A blazing-fast, modular, type-safe parser for OMG IDL 4.2 specification, written in Rust. Now with code generation for C, C++, Rust, and Python! https://hdds.io
Find a file
Olivier ESTEVE 7fdfedef94
feat(codegen): C/C++ migrate to encode_cdr2_le_at + cap-4 + zero-fill (1.6.1e)
Symmetric port of the F01 fix to the C and C++ codegen backends, mirroring
the Rust-side migration that landed in 1.6.1a-trait through
1.6.1a-callers-internal plus 1.6.1a-codegen-rust + 1.6.1a-codegen-encode.

C codegen:
- primitive_scalar_layout (codec/mod.rs): 8-byte primitives now `align: 4`
  per XCDR2 §7.4.3.4.1 Tab.15 cap-4. Pre-1.6.1e value `align: 8` was the
  F01 systemic bug source on this backend.
- DefinitionIndex::align_of (index.rs): same cap-4 fix. Used by encode_array
  for pre-alignment; otherwise `array<int64>` / `array<double>` would have
  re-introduced F01 through a different code path (caught by Sonnet review).
- definitions.rs: emit `Foo_encode_cdr2_le_at(const Foo*, uint8_t*, size_t,
  size_t* offset_io)` for structs and unions; pre-existing
  `Foo_encode_cdr2_le(const Foo*, uint8_t*, size_t)` is preserved as a
  trivial wrapper for back-compat. Contract: `_at` reads `*offset_io` on
  entry, writes back on success; on error (negative return code), the
  caller-visible offset state is undefined (matches the Rust
  encode_cdr2_le_at contract).
- codec/encode.rs: nested-struct caller site migrated from sub-buffer slice
  (`dst + offset, len - offset`) to offset-propagating call
  (`dst, len, &offset`). Bitset/bitmask hardcoded `encode_scalar(8, 8, ...)`
  changed to `(4, 8, ...)` for cap-4.
- tests.rs: added assertion for the new `_at` signature alongside the
  existing legacy assertions.

C++ codegen:
- primitive_scalar_layout (codec/mod.rs) + align_of (index.rs) + union
  primitive_scalar_layout (unions.rs) + discriminator_layout (unions.rs):
  same cap-4 fix on all four sites.
- All three struct encode emitters — FINAL (no DHEADER), APPENDABLE
  (DHEADER), MUTABLE (DHEADER + EMHEADER) — now emit
  `encode_cdr2_le_at(std::uint8_t*, std::size_t, std::size_t& offset)` as
  the real implementation with a legacy `encode_cdr2_le(dst, len)` wrapper
  for back-compat. Same pattern for the union encode method (unions.rs).
- Added `cdr2::pad_to_align(dst, offset, len, alignment)` helper that
  zero-fills the padding bytes per XCDR2 §7.4.3.4.2. All encode-side call
  sites — encode_scalar/encode_string/encode_wstring/encode_wchar/
  encode_fixed/encode_array/encode_sequence/encode_map + their duplicates
  in unions.rs + emit_encode_discriminator — migrated from
  `cdr2::align_offset(offset, ...)` (no zero-fill) to `cdr2::pad_to_align`.
  `cdr2::align_offset` itself is left in place because the decode side
  still uses it (deferred to 1.6.11).
- Nested-struct/union caller sites (codec/encode.rs + unions.rs) migrated
  from sub-buffer slice to offset-propagating call.
- encode_scalar heuristic (unions.rs): extended from
  `contains("static_cast")` to also include `ends_with(')')` so that call
  expressions take the temp-variable path. Mirrors the corresponding
  heuristic in codec/encode.rs; the two implementations were divergent
  pre-fix and could have produced rvalue-address bugs on union fields
  whose value expression ended with `)` (caught by Sonnet review).
- PubSubType (codec/pubsub_types.rs): UNCHANGED. Still calls the legacy
  `encode_cdr2_le` wrapper. This preserves the DDS TopicDataType ABI
  (`serialize(void*, SerializedPayload_t*)` returns byte count, not
  propagated offset), which is what FastDDS / Connext / OpenDDS plumbing
  expects on a top-level message.

Decode is intentionally NOT migrated (sous-chantier 1.6.11 territory).
The encode-side nested-struct callers now emit `_at` symmetric calls, but
the decode-side nested-struct callers (codec/decode.rs in C + C++, unions.rs
decode emitter) still use the sub-buffer slice form. To make this asymmetry
visible to downstream consumers, a `// TODO(hddsgen 1.6.11): migrate to
decode_cdr2_le_at to match encode side.` comment is now emitted alongside
each decode caller (Opus 3 review finding HIGH — encode/decode asymmetry
footgun in generated headers).

Rust backend (encode.rs + encode_containers.rs): formatting cleanup only.
Six `format_args!(...)` calls that `cargo fmt` had flagged as overly
multi-line since 1.6.1a-codegen-encode (commit da005a7) — no semantic
change, equivalence-preserving compaction.

Empirical validation:
- All 147 hddsgen library tests pass.
- `cargo fmt --all -- --check`: clean.
- `cargo clippy --all-targets --all-features -- -D warnings`: clean.
- Smoke probe `struct Outer { uint8 tag; Inner nested; sequence<uint32>
  nums; string name; }` with `struct Inner { uint8 b; uint64 v; }` —
  gen c + gcc -std=c11 -Wall -Wextra runs OK n=31 first16=
  `ab 01 00 00 88 77 66 55 44 33 22 11 02 00 00 00`. Same probe via gen cpp
  + g++ -std=c++17 -Wall -Wextra: BYTE-IDENTICAL output, same n=31. The
  `00 00` at offset 2-3 is the cap-4 padding (was 6 bytes pre-fix).
- Sonnet HIGH #1 (`align_of` cap-4 fix) — separate probe
  `struct ArrayInt64 { octet tag; long long arr[2]; }`: n=20 bytes,
  `ab 00 00 00` then arr[0] LE then arr[1] LE. Pre-fix would have been
  n=24 with 7 padding bytes. C and C++ byte-identical here too.

Version bump 1.1.0 → 1.1.1 (PATCH). Source-level API is preserved (legacy
wrappers stay); the wire output changes for any type containing 8-byte
primitives due to the cap-4 fix, which justifies a patch bump to signal
the wire-level behavior change to downstream HDDS / SDK builds that pin
hddsgen by version.

3-agent review:
- Sonnet: 2 HIGH + 1 MEDIUM actionable. HIGH #1 (`align_of` returning 8 in
  `c/index.rs` + `cpp/index.rs` used by `encode_array`) — fixed inline
  before commit, empirically validated via array<int64> probe. MEDIUM
  (encode_scalar heuristic divergence between codec/encode.rs and
  unions.rs) — fixed inline by aligning unions.rs on the
  `ends_with(')')` extra check.
- Haiku: COMPLETE. 9/9 sanity checks PASS (tests, fmt, clippy, no legacy
  sub-buffer pattern, `encode_cdr2_le_at` present, no align_offset on
  encode path, cap-4 landed, pad_to_align helper present, smoke regen OK).
- Opus 3: HIGH (encode/decode asymmetry footgun in generated headers) —
  addressed inline by emitting `// TODO(hddsgen 1.6.11)` comments
  alongside each decode caller site. INFO findings on dheader_pos
  capture under reference semantics (correct) and idiomatic C/C++
  convention split (correct). LOW (max_cdr2_size unused-param,
  pre-existing) — noted but out of scope.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-11 02:29:44 +02:00
.github HDDS v1.0.8 -- initial public release 2026-02-27 17:13:00 +01:00
examples fix: @key compute_key() for bool, char, enum, struct, Fixed, typedef string 2026-03-12 12:23:37 +01:00
fuzz HDDS v1.0.8 -- initial public release 2026-02-27 17:13:00 +01:00
scripts feat(c): support @optional fields + cross-language roundtrip test 2026-03-08 02:47:53 +01:00
sdk/c-micro/include HDDS v1.0.8 -- initial public release 2026-02-27 17:13:00 +01:00
src feat(codegen): C/C++ migrate to encode_cdr2_le_at + cap-4 + zero-fill (1.6.1e) 2026-05-11 02:29:44 +02:00
tests fix: @key compute_key() for bool, char, enum, struct, Fixed, typedef string 2026-03-12 12:23:37 +01:00
.gitignore HDDS v1.0.8 -- initial public release 2026-02-27 17:13:00 +01:00
.pre-commit-config.yaml HDDS v1.0.8 -- initial public release 2026-02-27 17:13:00 +01:00
build.rs HDDS v1.0.8 -- initial public release 2026-02-27 17:13:00 +01:00
Cargo.lock feat(codegen): C/C++ migrate to encode_cdr2_le_at + cap-4 + zero-fill (1.6.1e) 2026-05-11 02:29:44 +02:00
Cargo.toml feat(codegen): C/C++ migrate to encode_cdr2_le_at + cap-4 + zero-fill (1.6.1e) 2026-05-11 02:29:44 +02:00
CODE_OF_CONDUCT.md HDDS v1.0.8 -- initial public release 2026-02-27 17:13:00 +01:00
CONTRIBUTING.md HDDS v1.0.8 -- initial public release 2026-02-27 17:13:00 +01:00
hdds-logo.png HDDS v1.0.8 -- initial public release 2026-02-27 17:13:00 +01:00
LICENSE HDDS v1.0.8 -- initial public release 2026-02-27 17:13:00 +01:00
LICENSE-APACHE HDDS v1.0.8 -- initial public release 2026-02-27 17:13:00 +01:00
LICENSE-MIT HDDS v1.0.8 -- initial public release 2026-02-27 17:13:00 +01:00
Makefile feat(c): support @optional fields + cross-language roundtrip test 2026-03-08 02:47:53 +01:00
README.md HDDS v1.0.8 -- initial public release 2026-02-27 17:13:00 +01:00
SECURITY.md HDDS v1.0.8 -- initial public release 2026-02-27 17:13:00 +01:00
test_all_idl.sh HDDS v1.0.8 -- initial public release 2026-02-27 17:13:00 +01:00

HDDS

# hdds_gen

CI License Rust IDL Backends Lines

High-assurance OMG IDL 4.2 parser and multi-language code generator for DDS (Data Distribution Service) applications.

Overview

hdds_gen is a Rust-based toolchain that parses OMG IDL 4.2 files and generates serialization code for multiple target languages. It provides CDR2 (Common Data Representation version 2) encode/decode implementations suitable for DDS middleware interoperability.

Key capabilities:

  • Full OMG IDL 4.2 parser with preprocessor support
  • Code generation for 6 target backends
  • Semantic validation with detailed diagnostics
  • Pretty-printer for IDL formatting
  • CLI tool with subcommands for parsing, generation, validation, and formatting

Supported IDL Types

Primitive Types

IDL Type Description
boolean Boolean value
char, wchar 8-bit and wide characters
octet 8-bit unsigned
short, unsigned short 16-bit signed/unsigned
long, unsigned long 32-bit signed/unsigned
long long, unsigned long long 64-bit signed/unsigned
float, double, long double Floating-point types
string, wstring Unbounded strings
string<N>, wstring<N> Bounded strings
int8, int16, int32, int64 Fixed-width signed integers
uint8, uint16, uint32, uint64 Fixed-width unsigned integers
fixed<D,S> Fixed-point decimal (D digits, S scale)
void Void type (for operations)

Constructed Types

Type Description
struct Aggregated data structure with optional inheritance
enum Enumeration with optional explicit values
union Discriminated union with case labels
typedef Type alias with annotation support
bitset Packed bit fields with explicit widths
bitmask Named flag constants
const Constant definitions
module Namespace scoping
forward declaration Forward struct/union declarations
@annotation Custom annotation declarations

Container Types

Type Description
sequence<T> Unbounded sequence
sequence<T, N> Bounded sequence (max N elements)
T[N] Fixed-size array
map<K, V> Unbounded map
map<K, V, N> Bounded map (max N entries)

Interfaces (Feature-Gated)

With --features interfaces:

Type Description
interface Interface with operations and attributes
exception Exception type declarations
oneway One-way operations
in/out/inout Parameter direction qualifiers
raises Exception specifications

Supported Annotations

DDS/XTYPES Standard Annotations

Annotation Target Description
@key Field Marks field as part of topic key
@optional Field Field may be absent
@id(N) Field Explicit member ID
@autoid(SEQUENTIAL|HASH) Type Auto-generate member IDs
@extensibility(FINAL|APPENDABLE|MUTABLE) Type Type evolution policy
@final Type Shorthand for FINAL extensibility
@appendable Type Shorthand for APPENDABLE extensibility
@mutable Type Shorthand for MUTABLE extensibility
@must_understand Field Reader must understand this field
@nested Type Nested type (no topic)
@external Field External reference
@default_literal Enum Default discriminator value
@default Union case Default union case
@position(N) Bitset/Bitmask Explicit bit position
@bit_bound(N) Enum/Bitmask Maximum bit width
@data_representation(XCDR1|XCDR2) Type Wire format selection
@non_serialized Field Exclude from serialization

Documentation Annotations

Annotation Description
@unit("...") Unit of measurement
@min(N) Minimum value constraint
@max(N) Maximum value constraint
@range(min=N, max=M) Value range constraint
@value(...) Default value
@verbatim(...) Language-specific code injection

Interface Annotations

Annotation Description
@service Mark interface as service
@oneway One-way operation (no reply)
@ami Asynchronous method invocation

Custom Annotations

User-defined annotations via @annotation declarations with typed members and default values.

Code Generation Backends

Rust (rust)

  • Idiomatic Rust structs with #[derive(Debug, Clone, PartialEq)]
  • CDR2 serialization via Cdr2Encode / Cdr2Decode traits
  • Option<T> for @optional fields
  • Vec<T> for sequences, HashMap<K,V> for maps
  • Module namespacing preserved
  • PL-CDR2 support for mutable/appendable types

C++ (cpp)

  • C++17 compatible headers
  • STL containers (std::vector, std::map, std::array, std::string)
  • Inline CDR2 encode/decode methods
  • Namespace wrapping via --namespace-cpp
  • Compatible with FastDDS, Cyclone DDS, RTI Connext

C (c)

  • C99/C11 compatible header-only output
  • Static inline encode/decode functions
  • Struct definitions with explicit padding
  • Type descriptors for runtime introspection

Python (python)

  • Python 3.7+ with @dataclass decorators
  • Type hints via typing module
  • IntEnum for enumerations
  • CDR2 encode_cdr2_le() / decode_cdr2_le() methods
  • compute_key() for @key field hashing

Micro (micro) - no_std Rust

  • #![no_std] compatible for embedded targets
  • Uses heapless::Vec and heapless::String
  • Inline CDR encode/decode (no trait dispatch)
  • Configurable buffer sizes
  • Target: bare-metal Rust with hdds-micro crate

C-Micro (c-micro) - Header-Only C for MCUs

  • C89/C99 compatible, no dynamic allocation
  • Fixed-size buffers with compile-time bounds
  • Target: STM32, AVR, PIC, ESP32, any MCU with C compiler
  • Minimal runtime footprint

CLI Usage

# Install
cargo install --path .

# Parse and validate
hddsgen parse input.idl
hddsgen parse input.idl --pretty      # Pretty-print parsed IDL
hddsgen parse input.idl --json        # JSON diagnostics

# Check (validation only, CI-friendly exit codes)
hddsgen check input.idl
hddsgen check input.idl --json

# Generate code
hddsgen gen rust input.idl -o output.rs
hddsgen gen cpp input.idl -o output.hpp
hddsgen gen c input.idl -o output.h
hddsgen gen python input.idl -o output.py
hddsgen gen micro input.idl -o output.rs
hddsgen gen c-micro input.idl -o output.h

# Generate with namespace (C++)
hddsgen gen cpp input.idl --namespace-cpp MyApp::Types -o output.hpp

# Generate full project with examples
hddsgen gen rust input.idl --example --out-dir ./my_project
hddsgen gen cpp input.idl --example --out-dir ./my_project --build-system cmake

# Format IDL
hddsgen fmt input.idl -o formatted.idl

# Include directories for #include resolution
hddsgen parse main.idl -I ./includes -I /usr/share/idl

Subcommands

Command Description
parse Parse and validate IDL, optionally pretty-print
gen Generate code for target language
check Validate only (returns non-zero on error)
fmt Reformat IDL via pretty-printer

Generation Options

Option Description
-o, --out <FILE> Output file (stdout if omitted)
--out-dir <DIR> Output directory for module files
--namespace-cpp <NS> C++ namespace (e.g., A::B::C)
--example Generate full project with publisher/subscriber examples
--build-system <TYPE> Build system: cargo, cmake, make
--hdds-path <PATH> Path to hdds crate for Rust examples
-I, --include <DIR> Include directory for #include resolution

Preprocessor

Full C-style preprocessor with:

  • #include "file.idl" and #include <file.idl>
  • #define NAME value and #define MACRO(args) body
  • #ifdef, #ifndef, #if, #elif, #else, #endif
  • #undef
  • Cycle detection for include guards
  • Macro expansion with function-like macros
  • Token pasting (##) and stringification (#)

Validation Rules

The validator enforces semantic correctness:

Struct Rules

  • No duplicate field names
  • Valid type references
  • @key only on serializable fields
  • Extensibility annotation conflicts

Enum Rules

  • No duplicate enumerator names
  • Explicit values within @bit_bound limits

Union Rules

  • No duplicate case labels
  • At most one default case
  • Valid discriminator type
  • @default annotation consistency

Bitset Rules

  • Bit positions must not overlap
  • Total width within bounds
  • Valid @position annotations

Interface Rules (with feature)

  • No duplicate operation names
  • No operation/attribute name collisions
  • Valid parameter types
  • raises references existing exceptions
  • oneway operations must return void

Custom Annotations

  • Parameters match declared annotation members
  • Required parameters provided

Project Architecture

src/
  lib.rs              # Public API exports
  ast.rs              # Abstract Syntax Tree types
  types.rs            # IDL type system (primitives, annotations)
  token.rs            # Lexer token definitions
  error.rs            # Error types and handling

  lexer/              # Lexical analysis
    mod.rs            # Lexer entry point
    scanner.rs        # Character scanning
    numbers.rs        # Numeric literal parsing
    state.rs          # Lexer state machine

  parser/             # Syntax analysis
    mod.rs            # Parser entry point
    annotations.rs    # Annotation parsing
    const_expr.rs     # Constant expression evaluation
    interfaces.rs     # Interface parsing (feature-gated)
    types.rs          # Type parsing
    definitions/      # Definition parsers
      structs.rs
      enums.rs
      unions.rs
      bitsets.rs
      bitmasks.rs
      typedefs.rs
      consts.rs
      module.rs
      forwards.rs

  validate/           # Semantic validation
    mod.rs            # Validation entry point
    engine.rs         # Validation orchestration
    rules/            # Validation rules
      structs.rs
      enums.rs
      unions.rs
      bitsets.rs
      interfaces.rs
    diagnostics.rs    # Diagnostic types
    references.rs     # Reference resolution

  codegen/            # Code generation
    mod.rs            # Backend trait and registry
    rust_backend/     # Rust code generator
    cpp/              # C++ code generator
    c/                # C code generator
    python.rs         # Python code generator
    micro/            # no_std Rust generator
    c_micro/          # Header-only C for MCUs
    examples.rs       # Example code generation
    examples_project.rs  # Full project scaffolding

  pretty/             # Pretty-printer
    mod.rs            # Formatter entry point
    formatter.rs      # Core formatting logic
    structs.rs        # Struct formatting
    enums.rs          # Enum formatting
    unions.rs         # Union formatting
    bitsets.rs        # Bitset formatting
    modules.rs        # Module formatting
    interfaces.rs     # Interface formatting (feature-gated)

  bin/
    hddsgen.rs        # CLI entry point
    hddsgen/
      commands.rs     # Subcommand implementations
      preprocessor.rs # Preprocessor implementation

tests/                # Integration tests
examples/             # Example IDL files
  canonical/          # Reference test cases
  invalid/            # Expected-failure cases
  include/            # Include resolution tests
  macros/             # Preprocessor tests
  interfaces/         # Interface feature tests

Build and Test

# Build
make build              # Debug build
make release            # Release build

# Test
make test               # Unit tests
make validate-ci        # Full CI validation suite

# Code quality
make fmt                # Format code
make clippy             # Run linter
make doc                # Generate documentation

# Install
make install            # Install to ~/.cargo/bin

Feature Flags

Feature Description
interfaces Enable interface/exception parsing and pretty-printing
# Build with interfaces support
cargo build --features interfaces

Examples

Basic IDL

@extensibility(APPENDABLE)
struct HelloWorld {
    unsigned long index;
    string message;
};

Advanced IDL

module Comp {
    @appendable
    struct Msg {
        @key int32_t id;
        @optional string content;
        string<16> name;
        sequence<int32_t, 10> values;
    };

    enum Color { Red = 0, Green = 1, Blue = 2 };

    typedef map<string, int32_t, 100> ConfigMap;

    bitset Flags {
        bitfield<3> mode;
        bitfield<5> value, @position(4);
    };

    bitmask Permissions { Read, Write, Execute };

    union Data switch(int32_t) {
        case 1: int32_t integer;
        default: octet raw;
    };

    const int32_t MAGIC = 42;
};

Generated Rust Usage

use hdds::{Cdr2Encode, Cdr2Decode};

let msg = Comp::Msg {
    id: 1,
    content: Some("Hello".to_string()),
    name: "test".to_string(),
    values: vec![1, 2, 3],
};

let mut buffer = [0u8; 256];
let size = msg.encode_cdr2_le(&mut buffer)?;
let (decoded, _) = Comp::Msg::decode_cdr2_le(&buffer)?;
assert_eq!(msg, decoded);

Statistics

  • ~22,000 lines of Rust code
  • 6 code generation backends
  • 60+ example IDL files
  • Comprehensive validation suite

License

Licensed under either of:

at your option.

Copyright (c) 2025-2026 naskel.com

Repository

https://git.hdds.io/hdds/hdds_gen.git