Derive Macro xso::FromXml

source ·
#[derive(FromXml)]
{
    // Attributes available to this derive:
    #[xml]
}
Expand description

§Make a struct or enum parseable from XML

This derives the FromXml trait. This trait (like IntoXml and DynNamespace) can be derived on structs and enums.

Each Rust item (struct or enum) is mapped to an XML subtree, starting at the header of an element. In order to fully describe this element, additional information (“metadata” or short “meta”) needs to be added to the Rust item and its members.

For this, we use Rust attributes (similarly to how serde operates). Because the “attribute” word is also used in the XML context, we will instead use the term “meta”, which is similar to the terminology used in the Rust language reference itself.

§Examples

static MY_NAMESPACE: &'static str = "urn:uuid:55c56882-3915-49de-a7ee-fd672d7a85cf";

#[derive(FromXml)]
#[xml(namespace = MY_NAMESPACE, name = "foo")]
struct Foo;

// parses <foo xmlns="urn:uuid:55c56882-3915-49de-a7ee-fd672d7a85cf"/>

§Field order

Field order matters. The fields are parsed in the order they are declared (for children, anyway). If multiple fields match a given child element, the first field which matches will be taken. The only exception is #[xml(elements)] (without further arguments), which is always processed last.

When XML is generated from a struct, the child elements are also generated in the order of the fields. That means that passing an XML element through FromXml and IntoXml may re-order some child elements.

Sorting order between elements which match the same field is generally preserved, if the container preserves sort order on insertion.

§Meta types

The values accepted by the Rust attributes (we call them “meta” here to disambiguate between Rust and XML attributes) used by this crate may be of one of the following types:

  • path: A path is a (qualified) Rust name, such as foo::bar, ::std::vec::Vec, or just xyz.
  • string literal: A string literal, like "Hello World!".
  • flag: In that case, the meta does not accept a value. Its mere presence changes the behavior and it must not be followed by =.
  • ident or identifier: A Rust identifier (such as Foo or bar), without path delimiters (::).
  • nested: The meta is followed by more arguments wrapped in ().

§Struct, enum and enum variant metadata

These meta are available on structs, enums and/or enum variants. Almost all meta listed here are available on structs. Some are only available on enums, others are only available on enum variants.

MetaTypeAvailable onShort description
namespacepath or dynsee belowSpecifies the XML namespace of the struct.
namestring literalsee belowSpecifies the XML name of the struct.
attributestring literalsee belowSpecifies the XML attribute to match (enums only).
valuestring literalsee belowSpecifies the XML attribute’s value to match (enum variants only).
validatepathenumsFunction to allow validation or postprocessing of the item after parsing.
preparepathenumsFunction to allow preprocessing of the item before serialisation.
on_unknown_attributeidentvariantsUnknownAttributePolicy variant to use
on_unknown_childidentvariantsUnknownChildPolicy variant to use
transparentflagvariantsTransarently wrap the inner item.
elementflag or nestedvariantsTransparently parse a minidom::Element
wrapped_withnestedenumsAdd an XML wrapper around the item
fallbackflagenum variants onlyMark the enum variant as fallback variant
exhaustiveflagenums onlyMark the enum as exhaustive
normalize_withpathenums onlyPreprocess an attribute value before matching it

§Struct (de-)serialisation

Structs can be laid out in three different ways:

  • Normal struct: The XML namespace and name are fixed. The namespace and name meta must be set on the struct, and transparent or element cannot be set.

    The struct can have any contents, provided they are convertible to/from XML or have been marked as to be ignored.

  • Transparent struct: The struct must have exactly one member and that member must be unnamed and it must implement FromXml and/or IntoXml. Such a struct has the transparent flag set and must not have namespace, name or element metas.

    The struct will be parsed and serialised using the member’s implementations, but wrapped in the struct on the Rust level. The XML representation does not change.

  • Element struct: The struct must have exactly one member and that member must be unnamed and must be of type minidom::Element. Such a struct has the element meta and must not have namespace, name or transparent metas.

    This struct may accept any XML subtree, provided its element header matches the additional optional selector in the element meta.

§Enum (de-)serialisation

For enums, there are three different modes for matching a given XML element against the enum:

  • Name matched: The XML namespace is fixed and the XML name determines the enum variant. The namespace meta is required on the enum and variants must have the name meta set.

  • Attribute matched: The XML namespace and name are fixed. A specific XML attribute must exist on the element and its value determines the variant. variant. The namespace, name and attribute meta must be set on the enum, and the value meta must be set on each variant.

  • Fully dynamic: Each variant is matched separately. Each variant has almost the same behaviour as structs, with the notable difference that prepare, validate and wrapped_with are only available on the enum itself.

§Item meta reference

§namespace meta (on items)

The namespace meta controls the XML namespace of the XML element representing a Rust struct, enum, or enum variant. It may be specified in one of two ways:

  • A path which refers to a &'static str static which contains the namespace URI to match.

  • The keyword dyn: The XML namespace is matched using a member field via the DynNamespaceEnum trait. For this to work, exactly one field must be marked with the namespace meta and have a type implementing DynNamespaceEnum.

§name meta

The name meta controls the XML name of the XML element representing a Rust struct, enum, or enum variant. It must be specified as string literal.

§attribute meta (on items)

The attribute meta controls the XML attribute used for XML attribute value matched enums. The attribute name must be specified as string literal.

Presence of this meta requires presence of the namespace and name metas and is only allowed on enums.

§value meta

The value meta controls the value of the XML attribute matched for XML attribute value matched enums. It is only allowed on enum variants inside enums which have the attribute meta.

§validate meta

The validate meta is optional and allows specifying the name of a function which is called after successful deserialisation. That function must have the signature fn(&mut T) -> Result<(), xso::error::Error> where T is the type of the Rust enum or struct on which this meta is declared.

§prepare meta

The prepare meta is optional and allows specifying the name of a function which is called before serialisation. That function must have the signature fn(&mut T) -> () where T is the type of the Rust enum or struct on which this meta is declared.

§on_unknown_attribute meta

This meta controls how unknown attributes are handled. If set, it must be the idenifier of a UnknownAttributePolicy enum variant.

See the documentation of UnknownAttributePolicy for available variants and their effects.

Note that this policy only affects the processing of attributes of the item it is declared on; it has no effect on the handling of unexpected attributes on children, which is controlled by the respective item’s policy.

§on_unknown_value meta

This meta controls how unknown child elements are handled. If set, it must be the idenifier of a UnknownChildPolicy enum variant.

See the documentation of UnknownChildPolicy for available variants and their effects.

Note that this policy only affects the processing of direct children of the item it is declared on; it has no effect on the handling of unexpected grandchildren, which is controlled by the respective item’s policy.

§transparent meta

If present, it switches the struct or enum variant into transparent mode.

§element meta (on items)

If present, it switches the struct or enum variant into element mode.

This meta can either be used standalone, or it may contain additional arguments in key = value syntax. The following keys are supported, all of them accepting only string literals:

  • namespace: Restrict the matched elements to XML elements with the given namespace URI.

  • name: Restrict the matched elements to XML elements with the given name.

Note that both namespace and name are optional and can be used separately. For expamle, the following is valid (albeit weird):

#[derive(FromXml, IntoXml)]
#[xml(element(name = "foo"))]
struct Foo(minidom::Element);

And it matches all XML elements (no matter the namespace) which have the local name foo.

§wrapped_with meta

If present, it wraps the item in an additional XML element, both when parsing and serialising. The XML element is specified by additional arguments to the wrapped_with meta which must be specified as comma-separated key = value pairs inside parentheses:

  • namespace: Sets the XML namespace URI of the wrapping element.
  • name: Sets the XML name of the wrapping element.

Both are required. Example:

#[derive(FromXml, IntoXml)]
#[xml(namespace = "uri:foo", name = "baz", wrapped_with(namespace = "uri:foo", name = "bar"))]
struct Foo();

This would match (and generate) <bar xmlns="uri:foo"><baz/></bar>.

§exhaustive meta

If present, the enum considers itself authoritative. That means that if the XML name (for name matched enums) or attribute value (for attribute matched enums) does not match any of the variants, a hard parse error is emitted.

By contrast, if the meta is not present, in such a situation, parsing may continue with other attempts (e.g. if the enum is itself a member of a dynamic enum).

This meta can only be used on XML name or attribute name matched enums. It cannot be used on structs or dynamically matched enums.

§fallback meta

If set on an enum variant and an unexpected XML name (for name matched enums) or attribute value (for attribute matched enums) is encountered, this variant is assumed.

This meta can only be used on variants inside XML name or attribute name matched enums. It may only be present on one variant.

§normalize_with meta

This meta can only be used on XML attribute matched enums.

The normalize_with meta may be set to the path referring to a function. This function will be called on the attribute value before it is compared against the values of the enum variants.

The function must have the signature fn(&str) -> Cow<'_, str>.

§Field metadata

MetaXML representation
attributeAttribute
childChild element, processed with FromXml/IntoXml
childrenCollection of child elements, processed with FromXml/IntoXml
elementChild element, as minidom::Element
elementsCollection of elements, as minidom::Element
textCharacter data (text)
namespaceParent element’s XML namespace
ignorenone

§Field metadata reference

§attribute meta (on fields)
§child meta
§children meta
§element meta (on fields)
§elements meta
§text meta
§namespace meta (on fields)
§ignore meta

Field attributes are composed of a field kind, followed by a value or a list of attributes. Examples:

#[xml(attribute)]
#[xml(attribute = "foo")]
#[xml(attribute(name = "foo"))]

If the kind = .. syntax is allowed, the attribute which is specified that way will be marked as default.

The following field kinds are available:

  • attribute, attribute = name, attribute(..): Extract a string from an XML attribute. The field type must implement FromOptionalXmlText (for FromXml) or IntoOptionalXmlText (for IntoXml), unless the codec option is set.

    • name = .. (default): The XML name of the attribute. If this is not set, the field’s identifier is used.

    • namespace = ..: The XML namespace of the attribute. This is optional, and if absent, only unnamespaced attributes are considered.

    • default, default = ..: If set, a field value is generated if the attribute is not present and FromOptionalXmlText did not create a value from None, instead of failing to parse. If the optional argument is present, it must be the path to a callable which returns the field’s type. Otherwise, std::default::Default::default is used.

    • codec = ..: Path to a type implementing TextCodec to use instead of the FromOptionalXmlText / IntoOptionalXmlText implementation of the field’s type.

      If set, you need to explicitly add the default flag to fields of type Option<_>, because the default option logic of FromOptionalXmlText is not present.

  • child(.., extract(..)): Extract data from a child element.

    • name = .. (required): The XML name of the child to match.

    • namespace = .. (required): The XML namespace of the child to match. This can be one of the following:

      • A path referring to a &'static str static which contains the namespace URI of the XML element represented by the struct.
      • super: Only usable inside compounds with #[xml(namespace = dyn)], using #[xml(namespace = super)] on an extracted field allows to match the field’s child’s namespace with the dynamically determined namespace of the parent (both during serialisation and during deserialisation).
    • extract(..) (required): Specification of data to extract. See below for options.

    • skip_if: If set, this must be the path to a callable. That callable is invoked with a reference to the field’s type at serialisation time. If the callable returns true, the field is omitted from the output completely.

      This should often be combined with default.

    • default, default = ..: If set, a field value is generated if the child is not present instead of failing to parse. If the optional argument is present, it must be the path to a callable which returns the field’s type. Otherwise, std::default::Default::default is used.

      Note: When using extract(..), this is required even when the field’s type is Option<..>.

  • child, child(..) (without extract(..)): Extract an entire child element. The field type must implement FromXml (for FromXml) or IntoXml (for IntoXml).

    • namespace = super: If set, the field must also implement DynNamespace and the compound the field is in must be set to be namespace = dyn. In this case, the field’s child is forced to be in the same namespace as the parent during parsing.

    • skip_if: If set, this must be the path to a callable. That callable is invoked with a reference to the field’s type at serialisation time. If the callable returns true, the field is omitted from the output completely.

      This should often be combined with default.

    • default, default = ..: If set, a field value is generated if the child is not present instead of failing to parse. If the optional argument is present, it must be the path to a callable which returns the field’s type. Otherwise, std::default::Default::default is used.

      Note: When using extract(..), this is required even when the field’s type is Option<..>.

    Aside from namespace = super, matching of the XML namespace / name is completely delegated to the FromXml implementation of the field’s type and thus the namespace and name attributes are not allowed.

  • children(.., extract(..)): Like child(.., extract(..)), with the following differences:

    • More than one clause inside extract(..) are allowed.

    • More than one matching child is allowed

    • The field type must implement Default, Extend<T> and IntoIterator<Item = T>.

      T, must be a tuple type matching the types provided by the extracted parts.

    • Extracts must specify their type, because it cannot be inferred through the collection.

  • children, children(..) (without extract(..)): Extract zero or more entire child elements. The field type must implement Default, Extend<T> and IntoIterator<Item = T>, where T implements FromXml (and IntoXml for IntoXml).

    • skip_if: If set, this must be the path to a callable. That callable is invoked with a reference to the field’s type at serialisation time. If the callable returns true, the field is omitted from the output completely.

      This should often be combined with default.

    The namespace and name to match are determined by the field type, thus it is not allowed to specify them here. namespace = super is not supported.

  • text, text(..): Extract the element's text contents. The field type must implement FromXmlText (for [FromXml]) or IntoXmlText (for [IntoXml]), unless the codec` option is set.

  • element, element(..): Collect a single element as minidom::Element instead of attempting to destructure it. The field type must implement From<Element> and Into<Option<Element>> (minidom::Element implements both).

    • name = .. (optional): The XML name of the element to match.
    • namespace = .. (optional): The XML namespace of the element to match.
    • default, default = ..: If set, a field value is generated if the child is not present instead of failing to parse. If the optional argument is present, it must be the path to a callable which returns the field’s type. Otherwise, std::default::Default::default is used.

    If the field converts into None when invoking Into<Option<Element>>, the element is omitted from the output altogether.

  • elements(..): Collect otherwise unknown children as minidom::Element.

    • namespace = ..: The XML namespace of the element to match.
    • name = .. (optional): The XML name of the element to match. If omitted, all elements from the given namespace are collected.
  • elements: Collect all unknown children as minidom::Element. The field type must be Vec<Element>.

  • namespace: Represent the parent struct/enum variant’s XML namespace. This requires that the compound is declared with #[xml(namespace = dyn)]. The field type must implement DynNamespaceEnum.

  • ignore: The field is not considered during parsing or serialisation. The type must implement Default.

§Extraction specification

Inside extract(..), there must be a list of type annotations as used on fields. All annotations can be used which can also be used on fields. Because here there is no possibility to do any inferrence on the field type, the following field attributes support an additional, optional type argument:

  • attribute
  • text

If the extract(..) contains exactly one part and the type of the extract is not specified on that one part, it is assumed to be equal to the type of the field the extract is used on.

Otherwise, the default is String, which is not going to work in many cases. This limitation could be lifted, but we need a use case for it first :). So if you run into this file an issue please! Derive macro for FromXml.