spice-common/docs/spice_protocol.txt

Spice protocol format file
==========================

Copyright (C) 2016 Red Hat, Inc.
Licensed under a Creative Commons Attribution-Share Alike 3.0
United States License (see http://creativecommons.org/licenses/by-sa/3.0/us/legalcode).

Basic
-----
The spice protocol format file defines the network protocol used by spice.
It resemble the C format.

    file ::= <definitions> <protocol> ;
    definitions ::= <definition>|<definitions><definition> ;
    definition ::= <typedef>|<structure>|<enum>|<flag>|<message>|<channel> ;
    protocol ::= "protocol" <identifier> "{" <protocol_channels> "}" ";" ;
    protocol_channels ::= <protocol_channel>|<protocol_channels><protocol_channel> ;
    protocol_channel ::= <identifier> <identifier> [ "=" <integer> ] ";" ;
    integer ::= <hex>|<dec> ;
    dec ::= [+-][0-9]+ ;
    hex ::= "0x" [0-9a-f]+ ;
    identifier ::= [a-z][a-z0-9_]* ;

(here BNF with some regular expression is used).

It's used to generate automatically code to marshal/demarshal the network data.

Example:

    channel ExampleChannel {
       message {
          uint32 dummy;
       } Dummy;
    };

    protocol Example {
        ExampleChannel first = 1001;
    };

As you can see brackets like C are used and structures looks like C but you
can also see that keywords like `channel`, `protocol`, `message` or some
predefined types like `uint32` are proper of the protocol.

Comments
--------
Both C and C++ style comments are supported

    // this is a comment
    /* this is a comment too
       but can be split in multiple lines */

Base types
----------

All int from 8 to 64 bit (8, 16, 32 and 64) are supported either signed or unsigned.
Also you can pass one unix descriptor.

    base_type ::= "int8"|"uint8"|"int16"|"uint16"|"int32"|"uint32"|"int64"|"uint64"|"unix_fd" ;

Example:

    int16 x;

Enumerations and flags
----------------------

It's possible to specify enumerations and flags. The difference is that flags are defined as 2 power
values and can be combined. Enumerations and flags must have a size (`8`, `16` or `32`) specified.

    enum ::= <enum_type> "{" [ <enumflag_items> ] "}" <attributes> ";" ;
    flag ::= <flag_type> "{" [ <enumflag_items> ] "}" <attributes> ";" ;
    enum_type ::= "enum8"|"enum16"|"enum32" ;
    flag_type ::= "flag8"|"flag16"|"flag32" ;
    enumflag_items ::= <enumflag_item>|<enumflag_items><enumflag_item>
    enumflag_item ::= <enum_name> [ "=" <integer> ] [ "," ] ;
    enum_name ::= [a-z0-9_]* ;

Example:

    enum16 Level {
        LOW = 0x100,
        MEDIUM,
        HIGH = 0x1000
    };

Variables
---------

As you should already have noted variables are similar to C syntax but there are
some differences.

    variable ::= <type> [ "*" ] <identifier> [ "[" <array_size> "]" ] <attributes>;

The `*` specify a pointer. This is quite different from C. For the protocol it
specifies that in the protocol stream a relative offset is put that points to that
variable usually after all defined fields. This happens even on arrays, so for instance

    int32 *n;

containing a 0x12345678 `n` value could ended up coded as

    04 00 00 00 // 4 as offset
    78 56 34 12 // `n`

(little endian). While an array of 2 items defined as

    int32 *n[2];

and containing 0x12345678 and 0x9abcdef could end up with

    04 00 00 00 // 4 as offset
    78 56 34 12 // `n`[0]
    ef cd ab 09 // `n`[1]

note that `int32 *n[2]` defined a pointer to an array of 2 items and not
an array of pointers as C.

*WARNING*: You should avoid using pointers on protocol if not necessary as they are complicated
to handle not using autogenerated code and also use more space on the network.

Arrays
------

As seen above the easiest way to define an array size is specifying a constant value.
However there are multiple way to specify the size

    array_size ::= <integer>|<identifier>|""|<array_size_image>|<array_size_cstring> ;
    array_size_image ::= "image_size" "(" <integer> "," <identifier> ")" ;
    array_size_cstring ::= "cstring()" ;

We already seen integer.
Specifying an identifier name instead (should be variable) indicate that the length is specified
in another field, for instance

    uint8 name_len;
    int8 name[name_len];

allows to put a name of `name_len` len.
The empty value tells that the array will end when the containing message end so if we have

    int8 name[];

and the message is

    66 6f 6f

possibly the name we want is `foo` (66 6f 6f is the ASCII encoding for `foo`).

TODO: what happen with two [] in the structure ??
TODO: can a [] array not be the last and what happens ??

`image_size` allow to specify an array holding an image, for instance

    uint16 width;
    uint16 height;
    uint8 raw_image[image_size(8, width, height)];

could contain row data in raw_image. The constant `8` is the bit size of the image.

`cstring` allows to specify NUL-terminated sequence so having

    int8 name[cstring()];

and the message as

    66 6f 6f 00

we'll have the `foo` name. Note that the field does not need to end the message as in `int8 name[]` example.

Structures
----------

The simplest compound type is the structure. As in C is defined as a list of fields (any variable or switch).
But as a protocol definition there are no alignment or padding and every field (beside pointer values) follow each other.

    struct ::= "struct" <identifier> "{" [ <fields> ] "}" <attributes> ";" ;
    fields ::= <field>|<fields><field> ;
    field ::= <variable>|<switch>

Example:

    struct Point {
        int32 x;
        int32 y;
    };

Messages
--------

Messages have the same syntax of structure (beside `message`) with the difference that they can
be used directly inside channels.

    message ::= "message" <identifier> "{" [ <fields> ] "}" <attributes> ";" ;

Switches
--------

TODO

Type definitions
----------------

Like C type definition allow to short types defining new ones.

    typedef ::= "typedef" <identifier> <type> <attributes> ;

note that unlike C name came before the type.

Example:

    typedef XCoord int32;

Channels
--------

    channel ::= "channel" <identifier> [ ":" <identifier> ] "{" <channel_messages> "}" <attributes> ";" ;
    channel_messages ::= <channel_message>|<channel_messages><channel_message> ;
    channel_message ::= "server:" | "client:" | "message" <identifier> [ "=" <integer> ] ;

Example:

    channel ExampleChannel {
    server:
       message {
          uint32 dummy;
       } Dummy;
    };

Note that every message is associated with a number which is used in the protocol.
The assignment work in a similar way to enumeration in C beside first message is
assigned 1 value and not 0. So first message (if no integer is specified) is assigned
1, second 2 and so on.

`server:` or `client:` specify the direction of messages following, `server` specify
messages from server while `client` from client. If not specified is assumed from
server.

For each channel you can specify a parent channel. Derived channel inherits all
messages specified in the parent.
Note that messages from parent can be overridden by derived channels.

Protocol
--------

    protocol ::= "protocol" <identifier> "{" <protocol_channels> "}" ";" ;
    protocol_channels ::= <protocol_channel>|<protocol_channels><protocol_channel> ;
    protocol_channel ::= <identifier> <identifier> [ "=" <integer> ] ";" ;

Example:

    protocol Example {
        ExampleChannel first = 1001;
    };

Protocol specify the list of channel supported. Channel have an associated number
assigned in a similar way of channels (incremented from one to the next with
first starting from 0 if not specified).

*NOTE*: Due to the way currently code is generate you should use
small numbers.

Attributes
----------

As you probably noted attributed can be specified for lot of definitions.
They allow to change code generated or specific constraints of the protocol.

    attributes ::= ""|<attributes><attribute>|<attribute> ;
    attribute ::= <attribute_name> [ "(" <attribute_values> ")" ] ;
    attribute_values ::= <attribute_values> "," <attribute_value> | <attribute_value>
    attribute_value ::= <integer> | <identifier>
    attribute_name ::= @[a-z][a-z0-9_]* ;

Mostly of the attributes have no arguments, other currently have only one
argument.

*NOTE*: Some comments are also written in `spice-common` `python_modules/ptypes.py`
source file.

ctype
~~~~~

Specify the structure type name that the generated marshaller/demarshaller code
will use. By default the name will be converted to CamelCase and prefixed by
`Spice` so for example a structure like

    struct Point {
        int32 x;
        int32 y;
    } @ctype(MyPoint);

will be marshalled into a C structure like

    struct MyPoint {
        int32_t x;
        int32_t y;
    };

prefix
~~~~~~

This attribute allows to specify the prefix used for generated enumerations (both
protocol enumerations and flags generate C enumerations). By default the enumeration
will use upper case of the enum/flag name prefixed with `SPICE_` and followed by item so

    enum32 level {
        LOW,
        HIGH,
    };

will generate

    typedef enum SpiceLevel {
        SPICE_LEVEL_LOW,
        SPICE_LEVEL_HIGH,
        SPICE_LEVEL_ENUM_END
    } SpiceLevel;

while

    enum32 level {
        LOW,
        HIGH,
    } @prefix(LVL_);

will generate

    typedef enum SpiceLevel {
        LVL_LOW,
        LVL_HIGH,
        SPICE_LEVEL_ENUM_END
    } SpiceLevel;

(note that an automatic `END` enumeration is generated and name is not affected).

end
~~~

This attribute specifies that the data will be appended/embedded in the final C structure.

Example:

    struct test {
        uint16 len;
        uint16 array[len] @end;
    };

Output C structure:

    struct test {
        uint16_t len;
        uint16_t array[0];
    };

The generated code will allocate the C structure to allow space for extracted array.

*WARNING*: This option is usually confused with with empty size protocol. The
empty protocol array size specify array that extend on the network data while
the `@end` attribute specify to extend the C structure (for instance in the example
the attribute was attached to a `len`-sized array).

to_ptr
~~~~~~

This specifies that the corresponding C structure field contains a pointer to
the data. On marshalling the pointer is used, on demarshalling the data is
allocated in the memory block that holds the returned structure.
The type of this field must be a structure.

Example:

    struct test {
        uint16 num;
    };

    struct msg {
        test ptr @to_ptr;
    };

Output C structure:

    struct test {
        uint16_t num;
    };

    struct msg {
        struct test *ptr;
    };

nocopy
~~~~~~

TODO

as_ptr
~~~~~~

TODO

nomarshal
~~~~~~~~~

Do not generate code for marshalling this variable.
Usually used on last array element to make possible to manually feed data.

Example:
    struct Data {
        uint32 data_size;
        uint8 data[data_size] @nomarshal;
    };

zero_terminated
~~~~~~~~~~~~~~~

The field should terminated by zero.
Actually it's not used by python code so it's not enforced and no
code is generated.

marshall
~~~~~~~~

TODO

nonnull
~~~~~~~

This pointer field cannot be NULL. This means that marshaller assume C structure
contain not NULL pointer and demarshaller will fail to demarshall message if offset
is 0.

unique_flag
~~~~~~~~~~~

This flag field should contain just a single flag.
Actually it's not used by python code so it's not enforced and no
code is generated.

deprecated
~~~~~~~~~~

This flag currently apply only to enumerations and flags types and will set
generated C enumeration constant to deprecated

ptr_array
~~~~~~~~~

TODO

outvar
~~~~~~

TODO

anon
~~~~

TODO

chunk
~~~~~

TODO

ifdef
~~~~~

TODO

zero
~~~~

TODO

virtual
~~~~~~~

TODO