office-gobmx/unoidl/README.md
Stephan Bergmann a8485d558f [API CHANGE] Remove deprecated idlc and regmerge from the SDK
* Client code must replace uses of idlc and regmerge with uses of unoidl-write,
  see the changes to odk/examples/ and ure/source/uretext/ in
  40f2aee658 "Provide unoidl-write also for the
  SDK" for examples.

* The new types.rdb format is not compatible with LibreOffice < 4.1.  Clients
  generating extensions containing such files are advised to use appropriate
  LibreOffice-minimal-version elements.

* For compatibility with old extensions, reading the legacy types.rdb format is
  still supported.

* The SDK no longer ships an idl/ sub-directory containing the udkap and offapi
  .idl files (as, unlike idlc, unoidl-write does not need them).
  odk/config/cfgWin.js had to be adapted to look (somewhat arbitrarily) for an
  examples/ sub-directory instead of idl/ when checking for "an sdk folder".
  gb_UnoApi_package_idlfiles became unused and has been removed.

* The idlc and regmerge executables have been removed.  Module idlc has been
  removed except for idlc/test/parser/, which is also used by
  CustomTarget_unoidl/unoidl-write_test, and which may eventually be moved into
  module unoidl.  Module external/ucpp and the corresponding configure options
  have also been removed.

Change-Id: I42a0231699b863b5ebe2bee63bc32c8f79278cc1
Reviewed-on: https://gerrit.libreoffice.org/c/core/+/122363
Tested-by: Jenkins
Reviewed-by: Stephan Bergmann <sbergman@redhat.com>
2022-06-10 17:15:49 +02:00

282 lines
12 KiB
Markdown

# Support for UNOIDL Registry Formats
`Library_unoidl` contains the `unoidl::Manager` and `unoidl::Provider` implementations
for the following registry formats:
* The new `UNOIDL` binary `types.rdb` format.
* The old legacy binary `types.rdb` format (based on modules "store" and
"registry").
* A source-file format, reading (multiple) `UNOIDL` entity definitions directly
from a single `.idl` source file.
* A source-tree format, reading `UNOIDL` entity definitions directly from a tree
of `.idl` source files rooted at a given directory. (Where an entity named
`foo.bar.Baz` is expected in a file named `foo/bar/Baz.idl` within that tree.)
(While `.idl` files still contain `#include` directives for legacy idlc, the source-
based formats ignore any preprocessing directives starting with `#` in the `.idl`
files.) `unoidl::Manager::addProvider` transparently detects the registry format
for a given URI and instantiates the corresponding provider implementation.
`Executable_unoidl-write` is a helper tool to convert from any of the registry
formats to the `UNOIDL` format. It is used at build-time to compile `UNOIDL` format
`.rdb` files (that are used at build-time only, or included in installation sets
in `URE` or `program/types/` or as part of bundled extensions that are created
during the build and not merely included as pre-built `.oxt` files) from source
`.idl` files.
`Executable_unoidl-read` is a helper tool to convert from any of the registry
formats to the source-file format. It can be used manually after a LibreOffice
version update to create new reference registries for `Executable_unoidl-check`.
`Executable_unoidl-check` is a helper tool to check that one registry is
backwards-compatible with another registry. It is used at build-time to detect
inadvertent breakage of the udkapi and offapi APIs.
## Specification of the New UNOIDL types.rdb Format
The format uses byte-oriented, platform-independent, binary files. Larger
quantities are stored LSB first, without alignment requirements. Offsets are
32 bit, effectively limiting the overall file size to 4GB, but that is not
considered a limitation in practice (and avoids unnecessary bloat compared to
64 bit offsets).
Annotations can be added for (non-module) entities and certain parts of such
entities (e.g., both for an interface type definition and for a direct method of
an interface type definition; the idea is that it can be added for direct parts
that forma a "many-to-one" relationship; there is a tradeoff between generality
of concept and size of representation, esp. for the C++ representation types in
namespace `unoidl`) and consist of arbitrary sequences of name/value strings.
Each name/value string is encoded as a single UTF-8 string containing a name (an
arbitrary sequence of Unicode code points not containing `U+003D EQUALS SIGN`),
optionally followed by `U+003D EQUALS SIGN` and a value (an arbitrary sequence of
Unicode code points). The only annotation name currently in use is "deprecated"
(without a value).
The following definitions are used throughout:
* `UInt16`: 2-byte value, LSB first
* `UInt32`: 4-byte value, LSB first
* `UInt64`: 8-byte value, LSB first
* Offset: `UInt32` value, counting bytes from start of file
* `NUL`-Name: zero or more non-`NUL` US-ASCII bytes followed by a `NUL` byte
* Len-String: UInt32 number of characters, with `0x80000000` bit 0, followed by
that many US-ASCII (for `UNOIDL` related names) resp. UTF-8 (for annotations)
bytes
* Idx-String: either an Offset (with `0x80000000` bit 1) of a Len-String, or a
Len-String
* Annotations: `UInt32` number `N` of annotations followed by `N * Idx-String`
* Entry: Offset of `NUL`-Name followed by Offset of payload
* Map: zero or more Entries
The file starts with an 8 byte header, followed by information about the root
map (`unoidl-write` generates files in a single depth-first pass, so the root map
itself is at the end of the file):
* 7 byte magic header `UNOIDL\xFF`
* version byte 0
* Offset of root Map
* `UInt32` number of entries of root Map
...
Files generated by unoidl-write follow that by a
"\0** Created by LibreOffice " LIBO_VERSION_DOTTED " unoidl-write **\0"
banner (cf. `config_host/config_version.h.in`), as a debugging aid. (Old versions
used `reg2unoidl` instead of `unoidl-write` in that banner.)
Layout of per-entry payload in the root or a module Map:
* kind byte:
* 0: module
* followed by:
* `UInt32` number `N1` of entries of Map
* `N1 * Entry`
* otherwise:
* `0x80` bit: 1 if published
* `0x40` bit: 1 if annotated
* `0x20` bit: flag (may only be 1 for certain kinds, see below)
* remaining bits:
* 1: enum type
* followed by:
* `UInt32` number N1 of members
* `N1 * tuple` of:
* `Idx-String`
* `UInt32`
* if annotated: Annotations
* 2: plain struct type (with base if flag is 1)
* followed by:
* if "with base": `Idx-String`
* `UInt32` number `N1` of direct members
* `N1 * tuple` of:
* `Idx-String` name
* `Idx-String` type
* if annotated: Annotations
* 3: polymorphic struct type template
* followed by:
* `UInt32` number `N1` of type parameters
* `N1 * Idx-String`
* `UInt32` number `N2` of members
* `N2 * tuple` of:
* kind byte: `0x01` bit is 1 if parameterized type
* `Idx-String` name
* `Idx-String` type
* if annotated: Annotations
* 4: exception type (with base if flag is 1)
* followed by:
* if "with base": `Idx-String`
* `UInt32` number `N1` of direct members
* `N1 * tuple` of:
* `Idx-String` name
* `Idx-String` type
* if annotated: Annotations
* 5: interface type
* followed by:
* `UInt32` number `N1` of direct mandatory bases
* `N1 * tuple` of:
* `Idx-String`
* if annotated: Annotations
* `UInt32` number `N2` of direct optional bases
* `N2 * tuple` of:
* `Idx-String`
* if annotated: Annotations
* `UInt32` number `N3` of direct attributes
* `N3 * tuple` of:
* kind byte:
* `0x02` bit: 1 if read-only
* `0x01` bit: 1 if bound
* `Idx-String` name
* `Idx-String` type
* `UInt32` number `N4` of get exceptions
* `N4 * Idx-String`
* `UInt32` number `N5` of set exceptions
* `N5 * Idx-String`
* if annotated: Annotations
* `UInt32` number `N6` of direct methods
* `N6 * tuple` of:
* `Idx-String` name
* `Idx-String` return type
* `UInt32` number `N7` of parameters
* `N7 * tuple` of:
* direction byte: 0 for in, 1 for out, 2 for in-out
* `Idx-String` name
* `Idx-String` type
* `UInt32` number `N8` of exceptions
* N8 * Idx-String
* if annotated: Annotations
* 6: typedef
* followed by:
* `Idx-String`
* 7: constant group
* followed by:
* `UInt32` number `N1` of entries of Map
* `N1 * Entry`
* 8: single-interface--based service (with default constructor if flag is 1)
* followed by:
* `Idx-String`
* if not "with default constructor":
* `UInt32` number `N1` of constructors
* `N1 * tuple` of:
* `Idx-String`
* `UInt32` number `N2` of parameters
* `N2 * tuple` of
* kind byte: `0x04` bit is 1 if rest parameter
* `Idx-String` name
* `Idx-String` type
* `UInt32` number `N3` of exceptions
* `N3 * Idx-String`
* if annotated: Annotations
* 9: accumulation-based service
* followed by:
* `UInt32` number `N1` of direct mandatory base services
* `N1 * tuple` of:
* `Idx-String`
* if annotated: Annotations
* `UInt32` number `N2` of direct optional base services
* `N2 * tuple` of:
* `Idx-String`
* if annotated: Annotations
* `UInt32` number `N3` of direct mandatory base interfaces
* `N3 * tuple` of:
* `Idx-String`
* if annotated: Annotations
* `UInt32` number `N4` of direct optional base interfaces
* `N4 * tuple` of:
* `Idx-String`
* if annotated: Annotations
* `UInt32` number `N5` of direct properties
* `N5 * tuple` of:
* `UInt16` kind:
* `0x0100` bit: 1 if optional
* `0x0080` bit: 1 if removable
* `0x0040` bit: 1 if maybedefault
* `0x0020` bit: 1 if maybeambiguous
* `0x0010` bit: 1 if readonly
* `0x0008` bit: 1 if transient
* `0x0004` bit: 1 if constrained
* `0x0002` bit: 1 if bound
* `0x0001` bit: 1 if maybevoid
* `Idx-String` name
* `Idx-String` type
* if annotated: Annotations
* 10: interface-based singleton
* followed by:
* `Idx-String`
* 11: service-based singleton
* followed by:
* `Idx-String`
* if annotated, followed by: Annotations
Layout of per-entry payload in a constant group Map:
* kind byte:
* `0x80` bit: 1 if annotated
* remaining bits:
* 0: `BOOLEAN`
* followed by value byte, 0 represents false, 1 represents true
* 1: `BYTE`
* followed by value byte, representing values with two's complement
* 2: `SHORT`
* followed by `UInt16` value, representing values with two's complement
* 3: `UNSIGNED SHORT`
* followed by `UInt16` value
* 4: `LONG`
* followed by `UInt32` value, representing values with two's complement
* 5: `UNSIGNED LONG`
* followed by `UInt32` value
* 6: `HYPER`
* followed by `UInt64` value, representing values with two's complement
* 7: `UNSIGNED HYPER`
* followed by `UInt64` value
* 8: `FLOAT`
* followed by 4-byte value, representing values in ISO 60599 binary32 format,
LSB first
* 9: `DOUBLE`
* followed by 8-byte value, representing values in ISO 60599 binary64 format,
LSB first
* if annotated, followed by: Annotations