office-gobmx/xmloff
Tomaž Vajngerl f5cf0c10eb tdf#123396 actually use the mimetype set in the from document
In the document we set the mime type for the images stored, but
at import we never actually used the mime types. We always did
mime type detection from the filename extension. The problem with
this is that files stored as base64 streams don't have a filename
so it is also not possible to determine the mime type from the
file name.

The consequence of this is that we can't know which image to take
if we have multiple images (fallback images) so we always take the
last one, which could be the wrong one. This happend in the test
document.

This changes the behavior so that we always first use the document
set mime type as there is no reason to not trust that. Only if the
mime type isn't provided by the document we use other mime type
detection methods.

Change-Id: I175c509ce5f11eab2c0454d4d9901ca1fe975272
Reviewed-on: https://gerrit.libreoffice.org/c/core/+/160237
Tested-by: Jenkins
Reviewed-by: Tomaž Vajngerl <quikee@gmail.com>
2023-12-02 11:10:34 +01:00
..
documentation
dtd
inc tdf#123396 actually use the mimetype set in the from document 2023-12-02 11:10:34 +01:00
qa
source tdf#123396 actually use the mimetype set in the from document 2023-12-02 11:10:34 +01:00
util
CppunitTest_xmloff_draw.mk
CppunitTest_xmloff_style.mk
CppunitTest_xmloff_text.mk
CppunitTest_xmloff_uxmloff.mk
CustomTarget_generated.mk
IwyuFilter_xmloff.yaml
JunitTest_xmloff_unoapi.mk
Library_xo.mk
Library_xof.mk
Makefile
Module_xmloff.mk
Package_dtd.mk
README.md

ODF Import and Export Filter Logic

The main library "xo" contains the basic ODF import/export filter implementation for most applications. The document is accessed via its UNO API, which has the advantage that the same import/export code can be used for text in all applications (from/to Writer/EditEngine). The filter consumes/produces via SAX UNO API interface (implemented in "sax"). Various bits of the ODF filters are also implemented in applications, for example [git:sw/source/filter/xml].

There is a central list of all element or attribute names in [git:include/xmloff/xmltoken.hxx]. The main class of the import filter is SvXMLImport, and of the export filter SvXMLExport.

The Import filter maintains a stack of contexts for each element being read. There are many classes specific to particular elements, derived from SvXMLImportContext.

Note that for export several different versions of ODF are supported, with the default being the latest ODF version with "extensions", which means it may contain elements and attributes that are only in drafts of the specification or are not yet submitted for specification. Documents produced in the other (non-extended) ODF modes are supposed to be strictly conforming to the respective specification, i.e., only markup defined by the ODF specification is allowed.

There is another library "xof" built from the source/transform directory, which is the filter for the OpenOffice.org XML format. This legacy format is a predecessor of ODF and was the default in OpenOffice.org 1.x versions, which did not support ODF. This filter works as a SAX transformation from/to ODF, i.e., when importing a document the transform library reads the SAX events from the file and generates SAX events that are then consumed by the ODF import filter.

OpenOffice.org XML File Format

There is some stuff in the "dtd" directory which is most likely related to the OpenOffice.org XML format but is possibly outdated and obsolete.

Add New XML Tokens

When adding a new XML token, you need to add its entry in the following three files:

  • [git:include/xmloff/xmltoken.hxx]
  • [git:xmloff/source/core/xmltoken.cxx]
  • [git:xmloff/source/token/tokens.txt]