office-gobmx/xmloff
László Németh 3a332d9f1c tdf#158885 cui offapi sw xmloff: fix hyphenation at stem boundary
Add new hyphenation option "Compound characters at line end",
equivalent of libhyphen's COMPOUNDLEFTHYPHENMIN, to limit bad
pattern based hyphenation of compound words using morphological
analysis of Hunspell.

* Add checkbox to Text Flow in paragraph formatting dialog window
* Store property in paragraph model:
  css::style::ParagraphProperties::ParaHyphenationCompoundMinLeadingChars
* Add ODF import/export (loext:hyphenation-compound-remain-char-count)
* Add ODF unit tests

Note: slower Hunspell based hyphenation is used only if
ParaHyphenationCompoundMinLeadingChars >= 3 (we assume that
libhyphen hyphenation patterns cover the smaller distances
correctly). Hunpell based hyphenation doesn't introduce
new hyphenation breaks, only detects the stem boundaries
from the libhyphen based hyphenation breaks.

Follow-up to commit c899d3608d
"tdf#158885 sw: don't hyphenate right after a stem boundary",
replacing hyphenation zone dependence with the new "Compound
characters at line end".

Note: preset COMPOUNDLEFTHYPHENMIN values aren't loaded yet
from hyphenation dictionaries.

Note: the suffix of the last stem of the compound is always
hyphenated, i.e. the distance limits only hyphenation
inside the stem, not inside its suffix or at the end of the
stem before the suffix.

Change-Id: I46a0288929a66f7453e3ff97fbc5a0c6a01f038f
Reviewed-on: https://gerrit.libreoffice.org/c/core/+/164983
Tested-by: László Németh <nemeth@numbertext.org>
Reviewed-by: László Németh <nemeth@numbertext.org>
2024-03-20 13:04:09 +01:00
..
documentation xmloff: document the GutterAtTop Writer setting 2023-10-24 13:46:28 +02:00
dtd
inc tdf#158885 cui offapi sw xmloff: fix hyphenation at stem boundary 2024-03-20 13:04:09 +01:00
qa Clamp extrusion light level to allowed range for ODF 2024-03-18 15:58:53 +01:00
source tdf#158885 cui offapi sw xmloff: fix hyphenation at stem boundary 2024-03-20 13:04:09 +01:00
util
CppunitTest_xmloff_draw.mk xmloff: use XThemeColor in ODF, change the format for themes 2023-01-13 13:44:09 +00:00
CppunitTest_xmloff_style.mk Tests with color stops to assert Color and not BColor values 2023-08-21 08:57:15 +02:00
CppunitTest_xmloff_text.mk sw floattable, per-frame wrap-on-all-pages mode: add ODT filter 2023-11-28 09:53:13 +01:00
CppunitTest_xmloff_uxmloff.mk xmloff: use XThemeColor in ODF, change the format for themes 2023-01-13 13:44:09 +00:00
CustomTarget_generated.mk
IwyuFilter_xmloff.yaml tdf#146619 Recheck xmloff/*cxx with IWYU 2024-03-12 10:37:31 +01:00
JunitTest_xmloff_unoapi.mk
Library_xo.mk xmloff: fix import of CharComplexColor - add StylePropertiesContext 2023-08-01 08:15:15 +02:00
Library_xof.mk
Makefile
Module_xmloff.mk
Package_dtd.mk
README.md

ODF Import and Export Filter Logic

The main library "xo" contains the basic ODF import/export filter implementation for most applications. The document is accessed via its UNO API, which has the advantage that the same import/export code can be used for text in all applications (from/to Writer/EditEngine). The filter consumes/produces via SAX UNO API interface (implemented in "sax"). Various bits of the ODF filters are also implemented in applications, for example [git:sw/source/filter/xml].

There is a central list of all element or attribute names in [git:include/xmloff/xmltoken.hxx]. The main class of the import filter is SvXMLImport, and of the export filter SvXMLExport.

The Import filter maintains a stack of contexts for each element being read. There are many classes specific to particular elements, derived from SvXMLImportContext.

Note that for export several different versions of ODF are supported, with the default being the latest ODF version with "extensions", which means it may contain elements and attributes that are only in drafts of the specification or are not yet submitted for specification. Documents produced in the other (non-extended) ODF modes are supposed to be strictly conforming to the respective specification, i.e., only markup defined by the ODF specification is allowed.

There is another library "xof" built from the source/transform directory, which is the filter for the OpenOffice.org XML format. This legacy format is a predecessor of ODF and was the default in OpenOffice.org 1.x versions, which did not support ODF. This filter works as a SAX transformation from/to ODF, i.e., when importing a document the transform library reads the SAX events from the file and generates SAX events that are then consumed by the ODF import filter.

OpenOffice.org XML File Format

There is some stuff in the "dtd" directory which is most likely related to the OpenOffice.org XML format but is possibly outdated and obsolete.

Add New XML Tokens

When adding a new XML token, you need to add its entry in the following three files:

  • [git:include/xmloff/xmltoken.hxx]
  • [git:xmloff/source/core/xmltoken.cxx]
  • [git:xmloff/source/token/tokens.txt]