office-gobmx/lingucomponent
László Németh c899d3608d tdf#158885 sw: don't hyphenate right after a stem boundary
in compound words to get better typography or orthography
with more readable text, if hyphenation zone is enabled.

If there are multiple possible break points in the word
according to the libhyphen based hyphenation, keep only
the best ones using Hunspell morphological data
based on compound word decomposition of non-dictionary
words (pa: fields), and extra morphological data of dictionary
words (hy: fields) or their combination.

For readability and tradition, orthography and typography
prefer or only allow hyphenation between stems in compound
words in several languages, like Danish, Dutch, German,
Hungarian, Norwegian and Sweden.

Hyphenation zone is to avoid of too much or bad hyphenation.
Preferring stem boundaries for hyphenation within the hyphenation
zone is a natural extension of  it, i.e. skip hyphenation within
stems, if there is stem boundary within the hyphenation zone.

Now skip break points after skip boundaries, if their
distance is 3 or less characters (COMPOUNDLEFTHYPHENMIN = 4).

Skip also break points on stem boundaries, if there is a
weighted stem boundary before them within 3 characters.

Weighted stem boundaries are there between the pa: fields,
(stems resulted by the compound word decomposition),
or in hy: field, boundaries marked by double || instead of
a single |.

More information: man (5) hunspell, and option -m hunspell.

Note: break points skipped only in the last stems for languages
with fogemorphemes, yet, because of their incomplete Hunspell
output for morphological analysis.

Change-Id: I739908716d11a9c2db0c9d36fba8657ba6f53bee
Reviewed-on: https://gerrit.libreoffice.org/c/core/+/161498
Tested-by: Jenkins
Reviewed-by: László Németh <nemeth@numbertext.org>
2024-01-01 00:42:35 +01:00
..
config Load the locales from config file for languagetool 2023-10-22 19:02:06 +02:00
source tdf#158885 sw: don't hyphenate right after a stem boundary 2024-01-01 00:42:35 +01:00
IwyuFilter_lingucomponent.yaml
Library_guesslang.mk
Library_hyphen.mk
Library_LanguageTool.mk Use officecfg instead of SvxLanguageToolOptions 2023-03-09 19:36:57 +00:00
Library_lnth.mk
Library_MacOSXSpell.mk
Library_numbertext.mk
Library_spell.mk
Makefile
Module_lingucomponent.mk Fix --disable-curl build 2023-09-14 12:55:33 +02:00
README.md
StaticLibrary_ulingu.mk

Linguistics Components

lingucomponent contains spellcheck, hyphenator, thesaurus, etc.