office-gobmx/writerfilter
László Németh 692bc46b25 tdf#146140 sw DOCX import: fix moveFrom regression with broken text content
(Also a small clean-up: increase the character limit for tracked text
moving detection: Only 2 or more (non-whitespace) character deletions
are checked for it, because single characters are often typos or some
control-like characters, e.g. soft hyphen, not real text movings.)

Details of the regression: commit d32d9a2b3c
"tdf#123460 DOCX track changes: moveFrom completely" fixed
the missing redline import of the end of the moved paragraphs,
but paragraph end was imported as w:del, not w:moveFrom explicitly.
From commit f51fa75344
"tdf#145718 sw, DOCX import: complete tracked text moving"
this resulted two deletions (a moved one and a plain one) instead of
the previous single one.

Moreover, exporting these double deletions at the same position to
ODT, raised a back-compatibility issue with broken text content, see
tdf#107292 (solved recently, but not in older LibreOffice versions).

Removing the explicit w:del code path in writerfilter, it solved
the regression from commit f51fa75344
"tdf#145718 sw, DOCX import: complete tracked text moving".

See also commit 9e1e88ad5c
"tdf#145720 DOCX export: fix loss of tracked moving".

Change-Id: I15bfc83b87dd42a762ff84edf5bae765fe02a5ae
Reviewed-on: https://gerrit.libreoffice.org/c/core/+/126631
Tested-by: Jenkins
Reviewed-by: László Németh <nemeth@numbertext.org>
2021-12-11 10:32:31 +01:00
..
documentation
inc
qa RTF import: handle \snext 2021-11-09 08:22:41 +01:00
source tdf#146140 sw DOCX import: fix moveFrom regression with broken text content 2021-12-11 10:32:31 +01:00
util
CppunitTest_writerfilter_dmapper.mk
CppunitTest_writerfilter_filters_test.mk
CppunitTest_writerfilter_misc.mk
CppunitTest_writerfilter_rtftok.mk RTF import: handle \snext 2021-11-09 08:22:41 +01:00
CustomTarget_source.mk
IwyuFilter_writerfilter.yaml
Library_writerfilter.mk Generally determine Rdb content from gb_*_set_componentfile calls 2021-12-10 08:14:24 +01:00
Makefile
Module_writerfilter.mk
README.md

Import Filters for LibreOffice Writer

The writerfilter module contains import filters for Writer, using its UNO API.

Import filter for DOCX and RTF.

  • Module contents

    • documentation: RNG schema for the OOXML tokenizer, etc.
    • inc: module-global headers (can be included by any files under source)
    • qa: cppunit tests
    • source: the filters themselves
    • util: UNO passive registration config
  • Source contents

    • dmapper: the domain mapper, hiding UNO from the tokenizers, used by DOCX and RTF import
      • The incoming traffic of dmapper can be dumped into an XML file in /tmp in dbgutil builds, start soffice with the SW_DEBUG_WRITERFILTER=1 environment variable if you want that.
    • filter: the UNO filter service implementations, invoked by UNO and calling the dmapper + one of the tokenizers
    • ooxml: the docx tokenizer
    • rtftok: the rtf tokenizer