* Consistently use `const` to introduce variable bindings. * Don't rely on `xText` from the first example in the second one. * No (more) need to query for base interfaces. * No (more) need to delete interface references. * Add a missing delete of Any `next`. * Remove a redundant empty line. * Adapt to our 100 character line width. Change-Id: Ie116021a4b0cc6d88c6204e7ea5147a837c251f5 Reviewed-on: https://gerrit.libreoffice.org/c/core/+/168405 Tested-by: Jenkins Reviewed-by: Stephan Bergmann <stephan.bergmann@allotropia.de>
16 KiB
Support for Emscripten Cross Build
This subdirectory provides support for building LibreOffice as WASM, with the Emscripten toolchain.
You can build LibreOffice for WASM for two separate purposes: 1) Either to produce a WASM binary of LibreOffice as such, using Qt5 for its GUI, or 2) just compiling LibreOffice core ("LibreOffice Technology") to WASM without any UI for use in other software that provides the UI, like Collabora Online built as WASM.
The first purpose was the original reason for the WASM port and this document was originally written with that in mind. For the second purpose, look towards the end of the document for the section "Building headless LibreOffice as WASM for use in another product".
Status of LibreOffice as WASM with Qt
The build generates a Writer-only LO build. You should be able to run either
$ emrun --serve_after_close instdir/program/qt_soffice.html
$ emrun --serve_after_close workdir/LinkTarget/Executable/qt_vcldemo.html
$ emrun --serve_after_close workdir/LinkTarget/Executable/qt_wasm-qt5-mandelbrot.html
REMINDER: Always start new tabs in the browser, reload might fail / cache! INFO: latest browser won't work anymore with 0.0.0.0 and need 127.0.0.1.
Setup for the LO WASM build (with Qt)
We're using Qt 5.15.2 with Emscripten 3.1.46. There are a bunch of Qt patches to fix the most grave bugs. Also there's rapid development in Emscripten, so using another version often causes arbitrary problems.
- See below under Docker build for another build option
Setup emscripten
https://emscripten.org/docs/getting_started/index.html
git clone https://github.com/emscripten-core/emsdk.git
./emsdk install 3.1.46
./emsdk activate 3.1.46
Example bashrc
scriptlet:
EMSDK_ENV=$HOME/Development/libreoffice/git_emsdk/emsdk_env.sh
[ -f "$EMSDK_ENV" ] && \. "$EMSDK_ENV" 1>/dev/null 2>&1
Setup Qt
https://doc.qt.io/qt-5/wasm.html
Most of the information from https://doc.qt.io/qt-6/wasm.html is still valid for Qt5; generally the Qt6 WASM documentation is much better, because it incorporated many information from the Qt Wiki.
FWIW: Qt 5.15 LTS is not maintained publicly and Qt WASM has quite a few bugs. Most WASM fixes from Qt 6 are needed for Qt 5.15 too. Allotropia offers a Qt repository with the necessary patches cherry-picked.
With "-opensource -confirm-license" you agree to the open source license.
git clone https://github.com/allotropia/qt5.git
cd qt5
git checkout v5.15.2+wasm
./init-repository --module-subset=qtbase
./configure -opensource -confirm-license -xplatform wasm-emscripten -feature-thread -prefix <whatever> QMAKE_CFLAGS+=-sSUPPORT_LONGJMP=wasm QMAKE_CXXFLAGS+=-sSUPPORT_LONGJMP=wasm
make -j<CORES> module-qtbase
Optionally you can add the configure flag "-compile-examples". But then you also have to patch at least mkspecs/wasm-emscripten/qmake.conf with EXIT_RUNTIME=0, otherwise they will fail to run. In addition, building with examples will break with some of them, but at that point Qt already works and also most examples. Or just skip them. Other interesting flags might be "-nomake tests -no-pch -ccache".
Linking takes quite a long time, because emscripten-finalize rewrites the whole WASM files with some options. This way the LO WASM possibly needs 64GB RAM. For faster link times add "-s WASM_BIGINT=1", change to ASSERTIONS=1 and use -g3 to prevent rewriting the WASM file and generating source maps (see emscripten.py, finalize_wasm, and avoid modify_wasm = True). This is just needed for Qt examples, as LO already uses the correct flags!
It's needed to install Qt5 to the chosen prefix. Else LO won't find all needed files in the right place. For installation you can do
make -j<CORES> install
or make -j8 -C qtbase/src install_subtargets
Current Qt fails to start the demo webserver: https://bugreports.qt.io/browse/QTCREATORBUG-24072
Use emrun --serve_after_close
to run Qt WASM demos.
Setup LO
autogen.sh
is patched to use emconfigure. That basically sets various
environment vars, especially EMMAKEN_JUST_CONFIGURE
, which will create the
correct output file names, checked by configure
(a.out
).
There's a distro config for WASM, but it just provides --host=wasm32-local-emscripten, which should be enough setup. The build itself is a cross build and the cross-toolset just depends on a minimal toolset (gcc, libc-dev, flex, bison); all else is build from source, because the final result is not depending on the build system at all.
Recommended configure setup is thusly:
-
grab defaults
--with-distro=LibreOfficeWASM32
-
local config
QT5DIR=/dir/of/qt5/install/prefix
-
if you want to use ccache on both sides of the build
--with-build-platform-configure-options=--enable-ccache
--enable-ccache
FWIW: it's also possible to build an almost static Linux LibreOffice by just using --disable-dynloading --enable-customtarget-components. System externals are still linked dynamically, but everything else is static.
"Deploying" soffice.wasm
tar -chf wasm.tar --xform 's/.*program/lo-wasm/' instdir/program/soffice.* \
instdir/program/qt*
Your HTTP server needs to provide additional headers:
- add_header Cross-Origin-Opener-Policy same-origin
- add_header Cross-Origin-Embedder-Policy require-corp
The default html to use should be qt_soffice.html
Debugging setup
Since a few months you can use DWARF information embedded by LLVM into the WASM to debug WASM in Chrome. You need to enable an experimental feature and install an additional extension. The whole setup is described in:
https://developer.chrome.com/blog/wasm-debugging-2020/
This way you don't need source maps (much faster linking!) and can resolve local WASM variables to C++ names!
Per default, the WASM debug build splits the DWARF information into an additional WASM file, postfixed '.debug.wasm'.
Using Docker to cross-build with emscripten
If you prefer a controlled environment (sadly emsdk install/activate is not stable over time, as e.g. nodejs versions evolve), that is easy to replicate across different machines - consider the docker images we're providing.
Config/setup file see https://git.libreoffice.org/lode/+/ccb36979563635b51215477455953252c99ec013
Run
docker-compose build
in the lode/docker dir to get the container prepared. Run
PARALLELISM=4 BUILD_OPTIONS= BUILD_TARGET=build docker-compose run --rm \
-e PARALLELISM -e BUILD_TARGET -e BUILD_OPTIONS builder
to perform an actual srcdir != builddir
build; the container mounts
checked-out git repo and output dir via docker-compose.yml
(so make
sure the path names there match your setup):
The lode setup expects, inside the lode/docker subdir, the following directories:
- core (
git checkout
) - workdir (the output dir - gets written into)
- cache (
ccache tree
) - tarballs (external project tarballs gets written and cached there)
Ideas for an UNO bridge implementation
My post to Discord #emscripten:
"I'm looking for a way to do an abstract call
from one WASM C++ object to another WASM C++ object, so like FFI / WebIDL,
just within WASM. All my code is C++ and normally I have bridge code, with
assembler to implement the function call /RTTI and exception semantics of the
specified platform. Code is at
https://cgit.freedesktop.org/libreoffice/core/tree/bridges/source/cpp_uno.
I've read a bit about call_indirect
and stuff, but I don't have yet a good
idea, how I could implement this (and there is an initial feature/wasm branch
for the interested). I probably need some fixed lookup table, like on iOS,
because AFAIK you can't dynamically generate code in WASM. So any pointers or
ideas for an implementation? I can disassemble some minimalistic WASM example
and read clang code for WASM_EmscriptenInvoke
, but if there were some
standalone code or documentation I'm missing, that would be nice to know."
We basically would go the same way then the other backends. Write the bridge in
C++, which is probably largely boilerplate code, but the function call in WAT
(https://github.com/WebAssembly/wabt) based on the LLVM WASM calling
conventions in WASM_EmscriptenInvoke
. I didn't get a reply to that question for
hours. Maybe I'll open an Emscripten issue, if we really have to implement
this.
WASM dynamic dispatch:
UNO bindings with Embind
Right now there's a very rough implementation in place. With lots of different bits unimplemented. And it might be leaking memory. i.e. Lots of room for improvement! ;)
Some usage examples through javascript of the current implementation:
// inserts a string at the start of the Writer document.
const uno = init_unoembind_uno(Module);
const css = uno.com.sun.star;
const xModel = Module.getCurrentModelFromViewSh();
const xTextDocument = css.text.XTextDocument.query(xModel);
const xText = xTextDocument.getText();
const xTextCursor = xText.createTextCursor();
xTextCursor.setString("string here!");
// changes each paragraph of the Writer document to a random color.
const uno = init_unoembind_uno(Module);
const css = uno.com.sun.star;
const xModel = Module.getCurrentModelFromViewSh();
const xTextDocument = css.text.XTextDocument.query(xModel);
const xText = xTextDocument.getText();
const xEnumAccess = css.container.XEnumerationAccess.query(xText);
const xParaEnumeration = xEnumAccess.createEnumeration();
while (xParaEnumeration.hasMoreElements()) {
const next = xParaEnumeration.nextElement();
const xParagraph = css.text.XTextRange.query(next.get());
const xParaProps = css.beans.XPropertySet.query(xParagraph);
const color = new Module.uno_Any(Module.uno_Type.Long(), Math.floor(Math.random() * 0xFFFFFF));
xParaProps.setPropertyValue("CharColor", color);
next.delete();
color.delete();
}
Tools for problem diagnosis
nm -s
should list the symbols in the archive, based on the index generated by ranlib. If you get linking errors that archive has no index.
Emscripten filesystem access with threads
This is closed, but not really fixed IMHO:
Dynamic libraries /
modules in emscripten
There is a good summary in:
Summary: you can't use modules and threads.
This is mentioned at the end of:
The usage of MAIN_MODULE
and SIDE_MODULE
has other problems, a major one IMHO is symbol resolution at runtime only.
So this works really more like plugins in the sense of symbol resolution without dependencies /
rpath.
There is some clang-level dynamic-linking in progress (WASM dlload). The following link is already a bit old, but I found it a god summary of problems to expect:
Mixed information, links, problems, TODO
More info on Qt WASM emscripten pthreads:
WASM needs -pthread
at compile, not just link time for atomics support. Alternatively you can provide
-s USE_PTHREADS=1
, but both don't seem to work reliable, so best provide both.
https://github.com/emscripten-core/emscripten/issues/10370
The output file must have the prefix .o, otherwise the WASM files will get a
node.js
shebang (!) and ranlib won't be able to index the library (link errors).
Qt with threads has further memory limit. From Qt configure:
Project MESSAGE: Setting PTHREAD_POOL_SIZE to 4
Project MESSAGE: Setting TOTAL_MEMORY to 1GB
You can actually allocate 4GB:
LO uses a nested event loop to run dialogs in general, but that won't work, because you can't drive the browser event loop. like VCL does with the system event loop in the various VCL backends. Changing this will need some major work (basically dropping Application::Execute).
But with the know problems with exceptions and threads, this might change:
- https://github.com/emscripten-core/emscripten/pull/11518
- https://github.com/emscripten-core/emscripten/issues/11503
- https://github.com/emscripten-core/emscripten/issues/11233
- https://github.com/emscripten-core/emscripten/issues/12035
We're also using emconfigure at the moment. Originally I patched emscripten, because it
wouldn't create the correct a.out file for C++ configure tests. Later I found that
the emconfigure
sets EMMAKEN_JUST_CONFIGURE
to work around the problem.
ICU bug:
Alternative, probably:
There is a wasm64, but that still uses 32bit pointers!
Old outdated docs:
Reverted patch:
Generally https://emscripten.org/docs/porting:
- https://emscripten.org/docs/porting/guidelines/api_limitations.html#api-limitations
- https://emscripten.org/docs/porting/files/file_systems_overview.html#file-system-overview
- https://emscripten.org/docs/porting/pthreads.html
- https://emscripten.org/docs/porting/emscripten-runtime-environment.html
This will be interesting:
This didn't help much yet:
Emscripten supports standalone WASI binaries:
- https://github.com/emscripten-core/emscripten/wiki/WebAssembly-Standalone
- https://www.qt.io/qt-examples-for-webassembly
- http://qtandeverything.blogspot.com/2017/06/qt-for-web-assembly.html
- http://qtandeverything.blogspot.com/2020/
- https://emscripten.org/docs/api_reference/Filesystem-API.html
- https://discuss.python.org/t/add-a-webassembly-wasm-runtime/3957/12
- http://git.savannah.gnu.org/cgit/config.git
- https://webassembly.org/specs/
- https://developer.chrome.com/docs/native-client/
- https://emscripten.org/docs/getting_started/downloads.html
- https://github.com/openpgpjs/openpgpjs/blob/master/README.md#getting-started
- https://developer.mozilla.org/en-US/docs/WebAssembly/Using_the_JavaScript_API
- https://github.com/bytecodealliance/wasmtime/blob/main/docs/WASI-intro.md
- https://www.ip6.li/de/security/x.509_kochbuch/openssl-fuer-webassembly-compilieren
- https://emscripten.org/docs/introducing_emscripten/about_emscripten.html#about-emscripten-porting-code
- https://emscripten.org/docs/compiling/Building-Projects.html
Building headless LibreOffice as WASM for use in another product
Set up Emscripten
Follow the instructions in the first part of this document.
No Qt needed.
You don't need any dependencies other than those that normally are downloaded and compiled when building LibreOffice.
Set up LO
For instance, this autogen.input works for me:
--disable-debug
--enable-sal-log
--disable-crashdump
--host=wasm32-local-emscripten
--disable-gui
--with-main-module=writer
For building LO core for use in COWASM, it is known to work to use Emscripten 3.1.30 (and not just 2.0.31 which is what the LO+Qt5 work has been using in the past).
That's all
After all, in this case you are building LO core headless for it to be used by other software.
Note that a soffice.wasm will be built, but that is just because of how the makefilery has been set up. We do need the soffice.data file that contains the in-memory file system needed by the LibreOffice Technology core code during run-time, though. That is at the moment built as a side-effect when building soffice.wasm.