Records the uno commands from different instances of ChildSession and
dumps the last 4 uno commands into the crashlog during a fatal crash
Signed-off-by: Gopi Krishna Menon <krishnagopi487.github@outlook.com>
Change-Id: I838f71769dc08df7076c040f3d72c15f7607e9d3
Using for fuzzing and integration testing.
With unit-tests.
Change-Id: I23f8c619e239310d92c74c4d5e4157afb52a5e56
Signed-off-by: Ashod Nakashian <ashod.nakashian@collabora.co.uk>
std::atoi() assumes a null-terminated string and our strings are not
always null-terminated. So add a version that takes a length parameter,
this way we don't have to copy strings around.
Also switch to this in http::StatusLine::parse().
Signed-off-by: Miklos Vajna <vmiklos@collabora.com>
Change-Id: I449b356c1b9948c562434618596e8e3b38656088
To differentiate between non-printable data
and no-data, we use '.' for non-printables
and print nothing visible (i.e. whitespace)
when we run out of data. This makes the hex
dumps more readable.
Change-Id: I8eeb78ab72d63ed613b7c330949063c0cb8cbfca
Signed-off-by: Ashod Nakashian <ashod.nakashian@collabora.co.uk>
And guard http data dumping with debug directives.
Change-Id: I22a725ba49bfb0399a27889ce9732dfe061e2563
Signed-off-by: Ashod Nakashian <ashod.nakashian@collabora.co.uk>
The fuzzer ran out of memory, 955443527 bytes (79%) of the used memory
was this map.
Change-Id: I2dd84a094d3dd3d98618667e3c78591e2193bce2
Signed-off-by: Miklos Vajna <vmiklos@collabora.com>
On non-Linux systems we should default to std:🧵:id
which needs to be serialized using ostream interface.
While Util::getThreadId does specialize for Linux, the
code using it doesn't always handle the different return
types.
While std:🧵:id is the standard interface to the
thread ID, using such abstraction has proven to be costly
when converting the thread ID on each and every log via
ostringstream (due to the cost of memory allocation).
In practice Linux is the primary and so far only platform,
so the getThreadId is optimized for it. Other systems
can either use the default std:🧵:id, or can also
specialize as necessary.
Change-Id: I91cf279a8fdff12636a534957db5069dee51bd65
Signed-off-by: Ashod Nakashian <ashod.nakashian@collabora.co.uk>
This replaces Util::getFileTimestamp with
FileUtil::Stat::modifiedTimepoint() and fixes a potential bug:
getFileTimestamp had only 1 second precision (it simply dropped
sub-second data). This could mean that any modifications to a file
within a second could not be detected.
Minor simplifications done where possible and overly long lines
have been reformatted.
This is a non-functional change (except that file modified-time
now supports microsecond precision).
Change-Id: I3606638a86fc3e00c0ad5cb602bdbb2b4651867b
Signed-off-by: Ashod Nakashian <ashod.nakashian@collabora.co.uk>
In the old code, if the evaluation first allocates the memory for the
raw pointer, then calls firstLine() and an exception is thrown before
the std::unique_ptr construction, then the memory is leaked. Using
make_unique() has the benefit of avoiding this problem.
Convert only a single usage, so the remaining places can be done as easy
hacks.
Change-Id: Iaf3d8051a8a0627a57fdf1196bde7d5f8612fcff
Our header parses was overly simplistic and
didn't support a number of corner cases that
rfc2616 specifies (folding, for example). The
new approach is to simply normalize the headers by
removing invalid line-breaks and then let the
MessageHeader parser take care of parsing the
headers individually, which we then set on the request.
The new utility setHttpHeaders should be used
whenever we need to set a header in an request
to make sure it are sanitized and valid.
Change-Id: Ifa16fa9364f42183316749276c5d0a4c556cb740
Reviewed-on: https://gerrit.libreoffice.org/c/online/+/96371
Tested-by: Jenkins
Tested-by: Jenkins CollaboraOffice <jenkinscollaboraoffice@gmail.com>
Reviewed-by: Ashod Nakashian <ash@collabora.com>
loolmount now works and supports mounting and
unmounting, plus numerous improvements,
refactoring, logging, etc.. When enabled,
binding improves the jail setup time by anywhere
from 2x to orders of magnitude (in docker, f.e.).
A new config entry mount_jail_tree controls
whether mounting is used or the old method of
linking/copying of jail contents. It is set to
true by default and falls back to linking/copying.
A test mount is done when the setting is enabled,
and if mounting fails, it's disabled to avoid noise.
Temporarily disabled for unit-tests until we can
cleanup lingering mounts after Jenkins aborts our
build job. In a future patch we will have mount/jail
cleanup as part of make.
The network/system files in /etc that need frequent
refreshing are now updated in systemplate to make
their most recent version available in the jails.
These files can change during the course of loolwsd
lifetime, and are unlikely to be updated in
systemplate after installation at all. We link to
them in the systemplate/etc directory, and if that
fails, we copy them before forking each kit
instance to have the latest.
This reworks the approach used to bind-mount the
jails and the templates such that the total is
now down to only three mounts: systemplate, lo, tmp.
As now systemplate and lotemplate are shared, they
must be mounted as readonly, this means that user/
must now be moved into tmp/user/ which is writable.
The mount-points must be recursive, because we mount
lo/ within the mount-point of systemplate (which is
the root of the jail). But because we (re)bind
recursively, and because both systemplate and
lotemplate are mounted for each jails, we need to
make them unbindable, so they wouldn't multiply the
mount-points for each jails (an explosive growth!)
Contrarywise, we don't want the mount-points to
be shared, because we don't expect to add/remove
mounts after a jail is created.
The random temp directory is now created and set
correctly, plus many logging and other improvements.
Change-Id: Iae3fda5e876cf47d2cae6669a87b5b826a8748df
Reviewed-on: https://gerrit.libreoffice.org/c/online/+/92829
Tested-by: Jenkins
Tested-by: Jenkins CollaboraOffice <jenkinscollaboraoffice@gmail.com>
Reviewed-by: Ashod Nakashian <ashnakash@gmail.com>
The access_header can contain a lot of nonsense, like whitespace around
or additional \n's or \r's. We used to sanitize that, but then
regressed in e95413d151 where the
"tokenize by any of \n\r" was by mistake replaced with "tokenize by
string '\n\r'".
Unfortunately the unit test didn't uncover that, and the further
refactorings of the related code have hidden that even more.
Change-Id: Ie2bf950d0426292770b599e40ee2401101162ff2
Reviewed-on: https://gerrit.libreoffice.org/c/online/+/96638
Tested-by: Jenkins
Tested-by: Jenkins CollaboraOffice <jenkinscollaboraoffice@gmail.com>
Reviewed-by: Andras Timar <andras.timar@collabora.com>
The tokenizer(s) are more generic than the protocol
logic, and are used from contexts that don't involve
the protocol as such.
Change-Id: Ie8c256bf11a91e466bff794021f41603c9596a7f
More readable and typically more efficient.
Change-Id: I9bd5bfc91f4ac255bb8ae0987708fb8b56b398f8
Reviewed-on: https://gerrit.libreoffice.org/c/online/+/95285
Reviewed-by: Michael Meeks <michael.meeks@collabora.com>
Tested-by: Jenkins
Tested-by: Jenkins CollaboraOffice <jenkinscollaboraoffice@gmail.com>
Sometimes kit process goes into a heavy processing state (or even hangs)
and is not able to report its memory usage. Thus we can't implement cleanup
of problematic kit processes based on memory information reported by kit.
By moving memory reporting to admin module we avoid this problem.
Change-Id: Icf274e3a3a97b33623a93f9d2dc1e640ad9b7d99
Reviewed-on: https://gerrit.libreoffice.org/c/online/+/92752
Tested-by: Jenkins CollaboraOffice <jenkinscollaboraoffice@gmail.com>
Reviewed-by: Michael Meeks <michael.meeks@collabora.com>
LibreOffice core uses that, too, and we support an even more
restricted set of compilers.
Change-Id: I0d0e2c8608e323eb5ef0f35ee8c46d02ab49a745
Reviewed-on: https://gerrit.libreoffice.org/c/online/+/92467
Tested-by: Jenkins CollaboraOffice <jenkinscollaboraoffice@gmail.com>
Reviewed-by: Tor Lillqvist <tml@collabora.com>
The bulk of this commit just changes std::vector<std::string> to
StringVector when we deal with tokens from a websocket message.
The less boring part of it is the new StringVector class, which is a
wrapper around std::vector<std::string>, and provides the same API,
except that operator[] returns a string, not a string&, and this allows
returning an empty string in case that prevents reading past the end of
the underlying array.
This means in case client code forgets to check size() before invoking
operator[], we don't crash. (See the ~3 previous commits which fixed
such crashes.)
Later the ctor could be changed to take a single underlying string to
avoid lots of tiny allocations, that's not yet done in this commit.
Change-Id: I8a6082143a8ac0b65824f574b32104d7889c184f
Reviewed-on: https://gerrit.libreoffice.org/c/online/+/89687
Tested-by: Jenkins CollaboraOffice <jenkinscollaboraoffice@gmail.com>
Reviewed-by: Miklos Vajna <vmiklos@collabora.com>
- target ClientSession::_handleInput(), since crashing there would bring
down the whole loolwsd (not just a kit process), and it deals with
input from untrusted users (browsers)
- add a --enable-fuzzers configure switch to build with
-fsanitize=fuzzer (compared to normal sanitizers build, this is the only
special flag needed)
- configuring other sanitizers is not done automatically, either use
--with-sanitizer=... or the environment variables from LODE's sanitizer
config
- run the actual fuzzer like this:
./clientsession_fuzzer -max_len=16384 fuzzer/data/
- note that at least openSUSE Leap 15.1 sadly ships with a clang with
libfuzzer static libs removed from the package, so you need a
self-built clang to run the fuzzer (either manual build or one from
LODE)
- <https://chromium.googlesource.com/chromium/src/testing/libfuzzer/+/refs/heads/master/efficient_fuzzing.md#execution-speed>
suggests that "You should aim for at least 1,000 exec/s from your fuzz
target locally" (i.e. one run should not take more than 1 ms), so try
this minimal approach first. The alternative would be to start from the
existing loolwsd_fuzzer binary, then step by step cut it down to not
fork(), not do any network traffic, etc -- till it's fast enough that
the fuzzer can find interesting input
- the various configurations start to be really complex (the matrix is
just very large), so try to use Util::isFuzzing() for fuzzer-specific
changes (this is what core.git does as well), and only resort to ifdefs
for the Util::isFuzzing() itself
Change-Id: I72dc1193b34c93eacb5d8e39cef42387d42bd72f
Reviewed-on: https://gerrit.libreoffice.org/c/online/+/89226
Tested-by: Jenkins CollaboraOffice <jenkinscollaboraoffice@gmail.com>
Reviewed-by: Michael Meeks <michael.meeks@collabora.com>
Using double caused all sorts of rounding issues,
especially with random unit-test failures.
Luckily, we don't need doubles and can do everything
with integers.
Also added a new function to print time_point as
iso8601 string, for logging and convenience.
Change-Id: I1c2040c02d1143282dbde0dadef32613b77c330d
Reviewed-on: https://gerrit.libreoffice.org/81578
Reviewed-by: Ashod Nakashian <ashnakash@gmail.com>
Tested-by: Ashod Nakashian <ashnakash@gmail.com>
With password-protected files, the first loading attempt
always fails due to missing password. At that point the
client is notified of the missing password and the user
is prompted. The second attempt includes a (hopefully)
correct password and the document loading commences.
Due to the fact that an exception is raised when
the loading fails, this left the loading latch
triggered, which blocked subsequent attempts.
Change-Id: I7cc257a36eb1cc080f460aac8cdb7030783a5914
Util added getHttpTime
WhiteBoxTests added test for getHttpTime
Change-Id: Ifb6a3fb2dc9b059b925e7b881362b72759a8b56b
Reviewed-on: https://gerrit.libreoffice.org/79754
Reviewed-by: Michael Meeks <michael.meeks@collabora.com>
Tested-by: Michael Meeks <michael.meeks@collabora.com>
Avoids a warning when compiling for iOS: format specifies type
'unsigned long' but the argument has type 'uint64_t' (aka 'unsigned
long long').
Change-Id: I8b5205dd0c3a8ae2f531f1647b3e3bac27ea6065
Reviewed-on: https://gerrit.libreoffice.org/78985
Reviewed-by: Tor Lillqvist <tml@collabora.com>
Tested-by: Tor Lillqvist <tml@collabora.com>
Added functions to get file timestamp and to convert
chrono timestamp in ISO8601 fraction format and some
test cases.
Change-Id: I58961a31f7262b367cff9f33cffdec7571a2f8f7