With the use of a single flag for both, the
logic is now less ambiguous, as we cannot have
termination flagged without also implying
shutting down.
The assertions are no longer needed.
Now that setting the termination flag
explicitly implies having the shut down flag
as well, the checks are simpler. We only
need to check that the shutdown is not set
to continue running as normal, since having
the termination flag must perfoce mean shut
down is also set, there is no need to check
both.
Change-Id: I99e22f5668385182b0594040a8e3354b55e74642
Signed-off-by: Ashod Nakashian <ashod.nakashian@collabora.co.uk>
Emscripten is very opinionated about sys/poll.h so use poll.h instead.
Signed-off-by: Michael Stahl <michael.stahl@allotropia.de>
Change-Id: I9691519e27a080f03a19f0cc0dd8f796fe323062
This adds support for code-coverage HTML reporting.
To achieve this, we must use file-linking in jails
so that we can update the coverage data (.gcda files)
from the jails. This means that creating jails is
slower than with bind-mounting and we need to
account for that in our timeouts.
We also can't kill child processes with SIGKILL,
which is un-catchable. Instead, we use SIGTERM
and dump the profile data before exiting.
Change-Id: I16fa534f6ed42f7133014d841bb024423315e0a4
Signed-off-by: Ashod Nakashian <ashod.nakashian@collabora.co.uk>
Explicit is always better. We also need to
terminate more gracefully when profiling.
Change-Id: I7145cb59583c5d7c6362bbf9c74e9d21799eaa33
Signed-off-by: Ashod Nakashian <ashod.nakashian@collabora.co.uk>
We now have USR2 signal that dumps the
stack-trace of each process. This is useful for
capturing the state of misbehaving instances.
COOLWSD forwards USR2 to forkit and the kits
so they dump their stacktraces too.
This patch does not change the behavior of USR1.
Specifically, unlike USR2, USR1 is not forwarded
from wsd to frk and the kits. Also unlike USR2,
USR1 dumps into stderr.
Change-Id: I1d82f678f30f430f627692cc42961b1928f69e11
Signed-off-by: Ashod Nakashian <ashod.nakashian@collabora.co.uk>
SIGUSR2 can now be used to dump the
stacktrace of coolwsd, forkit, and the
kit processes.
Also, support writing signalLog to files.
Although we write to stderr, we normalize
the interface used for signal logging and
allow for writing to any file descriptor.
Change-Id: If6366bb6ddbd9f8863baca52e4f65ebb468dc1f1
Signed-off-by: Ashod Nakashian <ashod.nakashian@collabora.co.uk>
This replaces the hardcoded STDERR_FILENO
to allow for writing to any file, including
stdout or a redirection to disk, if needed.
Change-Id: I76f6461f46fd4bc035fcf643d01e60c2e3236894
Signed-off-by: Ashod Nakashian <ashod.nakashian@collabora.co.uk>
In preparation to log to a file in the jail.
This will allow for including the log in the
log file, thus capturing all output from our
thread-group into the same log file.
Change-Id: Ia5c4ed35786d28f5d45f3065919d53f2c8492cb0
Signed-off-by: Ashod Nakashian <ashod.nakashian@collabora.co.uk>
This simplifies the signal handling setup.
Change-Id: Id121a9df45fc11bfdea627f9828e0b624b1b2284
Signed-off-by: Ashod Nakashian <ashod.nakashian@collabora.co.uk>
This is to ensure that libgcc is loaded and
backtrace is available during signal handling.
Change-Id: I5bb36b69401dbedf4c991ba7d60d2e806441a625
Signed-off-by: Ashod Nakashian <ashod.nakashian@collabora.co.uk>
We should always set the shutdown flag first.
Otherwise, we run afoul of a race condition.
Change-Id: Ic99793d68b3b943496ff932b4bdafd336fef7f82
Signed-off-by: Ashod Nakashian <ashod.nakashian@collabora.co.uk>
With --unattended, do not wait for a debugger
upon seg-faulting. This avoids the unnecessary wait
that prolongs failed unit-tests in automated runs.
Now run_unit.sh and Cypress Makefile set this flag.
Note that the wait only happens when in debug
builds, or when envar COOL_DEBUG is set. This
prevents us from waiting when running a debug
build where we can't see the output, or indeed
the run is on a CI build machine.
This flag can also be used by devs when reproducing
failures where there is no interest in attaching
a debugger. The logs are shorter and more
readable, too. At least in trace level.
Change-Id: Ice15482c6724abc47f5955402295198eb7f671ee
Signed-off-by: Ashod Nakashian <ashod.nakashian@collabora.co.uk>
Based on previous crashes, it is useful to have more granularity
on what happened last.
Change-Id: If18a3a4d7817be23a6f8aadd301827a8e1bc007e
Signed-off-by: Michael Meeks <michael.meeks@collabora.com>
Certain parts of the code assume ShutdownRequested is set
when relying on the Termination flag. This is because
they serve as degrees not as independent flags.
The Termination flag is basically a more impatient
ShutdownRequested flag. It is used to forcefully
and immediately terminate. It is not designed to be
used independently from ShutdownRequested.
This pair is a good candidate for unifying as a single enum.
Change-Id: I8e3913a1959868197d8c5a059e89cbdbc6cef070
Signed-off-by: Ashod Nakashian <ashod.nakashian@collabora.co.uk>
During startup we wait for extended time until
a child process is up and running. In case this
takes a long time, or indeed if forking permanently
fails, we simply don't respond to shutdown requests.
This patch makes it possible to wait for about 20
seconds at a time for a child. This way on average
we will exit the process within about 10 seconds,
assuming we are blocked waiting for a child that
will never spawn.
Change-Id: I4409cbc60aa3c7bd30970d4638c820bc581b65ba
Signed-off-by: Ashod Nakashian <ashod.nakashian@collabora.co.uk>
(cherry picked from commit 6ed8b4bb1a6532d947a6bb00f936183767b54a80)
Assuming that the intention is to output a pair of strings on start and
a pair of string on finish.
Signed-off-by: Miklos Vajna <vmiklos@collabora.com>
Change-Id: I1ab3b03c353a8d0875e2fee451ca34965fc3038e
Hopefully helps to hunt segv's more effectively in development.
Signed-off-by: Michael Meeks <michael.meeks@collabora.com>
Change-Id: Ife800f9902fe8d95711b3b14867ee21bb3c3f21b
Records the uno commands from different instances of ChildSession and
dumps the last 4 uno commands into the crashlog during a fatal crash
Signed-off-by: Gopi Krishna Menon <krishnagopi487.github@outlook.com>
Change-Id: I838f71769dc08df7076c040f3d72c15f7607e9d3
It was a request an approved by @kendy despite the signal
handler limitations. After several tries, we were blind
when we introduce the commit using the heap.
This reverts commit 9720eb0f81.
Most C and Posix API clobber errno. By failing to save
it immediately after invoking an API we risk simply
reporting the result of an arbitrary subsequent API call.
This adds LOG_SYS_ERRNO to take errno explicitly.
This is necessary because sometimes logging is not done
immediately after calling the function for which we
want to report errno. Similarly, log macros that log
errno need to save errno before calling any functions.
This is necessary as the argements might contain calls
that clobber errno.
This also converts some LOG_SYS entries to LOG_ERR
because there can be no relevant errno in that context
(f.e. in a catch clause).
A couple of LOG_ macros have been folded into others,
reducing redundancy.
Finally, both of these log macros append errno to the
log message, so there is little point in ending the
messages with a period.
Change-Id: Iecc656f67115fec78b65cad4e7c17a17623ecf43
Signed-off-by: Ashod Nakashian <ashod.nakashian@collabora.co.uk>
loolmount now works and supports mounting and
unmounting, plus numerous improvements,
refactoring, logging, etc.. When enabled,
binding improves the jail setup time by anywhere
from 2x to orders of magnitude (in docker, f.e.).
A new config entry mount_jail_tree controls
whether mounting is used or the old method of
linking/copying of jail contents. It is set to
true by default and falls back to linking/copying.
A test mount is done when the setting is enabled,
and if mounting fails, it's disabled to avoid noise.
Temporarily disabled for unit-tests until we can
cleanup lingering mounts after Jenkins aborts our
build job. In a future patch we will have mount/jail
cleanup as part of make.
The network/system files in /etc that need frequent
refreshing are now updated in systemplate to make
their most recent version available in the jails.
These files can change during the course of loolwsd
lifetime, and are unlikely to be updated in
systemplate after installation at all. We link to
them in the systemplate/etc directory, and if that
fails, we copy them before forking each kit
instance to have the latest.
This reworks the approach used to bind-mount the
jails and the templates such that the total is
now down to only three mounts: systemplate, lo, tmp.
As now systemplate and lotemplate are shared, they
must be mounted as readonly, this means that user/
must now be moved into tmp/user/ which is writable.
The mount-points must be recursive, because we mount
lo/ within the mount-point of systemplate (which is
the root of the jail). But because we (re)bind
recursively, and because both systemplate and
lotemplate are mounted for each jails, we need to
make them unbindable, so they wouldn't multiply the
mount-points for each jails (an explosive growth!)
Contrarywise, we don't want the mount-points to
be shared, because we don't expect to add/remove
mounts after a jail is created.
The random temp directory is now created and set
correctly, plus many logging and other improvements.
Change-Id: Iae3fda5e876cf47d2cae6669a87b5b826a8748df
Reviewed-on: https://gerrit.libreoffice.org/c/online/+/92829
Tested-by: Jenkins
Tested-by: Jenkins CollaboraOffice <jenkinscollaboraoffice@gmail.com>
Reviewed-by: Ashod Nakashian <ashnakash@gmail.com>
Seems to not cause any serious regressions in the iOS app or in "make
run", but of course I am not able to run a comprehensive check of all
functionality.
Change-Id: I44a0e8d60bdbc0a885db88475961575c5e95ce88
Reviewed-on: https://gerrit.libreoffice.org/c/online/+/93037
Tested-by: Jenkins
Tested-by: Jenkins CollaboraOffice <jenkinscollaboraoffice@gmail.com>
Reviewed-by: Tor Lillqvist <tml@collabora.com>
More readable and typically more efficient.
Change-Id: I9bd5bfc91f4ac255bb8ae0987708fb8b56b398f8
Reviewed-on: https://gerrit.libreoffice.org/c/online/+/95285
Reviewed-by: Michael Meeks <michael.meeks@collabora.com>
Tested-by: Jenkins
Tested-by: Jenkins CollaboraOffice <jenkinscollaboraoffice@gmail.com>
LibreOffice core uses that, too, and we support an even more
restricted set of compilers.
Change-Id: I0d0e2c8608e323eb5ef0f35ee8c46d02ab49a745
Reviewed-on: https://gerrit.libreoffice.org/c/online/+/92467
Tested-by: Jenkins CollaboraOffice <jenkinscollaboraoffice@gmail.com>
Reviewed-by: Tor Lillqvist <tml@collabora.com>
1) Don't actually kill anything with the kill command, otherwise kill(0,
SIGKILL) will kill the fuzzer itself.
2) Don't require a valid signature when authenticating with JWT, since
the private key is generated on each process startup.
3) Log when the JWT would be invalid due to an expired timestamp.
Change-Id: I0da285617e27910329c0e7ed80a6d02e86344ccf
Reviewed-on: https://gerrit.libreoffice.org/c/online/+/91737
Tested-by: Jenkins CollaboraOffice <jenkinscollaboraoffice@gmail.com>
Reviewed-by: Miklos Vajna <vmiklos@collabora.com>
ignoring the segv can lead to not making progress, while churning debug.
Change-Id: I97af266cec3feefe2dcbd9adb8dbf4b13a4d69bd
Reviewed-on: https://gerrit.libreoffice.org/c/online/+/87002
Tested-by: Jenkins CollaboraOffice <jenkinscollaboraoffice@gmail.com>
Reviewed-by: Michael Meeks <michael.meeks@collabora.com>
It is painful to check and search manually the PID to attach the LOKit
process when exists several pre-spawned waiting to load a document.
This patch helps to attach the debugger when the LOKit process is about
to load a document then send the "signal SIGUSR1" to resume it.
Change-Id: I3b15bd522c6ef3ef57dc3453b457dcf91f2661b9
Reviewed-on: https://gerrit.libreoffice.org/85430
Reviewed-by: Michael Meeks <michael.meeks@collabora.com>
Tested-by: Henry Castro <hcastro@collabora.com>
Reviewed-by: Henry Castro <hcastro@collabora.com>
This is the cleanest way to achieve the goal
of immediately exiting a child. This is used
for cleaning up kit instances when closing
docs, as well as in unit-tests.
Change-Id: I76870234b130a508044044b102419646abe81ac8
Reviewed-on: https://gerrit.libreoffice.org/83699
Reviewed-by: Ashod Nakashian <ashnakash@gmail.com>
Tested-by: Ashod Nakashian <ashnakash@gmail.com>
malloc is not signal safe, and must not be called
from signal-safe functions. If malloc itself signals,
calling it in the signal handler can deadlock.
Luckily, we only needed malloc for getting the
backtrace strings. Now we just write directly to
stderr, which is faster, cleaner, and safer.
Change-Id: I54093f45e05f2a0fd3c5cde0cc2104ffe6d81d2a
Reviewed-on: https://gerrit.libreoffice.org/83151
Reviewed-by: Ashod Nakashian <ashnakash@gmail.com>
Tested-by: Ashod Nakashian <ashnakash@gmail.com>