Forkit always spawns a single child.
This is done to validate that forking
children is working and to be ready to
serve clients.
However, this initial forking can be slow,
mostly due to cold file reading and loading.
This often causes multiple child spawning
and other undesirable effects.
This patch makes sure that a single child
is always at the ready before proceeding
while simplifying the code. Otherwise, if
forkit fails to fork a single child within
4x an expected command-timeout (currently 5
seconds) then we fail the service altogether.
Change-Id: Ie2a441a2479db98a54f7fb9b9c8e98cda2f07c1c
Reviewed-on: https://gerrit.libreoffice.org/33128
Reviewed-by: Ashod Nakashian <ashnakash@gmail.com>
Tested-by: Ashod Nakashian <ashnakash@gmail.com>
When a client disconnects, we autosave only when there
are no other sessions. The idea being that it'd be
wasteful to save on every client disconnect, so long
that we have a later chance of saving.
This doesn't hold, however, when the only other
remaining session is one that had just connected
and hasn't loaded a view yet. WSD wouldn't
know about this, but when unloading the
disconnecting session, Kit will exit the process
as the last view is gone, losing all unsaved changes.
This patch tracks the sessions in WSD that have
loaded a view and takes that into account when
deciding to autosave on disconnect or not, thereby
fixing this corner case that may result in data loss.
Change-Id: I699721f2d710128ed4db65943b9357dff54380f5
Reviewed-on: https://gerrit.libreoffice.org/33126
Reviewed-by: Ashod Nakashian <ashnakash@gmail.com>
Tested-by: Ashod Nakashian <ashnakash@gmail.com>
In some cases when the last session is destroying
and we expect the DocumentBroker to be removed,
while waiting for autosave, a new client might
connect to the very same document.
In those cases we shouldn't fail but should retry
loading the document again once it has been unloaded.
Change-Id: I97ebb73991b9739cfb4e93b7de9115bdb48fda7a
Reviewed-on: https://gerrit.libreoffice.org/33125
Reviewed-by: Ashod Nakashian <ashnakash@gmail.com>
Tested-by: Ashod Nakashian <ashnakash@gmail.com>
Before setting up the socket where we listen for client requests, try
connecting to the same port. If that succeeds, another loolwsd process
is already running and listening on that port. That is obviously
undesirable.
Yes, there is a race condition if multiple loolwsd processes are
started simultaneously and do this check before any of them have
actually created the socket. Live with it. Multiple loolwsd processes
is a problem that happens accidentally for developers only anyway. In
a production environment systemd takes care of having just one, I
hope.
Thanks to Kendy for the idea.
Change-Id: Ifdde83472f9a56e592ec5dc7649dd7706efc2f7c
The server tells the client the hash of each tile it sends (calculated
from the contents of the tile, not its PNG encoding). When the client
asks for a tile to be refreshed, it tells the server what the hash of
the existing tile is. If the server notices that the tile contents
hasn't actually changed, it doesn't PNG encode it and doesn't send it
to the client.
The intent is that this will reduce load on the server and also avoid
unnecessary tile traffic.
Change-Id: Ia06ca68655ea984ed4319f24f4470afda322eccf
Prespawning proactively is optimistic, but is not critical.
When getting a space child, more will be spawn perforce,
so prespawnChildren shouldn't block.
Change-Id: I60cc8c1ab87cba384ebc9aca9e79b89f69a99252
Reviewed-on: https://gerrit.libreoffice.org/32858
Reviewed-by: Ashod Nakashian <ashnakash@gmail.com>
Tested-by: Ashod Nakashian <ashnakash@gmail.com>
Previously tilecombine had its own version, which is
nonesensical, since it's not really a tile.
Now it passes the version to the tiles when
parsing and serializes version per-tile.
Change-Id: I5db8d94880431e3d2a40b6787c6fe51a05771305
Reviewed-on: https://gerrit.libreoffice.org/32633
Reviewed-by: Ashod Nakashian <ashnakash@gmail.com>
Tested-by: Ashod Nakashian <ashnakash@gmail.com>
There is no way to let the user of document currently being
opened, in case of failure, know that disk is low on space.
We check the disk space when forking children after which we try
to alert all users but this would end up doing nothing for
current document because document broker is not registered at
this time (we iterate through doc brokers when alerting). Another
conditional disk check is performed just before opening the
document but this is performed only if last disk check was
performed greater than 60 seconds which would never be the case
because document open is always preceded by a child fork (when
rebalancing children).
Lets not cache the disk check when forking the children to
prevent above mentioned situation while still minimizing the
number of disk checks performed.
Change-Id: Id3add998f94e23f9f8c144f09e5efe9f0b63821c
Since we always need to set the thread-pool size
anyway, we cannot have 'unlimited' connections.
Actually, we never did, so that was misleading
in configure.ac anyway.
The current defaults are 20 connections and
10 documents, instead of the previous 1024
connections.
The reason for this "low" limit is to
enable unittesting these limits automatically
for the default configure.
There is also a lower-limit (needed by unittests
and internal technical requirements) of 3 connections
and 2 documents.
Change-Id: I6ccf3a607c50bb2a86bf1c0a16ebb6326ee34c7d
Reviewed-on: https://gerrit.libreoffice.org/32712
Reviewed-by: Ashod Nakashian <ashnakash@gmail.com>
Tested-by: Ashod Nakashian <ashnakash@gmail.com>
In the rare event that it fails, we will
cleanup the DocBrokers correctly.
Change-Id: I6f5346c9ec5021bdadc8a01ed389951ca18d07a3
Reviewed-on: https://gerrit.libreoffice.org/32676
Reviewed-by: Ashod Nakashian <ashnakash@gmail.com>
Tested-by: Ashod Nakashian <ashnakash@gmail.com>
When saving we need to differentiate between no-op
and failure. The lastSaveTime must always be updated
when saving didn't fail (i.e. no modification or saved).
Change-Id: I0e2455afac22c82f0b623f9441fbc0bca8a7cb83
Reviewed-on: https://gerrit.libreoffice.org/32629
Reviewed-by: Ashod Nakashian <ashnakash@gmail.com>
Tested-by: Ashod Nakashian <ashnakash@gmail.com>
While time_t is much simpler, it's too
opaque. The new chrono library is type-safe
and does conversion correctly, as well as
guarantees monotonity and other desirable
properties.
Change-Id: Id41c44c397a31d73e894e8f1715ff18f2b67df53
Reviewed-on: https://gerrit.libreoffice.org/32627
Reviewed-by: Ashod Nakashian <ashnakash@gmail.com>
Tested-by: Ashod Nakashian <ashnakash@gmail.com>
Since we already enumerate the DocumentBrokers and
remove dead ones before every autosave (every 30 seconds)
as well as when loading/unloading documents, there
is little reason to have a separate timeout and loop
for idle documents.
In addition, the previous code only killed the child
but didn't remove the dead DocumentBroker entry from
DocBrokers. Not clear if this was intentional, but it
would have been removed on the next cleanup anyway.
Change-Id: Ie1e09c77133165dd1255930b0b0be06fde7646b1
Reviewed-on: https://gerrit.libreoffice.org/32626
Reviewed-by: Ashod Nakashian <ashnakash@gmail.com>
Tested-by: Ashod Nakashian <ashnakash@gmail.com>
Between waits on forkit we shouldn't spend too much time.
This is to recover from forkit crashes.
Now we first spawn children, and only when not successful
(i.e. no more spare children needed) we check for autosave
and DocBrokers cleanup. Only when none of the above is done
do we sleep.
This gives the best balance between forkit waits and reduces
the unittests by a good 25 seconds (crash tests down from 34s
to about 10s only).
Change-Id: If69284746859bc78d14f1c6eda07aef4e006709e
Reviewed-on: https://gerrit.libreoffice.org/32624
Reviewed-by: Ashod Nakashian <ashnakash@gmail.com>
Tested-by: Ashod Nakashian <ashnakash@gmail.com>
Otherwise we throttle spawning to allow
time to fork the process. If we don't wait
we can bomb a loaded server and bring it down.
However during startup this is not necessary
as there are no in-flight spawn requests.
Change-Id: I1beac94571f6d8d96136d32c81310bea6547242b
Reviewed-on: https://gerrit.libreoffice.org/32622
Reviewed-by: Ashod Nakashian <ashnakash@gmail.com>
Tested-by: Ashod Nakashian <ashnakash@gmail.com>
And say 'forkit' in the logs where we recognize it
instead of the generic 'child'.
Change-Id: I7628b064bb6330db145a948640e48b727def3270
Reviewed-on: https://gerrit.libreoffice.org/32619
Reviewed-by: Ashod Nakashian <ashnakash@gmail.com>
Tested-by: Ashod Nakashian <ashnakash@gmail.com>
And check for save-failed first before finding
the session, as otherwise the session is useless.
Change-Id: I3eb45e54872547eb36765b2c8409e1aa51aee589
Reviewed-on: https://gerrit.libreoffice.org/32611
Reviewed-by: Ashod Nakashian <ashnakash@gmail.com>
Tested-by: Ashod Nakashian <ashnakash@gmail.com>
Avoid unnecessary code under lock
or when DocBrokers is updated,
which requires removal on exception.
Change-Id: Id01aed42cd66616b910b7e16a8a1ed6c1d1e74b9
Reviewed-on: https://gerrit.libreoffice.org/32558
Reviewed-by: Ashod Nakashian <ashnakash@gmail.com>
Tested-by: Ashod Nakashian <ashnakash@gmail.com>
... and print error in the logs. So that if there is some problem
in the setup and we are not able to interact with WOPI host for some
reason, we print it in the error logs clearly.
Convert to new style logging - LOG_XXX macros.
Change-Id: I53ce4f61136ffd256b0eae8712dc7f22a620e2bf
Simplifies the code and makes it less error prone.
Also add more logging on modifying DocBrokers.
Change-Id: I861495912eb4994a32b1ccaccc181fb1aad26e1f
Reviewed-on: https://gerrit.libreoffice.org/32295
Reviewed-by: Ashod Nakashian <ashnakash@gmail.com>
Tested-by: Ashod Nakashian <ashnakash@gmail.com>
The WS could be chocked on write, but we shouldn't
assume the child is dead because of that. We are
only trying to test if the child process is alive
in that helper.
Change-Id: I108a297e43f923cab0dfa30204837dc4df15d3a1
Reviewed-on: https://gerrit.libreoffice.org/32289
Reviewed-by: Ashod Nakashian <ashnakash@gmail.com>
Tested-by: Ashod Nakashian <ashnakash@gmail.com>
Since the local path already ends in foreslash,
no need to keep the on in provided by the client.
Change-Id: Ia2bc24c7faa84509f9ec18deefb14cad2858e856
Reviewed-on: https://gerrit.libreoffice.org/32288
Reviewed-by: Ashod Nakashian <ashnakash@gmail.com>
Tested-by: Ashod Nakashian <ashnakash@gmail.com>
For now, do the check only when a new session connects to the
document, because at that point we fetch the document information (in
separate function for WOPI and local files) and construct the
FileInfo, including timestamp.
For now, just log an ERR message if we notice that the document in its
storage system (WOPI or local file system) has an unexpected last
modified time. What should we do? If we don't have unsaved changes,
most likely we should just silently reload the document and force all
sessions to refresh. But if we have unsaved changes, and the document
has changed underneath, we have a problem.
We need to fetch the timestamp also also after saving ("persisting")
as we can't assume that the clock on the machine running loolwsd and
that of the storage (as reported in the WOPI case in CheckFileInfo)
are in synch. (Assuming separate machines, they certainly won't ever
exactly in synch, but aren't necessarily even just a few seconds apart
(think incorrectly set up timezone etc), so no amount of tolerance in
the comparison would be good enough, because after all, it might be
that in the problematic cases we are looking for the timestamps also
are separated by a quite short time.)
Yes, this means there is a race condition; what if the document is
modified behind out back right after we have persisted it, before we
ask for its timestamp? It would be much better if the persisting
operation atomically also told what the timestamp of the document in
the storage is after persisting, but alas, WOPI doesn't do that.
Rename the DocumentBroker::origDocumentLastModifiedTime field to
_documentLastModifiedTime as that is less misleading. It is not the
"original" document timestamp but the timestamp of the document in its
storage system.
This needs much more work: Ideally the timestamp of the document in
its storage system should be retrieved and checked against the
expected value also before we are about to save it.
But unfortunately experience has shown that the WOPI CheckFileInfo
operation can be expensive, so we'll see what can be done. Ideally
WOPI should contain the optional functionality to return an error if,
when saving a document, its timestamp (and size?) in storage are not
what the saving client expects.
Also add a few FIXME comments.
Change-Id: I5a9b55d4b55a8db0c9ee8638edd368dc0aa325d5
The information in StorageBase::_fileInfo is not expensive to
construct. What might be expensive we were doing anyway for each time
a session connects to the document. So just update _fileInfo whenever
we fetch currrent information about the document in its storage, which
is useful, as the timestamp and size after all will change whenever
the document is persisted.
Change-Id: I173394c88b4d6448ad5bf1ab9b41694cffdf1ff4
And not as the current time. The getLocalFileInfo() and
getWOPIFileInfo() functions will reset it anyway.
Change-Id: Ief821482cf789113cecd85e5d59afecf6ad2c3ab
It would be another thing if it would be abstract and overridden in
the concrete derived classes. But now it just returns the _fileInfo
field that is set by the separate getLocalFileInfo() and
getWOPIFileInfo() functions. Possibly this should be re-designed
and/or stuff re-named for clarity.
Change-Id: I30d01277ac0c6bd4b9daa317aade319b6ef39342
Poco::Timestamp::fromEpochTime() is a static member function that
returns a Timestamp.
Let's hope this doesn't introduce any regression.
Change-Id: I0997c4c3128ec07d5db76dbf3ecc388cd19ac2ac
Initial effort was setting it when we are compiling with
--enable-debug or when not, unit tests are running, but since we don't
run the unit tests in absence of --enable-debug anyways,
there is no use for this #else code block.
Change-Id: I9d7c66a18be160a47afd2bf336d89d50d4f2463e
loleaflet_logging defaults to true with compiled with
--enable-debug otherwise false.
Browser will print additional debug info when this property is
set to true.
Change-Id: Id9fabf134bd8d19fa1a09ca8c0987df46d4f1a4c
By that time the loleaflet code has already done the greying-out (this
happens at the latest after 10 minutes of inactivity) and marked them
as 'inactive'. Bluntly closing the session seems to work out fine, no
new horribly alarming message is displayed by the loleaflet code in
addition to the already visible 'Inactive document ...' one.
Change-Id: I420f0d7530194e6c8d6f34d7985ab810cde5a76a
To avoid degrading performance for everyone
because of a single slow/bad connection, we
send data to clients each in its own thread.
Change-Id: I6f980c25a404c4d05bcdb1979849ea3d2776c7b9
Reviewed-on: https://gerrit.libreoffice.org/31984
Reviewed-by: Ashod Nakashian <ashnakash@gmail.com>
Tested-by: Ashod Nakashian <ashnakash@gmail.com>
The call to the setRaw() virtual function during construction will not
dispatch to derived class, so the ctor is correct only in case there are
no subclasses.
Change-Id: I484d276dd05e53211b513373100b70586a5857c0
using these WOPI protocol extensions: DisablePrint,
DisableExport, DisableCopy. All assumed to be false if not
provided.
Change-Id: I415597d710107f9d8cbb8757f361365cc2a88eb1
We want document owners (sessions with their WOPI UserId =
OwnerId) close the document (for all sessions) only if
EnableOwnerTermination property is specified. Note that
EnableOwnerTermination is an extension to WOPI, and not part of
original WOPI standard.
Change-Id: I7237d172c4c54477572bc132336910250af075a3
The pool is checked for expansion before processing
items from the queue. By tracking the number of idle
thread (i.e. not currently sending data) we ensure
that there is at least one idle thread before
we invoke the socket.
If there is not enough idle threads at that point,
a new thread is spawned, so long as we're below the
limit. This guarantees that even if all the active
threads block on the socket, we'd always have one
more to process new data (until we reach the limit,
which is as many client connections as we have).
Technically, a single slow connection could
still monopolize all connections if there are
many messages to be sent to it. For that we'd
need to track and assign one thread per connection,
something we don't currently do.
Change-Id: Ic8b5e064da068b37bcfa773005495b198763c31d
Reviewed-on: https://gerrit.libreoffice.org/31886
Reviewed-by: Ashod Nakashian <ashnakash@gmail.com>
Tested-by: Ashod Nakashian <ashnakash@gmail.com>
This adds SenderQueue and a wrapper of messages to
send back to clients.
Currently no threading takes place, but the messages
are pumped through the queue nonetheless.
Change-Id: Id9997539c0a2a351cbf406f649c268dd3643e88e
Reviewed-on: https://gerrit.libreoffice.org/31883
Reviewed-by: Ashod Nakashian <ashnakash@gmail.com>
Tested-by: Ashod Nakashian <ashnakash@gmail.com>
Use a new protocol message, 'resetidle' to inform Admin clients
whenever a user has done anything in a document view. This is a
message that Admin clients need to subscribe to.
Also add the current idle time for each document to the 'documents'
message.
To reduce protocol chatter, the idle time is updated at most once per
10 s.
Change-Id: I418e82b05048a3628f21dcd240ccd974b3a01356
Reviewed-on: https://gerrit.libreoffice.org/31653
Reviewed-by: Tor Lillqvist <tml@collabora.com>
Tested-by: Tor Lillqvist <tml@collabora.com>
Set a timer in loleaflet 15 minutes before access token expiry
date (access_token_ttl value) to prompt the user to save and
refresh the session.
Change-Id: I98c3e47c9b7031e29e002f653d488747b9c17df8
Reviewed-on: https://gerrit.libreoffice.org/31381
Reviewed-by: Jan Holesovsky <kendy@collabora.com>
Tested-by: Jan Holesovsky <kendy@collabora.com>
We now guarantee that forkChildren will be invoked
under DocBrokersMutex lock.
This eliminates the case when alertAllUsersInternal
is invoked when this mutex isn't locked.
Change-Id: Ibb259bbb4f380300a90ad2fc7affe6013dd71fef
Unfortunately this brings back the deadlock that
51c88c5fb7 fixed :-(
This reverts commit da3b1b208a.
Change-Id: If48c0b3ddebf3fb366786d90cb35c3c22963b1a1
Fixup the part for EMPTY invalidation as well.
N.B. Should replace 'EMPTY' with '0,0,INT_MAX,INT_MAX'
to be consistent. Since both are expected, no compat
issues should be expected.
Change-Id: I066981622b1de71e9e849530df0583291b74b804
Reviewed-on: https://gerrit.libreoffice.org/31393
Reviewed-by: Ashod Nakashian <ashnakash@gmail.com>
Tested-by: Ashod Nakashian <ashnakash@gmail.com>
Messages larger than a certain size are preambled
with a 'nextmessage' message that hold the size
of the subsequent message.
This is a workaround to a limitation the Poco
WebSocket API where if the buffer size is
smaller than the received frame the socket
ends up in a bad state and must be closed.
Unfortunately the new API that avoids this
workaround is not yet released by Poco.
Here we minimize the need for 'nextmessage'
to truely large messages. The limit is now
raised from above 1KB to over 63KB.
We may raise this limit further, but that will
cost each socket that much dedicated buffer size.
Change-Id: I01e4c68cdbe67e413c04a9725152224a87ab8267
Reviewed-on: https://gerrit.libreoffice.org/31286
Reviewed-by: Ashod Nakashian <ashnakash@gmail.com>
Tested-by: Ashod Nakashian <ashnakash@gmail.com>