bccu#1433, bccu#1757 related.
Piggyback editlock information to tile messages so that kit can
use that information to allow changing parts only for messages
with editlock.
... and decline tile render request for tile messages without editlock
information
Change-Id: I9cedb870cd977741375665cb258d04c818481a14
The tile and tilecombine messages apparently have optional
appendages at their rear ends. Not one, but two (at least).
While the first (timestamp) seems to be truely optional
(in the sense that leaving it out doesn't break anything,)
the same can't be said of the second (id).
For Impress slides this id is used to identify the slide
to which the tile belongs. Or rather the slide being
rendered, as it seems meaningful only for the slide
thumbnails.
Previously the complete arguments of tile were copied
verbatim from the input to the output (i.e. back to the
client) and so any extra payload was also echoed back.
But when id is missing (when expected) loleaflet not
only fails to show these tiles (understandably,) but
it also fails to show the scrollbar for said slide
thumbnails altogether!
With the new logic to move the tile communication to
the child socket instead of the clients, the arguments
are parsed and then serialized back in the response.
So all fields must be explicitly known in advance.
This change is necessary because tilecombine is broken
to tile commands and so both share common code. This
means that echoing back the request verbatim will
break loleaflet since tilecombine arguments (which
is a list) is not a valid response (which has the
format of tile). So the internal representation
has to be something neutral/common.
The fix is to parse the timestamp and id only when
provided and to echo back the id only in that case.
Change-Id: Ic97cf0de4083d412a4b3543e8b9a8713ac27a27c
Reviewed-on: https://gerrit.libreoffice.org/24669
Reviewed-by: Ashod Nakashian <ashnakash@gmail.com>
Tested-by: Ashod Nakashian <ashnakash@gmail.com>
For the moment, it will allow running 'make check' that does not conflict with
an already running loolwsd (eg. from 'make run'). Later we can consider
running more tests in parallel.
So that they can run in parallel to a 'production' loolwsd (like one from
'make run' or so) in the future; for the moment it is not possible, as the
MASTER_PORT is hardcoded.
WSD's DocumentBroker and Kit's Document now handle
the communication of tiles as well as all aspects
of rendering, caching, unifying requests and
distributing results etc.
Change-Id: Ie22fbaaae26b91185ee348ec7077f30430eec0f6
Reviewed-on: https://gerrit.libreoffice.org/24640
Reviewed-by: Ashod Nakashian <ashnakash@gmail.com>
Tested-by: Ashod Nakashian <ashnakash@gmail.com>
...child from DocumentBroker""
Restore the communication with child from DocumentBroker.
This reverts commit 20ab6e8ae7.
Change-Id: I248bededff7074d8fb482b2cdd172048f80c02b2
Reviewed-on: https://gerrit.libreoffice.org/24639
Reviewed-by: Ashod Nakashian <ashnakash@gmail.com>
Tested-by: Ashod Nakashian <ashnakash@gmail.com>
For me, it is deadlocking - while we are waiting on the condition variable,
the tear down of the actual LOK / document is not progressing, because it is
waiting on something.
I was unable to get to the bottom of this in a reasonable time, so just
disabled the test for the moment; Ash is working on a similar problem I think,
so let's see if his solution helps here too (or can be applied the same way or
something).
Without this, the unit-prefork gives unpredictable results depending on
whether the entire unit test run output is redirected to another logfile or
not (as then the stdout is inherited, and points to an unexpected place).
Maybe we should just exclude the fd 0, 1, 2 from the testing; but this is good
enough for now.
UnitPrefork got what I assume is one of those PING frames that
ChildProcess::isAlive() sends before the actual reply when it sent the
"unit-memdump" message, and did not like it. Uncommenting the line
that outputs the "memory stats" message it expects showed:
Got memory stats 'PING'
Followed by:
Assertion `tokens.count() == 2' failed.
Fix by factoring out the handling of PING frames, PONG frames, and the
pseudo-PONG frames that we send ourselves in reply to PING frames into
a new function IoUtil::receiveFrame(). Use that in a couple of
places. (Probably should use in many more places.)
Getting past this then leads to later cppunit tests again being run,
and their failures then again showing up...
Note that we currently have unit tests that incorrectly (IMHO) require
some frames sent by the server to indeed be 'text' ones from the
WebSocket perspective. That is an unnecessary restriction.
Autosave should only save when the user has been idle
for a certain time, or the periodic autosave time elapses.
The document is considered for autosave only when it's modified.
Change-Id: Ia239173ff6636e52c1a2b7e1f6bf9bd6860175ed
Reviewed-on: https://gerrit.libreoffice.org/24602
Reviewed-by: Ashod Nakashian <ashnakash@gmail.com>
Tested-by: Ashod Nakashian <ashnakash@gmail.com>
The WebSocket that each child created with WSD is not used
except to request the child to load the document a client
requests. Beyond this point, it was not utilized for anything.
In fact, there are no handlers in WSD for messages coming
from the child; it is a one-way communication.
That is until now. With the move to unify communication
between WSD and each child, DocumentBroker can now
receive and handle messages from its ChildProcess.
Change-Id: Ie7f030a92db8303cd7087fff2325f136a49bc7fc
Reviewed-on: https://gerrit.libreoffice.org/24581
Reviewed-by: Ashod Nakashian <ashnakash@gmail.com>
Tested-by: Ashod Nakashian <ashnakash@gmail.com>
New unittests to verify TileCache logic on the unit level.
Change-Id: Ia36181e850b349abb88ba5f04f1e5244771bacc6
Reviewed-on: https://gerrit.libreoffice.org/24574
Reviewed-by: Ashod Nakashian <ashnakash@gmail.com>
Tested-by: Ashod Nakashian <ashnakash@gmail.com>
Indeed, tests are built when invoking make in loolwsd
directory, thereby helping catch build errors in test
before committing.
Change-Id: I34cffcb5d0aed6485e578cf20f64217bee337d23
Reviewed-on: https://gerrit.libreoffice.org/24573
Reviewed-by: Ashod Nakashian <ashnakash@gmail.com>
Tested-by: Ashod Nakashian <ashnakash@gmail.com>
Since lokit process counting waits until WSD
is ready (i.e. until it has created lokit processes,)
there is no need to sleep explicitly anymore.
Provided, of course, we always call the lokit
process counter before running any tests, which we
currently do.
Change-Id: Idf7ad925688251f1c81ef8628530714d2dc92d9c
Reviewed-on: https://gerrit.libreoffice.org/24528
Reviewed-by: Ashod Nakashian <ashnakash@gmail.com>
Tested-by: Ashod Nakashian <ashnakash@gmail.com>
While there are two separate callbacks registered
(one with lokit and the other with lokitDocument,)
there is no reason why they should be handled
separately (and indeed differently).
The lokit callback only sends notifications on
status indicator (during loading and saving)
and document password type (if protected).
Due to the different callback handlers
the status indicator was only sent to the
first client, not all (as one expects).
Furthermore, because the lokit callback
was processed on the Core thread, it
was bound to cause performance and
thread-safety issues. Specifically it
deadlocked when another callback was
in flight when a save issued status
indicator callback.
By unifying the callbacks and putting
all callback messages on the message
queue we avoid all of the above and
simplify the code.
Change-Id: I5bd790b6ce88b7939186c1ec1dead7fb6cabf7e0
Reviewed-on: https://gerrit.libreoffice.org/24522
Reviewed-by: Ashod Nakashian <ashnakash@gmail.com>
Tested-by: Ashod Nakashian <ashnakash@gmail.com>
Due to filesystem timestamp precision and other factors
we assume a timestamp match within 10 seconds to mean
the document has been recently saved and store it.
A document has to have an older than 10 seconds
modified timestamp compared to our last timestamp
to be deemed unchanged in the interim and skipped
from storing again.
Change-Id: I39b4bf64b221ba30dc7b818a330e779a2d0ecbd4
Reviewed-on: https://gerrit.libreoffice.org/24472
Reviewed-by: Ashod Nakashian <ashnakash@gmail.com>
Tested-by: Ashod Nakashian <ashnakash@gmail.com>
When loading a document first we set the rendering
options. Beyond that, the document is shared and
we shouldn't change the rendering options.
Change-Id: I0d2ac6fc43553b8395111ba2b8a3cc2796d2f0a4
Reviewed-on: https://gerrit.libreoffice.org/24470
Reviewed-by: Ashod Nakashian <ashnakash@gmail.com>
Tested-by: Ashod Nakashian <ashnakash@gmail.com>
As this assert fails at the moment (it did even before my previous
commit), I can't be 100% sure it is correct now. So sue me. Or revert
both my changes.
DocumentBroker is a central document management object.
Using it to communicate with clients leaves it open
to the whims of slow connections. When its lock
is help for a long time, all clients stall, giving
users a very poor experience.
The culprit in this case was takeEditLock, which
sent to all clients the new edit lock state.
To avoid this, DocumentBroker now sends this
state to the children (via a loopback socket,)
which process messages in a separate process and
each on its own queue thread. The children then
in turn echo this edit lock state back to the
clients. This communication back is done on the
prisoner socket thread, which doesn't lock or
stall any shared objects or threads.
Change-Id: I475f6b3ecac9ae2a689bd30f43d416871aa0e384
Reviewed-on: https://gerrit.libreoffice.org/24420
Reviewed-by: Ashod Nakashian <ashnakash@gmail.com>
Tested-by: Ashod Nakashian <ashnakash@gmail.com>
This makes for amore compact API and avoids
a race between issuing the save and waiting for it.
Also added force flag and autoSave now checks the
modified state of the document. If a document is
not modified, nor save forced, autoSave checks
the last activity on the document and only
if there is any since last save does it issue
a save command.
Change-Id: I962e36df18d7edf5f658992e97b5def5f6247dc3
Reviewed-on: https://gerrit.libreoffice.org/24382
Reviewed-by: Ashod Nakashian <ashnakash@gmail.com>
Tested-by: Ashod Nakashian <ashnakash@gmail.com>
When multiple users have a document open,
save notficiations are broadcast to all.
Each session then tries to store the document
to the Storage when only the first should suffice.
A new file modified-time member tracks the file's
timestamp and only persists when it changes,
thereby avoid excessive stores.
Change-Id: I138f1aa812963a2120a1fcac763dfacccc542c1a
Reviewed-on: https://gerrit.libreoffice.org/24381
Reviewed-by: Ashod Nakashian <ashnakash@gmail.com>
Tested-by: Ashod Nakashian <ashnakash@gmail.com>
The test was unreliable, but any change there made it reliable, so not sure
yet what was the root cause - but at least this should help seeing the
brokeness once it appears again.
WebSocket::receiveFrame() does not null-terminate the buffer even when
it successfully reads something into it, even less when it
doesn't. (Why would it, as it is perferctly fine to transmit WebSocket
(binary) frames that contain zero bytes.) So the 'received' string was
always full of random bytes.
Sessions are referrenced in DocumentBroker instances,
which themselves are referrenced in a container.
When exceptions are thrown either while creating a new
session, or during the lifetime of one, these references
must be correctly cleaned up, otherwise we introduce
internal instability in addition to stalling the client.
Change-Id: I3177e45564860897528da6d7fbcbe346d3bd1c75
Reviewed-on: https://gerrit.libreoffice.org/24338
Reviewed-by: Ashod Nakashian <ashnakash@gmail.com>
Tested-by: Ashod Nakashian <ashnakash@gmail.com>
The invalidatetiles is normally a notification coming from
LOK and it signifies that the tiles in quesion need
rendering anew. Issuing this internally from the Kit
removes TileCache images unnecessarily.
Furthermore, since this message is always sent in response
useractive message, there is no need in issuing it from
WSD when loleaflet is perfectly capable of issuing it
itself (internally).
Change-Id: Ia97de6d803745dca3f6e73100f2d921dbbdf76f6
Reviewed-on: https://gerrit.libreoffice.org/24316
Reviewed-by: Ashod Nakashian <ashnakash@gmail.com>
Tested-by: Ashod Nakashian <ashnakash@gmail.com>
Use info() instead. Also, only log the info if the file actually
existed and was removed. We don't want misleading noise logging about
removing files that were not there. Use std::remove() directly to
avoid unnecessary layers that just make it harder to know whether the
removal worked or not.
All changes are supposed to be persistent. This simplifies the tile
caching code quite a lot.
The TileCache object no longer needs to keep any state whether the
document is being edited or whether it has been modified without
saving etc.
Update the modtime.txt file after saving the document. Otherwise the
tile cache would wrongly be considered invalid next time.
As a sanity check, we put a flag file 'unsaved.txt' into the cache
directory whenever we get a callback that indicates the document has
been modified, and remove it when the document is saved. If the flag
file is present when we take an existing tile cache into use, we can't
trust it.
Even after these changes, we still don't use an existing tile cache as
much (or at all?) as we could, though. The INVALIDATE_TILES EMPTY
callback that LO does early on in a conection causes us to remove all
cached tiles...
These notifications are important to be sent once the user
becomes active again to sync their view with the latest.
Change-Id: Id8f9fff83eea888cdcc8d6ed1d4f12111de39a6e
Reviewed-on: https://gerrit.libreoffice.org/24288
Reviewed-by: Ashod Nakashian <ashnakash@gmail.com>
Tested-by: Ashod Nakashian <ashnakash@gmail.com>
This also avoids the feedback loop that results from the kit
thinking the previously inactive client is now active and
sending commands (.uno:Save).
Change-Id: I47074b35a922da15592d550032d494ba1efab83e
Reviewed-on: https://gerrit.libreoffice.org/24287
Reviewed-by: Ashod Nakashian <ashnakash@gmail.com>
Tested-by: Ashod Nakashian <ashnakash@gmail.com>
This test requires the renderid parameter to be present in the 'tile:'
response messages, and that is the case only when ENABLE_DEBUG, so we
can run the test only in a debug build.
In a debug build it contains also the parameter renderid. (And in the
future, might be extended in other ways, too.) Construct a new request
message that has exactly and only the parameters we want.
loleaflet can now send userinactive when the user
has switched tabs or the browser window loses
focus. Similarly, it can send useractive when
focus is regained.
Change-Id: Id3186949b10a8263e29ada1a790d3123a79e8f08
Reviewed-on: https://gerrit.libreoffice.org/24272
Reviewed-by: Ashod Nakashian <ashnakash@gmail.com>
Tested-by: Ashod Nakashian <ashnakash@gmail.com>
Will be used in unit test to verify that several clients of the same
document asking for the same tile simultaneously indeed do cause just
one tile rendering to take place.
It is still possible to access them directly via loleaflet/dist/<something>,
but such use can lead to unexpected behaviour due to various caching in the
browsers etc.
Implement this for tilecombine, and do tile writes in each client's
thread separately. Add env-var. to trigger sleep, and tune it to 1
second; easily long enough to exercise this code-path.
Keep track of tiles being rendered in TileCache, and when asked to
render the same tile as is already being rendered, just "subscribe" to
the existing ongoing rendering. When a tile has been rendered and is
being sent out to clients, check if there are "subscriptions" and send
it to them, too.
One problem is that if the client that caused a tile rendering to be
initiated goes away before the rendering has completed, it will never
complete, and the subscribers are left without the tile.
Change-Id: Icca237876a0f466c29eb5bf60ffd4da3d9d68600
Reviewed-on: https://gerrit.libreoffice.org/24228
Reviewed-by: Ashod Nakashian <ashnakash@gmail.com>
Tested-by: Ashod Nakashian <ashnakash@gmail.com>
Since auto-discovery is problematic, this patch implements
support for both regex patterned hostnames/IPs to allow,
and those to block/deny.
A hostname/IP must be both allowed, and not denied, to
be accepted.
By setting ranges of allowed hostnames/IPs, and others
to block/deny, an admin can configure Online with
great flexibility.
Defaults updated with same values, but not exhaustive.
Change-Id: Iedfcafe41d07d905b549fb450c3fe625ad44599e
Reviewed-on: https://gerrit.libreoffice.org/24233
Reviewed-by: Ashod Nakashian <ashnakash@gmail.com>
Tested-by: Ashod Nakashian <ashnakash@gmail.com>
The closing handshake.
Either peer can send a control frame with data containing
a specified control sequence to begin the closing handshake.
Upon receiving such a frame, the other peer sends a
Close frame in response, if it hasn't already sent one.
When compiled with --enable-debug, when requesting a tile for part=42,
actually use part=0, and sleep five seconds before passing the
rendered tile back up. This makes it easier to debug handling of
simultaneous requests for the same tile from multiple clients.
In loolforkit, whenever we have forked a new loolkit, also check if
any previously forked children have exited. Remove the jails of
those. (The loolkit process itself does not even try to remove all of
its jail, see 3aadd910c6e32c0e557671effa5a4c606cd6e8bf.)
In order to be able to notice exited child processes in loolforkit, we
no longer can set the action for SIGCHLD to SIG_IGN. That means that
exiting loolkit processes will be in the zombie state until loolforkit
picks up their exit status. As loolforkit does this check only in
connection with forking a new child, zombie loolkit processes will
hang around for some time, until the next loolkit process is
forked. Not sure if this is a problem.
countLoolKitProcesses() in httpwstest now needs to skip zombies.
Loolwsd still takes care of removing whatever jails are left when it
finishes.
When inside the chroot, what we would need to do is remove everything
below / . But doing that is a bit too risky, in case some developer
screws up some detail and that code happens to run outside the chroot
after all, and the developer's machine gets trashed. So just remove
paths we can reasonably assume won't exist as global pathnames on a
developer machine: loSubPath and JAILED_DOCUMENT_ROOT.
Currently the actual complete cleanup of loolkit jails happens in
loolwsd when it is exiting. That is a bug and will have to be
fixed. It should be done in loolforkit as soon as possible after the
loolkit process has exited.
At least, that is the value of the num_prespawn_children element in
the loolwsd.xml as shipped. But maybe that is not what is meant with
"default"? It is unclear to me what the "default" attribute means.
When a new view is created on a document that is
in the process of unloading, all sorts of things
can go wrong. This is especially problematic when
the document needs to be saved before unloading,
which takes significantly longer than otherwise.
Change-Id: Ib33a18cafa9d5a3a17f6bd8c6145f9331ae54044
Reviewed-on: https://gerrit.libreoffice.org/24184
Reviewed-by: Ashod Nakashian <ashnakash@gmail.com>
Tested-by: Ashod Nakashian <ashnakash@gmail.com>
Normally, when each client view closes, the
session count is decremented until the last
view is closed. However this doesn't work
when the kit child process terminates.
Due to a race condition between the last
client disconnecting, and the internal
structure destructing, and the next
client connecting (on the same doc),
the Admin loses track of the doc and pid.
This is an issue of assuming a document
and its pid are unique and will always
remain unchanged.
This patch adds a new API to remove a
doc and all its views unconditionally
to try to avoid the above issues.
Change-Id: I0c181260679875b0464dd9b6548b29b8d6a361f7
Reviewed-on: https://gerrit.libreoffice.org/24183
Reviewed-by: Ashod Nakashian <ashnakash@gmail.com>
Tested-by: Ashod Nakashian <ashnakash@gmail.com>
Standardized error handling in request-handlers.
There is a new family of internal exeptions designed
to signify the type of error and how to handle it.
All handlers must throw one of those errors
and they will be translated to the correct HTTP
response when caught.
Since some requests send a response as part of their
handling (convert-to, for example) those handlers
must return a flag signlaning whether or not they
sent a response. If not, HTTP OK response is sent
at the end of the handler.
To complicate things, some requests upgrade the
connection to WebSocket. In those cases errors
must be sent via the WebSocket and not as an
HTTP response. The error message sent can (and
in most cases should) be displayed to the end-user.
A new file, UserMessages.hpp, has been added to
hold user-visible messages that can be
reviewed and translated.
Change-Id: Icc725f3313446d4514cf6d092635158ee7171f5d
Reviewed-on: https://gerrit.libreoffice.org/24133
Reviewed-by: Ashod Nakashian <ashnakash@gmail.com>
Tested-by: Ashod Nakashian <ashnakash@gmail.com>
This ensures that bundled fonts in instdir/share end up resolved to
the same path that they were in when the forkit font config was setup.
It may also help locate other pre-inited resources.
Also copy in ~/.fonts in debug mode - can't hurt.
SocketProcessor doesn't need to take response
instance, since by the time it is called we
are already upgraded to WebSocket and it's
too late to set a request-level status.
Change-Id: Id95087e60354a50148c88427130613356679cf82
Reviewed-on: https://gerrit.libreoffice.org/24110
Reviewed-by: Ashod Nakashian <ashnakash@gmail.com>
Tested-by: Ashod Nakashian <ashnakash@gmail.com>
Connecting to a Kit process is managed by document broker, that it does several
jobs to establish the bridge connection between the Client and Kit process,
The result, it is mostly time outs to get messages in the unit test and it could fail.
connectLOKit ensures the websocket is connected to a kit process.
Some messages are not forwarded to the client session, this is caused
by the time the client session is assigned, the prison session,
it is already forwarding to not assigned peer session.