As this assert fails at the moment (it did even before my previous
commit), I can't be 100% sure it is correct now. So sue me. Or revert
both my changes.
DocumentBroker is a central document management object.
Using it to communicate with clients leaves it open
to the whims of slow connections. When its lock
is help for a long time, all clients stall, giving
users a very poor experience.
The culprit in this case was takeEditLock, which
sent to all clients the new edit lock state.
To avoid this, DocumentBroker now sends this
state to the children (via a loopback socket,)
which process messages in a separate process and
each on its own queue thread. The children then
in turn echo this edit lock state back to the
clients. This communication back is done on the
prisoner socket thread, which doesn't lock or
stall any shared objects or threads.
Change-Id: I475f6b3ecac9ae2a689bd30f43d416871aa0e384
Reviewed-on: https://gerrit.libreoffice.org/24420
Reviewed-by: Ashod Nakashian <ashnakash@gmail.com>
Tested-by: Ashod Nakashian <ashnakash@gmail.com>
This makes for amore compact API and avoids
a race between issuing the save and waiting for it.
Also added force flag and autoSave now checks the
modified state of the document. If a document is
not modified, nor save forced, autoSave checks
the last activity on the document and only
if there is any since last save does it issue
a save command.
Change-Id: I962e36df18d7edf5f658992e97b5def5f6247dc3
Reviewed-on: https://gerrit.libreoffice.org/24382
Reviewed-by: Ashod Nakashian <ashnakash@gmail.com>
Tested-by: Ashod Nakashian <ashnakash@gmail.com>
When multiple users have a document open,
save notficiations are broadcast to all.
Each session then tries to store the document
to the Storage when only the first should suffice.
A new file modified-time member tracks the file's
timestamp and only persists when it changes,
thereby avoid excessive stores.
Change-Id: I138f1aa812963a2120a1fcac763dfacccc542c1a
Reviewed-on: https://gerrit.libreoffice.org/24381
Reviewed-by: Ashod Nakashian <ashnakash@gmail.com>
Tested-by: Ashod Nakashian <ashnakash@gmail.com>
The test was unreliable, but any change there made it reliable, so not sure
yet what was the root cause - but at least this should help seeing the
brokeness once it appears again.
WebSocket::receiveFrame() does not null-terminate the buffer even when
it successfully reads something into it, even less when it
doesn't. (Why would it, as it is perferctly fine to transmit WebSocket
(binary) frames that contain zero bytes.) So the 'received' string was
always full of random bytes.
Sessions are referrenced in DocumentBroker instances,
which themselves are referrenced in a container.
When exceptions are thrown either while creating a new
session, or during the lifetime of one, these references
must be correctly cleaned up, otherwise we introduce
internal instability in addition to stalling the client.
Change-Id: I3177e45564860897528da6d7fbcbe346d3bd1c75
Reviewed-on: https://gerrit.libreoffice.org/24338
Reviewed-by: Ashod Nakashian <ashnakash@gmail.com>
Tested-by: Ashod Nakashian <ashnakash@gmail.com>
The invalidatetiles is normally a notification coming from
LOK and it signifies that the tiles in quesion need
rendering anew. Issuing this internally from the Kit
removes TileCache images unnecessarily.
Furthermore, since this message is always sent in response
useractive message, there is no need in issuing it from
WSD when loleaflet is perfectly capable of issuing it
itself (internally).
Change-Id: Ia97de6d803745dca3f6e73100f2d921dbbdf76f6
Reviewed-on: https://gerrit.libreoffice.org/24316
Reviewed-by: Ashod Nakashian <ashnakash@gmail.com>
Tested-by: Ashod Nakashian <ashnakash@gmail.com>
Use info() instead. Also, only log the info if the file actually
existed and was removed. We don't want misleading noise logging about
removing files that were not there. Use std::remove() directly to
avoid unnecessary layers that just make it harder to know whether the
removal worked or not.
All changes are supposed to be persistent. This simplifies the tile
caching code quite a lot.
The TileCache object no longer needs to keep any state whether the
document is being edited or whether it has been modified without
saving etc.
Update the modtime.txt file after saving the document. Otherwise the
tile cache would wrongly be considered invalid next time.
As a sanity check, we put a flag file 'unsaved.txt' into the cache
directory whenever we get a callback that indicates the document has
been modified, and remove it when the document is saved. If the flag
file is present when we take an existing tile cache into use, we can't
trust it.
Even after these changes, we still don't use an existing tile cache as
much (or at all?) as we could, though. The INVALIDATE_TILES EMPTY
callback that LO does early on in a conection causes us to remove all
cached tiles...
These notifications are important to be sent once the user
becomes active again to sync their view with the latest.
Change-Id: Id8f9fff83eea888cdcc8d6ed1d4f12111de39a6e
Reviewed-on: https://gerrit.libreoffice.org/24288
Reviewed-by: Ashod Nakashian <ashnakash@gmail.com>
Tested-by: Ashod Nakashian <ashnakash@gmail.com>
This also avoids the feedback loop that results from the kit
thinking the previously inactive client is now active and
sending commands (.uno:Save).
Change-Id: I47074b35a922da15592d550032d494ba1efab83e
Reviewed-on: https://gerrit.libreoffice.org/24287
Reviewed-by: Ashod Nakashian <ashnakash@gmail.com>
Tested-by: Ashod Nakashian <ashnakash@gmail.com>
This test requires the renderid parameter to be present in the 'tile:'
response messages, and that is the case only when ENABLE_DEBUG, so we
can run the test only in a debug build.
In a debug build it contains also the parameter renderid. (And in the
future, might be extended in other ways, too.) Construct a new request
message that has exactly and only the parameters we want.
loleaflet can now send userinactive when the user
has switched tabs or the browser window loses
focus. Similarly, it can send useractive when
focus is regained.
Change-Id: Id3186949b10a8263e29ada1a790d3123a79e8f08
Reviewed-on: https://gerrit.libreoffice.org/24272
Reviewed-by: Ashod Nakashian <ashnakash@gmail.com>
Tested-by: Ashod Nakashian <ashnakash@gmail.com>
Will be used in unit test to verify that several clients of the same
document asking for the same tile simultaneously indeed do cause just
one tile rendering to take place.
It is still possible to access them directly via loleaflet/dist/<something>,
but such use can lead to unexpected behaviour due to various caching in the
browsers etc.
Implement this for tilecombine, and do tile writes in each client's
thread separately. Add env-var. to trigger sleep, and tune it to 1
second; easily long enough to exercise this code-path.
Keep track of tiles being rendered in TileCache, and when asked to
render the same tile as is already being rendered, just "subscribe" to
the existing ongoing rendering. When a tile has been rendered and is
being sent out to clients, check if there are "subscriptions" and send
it to them, too.
One problem is that if the client that caused a tile rendering to be
initiated goes away before the rendering has completed, it will never
complete, and the subscribers are left without the tile.
Change-Id: Icca237876a0f466c29eb5bf60ffd4da3d9d68600
Reviewed-on: https://gerrit.libreoffice.org/24228
Reviewed-by: Ashod Nakashian <ashnakash@gmail.com>
Tested-by: Ashod Nakashian <ashnakash@gmail.com>
Since auto-discovery is problematic, this patch implements
support for both regex patterned hostnames/IPs to allow,
and those to block/deny.
A hostname/IP must be both allowed, and not denied, to
be accepted.
By setting ranges of allowed hostnames/IPs, and others
to block/deny, an admin can configure Online with
great flexibility.
Defaults updated with same values, but not exhaustive.
Change-Id: Iedfcafe41d07d905b549fb450c3fe625ad44599e
Reviewed-on: https://gerrit.libreoffice.org/24233
Reviewed-by: Ashod Nakashian <ashnakash@gmail.com>
Tested-by: Ashod Nakashian <ashnakash@gmail.com>
The closing handshake.
Either peer can send a control frame with data containing
a specified control sequence to begin the closing handshake.
Upon receiving such a frame, the other peer sends a
Close frame in response, if it hasn't already sent one.
When compiled with --enable-debug, when requesting a tile for part=42,
actually use part=0, and sleep five seconds before passing the
rendered tile back up. This makes it easier to debug handling of
simultaneous requests for the same tile from multiple clients.
In loolforkit, whenever we have forked a new loolkit, also check if
any previously forked children have exited. Remove the jails of
those. (The loolkit process itself does not even try to remove all of
its jail, see 3aadd910c6e32c0e557671effa5a4c606cd6e8bf.)
In order to be able to notice exited child processes in loolforkit, we
no longer can set the action for SIGCHLD to SIG_IGN. That means that
exiting loolkit processes will be in the zombie state until loolforkit
picks up their exit status. As loolforkit does this check only in
connection with forking a new child, zombie loolkit processes will
hang around for some time, until the next loolkit process is
forked. Not sure if this is a problem.
countLoolKitProcesses() in httpwstest now needs to skip zombies.
Loolwsd still takes care of removing whatever jails are left when it
finishes.
When inside the chroot, what we would need to do is remove everything
below / . But doing that is a bit too risky, in case some developer
screws up some detail and that code happens to run outside the chroot
after all, and the developer's machine gets trashed. So just remove
paths we can reasonably assume won't exist as global pathnames on a
developer machine: loSubPath and JAILED_DOCUMENT_ROOT.
Currently the actual complete cleanup of loolkit jails happens in
loolwsd when it is exiting. That is a bug and will have to be
fixed. It should be done in loolforkit as soon as possible after the
loolkit process has exited.
At least, that is the value of the num_prespawn_children element in
the loolwsd.xml as shipped. But maybe that is not what is meant with
"default"? It is unclear to me what the "default" attribute means.
When a new view is created on a document that is
in the process of unloading, all sorts of things
can go wrong. This is especially problematic when
the document needs to be saved before unloading,
which takes significantly longer than otherwise.
Change-Id: Ib33a18cafa9d5a3a17f6bd8c6145f9331ae54044
Reviewed-on: https://gerrit.libreoffice.org/24184
Reviewed-by: Ashod Nakashian <ashnakash@gmail.com>
Tested-by: Ashod Nakashian <ashnakash@gmail.com>
Normally, when each client view closes, the
session count is decremented until the last
view is closed. However this doesn't work
when the kit child process terminates.
Due to a race condition between the last
client disconnecting, and the internal
structure destructing, and the next
client connecting (on the same doc),
the Admin loses track of the doc and pid.
This is an issue of assuming a document
and its pid are unique and will always
remain unchanged.
This patch adds a new API to remove a
doc and all its views unconditionally
to try to avoid the above issues.
Change-Id: I0c181260679875b0464dd9b6548b29b8d6a361f7
Reviewed-on: https://gerrit.libreoffice.org/24183
Reviewed-by: Ashod Nakashian <ashnakash@gmail.com>
Tested-by: Ashod Nakashian <ashnakash@gmail.com>
Standardized error handling in request-handlers.
There is a new family of internal exeptions designed
to signify the type of error and how to handle it.
All handlers must throw one of those errors
and they will be translated to the correct HTTP
response when caught.
Since some requests send a response as part of their
handling (convert-to, for example) those handlers
must return a flag signlaning whether or not they
sent a response. If not, HTTP OK response is sent
at the end of the handler.
To complicate things, some requests upgrade the
connection to WebSocket. In those cases errors
must be sent via the WebSocket and not as an
HTTP response. The error message sent can (and
in most cases should) be displayed to the end-user.
A new file, UserMessages.hpp, has been added to
hold user-visible messages that can be
reviewed and translated.
Change-Id: Icc725f3313446d4514cf6d092635158ee7171f5d
Reviewed-on: https://gerrit.libreoffice.org/24133
Reviewed-by: Ashod Nakashian <ashnakash@gmail.com>
Tested-by: Ashod Nakashian <ashnakash@gmail.com>
This ensures that bundled fonts in instdir/share end up resolved to
the same path that they were in when the forkit font config was setup.
It may also help locate other pre-inited resources.
Also copy in ~/.fonts in debug mode - can't hurt.
SocketProcessor doesn't need to take response
instance, since by the time it is called we
are already upgraded to WebSocket and it's
too late to set a request-level status.
Change-Id: Id95087e60354a50148c88427130613356679cf82
Reviewed-on: https://gerrit.libreoffice.org/24110
Reviewed-by: Ashod Nakashian <ashnakash@gmail.com>
Tested-by: Ashod Nakashian <ashnakash@gmail.com>
Connecting to a Kit process is managed by document broker, that it does several
jobs to establish the bridge connection between the Client and Kit process,
The result, it is mostly time outs to get messages in the unit test and it could fail.
connectLOKit ensures the websocket is connected to a kit process.
Some messages are not forwarded to the client session, this is caused
by the time the client session is assigned, the prison session,
it is already forwarding to not assigned peer session.
Sleep in both places before counting them, and catch Poco exceptions
while counting (can happen for instance when a /proc entry goes away
while we are looking for its 'comm' file).
Do not distinguish between normal shutdown or abnormal shutdown.
Also remove 'disconnect' frame to indicate normal shutdown.
Change-Id: I98fd9f5a219feb1097c57302dba14e08ad9bf143
The number of loolkit processes after running the cppunit tests should
be the same as before running them. (At least, that is the intention
at the moment, I think. If/when this changes, the test will have to be
adapted, of course.)
That global flag is checked all over the place, so setting it will
actually make the threads eventually finish. (All polling is done with
timeout, I think, and then checking TerminationFlag whenever the poll
times out.)
Sure, it would be much better to use an eventfd and poll that, too,
instead of timing out from the polls all the time to check a plain old
boolean flag.
If you can't come up with a meaningful message to use in a
CPPUNIT_ASSERT_MESSAGE(), just use CPPUNIT_ASSERT() instead. These
messages aren't intended for end-users but for developers, so it is
pointless to make them high-level and dumbed-down.
In case of abnormal termination of session from client-side,
we might still have data being processed in the kit process, for
example, during auto-save. Trying to send such data over an
expired socket towards the client would throw exceptions which
need to be handled, otherwise the auto-save process would not be
successful. Further, the unhandled exception can bring the document
broker in an unstable state with dockey still present in the
object.
Also do not send 'editlock: 0' to a websocket session which is
going to be removed because the session might have been
forcefully terminated from the client-side, in which case sending
data would go nowhere.
Change-Id: I10eb9c818bc81d4db26d5a19dc8bd44f6fbdf32c
Enforce user being 'lool' for setcap binaries loolmount and loolforkit.
Add warnings if configured without --enable-debug.
Developers should pass --enable-debug to configure.
The loolwsd process created it and opened it for reading, but nothing
opened it for writing.
There is still documentation for it in README, that needs to be either
rewritten to match reality or removed.
Comes in handly in some testing situations where you don't want to
send a signal to get loolwsd to finish. Option is present only in an
--enable-debug build.
If the first session used to save does not hold the edit-lock,
all the .uno:Save commands will get converted to dummy messages
and never reach the kit process.
Change-Id: I36cbee778f8c2c5866dcf276cf156fdf9ed8388e
It is a status code, a 2-byte unsigned integer in network byte order,
potentially followed by a textual reason. But we never include any
specific status codes in the CLOSE frames anyway. (With a Poco-based
WebSocket peer it is always WS_NORMAL_CLOSE, 1000, 0x03 0xE8.)
Since WSD now takes into account the fact that
ForKit always spawns one child, the unit-test
shouldn't expect +1 preforked children.
Change-Id: I5cbe9d817a0d2ffdf9fb0953ef85450f7b8b224f
Reviewed-on: https://gerrit.libreoffice.org/23980
Reviewed-by: Ashod Nakashian <ashnakash@gmail.com>
Tested-by: Ashod Nakashian <ashnakash@gmail.com>
Last client disconnection now correctly issues a save
and waits for the confirmation before tearing down
the sockets, queues and threads.
Change-Id: I28c28d79a17d359e9aa1fe67b983ca9fb592b847
Reviewed-on: https://gerrit.libreoffice.org/23978
Reviewed-by: Ashod Nakashian <ashnakash@gmail.com>
Tested-by: Ashod Nakashian <ashnakash@gmail.com>
Connection thread should not attempt to disconnect the session,
which in turn will try to unload the document, which will
wait on the connection to destroy. The latter will never
happen since the connection destructor must, correctly,
wait for its thread to finish, which is waiting on itself now.
Since the session disconnect is already called from the session
destructor, there is no need to explicitly invoke it here.
Change-Id: Iaf9e8a10d4caa9001208084e909a14b4d4c5105e
Reviewed-on: https://gerrit.libreoffice.org/23966
Reviewed-by: Ashod Nakashian <ashnakash@gmail.com>
Tested-by: Ashod Nakashian <ashnakash@gmail.com>
The sessions container already has the number of sessions.
No need for separate counters to track them.
Change-Id: I838865e2b8a843e87e81a6cc1226bcacd774b032
Reviewed-on: https://gerrit.libreoffice.org/23964
Reviewed-by: Ashod Nakashian <ashnakash@gmail.com>
Tested-by: Ashod Nakashian <ashnakash@gmail.com>