2017-08-04 06:46:44 -05:00
|
|
|
= Introduction =
|
|
|
|
|
|
|
|
The VCL scheduler handles LOs primary event queue. It is simple by design,
|
|
|
|
currently just a single-linked list, processed in list-order by priority
|
2017-08-17 09:41:20 -05:00
|
|
|
using round-robin for reoccurring tasks.
|
2017-08-04 06:46:44 -05:00
|
|
|
|
|
|
|
The scheduler has the following behaviour:
|
|
|
|
|
|
|
|
B.1. Tasks are scheduled just priority based
|
|
|
|
B.2. Implicitly cooperative AKA non-preemptive
|
|
|
|
B.3. It's not "fair" in any way (a consequence of B.2)
|
|
|
|
B.4. Tasks are handled round-robin (per priority)
|
|
|
|
B.5. Higher priorities have lower values
|
|
|
|
B.6. A small set of priorities instead of an flexible value AKA int
|
|
|
|
|
|
|
|
There are some consequences due to this design.
|
|
|
|
|
2017-08-17 09:41:20 -05:00
|
|
|
C.1. Higher priority tasks starve lower priority tasks
|
2017-08-04 06:46:44 -05:00
|
|
|
As long as a higher task is available, lower tasks are never run!
|
|
|
|
See Anti-pattern.
|
|
|
|
|
|
|
|
C.2. Tasks should be split into sensible blocks
|
|
|
|
If this can't really be done, process pending tasks by calling
|
|
|
|
Application::Reschedule(). Or use a thread.
|
|
|
|
|
|
|
|
C.3. This is not an OS scheduler
|
|
|
|
There is no real way to "fix" B.2. and B.3.
|
2017-08-17 09:41:20 -05:00
|
|
|
If you need to do a preemptive task, use a thread!
|
|
|
|
Otherwise make your task suspendable and check SalInstance::AnyInput
|
2017-08-04 06:46:44 -05:00
|
|
|
or call Application::Reschedule regularly.
|
|
|
|
|
|
|
|
|
|
|
|
= Driving the scheduler AKA the system timer =
|
|
|
|
|
|
|
|
1. There is just one system timer, which drives LO event loop
|
|
|
|
2. The timer has to run in the main window thread
|
|
|
|
3. The scheduler is run with the Solar mutex acquired
|
|
|
|
4. The system timer is a single-shot timer
|
|
|
|
5. The scheduler system event / message has a low system priority.
|
|
|
|
All system events should have a higher priority.
|
|
|
|
|
2017-08-17 09:41:20 -05:00
|
|
|
Every time a task is started, the scheduler timer is adjusted. When the timer
|
2017-08-04 06:46:44 -05:00
|
|
|
fires, it posts an event to the system message queue. If the next most
|
2017-08-17 09:41:20 -05:00
|
|
|
important task is an Idle (AKA instant, 0ms timeout), the event is pushed to
|
2017-08-04 06:46:44 -05:00
|
|
|
the back of the queue, so we don't starve system messages, otherwise to the
|
2017-08-17 09:41:20 -05:00
|
|
|
front. This is especially important to get a correct SalInstance::AnyInput
|
2017-08-04 06:46:44 -05:00
|
|
|
handling, as this is used to suspend long background Idle tasks.
|
|
|
|
|
2017-08-17 09:41:20 -05:00
|
|
|
Every time the scheduler is invoked it searches for the next task to process,
|
2017-08-04 06:46:44 -05:00
|
|
|
restarts the timer with the timeout for the next event and then invokes the
|
|
|
|
task. After invoking the task and if the task is still active, it is pushed
|
|
|
|
to the end of the queue and the timeout is eventually adjusted.
|
|
|
|
|
|
|
|
|
|
|
|
= Locking =
|
|
|
|
|
2017-08-17 09:41:20 -05:00
|
|
|
The locking is quite primitive: all interaction with internal Scheduler
|
2017-08-04 06:46:44 -05:00
|
|
|
structures are locked. This includes the ImplSchedulerContext and the
|
|
|
|
Task::mpSchedulerData, which is actually a part of the scheduler.
|
|
|
|
Before invoking the task, we have to release the lock, so others can
|
|
|
|
Start new Tasks.
|
|
|
|
|
|
|
|
|
|
|
|
= Lifecycle / thread-safety of Scheduler-based objects =
|
|
|
|
|
|
|
|
A scheduler object it thread-safe in the way, that it can be associated to
|
|
|
|
any thread and any thread is free to call any functions on it. The owner must
|
|
|
|
guarantee that the Invoke() function can be called, while the Scheduler object
|
|
|
|
exists / is not disposed.
|
|
|
|
|
|
|
|
|
|
|
|
= Anti-pattern: Dependencies via (fine grained) priorities =
|
|
|
|
|
|
|
|
"Idle 1" should run before "Idle 2", therefore give "Idle 1" a higher priority
|
|
|
|
then "Idle 2". This just works correct for low frequency idles, but otherwise
|
|
|
|
always breaks!
|
|
|
|
|
|
|
|
If you have some longer work - even if it can be split by into schedulable,
|
|
|
|
smaller blocks - you normally don't want to schedule it with a non-default
|
|
|
|
priority, as it starves all lower priority tasks. Even if a block was processed
|
|
|
|
in "Idle 1", it is scheduled with the same (higher) priority again. Changing
|
|
|
|
the "Idle" to a "Timer" also won't work, as this breaks the dependency.
|
|
|
|
|
|
|
|
What is needed is task based dependency handling, so if "Task 1" is done, it
|
|
|
|
has to start "Task 2" and if "Task 1" is started again, it has to stop
|
|
|
|
"Task 2". This currently has to be done by the implementor, but this feature
|
|
|
|
can be added to the scheduler reasonably.
|
|
|
|
|
|
|
|
|
|
|
|
= Implementation details =
|
|
|
|
|
2017-08-28 12:58:32 -05:00
|
|
|
== General: main thread deferral ==
|
|
|
|
|
|
|
|
Currently for Mac and Windows, we run main thread deferrals by disabling the
|
|
|
|
SolarMutex using a boolean. In the case of the redirect, this makes
|
2017-09-26 10:17:44 -05:00
|
|
|
tryToAcquire and doAcquire return true or 1, while a release is ignored.
|
2017-08-28 12:58:32 -05:00
|
|
|
Also the IsCurrentThread() mutex check function will act accordingly, so all
|
|
|
|
the DBG_TESTSOLARMUTEX won't fail.
|
|
|
|
|
|
|
|
Since we just disable the locks when we start running the deferred code in the
|
|
|
|
main thread, we won't let the main thread run into stuff, where it would
|
|
|
|
normally wait for the SolarMutex.
|
|
|
|
|
|
|
|
Eventually this will move into the GenericSolarMutex. KDE / Qt also does main
|
|
|
|
thread redirects using Qt::BlockingQueuedConnection.
|
|
|
|
|
2017-08-29 03:29:51 -05:00
|
|
|
== General: processing all current events for DoYield ==
|
|
|
|
|
|
|
|
This is easily implemented for all non-priority queue based implementations.
|
|
|
|
Windows and MacOS both have a timestamp attached to their events / messages,
|
|
|
|
so simply get the current time and just process anything < timestamp.
|
|
|
|
For the KDE backend this is already the default behaviour - single event
|
|
|
|
processing isn't even supported. The headless backend accomplishes this by
|
|
|
|
just processing a copy of the list of current events.
|
|
|
|
|
|
|
|
Problematic in this regard is the Gtk+ backend. g_main_context_iteration
|
|
|
|
dispatches "only those highest priority event sources". There is no real way
|
|
|
|
to tell, when these became ready. I've added a workaround idea to the TODO
|
|
|
|
list. FWIW: Qt runs just a single timer source in the glib main context,
|
|
|
|
basically the same we're doing with the LO scheduler as a system event.
|
|
|
|
|
|
|
|
The gen X11 backend has some levels of redirection, but needs quite some work
|
|
|
|
to get this fixed.
|
|
|
|
|
2017-08-04 06:46:44 -05:00
|
|
|
== MacOS implementation details ==
|
|
|
|
|
2017-08-15 01:23:31 -05:00
|
|
|
Generally the Scheduler is handled as expected, except on resize, which is
|
|
|
|
handled with different runloop-modes in MacOS. In case of a resize, the normal
|
|
|
|
runloop is suspended in sendEvent, so we can't call the scheduler via posted
|
2017-08-08 08:03:37 -05:00
|
|
|
main loop-events. Instead the scheduler uses the timer again.
|
|
|
|
|
|
|
|
Like the Windows backend, all Cocoa / GUI handling also has to be run in
|
|
|
|
the main thread. We're emulating Windows out-of-order PeekMessage processing,
|
|
|
|
via a YieldWakeupEvent and two conditionals. When in a RUNINMAIN call, all
|
|
|
|
the DBG_TESTSOLARMUTEX calls are disabled, as we can't release the SolarMutex,
|
2017-08-28 12:58:32 -05:00
|
|
|
but we can prevent running any other SolarMutex based code. Those wakeup
|
|
|
|
events must be ignored to prevent busy-locks. For more info read the "General:
|
|
|
|
main thread deferral" section.
|
2017-08-08 08:03:37 -05:00
|
|
|
|
|
|
|
We can neither rely on MacOS dispatch_sync code block execution nor the
|
2017-09-22 04:23:45 -05:00
|
|
|
message handling, as both can't be prioritized or filtered and the first
|
2017-08-08 08:03:37 -05:00
|
|
|
does also not allow nested execution and is just processed in sequence.
|
2017-08-15 01:23:31 -05:00
|
|
|
|
|
|
|
There is also a workaround for a problem for pushing tasks to an empty queue,
|
|
|
|
as [NSApp postEvent: ... atStart: NO] doesn't append the event, if the
|
|
|
|
message queue is empty.
|
2017-08-04 06:46:44 -05:00
|
|
|
|
|
|
|
Probably that's the reason, why some code comments spoke of lost events and
|
|
|
|
there was some distinct additional event processing implemented.
|
|
|
|
|
|
|
|
== Windows implementation details ==
|
|
|
|
|
|
|
|
Posted or sent event messages often trigger processing of WndProc in
|
|
|
|
PeekMessage, GetMessage or DispatchMessage, independently from the message to
|
|
|
|
fetch, remove or dispatch ("During this call, the system delivers pending,
|
|
|
|
nonqueued messages..."). Additionally messages have an inherited priority
|
|
|
|
based on the function used to generate them. Even if WM_TIMER should have been
|
|
|
|
the lowest prio, a posted WM_TIMER is processed with the prio of a posted
|
|
|
|
message.
|
|
|
|
|
|
|
|
Therefore the current solution always starts a (threaded) timer even for the
|
|
|
|
instant Idles and syncs to this timer message in the main dispatch loop.
|
|
|
|
Using SwitchToThread(), this seem to work reasonably well.
|
|
|
|
|
tdf#111994 WIN workaround PostMessage delays
Fixes the "Multiple timers in queue" assertion by effectively
removing it.
When debugging it became obvious, that PostMessage returns, even
if the message was not yet added to the message queue.
The assert happens, because we start the timer in the Scheduler
before Invoke(), so it fires, if we block in Invoke(), and then
reset the timer after Invoke, if there were changes to the Task
list.
In this case it fires during Invoke(), the message is added. We
restart the timer, first by stopping it (we wait in
DeleteTimerQueueTimer, to be sure the timer function has either
finished or was not run). And the try to remove the message with
PeekMessageW, which doesn't remove the posted message.
Then the timer is restarted, and when the event is processed, we
end up with an additional timer event, which was asserted.
As a fix this adds a (microsecond) timestamp to the timer message,
which is validated in the WinProc function. So if we stop the
timer too fast, the event is ignored based on the timestamp.
And while at it, the patch moves timer related variables from
SalData into WinSalTimer.
Change-Id: Ib840a421e8bd040d40f39473e1d44491e5b332bd
Reviewed-on: https://gerrit.libreoffice.org/42575
Tested-by: Jenkins <ci@libreoffice.org>
Reviewed-by: Jan-Marek Glogowski <glogow@fbihome.de>
2017-08-24 06:41:37 -05:00
|
|
|
An additional workaround is implemented for the delayed queuing of posted
|
|
|
|
messages, where PeekMessage in WinSalTimer::Stop() won't be able remove the
|
|
|
|
just posted timer callback message. We handle this by adding a timestamp to
|
|
|
|
the timer callback message, which is checked before starting the Scheduler.
|
|
|
|
This way we can end with multiple timer callback message in the queue, which
|
|
|
|
we were asserting.
|
|
|
|
|
2017-08-28 12:58:32 -05:00
|
|
|
To run the required GUI code in the main thread without unlocking the
|
|
|
|
SolarMutex, we "disable" it. For more infos read the "General: main thread
|
|
|
|
deferral" section.
|
|
|
|
|
2017-08-04 06:46:44 -05:00
|
|
|
== KDE implementation details ==
|
|
|
|
|
2017-08-17 09:41:20 -05:00
|
|
|
This implementation also works as intended. But there is a different Yield
|
|
|
|
handling, because Qts QAbstractEventDispatcher::processEvents will always
|
2017-08-04 06:46:44 -05:00
|
|
|
process all pending events.
|
|
|
|
|
|
|
|
|
|
|
|
= TODOs and ideas =
|
|
|
|
|
|
|
|
== Task dependencies AKA children ==
|
|
|
|
|
|
|
|
Every task can have a list of children / a child.
|
|
|
|
|
|
|
|
* When a task is stopped, the children are started.
|
|
|
|
* When a task is started, the children are stopped.
|
|
|
|
|
|
|
|
This should be easy to implement.
|
|
|
|
|
|
|
|
== Per priority time-sorted queues ==
|
|
|
|
|
|
|
|
This would result in O(1) scheduler. It was used in the Linux kernel for some
|
2017-08-17 09:41:20 -05:00
|
|
|
time (search Ingo Molnar's O(1) scheduler). This can be a scheduling
|
2017-08-04 06:46:44 -05:00
|
|
|
optimization, which would prevent walking longer event list. But probably the
|
|
|
|
management overhead would be too large, as we have many one-shot events.
|
|
|
|
|
|
|
|
To find the next task the scheduler just walks the (constant) list of priority
|
|
|
|
queues and schedules the first ready event of any queue.
|
|
|
|
|
|
|
|
The downside of this approach: Insert / Start / Reschedule(for "auto" tasks)
|
|
|
|
now need O(log(n)) to find the position in the queue of the priority.
|
|
|
|
|
|
|
|
== Always process all (higher priority) pending events ==
|
|
|
|
|
|
|
|
Currently Application::Reschedule() processes a single event or "all" events,
|
|
|
|
with "all" defined as "100 events" in most backends. This already is ignored
|
|
|
|
by the KDE4 backend, as Qt defines its QAbstractEventDispatcher::processEvents
|
|
|
|
processing all pending events (there are ways to skip event classes, but no
|
|
|
|
easy way to process just a single event).
|
|
|
|
|
|
|
|
Since the Scheduler is always handled by the system message queue, there is
|
|
|
|
really no more reasoning to stop after 100 events to prevent LO Scheduler
|
|
|
|
starvation.
|
2017-08-08 08:03:37 -05:00
|
|
|
|
2017-08-23 09:07:50 -05:00
|
|
|
== Drop static inherited or composed Task objects ==
|
|
|
|
|
|
|
|
The sequence of destruction of static objects is not defined. So a static Task
|
|
|
|
can not be guaranteed to happen before the Scheduler. When dynamic unloading
|
|
|
|
is involved, this becomes an even worse problem. This way we could drop the
|
|
|
|
mbStatic workaround from the Task class.
|
|
|
|
|
2017-08-08 08:03:37 -05:00
|
|
|
== Run the LO application in its own thread ==
|
|
|
|
|
|
|
|
This would probably get rid of most of the MacOS and Windows implementation
|
|
|
|
details / workarounds, but is quite probably a large amount of work.
|
|
|
|
|
|
|
|
Instead of LO running in the main process / thread, we run it in a 2nd thread
|
|
|
|
and defer al GUI calls to the main thread. This way it'll hopefully not block
|
|
|
|
and can process system events.
|
|
|
|
|
2017-09-22 04:23:45 -05:00
|
|
|
That's just a theory - it definitely needs more analysis before even attending
|
2017-08-08 08:03:37 -05:00
|
|
|
an implementation.
|
tdf#111994 WIN workaround PostMessage delays
Fixes the "Multiple timers in queue" assertion by effectively
removing it.
When debugging it became obvious, that PostMessage returns, even
if the message was not yet added to the message queue.
The assert happens, because we start the timer in the Scheduler
before Invoke(), so it fires, if we block in Invoke(), and then
reset the timer after Invoke, if there were changes to the Task
list.
In this case it fires during Invoke(), the message is added. We
restart the timer, first by stopping it (we wait in
DeleteTimerQueueTimer, to be sure the timer function has either
finished or was not run). And the try to remove the message with
PeekMessageW, which doesn't remove the posted message.
Then the timer is restarted, and when the event is processed, we
end up with an additional timer event, which was asserted.
As a fix this adds a (microsecond) timestamp to the timer message,
which is validated in the WinProc function. So if we stop the
timer too fast, the event is ignored based on the timestamp.
And while at it, the patch moves timer related variables from
SalData into WinSalTimer.
Change-Id: Ib840a421e8bd040d40f39473e1d44491e5b332bd
Reviewed-on: https://gerrit.libreoffice.org/42575
Tested-by: Jenkins <ci@libreoffice.org>
Reviewed-by: Jan-Marek Glogowski <glogow@fbihome.de>
2017-08-24 06:41:37 -05:00
|
|
|
|
|
|
|
== Re-evaluate the MacOS ImplNSAppPostEvent ==
|
|
|
|
|
|
|
|
Probably a solution comparable to the Windows backends delayed PostMessage
|
|
|
|
workaround using a validation timestamp is better then the current peek,
|
|
|
|
remove, re-postEvent, which has to run in the main thread.
|
|
|
|
|
|
|
|
Originally I didn't evaluate, if the event is actually lost or just delayed.
|
2017-08-29 03:29:51 -05:00
|
|
|
|
|
|
|
== Drop nMaxEvents from Gtk+ based backends ==
|
|
|
|
|
|
|
|
gint last_priority = G_MAXINT;
|
|
|
|
bool bWasEvent = false;
|
|
|
|
do {
|
|
|
|
gint max_priority;
|
|
|
|
g_main_context_acquire( NULL );
|
|
|
|
bool bHasPending = g_main_context_prepare( NULL, &max_priority );
|
|
|
|
g_main_context_release( NULL );
|
|
|
|
if ( bHasPending )
|
|
|
|
{
|
|
|
|
if ( last_priority > max_priority )
|
|
|
|
{
|
|
|
|
bHasPending = g_main_context_iteration( NULL, bWait );
|
|
|
|
bWasEvent = bWasEvent || bHasPending;
|
|
|
|
}
|
|
|
|
else
|
|
|
|
bHasPending = false;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
while ( bHasPending )
|
|
|
|
|
|
|
|
The idea is to use g_main_context_prepare and keep the max_priority as an
|
|
|
|
indicator. We cannot prevent running newer lower events, but we can prevent
|
|
|
|
running new higher events, which should be sufficient for most stuff.
|
|
|
|
|
|
|
|
This also touches user event processing, which currently runs as a high
|
|
|
|
priority idle in the event loop.
|
|
|
|
|
|
|
|
== Drop nMaxEvents from gen (X11) backend ==
|
|
|
|
|
|
|
|
A few layers of indirection make this code hard to follow. The SalXLib::Yield
|
|
|
|
and SalX11Display::Yield architecture makes it impossible to process just the
|
|
|
|
current events. This really needs a refactorung and rearchitecture step, which
|
|
|
|
will also affect the Gtk+ and KDE4 backend for the user event handling.
|