office-gobmx/vcl/README.scheduler

= Introduction =

The VCL scheduler handles LOs primary event queue. It is simple by design,
currently just a single-linked list, processed in list-order by priority
using round-robin for reoccurring tasks.

The scheduler has the following behaviour:

B.1. Tasks are scheduled just priority based
B.2. Implicitly cooperative AKA non-preemptive
B.3. It's not "fair" in any way (a consequence of B.2)
B.4. Tasks are handled round-robin (per priority)
B.5. Higher priorities have lower values
B.6. A small set of priorities instead of an flexible value AKA int

There are some consequences due to this design.

C.1. Higher priority tasks starve lower priority tasks
     As long as a higher task is available, lower tasks are never run!
     See Anti-pattern.

C.2. Tasks should be split into sensible blocks
     If this can't really be done, process pending tasks by calling
     Application::Reschedule(). Or use a thread.

C.3. This is not an OS scheduler
     There is no real way to "fix" B.2. and B.3.
     If you need to do a preemptive task, use a thread!
     Otherwise make your task suspendable and check SalInstance::AnyInput
     or call Application::Reschedule regularly.


= Driving the scheduler AKA the system timer =

  1. There is just one system timer, which drives LO event loop
  2. The timer has to run in the main window thread
  3. The scheduler is run with the Solar mutex acquired
  4. The system timer is a single-shot timer
  5. The scheduler system event / message has a low system priority.
     All system events should have a higher priority.

Every time a task is started, the scheduler timer is adjusted. When the timer
fires, it posts an event to the system message queue. If the next most
important task is an Idle (AKA instant, 0ms timeout), the event is pushed to
the back of the queue, so we don't starve system messages, otherwise to the
front. This is especially important to get a correct SalInstance::AnyInput
handling, as this is used to suspend long background Idle tasks.

Every time the scheduler is invoked it searches for the next task to process,
restarts the timer with the timeout for the next event and then invokes the
task. After invoking the task and if the task is still active, it is pushed
to the end of the queue and the timeout is eventually adjusted.


= Locking =

The locking is quite primitive: all interaction with internal Scheduler
structures are locked. This includes the ImplSchedulerContext and the
Task::mpSchedulerData, which is actually a part of the scheduler.
Before invoking the task, we have to release the lock, so others can
Start new Tasks.


= Lifecycle / thread-safety of Scheduler-based objects =

A scheduler object it thread-safe in the way, that it can be associated to
any thread and any thread is free to call any functions on it. The owner must
guarantee that the Invoke() function can be called, while the Scheduler object
exists / is not disposed.


= Anti-pattern: Dependencies via (fine grained) priorities =

"Idle 1" should run before "Idle 2", therefore give "Idle 1" a higher priority
then "Idle 2". This just works correct for low frequency idles, but otherwise
always breaks!

If you have some longer work - even if it can be split by into schedulable,
smaller blocks - you normally don't want to schedule it with a non-default
priority, as it starves all lower priority tasks. Even if a block was processed
in "Idle 1", it is scheduled with the same (higher) priority again. Changing
the "Idle" to a "Timer" also won't work, as this breaks the dependency.

What is needed is task based dependency handling, so if "Task 1" is done, it
has to start "Task 2" and if "Task 1" is started again, it has to stop
"Task 2". This currently has to be done by the implementor, but this feature
can be added to the scheduler reasonably.


= Implementation details =

== MacOS implementation details ==

Generally the Scheduler is handled as expected, except on resize, which is
handled with different runloop-modes in MacOS. In case of a resize, the normal
runloop is suspended in sendEvent, so we can't call the scheduler via posted
main loop-events. Instead the scheduler uses the timer again.

Like the Windows backend, all Cocoa / GUI handling also has to be run in
the main thread. We're emulating Windows out-of-order PeekMessage processing,
via a YieldWakeupEvent and two conditionals. When in a RUNINMAIN call, all
the DBG_TESTSOLARMUTEX calls are disabled, as we can't release the SolarMutex,
but we can prevent running any other SolarMutex based code. Same for all the
SolarMutex acquire and release calls, so the calling and the main thread
don't deadlock. Those wakeup events must be ignored to prevent busy-locks.

We can neither rely on MacOS dispatch_sync code block execution nor the
message handling, as both can't be priorized or filtered and the first
does also not allow nested execution and is just processed in sequence.

There is also a workaround for a problem for pushing tasks to an empty queue,
as [NSApp postEvent: ... atStart: NO] doesn't append the event, if the
message queue is empty.

Probably that's the reason, why some code comments spoke of lost events and
there was some distinct additional event processing implemented.

== Windows implementation details ==

Posted or sent event messages often trigger processing of WndProc in
PeekMessage, GetMessage or DispatchMessage, independently from the message to
fetch, remove or dispatch ("During this call, the system delivers pending,
nonqueued messages..."). Additionally messages have an inherited priority
based on the function used to generate them. Even if WM_TIMER should have been
the lowest prio, a posted WM_TIMER is processed with the prio of a posted
message.

Therefore the current solution always starts a (threaded) timer even for the
instant Idles and syncs to this timer message in the main dispatch loop.
Using SwitchToThread(), this seem to work reasonably well.

== KDE implementation details ==

This implementation also works as intended. But there is a different Yield
handling, because Qts QAbstractEventDispatcher::processEvents will always
process all pending events.


= TODOs and ideas =

== Task dependencies AKA children ==

Every task can have a list of children / a child.

 * When a task is stopped, the children are started.
 * When a task is started, the children are stopped.

This should be easy to implement.

== Per priority time-sorted queues ==

This would result in O(1) scheduler. It was used in the Linux kernel for some
time (search Ingo Molnar's O(1) scheduler). This can be a scheduling
optimization, which would prevent walking longer event list. But probably the
management overhead would be too large, as we have many one-shot events.

To find the next task the scheduler just walks the (constant) list of priority
queues and schedules the first ready event of any queue.

The downside of this approach: Insert / Start / Reschedule(for "auto" tasks)
now need O(log(n)) to find the position in the queue of the priority.

== Always process all (higher priority) pending events ==

Currently Application::Reschedule() processes a single event or "all" events,
with "all" defined as "100 events" in most backends. This already is ignored
by the KDE4 backend, as Qt defines its QAbstractEventDispatcher::processEvents
processing all pending events (there are ways to skip event classes, but no
easy way to process just a single event).

Since the Scheduler is always handled by the system message queue, there is
really no more reasoning to stop after 100 events to prevent LO Scheduler
starvation.

== Run the LO application in its own thread ==

This would probably get rid of most of the MacOS and Windows implementation
details / workarounds, but is quite probably a large amount of work.

Instead of LO running in the main process / thread, we run it in a 2nd thread
and defer al GUI calls to the main thread. This way it'll hopefully not block
and can process system events.

That's just a theory - it definitly needs more analysis before even attemding
an implementation.
Add VCL scheduler documentation Change-Id: Ifb2332b6d3c8bf472c684d3a79c861cc9035d246 2017-08-04 06:46:44 -05:00			`= Introduction =`

			`The VCL scheduler handles LOs primary event queue. It is simple by design,`
			`currently just a single-linked list, processed in list-order by priority`
Fix typos Change-Id: Iaa9c0aea3ea1a239e378bd714ba335f91bb1faf3 Reviewed-on: https://gerrit.libreoffice.org/41194 Reviewed-by: Michael Stahl <mstahl@redhat.com> Tested-by: Michael Stahl <mstahl@redhat.com> 2017-08-17 09:41:20 -05:00			`using round-robin for reoccurring tasks.`
Add VCL scheduler documentation Change-Id: Ifb2332b6d3c8bf472c684d3a79c861cc9035d246 2017-08-04 06:46:44 -05:00
			`The scheduler has the following behaviour:`

			`B.1. Tasks are scheduled just priority based`
			`B.2. Implicitly cooperative AKA non-preemptive`
			`B.3. It's not "fair" in any way (a consequence of B.2)`
			`B.4. Tasks are handled round-robin (per priority)`
			`B.5. Higher priorities have lower values`
			`B.6. A small set of priorities instead of an flexible value AKA int`

			`There are some consequences due to this design.`

Fix typos Change-Id: Iaa9c0aea3ea1a239e378bd714ba335f91bb1faf3 Reviewed-on: https://gerrit.libreoffice.org/41194 Reviewed-by: Michael Stahl <mstahl@redhat.com> Tested-by: Michael Stahl <mstahl@redhat.com> 2017-08-17 09:41:20 -05:00			`C.1. Higher priority tasks starve lower priority tasks`
Add VCL scheduler documentation Change-Id: Ifb2332b6d3c8bf472c684d3a79c861cc9035d246 2017-08-04 06:46:44 -05:00			`As long as a higher task is available, lower tasks are never run!`
			`See Anti-pattern.`

			`C.2. Tasks should be split into sensible blocks`
			`If this can't really be done, process pending tasks by calling`
			`Application::Reschedule(). Or use a thread.`

			`C.3. This is not an OS scheduler`
			`There is no real way to "fix" B.2. and B.3.`
Fix typos Change-Id: Iaa9c0aea3ea1a239e378bd714ba335f91bb1faf3 Reviewed-on: https://gerrit.libreoffice.org/41194 Reviewed-by: Michael Stahl <mstahl@redhat.com> Tested-by: Michael Stahl <mstahl@redhat.com> 2017-08-17 09:41:20 -05:00			`If you need to do a preemptive task, use a thread!`
			`Otherwise make your task suspendable and check SalInstance::AnyInput`
Add VCL scheduler documentation Change-Id: Ifb2332b6d3c8bf472c684d3a79c861cc9035d246 2017-08-04 06:46:44 -05:00			`or call Application::Reschedule regularly.`


			`= Driving the scheduler AKA the system timer =`

			`1. There is just one system timer, which drives LO event loop`
			`2. The timer has to run in the main window thread`
			`3. The scheduler is run with the Solar mutex acquired`
			`4. The system timer is a single-shot timer`
			`5. The scheduler system event / message has a low system priority.`
			`All system events should have a higher priority.`

Fix typos Change-Id: Iaa9c0aea3ea1a239e378bd714ba335f91bb1faf3 Reviewed-on: https://gerrit.libreoffice.org/41194 Reviewed-by: Michael Stahl <mstahl@redhat.com> Tested-by: Michael Stahl <mstahl@redhat.com> 2017-08-17 09:41:20 -05:00			`Every time a task is started, the scheduler timer is adjusted. When the timer`
Add VCL scheduler documentation Change-Id: Ifb2332b6d3c8bf472c684d3a79c861cc9035d246 2017-08-04 06:46:44 -05:00			`fires, it posts an event to the system message queue. If the next most`
Fix typos Change-Id: Iaa9c0aea3ea1a239e378bd714ba335f91bb1faf3 Reviewed-on: https://gerrit.libreoffice.org/41194 Reviewed-by: Michael Stahl <mstahl@redhat.com> Tested-by: Michael Stahl <mstahl@redhat.com> 2017-08-17 09:41:20 -05:00			`important task is an Idle (AKA instant, 0ms timeout), the event is pushed to`
Add VCL scheduler documentation Change-Id: Ifb2332b6d3c8bf472c684d3a79c861cc9035d246 2017-08-04 06:46:44 -05:00			`the back of the queue, so we don't starve system messages, otherwise to the`
Fix typos Change-Id: Iaa9c0aea3ea1a239e378bd714ba335f91bb1faf3 Reviewed-on: https://gerrit.libreoffice.org/41194 Reviewed-by: Michael Stahl <mstahl@redhat.com> Tested-by: Michael Stahl <mstahl@redhat.com> 2017-08-17 09:41:20 -05:00			`front. This is especially important to get a correct SalInstance::AnyInput`
Add VCL scheduler documentation Change-Id: Ifb2332b6d3c8bf472c684d3a79c861cc9035d246 2017-08-04 06:46:44 -05:00			`handling, as this is used to suspend long background Idle tasks.`

Fix typos Change-Id: Iaa9c0aea3ea1a239e378bd714ba335f91bb1faf3 Reviewed-on: https://gerrit.libreoffice.org/41194 Reviewed-by: Michael Stahl <mstahl@redhat.com> Tested-by: Michael Stahl <mstahl@redhat.com> 2017-08-17 09:41:20 -05:00			`Every time the scheduler is invoked it searches for the next task to process,`
Add VCL scheduler documentation Change-Id: Ifb2332b6d3c8bf472c684d3a79c861cc9035d246 2017-08-04 06:46:44 -05:00			`restarts the timer with the timeout for the next event and then invokes the`
			`task. After invoking the task and if the task is still active, it is pushed`
			`to the end of the queue and the timeout is eventually adjusted.`


			`= Locking =`

Fix typos Change-Id: Iaa9c0aea3ea1a239e378bd714ba335f91bb1faf3 Reviewed-on: https://gerrit.libreoffice.org/41194 Reviewed-by: Michael Stahl <mstahl@redhat.com> Tested-by: Michael Stahl <mstahl@redhat.com> 2017-08-17 09:41:20 -05:00			`The locking is quite primitive: all interaction with internal Scheduler`
Add VCL scheduler documentation Change-Id: Ifb2332b6d3c8bf472c684d3a79c861cc9035d246 2017-08-04 06:46:44 -05:00			`structures are locked. This includes the ImplSchedulerContext and the`
			`Task::mpSchedulerData, which is actually a part of the scheduler.`
			`Before invoking the task, we have to release the lock, so others can`
			`Start new Tasks.`


			`= Lifecycle / thread-safety of Scheduler-based objects =`

			`A scheduler object it thread-safe in the way, that it can be associated to`
			`any thread and any thread is free to call any functions on it. The owner must`
			`guarantee that the Invoke() function can be called, while the Scheduler object`
			`exists / is not disposed.`


			`= Anti-pattern: Dependencies via (fine grained) priorities =`

			`"Idle 1" should run before "Idle 2", therefore give "Idle 1" a higher priority`
			`then "Idle 2". This just works correct for low frequency idles, but otherwise`
			`always breaks!`

			`If you have some longer work - even if it can be split by into schedulable,`
			`smaller blocks - you normally don't want to schedule it with a non-default`
			`priority, as it starves all lower priority tasks. Even if a block was processed`
			`in "Idle 1", it is scheduled with the same (higher) priority again. Changing`
			`the "Idle" to a "Timer" also won't work, as this breaks the dependency.`

			`What is needed is task based dependency handling, so if "Task 1" is done, it`
			`has to start "Task 2" and if "Task 1" is started again, it has to stop`
			`"Task 2". This currently has to be done by the implementor, but this feature`
			`can be added to the scheduler reasonably.`


			`= Implementation details =`

			`== MacOS implementation details ==`

OSX fix updates during resize While resizing MacOS suspends the NSDefaultRunLoopMode. So in this caae we can't post to the system event loop, but must use timers to restart ourself. Since the timer itself is scheduled on the NSEventTrackingRunLoopMode it' also triggers on resize events. There is still some minor glitch: when resizing too fast some part of LibreOffice isn't painted, while the left mouse button is down. Since there isn't any layouting triggered by the mouse up, there has to be an other inconsistency. Change-Id: I3ccba78bd23ec8526f21e7b93b027f3d3279f901 2017-08-15 01:23:31 -05:00			`Generally the Scheduler is handled as expected, except on resize, which is`
			`handled with different runloop-modes in MacOS. In case of a resize, the normal`
			`runloop is suspended in sendEvent, so we can't call the scheduler via posted`
tdf#99784 OSX run GUI stuff in the main thread The extension manager starts dialogs from its own thread. But the OSX backend currently doesn't defer these calls to the main thread. This implements the deference by running the called function in the main thread, using a code ^Block, and returning the result via a SalYieldMutex member. Change-Id: Id8977991e3eda91da27c23d8021e028d4f4cefe5 Reviewed-on: https://gerrit.libreoffice.org/42448 Tested-by: Jenkins <ci@libreoffice.org> Reviewed-by: Jan-Marek Glogowski <glogow@fbihome.de> 2017-08-08 08:03:37 -05:00			`main loop-events. Instead the scheduler uses the timer again.`

			`Like the Windows backend, all Cocoa / GUI handling also has to be run in`
			`the main thread. We're emulating Windows out-of-order PeekMessage processing,`
			`via a YieldWakeupEvent and two conditionals. When in a RUNINMAIN call, all`
			`the DBG_TESTSOLARMUTEX calls are disabled, as we can't release the SolarMutex,`
			`but we can prevent running any other SolarMutex based code. Same for all the`
			`SolarMutex acquire and release calls, so the calling and the main thread`
			`don't deadlock. Those wakeup events must be ignored to prevent busy-locks.`

			`We can neither rely on MacOS dispatch_sync code block execution nor the`
			`message handling, as both can't be priorized or filtered and the first`
			`does also not allow nested execution and is just processed in sequence.`
OSX fix updates during resize While resizing MacOS suspends the NSDefaultRunLoopMode. So in this caae we can't post to the system event loop, but must use timers to restart ourself. Since the timer itself is scheduled on the NSEventTrackingRunLoopMode it' also triggers on resize events. There is still some minor glitch: when resizing too fast some part of LibreOffice isn't painted, while the left mouse button is down. Since there isn't any layouting triggered by the mouse up, there has to be an other inconsistency. Change-Id: I3ccba78bd23ec8526f21e7b93b027f3d3279f901 2017-08-15 01:23:31 -05:00
			`There is also a workaround for a problem for pushing tasks to an empty queue,`
			`as [NSApp postEvent: ... atStart: NO] doesn't append the event, if the`
			`message queue is empty.`
Add VCL scheduler documentation Change-Id: Ifb2332b6d3c8bf472c684d3a79c861cc9035d246 2017-08-04 06:46:44 -05:00
			`Probably that's the reason, why some code comments spoke of lost events and`
			`there was some distinct additional event processing implemented.`

			`== Windows implementation details ==`

			`Posted or sent event messages often trigger processing of WndProc in`
			`PeekMessage, GetMessage or DispatchMessage, independently from the message to`
			`fetch, remove or dispatch ("During this call, the system delivers pending,`
			`nonqueued messages..."). Additionally messages have an inherited priority`
			`based on the function used to generate them. Even if WM_TIMER should have been`
			`the lowest prio, a posted WM_TIMER is processed with the prio of a posted`
			`message.`

			`Therefore the current solution always starts a (threaded) timer even for the`
			`instant Idles and syncs to this timer message in the main dispatch loop.`
			`Using SwitchToThread(), this seem to work reasonably well.`

			`== KDE implementation details ==`

Fix typos Change-Id: Iaa9c0aea3ea1a239e378bd714ba335f91bb1faf3 Reviewed-on: https://gerrit.libreoffice.org/41194 Reviewed-by: Michael Stahl <mstahl@redhat.com> Tested-by: Michael Stahl <mstahl@redhat.com> 2017-08-17 09:41:20 -05:00			`This implementation also works as intended. But there is a different Yield`
			`handling, because Qts QAbstractEventDispatcher::processEvents will always`
Add VCL scheduler documentation Change-Id: Ifb2332b6d3c8bf472c684d3a79c861cc9035d246 2017-08-04 06:46:44 -05:00			`process all pending events.`


			`= TODOs and ideas =`

			`== Task dependencies AKA children ==`

			`Every task can have a list of children / a child.`

			`* When a task is stopped, the children are started.`
			`* When a task is started, the children are stopped.`

			`This should be easy to implement.`

			`== Per priority time-sorted queues ==`

			`This would result in O(1) scheduler. It was used in the Linux kernel for some`
Fix typos Change-Id: Iaa9c0aea3ea1a239e378bd714ba335f91bb1faf3 Reviewed-on: https://gerrit.libreoffice.org/41194 Reviewed-by: Michael Stahl <mstahl@redhat.com> Tested-by: Michael Stahl <mstahl@redhat.com> 2017-08-17 09:41:20 -05:00			`time (search Ingo Molnar's O(1) scheduler). This can be a scheduling`
Add VCL scheduler documentation Change-Id: Ifb2332b6d3c8bf472c684d3a79c861cc9035d246 2017-08-04 06:46:44 -05:00			`optimization, which would prevent walking longer event list. But probably the`
			`management overhead would be too large, as we have many one-shot events.`

			`To find the next task the scheduler just walks the (constant) list of priority`
			`queues and schedules the first ready event of any queue.`

			`The downside of this approach: Insert / Start / Reschedule(for "auto" tasks)`
			`now need O(log(n)) to find the position in the queue of the priority.`

			`== Always process all (higher priority) pending events ==`

			`Currently Application::Reschedule() processes a single event or "all" events,`
			`with "all" defined as "100 events" in most backends. This already is ignored`
			`by the KDE4 backend, as Qt defines its QAbstractEventDispatcher::processEvents`
			`processing all pending events (there are ways to skip event classes, but no`
			`easy way to process just a single event).`

			`Since the Scheduler is always handled by the system message queue, there is`
			`really no more reasoning to stop after 100 events to prevent LO Scheduler`
			`starvation.`
tdf#99784 OSX run GUI stuff in the main thread The extension manager starts dialogs from its own thread. But the OSX backend currently doesn't defer these calls to the main thread. This implements the deference by running the called function in the main thread, using a code ^Block, and returning the result via a SalYieldMutex member. Change-Id: Id8977991e3eda91da27c23d8021e028d4f4cefe5 Reviewed-on: https://gerrit.libreoffice.org/42448 Tested-by: Jenkins <ci@libreoffice.org> Reviewed-by: Jan-Marek Glogowski <glogow@fbihome.de> 2017-08-08 08:03:37 -05:00
			`== Run the LO application in its own thread ==`

			`This would probably get rid of most of the MacOS and Windows implementation`
			`details / workarounds, but is quite probably a large amount of work.`

			`Instead of LO running in the main process / thread, we run it in a 2nd thread`
			`and defer al GUI calls to the main thread. This way it'll hopefully not block`
			`and can process system events.`

			`That's just a theory - it definitly needs more analysis before even attemding`
			`an implementation.`