2.3. Fully distributed semantics.
As existing solutions do not provide for an adequate solution, another approach was sought. As one
of the goals was to provide for scalability without any need to change the source code, it is a good
starting point to look at the services offered by single processor real-time kernels. This has the
advantage that the embedded programmer can keep the same programming style he is used to. Most
real-time kernels have also evolved over the years and the services offered are a result of natural
selection during practical use. The conclusion was to mimic the services traditionally found in most
real-time kernels but to implement them in a fully distributed way. This was found possible if the
semantics were accordingly adapted. The result is a safe programming paradigm based on
"objects". The objects can be active (tasks) or passive (semaphores, FIFO queues, message
mailboxes, resources) and are the unit of distribution in the parallel system. For the application
developer, except during the system definition phase, task code can be written independently of the
mapping of the objects onto the hardware topology.
In terms of productivity, the major benefit is that the programmer can now concentrate on his
application. The application programmer is often a specialist in a particular domain; often more
elegantly expressed in terms of mathematical expressions than in terms of the target hardware.
Hence, it is desirable to provide tools that permit him to program without being concerned too
much about system level issues. Just as the system software handles memory management and page
swapping on a workstation, on a parallel processing target, communication should preferably by
handled by the system software as well. In practice this also justifies that the system software
designer spends time to optimize the implementation, while it is second priority for the application
developer.
2.4. The programming model.
The final semantics are very similar to what is offered by most modern real-time kernels on a single
processor. The small but important changes however are related to the underlying programming
model. The first major point is that the multi-tasking model does not presuppose the existence of
common memory. Hence all pointers passed between tasks are local. As a result, by default all
communicating services pass a copy of the data from local workspace to local workspace. The
benefit of this approach is that it still works when common memory is available. When the
programmer wants to exploit potential performance benefits from passing common memory
pointers, he can still do so but this will be very visible in his program (but he will have lost the
scalability of his program).
The second point is the semantic need for an underlying synchronization protocol to protect the
data from being prematurely overwritten. E.g. when a tasks sends a message, the source data will
be split over multiple packets that by using DMA and multiple, but not necessarily identical,
communication paths, will be copied to the destination area indicated by the receiving task. Hence,
the system layer and the programmer must assure that the data packets are put in the right order on
the destination memory and that the last byte has been copied before another transfer is started. In
Virtuoso this has resulted in most kernel services being available in multiple semantic variations:
blocking (safest under all conditions), non-blocking, and blocking with time-out. The respective
kernel services are indicated with a different suffix (-W, WT). In addition, some services are
available with an -A suffix, indicating an asynchronous operation. An example is the system wide
KS_memcopyA, the unsafe version of KS_memcopyW.
Another simple example is the use of counting semaphores. Whereas in single processor
environments, one can "sequentialise" the signaling of a semaphore, allowing the use of binary
semaphores, in a parallel processing environment, one cannot prevent simultaneous signaling