7RZDUGV)DXOW7ROHUDQW5HDO7LPH6FKHGXOLQJLQWKHVH/0LFURNHUQHO
Libin Xu
1,2
, Yuebin Bai
1
, Kun Cheng
1
, Lingyu Ge
1
, Danning Nie
1
, Lijun Zhang
1,2
, Wenjia Liu
1
1
School of Computer Science and Engineering
Beihang University
Beijing 100191, China
Email: 13521941768@163.com
2
Science and Technology on Communication Networks Laboratory
Shijiazhuang 050081, China
Abstract—System dependability and real-time requirement
have been receiving widespread attention in modern computer
systems. Fault-tolerant real-time scheduling provides a method of
combining timing requirements and faults tolerance for real-time
systems. This paper presents a preliminary study of building a
fault-tolerant real-time mechanism for seL4 microkernel. seL4, as
the first formally-verified operating system kernel in the world, is
an ideal platform for security-critical systems. However, the lack
of real-time and fault-tolerant policy reduces its the reliability and
user experience. We have implemented basic real-time and
checkpoint mechanisms in seL4, and propose the Check-Point
based Fault-Tolerant RM (CP-FTRM) scheduling policy. The
experiments have shown the feasibility and efficiency of them.
Keywords—real-time system; fault-tolerance; time redundancy;
rate monotonic algorithm; checkpoint; seL4
I.
I
NTRODUCTION
Modern embedded systems are characterized by multiple
functionality and sophisticated user interface, which are driven
by the vastly different requirements including real-time
applications. The dependability of a real-time system relies not
only on functional correctness, but also on timing constrains [1].
According to requirements of task deadline, real-time system
can be classified into hard and soft real-time. Missing tasks’
deadline could cause a catastrophic result in the former while
decline part of service quality in the latter. Effective real-time
schedule algorithm is the key technology to ensure that all user
space tasks finish by the time of deadline. Rate monotonic (RM)
and Earliest deadline first (EDF) [2], as the optimal real-time
scheduling algorithms based on periodic task sets, are
representative of static and dynamic scheduling respectively.
Fault-tolerance has shown to be an effective technology to
realize system dependability, especially in real-time system [3].
The timing nature of real-time systems requires that it must be
equipped with fault-tolerance ability to guarantee normal
operations when inevitable hardware and software faults occur.
At the same time, the error tasks should have finished by the
time of deadline. As an important redundancy technology, time
redundancy [4] reserves additional processor slacks in task
scheduling for the system roll-back to normal status and re-
execution when faults occur, thus has become the main idea to
design fault-tolerant real-time scheduling policies.
seL4 is a high-assurance, high-performance and formally
verified operating system based on L4-microkernel, featuring
mathematically proved correctness and strong security
properties [5]. The seL4 microkernel provides a capability-based
access-control model to isolate software components, and also
to enable authorized system calls and Inter-Process
Communication (IPC) [6, 7]. Even though seL4 microkernel has
implemented memory and security isolation, it still suffers from
lacking of temporal isolation [8, 9]. In fact, the scheduling policy
of current seL4 is based on round-robin with 256 priority levels,
and threads do not even have time attributes such as budget,
period or deadline, which might not meet user real-time
requirements. On the other hand, whilst formally verification
ensures microkernel itself bug free, other bugs such as hardware
faults, user application errors are still inevitable. It is necessary
to introduce fault-tolerance in seL4 to improve system
dependability.
In this paper, we make several contributions. Firstly, we
provide the support for kernel time management and periodic
task model, which are the basic mechanisms for designing real-
time systems. Further, we implement RM and EDF real-time
scheduling policy in seL4 microkernel and provide related
system calls for user space applications to configure and control
periodic tasks. Secondly, we implement the checkpoint
mechanism which saves the context of a real-time task during its
execution and restarts it based on established checkpoints. Then,
Check-Point based Fault-tolerant RM (CP-FTRM) scheduling
policy is proposed to decrease tasks’ error recovery overhead.
The policy inserts several checkpoints in the execution of tasks
and avoids the re-execution from the beginning when errors
occur, which greatly saves processor resources. Besides, we
deduce the feasibility condition of CP-FTRM, that is the least
upper bound of the total processor utilization over all sets of
tasks that fully utilize the processor. Finally, we have evaluated
the performance overheads of related components on ARM
board. Then we present exact execution traces of system tasks
and comparisons between implemented policies.
The rest of the paper is organized as follows. Section II
presents the key technologies about real-time system and related
work on fault-tolerant real-time scheduling. In section III we
provide the design and implementation of a real-time scheduler.
Time Event Queue (TEQ), Interrupt Service Routine (ISR) and
periodic resource model are introduced to construct the real-time
system in seL4. Further, section IV describes the design and
analysis of CP-FTRM policy, which introduces checkpoint
mechanism to FTRM policy to decrease the task recovery
2016 IEEE 18th International Conference on High Performance Computing and Communications; IEEE 14th International
Conference on Smart City; IEEE 2nd International Conference on Data Science and Systems
978-1-5090-4297-5/16 $31.00 © 2016 IEEE
DOI 10.1109/HPCC-SmartCity-DSS.2016.203
711