没有合适的资源?快使用搜索试试~ 我知道了~
首页Parallel Programming with OmniThreadLib
资源详情
资源评论
资源推荐


第 1 页 共 213 页
1. Introduction to multithreading
You may have heard about
threads
and
mutltihreading
already, but you may not exactly
know what those words actually mean. The precise definition is not important for this
story. What is important are the general ideas. If you want to read more, you should start
with the entry in Wikipedia. If you are well versed in these topics, you can freely skip ahead
to the next chapter, Introduction to OTL.
When you start a program, the operating system creates internal entity called a process. It
contains almost everything that an operating system associates with the started program -
allocated memory, open file and window handles, sockets and so on. A process, however,
does not contain the information related to the CPU state. That is part of a
thread
.
A
thread
encapsulates the state of the CPU registers, thread stack and
thread local
variables
. Every process contains one thread - the
main thread
. Operating system creates it
together with a process when it starts a program.
Modern operating systems allow multiple threads of execution inside a process. This is
achieved by creating additional
background threads
in the code. To create a thread we can
use operating system primitives (such as CreateThread on Windows), low-level runtime library
functionality as Delphi’s TThread, or higher-level concepts such as tasks and abstractions
(also known as patterns).
On a multi-CPU machine these threads can execute in parallel, one on each core. A typical
computer system, however, runs much more threads than it has CPUs and that’s where
multithreading kicks in.
A multithreading is not a new invention. It appeared decades before multiprocessor boards
became available to the general public. In a typical multithreading system the operating
system allows a thread to run for some time (typically few milliseconds) and then stores
the thread’s state (registers etc) into the operating system’s internal data structures and
activates some other thread. By quickly switching between multiple threads the system
gives an illusion that it is running many threads in many programs in parallel.
1.1 Multithreading as a source of problems
Multithreading is a powerful concept, but it also brings many problems. They are all
consequence of the same fact – that threads are not
isolated
.
An operating system works hard to isolate one process from another. This prevents one
process from accessing the memory belonging to another, which is an important security
feature. It also makes sure that a bug in one process won’t affect any other process in the
system. If you think that’s nothing special, you are not old enough. Some of us had to
work on Windows 3 which had no such isolation and we still remember how one badly
behaved application could bring down the whole system.
Since threads are not isolated, each thread can access all of the memory memory
belonging to a process. This gives us extreme power and flexibility. Starting multiple
threads provides all of the threads with access to the same data. It is also a source of all
problems. If multiple threads don’t only read from the shared data but write to it, you’ll
get into trouble.
It should be obvious that one thread must not modify data while another thread is reading
from it. It is harder to enforce this in practice. We can protect access to shared data by

第 2 页 共 213 页
using locking (critical sections, spinlocks, TMonitor …) or interlocked operations, but they will
slow the program down and they can introduce new problems (deadlocks).
That is why I prefer
data copying
,
communications
and
aggregation
over data sharing.
Modern CPUs are fast and they can easily copy few tens or hundreds of bytes that are then
sent to the thread so it can work on its own copy of the data and not on shared data. This
approach is the basis of OmniThreadLibrary and you’ll see it all through the book.
To write better code you should know which operations that are completely safe in single-
threaded world will get you in trouble when you are multithreading.
1.1.1
Reading and writing shared data
Let’s start with a simple variable access. One thread is reading from a variable and
another is writing to it.
1
var
2
a,b: integer;
3
4
// thread 1
5
6
a := 42;
7
8
// thread 2
9
10
b := a;
This looks safe. Variable b will contain either 42 or previous data stored in a. Well, that is
not entirely true. b may also contain a mixture of both.
Depending on how the a is stored in memory (how it is
aligned
) the processor will access it
either with one or two operations. If two operations are used, a read operation may
partially overlap the write operation. For example, the code may read the first part of a,
then a gets fully updated and only then the code reads the second part of a.
This code behaves “as expected” (doesn’t cause problems) if the address of a gives
remainder of 0 when divided by 4. We say that a is 4-aligned. Delphi in most cases makes
all integers 4-aligned. Sadly, in some occasions, especially when structured types – classes
and records – are nested, variables are not well-aligned.
We say that the operation is atomic if it is performed in one step. Reads and writes of 4-
aligned integers are atomic on Intel platform.
The simplest way to be sure that a value is correctly aligned is to use the TOmniAlignedInt32
record provided by the OmniThreadLibrary. It makes sure that the underlying data storage
is correctly aligned and can as such be accessed atomically.
OmniThreadLibrary also implements TOmniAlignedInt64 type which creates 8-aligned int64 data.
This data may be accessed atomically only from 64-bit code.
Reads and writes of 8-aligned int64s are atomic only in 64-bit code.
TOmniAlignedInt64 implements atomic operations on int64s that can be used in 32-bit code.
1.1.2
Modifying shared data
Another example of typical code that doesn’t work correctly in multithreaded
environment is modifying the same data from two threads. In a typical program this data

第 3 页 共 213 页
modification is hidden inside an operation that we think of as
atomic
while in reality it
isn’t.
A standard example of this problem is implemented with two threads, one incrementing
the shared value and other decrementing it.
1
var
2
data: integer;
3
4
data := 0;
5
6
// thread 1
7
8
f
o
r
i := 1
t
o
100000
do
9 Inc(data);
10
11
// thread 2
12
13
f
o
r
i := 1
t
o
100000
do
14
Dec(data);
In a single-threaded world, the value of data is 0 when this pseudo code finishes. In a
multithreaded scenario, however, it will lie somewhere between -100000 and 100000 – and
that’s all we can say.
The problem with this code is that Inc and Dec are not atomic operations. Rather they are
implemented inside the CPU as a series of three operations – read data from memory,
increment (or decrement) data
, and
write data tomemory
.
One way to fix such code is to use already mentioned TOmniAlignedInt32 which implements
atomic Increment and Decrement operations.
1
var
2
data: TOmniAlignedInt32;
3
4 data := 0;
5
6
// thread 1
7
8
f
o
r
i := 1
t
o
100000
do
9
data.Increment;
10
11
// thread 2
12
13
f
o
r
i := 1
t
o
100000
do
14 data.Decrement;
Another option is to use
interlocked
operations from Delphi’s
System.SyncObjs
unit.
1
var
2
data: integer;
3
4 data := 0;
5
6
// thread 1
7
8
f
o
r
i := 1
t
o
100000
do
9 TInterlocked.Increment(data);
10
11
// thread b
12
13
f
o
r
i := 1
t
o
100000
do
14 TInterlocked.Decrement(data);
The third option is to protect increment/decrement operations by wrapping them in a

第 4 页 共 213 页
critical section. You could, for example, use OmniThreadLibrary’s TOmniCS.
1
var
2
data: integer;
3
lock: TOmniCS;
4
5 data := 0;
6
7
// thread 1
8
9
f
o
r
i := 1
t
o
100000
do beg
i
n
10
lock.Acquire;
11
data.Increment(data);
12
lock.Release;
13
end
;
14
15
// thread 2
16
17
f
o
r
i := 1
t
o
100000
do beg
i
n
18
lock.Acquire;
19
data.Decrement(data);
20
lock.Release;
21
end
;
Using interlocked operations is a bit slower than normal arithmetics and locking with
critical sections is even slower. That’s one of the reasons I prefer to use data copying and
aggregation.
The next example shows how the increment/decrement example could be rewritten with
these principles in mind.
1
var
2
data: integer;
3
4
// thread 1
5
var
6
tempData: integer;
7
8
tempData := 0;
9
f
o
r
i := 1
t
o
100000
do
10 Inc(tempData);
11
12
send tempData
t
o
main thread
13
14
// thread 2
15
var
16
tempData: integer;
17
18
tempData := 0;
19
f
o
r
i := 1
t
o
100000
do
20 Dec(tempData);
21
22
send tempData
t
o
main thread
23
24
// main thread
25
26 data := result from thread 1 + result from thread 2
This approach doesn’t manipulate shared data which makes it run faster than interlocked
and locking solutions. It does, however, require some communication mechanism that will
send data to and from thread. Different parts of this book will deal with that.
1.1.3
Writes masquerading as reads
Another source of problems are shared objects. If we share an object between threads and
then execute some methods of that object in different threads we have to know exactly
剩余212页未读,继续阅读












安全验证
文档复制为VIP权益,开通VIP直接复制

评论1