17
All data moves in or out of a process by this mechanism. The machinery inside the
operating system that performs these transfers can be incredibly complex, but
conceptually, it's very straightforward.
Figure 1-1 shows a simplified logical diagram of how block data moves from an external
source, such as a disk, to a memory area inside a running process. The process requests
that its buffer be filled by making the read() system call. This results in the kernel issuing
a command to the disk controller hardware to fetch the data from disk. The disk
controller writes the data directly into a kernel memory buffer by DMA without further
assistance from the main CPU. Once the disk controller finishes filling the buffer, the
kernel copies the data from the temporary buffer in kernel space to the buffer specified by
the process when it requested the read() operation.
Figure 1-1. Simplified I/O buffer handling
This obviously glosses over a lot of details, but it shows the basic steps involved.
Note the concepts of user space and kernel space in Figure 1-1. User space is where
regular processes live. The JVM is a regular process and dwells in user space. User space
is a nonprivileged area: code executing there cannot directly access hardware devices, for
example. Kernel space is where the operating system lives. Kernel code has special
privileges: it can communicate with device controllers, manipulate the state of processes
in user space, etc. Most importantly, all I/O flows through kernel space, either directly (as
decsribed here) or indirectly (see Section 1.4.2).
When a process requests an I/O operation, it performs a system call, sometimes known as
a trap, which transfers control into the kernel. The low-level open(), read(), write(), and
close() functions so familiar to C/C++ coders do nothing more than set up and perform
the appropriate system calls. When the kernel is called in this way, it takes whatever steps
are necessary to find the data the process is requesting and transfer it into the specified
buffer in user space. The kernel tries to cache and/or prefetch data, so the data being
requested by the process may already be available in kernel space. If so, the data
requested by the process is copied out. If the data isn't available, the process is suspended
while the kernel goes about bringing the data into memory.
Looking at Figure 1-1, it's probably occurred to you that copying from kernel space to the
final user buffer seems like extra work. Why not tell the disk controller to send it directly
to the buffer in user space? There are a couple of problems with this. First, hardware is
usually not able to access user space directly.
[2]
Second, block-oriented hardware devices
such as disk controllers operate on fixed-size data blocks. The user process may be