MIT操作系统教材：xv6详解

5星 · 超过95%的资源需积分: 9 123 浏览量更新于2024-07-25 收藏 1015KB PDF 举报

"MIT操作系统课本，主要讲解Xv6操作系统，是一本经典的OS学习资料，适合配合源代码学习" 本文档是MIT操作系统课程所使用的教材，专注于通过研究一个名为Xv6的操作系统内核来讲解操作系统的基本概念。Xv6是对Dennis Ritchie和Ken Thompson的Unix Version 6（v6）的重新实现，它采用了类似v6的结构和风格，但用ANSI C语言编写，适应于x86架构的多处理器环境。教材内容涵盖了操作系统的关键方面，包括： 1. **操作系统接口**：这部分将介绍操作系统如何提供服务给用户和应用程序，如系统调用、进程控制等基本接口。 2. **第一个进程**：详细讲解进程的概念，如何创建、管理及调度进程，以及进程间的通信机制。 3. **页表**：阐述虚拟内存管理的基础，如何通过页表映射物理内存，以及如何处理页故障。 4. **陷阱、中断和驱动程序**：讨论异常处理（陷阱）和硬件中断，以及驱动程序在操作系统中的角色，如何与硬件交互。 5. **锁**：讲解在并发环境中如何使用锁和其他同步机制来保护共享资源，防止数据竞争。 6. **调度**：介绍不同的调度算法，如FCFS、SJF、优先级调度等，以及它们在多任务环境中的应用。 7. **文件系统**：分析文件的组织、存储和检索，以及文件系统的挂载、卸载和元数据管理。 8. **Pc硬件**（附录A）：简要介绍与操作系统密切相关的PC硬件组件，如CPU、内存和I/O设备。 9. **引导加载器**（附录B）：讲述操作系统启动过程，如何从磁盘加载内核到内存。此外，这份教材是基于xv6的源代码进行教学，鼓励读者同时阅读代码以加深理解。这种实践性的教学方法受到John Lions的《UNIX 6th Edition注释》启发，读者可以在指定网站找到关于v6和xv6的在线资源。这本教材已被用于实际的教学中，旨在帮助学生深入理解操作系统的工作原理，并通过实际操作和代码分析培养解决问题的能力。对于想要学习操作系统设计和实现的初学者，这是一份非常有价值的参考资料。

would not change.

Real world

Unix’s combination of the ‘‘standard’’ ﬁle descriptors, pipes, and convenient shell

syntax for operations on them was a major advance in writing general-purpose

reusable programs. The idea sparked a whole culture of ‘‘software tools’’ that was re-

sponsible for much of Unix’s power and popularity, and the shell was the ﬁrst so-called

‘‘scripting language.’’ The Unix system call interface persists today in systems like BSD,

Linux, and Mac OS X.

Modern kernels provide many more system calls, and many more kinds of kernel

services, than xv6. For the most part, modern Unix-derived operating systems have

not followed the early Unix model of exposing devices as special ﬁles, like the console

device ﬁle discussed above. The authors of Unix went on to build Plan 9, which ap-

plied the ‘‘resources are ﬁles’’ concept to modern facilities, representing networks,

graphics, and other resources as ﬁles or ﬁle trees.

The ﬁle system abstraction has been a powerful idea, most recently applied to

network resources in the form of the World Wide Web. Even so, there are other mod-

els for operating system interfaces. Multics, a predecessor of Unix, abstracted ﬁle stor-

age in a way that made it look like memory, producing a very diﬀerent ﬂavor of inter-

face. The complexity of the Multics design had a direct inﬂuence on the designers of

Unix, who tried to build something simpler.

This book examines how xv6 implements its Unix-like interface, but the ideas and

concepts apply to more than just Unix. Any operating system must multiplex process-

es onto the underlying hardware, isolate processes from each other, and provide mech-

anisms for controlled inter-process communication. After studying xv6, you should be

able to look at other, more complex operating systems and see the concepts underlying

xv6 in those systems as well.

DRAFT as of August 28, 2012 16 http://pdos.csail.mit.edu/6.828/xv6/

Chapter 1

The ﬁrst process

This chapter explains what happens when xv6 ﬁrst starts running, through the

creation of the ﬁrst process. In doing so, the text provides a glimpse of the implemen-

tation of all major abstractions that xv6 provides, and how they interact. Most of xv6

avoids special-casing the ﬁrst process, and instead reuses code that xv6 must provide

for standard operation. Subsequent chapters will explore each abstraction in more de-

tail.

Xv6 runs on Intel 80386 or later (‘‘x86’’) processors on a PC platform, and much

of its low-level functionality (for example, its process implementation) is x86-speciﬁc.

This book assumes the reader has done a bit of machine-level programming on some

architecture, and will introduce x86-speciﬁc ideas as they come up. Appendix A brieﬂy

outlines the PC platform.

Process overview

A process is an abstraction that provides the illusion to a program that it has its

own abstract machine. A process provides a program with what appears to be a pri-

vate memory system, or address space, which other processes cannot read or write.

A process also provides the program with what appears to be its own CPU to execute

the program’s instructions.

Xv6 uses page tables (which are implemented by hardware) to give each process

its own address space. The x86 page table translates (or ‘‘maps’’) a virtual address

(the address that an x86 instruction manipulates) to a physical address (an address

that the processor chip sends to main memory).

Xv6 maintains a separate page table for each process that deﬁnes that process’s

address space. As illustrated in Figure 1-1, an address space includes the process’s

user memory starting at virtual address zero. Instructions come ﬁrst, followed by glob-

al variables, then the stack, and ﬁnally a ‘‘heap’’ area (for malloc) that the process can

expand as needed.

Each process’s address space maps the kernel’s instructions and data as well as the

user program’s memory. When a process invokes a system call, the system call exe-

cutes in the kernel mappings of the process’s address space. This arrangement exists

so that the kernel’s system call code can directly refer to user memory. In order to

leave room for user memory to grow, xv6’s address spaces map the kernel at high ad-

dresses, starting at 0x80100000.

The xv6 kernel maintains many pieces of state for each process, which it gathers

into a struct proc (2103). A process’s most important pieces of kernel state are its

page table, its kernel stack, and its run state. We’ll use the notation p->xxx to refer to

elements of the proc structure.

DRAFT as of August 28, 2012 17 http://pdos.csail.mit.edu/6.828/xv6/

address space

virtual address

physical address

user memory

struct proc+code

p->xxx+code

0x80000000

0xFFFFFFFF

0x80100000

text and data

free memory

BIOS

user text

and data

user stack

heap

kernel

user

Figure 1-1. Layout of a virtual address space

Each process has a thread of execution (or thread for short) that executes the

process’s instructions. A thread can be suspended and later resumed. To switch trans-

parently between processes, the kernel suspends the currently running thread and re-

sumes another process’s thread. Much of the state of a thread (local variables, function

call return addresses) is stored on the thread’s stacks. Each process has two stacks: a

user stack and a kernel stack (p->kstack). When the process is executing user in-

structions, only its user stack is in use, and its kernel stack is empty. When the pro-

cess enters the kernel (via a system call or interrupt), the kernel code executes on the

process’s kernel stack; while a process is in the kernel, its user stack still contains saved

data, but isn’t actively used. A process’s thread alternates between actively using the

user stack and the kernel stack. The kernel stack is separate (and protected from user

code) so that the kernel can execute even if a process has wrecked its user stack.

When a process makes a system call, the processor switches to the kernel stack,

raises the hardware privilege level, and starts executing the kernel instructions that im-

plement the system call. When the system call completes, the kernel returns to user

space: the hardware lowers its privilege level, switches back to the user stack, and re-

sumes executing user instructions just after the system call instruction. A process’s

thread can ‘‘block’’ in the kernel to wait for I/O, and resume where it left oﬀ when the

I/O has ﬁnished.

p->state indicates whether the process is allocated, ready to run, running, wait-

ing for I/O, or exiting.

p->pgdir holds the process’s page table, in the format that the x86 hardware ex-

pects. xv6 causes the paging hardware to use a process’s p->pgdir when executing

that process. A process’s page table also serves as the record of the addresses of the

physical pages allocated to store the process’s memory.

Code: the ﬁrst address space

DRAFT as of August 28, 2012 18 http://pdos.csail.mit.edu/6.828/xv6/

thread

p->kstack+code

p->state+code

p->pgdir+code

0x80000000

0xFFFFFFFF

0x80100000

text and data

BIOS

Virtual address space

Physical memory

Top physical

memory

kernel text

and data

4 Mbyte

0 BIOS

text and data

Figure 1-2. Layout of a virtual address space

When a PC powers on, it initializes itself and then loads a boot loader from disk

into memory and executes it. Appendix B explains the details. Xv6’s boot loader loads

the xv6 kernel from disk and executes it starting at entry (1040). The x86 paging hard-

ware is not enabled when the kernel starts; virtual addresses map directly to physical

addresses.

The boot loader loads the xv6 kernel into memory at physical address 0x100000.

The reason it doesn’t load the kernel at 0x80100000, where the kernel expects to ﬁnd

its instructions and data, is that there may not be any physical memory at such a high

address on a small machine. The reason it places the kernel at 0x100000 rather than

0x0 is because the address range 0xa0000:0x100000 contains I/O devices.

To allow the rest of the kernel to run, entry sets up a page table that maps virtu-

al addresses starting at 0x80000000 (called KERNBASE (0207)) to physical addresses start-

ing at 0x0 (see Figure 1-1). Setting up two ranges of virtual addresses that map to the

same physical memory range is a common use of page tables, and we will see more

examples like this one.

The entry page table is deﬁned in main.c (1311). We look at the details of page ta-

bles in Chapter 2, but the short story is that entry 0 maps virtual addresses

0:0x400000 to physical addresses 0:0x400000. This mapping is required as long as

entry is executing at low addresses, but will eventually be removed.

Entry 960 maps virtual addresses KERNBASE:KERNBASE+0x400000 to physical ad-

dresses 0:0x400000. This entry will be used by the kernel after entry has ﬁnished; it

maps the high virtual addresses at which the kernel expects to ﬁnd its instructions and

data to the low physical addresses where the boot loader loaded them. This mapping

restricts the kernel instructions and data to 4 Mbytes.

Returning to entry, it loads the physical address of entrypgdir into control reg-

ister %cr3. The paging hardware must know the physical address of entrypgdir, be-

DRAFT as of August 28, 2012 19 http://pdos.csail.mit.edu/6.828/xv6/

boot loader

entry+code

KERNBASE+code

entry+code

entrypgdir+code

cause it doesn’t know how to translate virtual addresses yet; it doesn’t have a page table

yet. The symbol entrypgdir refers to an address in high memory, and the macro

V2P_WO (0220) subtracts KERNBASE in order to ﬁnd the physical address. To enable the

paging hardware, xv6 sets the ﬂag CR0_PG in the control register %cr0.

The processor is still executing instructions at low addresses after paging is en-

abled, which works since entrypgdir maps low addresses. If xv6 had omitted entry 0

from entrypgdir, the computer would have crashed when trying to execute the in-

struction after the one that enabled paging.

Now entry needs to transfer to the kernel’s C code, and run it in high memory.

First it makes the stack pointer, %esp, point to memory to be used as a stack (1054). All

symbols have high addresses, including stack, so the stack will still be valid even

when the low mappings are removed. Finally entry jumps to main, which is also a

high address. The indirect jump is needed because the assembler would otherwise

generate a PC-relative direct jump, which would execute the low-memory version of

main. Main cannot return, since the there’s no return PC on the stack. Now the kernel

is running in high addresses in the function main (1217).

Code: creating the ﬁrst process

After main initializes several devices and subsystems, it creates the ﬁrst process by

calling userinit (1239). Userinit’s ﬁrst action is to call allocproc. The job of al-

locproc (2205) is to allocate a slot (a struct proc) in the process table and to initial-

ize the parts of the process’s state required for its kernel thread to execute. Allocproc

is called for each new process, while userinit is called only for the very ﬁrst process.

Allocproc scans the proc table for a slot with state UNUSED (2211-2213). When it ﬁnds

an unused slot, allocproc sets the state to EMBRYO to mark it as used and gives the

process a unique pid (2201-2219). Next, it tries to allocate a kernel stack for the process’s

kernel thread. If the memory allocation fails, allocproc changes the state back to UN-

USED and returns zero to signal failure.

Now allocproc must set up the new process’s kernel stack. allocproc is written

so that it can be used by fork as well as when creating the ﬁrst process. allocproc

sets up the new process with a specially prepared kernel stack and set of kernel regis-

ters that cause it to ‘‘return’’ to user space when it ﬁrst runs. The layout of the pre-

pared kernel stack will be as shown in Figure 1-3. allocproc does part of this work

by setting up return program counter values that will cause the new process’s kernel

thread to ﬁrst execute in forkret and then in trapret (2236-2241). The kernel thread

will start executing with register contents copied from p->context. Thus setting p-

>context->eip to forkret will cause the kernel thread to execute at the start of

forkret (2533). This function will return to whatever address is at the bottom of the

stack. The context switch code (2708) sets the stack pointer to point just beyond the

end of p->context. allocproc places p->context on the stack, and puts a pointer to

trapret just above it; that is where forkret will return. trapret restores user regis-

ters from values stored at the top of the kernel stack and jumps into the process (3027).

This setup is the same for ordinary fork and for creating the ﬁrst process, though in

the latter case the process will start executing at user-space location zero rather than at

DRAFT as of August 28, 2012 20 http://pdos.csail.mit.edu/6.828/xv6/

V2P_WO+code

CR0_PG+code

main+code

allocproc+code

EMBRYO+code

pid+code

forkret+code

trapret+code

p->context+code

forkret+code

trapret+code

forkret+code

trapret+code

剩余95页未读，继续阅读

君至天下

粉丝: 0
资源: 3

MIT操作系统教材：xv6详解

基于MIT App Inventor的Android简易智力游戏开发指南

关于PC游戏的20个常见误解

OS X 剪贴板应用 Clipboard 功能与许可介绍

Mit OS engneering

Mitos-开源

算法导论(MIT经典课本)影印版

前端开源库-mitos

Reda_Mitos-y-Leyendas:阿波伦达摩斯和波克马斯

20_Mitos_Sobre_Las_Pc_games_Myths_pc_

MIT算法导论的课本

最新资源