Lesson 3. Booting the Computer (2)

In Lesson 2, you have seen that the BIOS POST routine searches for boot device and then loads in the boot sector for execution. You may wonder how the POST routine knows if a disk is a boot device or not? The secret is that to become a boot device the first sector of the disk must end with two bytes "55 AA". These two bytes in the first disk sector are the label for a boot sector. The BIOS POST routine essentially reads the first sector of a list of devices in order and checks if these two bytes exist until it finds one. If such a boot sector is found, it is loaded into memory at address 0x07C00. The boot sector is 512 bytes long and often contains a boot loader which is responsible for bringing up other boot routines or the operating system kernel.

Before introducing the boot loader implementation, let's first figure out why the BIOS loads the boot sector at 0x07C00 instead of 0x00000 or other location. It is not an arbitrarily chosen location because the BIOS allocates the 1M memory (memory range addressable in the real mode) specifically during the time of the processor initialization. Figure 1 shows the memory allocation by BIOS. The memory area from 0x07C00 to 0x07DFF is reserved for boot loader. The first 1Kb memory is used for interrupt vector, which jumps to the interrupt service routines stored in memory between 0xC0000 and 0xFFFFF. The following 128 bytes are used for BIOS data area. The 128Kb memory from 0xA0000 to 0xBFFFF is used for video memory. There are two free memory areas that you can use for your own purpose. Now you know that the boot sector can not be placed arbitrarily. Otherwise data can be overwritten, leading to corrupted execution.
The main job of boot loader is fetching the OS kernel into memory and then bringing up the OS function. Without the support of high-level OS disk routines, the boot loader has to utilize the BIOS interrupt routines to achieve disk operations. As mentioned before, this is the reason that the boot loader can not be loaded into 0x00000 since you want to use the interrupt services. The interrupt service you want to use is INT 0x13, which provides all kinds of disk operations. Before calling INT 0x13, you have to setup a list of parameters so that the interrupt service routine knows what actions to perform. Those parameters are passed through the registers as listed below:
                    AH = Function index, 0x02 means to read sectors from drive
                    AL = Number of sectors to read
                    CH = Cylinder number
                    CL = Sector number
                    DH = Head number
                    DL = Drive number, 0x80h means the 1st hard disk
                    ES:BX = Buffer address pointer
                    CF = Set on error, clear if no error
                    AH = Return code
                    AL = Number of actual sectors read
After handling the interrupt, those disk sectors you specified will be stored in the memory at address ES:BX. With proper configuration of the environment, you can then use a jump instruction to direct the execution to the loaded code.

In summary, the boot sector has a label of "55 AA" for the last two bytes in that sector. It utilizes the BIOS interrupt 0x13 to read in the OS kernel and prepare necessary execution environment before transferring the control to the OS.

[1] Intel Architecture Software Developer's Manual Volume 3 - System Programming
[2] 操作系统引导探究 / Operating System Booting Investigation (Chinese)

Lesson 2. Booting the Computer (1)

Booting is the first task the OS has to deal with after you power on a computer. The boot loader is probably the best starting point to begin the OS study. An interesting question is: what happens when you power on the computer? Of course, the OS starts functioning. But what actions are performed exactly? To answer this question, we first have to know what microprocessor is installed. Since Intel's processors are dominant in the PC market, we assume the OS is running on an x86 machine. The "Intel Architecture Software Developer's Manual" is the best source for your reference. It describes every detail about the processor. Volume 3 is especially helpful as it describes operating system support environment of the Intel processors. "FreeBSD Architecture Handbook" also has a very detail description about the hardware behavior and the FreeBSD implementation. Related references will be provided at the end for your reference.

Let's start the journey by powering on the computer. The processor first performs hardware initialization of the processor and an optional built-in self-test (BIST). One of the tasks is to set the processor's registers to a predefined state. The first instruction executed is located at physical address 0xFFFFFFFF0. This starting address is formed as follows. During a hardware reset, the address generation is handled somehow differently from real-mode and protected-mode. The CS (Code Segment) register has two parts: the visible segment selector part and the hidden base address part. During the reset, these two parts are loaded with 0xF000 and 0xFFFF0000. The EIP (Instruction Pointer) register is initialized to 0xFFF0. The starting address is formed by adding the value in the EIP to the base address (that is, 0xFFFF0000 + 0xFFF0 = 0xFFFFFFF0). Note after the first time the CS register is loaded with a new value after a hardware reset, the processor will follow the rule of address translation in real-mode (memory address mode will be discussed in later topics). Thus to insure the base address in CS register remains intact during initialization, the initialization code must not contain a far jump or far call or allow an interrupt to occur.

The state of control register CR0 is initialized to 0x60000010 as shown in Figure. 1. It places the processor in real-mode with paging disabled. The linear address and the physical address are identical. According to the real-mode address translation rule, the maximum addressable memory space is 1Mb. However, the first instruction address is pointed at 0xFFFFFFF0, slightly less than 4Gb. It is out of the addressable memory space in the real-mode. The hardware translates this address so that it points to a piece of code in the basic input output system (BIOS).

The BIOS is a piece of read-only memory chip (ROM) on the motherboard. It contains many low-level service routines designed specific to the motherboard. After hardware initialization completes, the processor starts to fetch and execute the instruction from the address 0xFFFFFFF0, which indeed resides in the BIOS ROM. Usually, a jump instruction is put at that address, directing to the BIOS's power on self test routine (POST). The POST routine performs a series of hardware tests, including memory, system bus and other peripheral devices. It also prepares necessary execution environment before handing over the control. One important step is determining the boot device from a list of candidates (e.g. floppy disk, hard disk, CD-ROM and USB driver). Finally, the POST routine loads the first sector (512 bytes) of the boot device into the memory at address 0x07C00 and then jumps to that memory location for execution. The first sector of the boot device is called boot sector or master boot record (MBR).

In summary, after the computer is powered on, the BIOS routine loads the boot sector into memory at address 0x07C00 and starts executing instructions from there.

[1] Intel Architecture Software Developer's Manual Volume 3 - System Programming
[2] FreeBSD Architecture Handbook
[3] 操作系统引导探究 / Operating System Booting Investigation (Chinese)

eyeOS - a new OS concept

Can you imagine an OS written in PHP, XML and JavaScript? eyeOS is such type of web service that can provide you a platform which can be accessed from anywhere through the Internet. The eyeOS comes with a preloaded set of applications such as a word processor, a music player, a calendar, a file manager and so on. You can find most of the common applications on this platform. You can even upload files and develop new applications as you want. The following screenshots show the user interface which is simple and pretty. Many functions that you can find in your local desktop OS are implemented.
The eyeOS brings us the concept of cloud computing. The key advantages include:
(1) Centralized platform resides in the server, allowing to work and collaborate from everywhere through the Internet.
(2) Unified formats remove the compatibility issues in the traditional operating system.
(3) Your data are stored in the server. Never need to worry about the disk failure.

If you find this is interesting, check the demo out :)

Lesson 1. The Role of the Operating System

I had a dream to write an operating system (OS) by myself when I was a middle school student. The OS was like a Pandora's box to me, magically managing all hardware components. It would be very exciting to develop my own one. After taking the OS class in college, however, I realized that the OS is one of the most complex software systems. Since it has to communicate with all kinds of hardware devices, you can imagine how much effort it needs to develop a fully functional OS.

The mysterious inside of the OS has always kept me curious. Unfortunately, most of the OS textbooks only focus on algorithms and principles. They seldom touch the implementation details of a real world system, in which I am interested most. I believe many are in the same dilemma. The best source for the OS study is certainly the Linux code. However, getting to know the details is a slow and painful process. As a newbie, I wasted a lot of time in trying to understand some very specific implementation issues. While those issues are not critical to understand the concepts, they become the real obstacles when you try to write your own. This tutorial is meant to share my experience of the operating system learning, post useful materials related to the OS development and provide you the starting point for building your own OS. Due to my limited knowledge, errors can be found here and there in the tutorial. Your comments are always appreciated. Finally but not the last, wish your dream comes true! :) 

Now, let's start the topic. As we all know, the OS is probably the first software you may want to install into your computer. It provides an abstract and consistent view of the underlying hardware so that applications can run on computers with very different hardware composition. Figure.1 illustrates the role of the operating system in the entire system. The underlying hardware generally consists of CPU, memory and peripheral devices. The OS interacts with these hardware components through instruction set architecture (ISA). The ISA provides an abstract view of the underlying hardware to the OS so that latter does not have to worry about the hardware implementation details. Using the concept of "Layer" and "abstraction" is a very important design approach in computer systems. You will get familiar to it. The ISA includes a set of machine instructions and a group of registers. By executing instructions and reading/writing corresponding registers, the OS can instruct the hardware to accomplish a particular task and get necessary feedback. This is exactly how the "magic" happens. Indeed, this is a universal way of cooperating software and hardware. Thus ISA is often called the hardware/software interface.

Now, the OS is able to control CPU and memory transactions using ISA. However, there are a large number of peripheral devices in the market and new products are coming out every day. How can the OS handle such varieties? The common solution is to adopt a modularized approach. The device driver module of the OS is provided by the product vendor. The device driver implements a set of predefined action handlers required by the OS. Whenever the OS wants to access peripheral devices, it request the service through those action handlers. Most of the major OS tasks are implemented in the kernel. They include booting/initialization, process scheduling, memory management, file system management and so on. Hopefully, we can cover them!

The OS wraps up all services it can provide to the applications through a mechanism commonly called "system call". An OS can have more than one hundred system calls, serving all kinds of requests. The role of system calls is similar to the ISA. They are the interface between user applications and the OS. By providing system calls, the OS further abstract its implementation details and allow the user applications to control underlying hardware indirectly and safely without knowing any specifics.

In conclusion, the role of the OS is to provide services to user applications while hiding the underlying hardware complexities as much as possible. So in order to understand an operating system implementation, knowledges of microprocessor architecture and hardware organization  are necessary. But do not worry too much as I will present related background when we see it. I will also demonstrate some experiments and code to further help you better understand.