x86 Segmentation and Paging Overview

09 Mar 2016 · 5 minute read

Introduction

x86’s memory management is both segmented and paged.

Segmentation
- Segmentation separates code segments, data segments, and stack segments. This allows multiple programs to run simultaneously without interfering with each other.
Paging
- Paging fulfills traditional memory management requirements.

Concepts:

Logical Address:

The addresses that appear in instructions are logical addresses. Logical addresses are not real addresses; they are relative addresses that need to be translated into physical addresses to access the content at that address. Under x86 Segmentation, a logical address is defined as [Segment Identifier + Offset]. The segment identifier is defined by the selector, and the offset is the segment’s internal offset.

Linear Address:

Also known as a virtual address, it is an address generated through program translation. Similar to logical addresses, linear addresses are not real addresses. Under x86 Segmentation, the address before entering segmentation is called a logical address, and the address after segmentation is called a linear address (base + offset). In paging, the address before entering paging is a linear address, and the address after paging is a physical address. The entire address space contained in all segments, as well as the address space constructed by all linear addresses, is called the linear address space.

Overview

32Bit-Paging，4KByte Page

Segmentation: (Logical Address -> Linear Address)

Segment Registers

To reduce addressing time and code complexity, the processor provides 6 registers to store selectors. Each register supports referencing a specific type of memory (such as code, stack, and data). The essential segments required for program execution, such as the code segment, data segment, and stack segment, are associated with the Code Segment Register, Stack Segment Register, and Data Segment Register, respectively. These registers are responsible for holding their corresponding segment selectors.

While a program may have many segments, only 6 segments can be immediately used. Other segments that need to be accessed must have their selectors placed in these registers.

CS: Code Segment Register
SS: Stack Segment Register
DS: Data Segment Register
ES: Additional Data Register
FS: Additional Data Register
GS: Additional Data Register

Each register is divided into two parts:

Visible Part: Used to store the selector.
Hidden Part: When the processor loads the selector, it includes the base address, length limit, and access privileges of the corresponding descriptor in this part. This allows the processor to access the information from the descriptor without waiting for an additional bus cycle during addressing.

Commands for manipulating registers: Instructions that directly operate on segment registers include MOV, POP, LDS, LES, LSS, LGS, and LFS. These instructions explicitly reference segment registers. Instructions that implicitly operate on segment registers include CALL, JMP, RET, SYSENTER, SYSEXIT, and others. These instructions may change the contents of segment registers, mostly referring to the Code Segment Register. Selector:

In x86, to locate a specific segment, it is necessary to use a selector. The selector is loaded into one of the 6 segment registers. Each selector is 16 bits long. The first 13 bits represent the index of the segment descriptor table entry. If the descriptor is 0, it means that the corresponding segment register is not available. The next bit represents whether the descriptor is in the LDT or GDT. The last two bits are related to priority and protection. Memory Management Registers:

GDTR: The register that holds information about the GDT (Global Descriptor Table). It contains a 32-bit base address and a 16-bit length limit. The LGDT and SGDT instructions are used to load and save the entry address of the GDT. The CPU accesses the GDT based on this entry address. LDTR: LDTR records the starting address of the LDT (Local Descriptor Table). It consists of a 32-bit segment base address, a length limit, and other attributes. LDTR also includes a 16-bit selector used to locate the corresponding segment descriptor in the LDT.