Halaman 3 dari 4
Operating Systems and PAE Support
Updated: July 14, 2006
On This Page
PAE: 32- vs. 64-Bit Systems
Addressing
physical memory above 4 GB requires more than the 32 bits of address
offered by the standard operating mode of Intel (32-bit) processors. To
this end, Intel introduced the 36-bit physical addressing mode called
PAE, starting with the Intel Pentium Pro processor.
This article
describes some techniques that Microsoft Windows operating systems and
several UNIX operating systems use to provide support to applications
using PAE mode addressing. Because processes running in these
environments have 32-bit pointers, the operating system must manage and
present PAE's 36 bits of address in such a way that the applications
can practically use it. The key question is: how does the operating
system solve this problem? The performance, functionality, simplicity
of programming, and reliability of how these issues are handled will
determine the usefulness of the large memory support.
PAE is
supported only on 32-bit versions of the Windows operating system;
64-bit versions of Windows do not support PAE. For information about
device driver and system requirements for 64-bit versions of Windows,
see 64-bit System Design.
The Address Windowing Extension (AWE) API is supported on 32-bit
systems. It is also supported on x64 systems for both native and Wow64
applications.
Although support for PAE memory is typically
associated with support for more than 4 GB of RAM, PAE can be enabled
on Windows XP SP2, Windows Server 2003, and later 32-bit versions of
Windows to support hardware-enforced Data Execution Prevention (DEP).
The
information in this article applies to Windows 2000, Windows XP
Professional, Windows Server™ 2003, and later versions of these
operating systems, referred to as "Windows" in this paper.
Technical Background
Address Translation in standard 32-bit mode
All IA-32 processors (Intel Pentium, Pentium Pro, Pentium II Xeon, and
Pentium III Xeon) support 32 bits of physical address (4 GB), allowing
applications to address 4 GB of virtual address when they are running.
The system must translate the 32-bit virtual address that the
applications and operating system use to the 32-bit physical address
used by the hardware. (Pentium Pro was the first processor in the IA-32
family to support PAE, but chipset support is also required for 36-bit
physical addresses, which was usually lacking.)
Windows
uses two levels of mapping to do the translation, which is facilitated
by a set of data structures called page directories and page tables
that the memory manager creates and maintains.
PSE Mode
IA-32 supports two methods to access memory above 4 GB (32 bits). PSE
(Page Size Extension) was the first method, which shipped with the
Pentium II. This method offers a compatibility advantage because it
kept the PTE (page table entry) size of 4 bytes. However, the only
practical implementation of this is through a driver. This approach
suffers from significant performance limitations, due to a buffer copy
operation necessary for reading and writing above 4 GB. PSE mode is
used in the PSE 36 RAM disk usage model.
PSE
uses a standard 1K directory and no page tables to extend the page size
4-MB (eliminating one level of indirection for that mode). The Page
Directory Entries (PDE) contains 14 bits of address, and when combined
with the 22-bit byte index, yields the 36 bits of extended physical
address. Both 4-KB and 4-MB pages are simultaneously supported below 4
GB, with the 4-KB pages supported in the standard way.
Note that pages located above 4 GB must use PSE mode (with 4-MB page sizes).
PAE Mode
PAE is the second method supported to access memory above 4 GB; this
method has been widely implemented. PAE maps up to 64 GB of physical
memory into a 32-bit (4 GB) virtual address space using either 4-KB or
2-MB pages. The Page directories and the page tables are extended to 8
byte formats, allowing the extension of the base addresses of page
tables and page frames to 24 bits (from 20 bits). This is where the
extra four bits are introduced to complete the 36-bit physical address.
Windows
supports PAE with 4-KB pages. PAE also supports a mode where 2-MB pages
are supported. Many of the UNIX operating systems rely on the 2 MB-page
mode. The address translation is done without the use of page tables
(the PDE supplies the page frame address directly).
Operating System Implementation and Application Support
The
next issue is how the operating system can manage and present PAE's 36
bits of address in such a way that an application (with 32-bit
pointers) can practically use the additional memory.
There are
five application support models. The first two models (Server
Consolidation and Large Cache) are completely handled within the
operating system and require no changes to the application. The second
two models (Application Windowing and Process Fork) require application
changes to support API extensions for large memory. The last model (PSE
36 RAM Disk) requires no changes to the operating system (implemented
in a driver), but mandates application changes to support the driver.
1. Server Consolidation
A PAE-enabled operating system should be capable of utilizing all
physical memory provided by the system to load multiple applications;
for example, App#1, App#2, App #N, each consisting of 4 GB (maximum) of
virtual address. In a non-PAE enabled system, the result can be a great
deal of paging, since maximum physical memory in the system is limited
to 4 GB.
With
the additional physical memory supported under PAE mode, an operating
system can keep more of these applications in memory without paging.
This is valuable in supporting server consolidation configurations,
where support of multiple applications in a single server is typically
required. Note that no application changes are required to support this
capability.
2. Large Cache
Using additional PAE-enabled memory for a data cache is also possible.
If the operating system supports this feature, applications need not be
recoded to take advantage of it. Windows Advanced Server and Datacenter
Server support caching on a PAE platform and can utilize all of the
available memory.
3. Application Windowing
A PAE-enabled operating system can introduce an API to allow a properly
coded application access to physical memory anywhere in the system,
even though it may be above 4 GB. Ideally, the API to allocate "high"
physical memory and create or move the window should be quick and
simple to code. This is highly advantageous for applications that
require fast access to large amounts of data in memory.
Sharing
high memory between processes can introduce quite a bit of complexity
into the API and the implementation. Windows avoids this kind of
sharing.
In addition, the support of paging makes the design and
implementation of the operating system much more difficult and makes
deterministic performance more difficult to achieve. Windows avoids
paging of high memory as well.
4. Process Fork and Shared Memory
This application support model splits the current process into two or
more nearly identical copies. A copy is made of the user and system
stacks, the allocated data space, and the registers. The major
difference is that one has the Process ID (PID) of the parent; the
other has a new PID. The fork returns a value that is a PID. The PID is
zero for the copy that is the child or for the PID of the child for the
copy that is the parent.
5. PSE36 RAM Disk
Through use of a kernel device driver, much like a RAM disk, it is
possible to utilize memory above 4 GB with no change whatsoever to the
operating system. Compatibility between the base operating system
(running in 32-bit mode) and the driver (running in PAE mode) is
maintained since the page tables are kept at 4 bytes wide. The
trade-offs for this very low development impact are several:
• |
Performance degrades due to all I/O being forced to perform double buffering.
|
• |
Application development impact is not appreciably less than that required for current APIs.
|
• |
It cannot be used as a "consolidation server" because all applications share the same 4 GB physical memory space.
|
Design Implementation
The operating system implementations for large memory support must
directly address these issues in order to be successful. The
simplicity, reliability, and performance of the operating system will
be directly impacted, based on the design choices made in handling
these issues.
Technical Issues with Large Memory Support in IA32
Memory Sharing and Inter-Process Communications
In all cases where memory remap is being used for allocating memory to
processes, which is common to many PAE variants, memory sharing is
problematic. The physical memory being remapped is "outside" the
process virtual address space. Thus, the physical memory is less
connected to the process in the sense of sharing the process's internal
access and security controls, as well as those provided by the
operating system.
To
apply access and security controls, it is necessary to greatly increase
the bookkeeping required of the operating system memory manager as well
as the API set the application developer must use. This negatively
impacts the high performance possible using very fast remap operations.
It is also important to remember that IPC/memory sharing may still take
place between two processes' virtual address spaces in any case,
regardless of the physically mapped memory each may be using.
TLB Shoot-down
Translation Look-aside Buffers (TLBs) are processor registers, or a
cache, that provides a direct logical-to-physical mapping of page table
entries. Once loaded, the processor has to read the page directories
very infrequently (TLB misses) unless a task switch occurs.
During
a remap operation, it is necessary to ensure that all processors have
valid logical-to-physical mapping on chip. Therefore, remap operations
require a TLB shoot-down, because the logical-to-physical association
is invalidated by the remap (where "logical" = the application/process
view of memory).
There is a performance impact while the
processor (or processors) reload the TLB. All operating systems have
this issue, and in the case of PAE memory support, they ameliorate the
issue in different ways:
• |
Windows
provides the ability for a single application to "batch" the remap
operations required so that all happen simultaneously and only cause
one TLB shoot-down and one performance dip instead of random remaps,
each of which would impact performance. This is quite adequate for
large applications, which are typically running on single-purpose
systems.
|
• |
Other
operating systems provide "victim" buffers or allow one process to
share another process's mappings, but at a cost of more synchronization
and API complexity.
Windows XP also provides this "batch" or
Scatter/Gather functionality. Additionally, performance of these
operations has been improved for Windows Server 2003, Enterprise
Edition and Datacenter Edition.
|
I/O
At one level or another, all the PAE variants support both 32-bit and
64-bit DMA I/O devices with the attendant drivers. However, there are a
number of provisos and conditions.
Kernel and memory organization
Typically, kernel memory space organization is unchanged from the
standard kernel for the operating system. In many cases, items such as
the memory pool size remain the same. For backward compatibility, PCI
base address registers (BARs) remain the same. Larger memory sizes
cause some shifting of kernel address space, usually when between 16 GB
and 32 GB of memory is physically present in the system.
One difference between operating systems is whether memory allocations are dynamic:
• |
Some
operating systems require the administrator to configure the amount of
memory used for various purposes (caching, mapping, consolidation, and
so on).
|
• |
Windows
does not require the administrator to configure memory allocations,
because the usage is dynamic, within the constraints of the APIs used.
|
Hardware Support
The PCI standard provides a method whereby adapters may physically
address more than 4 GB of memory by sending the high 32 bits of address
and the low 32 bits of address in two separate sends. This is called
Dual Address Cycle (DAC) and is used both for 32-bit adapters that
understand 64-bit addresses but have only 32 address lines and for
adapters that do have 64 address lines. This is a backward
compatibility feature.
Given
the method with which PCI addresses memory beyond 32 bits, there is a
failure mode that is subtle. Any I/O range that "spans" across two 4-GB
regions must be treated specially. If not, the address range will be
correctly decoded for only one part of the transfer and the remaining
part will be transposed to an incorrect memory location. This will
corrupt memory and will crash the system, crash the application, or
silently corrupt data at that location. Applications cannot prevent
this because they are only presented virtual addresses and have no
visibility to the physical level. All operating systems that use PAE
face this problem, but some do not explicitly prevent this from
occurring and instead depend on the device driver to take the correct
actions.
Windows, however, explicitly prevents this problem. When
an I/O range spans in this fashion, Windows returns two separate
addresses and ranges to the device and driver. The final special case
is the first transition from 4 GB to beyond. No DAC is required for the
region below 4 GB, but DAC is required for the rest of the transfer.
Again, Windows returns two separate addresses and ranges in this case
to prevent memory corruption.
Obviously, DAC or 64-bit adapters
and drivers provide the best performance as no buffering of I/O occurs.
This buffering is required, however, whenever the adapter and driver
cannot utilize more than 32 bits of address information. All operating
systems that utilize PAE mode addressing support this "double
buffering" in some fashion, as a backward compatibility feature. This
buffering does have a performance penalty that is dependent on several
factors:
• |
Adapter hardware performance
|
• |
Driver performance
|
• |
Operating system support provided for double buffering
|
• |
Amount of physical memory installed in the system
|
As
the physical memory increases, the relative amount of I/O addresses
beyond 32 bits also increases in relation to those addresses below 32
bits. In most cases, the operating system transparently provides double
buffering, although some Unix variants do not provide any assistance in
this function and require any 32-bit devices and drivers to manage
their own double buffering routines and allocations.
Driver Issues
Typically, device drivers must be modified in a number of small ways.
Although the actual code changes may be small, they can be difficult.
This is because when not using PAE memory addressing, it is possible
for a device driver to assume that physical addresses and 32-bit
virtual address limits are identical. PAE memory makes this assumption
untrue.
Several
assumptions and shortcuts that could previously be used safely do not
apply. In general, these fall in to three categories:
• |
Buffer
alignment in code that allocates and aligns shared memory buffers must
be modified so that it does not ignore the upper 32 bits of the
physical address.
|
• |
Truncation of addresses information in the many locations this might be kept must be avoided.
|
• |
It
is necessary to strictly segregate virtual and physical address
references so DMA operations do not transfer information to or from
random memory locations.
|
PAE mode can
be enabled on Windows XP SP2, Windows Server 2003 SP1 and later
versions of Windows to support hardware-enforced DEP. However, many
device drivers designed for these systems may not have been tested on
system configurations with PAE enabled. In order to limit the impact to
device driver compatibility, changes to the hardware abstraction layer
(HAL) were made to Windows XP SP2 and Windows Server 2003 SP1 Standard
Edition to limit physical address space to 4 GB. Driver developers are
encouraged to read about DEP.
Paging
Most operating systems supporting PAE support virtual memory paging of
some nature for the physical memory beyond 4 GB. This usually occurs
with some restrictions such as limiting the boot/system paging file to
4 GB or spreading the paging file (or files) across multiple operating
system-organized volumes (not necessarily physical spindles).
Although
this allows the obvious benefits of virtual memory, the downside is the
performance impact on applications that have one or more of the
following characteristics:
• |
Use a large amount of physical memory for their data sets
|
• |
Do a great deal of I/O
|
• |
Have large executable working sets
|
Finally, paging support typically comes at the expense of increasing the API set and slowing development and version migration.
User APIs
All operating systems supporting PAE have APIs that allow for use of
physical memory by processes beyond the virtual address range possible
on IA-32 processors. These differ primarily in how much support they
provide for the items described earlier: memory sharing, inter-process
communications, paging, and so on. A simple and straightforward API set
is provided by Windows--the Address Windowing Extensions (AWE) API
set--which consists of only five API calls, with the most complex API
being four times larger and involving kernel and user-level calls.
The
proliferation of proprietary APIs--some of which are tied directly to
the processor architecture (kernel level)--makes porting applications
from one Unix variant to another expensive, time-consuming, and a
constant struggle to balance costs versus performance optimization.
Windows provides an API set which is simple, fast, and completely
portable between 32-bit and 64-bit hardware platforms, requiring only a
recompile in order to function.
Page Size
Almost all operating systems supporting PAE use differing page sizes
when providing physical memory beyond 4 GB to an application. The
primary exception is Windows, which presents to applications only 4 KB
pages on IA-32 platforms (this is different on Itanium-based platform).
The
issue with using varying page sizes for applications is related to
additional application complexity required to function correctly with
differing memory allocation sizes, as well as subtle effects related to
the underlying assumptions that almost all applications have with page
size. Although research shows a small class of applications can benefit
from larger page sizes (2 MB or 4 MB), because each TLB entry spans a
greater address range, the general rule is applications don't benefit
from larger page sizes.
Windows and PAE
Windows 2000 Professional
Windows XP
|
AWE API and 4 GB of physical RAM
|
Windows XP SP2 and later
|
AWE API and 4 GB of physical address space
|
Windows 2000 Server
Windows Server 2003, Standard Edition
|
AWE API and 4 GB of RAM
|
Windows Server 2003 SP1, Standard Edition
|
AWE API and 4 GB of physical address space
|
Windows Server 2003, Enterprise Edition
|
8 processors and 32 GB RAM
|
Windows Server 2003 SP1, Enterprise Edition
|
8 processors and 64 GB RAM
|
Windows 2000 Advanced Server
|
8 processors and 8 GB RAM
|
Windows 2000 Datacenter Server
|
32 processors and 32 GB RAM (support for 64 GB was not offered because of a lack of systems for testing)
|
Windows Server 2003, Datacenter Edition
|
32 processors and 64 GB RAM
|
Windows Server 2003 SP1, Datacenter Edition
|
32 processors and 128 GB RAM
|
For more information about PAE and Windows, including guidelines for developers, see PAE Memory and Windows.
|