ntoskrnl.exe

{{Short description|Windows NT kernel image}}

{{Lowercase title}}

{{About|a computer file that contains a part of the Windows NT kernel|the Windows NT kernel itself|Architecture of Windows NT}}

{{More citations needed|date=April 2014}}

ntoskrnl.exe (short for Windows NT operating system kernel executable), also known as the kernel image, contains the kernel and executive layers of the Microsoft Windows NT kernel, and is responsible for hardware abstraction, process handling, and memory management. In addition to the kernel and executive layers, it contains the cache manager, security reference monitor, memory manager, scheduler (Dispatcher), and blue screen of death (the prose and portions of the code).Russinovich, M: [https://web.archive.org/web/20080408141752/http://technet.microsoft.com/en-us/sysinternals/bb897446.aspx Systems Internals Tips and Trivia], SysInternals Information

Overview

x86 versions of ntoskrnl.exe depend on bootvid.dll, hal.dll and kdcom.dll (x64 variants of ntoskrnl.exe have these DLLs embedded in the kernel to improve performance). However, it is not a native application thus it is not linked against ntdll.dll. Instead, ntoskrnl.exe has its own entry point "KiSystemStartup" that calls the architecture-independent kernel initialization function. Because it requires a static copy of the C Runtime objects, the executable is usually about 10 MB in size.

In Windows XP and earlier, the Windows installation source ships four kernel image files to support uniprocessor systems, symmetric multiprocessor (SMP) systems, CPUs with PAE, and CPUs without PAE. Windows setup decides whether the system is uniprocessor or multiprocessor, then, installs both the PAE and non-PAE variants of the kernel image for the decided kind. On a multiprocessor system, Setup installs ntkrnlmp.exe and ntkrpamp.exe but renames them to ntoskrnl.exe and ntkrnlpa.exe respectively.

Starting with Windows Vista, Microsoft began unifying the kernel images as multi-core CPUs took to the market and PAE became mandatory.

class="wikitable sortable" style="text-align: center; margin: 0px auto;"

|+ Kernel image filenames

colspan="3" |32-bit Windows
Filename

! Supports
SMP

! Supports
PAE

colspan="4" | 32-bit kernel
ntoskrnl.exe

| {{No}}

| {{No}}

ntkrnlmp.exe

| {{Yes}}

| {{No}}

ntkrnlpa.exe

| {{No}}

| {{Yes}}

ntkrpamp.exe

| {{Yes}}

| {{Yes}}

colspan="3" | 64-bit kernel (x64 editions)
Filename

! Supports
SMP

! Supports
57 bit VA

ntoskrnl.exe

| {{No}}

| {{No}}

ntkrnlmp.exe

| {{Yes}}

| {{No}}

ntkrla57.exe

| {{Yes}}

| {{Yes}}

Windows kernel's architecture is structured so that everything is easy to understand{{huh|date=December 2024}}. Functions and global variables use the so called Pascal Case formatting with special (additional) prefixes in their names to differentiate parts of the kernel.

An example is IoCreateDevice and ObReferenceObjectByHandle. Both functions have different prefix names to differentiate critical managers within the kernel code: Io being used for [https://learn.microsoft.com/en-us/windows-hardware/drivers/kernel/windows-kernel-mode-i-o-manager I/O Manager] functions and Ob for Object Manager functions.

Variations of these prefixes exist for internal functions that are not being exported by the kernel, such as adding an i after the first letter (e.g., Ki for “Kernel Internal”) or appending p to the full prefix (e.g., Psp for “Process Support Internal”).

The following table lists all prefixes.

class="wikitable sortable"

|+ NT favorable prefixes

Export
Prefix

!Internal Prefix

! Meaning

Cc

| Ccp

File system cache{{cite web | author=Microsoft Corporation | author-link=Microsoft Corporation | year=2009 | url=http://msdn.microsoft.com/en-us/library/ff539010.aspx | title=Cache Manager Routines | publisher=Microsoft Corporation | access-date=2009-06-13 }}
Cm

| Cmp

Configuration Manager, the kernel mode side of Windows Registry
Dbg

| Dbg

Debugging aid functions, such as a software break point
Dbgk

| Dbgk

| A set of debugging functions that are being exposed to user mode through ntdll.dll

Ex

| Exp

Windows executive, an "outer layer" of ntoskrnl.exe
FsRtl

| FsRtlp

File system runtime library{{cite web | author=Microsoft Corporation | author-link=Microsoft Corporation | year=2009 | url=http://msdn.microsoft.com/en-us/library/ff540426.aspx | title=File System Runtime Library Routines | publisher=Microsoft Corporation | access-date=2009-06-13 }}
Io

| Iop

I/O manager{{cite web | author=Microsoft Corporation | author-link=Microsoft Corporation | year=2009 | url=http://msdn.microsoft.com/en-us/library/ff551797.aspx | title=I/O Manager Routines | publisher=Microsoft Corporation | access-date=2009-06-13 }}
Ke

| Ki

Core kernel routines{{cite web | author=Microsoft Corporation | author-link=Microsoft Corporation | year=2009 | url=http://msdn.microsoft.com/en-us/library/ff542078.aspx | title=Core Kernel Library Support Routines | publisher=Microsoft Corporation | access-date=2009-06-13 }}
| KxInterrupt handling, semaphores, spinlocks, multithreading and context switching related functions
| KsKernel streaming
Ldr

| Ldrp

NT's PE Executables loader
Lpc

| Lpcp

Local Procedure Call, an internal, undocumented, interprocess or user/kernel message passing mechanism
Lsa

| Lsap

Local Security Authority
Mm

| Mi

Memory management
Nls

| Nls

Nls for Native Language Support (similar to code pages).
Ob

| Obp

Object Manager
Po

| Pop

Plug-and-play and power management{{cite web | author=Microsoft Corporation | author-link=Microsoft Corporation | year=2009 | url=http://msdn.microsoft.com/en-us/library/ff559835.aspx | title=Power Manager Routines | publisher=Microsoft Corporation | access-date=2009-06-13 }}
Ps

| Psp

Process and thread management (task management)
Rtl

| Rtlp

Runtime library, i.e., many utility functions that can be used by native applications, yet don't directly involve kernel support
Se

| Sep

Security Manager, access token for the Win32 API
Vf

| Vi

Driver Verifier
Zw/Nt

|

Nt or Zw are system calls declared in ntdll.dll and ntoskrnl.exe. When called from ntdll.dll in user mode, these groups are almost exactly the same; they trap into kernel mode and call the equivalent function in ntoskrnl.exe via the SSDT. When calling the functions directly in ntoskrnl.exe (only possible in kernel mode), the Zw variants ensure kernel mode, whereas the Nt variants do not.{{cite journal | author=The NT Insider | journal=OSR Online | volume=10 | issue=4 | date=August 27, 2003 | url=http://www.osronline.com/article.cfm?article=257 | title=Nt vs. Zw - Clearing Confusion On The Native API | publisher=OSR Open Systems Resources | access-date=2013-09-16 }}

Initialization

When the kernel receives control, it gets a struct-type pointer from bootloader. The pointer's destination contains information about the hardware, the path to the Windows Registry file, kernel parameters containing boot preferences or options that change the behavior of the kernel, path of the files loaded by the bootloader (SYSTEM Registry hive, nls for character encoding conversion, and vga font).{{cite web|url=http://www.nirsoft.net/kernel_struct/vista/LOADER_PARAMETER_BLOCK.html|title=struct LOADER_PARAMETER_BLOCK|website=www.nirsoft.net}} The definition of this structure can be retrieved by using the kernel debugger or downloading it from the Microsoft symbol database.{{cite book|title=Practical Reverse Engineering Using X86, X64, Arm, Windows Kernel, and Reversing Tools.|date=2014|publisher=John Wiley & Sons Inc|isbn=978-1118787311}}{{Page needed|date=October 2014}}

In the x86 architecture, the kernel receives the system already in protected mode, with the GDT, IDT and TSS ready.{{elucidate|date=October 2014}} But since it does not know the address of each one, it has to load them one by one to fill the PCR structure.{{technical statement|date=October 2014}}

The main entry point of ntoskrnl.exe performs some system dependent initialization then calls a system independent initialization then enters an idle loop.{{Contradict-inline|reason=It does? From the code sample above it looks like it calls KiInitializeKernel and then returns to caller.|date=October 2014}}

Interrupt handling

{{About|NT implementation of interrupt handlers||Interrupt handling}}

Modern operating systems use interrupts instead of I/O port polling to wait for information from devices.

In the x86 architecture, interrupts are handled through the Interrupt Dispatch Table (IDT). When a device triggers an interrupt and the interrupt flag (IF) in the FLAGS register is set, the processor's hardware looks for an interrupt handler in the table entry corresponding to the interrupt number to which in turn has been translated from IRQ by PIC chips, or in more modern hardwares, APIC. Interrupt handlers usually save some subset of the state of registers before handling it and restore them back to their original values when done.

The interrupt table contains handlers for hardware interrupts, software interrupts, and exceptions. For some IA-32 versions of the kernel, one example of such a software interrupt handler (of which there are many) is in its IDT table entry 2E16 (hexadecimal; 46 in decimal), used in assembly language as INT 2EH for system calls. In the real implementation the entry points to an internal subroutine named (as per symbol information published by Microsoft) KiSystemService. For newer versions, different mechanisms making use of SYSENTER instruction and in x86-64 SYSCALL instruction are used instead.

One notable feature of NT's interrupt handling is that interrupts are usually conditionally masked based on their priority (called "IRQL"), instead of disabling all IRQs via the interrupt flag. This permits various kernel components to carry on critical operations without necessarily blocking services of peripherals and other devices.{{cite web | author=CC Hameed | date=January 22, 2008 | url=https://blogs.technet.microsoft.com/askperf/2008/01/22/what-is-irql-and-why-is-it-important/ | title=What is IRQL and why is it important? {{!}} Ask the Performance Team Blog | publisher=Microsoft Corporation | access-date=2018-11-11 }}

Memory manager

{{About|NT implementation of a memory manager||memory management}}

The entire physical memory (RAM) address range is broken into many small blocks also called pages, 4KB in size each, and mapped to virtual addresses. A few of the properties of each block are stored in structures called page table entries, which are managed by the OS and accessed by the processor's hardware. Page tables are organized into a tree structure, and the physical page number of the top-level table is stored in control register 3 (CR3).

Microsoft Windows divides virtual address space into two regions. The lower part, starting at zero, is instantiated separately for each process and is accessible from both user and kernel mode. Application programs run in processes and supply code that runs in user mode.

The upper part is accessible only from kernel mode, and with some exceptions, is instantiated just once, system-wide. ntoskrnl.exe is mapped into this region, as are several other kernel mode components. This region also contains data used by kernel mode code, such as the kernel mode heaps and the file system cache.

class="wikitable"

|+ Virtual Address Space Layouts

Arch

! MmHighestUserAddress

! MmSystemRangeStart

x86{{efn|Tunable via /userva or /3gb switch.}}rowspan=2 | 0x7fffffffrowspan=2 | 0x80000000
ARM
x86-640x000007ff'ffffffff(until Windows 8.1 Update 2)
0x00007fff'ffffffff(from Windows 8.1 Update 3)
0xffff8000'00000000

Registry

{{Details|Windows Registry}}

Windows Registry is a repository for configuration and settings information for the operating system and for other software, such as applications. It can be thought of as a filesystem optimized for small files.{{cite book|last=Tanenbaum|first=Andrew S.|title=Modern operating systems|date=2008|publisher=Pearson Prentice Hall|location=Upper Saddle River, N.J.|isbn=978-0136006633|pages=829|edition=3rd}} However, it is not accessed through file system-like semantics, but rather through a specialized set of APIs, implemented in kernel mode and exposed to user mode.

The registry is stored on disk as several different files called "hives." One, the System hive, is loaded early in the boot sequence and provides configuration information required at that time. Additional registry hives, providing software-specific and user-specific data, are loaded during later phases of system initialization and during user login, respectively.

Drivers

{{further|Device driver}}

The list of drivers to be loaded from the disk are retrieved from the Services key of the current control set's key in the SYSTEM registry hive. That key stores device drivers, kernel processes and user processes. They are all collectively called "services" and are all stored mixed on the same place.

During initialization or upon driver load request, the kernel traverses that tree looking for services tagged as kernel services.

See also

Notes

{{notelist}}As mentioned in [https://learn.microsoft.com/en-us/sysinternals/resources/windows-internals Windows Internals Book 7th edition], the boot-time option increaseuserva and corresponding header in executable image is required for this feature.

References

{{Reflist}}

Further reading

  • {{Cite book|last=Tanenbaum|first=Andrew S.|title=Modern Operating Systems|date=2008|publisher=Pearson Prentice Hall|location=Upper Saddle River, N.J.|isbn=978-0136006633|pages=829|edition=3rd}}
  • {{Cite book|author1=Bruce Dang|author2=Alexandre Gazet|author3=Elias Bachaalany|title=Practical Reverse Engineering: x86, x64, ARM, Windows Kernel, Reversing Tools, and Obfuscation|date=2014|publisher=Wiley|isbn=978-1118787311|pages=384}}