Kernel page-table isolation
{{Redirect|KPTI}}
File:Kernel page-table isolation.svg
Kernel page-table isolation (KPTI or PTI, previously called KAISER){{Cite news |url=https://lwn.net/Articles/741878/|title=The current state of kernel page-table isolation |author-last=Corbet |author-first=Jonathan |author-link=Jonathan Corbet |date=2017-12-20 |work=LWN.net }}{{Cite news |url=https://www.bleepingcomputer.com/news/security/os-makers-preparing-patches-for-secret-intel-cpu-security-bug/ |title=OS Makers Preparing Patches for Secret Intel CPU Security Bug |author-last=Cimpanu |author-first=Catalin |date=2018-01-03 |work=Bleeping Computer |language=en-us}} is a Linux kernel feature that mitigates the Meltdown security vulnerability (affecting mainly Intel's x86 CPUs){{Cite news |url=https://www.extremetech.com/computing/261439-spectre-meltdown-new-critical-security-flaws-explored-explained |title=Spectre, Meltdown: Critical CPU Security Flaws Explained – ExtremeTech |date=2018-01-04 |work=ExtremeTech |access-date=2018-01-05 |language=en-US}} and improves kernel hardening against attempts to bypass kernel address space layout randomization (KASLR). It works by better isolating user space and kernel space memory.{{Cite news |url=https://lwn.net/Articles/738975/ |title=KAISER: hiding the kernel from user space |author-last=Corbet |author-first=Jonathan |author-link=Jonathan Corbet |date=2017-11-15 |work=LWN.net}}{{Cite conference |date=2017-06-24 |author-last1=Gruss |author-first1=Daniel |author-last2=Lipp |author-first2=Moritz |author-last3=Schwarz |author-first3=Michael |author-last4=Fellner |author-first4=Richard |author-last5=Maurice |author-first5=Clémentine |author-last6=Mangard |author-first6=Stefan |title=KASLR is Dead: Long Live KASLR |url=https://gruss.cc/files/kaiser.pdf |conference=Engineering Secure Software and Systems 2017}} KPTI was merged into Linux kernel version 4.15,{{Cite news |url=https://lwn.net/Articles/742404/ |title=Kernel page-table isolation merged |author-last=Corbet |author-first=Jonathan |author-link=Jonathan Corbet |date=2017-12-20 |work=LWN.net}} and backported to Linux kernels 4.14.11, 4.9.75, and 4.4.110.{{Cite web |url=https://cdn.kernel.org/pub/linux/kernel/v4.x/ChangeLog-4.14.11 |title=Linux 4.14.11 Changelog |author-last=Kroah-Hartman |author-first=Greg |date=2018-01-02 |website=kernel.org }}{{Cite web |url=https://cdn.kernel.org/pub/linux/kernel/v4.x/ChangeLog-4.9.75 |title=Linux 4.9.75 Changelog |author-last=Kroah-Hartman |author-first=Greg |date=2018-01-05 |website=kernel.org }}{{Cite web |url=https://cdn.kernel.org/pub/linux/kernel/v4.x/ChangeLog-4.4.110 |title=Linux 4.4.110 Changelog |author-last=Kroah-Hartman |author-first=Greg |date=2018-01-05 }} Windows{{Cite tweet |number=930412525111296000 |user=aionescu |title=Windows 17035 Kernel ASLR/VA Isolation In Practice |author-first=Alex |author-last=Ionescu |date=2017-11-14}} and macOS{{Cite web |url=http://appleinsider.com/articles/18/01/03/apple-has-already-partially-implemented-fix-in-macos-for-kpti-intel-cpu-security-flaw |title=Apple has already partially implemented fix in macOS for 'KPTI' Intel CPU security flaw |website=AppleInsider |date=3 January 2018 |language=en-US |access-date=2018-01-03}} released similar updates. KPTI does not address the related Spectre vulnerability.{{Cite news |url=https://techcrunch.com/2018/01/03/kernel-panic-what-are-meltdown-and-spectre-the-bugs-affecting-nearly-every-computer-and-device/ |title=Kernel panic! What are Meltdown and Spectre, the bugs affecting nearly every computer and device? |author-last=Coldewey |author-first=Devin |date=2018-01-04 |work=TechCrunch |language=en}}
Background on KAISER
The KPTI patches were based on KAISER (short for Kernel Address Isolation to have Side-channels Efficiently Removed), a technique conceived in 2016{{cite web |author-first=Daniel |author-last=Gruss |date=2018-01-03 |title=#FunFact: We submitted #KAISER to #bhusa17 and got it rejected |via=Twitter |url=https://twitter.com/lavados/status/948536300830851072 |access-date=2018-01-08 |url-status=live |archive-url=https://web.archive.org/web/20180108013055/https://mobile.twitter.com/lavados/status/948536300830851072 |archive-date=2018-01-08}} and published in June 2017 back when Meltdown was not known yet. KAISER makes it harder to defeat KASLR, a 2014 mitigation for a much less severe issue.
In 2014, the Linux kernel adopted kernel address space layout randomization (KASLR),{{cite web |url=http://kernelnewbies.org/Linux_3.14#head-192cae48200fccde67b36c75cdb6c6d8214cccb3 |title=Linux kernel 3.14, Section 1.7. Kernel address space randomization |date=2014-03-30 |website=kernelnewbies.org |access-date=2014-04-02}} which makes it more difficult to exploit other kernel vulnerabilities,{{Cite book |url=https://books.google.com/books?id=roM4DwAAQBAJ&pg=PA56 |title=Architectural and Operating System Support for Virtual Memory |author-last1=Bhattacharjee |author-first1=Abhishek |author-last2=Lustig |author-first2=Daniel |date=2017-09-29 |publisher=Morgan & Claypool Publishers |isbn=978-1-62705-933-6 |pages=56 |language=en}} which relies on kernel address mappings remaining hidden from user space.{{Cite news |url=http://www.eweek.com/security/kpti-intel-chip-flaw-exposes-security-risks |title=KPTI Intel Chip Flaw Exposes Security Risks |author-last=Kerner |author-first=Sean Michael |date=2018-01-03 |work=eWEEK |language=en-US}} Despite prohibiting access to these kernel mappings, it turns out that there are several side-channel attacks in modern processors that can leak the location of this memory, making it possible to work around KASLR.{{Cite book |author-last1=Jang |author-first1=Yeongjin |author-last2=Lee |author-first2=Sangho |author-last3=Kim |author-first3=Taesoo |title=Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security |chapter=Breaking Kernel Address Space Layout Randomization with Intel TSX |date=2016 |chapter-url=http://people.oregonstate.edu/~jangye/assets/papers/2016/jang:drk-bh.pdf |series=CCS '16 |location=New York, NY, USA |publisher=ACM |pages=380–392 |doi=10.1145/2976749.2978321 |isbn=978-1-4503-4139-4|doi-access=free }}{{Cite book |author-last1=Gruss |author-first1=Daniel |author-last2=Maurice |author-first2=Clémentine |author-last3=Fogh |author-first3=Anders |author-last4=Lipp |author-first4=Moritz |author-last5=Mangard |author-first5=Stefan |title=Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security |chapter=Prefetch Side-Channel Attacks |date=2016 |chapter-url=https://gruss.cc/files/prefetch.pdf |series=CCS '16 |location=New York, NY, USA |publisher=ACM |pages=368–379 |doi=10.1145/2976749.2978356 |isbn=978-1-4503-4139-4|s2cid=15973158 }}{{Cite book |author-last1=Hund |author-first1=R. |author-last2=Willems |author-first2=C. |author-last3=Holz |author-first3=T. |title=2013 IEEE Symposium on Security and Privacy |chapter=Practical Timing Side Channel Attacks against Kernel Space ASLR |date=May 2013 |chapter-url=https://www.ieee-security.org/TC/SP2013/papers/4977a191.pdf |pages=191–205 |doi=10.1109/sp.2013.23 |isbn=978-0-7695-4977-4 |s2cid=215754624 }}
KAISER addressed these problems in KASLR by eliminating some sources of address leakage. Whereas KASLR merely prevents address mappings from leaking, KAISER also prevents the data from leaking, thereby covering the Meltdown case.{{cite web |url=https://meltdownattack.com/meltdown.pdf |title=Meltdown}}
KPTI is based on KAISER. Without KPTI enabled, whenever executing user-space code (applications), Linux would also keep its entire kernel memory mapped in page tables, although protected from access. The advantage is that when the application makes a system call into the kernel or an interrupt is received, kernel page tables are always present, so most context switching-related overheads (TLB flush, page-table swapping, etc) can be avoided.
Meltdown vulnerability and KPTI
In January 2018, the Meltdown vulnerability was published, known to affect Intel's x86 CPUs and ARM Cortex-A75.{{Cite news |url=https://www.extremetech.com/computing/261439-spectre-meltdown-new-critical-security-flaws-explored-explained |title=Spectre, Meltdown: Critical CPU Security Flaws Explained – ExtremeTech |date=2018-01-04 |work=ExtremeTech |access-date=2018-01-05 |language=en-US}}{{Cite news |url=https://techcrunch.com/2018/01/03/kernel-panic-what-are-meltdown-and-spectre-the-bugs-affecting-nearly-every-computer-and-device/ |title=Kernel panic! What are Meltdown and Spectre, the bugs affecting nearly every computer and device? |author-last=Coldewey |author-first=Devin |date=2018-01-04 |work=TechCrunch |language=en}} It was a far more severe vulnerability than the KASLR bypass that KAISER originally intended to fix: It was found that contents of kernel memory could also be leaked, not just the locations of memory mappings, as previously thought.
KPTI (conceptually based on KAISER) prevents Meltdown by preventing most protected locations from being mapped to user space.
AMD x86 processors are not currently known to be affected by Meltdown and don't need KPTI to mitigate them.{{Cite news |url=https://www.amd.com/en/corporate/speculative-execution |title=An Update on AMD Processor Security |date=2018-01-04 |publisher=AMD}} However, AMD processors are still susceptible to KASLR bypass when KPTI is disabled.
Implementation
KPTI fixes these leaks by separating user-space and kernel-space page tables entirely. One set of page tables includes both kernel-space and user-space addresses same as before, but it is only used when the system is running in kernel mode. The second set of page tables for use in user mode contains a copy of user-space and a minimal set of kernel-space mappings that provides the information needed to enter or exit system calls, interrupts and exceptions.
On processors that support the process-context identifiers (PCID), a translation lookaside buffer (TLB) flush can be avoided, but even then it comes at a significant performance cost, particularly in syscall-heavy and interrupt-heavy workloads.{{Cite news |url=https://www.theregister.co.uk/2018/01/02/intel_cpu_design_flaw/ |title=Kernel-memory-leaking Intel processor design flaw forces Linux, Windows redesign |author-last1=Leyden |author-first1=John |date=2018-01-02 |journal=The Register |author-last2=Williams |author-first2=Chris}}
The overhead was measured to be 0.28% according to KAISER's original authors; a Linux developer measured it to be roughly 5% for most workloads and up to 30% in some cases, even with the PCID optimization; for database engine PostgreSQL the impact on read-only tests on an Intel Skylake processor was 7–17% (or 16–23% without PCID),{{cite web |url=https://www.postgresql.org/message-id/20180102222354.qikjmf7dvnjgbkxe%40alap3.anarazel.de |title=heads up: Fix for intel hardware bug will lead to performance regressions |author-first=Andres |author-last=Freund |work=PostgreSQL development mailing list (pgsql-hackers) |date=2018-01-02}} while a full benchmark lost 13–19% (Coffee Lake vs. Broadwell-E).{{cite web |url=https://www.phoronix.com/scan.php?page=article&item=linux-415-x86pti&num=2 |title=Initial Benchmarks Of The Performance Impact Resulting From Linux's x86 Security Changes |work=Phoronix |author-first=Michael |author-last=Larabel |date=2018-01-02}} Many benchmarks have been done by Phoronix,{{Cite web |url=https://www.phoronix.com/scan.php?page=news_item&px=x86-PTI-Initial-Gaming-Tests |title=Linux Gaming Performance Doesn't Appear Affected By The x86 PTI Work |author-last=Larabel |author-first=Michael |date=2018-01-02 |website=Phoronix |language=en}}{{Cite web |url=https://www.phoronix.com/scan.php?page=article&item=linux-kpti-kvm |title=VM Performance Showing Mixed Impact With Linux 4.15 KPTI Patches – Phoronix |author-last=Larabel |author-first=Michael |date=2018-01-03 |website=Phoronix |language=en}}{{Cite web |url=https://www.phoronix.com/scan.php?page=article&item=linux-more-x86pti |title=Further Analyzing The Intel CPU "x86 PTI Issue" On More Systems |author-last=Larabel |author-first=Michael |date=2018-01-03 |website=Phoronix |language=en}} Redis slowed by 6–7%. Linux kernel compilation slowed down by 5% on Haswell.{{Cite web |url=https://medium.com/@loganaden/linux-kpti-performance-hit-on-real-workloads-8da185482df3 |title=Linux KPTI performance hit on real workloads |author-last=Velvindron |author-first=Loganaden |date=2018-01-04 |website=Loganaden Velvindron |access-date=2018-01-05}}
KPTI can partially be disabled with the "nopti" kernel boot option. Also provisions were created to disable KPTI if newer processors fix the information leaks.
References
{{Reflist}}
External links
- [https://www.kernel.org/doc/html/v5.15/x86/pti.html 17. Page Table Isolation (PTI) - The Linux Kernel documentation]
- [https://lkml.org/lkml/2017/12/18/1523 KPTI documentation patch]