x86 instruction listings#Original 8086/8088 instructions
{{Short description|List of x86 microprocessor instructions}}
{{lowercase title|title=x86 instruction listings}}
{{X86 instruction listings}}
The x86 instruction set refers to the set of instructions that x86-compatible microprocessors support. The instructions are usually part of an executable program, often stored as a computer file and executed on the processor.
The x86 instruction set has been extended several times, introducing wider registers and datatypes as well as new functionality.{{cite web
| url = http://www.intel.com/content/www/us/en/processors/processor-identification-cpuid-instruction-note.html?wapkw=processor-identification-cpuid-instruction
| title = Re: Intel Processor Identification and the CPUID Instruction
| access-date = 2013-04-21
}}
x86 integer instructions
{{Main|x86 assembly language}}
Below is the full 8086/8088 instruction set of Intel (81 instructions total).{{Cite web | url=https://eecs.wsu.edu/~ee314/handouts/x86ref.pdf | title=Intel 80x86 Instruction Set Summary | website=eecs.wsu.edu}} These instructions are also available in 32-bit mode, in which they operate on 32-bit registers (eax, ebx, etc.) and values instead of their 16-bit (ax, bx, etc.) counterparts. The updated instruction set is grouped according to architecture (i186, i286, i386, i486, i586/i686) and is referred to as (32-bit) x86 and (64-bit) x86-64 (also known as AMD64).
= Original 8086/8088 instructions =
This is the original instruction set. In the 'Notes' column, r means register, m means memory address and imm means immediate (i.e. a value).
{{sticky header}}
class="wikitable sortable sticky-header"
|+ Original 8086/8088 instruction set ! style="line-height:120%; text-align:left" | In- | |||
id=mnem-aaa
| {{mono|AAA}} | ASCII adjust AL after addition | used with unpacked binary-coded decimal | {{mono|0x37}} |
{{mono|AAD}} | ASCII adjust AX before division | 8086/8088 datasheet documents only base 10 version of the AAD instruction (opcode {{mono|0xD5}} {{mono|0x0A}}), but any other base will work. Later Intel's documentation has the generic form too. NEC V20 and V30 (and possibly other NEC V-series CPUs) always use base 10, and ignore the argument, causing a number of incompatibilities | {{mono|0xD5}} |
{{mono|AAM}} | ASCII adjust AX after multiplication | Only base 10 version (Operand is 0xA) is documented, see notes for AAD | {{mono|0xD4}} |
{{mono|AAS}} | ASCII adjust AL after subtraction | {{mono|0x3F}} | |
{{mono|ADC}} | Add with carry | (1) r += (r/m/imm+CF); (2) m += (r/imm+CF); | {{mono|0x10}}...{{mono|0x15}}, {{mono|0x80}}...{{mono|0x81/2}}, {{mono|0x83/2}} |
{{mono|ADD}} | Add | (1) r += r/m/imm; (2) m += r/imm; | {{mono|0x00}}...{{mono|0x05}}, {{mono|0x80/0}}...{{mono|0x81/0}}, {{mono|0x83/0}} |
{{mono|AND}} | Logical AND | (1) r &= r/m/imm; (2) m &= r/imm; | {{mono|0x20}}...{{mono|0x25}}, {{mono|0x80}}...{{mono|0x81/4}}, {{mono|0x83/4}} |
{{mono|CALL}} | Call procedure | {{code|lang=nasm|push eip; eip points to the instruction directly after the call}} | {{mono|0x9A}}, {{mono|0xE8}}, {{mono|0xFF/2}}, {{mono|0xFF/3}} |
{{mono|CBW}} | Convert byte to word | AX = AL ; sign extended | {{mono|0x98}} |
{{mono|CLC}} | Clear carry flag | CF = 0; | {{mono|0xF8}} |
{{mono|CLD}} | Clear direction flag | DF = 0; | {{mono|0xFC}} |
{{mono|CLI}} | Clear interrupt flag | IF = 0; | {{mono|0xFA}} |
{{mono|CMC}} | Complement carry flag | CF = !CF; | {{mono|0xF5}} |
{{mono|CMP}} | Compare operands | (1) r - r/m/imm; (2) m - r/imm; | {{mono|0x38}}...{{mono|0x3D}}, {{mono|0x80}}...{{mono|0x81/7}}, {{mono|0x83/7}} |
{{mono|CMPSB}} | Compare bytes in memory. May be used with a {{mono|REPE}} or {{mono|REPNE}} prefix to test and repeat the instruction {{mono|CX}} times. | {{sxhl|lang=c|1=if (DF==0) *(byte*)SI++ - *(byte*)ES:DI++;
else *(byte*)SI-- - *(byte*)ES:DI--;}} | {{mono|0xA6}} |
{{mono|CMPSW}} | Compare words. May be used with a {{mono|REPE}} or {{mono|REPNE}} prefix to test and repeat the instruction {{mono|CX}} times. | {{sxhl|lang=c|1=if (DF==0) *(word*)SI++ - *(word*)ES:DI++;
else *(word*)SI-- - *(word*)ES:DI--;}} | {{mono|0xA7}} |
{{mono|CWD}} | Convert word to doubleword | {{mono|0x99}} | |
{{mono|DAA}} | Decimal adjust AL after addition | (used with packed binary-coded decimal) | {{mono|0x27}} |
{{mono|DAS}} | Decimal adjust AL after subtraction | {{mono|0x2F}} | |
{{mono|DEC}} | Decrement by 1 | {{mono|0x48}}...{{mono|0x4F}}, {{mono|0xFE/1}}, {{mono|0xFF/1}} | |
{{mono|DIV}} | Unsigned divide | (1) AX = DX:AX / r/m; resulting DX = remainder (2) AL = AX / r/m; resulting AH = remainder | {{mono|0xF7/6}}, {{mono|0xF6/6}} |
{{mono|ESC}} | Used with floating-point unit | {{mono|0xD8}}..{{mono|0xDF}} | |
{{mono|HLT}} | Enter halt state | {{mono|0xF4}} | |
{{mono|IDIV}} | Signed divide | (1) AX = DX:AX / r/m; resulting DX = remainder (2) AL = AX / r/m; resulting AH = remainder | {{mono|0xF7/7}}, {{mono|0xF6/7}} |
{{mono|IMUL}} | Signed multiply in One-operand form | (1) DX:AX = AX * r/m; (2) AX = AL * r/m | {{mono|0xF7/5}}, {{mono|0xF6/5}} |
{{mono|IN}} | Input from port | (1) AL = port[imm]; (2) AL = port[DX]; (3) AX = port[imm]; (4) AX = port[DX]; | {{mono|0xE4}}, {{mono|0xE5}}, {{mono|0xEC}}, {{mono|0xED}} |
{{mono|INC}} | Increment by 1 | {{mono|0x40}}...{{mono|0x47}}, {{mono|0xFE/0}}, {{mono|0xFF/0}} | |
{{mono|INT}} | Call to interrupt | {{mono|0xCC}}, {{mono|0xCD}} | |
{{mono|INTO}} | Call to interrupt if overflow | {{mono|0xCE}} | |
{{mono|IRET}} | Return from interrupt | {{mono|0xCF}} | |
{{mono|Jcc}} | Jump if condition | {{mono|JA, JAE, JB, JBE, JC (same as JB), JE, JG, JGE, JL, JLE, JNA (same as JBE), JNAE (same as JB), JNB (same as JAE), JNBE, JNC (same as JAE), JNE, JNG (same as JLE), JNGE (same as JL), JNL (same as JGE), JNLE (same as JG), JNO, JNP, JNS, JNZ (same as JNE), JO, JP, JPE (same as JP), JPO (same as JNP), JS, JZ (same as JE)}}{{cite web | url = http://unixwiz.net/techtips/x86-jumps.html | title = Intel x85 JUMP quick reference | access-date = 2025-04-01 }}. | {{mono|0x70}}...{{mono|0x7F}} |
{{mono|JCXZ}} | Jump if CX is zero | {{mono|JECXZ}} for ECX instead of CX in 32 bit mode (same opcode). | {{mono|0xE3}} |
{{mono|JMP}} | Jump | {{mono|0xE9}}...{{mono|0xEB}}, {{mono|0xFF/4}}, {{mono|0xFF/5}} | |
{{mono|LAHF}} | Load FLAGS into AH register | {{mono|0x9F}} | |
{{mono|LDS}} | Load DS:r with far pointer | r = m; DS = 2 + m; | {{mono|0xC5}} |
{{mono|LEA}} | Load Effective Address | {{mono|0x8D}} | |
{{mono|LES}} | Load ES:r with far pointer | r = m; ES = 2 + m; | {{mono|0xC4}} |
{{mono|LOCK}} | Assert BUS LOCK# signal | (for multiprocessing) | {{mono|0xF0}} |
{{mono|LODSB}} | Load string byte. May be used with a {{mono|REP}} prefix to repeat the instruction {{mono|CX}} times. | {{code|lang=c|1=if (DF==0) AL = *SI++; else AL = *SI--;}} | {{mono|0xAC}} |
{{mono|LODSW}} | Load string word. May be used with a {{mono|REP}} prefix to repeat the instruction {{mono|CX}} times. | {{code|lang=c|1=if (DF==0) AX = *SI++; else AX = *SI--;}} | {{mono|0xAD}} |
{{mono|LOOP}}/ {{mono|LOOPx}} | Loop control | ({{mono|LOOPE, LOOPNE, LOOPNZ, LOOPZ}}) {{code|lang=c|1=if (x && --CX) goto lbl;}} | {{mono|0xE0}}...{{mono|0xE2}} |
{{mono|MOV}} | Move | (1) r = r/m/imm; (2) m = r/imm; (3) r/m = sreg; (4) sreg = r/m; | {{mono|0xA0}}...{{mono|0xA3}}, {{mono|0x8C}}, {{mono|0x8E}} |
{{mono|MOVSB}} | Move byte from string to string. May be used with a {{mono|REP}} prefix to repeat the instruction {{mono|CX}} times. | {{sxhl|lang=c|1=if (DF==0) *(byte*)ES:DI++ = *(byte*)SI++;
else *(byte*)ES:DI-- = *(byte*)SI--;}}. | {{mono|0xA4}} |
{{mono|MOVSW}} | Move word from string to string. May be used with a {{mono|REP}} prefix to repeat the instruction {{mono|CX}} times. | {{sxhl|lang=c|1=if (DF==0) *(word*)ES:DI++ = *(word*)SI++;
else *(word*)ES:DI-- = *(word*)SI--;}} | {{mono|0xA5}} |
{{mono|MUL}} | Unsigned multiply | (1) DX:AX = AX * r/m; (2) AX = AL * r/m; | {{mono|0xF7/4}}, {{mono|0xF6/4}} |
{{mono|NEG}} | Two's complement negation | {{code|lang=c|1=r/m = 0 – r/m;}} | {{mono|0xF6/3}}...{{mono|0xF7/3}} |
{{mono|NOP}} | No operation | opcode equivalent to {{code|XCHG EAX, EAX}} | {{mono|0x90}} |
{{mono|NOT}} | Negate the operand, logical NOT | {{code|lang=c|1=r/m ^= -1;}} | {{mono|0xF6/2}}...{{mono|0xF7/2}} |
{{mono|OR}} | Logical OR | (1) r ∣= r/m/imm; (2) m ∣= r/imm; | {{mono|0x08}}...{{mono|0x0D}}, {{mono|0x80}}...{{mono|0x81/1}}, {{mono|0x83/1}} |
{{mono|OUT}} | Output to port | (1) port[imm] = AL; (2) port[DX] = AL; (3) port[imm] = AX; (4) port[DX] = AX; | {{mono|0xE6}}, {{mono|0xE7}}, {{mono|0xEE}}, {{mono|0xEF}} |
{{mono|POP}} | Pop data from stack | r/m/sreg = *SP++; | {{mono|0x07}}, {{mono|0x17}}, {{mono|0x1F}}, {{mono|0x58}}...{{mono|0x5F}}, {{mono|0x8F/0}} |
{{mono|POPF}} | Pop FLAGS register from stack | FLAGS = *SP++; | {{mono|0x9D}} |
{{mono|PUSH}} | Push data onto stack | {{code|lang=c|1=*--SP = r/m/sreg;}} | {{mono|0x06}}, {{mono|0x0E}}, {{mono|0x16}}, {{mono|0x1E}}, {{mono|0x50}}...{{mono|0x57}}, {{mono|0xFF/6}} |
{{mono|PUSHF}} | Push FLAGS onto stack | {{code|lang=c|1=*--SP = FLAGS;}} | {{mono|0x9C}} |
{{mono|RCL}} | Rotate left (with carry) | {{mono|0xC0}}...{{mono|0xC1/2}} (186+), {{mono|0xD0}}...{{mono|0xD3/2}} | |
{{mono|RCR}} | Rotate right (with carry) | {{mono|0xC0}}...{{mono|0xC1/3}} (186+), {{mono|0xD0}}...{{mono|0xD3/3}} | |
{{mono|REPxx}} | Repeat MOVS/STOS/CMPS/LODS/SCAS | ({{mono|REP, REPE, REPNE, REPNZ, REPZ}}) | {{mono|0xF2}}, {{mono|0xF3}} |
{{mono|RET}} | Return from procedure | Not a real instruction. The assembler will translate these to a RETN or a RETF depending on the memory model of the target system. | |
{{mono|RETN}} | Return from near procedure | {{mono|0xC2}}, {{mono|0xC3}} | |
{{mono|RETF}} | Return from far procedure | {{mono|0xCA}}, {{mono|0xCB}} | |
{{mono|ROL}} | Rotate left | {{mono|0xC0}}...{{mono|0xC1/0}} (186+), {{mono|0xD0}}...{{mono|0xD3/0}} | |
{{mono|ROR}} | Rotate right | {{mono|0xC0}}...{{mono|0xC1/1}} (186+), {{mono|0xD0}}...{{mono|0xD3/1}} | |
{{mono|SAHF}} | Store AH into FLAGS | {{mono|0x9E}} | |
{{mono|SAL}} | Shift Arithmetically left (signed shift left) | (1) r/m <<= 1; (2) r/m <<= CL; | {{mono|0xC0}}...{{mono|0xC1/4}} (186+), {{mono|0xD0}}...{{mono|0xD3/4}} |
{{mono|SAR}} | Shift Arithmetically right (signed shift right) | (1) (signed) r/m >>= 1; (2) (signed) r/m >>= CL; | {{mono|0xC0}}...{{mono|0xC1/7}} (186+), {{mono|0xD0}}...{{mono|0xD3/7}} |
{{mono|SBB}} | Subtraction with borrow | (1) r -= (r/m/imm+CF); (2) m -= (r/imm+CF); alternative 1-byte encoding of {{nowrap|SBB AL, AL }} is available via undocumented SALC instruction | {{mono|0x18}}...{{mono|0x1D}}, {{mono|0x80}}...{{mono|0x81/3}}, {{mono|0x83/3}} |
{{mono|SCASB}} | Compare byte string. May be used with a {{mono|REPE}} or {{mono|REPNE}} prefix to test and repeat the instruction {{mono|CX}} times. | {{code|lang=c|1=if (DF==0) AL - *ES:DI++; else AL - *ES:DI--;}} | {{mono|0xAE}} |
{{mono|SCASW}} | Compare word string. May be used with a {{mono|REPE}} or {{mono|REPNE}} prefix to test and repeat the instruction {{mono|CX}} times. | {{code|lang=c|1=if (DF==0) AX - *ES:DI++; else AX - *ES:DI--;}} | {{mono|0xAF}} |
{{mono|SHL}} | Shift left (unsigned shift left) | Same opcode as SAL, since logical left shifts are equal to arithmetical left shifts. | {{mono|0xC0}}...{{mono|0xC1/4}} (186+), {{mono|0xD0}}...{{mono|0xD3/4}} |
{{mono|SHR}} | Shift right (unsigned shift right) | {{mono|0xC0}}...{{mono|0xC1/5}} (186+), {{mono|0xD0}}...{{mono|0xD3/5}} | |
{{mono|STC}} | Set carry flag | CF = 1; | {{mono|0xF9}} |
{{mono|STD}} | Set direction flag | DF = 1; | {{mono|0xFD}} |
{{mono|STI}} | Set interrupt flag | IF = 1; | {{mono|0xFB}} |
{{mono|STOSB}} | Store byte in string. May be used with a {{mono|REP}} prefix to repeat the instruction {{mono|CX}} times. | {{code|lang=c|1=if (DF==0) *ES:DI++ = AL; else *ES:DI-- = AL;}} | {{mono|0xAA}} |
{{mono|STOSW}} | Store word in string. May be used with a {{mono|REP}} prefix to repeat the instruction {{mono|CX}} times. | {{code|lang=c|1=if (DF==0) *ES:DI++ = AX; else *ES:DI-- = AX;}} | {{mono|0xAB}} |
{{mono|SUB}} | Subtraction | (1) r -= r/m/imm; (2) m -= r/imm; | {{mono|0x28}}...{{mono|0x2D}}, {{mono|0x80}}...{{mono|0x81/5}}, {{mono|0x83/5}} |
{{mono|TEST}} | Logical compare (AND) | (1) r & r/m/imm; (2) m & r/imm; | {{mono|0x84}}, {{mono|0x85}}, {{mono|0xA8}}, {{mono|0xA9}}, {{mono|0xF6/0}}, {{mono|0xF7/0}} |
{{mono|WAIT}} | Wait until not busy | Waits until BUSY# pin is inactive (used with floating-point unit) | {{mono|0x9B}} |
{{mono|XCHG}} | Exchange data | {{code|lang=asm|1=r :=: r/m;}} A spinlock typically uses xchg as an atomic operation. (coma bug). | {{mono|0x86}}, {{mono|0x87}}, {{mono|0x91}}...{{mono|0x97}} |
{{mono|XLAT}} | Table look-up translation | behaves like {{code|MOV AL, [BX+AL]}} | {{mono|0xD7}} |
{{mono|XOR}} | Exclusive OR | (1) r ^+= r/m/imm; (2) m ^= r/imm; | {{mono|0x30}}...{{mono|0x35}}, {{mono|0x80}}...{{mono|0x81/6}}, {{mono|0x83/6}} |
= Added in specific processors =
== Added with [[Intel 80186|80186]]/[[Intel 80188|80188]] ==
{{sticky header}}
class="wikitable sortable sticky-header"
! Instruction !! Opcode !! Meaning !! Notes | |||
{{mono|BOUND}} | 62 /r | Check array index against bounds | raises software interrupt 5 if test fails |
{{mono|ENTER}} | C8 iw ib | Enter stack frame | Modifies stack for entry to procedure for high level language. Takes two operands: the amount of storage to be allocated on the stack and the nesting level of the procedure. |
rowspan="2" | {{mono|INSB/INSW}}
| 6C | rowspan="2" | Input from port to string. May be used with a REP prefix to repeat the instruction CX times. | rowspan="2"| equivalent to: IN AL, DX MOV ES:[DI], AL INC DI ; adjust DI according to operand size and DF | |||
6D | |||
{{mono|LEAVE}} | C9 | Leave stack frame | Releases the local stack storage created by the previous ENTER instruction. |
rowspan="2"| {{mono|OUTSB/OUTSW}}
| 6E | rowspan="2" | Output string to port. May be used with a REP prefix to repeat the instruction CX times. | rowspan="2"| equivalent to: MOV AL, DS:[SI] OUT DX, AL INC SI ; adjust SI according to operand size and DF | |||
6F | |||
{{mono|POPA}} | 61 | Pop all general purpose registers from stack | equivalent to:
POP SI POP BP POP AX ; no POP SP here, all it does is ADD SP, 2 (since AX will be overwritten later) POP BX POP DX POP CX POP AX |
{{mono|PUSHA}} | 60 | Push all general purpose registers onto stack | equivalent to:
PUSH CX PUSH DX PUSH BX PUSH SP ; The value stored is the initial SP value PUSH BP PUSH SI PUSH DI |
rowspan="2"| {{mono|PUSH}} immediate
| 6A ib | rowspan="2"| Push an immediate byte/word value onto the stack | rowspan="2"| example: PUSH 1200h | |||
68 iw | |||
rowspan="2" | {{mono|IMUL}} immediate
| 6B /r ib |rowspan="2"| Signed and unsigned multiplication of immediate byte/word value |rowspan="2"| example: IMUL DX,1200h IMUL CX, DX, 12h IMUL BX, SI, 1200h IMUL DI, word ptr [BX+SI], 12h IMUL SI, word ptr [BP-4], 1200h Note that since the lower half is the same for unsigned and signed multiplication, this version of the instruction can be used for unsigned multiplication as well. | |||
69 /r iw | |||
rowspan="2" | {{mono|SHL/SHR/SAL/SAR/ROL/ROR/RCL/RCR}} immediate
| C0 | rowspan="2" | Rotate/shift bits with an immediate value greater than 1 | rowspan="2" | example: SHR BL,3 | |||
C1 |
== Added with [[80286]] ==
The new instructions added in 80286 add support for x86 protected mode. Some but not all of the instructions are available in real mode as well.
{{sticky header}}
class="wikitable sortable sticky-header"
! Instruction !! Opcode !! Instruction description !! Real mode !! Ring |
colspan="5" | |
---|
LGDT m16&32 {{efn|name=gdt_idt_descriptor|text=The descriptors used by the LGDT , LIDT , SGDT and SIDT instructions consist of a 2-part data structure. The first part is a 16-bit value, specifying table size in bytes minus 1. The second part is a 32-bit value (64-bit value in 64-bit mode), specifying the linear start address of the table.For LGDT and LIDT with a 16-bit operand size, the address is ANDed with 00FFFFFFh.
On Intel (but not AMD) CPUs, the | | Load GDTR (Global Descriptor Table Register) from memory.{{efn|name=i286_serialize|text=The | rowspan="4" {{yes}} | rowspan="6" {{no|0}} |
{{nowrap|LIDT m16&32 {{efn|name=gdt_idt_descriptor}}}}
| | Load IDTR (Interrupt Descriptor Table Register) from memory.{{efn|name=i286_serialize}} |
LMSW r/m16
| | Load MSW (Machine Status Word) from 16-bit register or memory.{{efn|text=The |
CLTS
| | Clear task-switched flag in the MSW. |
LLDT r/m16
| | Load LDTR (Local Descriptor Table Register) from 16-bit register or memory.{{efn|name=i286_serialize}} | rowspan="2" {{no|#UD}} |
LTR r/m16
| | Load TR (Task Register) from 16-bit register or memory.{{efn|name=i286_serialize}} The TSS (Task State Segment) specified by the 16-bit argument is marked busy, but a task switch is not done. |
colspan="5" | |
{{nowrap|SGDT m16&32 {{efn|name=gdt_idt_descriptor}}}}
| | Store GDTR to memory. | rowspan="3" {{yes}} | rowspan="5" {{yes2|Usually 3{{efn|text=If This has been a significant security problem for software-based virtualization, since it enables these instructions to be used by a VM guest to detect that it is running inside a VM.Oracle Corp, [https://docs.oracle.com/en/virtualization/virtualbox/6.0/admin/swvirt-details.html Oracle® VM VirtualBox Administrator's Guide for Release 6.0, section 3.5: Details About Software Virtualization]. [https://web.archive.org/web/20231208205121/https://docs.oracle.com/en/virtualization/virtualbox/6.0/admin/swvirt-details.html Archived] on 8 Dec 2023.MBC Project, [https://github.com/MBCProject/mbc-markdown/blob/7223fa76d69015ceb63cb094257e64c3cc6bf3b9/anti-behavioral-analysis/virtual-machine-detection.md Virtual Machine Detection (permanent link)] or [https://github.com/MBCProject/mbc-markdown/blob/main/anti-behavioral-analysis/virtual-machine-detection.md Virtual Machine Detection (non permanent link)]}}}} |
SIDT m16&32 {{efn|name=gdt_idt_descriptor}}
| | Store IDTR to memory. |
SMSW r/m16
| | Store MSW to register or 16-bit memory.{{efn|name=i286_extend16}} |
SLDT r/m16
| | Store LDTR to register or 16-bit memory.{{efn|name=i286_extend16|text=The
| rowspan="2" {{no|#UD}} |
STR r/m16
| | Store TR to register or 16-bit memory.{{efn|name=i286_extend16}} |
colspan="5" | |
{{nowrap|ARPL r/m16,r16 }}
| | Adjust RPL (Requested Privilege Level) field of selector. The operation performed is:
| {{no|#UD{{efn|The | rowspan="5" {{yes|3}} |
LAR r,r/m16
| | Load access rights byte from the specified segment descriptor. | rowspan="4" {{no|#UD}} |
LSL r,r/m16
| | Load segment limit from the specified segment descriptor. Sets ZF=1 if the descriptor could be loaded, ZF=0 otherwise.{{efn|name=lar_lsl_unmod}} |
VERR r/m16
| {{nowrap| | Verify a segment for reading. Sets ZF=1 if segment can be read, ZF=0 otherwise. |
VERW r/m16
| | Verify a segment for writing. Sets ZF=1 if segment can be written, ZF=0 otherwise.{{efn|text=On some Intel CPU/microcode combinations from 2019 onwards, the |
colspan="5" | |
{{unofficial2|align="left"|{{mono| LOADALL}}{{efn|name=i286_undoc|Undocumented, 80286 only.Intel, [https://docs.pcjs.org/manuals/intel/80286/80286_LOADALL.pdf Undocumented iAPX 286 Test Instruction]. [https://web.archive.org/web/20231220173720/https://docs.pcjs.org/manuals/intel/80286/80286_LOADALL.pdf Archived] on 20 Dec 2023.VCF Forums, [https://forum.vcfed.org/index.php?threads/i-found-the-saveall-opcode.71519/ I found the SAVEALL opcode], jun 21, 2019. [https://web.archive.org/web/20230413203921/https://forum.vcfed.org/index.php?threads/i-found-the-saveall-opcode.71519/ Archived] on 13 Apr 2023.rep lodsb, [https://rep-lodsb.mataroa.blog/blog/intel-286-secrets-ice-mode-and-f1-0f-04/ Intel 286 secrets: ICE mode and F1 0F 04], aug 12, 2022. [https://web.archive.org/web/20231208175920/https://rep-lodsb.mataroa.blog/blog/intel-286-secrets-ice-mode-and-f1-0f-04/ Archived] on 8 Dec 2023. (A different variant of LOADALL with a different opcode and memory layout exists on 80386.)}}}}
| {{unofficial2|align="left"|{{mono| 0F 05}}}} | Load all CPU registers from a 102-byte data structure starting at physical address | rowspan="2" {{yes}} | rowspan="2" {{no|0}} |
{{unofficial2|align="left"|{{mono| STOREALL}}{{efn|name=i286_undoc}}}}
| {{unofficial2|align="left"|{{mono| F1 0F 04}}}} | Store all CPU registers to a 102-byte data structure starting at physical address |
{{notelist}}
== Added with [[80386]] ==
The 80386 added support for 32-bit operation to the x86 instruction set. This was done by widening the general-purpose registers to 32 bits and introducing the concepts of OperandSize and AddressSize – most instruction forms that would previously take 16-bit data arguments were given the ability to take 32-bit arguments by setting their OperandSize to 32 bits, and instructions that could take 16-bit address arguments were given the ability to take 32-bit address arguments by setting their AddressSize to 32 bits. (Instruction forms that work on 8-bit data continue to be 8-bit regardless of OperandSize. Using a data size of 16 bits will cause only the bottom 16 bits of the 32-bit general-purpose registers to be modified – the top 16 bits are left unchanged.)
The default OperandSize and AddressSize to use for each instruction is given by the D bit of the segment descriptor of the current code segment - D=0
makes both 16-bit, D=1
makes both 32-bit. Additionally, they can be overridden on a per-instruction basis with two new instruction prefixes that were introduced in the 80386:
66h
: OperandSize override. Will change OperandSize from 16-bit to 32-bit ifCS.D=0
, or from 32-bit to 16-bit ifCS.D=1
.67h
: AddressSize override. Will change AddressSize from 16-bit to 32-bit ifCS.D=0
, or from 32-bit to 16-bit ifCS.D=1
.
The 80386 also introduced the two new segment registers FS
and GS
as well as the x86 control, debug and test registers.
The new instructions introduced in the 80386 can broadly be subdivided into two classes:
- Pre-existing opcodes that needed new mnemonics for their 32-bit OperandSize variants (e.g.
CWDE
,LODSD
) - New opcodes that introduced new functionality (e.g.
SHLD
,SETcc
)
For instruction forms where the operand size can be inferred from the instruction's arguments (e.g. ADD EAX,EBX
can be inferred to have a 32-bit OperandSize due to its use of EAX as an argument), new instruction mnemonics are not needed and not provided.
{{sticky header}}
class="wikitable sortable sticky-header"
|+ 80386: new instruction mnemonics for 32-bit variants of older opcodes ! Type !! Instruction mnemonic !! Opcode !! Description !! Mnemonic for older 16-bit variant !! Ring | ||||
rowspan="7" | String instructions{{efn|text=For the 32-bit string instructions, the ±± notation is used to indicate that the indicated register is post-decremented by 4 if EFLAGS.DF=1 and post-incremented by 4 otherwise.For the operands where the DS segment is indicated, the DS segment can be overridden by a segment-override prefix – where the ES segment is indicated, the segment is always ES and cannot be overridden. The choice of whether to use the 16-bit SI/DI registers or the 32-bit ESI/EDI registers as the address registers to use is made by AddressSize, overridable with the 67 prefix.}}{{efn|text=The 32-bit string instructions accept repeat-prefixes in the same way as older 8/16-bit string instructions.For LODSD , STOSD , MOVSD , INSD and OUTSD , the REP prefix (F3 ) will repeat the instruction the number of times specified in rCX (CX or ECX, decided by AddressSize), decrementing rCX for each iteration (with rCX=0 resulting in no-op and proceeding to the next instruction).For CMPSD and SCASD , the REPE (F3 ) and REPNE (F2 ) prefixes are available, which will repeat the instruction, decrementing rCX for each iteration, but only as long as the flag condition (ZF=1 for REPE , ZF=0 for REPNE ) holds true AND rCX ≠ 0.}}
| | AD | Load string doubleword: EAX := DS:[rSI±±] | LODSW
| rowspan="5" {{yes|3}} | |
STOSD | AB | Store string doubleword: ES:[rDI±±] := EAX | STOSW | |
MOVSD | A5 | Move string doubleword: ES:[rDI±±] := DS:[rSI±±] | MOVSW | |
CMPSD | A7 | Compare string doubleword: temp1 := DS:[rSI±±] | CMPSW | |
SCASD | AF | Scan string doubleword: temp1 := ES:[rDI±±] | SCASW | |
INSD | 6D | Input string from doubleword I/O port:ES:[rDI±±] := port[DX] {{efn|For the INSB/W/D instructions, the memory access rights for the ES:[rDI] memory address might not be checked until after the port access has been performed – if this check fails (e.g. page fault or other memory exception), then the data item read from the port is lost. As such, it is not recommended to use this instruction to access an I/O port that performs any kind of side effect upon read.}} | INSW | rowspan="2" {{no2|Usually 0{{efn|I/O port access is only allowed when CPL≤IOPL or the I/O port permission bitmap bits for the port to access are all set to 0.}}}} |
OUTSD | 6F | Output string to doubleword I/O port:port[DX] := DS:[rSI±±] | OUTSW | |
colspan="6" | | ||||
---|---|---|---|---|
rowspan="8" | Other
| | 98 | Sign-extend 16-bit value in AX to 32-bit value in EAX{{efn|The CWDE instruction differs from the older CWD instruction in that CWD would sign-extend the 16-bit value in AX into a 32-bit value in the DX:AX register pair.}} | CBW
| rowspan="5" {{yes|3}} | |
CDQ | 99 | Sign-extend 32-bit value in EAX to 64-bit value in EDX:EAX.
Mainly used to prepare a dividend for the 32-bit | | ||
{{nowrap|JECXZ rel8 }} | {{nowrap|E3 cb }}{{efn|For the E3 opcode (JCXZ /JECXZ ), the choice of whether the instruction will use CX or ECX for its comparison (and consequently which mnemonic to use) is based on the AddressSize, not OperandSize. (OperandSize instead controls whether the jump destination should be truncated to 16 bits or not).This also applies to the loop instructions LOOP ,LOOPE ,LOOPNE (opcodes E0 ,E1 ,E2 ), however, unlike JCXZ /JECXZ , these instructions have not been given new mnemonics for their ECX-using variants.}} | Jump if ECX is zero | JCXZ | |
PUSHAD | 60 | Push all 32-bit registers onto stack{{efn|For PUSHA(D) , the value of SP/ESP pushed onto the stack is the value it had just before the PUSHA(D) instruction started executing.}} | PUSHA | |
POPAD | 61 | Pop all 32-bit general-purpose registers off stack{{efn|For POPA /POPAD , the stack item corresponding to SP/ESP is popped off the stack (performing a memory read), but not placed into SP/ESP.}} | POPA | |
PUSHFD | 9C | Push 32-bit EFLAGS register onto stack | PUSHF
| rowspan="3" {{yes2|Usually 3{{efn|The | |
POPFD | 9D | Pop 32-bit EFLAGS register off stack | POPF | |
IRETD | CF | 32-bit interrupt return. Differs from the older 16-bit IRET instruction in that it will pop interrupt return items (EIP,CS,EFLAGS; also ESP{{efn|text=If IRETD is used to return from kernel mode to user mode (which will entail a CPL change) and the user-mode stack segment indicated by SS is a 16-bit segment, then the IRETD instruction will only restore the low 16 bits of the stack pointer (ESP/RSP), with the remaining bits keeping whatever value they had in kernel code before the IRETD . This has necessitated complex workarounds on both Linux ("ESPFIX")LKML, [https://lkml.org/lkml/2014/4/29/626 (PATCH) x86-64, espfix: Don't leak bits 31:16 of %esp returning to 16-bit stack], Apr 29, 2014. [https://web.archive.org/web/20180104155340/https://lkml.org/lkml/2014/4/29/626 Archived] on Jan 4, 2018 and Windows.Raymond Chen, [https://devblogs.microsoft.com/oldnewthing/20160404-00/?p=93261 Getting MS-DOS games to run on Windows 95: Working around the iretd problem], Apr 4, 2016. [https://web.archive.org/web/20190315174141/https://devblogs.microsoft.com/oldnewthing/20160404-00/?p=93261 Archived] on Mar 15, 2019 This issue also affects the later 64-bit IRETQ instruction.}} and SS if there is a CPL change; and also ES,DS,FS,GS if returning to virtual 8086 mode) off the stack as 32-bit items instead of 16-bit items. Should be used to return from interrupts when the interrupt handler was entered through a 32-bit IDT interrupt/trap gate.
Instruction is serializing. | |
{{notelist}}
{{sticky header}}
class="wikitable sortable sticky-header"
|+ 80386: new opcodes introduced ! Instruction mnemonics !! Opcode !! Description !! Ring | ||
BT r/m, r | 0F A3 /r | rowspan="2" | Bit Test.{{efn|name=bt_offsetting|text=For the BT , BTS , BTR and BTC instructions:
Second operand specifies which bit of the first operand to test. The bit to test is copied to EFLAGS.CF. | rowspan="8" {{yes|3}} |
BT r/m, imm8 | 0F BA /4 ib | |
BTS r/m, r | 0F AB /r | rowspan="2" | Bit Test-and-set.{{efn|name=bt_offsetting}}{{efn|name=bt_atomic|text=The BTS , BTC and BTR instructions accept the LOCK (F0 ) prefix when used with a memory argument – this results in the instruction executing atomically.}}
Second operand specifies which bit of the first operand to test and set. |
BTS r/m, imm8 | 0F BA /5 ib | |
BTR r/m, r | 0F B3 /r | rowspan="2" | Bit Test and Reset.{{efn|name=bt_offsetting}}{{efn|name=bt_atomic}}
Second operand specifies which bit of the first operand to test and clear. |
BTR r/m, imm8 | 0F BA /6 ib | |
BTC r/m, r | 0F BB /r | rowspan="2" | Bit Test and Complement.{{efn|name=bt_offsetting}}{{efn|name=bt_atomic}}
Second operand specifies which bit of the first operand to test and toggle. |
BTC r/m, imm8 | 0F BA /7 ib | |
colspan="4" | | ||
---|---|---|
BSF r, r/m | {{nowrap|NFx 0F BC /r {{efn|If the F3 prefix is used with the {{nowrap|0F BC /r }} opcode, then the instruction will execute as TZCNT on systems that support the BMI1 extension. TZCNT differs from BSF in that TZCNT but not BSR is defined to return operand size if the source operand is zero – for other source operand values, they produce the same result (except for flags).}}}} | Bit scan forward. Returns bit index of lowest set bit in input.{{efn|name=bsf_bsr_zero|text=BSF and BSR set the EFLAGS.ZF flag to 1 if the source argument was all-0s and 0 otherwise.If the source argument was all-0s, then the destination register is documented as being left unchanged on AMD processors, but set to an undefined value on Intel processors.}} | rowspan="6" {{yes|3}} |
BSR r, r/m | {{nowrap|NFx 0F BD /r {{efn|If the F3 prefix is used with the {{nowrap|0F BD /r }} opcode, then the instruction will execute as LZCNT on systems that support the ABM or LZCNT extensions. LZCNT produces a different result from BSR for most input values.}}}} | Bit scan reverse. Returns bit index of highest set bit in input.{{efn|name=bsf_bsr_zero}} |
SHLD r/m, r, imm8 | 0F A4 /r ib | rowspan="2" | Shift Left Double. The operation of SHLD arg1,arg2,shamt is:arg1 := (arg1< {{efn|name=shld_shamt|text=For SHLD and SHRD , the shift-amount is masked – the bottom 5 bits are used for 16/32-bit operand size and 6 bits for 64-bit operand size.SHLD and SHRD with 16-bit arguments and a shift-amount greater than 16 produce undefined results. (Actual results differ between different Intel CPUs, with at least three different behaviors known.sandpile.org, [https://www.sandpile.org/x86/flags.htm x86 architecture rFLAGS register], see note #7. [https://web.archive.org/web/20111103093624/https://www.sandpile.org/x86/flags.htm Archived] on 3 Nov 2011.)}} |
SHLD r/m, r, CL | 0F A5 /r | |
{{nowrap|SHRD r/m, r, imm8 }} | {{nowrap|0F AC /r ib }} | rowspan="2" | Shift Right Double. The operation of SHRD arg1,arg2,shamt is:arg1 := (arg1>>shamt) | (arg2<<(operand_size - shamt)) {{efn|name=shld_shamt}} |
SHRD r/m, r, CL | 0F AD /r | |
colspan="4" | | ||
MOVZX reg, r/m8 | 0F B6 /r | rowspan="2" | Move from 8/16-bit source to 16/32-bit register with zero-extension.
| rowspan="7" {{yes|3}} |
MOVZX reg, r/m16 | 0F B7 /r | |
MOVSX reg, r/m8 | 0F BE /r | rowspan="2" | Move from 8/16-bit source to 16/32/64-bit register with sign-extension. |
MOVSX reg, r/m16 | 0F BF /r | |
SETcc r/m8
| {{nowrap| {{(!}} class="wikitable sortable" ! x !! cc !! Condition (EFLAGS) {{!}}- {{!}} 0 {{!!}} O {{!!}} OF=1: "Overflow" {{!}}- {{!}} 1 {{!!}} NO {{!!}} OF=0: {{nowrap|"Not Overflow"}} {{!}}- {{!}} 2 {{!!}} C,B,NAE {{!!}} CF=1: "Carry", "Below", {{nowrap|"Not Above or Equal"}} {{!}}- {{!}} 3 {{!!}} NC,NB,AE {{!!}} CF=0: {{nowrap|"Not Carry"}}, {{nowrap|"Not Below"}}, {{nowrap|"Above or Equal"}} {{!}}- {{!}} 4 {{!!}} Z,E {{!!}} ZF=1: "Zero", "Equal" {{!}}- {{!}} 5 {{!!}} NZ,NE {{!!}} ZF=0: {{nowrap|"Not Zero"}}, {{nowrap|"Not Equal"}} {{!}}- {{!}} 6 {{!!}} NA,BE {{!!}} (CF=1 or ZF=1): {{nowrap|"Not Above"}}, {{nowrap|"Below or Equal"}} {{!}}- {{!}} 7 {{!!}} A,NBE {{!!}} (CF=0 and ZF=0): "Above", {{nowrap|"Not Below or Equal"}} {{!}}- {{!}} 8 {{!!}} S {{!!}} SF=1: "Sign" {{!}}- {{!}} 9 {{!!}} NS {{!!}} SF=0: {{nowrap|"Not Sign"}} {{!}}- {{!}} A {{!!}} P,PE {{!!}} PF=1: "Parity", {{nowrap|"Parity Even"}} {{!}}- {{!}} B {{!!}} NP,PO {{!!}} PF=0: {{nowrap|"Not Parity"}}, {{nowrap|"Parity Odd"}} {{!}}- {{!}} C {{!!}} L,NGE {{!!}} SF≠OF: "Less", {{nowrap|"Not Greater Or Equal"}} {{!}}- {{!}} D {{!!}} NL,GE {{!!}} SF=OF: {{nowrap|"Not Less"}}, {{nowrap|"Greater Or Equal"}} {{!}}- {{!}} E {{!!}} LE,NG {{!!}} (ZF=1 or SF≠OF): {{nowrap|"Less or Equal"}}, {{nowrap|"Not Greater"}} {{!}}- {{!}} F {{!!}} NLE,G {{!!}} (ZF=0 and SF=OF): {{nowrap|"Not Less or Equal"}}, {{nowrap|"Greater"}} {{!)}} }}{{efn|text=For | Set byte to 1 if condition is satisfied, 0 otherwise. | ||
Jcc rel16 Jcc rel32
| | Conditional jump near. Differs from older variants of conditional jumps in that they accept a 16/32-bit offset rather than just an 8-bit offset. | ||
IMUL r, r/m | 0F AF /r | Two-operand non-widening integer multiply. |
colspan="4" | | ||
FS: | 64 | rowspan="2" | Segment-override prefixes for FS and GS segment registers.
| rowspan="9" {{yes|3}} |
GS: | 65 | |
PUSH FS | 0F A0 | rowspan="4" | Push/pop FS and GS segment registers. |
POP FS | 0F A1 | |
PUSH GS | 0F A8 | |
POP GS | 0F A9 | |
LFS r16, m16&16 LFS r32, m32&16 | 0F B4 /r | rowspan="3" | Load far pointer from memory.
Offset part is stored in destination register argument, segment part in FS/GS/SS segment register as indicated by the instruction mnemonic.{{efn|For |
LGS r16, m16&16 {{nowrap| LGS r32, m32&16 }} | 0F B5 /r | |
LSS r16, m16&16 {{nowrap| LSS r32, m32&16 }} | 0F B2 /r | |
colspan="4" | | ||
MOV reg,CRx | 0F 20 /r {{efn|name=movcr_modrm|text=For MOV to/from the CRx , DRx and TRx registers, the reg part of the ModR/M byte is used to indicate CRx/DRx/TRx register and r/m part the general-register.
Uniquely for the {{nowrap| | Move from control register to general register.{{efn|name=movcr_opsiz|For moves to/from the CRx and DRx registers, the operand size is always 64 bits in 64-bit mode and 32 bits otherwise.}}
| rowspan="6" {{no|0}} |
MOV CRx,reg | 0F 22 /r {{efn|name=movcr_modrm}} | Move from general register to control register.{{efn|name=movcr_opsiz}}
Moves to the On Pentium and later processors, moves to the |
MOV reg,DRx | 0F 21 /r {{efn|name=movcr_modrm}} | Move from x86 debug register to general register.{{efn|name=movcr_opsiz}} |
MOV DRx,reg | 0F 23 /r {{efn|name=movcr_modrm}} | Move from general register to x86 debug register.{{efn|name=movcr_opsiz}}
On Pentium and later processors, moves to the DR0-DR7 debug registers are serializing. |
MOV reg,TRx | 0F 24 /r {{efn|name=movcr_modrm}} | Move from x86 test register to general register.{{efn|name=movtr_pent|The MOV TRx instructions were discontinued from Pentium onwards.}} |
MOV TRx,reg | 0F 26 /r {{efn|name=movcr_modrm}} | Move from general register to x86 test register.{{efn|name=movtr_pent}} |
colspan="4" | | ||
{{unofficial2|align="left"|{{mono| ICEBP, INT01, INT1{{efn|The INT1 /ICEBP (F1 ) instruction is present on all known Intel x86 processors from the 80386 onwards, but only fully documented for Intel processors from the May 2018 release of the Intel SDM (rev 067) onwards.Michal Necasek, [https://www.os2museum.com/wp/icebp-finally-documented/ ICEBP finally documented], OS/2 Museum, May 25, 2018. [https://web.archive.org/web/20180606211954/https://www.os2museum.com/wp/icebp-finally-documented/ Archived] on 6 June 2018 Before this release, mention of the instruction in Intel material was sporadic, e.g. AP-526 rev 001.Intel, [https://web.archive.org/web/19961222093646/http://www.intel.com/design/pro/applnots/24281601.pdf AP-526: Optimization For Intel's 32-bit Processors], order no. 242816-001, october 1995 – lists SALC on page 83, INT1 on page 86 and FFREEP on page 114. Archived from the [http://www.intel.com/design/pro/applnots/24281601.pdf original] on 22 Dec 1996.For AMD processors, the instruction has been documented since 2002.AMD, [https://kib.kiev.ua/x86docs/AMD/AMD64/24593_APM_v2-r3.06.pdf AMD 64-bit Technology, vol 2: System Programming], order no. 24593, rev 3.06, aug 2002, page 248}}}}}} | {{unofficial2|align="left"|{{mono| F1}}}} | In-circuit emulation breakpoint. Performs software interrupt #1 if executed when not using in-circuit emulation.{{efn|text=The operation of the
CD 01 }} will check CPL against the interrupt descriptor's DPL field as an access-rights check, while F1 will not.| rowspan="7" {{yes|3}} | ||
{{unofficial2|align="left"|{{mono| UMOV r/m, r8}}}}
| {{unofficial2|align="left"|{{mono| 0F 10 /r}}}} | rowspan="4" | User Move – perform data moves that can access user memory while in In-circuit emulation HALT mode. Performs same operation as | ||
{{unofficial2|align="left"|{{mono| UMOV r/m, r16/32}}}}
| {{unofficial2|align="left"|{{mono| 0F 11 /r}}}} | ||
{{unofficial2|align="left"|{{mono| UMOV r8, r/m}}}}
| {{unofficial2|align="left"|{{mono| 0F 12 /r}}}} | ||
{{unofficial2|align="left"|{{mono| UMOV r16/32, r/m}}}}
| {{unofficial2|align="left"|{{mono| 0F 13 /r}}}} | ||
{{unofficial2|align="left"|{{mono| XBTS reg,r/m}}}}
| {{unofficial2|align="left"|{{mono| 0F A6 /r}}}} | Bitfield extract (early 386 only).{{efn|name=xbts_discon|The They have been used by software mainly for detection of the buggy{{Cite web|url=https://www.pcjs.org/documents/manuals/intel/80386/#b0-stepping|title=Intel 80386 CPU Information | PCjs Machines|website=www.pcjs.org}} B0 stepping of the 80386. Microsoft Windows (v2.01 and later) will attempt to run the | ||
{{unofficial2|align="left"|{{mono| IBTS r/m,reg}}}}
| {{unofficial2|align="left"|{{mono| 0F A7 /r}}}} | Bitfield insert (early 386 only).{{efn|name=xbts_discon}}{{efn|name=xbts_op}} | ||
{{unofficial2|align="left"|{{mono| LOADALLD, LOADALL386}}{{efn|name=i386_loadall|Undocumented, 80386 only.Robert Collins, [https://web.archive.org/web/19970605213204/http://www.x86.org/articles/loadall/tspec_a3_doc.html The LOADALL Instruction]. Archived from the [http://www.x86.org/articles/loadall/tspec_a3_doc.html original] on Jun 5, 1997.}}}} | {{unofficial2|align="left"|{{mono| 0F 07}}}} | Load all CPU registers from a 296-byte data structure starting at ES:EDI, including "hidden" part of segment descriptor registers. | {{no|0}} |
{{notelist}}
== Added with [[80486]] ==
class="wikitable sortable"
! Instruction !! Opcode !! Description !! Ring |
BSWAP r32
| {{nowrap| | Byte Order Swap. Usually used to convert between big-endian and little-endian data representations. For 32-bit registers, the operation performed is:
Using | rowspan="5" {{yes|3}} |
CMPXCHG r/m8,r8
| {{nowrap| | rowspan="2" | Compare and Exchange. If accumulator (AL/AX/EAX/RAX) compares equal to first operand,{{efn|The Instruction atomic only if used with |
{{nowrap|CMPXCHG r/m,r16 }}{{nowrap| CMPXCHG r/m,r32 }}
| {{nowrap| |
XADD r/m,r8
| {{nowrap| | rowspan="2" | eXchange and ADD. Exchanges the first operand with the second operand, then stores the sum of the two values into the destination operand. Instruction atomic only if used with |
XADD r/m,r16 XADD r/m,r32
| |
INVLPG m8
| {{nowrap| | Invalidate the TLB entries that would be used for the 1-byte memory operand.{{efn| Instruction is serializing. | rowspan="3" {{no|0}} |
INVD
| | Invalidate Internal Caches.{{efn|name=invd_scope|text=The |
WBINVD
| {{nowrap| | Write Back and Invalidate Cache.{{efn|name=invd_scope}} Writes back all modified cache lines in the processor's internal cache to main memory and invalidates the internal caches. |
{{notelist}}{{vpad}}
== Added in [[Pentium (original)|P5]]/[[P6 (microarchitecture)|P6]]-class processors ==
Integer/system instructions that were not present in the basic 80486 instruction set, but were added in various x86 processors prior to the introduction of SSE. (Discontinued instructions are not included.)
{{sticky header}}
class="wikitable sortable sticky-header"
! Instruction !! Opcode !! Description !! Ring !! Added in |
colspan="5" | |
---|
RDMSR
| | Read Model-specific register. The MSR to read is specified in ECX. The value of the MSR is then returned as a 64-bit value in EDX:EAX.{{efn|name="p5rd_clear_hi32"|1=In 64-bit mode, the | rowspan="2" {{no|0}} | rowspan="2" | IBM 386SLC,Frank van Gilluwe, "The Undocumented PC, second edition", 1997, {{ISBN|0-201-47950-8}}, page 55 |
WRMSR
| | Write Model-specific register. The MSR to write is specified in ECX, and the data to write is given in EDX:EAX.{{efn|On Intel and AMD CPUs, the Instruction is, with some exceptions, serializing.{{efn|text=Writes to the following MSRs are not serializing:Intel, [http://kib.kiev.ua/x86docs/Intel/SDMs/253668-078.pdf Software Developer’s Manual, vol 3A], order no. 253668-078, Dec 2022, section 9.3, page 299.Intel, [https://www.intel.com/content/www/us/en/developer/articles/technical/software-security-guidance/technical-documentation/cpuid-enumeration-and-architectural-msrs.html CPUID Enumeration and Architectural MSRs], 8 Aug 2023. [https://web.archive.org/web/20240523214955/https://www.intel.com/content/www/us/en/developer/articles/technical/software-security-guidance/technical-documentation/cpuid-enumeration-and-architectural-msrs.html Archived] on 23 May 2024. {{(!}} class="wikitable sortable" ! Number !! Name {{!}}- {{!}} {{!}}- {{!}} {{!}}- {{!}} {{!}}- {{!}} {{!}}- {{!}} {{!}}- {{!}} {{!}}- {{!}} {{!}}- {{!}} {{!}}- {{!}} {{!}}- {{!}} {{!}}- {{!}} {{!}}- {{!}} {{!}}- {{!}} {{!)}}
}} |
RSM {{cite web|url=http://www.softeng.rl.ac.uk/st/archive/SoftEng/SESP/html/SoftwareTools/vtune/users_guide/mergedProjects/analyzer_ec/mergedProjects/reference_olh/mergedProjects/instructions/instruct32_hh/vc279.htm|title=RSM—Resume from System Management Mode|url-status=dead |archive-url=https://web.archive.org/web/20120312224625/http://www.softeng.rl.ac.uk/st/archive/SoftEng/SESP/html/SoftwareTools/vtune/users_guide/mergedProjects/analyzer_ec/mergedProjects/reference_olh/mergedProjects/instructions/instruct32_hh/vc279.htm|archive-date=2012-03-12}}
| | Resume from System Management Mode. Instruction is serializing. | {{n/a |
2 (SMM)}} | {{nowrap|Intel 386SL,Microprocessor Report, [http://www.cecs.uci.edu/~papers/mpr/MPR/ARTICLES/060805.PDF System Management Mode Explained] (vol 6, no. 8, june 17, 1992). [https://web.archive.org/web/20220629220530/https://www.cecs.uci.edu/~papers/mpr/MPR/ARTICLES/060805.PDF Archived] on Jun 29, 2022.Ellis, Simson C., "The 386 SL Microprocessor in Notebook PCs", Intel Corporation, Microcomputer Solutions, March/April 1991, page 20 486SL,{{efn|System Management Mode and the |
CPUID
| | CPU Identification and feature information. Takes as input a CPUID leaf index in EAX and, depending on leaf, a sub-index in ECX. Result is returned in EAX,EBX,ECX,EDX.{{efn|On some older 32-bit processors, executing Instruction is serializing, and causes a mandatory #VMEXIT under virtualization. Support for | {{yes2|Usually 3{{efn|On some Intel processors starting from Ivy Bridge, there exists MSRs that can be used to restrict | Intel Pentium,{{efn|name="cpuid_backported"| |
{{nowrap|CMPXCHG8B m64 }}
| {{nowrap| | Compare and Exchange 8 bytes. Compares EDX:EAX with m64. If equal, set ZF{{efn|Unlike the older Instruction atomic only if used with | {{yes|3}} | Intel Pentium, |
RDTSC
| | Read 64-bit Time Stamp Counter (TSC) into EDX:EAX.{{efn|name="rdtsc_pmc_unordered"|text=The In early processors, the TSC was a cycle counter, incrementing by 1 for each clock cycle (which could cause its rate to vary on processors that could change clock speed at runtime) – in later processors, it increments at a fixed rate that doesn't necessarily match the CPU clock speed.{{efn|text=Fixed-rate TSC was introduced in two stages:{{glossary}}{{term|Constant TSC}}{{defn|TSC running at a fixed rate as long as the processor core is not in a deep-sleep (C2 or deeper) mode, but not synchronized between CPU cores. Introduced in Intel Prescott, Yonah and Bonnell. Also present in all Transmeta and VIA NanoLinux kernel 5.4.12, [https://elixir.bootlin.com/linux/v5.4.12/source/arch/x86/kernel/cpu/centaur.c#L110 /arch/x86/kernel/cpu/centaur.c] CPUs. Does not have a CPUID bit.}}{{term|Invariant TSC}}{{defn|TSC running at a fixed rate, and remaining synchronized between CPU cores in all P-,C- and T-states (but not necessarily S-states). | {{yes2|Usually 3{{efn|text= | Intel Pentium, |
colspan="5" | |
RDPMC
| | Read Performance Monitoring Counter. The counter to read is specified by ECX and its value is returned in EDX:EAX.{{efn|name="rdtsc_pmc_unordered"}}{{efn|name="p5rd_clear_hi32"}} | {{yes2|Usually 3{{efn|text= | {{nowrap|Intel Pentium MMX,}} |
{{nowrap|CMOVcc reg,r/m }}
| {{nowrap| {{(!}} class="wikitable sortable" ! x !! cc !! Condition (EFLAGS) {{!}}- {{!}} 0 {{!!}} O {{!!}} OF=1: "Overflow" {{!}}- {{!}} 1 {{!!}} NO {{!!}} OF=0: {{nowrap|"Not Overflow"}} {{!}}- {{!}} 2 {{!!}} C,B,NAE {{!!}} CF=1: "Carry", "Below", {{nowrap|"Not Above or Equal"}} {{!}}- {{!}} 3 {{!!}} NC,NB,AE {{!!}} CF=0: {{nowrap|"Not Carry"}}, {{nowrap|"Not Below"}}, {{nowrap|"Above or Equal"}} {{!}}- {{!}} 4 {{!!}} Z,E {{!!}} ZF=1: "Zero", "Equal" {{!}}- {{!}} 5 {{!!}} NZ,NE {{!!}} ZF=0: {{nowrap|"Not Zero"}}, {{nowrap|"Not Equal"}} {{!}}- {{!}} 6 {{!!}} NA,BE {{!!}} (CF=1 or ZF=1): {{nowrap|"Not Above"}}, {{nowrap|"Below or Equal"}} {{!}}- {{!}} 7 {{!!}} A,NBE {{!!}} (CF=0 and ZF=0): "Above", {{nowrap|"Not Below or Equal"}} {{!}}- {{!}} 8 {{!!}} S {{!!}} SF=1: "Sign" {{!}}- {{!}} 9 {{!!}} NS {{!!}} SF=0: {{nowrap|"Not Sign"}} {{!}}- {{!}} A {{!!}} P,PE {{!!}} PF=1: "Parity", {{nowrap|"Parity Even"}} {{!}}- {{!}} B {{!!}} NP,PO {{!!}} PF=0: {{nowrap|"Not Parity"}}, {{nowrap|"Parity Odd"}} {{!}}- {{!}} C {{!!}} L,NGE {{!!}} SF≠OF: "Less", {{nowrap|"Not Greater Or Equal"}} {{!}}- {{!}} D {{!!}} NL,GE {{!!}} SF=OF: {{nowrap|"Not Less"}}, {{nowrap|"Greater Or Equal"}} {{!}}- {{!}} E {{!!}} LE,NG {{!!}} (ZF=1 or SF≠OF): {{nowrap|"Less or Equal"}}, {{nowrap|"Not Greater"}} {{!}}- {{!}} F {{!!}} NLE,G {{!!}} (ZF=0 and SF=OF): {{nowrap|"Not Less or Equal"}}, {{nowrap|"Greater"}} {{!)}} }} | Conditional move to register. The source operand may be either register or memory.{{efn|In 64-bit mode, | {{yes|3}} | Intel Pentium Pro, |
colspan="5" | |
NOP r/m ,NOPL r/m
| {{nowrap| {{(!}} class="wikitable sortable" ! Length !! Byte Sequence {{!}}- {{!}} 2 {{!!}} {{!}}- {{!}} 3 {{!!}} {{!}}- {{!}} 4 {{!!}} {{!}}- {{!}} 5 {{!!}} {{!}}- {{!}} 6 {{!!}} {{!}}- {{!}} 7 {{!!}} {{!}}- {{!}} 8 {{!!}} {{!}}- {{!}} 9 {{!!}} {{!)}} For cases where there is a need to use more than 9 bytes of NOP padding, it is recommended to use multiple NOPs. }} | Official long NOP. Other than AMD K7/K8, broadly unsupported in non-Intel processors released before 2005.{{efn|Unlike other instructions added in Pentium Pro, long NOP does not have a CPUID feature bit.}}JookWiki, [https://www.jookia.org/wiki/Nopl "nopl"], sep 24, 2022 – provides a lengthy account of the history of the long NOP and the issues around it. [https://web.archive.org/web/20221028233225/https://www.jookia.org/wiki/Nopl Archived] on oct 28, 2022. | {{yes|3}} | Intel Pentium Pro,{{efn|text= The whole {{nowrap| |
UD2 ,{{efn|While the {{nowrap|0F 0B }} opcode was officially reserved as an invalid opcode from Pentium onwards, it only got assigned the mnemonic UD2 from Pentium Pro onwards.Intel, [https://www.ardent-tool.com/CPU/docs/Intel/IA/243191-001.pdf Intel Architecture Software Developer’s Manual, Volume 2], 1997, order no. 243191-001, pages 3-9 and A-7.}}UD2A {{efn|name=ud2_binutils|text=GNU Binutils have used the UD2A and UD2B mnemonics for the {{nowrap|0F 0B }} and {{nowrap|0F B9 }} opcodes since version 2.7.John Hassey, [https://sourceware.org/pipermail/gas2/1995/000421.html Pentium Pro changes], GAS2 mailing list, 28 dec 1995 – patch that added the UD2A and UD2B instruction mnemomics to GNU Binutils. [https://web.archive.org/web/20230725214633/https://sourceware.org/pipermail/gas2/1995/000421.html Archived] on 25 Jul 2023.Neither UD2A nor UD2B originally took any arguments - UD2B was later modified to accept a ModR/M byte, in Binutils version 2.30.Jan Beulich, [https://sourceware.org/pipermail/binutils-cvs/2017-November/046908.html x86: correct UDn], binutils-gdb mailing list, 23 nov 2017 – Binutils patch that added ModR/M byte to UD1 /UD2B and added UD0 . [https://web.archive.org/web/20230725214642/https://sourceware.org/pipermail/binutils-cvs/2017-November/046908.html Archived] on 25 Jul 2023.}}
| | rowspan="3" | Undefined Instructions – will generate an invalid opcode (#UD) exception in all operating modes.{{efn|The These instructions are provided for software testing to explicitly generate invalid opcodes. The opcodes for these instructions are reserved for this purpose. | rowspan="3" {{yes|(3)}} | rowspan="2" | (80186),{{efn|name=ud_186|text=The UD0/1/2 opcodes - {{nowrap| |
UD1 reg,r/m ,{{efn|While the {{nowrap|0F B9 }} opcode was officially reserved as an invalid opcode from Pentium onwards, it only got assigned its mnemonic UD1 much later – AMD APM started listing UD1 in its opcode maps from rev 3.17 onwards,AMD, [https://kib.kiev.ua/x86docs/AMD/AMD64/24594_APM_v3-r3.17.pdf AMD64 Architecture Programmer’s Manual Volume 3], publication no. 24594, rev 3.17, dec 2011 – see page 416 for UD0 and page 415 and 419 for UD1 . while Intel SDM started listing it from rev 061 onwards.Intel, [https://kib.kiev.ua/x86docs/Intel/SDMs/253667-061.pdf Software Developer's Manual, vol 2B], order no. 253667-061, dec 2016 – lists UD1 (with {{nowrap|ModR/M}} byte) and UD0 (without ModR/M byte) on page 4-687.}}{{nowrap| UD2B reg,r/m {{efn|name=ud2_binutils}}}}
| |
OIO ,UD0 ,UD0 reg,r/m {{efn|For the {{nowrap|0F FF }} opcode, the OIO mnemonic was introduced by Cyrix, while the UD0 menmonic (without arguments) was introduced by AMD and Intel at the same time as the UD1 mnemonic for {{nowrap|0F B9 }}. Later Intel (but not AMD) documentation modified its description of UD0 to add a ModR/M byte and take two arguments.Intel, [https://kib.kiev.ua/x86docs/Intel/SDMs/253667-064.pdf Software Developer's Manual, vol 2B], order no. 253667-064, oct 2017 – lists UD0 (with ModR/M byte) on page 4-683.}}
| | (80186),{{efn|name=ud_186}} |
colspan="5" | |
SYSCALL
| | Fast System call. | {{yes|3}} | rowspan="2" | AMD K6,{{efn|On K6, the |
SYSRET
| | Fast Return from System Call. Designed to be used together with | {{no|0{{efn|name="syscall_realmode"|text=The |
SYSENTER
| | Fast System call. | {{yes|3{{efn|name="syscall_realmode"}}}} | rowspan="2" | Intel Pentium II,{{efn|text=The |
SYSEXIT
| | Fast Return from System Call. Designed to be used together with | {{no|0{{efn|name="syscall_realmode"}}}} |
{{notelist}}
{{vpad}}
= Added as instruction set extensions =
== Added with [[x86-64]] ==
These instructions can only be encoded in 64 bit mode. They fall in four groups:
- original instructions that reuse existing opcodes for a different purpose (
MOVSXD
replacingARPL
) - original instructions with new opcodes (
SWAPGS
) - existing instructions extended to a 64 bit address size (
JRCXZ
) - existing instructions extended to a 64 bit operand size (remaining instructions)
Most instructions with a 64 bit operand size encode this using a REX.W
prefix; in the absence of the REX.W
prefix,
the corresponding instruction with 32 bit operand size is encoded. This mechanism also applies to most other instructions with 32 bit operand
size. These are not listed here as they do not gain a new mnemonic in Intel syntax when used with a 64 bit operand size.
class="wikitable sortable"
! Instruction !! Encoding !! Meaning !! Ring |
CDQE
| | Sign extend EAX into RAX | rowspan="13" {{yes|3}} |
CQO
| | Sign extend RAX into RDX:RAX |
CMPSQ
| | CoMPare String Quadword |
{{nowrap|CMPXCHG16B m128 }}{{efn|The memory operand to CMPXCHG16B must be 16-byte aligned.}}{{efn|text=The CMPXCHG16B instruction was absent from a few of the earliest Intel/AMD x86-64 processors. On Intel processors, the instruction was missing from Xeon "Nocona" stepping D,CPU-World, [https://www.cpu-world.com/cgi-bin/CPUID.pl?CPUID=75151 CPUID for Intel Xeon 3.40 GHz] – Nocona stepping D CPUID without CMPXCHG16B but added in stepping E.CPU-World, [https://www.cpu-world.com/cgi-bin/CPUID.pl?CPUID=75154 CPUID for Intel Xeon 3.60 GHz] – Nocona stepping E CPUID with CMPXCHG16B On AMD K8 family processors, it was added in stepping F, at the same time as DDR2 support was introduced.SuperUser StackExchange, [https://superuser.com/questions/187254/how-prevalent-are-old-x64-processors-lacking-the-cmpxchg16b-instruction How prevalent are old x64 processors lacking the cmpxchg16b instruction?]For this reason, CMPXCHG16B has its own CPUID flag, separate from the rest of x86-64.}}
| {{nowrap| | CoMPare and eXCHanGe 16 Bytes. |
IRETQ
| | 64-bit Return from Interrupt |
JRCXZ rel8
| | Jump if RCX is zero |
LODSQ
| | LoaD String Quadword |
{{nowrap|MOVSXD r64,r/m32 }}
| | MOV with Sign Extend 32-bit to 64-bit |
MOVSQ
| | Move String Quadword |
POPFQ
| | POP RFLAGS Register |
PUSHFQ
| | PUSH RFLAGS Register |
SCASQ
| | SCAn String Quadword |
STOSQ
| | STOre String Quadword |
SWAPGS
| | Exchange GS base with KernelGSBase MSR | {{no|0}} |
{{notelist}}{{vpad}}
== Bit manipulation extensions ==
{{Main|X86 Bit manipulation instruction set}}
Bit manipulation instructions. For all of the VEX-encoded instructions defined by BMI1 and BMI2, the operand size may be 32 or 64 bits, controlled by the VEX.W bit – none of these instructions are available in 16-bit variants. The VEX-encoded instructions are not available in Real Mode and Virtual-8086 mode - other than that, the bit manipulation instructions are available in all operating modes on supported CPUs.
class="wikitable sortable"
! Bit Manipulation Extension !! Instruction |
colspan="5" | |
---|
rowspan="4" | {{glossary}}{{term|ABM (LZCNT){{efn|text=On AMD CPUs, the "ABM" extension provides both POPCNT and LZCNT . On Intel CPUs, however, the CPUID bit for "ABM" is only documented to indicate the presence of the LZCNT instruction and is listed as "LZCNT", while POPCNT has its own separate CPUID feature bit.However, all known processors that implement the "ABM"/"LZCNT" extensions also implement POPCNT and set the CPUID feature bit for POPCNT, so the distinction is theoretical only.(The converse is not true – there exist processors that support POPCNT but not ABM, such as Intel Nehalem and VIA Nano 3000.)}}}}{{defn|Advanced Bit Manipulation}}{{glossary end}}
| | | rowspan="2" | Population Count. Counts the number of bits that are set to 1 in its source argument. | rowspan="4" | K10, |
POPCNT r64,r/m64
| |
LZCNT r16,r/m16 LZCNT r32,r/m32
| | rowspan="2" | Count Leading zeroes.{{efn|The |
LZCNT r64,r/m64
| {{nowrap| |
colspan="5" | |
rowspan="7" | {{glossary}}{{term|BMI1}}{{defn|Bit Manipulation Instruction Set 1}}{{glossary end}}
| | | rowspan="2" | Count Trailing zeroes.{{efn|The | rowspan="7" | Haswell, |
TZCNT r64,r/m64
| {{nowrap| |
ANDN ra,rb,r/m
| | Bitwise AND-NOT: |
BEXTR ra,r/m,rb
| | Bitfield extract. Bitfield start position is specified in bits [7:0] of
|
BLSI reg,r/m
| | Extract lowest set bit in source argument. Returns 0 if source argument is 0. Equivalent to |
BLSMSK reg,r/m
| | Generate a bitmask of all-1s bits up to the lowest bit position with a 1 in the source argument. Returns all-1s if source argument is 0. Equivalent to |
BLSR reg,r/m
| | Copy all bits of the source argument, then clear the lowest set bit. Equivalent to |
colspan="5" | |
rowspan="8" | {{glossary}}{{term|BMI2}}{{defn|Bit Manipulation Instruction Set 2}}{{glossary end}}
| | {{small| | Zero out high-order bits in | rowspan="8" | Haswell, |
MULX ra,rb,r/m
| {{small|{{nowrap| | Widening unsigned integer multiply without setting flags. Multiplies EDX/RDX with |
PDEP ra,rb,r/m
| {{small|{{nowrap| | Parallel Bit Deposit. Scatters contiguous bits from
|
PEXT ra,rb,r/m
| {{small|{{nowrap| | Parallel Bit Extract. Uses
|
{{nowrap|RORX reg,r/m,imm8 }}
| {{small|{{nowrap| | Rotate right by immediate without affecting flags. |
SARX ra,r/m,rb
| {{small|{{nowrap| | Arithmetic shift right without updating flags. |
SHRX ra,r/m,rb
| {{small|{{nowrap| | Logical shift right without updating flags. |
SHLX ra,r/m,rb
| {{small|{{nowrap| | Shift left without updating flags. |
{{notelist}}{{vpad}}
== Added with Intel TSX ==
{{Main|Transactional Synchronization Extensions}}
class="wikitable sortable"
! TSX Subset !! Instruction !! Opcode !! Description !! Added in |
colspan="5" | |
---|
rowspan="4" | {{glossary}}{{term|RTM}}{{defn|Restricted Transactional memory}}{{glossary end}}
| | | Start transaction. If transaction fails, perform a branch to the given relative offset. | rowspan="4" | Haswell |
XABORT imm8
| | Abort transaction with 8-bit immediate as error code. |
XEND
| {{nowrap| | End transaction. |
XTEST
| {{nowrap| | Test if in transactional execution. Sets |
colspan="5" | |
rowspan="2" | {{glossary}}{{term|HLE}}{{defn|Hardware Lock Elision}}{{glossary end}}
| | | Instruction prefix to indicate start of hardware lock elision, used with memory atomic instructions only (for other instructions, the | rowspan="2" | Haswell |
XRELEASE
| | Instruction prefix to indicate end of hardware lock elision, used with memory atomic/store instructions only (for other instructions, the |
colspan="5" | |
rowspan="2" | {{glossary}}{{term|TSXLDTRK}}{{defn|Load Address Tracking suspend/resume}}{{glossary end}}
| | {{nowrap| | Suspend Tracking Load Addresses | rowspan="2" | {{nowrap|Sapphire Rapids}} |
XRESLDTRK
| {{nowrap| | Resume Tracking Load Addresses |
{{vpad}}
== Added with [[Control-flow integrity#Intel Control-flow Enforcement Technology|Intel CET]] ==
Intel CET (Control-Flow Enforcement Technology) adds two distinct features to help protect against security exploits such as return-oriented programming: a shadow stack (CET_SS), and indirect branch tracking (CET_IBT).
class="wikitable sortable"
! CET Subset !! Instruction !! Opcode !! Description !! Ring !! Added in |
colspan="6" | |
---|
rowspan="12" | {{glossary}}{{term|CET_SS}}{{defn|Shadow stack. When shadow stacks are enabled, return addresses are pushed on both the regular stack and the shadow stack when a function call is made. They are then both popped on return from the function call – if they do not match, then the stack is assumed to be corrupted, and a #CP exception is issued. The shadow stack is additionally required to be stored in specially marked memory pages which cannot be modified by normal memory store instructions.}}{{glossary end}} | | | rowspan="2" | Increment shadow stack pointer | rowspan="8" {{yes|3}} | rowspan="12" | {{nowrap|Tiger Lake,}} |
INCSSPQ r64
| |
RDSSPD r32
| | Read shadow stack pointer into register (low 32 bits){{efn|name="rdssp_nop"|text=The |
RDSSPQ r64
| | Read shadow stack pointer into register (full 64 bits){{efn|name="rdssp_nop"}} |
SAVEPREVSSP
| | Save previous shadow stack pointer |
RSTORSSP m64
| | Restore saved shadow stack pointer |
WRSSD m32,r32
| | Write 4 bytes to shadow stack |
WRSSQ m64,r64
| {{nowrap| | Write 8 bytes to shadow stack |
WRUSSD m32,r32
| | Write 4 bytes to user shadow stack | rowspan="4" {{no|0}} |
{{nowrap|WRUSSQ m64,r64 }}
| {{nowrap| | Write 8 bytes to user shadow stack |
SETSSBSY
| | Mark shadow stack busy |
CLRSSBSY m64
| | Clear shadow stack busy flag |
colspan="6" | |
rowspan="3" | {{glossary}}{{term|CET_IBT}}{{defn|Indirect Branch Tracking. When IBT is enabled, an indirect branch (jump, call, return) to any instruction that is not an ENDBR32/64 instruction will cause a #CP exception.}}{{glossary end}}
| | | Terminate indirect branch in 32-bit mode{{efn|name="endbr_nop"|text= | rowspan="3" {{yes|3}} | rowspan="3" | Tiger Lake |
ENDBR64
| | Terminate indirect branch in 64-bit mode{{efn|name="endbr_nop"}} |
NOTRACK
| | Prefix used with indirect |
{{notelist}}{{vpad}}
== Added with XSAVE ==
The XSAVE instruction set extensions are designed to save/restore CPU extended state (typically for the purpose of context switching) in a manner that can be extended to cover new instruction set extensions without the OS context-switching code needing to understand the specifics of the new extensions. This is done by defining a series of state-components, each with a size and offset within a given save area, and each corresponding to a subset of the state needed for one CPU extension or another. The EAX=0Dh
CPUID leaf is used to provide information about which state-components the CPU supports and what their sizes/offsets are, so that the OS can reserve the proper amount of space and set the associated enable-bits.
class="wikitable sortable"
! XSAVE Extension !! Instruction |
colspan="6" | |
---|
rowspan="4" | {{glossary}}{{term|XSAVE}}{{defn|Processor Extended State Save/Restore.}}{{glossary end}}
| | | Save state components specified by bitmap in EDX:EAX to memory. | rowspan="3" {{yes|3}} | rowspan="4" | Penryn,{{efn|XSAVE was added in steppings E0/R0 of Penryn and is not available in earlier steppings.}} |
XRSTOR mem XRSTOR64 mem
| | Restore state components specified by EDX:EAX from memory. |
XGETBV
| | Get value of Extended Control Register. |
XSETBV
| | Set Extended Control Register.{{efn|The | {{no|0}} |
colspan="6" | |
{{glossary}}{{term|XSAVEOPT}}{{defn|Processor Extended State Save/Restore Optimized}}{{glossary end}}
| | | Save state components specified by EDX:EAX to memory. | {{yes|3}} | {{nowrap|Sandy Bridge,}} |
colspan="6" | |
{{glossary}}{{term|XSAVEC}}{{defn|Processor Extended State save/restore with compaction.}}{{glossary end}}
| | | Save processor extended state components specified by EDX:EAX to memory with compaction. | {{yes|3}} |
colspan="6" | |
rowspan="2" | {{glossary}}{{term|XSS}}{{defn|Processor Extended State save/restore, including supervisor state.}}{{glossary end}}
| | | Save processor extended state components specified by EDX:EAX to memory with compaction and optimization if possible. | rowspan="2" {{no|0}} |
XRSTORS mem XRSTORS64 mem
| | Restore state components specified by EDX:EAX from memory. |
{{notelist}}
{{vpad}}
== Added with other cross-vendor extensions ==
{{sticky header}}
class="wikitable sortable sticky-header"
! Instruction Set Extension !! Instruction |
colspan="6" | |
---|
rowspan="5" | {{glossary}}{{term|SSE{{efn|name="k7_mmxext"}}}}{{defn|(non-SIMD)}}{{glossary end}}
| | | Prefetch with Non-Temporal Access. | rowspan="5" {{yes|3}} | rowspan="5" | Pentium III, |
PREFETCHT0 m8
| | Prefetch data to all levels of the cache hierarchy.{{efn|name="prefetch_hint"}} |
PREFETCHT1 m8
| | Prefetch data to all levels of the cache hierarchy except L1 cache.{{efn|name="prefetch_hint"}} |
PREFETCHT2 m8
| | Prefetch data to all levels of the cache hierarchy except L1 and L2 caches.{{efn|name="prefetch_hint"}} |
SFENCE
| | Store Fence.{{efn|The |
colspan="6" | |
rowspan="4" | {{glossary}}{{term|SSE2}}{{defn|(non-SIMD)}}{{glossary end}}
| | | Load Fence and Dispatch Serialization.{{efn|The | rowspan="4" {{yes|3}} |
MFENCE
| | Memory Fence.{{efn|The |
MOVNTI m32,r32 MOVNTI m64,r64
| | Non-Temporal Memory Store. |
PAUSE
|
| Pauses CPU thread for a short time period.{{efn|The actual length of the pause performed by the |
colspan="6" | |
{{glossary}}{{term|CLFSH{{efn|While the CLFLUSH instruction was introduced together with SSE2, it has its own CPUID flag and may be present on processors not otherwise implementing SSE2 and/or absent from processors that otherwise implement SSE2. (E.g. AMD Geode LX supports CLFLUSH but not SSE2.)}}}}{{defn|Cache Line Flush.}}{{glossary end}}
| {{nowrap| | | Flush one cache line to memory. | {{yes|3}} | (SSE2), |
colspan="6" | |
rowspan="2" | {{glossary}}{{term|MONITOR{{efn|While the MONITOR and MWAIT instructions were introduced at the same time as SSE3, they have their own CPUID flag that needs to be checked separately from the SSE3 CPUID flag (e.g. Athlon 64 X2 and VIA C7 supported SSE3 but not MONITOR.)}}}}{{defn|Monitor a memory location for memory writes.}}{{glossary end}}
| | | Start monitoring a memory location for memory writes. The memory address to monitor is given by DS:AX/EAX/RAX.{{efn|For | rowspan="2" {{no2|Usually 0{{efn|name="monitor_ring3"|On some processors, such as Intel Xeon Phi x200Intel, [https://web.archive.org/web/20170305002312/https://software.intel.com/en-us/blogs/2016/10/06/intel-xeon-phi-product-family-x200-knl-user-mode-ring-3-monitor-and-mwait Intel® Xeon Phi™ Product Family x200 (KNL) User mode (ring 3) MONITOR and MWAIT] (archived 5 mar 2017) and AMD K10AMD, [https://www.amd.com/content/dam/amd/en/documents/archived-tech-docs/programmer-references/31116.pdf BIOS and Kernel Developer’s Guide (BKDG) For AMD Family 10h Processors], order no. 31116, rev 3.62, page 419. [https://www.amd.com/content/dam/amd/en/documents/archived-tech-docs/programmer-references/31116.pdf Archived] on Apr 8, 2024. and later, there exist documented MSRs that can be used to enable |
MWAIT {{efn|name="monitor_explicit_op"}}MWAIT EAX,ECX
| | Wait for a write to a monitored memory location previously specified with {{(!}} class="wikitable sortable" ! Bits !! MWAIT Extension {{!}}- {{!}} 0 {{!!}} Treat interrupts as break events, even when masked (EFLAGS.IF=0). (Available on all non-NetBurst implementations of {{!}}- {{!}} 1 {{!!}} {{unofficial2|align="left"|Timed MWAIT: end the wait when the TSC reaches or exceeds the value in EDX:EBX. (Undocumented, reportedly present in Intel Skylake and later Intel processors)R. Zhang et al, [https://publications.cispa.saarland/3769/1/mwait_sec23.pdf (M)WAIT for It: Bridging the Gap between Microarchitectural and Architectural Side Channels], 3 Jan 2023, page 5. [https://web.archive.org/web/20230105140516/https://publications.cispa.saarland/3769/1/mwait_sec23.pdf Archived] from the original on 5 Jan 2023.}} {{!}}- {{!}}- {{!}} 31:3 {{!!}} {{n/a|align="left"|Not used, must be set to zero.}} {{!)}} }} and hint{{efn|text=The hint flags available for {{(!}} class="wikitable sortable" ! Bits !! MWAIT Hint {{!}}- {{!}} 3:0 {{!!}} Sub-state within a C-state (see bits 7:4) (Intel processors only) {{!}}- {{!}} 7:4 {{!!}} Target CPU power C-state during wait, minus 1. (E.g. 0000b for C1, 0001b for C2, 1111b for C0) {{!}}- {{!}} 31:8 {{!!}} {{n/a|align="left"|Not used.}} {{!)}} The C-states are processor-specific power states, which do not necessarily correspond 1:1 to ACPI C-states. }} flags, respectively. |
colspan="6" | |
{{glossary}}{{term|SMX}}{{defn|Safer Mode Extensions. Load, authenticate and execute a digitally signed "Authenticated Code Module" as part of Intel Trusted Execution Technology.}}{{glossary end}} | | {{nowrap| | Perform an SMX function. The leaf function to perform is given in EAX.{{efn|text=The leaf functions defined for {{(!}} class="wikitable sortable" ! EAX !! Function {{!}}- {{!}} 0 (CAPABILITIES) {{!!}} Report SMX capabilities {{!}}- {{!}} 2 (ENTERACCES) {{!!}} Enter execution of authenticated code module {{!}}- {{!}} 3 (EXITAC) {{!!}} Exit execution of authenticated code module {{!}}- {{!}} 4 (SENTER) {{!!}} Enter measured environment {{!}}- {{!}} 5 (SEXIT) {{!!}} Exit measured environment {{!}}- {{!}} 6 (PARAMETERS) {{!!}} Report SMX parameters {{!}}- {{!}} 7 (SMCTRL) {{!!}} SMX Mode Control {{!}}- {{!}} 8 (WAKEUP) {{!!}} Wake up sleeping processors in measured environment {{!)}} Any unsupported value in EAX causes an #UD exception. }} | {{no2|Usually 0{{efn|text=For | {{nowrap|Conroe/Merom,}} |
colspan="6" | |
{{glossary}}{{term|RDTSCP}}{{defn|Read Time Stamp Counter and Processor ID.}}{{glossary end}}
| | | Read Time Stamp Counter and processor core ID.{{efn|name="rdtscp_rdpid"}} | {{yes2|Usually 3{{efn|text= | K8,{{efn|Support for |
colspan="6" | |
rowspan="2" | {{glossary}}{{term|POPCNT{{efn|While the POPCNT instruction was introduced at the same time as SSE4.2, it is not considered to be a part of SSE4.2, but instead a separate extension with its own CPUID flag.On AMD processors, it is considered to be a part of the ABM extension, but still has its own CPUID flag.}}}}{{defn|Population Count.}}{{glossary end}} | | | rowspan="2" | Count the number of bits that are set to 1 in its source argument. | rowspan="2" {{yes|3}} |
POPCNT r64,r/m64
| |
colspan="6" | |
rowspan="3" |{{glossary}}{{term|SSE4.2}}{{defn|(non-SIMD)}}{{glossary end}}
| | | rowspan="3" | Accumulate CRC value using the CRC-32C (Castagnoli) polynomial 0x11EDC6F41 (normal form 0x1EDC6F41). This is the polynomial used in iSCSI. In contrast to the more popular one used in Ethernet, its parity is even, and it can thus detect any error with an odd number of changed bits. | rowspan="3" {{yes|3}} | rowspan="3" | Nehalem, |
CRC32 r32,r/m16 CRC32 r32,r/m32
| |
CRC32 r64,r/m64
| |
colspan="6" | |
rowspan="4" | {{glossary}}{{term|FSGSBASE}}{{defn|Read/write base address of FS and GS segments from user-mode. Available in 64-bit mode only.}}{{glossary end}} | | | Read base address of FS: segment. | rowspan="4" {{yes|3}} | rowspan="4" | Ivy Bridge, |
RDGSBASE r32 RDGSBASE r64
| | Read base address of GS: segment. |
WRFSBASE r32 WRFSBASE r64
| | Write base address of FS: segment. |
WRGSBASE r32 WRGSBASE r64
| | Write base address of GS: segment. |
colspan="6" | |
rowspan="4" | {{glossary}}{{term|MOVBE}}{{defn|Move to/from memory with byte order swap.}}{{glossary end}}
| | | rowspan="2" | Load from memory to register with byte-order swap. | rowspan="4" {{yes|3}} | rowspan="4" | Bonnell, |
MOVBE r64,m64
| {{nowrap| |
MOVBE m16,r16 MOVBE m32,r32
| | rowspan="2" | Store to memory from register with byte-order swap. |
MOVBE m64,r64
| {{nowrap| |
colspan="6" | |
{{glossary}}{{term|INVPCID}}{{defn|Invalidate TLB entries by Process-context identifier.}}{{glossary end}}
| | | Invalidate entries in TLB and paging-structure caches based on invalidation type in register{{efn|text=The invalidation types defined for {{(!}} class="wikitable sortable" ! Value !! Function {{!}}- {{!}} 0 {{!!}} Invalidate TLB entries matching PCID and virtual memory address in descriptor, excluding global entries {{!}}- {{!}} 1 {{!!}} Invalidate TLB entries matching PCID in descriptor, excluding global entries {{!}}- {{!}} 2 {{!!}} Invalidate all TLB entries, including global entries {{!}}- {{!}} 3 {{!!}} Invalidate all TLB entries, excluding global entries {{!)}} Any unsupported value in the register argument causes a #GP exception.}} and descriptor in m128. The descriptor contains a memory address and a PCID.{{efn|Unlike the older Instruction is serializing on AMD but not Intel CPUs. | {{no|0}} |
colspan="6" | |
rowspan="2" | {{glossary}}{{term|PREFETCHW{{efn|The PREFETCH and PREFETCHW instructions are mandatory parts of the 3DNow! instruction set extension, but are also available as a standalone extension on systems that do not support 3DNow!}}}}{{defn|Cache-line prefetch with intent to write.}}{{glossary end}}
| {{nowrap| | | Prefetch cache line with intent to write.{{efn|name="prefetch_hint"}} | rowspan="2" {{yes|3}} | rowspan="2" | K6-2, |
{{nowrap|PREFETCH m8 }}{{efn|The PREFETCH ({{nowrap|0F 0D /0 }}) instruction is a 3DNow! instruction, present on all processors with 3DNow! but not necessarily on processors with the PREFETCHW extension.On AMD CPUs with PREFETCHW, opcode {{nowrap| 0F 0D /0 }} as well as opcodes {{nowrap|0F 0D /2../7 }} are all documented to be performing prefetch.On Intel processors with PREFETCHW, these opcodes are documented as performing reserved-NOPsIntel, [http://kib.kiev.ua/x86docs/Intel/SDMs/325384-078.pdf Intel® 64 and IA-32 Architectures Software Developer’s Manual] volume 3, order no. 325384-078, december 2022, chapter 23.15 (except {{nowrap| 0F 0D /2 }} being {{nowrap|PREFETCHWT1 m8 }} on {{nowrap|Xeon Phi}} only) – third party testingCatherine Easdon, [https://www.cattius.com/images/thesis-unsigned.pdf Undocumented CPU Behaviour on x86 and RISC-V Microarchitectures: A Security Perspective], 10 May 2019, page 39 indicates that some or all of these opcodes may be performing prefetch on at least some Intel Core CPUs.}}
| | Prefetch cache line.{{efn|name="prefetch_hint"}} |
colspan="6" | |
rowspan="2" | {{glossary}}{{term|ADX}}{{defn|Enhanced variants of add-with-carry.}}{{glossary end}}
| {{nowrap| | | Add-with-carry. Differs from the older | rowspan="2" {{yes|3}} | rowspan="2" | Broadwell, |
{{nowrap|ADOX r32,r/m32 }}ADOX r64,r/m64
| | Add-with-carry, with the overflow-flag |
colspan="6" | |
rowspan="2" | {{glossary}}{{term|SMAP}}{{defn|Supervisor Mode Access Prevention. Repurposes the EFLAGS.AC (alignment check) flag to a flag that prevents access to user-mode memory while in ring 0, 1 or 2.}}{{glossary end}}
| | | Clear | rowspan="2" {{no|0}} | rowspan="2" | Broadwell, |
STAC
| | Set |
colspan="6" | |
{{glossary}}{{term|CLFLUSHOPT}}{{defn|Optimized Cache Line Flush.}}{{glossary end}}
| {{nowrap| | | Flush cache line. | {{yes|3}} |
colspan="6" | |
{{glossary}}{{term|PREFETCHWT1}}{{defn|Cache-line prefetch into L2 cache with intent to write.}}{{glossary end}}
| | | Prefetch data with T1 locality hint (fetch into L2 cache, but not L1 cache) and intent-to-write hint.{{efn|name="prefetch_hint"}} | {{yes|3}} | {{nowrap|Knights Landing,}} |
colspan="6" | |
rowspan="2" | {{glossary}}{{term|PKU}}{{defn|Protection Keys for user pages.}}{{glossary end}}
| | | Read User Page Key register into EAX. | rowspan="2" {{yes|3}} | rowspan="2" | Skylake-X, |
WRPKRU
| | Write data from EAX into User Page Key Register, and perform a Memory Fence. |
colspan="6" | |
{{glossary}}{{term|CLWB}}{{defn|Cache Line Writeback to memory.}}{{glossary end}}
| | {{nowrap| | Write one cache line back to memory without invalidating the cache line. | {{yes|3}} |
colspan="6" | |
{{glossary}}{{term|RDPID}}{{defn|Read processor core ID.}}{{glossary end}}
| | | Read processor core ID into register.{{efn|name="rdtscp_rdpid"|text=The "core ID" value read by | {{yes|3{{efn|text=Unlike the older | {{nowrap|Goldmont Plus,}} |
colspan="6" | |
{{glossary}}{{term|MOVDIRI}}{{defn|Move to memory as Direct Store.}}{{glossary end}}
| | | Store to memory using Direct Store (memory store that is not cached or write-combined with other stores). | {{yes|3}} |
colspan="6" | |
{{glossary}}{{term|MOVDIR64B}}{{defn|Move 64 bytes as Direct Store.}}{{glossary end}}
| {{nowrap| | | Move 64 bytes of data from m512 to address given by ES:reg. The 64-byte write is done atomically with Direct Store.{{efn|For | {{yes|3}} |
colspan="6" | |
{{glossary}}{{term|WBNOINVD}}{{defn|Whole Cache Writeback without invalidate.}}{{glossary end}}
| | | Write back all dirty cache lines to memory without invalidation.{{efn|The | {{no|0}} |
colspan="6" | |
rowspan="2" | {{glossary}}{{term|PREFETCHI}}{{defn|Instruction prefetch.}}{{glossary end}}
| | | Prefetch code to all levels of the cache hierarchy.{{efn|name=prefetchi_note|text=In initial implementations, the | rowspan="2" {{yes|3}} | rowspan="2" | Zen 5, |
PREFETCHIT1 m8
| | Prefetch code to all levels of the cache hierarchy except first-level cache.{{efn|name=prefetchi_note}} |
{{notelist}}{{vpad}}
== Added with other Intel-specific extensions ==
{{sticky header}}
class="wikitable sortable sticky-header"
! Instruction Set Extension !! Instruction |
colspan="6" | |
---|
rowspan="2" | {{glossary}}{{term|SSE2 branch hints}}{{defn|Instruction prefixes that can be used with the Jcc instructions to provide branch taken/not-taken hints.}}{{glossary end}}
| Intel XED uses the mnemonics {{nowrap| | | Instruction prefix: branch hint weakly not taken. | rowspan="2" {{Yes|3}} | rowspan="2" | Pentium 4,{{efn|Branch hints are supported on all NetBurst (Pentium 4 family) processors - but not supported on any other known processor prior to their re-introduction in "Redwood Cove" CPUs, starting with "Meteor Lake" in 2023.}} |
HST ,hint-taken {{efn|name=wmt_hint}}
| | Instruction prefix: branch hint strongly taken. |
colspan="6" | |
rowspan="3" | {{glossary}}{{term|SGX}}{{defn|Software Guard Extensions. Set up an encrypted enclave in which a guest can execute code that a compromised or malicious host cannot inspect or tamper with.}}{{glossary end}} | | {{nowrap| | Perform an SGX Supervisor function. The function to perform is given in EAX{{efn|text=The leaf functions defined for {{(!}} class="wikitable sortable" ! EAX !! Function {{!}}- {{!}} 0 (ECREATE) {{!!}} Create an enclave {{!}}- {{!}} 1 (EADD) {{!!}} Add a page {{!}}- {{!}} 2 (EINIT) {{!!}} Initialize an enclave {{!}}- {{!}} 3 (EREMOVE) {{!!}} Remove a page from EPC (Enclave Page Cache) {{!}}- {{!}} 4 (EDBGRD) {{!!}} Read data by debugger {{!}}- {{!}} 5 (EDBGWR) {{!!}} Write data by debugger {{!}}- {{!}} 6 (EEXTEND) {{!!}} Extend EPC page measurement {{!}}- {{!}} 7 (ELDB) {{!!}} Load an EPC page as blocked {{!}}- {{!}} 8 (ELDU) {{!!}} Load an EPC page as unblocked {{!}}- {{!}} 9 (EBLOCK) {{!!}} Block an EPC page {{!}}- {{!}} A (EPA) {{!!}} Add version array {{!}}- {{!}} B (EWB) {{!!}} Writeback/invalidate EPC page {{!}}- {{!}} C (ETRACK) {{!!}} Activate EBLOCK checks {{!}}- ! colspan="2" {{!}} Added with SGX2 {{!}}- {{!}} D (EAUG) {{!!}} Add page to initialized enclave {{!}}- {{!}} E (EMODPTR) {{!!}} Restrict permissions of EPC page {{!}}- {{!}} F (EMODT) {{!!}} Change type of EPC page {{!}}- ! colspan="2" {{!}} Added with OVERSUB {{!}}- {{!}} 10 (ERDINFO) {{!!}} Read EPC page type/status info {{!}}- {{!}} 11 (ETRACKC) {{!!}} Activate EBLOCK checks {{!}}- {{!}} 12 (ELDBC) {{!!}} Load EPC page as blocked with enhanced error reporting {{!}}- {{!}} 13 (ELDUC) {{!!}} Load EPC page as unblocked with enhanced error reporting {{!}}- ! colspan="2" {{!}} Other {{!}}- {{!}} 18 (EUPDATESVN) {{!!}} Update SVN (Security Version Number) after live microcode updateIntel, [https://cdrdv2-public.intel.com/648682/648682%20Runtime_Microcode_Update_with_Intel_SGX_rev1p0.pdf Runtime Microcode Updates with Intel® Software Guard Extensions], sep 2021, order no. 648682 rev 1.0. [https://web.archive.org/web/20230331103022/https://cdrdv2-public.intel.com/648682/648682%20Runtime_Microcode_Update_with_Intel_SGX_rev1p0.pdf Archived] from the original on 31 mar 2023. {{!)}} Any unsupported value in EAX causes a #GP exception. }} — depending on function, the instruction may take additional input operands in RBX, RCX and RDX. Depending on function, the instruction may return data in RBX and/or an error code in EAX. | {{no|0}} | rowspan="3" | {{glossary}}{{term|SGX1}}{{defn|Skylake,{{efn|SGX is deprecated on desktop/laptop processors from 11th generation (Rocket Lake, Tiger Lake) onwards,Intel, [https://cdrdv2-public.intel.com/634648/634648-004.pdf 11th Generation Intel® Core™ Processor Desktop Datasheet, Volume 1], may 2022, order no. 634648-004, section 3.5, page 65. [https://web.archive.org/web/20250219182337/https://cdrdv2-public.intel.com/634648/634648-004.pdf Archived] on 19 Feb 2025. but continues to be available on Xeon-branded server parts.}} |
ENCLU
| {{nowrap| | Perform an SGX User function. The function to perform is given in EAX{{efn|text=The leaf functions defined for {{(!}} class="wikitable sortable" ! EAX !! Function {{!}}- {{!}} 0 (EREPORT) {{!!}} Create a cryptographic report {{!}}- {{!}} 1 (EGETKEY) {{!!}} Create a cryptographic key {{!}}- {{!}} 2 (EENTER) {{!!}} Enter an Enclave {{!}}- {{!}} 3 (ERESUME) {{!!}} Re-enter an Enclave {{!}}- {{!}} 4 (EEXIT) {{!!}} Exit an Enclave {{!}}- ! colspan="2" {{!}} Added with SGX2 {{!}}- {{!}} 5 (EACCEPT) {{!!}} Accept changes to EPC page {{!}}- {{!}} 6 (EMODPE) {{!!}} Extend EPC page permissions {{!}}- {{!}} 7 (EACCEPTCOPY) {{!!}} Initialize pending page {{!}}- {{!}}- {{!}} 8 (EVERIFYREPORT2) {{!!}} Verify a cryptographic report of a trust domain {{!}}- ! colspan="2" {{!}} Added with AEX-NotifyIntel, [https://cdrdv2-public.intel.com/736463/aex-notify-white-paper-public.pdf Asynchronous Enclave Exit Notify and the EDECCSSA User Leaf Function], 30 Jun 2022. [https://web.archive.org/web/20221121073302/https://cdrdv2-public.intel.com/736463/aex-notify-white-paper-public.pdf Archived] on 21 Nov 2022. {{!}}- {{!}} 9 (EDECCSSA) {{!!}} Decrement TCS.CSSA {{!}}- {{!}}- {{!}} A (EREPORT2) {{!!}} Create a cryptographic report that contains SHA384 measurements {{!}}- {{!}} B (EGETKEY256) {{!!}} Create a 256-bit cryptographic key {{!)}} Any unsupported value in EAX causes a #GP exception. }} — depending on function, the instruction may take additional input operands in RBX, RCX and RDX. Depending on function, the instruction may return data/status information in EAX and/or RCX. | {{yes|3{{efn| |
ENCLV
| {{nowrap| | Perform an SGX Virtualization function. The function to perform is given in EAX{{efn|text=The leaf functions defined for {{(!}} class="wikitable sortable" ! EAX !! Function {{!}}- ! colspan="2" {{!}} Added with OVERSUB {{!}}- {{!}} 0 (EDECVIRTCHILD) {{!!}} Decrement VIRTCHILDCNT in SECS {{!}}- {{!}} 1 (EINCVIRTCHILD) {{!!}} Increment VIRTCHILDCNT in SECS {{!}}- {{!}} 2 (ESETCONTEXT) {{!!}} Set ENCLAVECONTEXT field in SECS {{!)}} Any unsupported value in EAX causes a #GP exception. }} — depending on function, the instruction may take additional input operands in RBX, RCX and RDX. Instruction returns status information in EAX. | {{no|0{{efn| |
colspan="6" | |
{{glossary}}{{term|PTWRITE}}{{defn|Write data to a Processor Trace Packet.}}{{glossary end}}
| | | Read data from register or memory to encode into a PTW packet.{{efn|For | {{yes|3}} | Kaby Lake, |
colspan="6" | |
{{glossary}}{{term|PCONFIG}}{{defn|Platform Configuration, including TME-MK ("Total Memory Encryption – Multi-Key") and TSE ("Total Storage Encryption").}}{{glossary end}}
| | | Perform a platform feature configuration function. The function to perform is specified in EAX{{efn|text=The leaf functions defined for {{(!}} class="wikitable sortable" ! EAX !! Function {{!}}- {{!}} 0 {{!!}} MKTME_KEY_PROGRAM: {{!}}- ! colspan="2" {{!}} Added with TSE {{!}}- {{!}} 1 {{!!}} TSE_KEY_PROGRAM: {{!}}- {{!}} 2 {{!!}} TSE_KEY_PROGRAM_WRAPPED: {{!)}} Any unsupported value in EAX causes a #GP(0) exception.}} - depending on function, the instruction may take additional input operands in RBX, RCX and RDX. If the instruction fails, it will set EFLAGS.ZF=1 and return an error code in EAX. If it is successful, it sets EFLAGS.ZF=0 and EAX=0. | {{no|0}} |
colspan="6" | |
{{glossary}}{{term|CLDEMOTE}}{{defn|Cache Line Demotion Hint.}}{{glossary end}}
| | | Move cache line containing m8 from CPU L1 cache to a more distant level of the cache hierarchy.{{efn|For | {{yes|3}} | (Tremont), |
colspan="6" | |
rowspan="3" | {{glossary}}{{term|WAITPKG}}{{defn|User-mode memory monitoring and waiting.}}{{glossary end}}
| | | Start monitoring a memory location for memory writes. The memory address to monitor is given by the register argument.{{efn|For | {{yes|3}} | rowspan="3" | Tremont, |
UMWAIT r32 UMWAIT r32,EDX,EAX
| | Timed wait for a write to a monitored memory location previously specified with | rowspan="2" {{yes2|Usually 3{{efn|name="waitpkg_cr4tsd"|text= |
TPAUSE r32 TPAUSE r32,EDX,EAX
| | Wait until the Time Stamp Counter reaches the value specified in EDX:EAX.{{efn|name=umwait_ctrl}} The register argument to the ! Bits !! Usage {{!}}- {{!}} 0 {{!!}} Preferred optimization state.
{{!}}- {{!}} 31:1 {{!!}} {{n/a|(Reserved)}} {{!)}}}} |
colspan="6" | |
{{glossary}}{{term|SERIALIZE}}{{defn|Instruction Execution Serialization.}}{{glossary end}}
| | | Serialize instruction fetch and execution.{{efn|While serialization can be performed with older instructions such as e.g. | {{yes|3}} |
colspan="6" | |
{{glossary}}{{term|HRESET}}{{defn|Processor History Reset.}}{{glossary end}}
| | {{nowrap| | Request that the processor reset selected components of hardware-maintained prediction history. A bitmap of which components of the CPU's prediction history to reset is given in EAX (the imm8 argument is ignored).{{efn|text=A bitmap of CPU history components that can be reset through {{(!}} class="wikitable sortable" ! Bit !! Usage {{!}}- {{!}} 0 {{!!}} Intel Thread Director history {{!}}- {{!}} 31:1 {{!!}} {{n/a|(Reserved)}} {{!)}}}} | {{no|0}} |
colspan="6" | |
rowspan="5" | {{glossary}}{{term|UINTR}}{{defn|User Interprocessor interrupt. Available in 64-bit mode only.}}{{glossary end}} | | | Send Interprocessor User Interrupt.{{efn|The register argument to | rowspan="5" {{yes|3}} | rowspan="5" | Sapphire Rapids |
UIRET
| | User Interrupt Return. Pops RIP, RFLAGS and RSP off the stack, in that order.{{efn|text=On Sapphire Rapids processors, the |
TESTUI
| | Test User Interrupt Flag. |
CLUI
| | Clear User Interrupt Flag. |
STUI
| | Set User Interrupt Flag. |
colspan="6" | |
rowspan="2" | {{glossary}}{{term|ENQCMD}}{{defn|Enqueue Store. Part of Intel DSA (Data Streaming Accelerator Architecture).Intel, [https://cdrdv2-public.intel.com/671116/341204-intel-data-streaming-accelerator-spec.pdf Intel Data Streaming Accelerator Architecture Specification], order no. 341204-004, Sep 2022, pages 13 and 23. [https://web.archive.org/web/20230720233510/https://cdrdv2-public.intel.com/671116/341204-intel-data-streaming-accelerator-spec.pdf Archived] on 20 Jul 2023. }}{{glossary end}}| | | Enqueue Command. Reads a 64-byte "command data" structure from memory (m512 argument) and writes atomically to a memory-mapped Enqueue Store device (register argument provides the memory address of this device, using ES segment and requiring 64-byte alignment.{{efn|text=For | {{yes|3}} | rowspan="2" | {{nowrap|Sapphire Rapids}} |
{{nowrap|ENQCMDS reg,m512 }}
| | Enqueue Command Supervisor. Differs from | {{no|0}} |
colspan="6" | |
{{glossary}}{{term|WRMSRNS}}{{defn|Non-serializing Write to Model-specific register.}}{{glossary end}}
| | |Write Model-specific register. The MSR to write is specified in ECX, and the data to write is given in EDX:EAX. The instruction differs from the older | {{no|0}} | {{nowrap|Sierra Forest}} |
colspan="6" | |
rowspan="2" | {{glossary}}{{term|MSRLIST}}{{defn|Read/write multiple Model-specific registers. Available in 64-bit mode only.}}{{glossary end}} | | | Read multiple MSRs. RSI points to a table of up to 64 MSR indexes to read (64 bits each), RDI points to a table of up to 64 data items that the MSR read-results will be written to (also 64 bits each), and RCX provides a 64-entry bitmap of which of the table entries to actually perform an MSR read for.{{efn|name=msrlist_align|text=For the | rowspan="2" {{no|0}} | rowspan="2" | {{nowrap|Sierra Forest}} |
WRMSRLIST
| | Write multiple MSRs. RSI points to a table of up to 64 MSR indexes to write (64 bits each), RDI points to a table of up to 64 data items to write into the MSRs (also 64 bits each), and RCX provides a 64-entry bitmap of which of the table entries to actually perform an MSR write for.{{efn|name=msrlist_align}} The MSRs are written in table order. The instruction is not serializing. |
colspan="6" | |
{{glossary}}{{term|CMPCCXADD}}{{defn|Atomically perform a compare - and a fetch-and-add if the condition is met. Available in 64-bit mode only.}}{{glossary end}} | {{nowrap| | {{small|{{nowrap| {{(!}} class="wikitable sortable" ! x !! cc !! Condition (EFLAGS) {{!}}- {{!}} 0 {{!!}} O {{!!}} OF=1: "Overflow" {{!}}- {{!}} 1 {{!!}} NO {{!!}} OF=0: {{nowrap|"Not Overflow"}} {{!}}- {{!}} 2 {{!!}} B {{!!}} CF=1: "Below" {{!}}- {{!}} 3 {{!!}} NB {{!!}} CF=0: {{nowrap|"Not Below"}} {{!}}- {{!}} 4 {{!!}} Z {{!!}} ZF=1: "Zero" {{!}}- {{!}} 5 {{!!}} NZ {{!!}} ZF=0: {{nowrap|"Not Zero"}} {{!}}- {{!}} 6 {{!!}} BE {{!!}} (CF=1 or ZF=1): {{nowrap|"Below or Equal"}} {{!}}- {{!}} 7 {{!!}} NBE {{!!}} (CF=0 and ZF=0): {{nowrap|"Not Below or Equal"}} {{!}}- {{!}} 8 {{!!}} S {{!!}} SF=1: "Sign" {{!}}- {{!}} 9 {{!!}} NS {{!!}} SF=0: {{nowrap|"Not Sign"}} {{!}}- {{!}} A {{!!}} P {{!!}} PF=1: "Parity" {{!}}- {{!}} B {{!!}} NP {{!!}} PF=0: {{nowrap|"Not Parity"}} {{!}}- {{!}} C {{!!}} L {{!!}} SF≠OF: "Less" {{!}}- {{!}} D {{!!}} NL {{!!}} SF=OF: {{nowrap|"Not Less"}} {{!}}- {{!}} E {{!!}} LE {{!!}} (ZF=1 or SF≠OF): {{nowrap|"Less or Equal"}} {{!}}- {{!}} F {{!!}} NLE {{!!}} (ZF=0 and SF=OF): {{nowrap|"Not Less or Equal"}} {{!)}}}}{{efn|Even though the | Read value from memory, then compare to first register operand. If the comparison passes, then add the second register operand to the memory value. The instruction as a whole is performed atomically.
| {{yes|3}} | {{nowrap|Sierra Forest,}} |
colspan="6" | |
{{glossary}}{{term|PBNDKB}}{{defn|Platform Bind Key to Binary Large Object. Part of Intel TSE (Total Storage Encryption), and available in 64-bit mode only. }}{{glossary end}}| | | Bind information to a platform by encrypting it with a platform-specific wrapping key. The instruction takes as input the addresses to two 256-byte-aligned "bind structures" in RBX and RCX, reads the structure pointed to by RBX and writes a modified structure to the address given in RCX. If the instruction fails, it will set EFLAGS.ZF=1 and return an error code in EAX. If it is successful, it sets EFLAGS.ZF=0 and EAX=0. | {{no|0}} |
{{notelist}}{{vpad}}
== Added with other AMD-specific extensions ==
{{sticky header}}
class="wikitable sortable sticky-header"
! Instruction Set Extension !! Instruction |
colspan="6" | |
---|
rowspan="2" | {{glossary}}{{term|AltMovCr8}}{{defn|Alternative mechanism to access the CR8 control register.{{efn|The standard way to access the CR8 register is to use an encoding that makes use of the REX.R prefix, e.g. {{nowrap|44 0F 20 07 }} ({{nowrap|MOV RDI,CR8 }}). However, the REX.R prefix is only available in 64-bit mode.The AltMovCr8 extension adds an additional method to access CR8, using the F0 (LOCK ) prefix instead of REX.R – this provides access to CR8 outside 64-bit mode.}}}}{{glossary end}}
| | | Read the CR8 register. | rowspan="2" {{no|0}} | rowspan="2" | K8{{efn|Support for AltMovCR8 was added in stepping F of the AMD K8, and is not available on earlier steppings.}} |
{{nowrap|MOV CR8,reg }}
| {{nowrap| | Write to the CR8 register. |
colspan="6" | |
rowspan="2" | {{glossary}}{{term|MONITORX}}{{defn|Monitor a memory location for writes in user mode.}}{{glossary end}}
| | | Start monitoring a memory location for memory writes. Similar to older | rowspan="2" {{yes|3}} | rowspan="2" | Excavator |
MWAITX
| | Wait for a write to a monitored memory location previously specified with |
colspan="6" | |
{{glossary}}{{term|CLZERO}}{{defn|Zero out full cache line.}}{{glossary end}}
| {{nowrap| | | Write zeroes to all bytes in a memory region that has the size and alignment of a CPU cache line and contains the byte addressed by DS:rAX.{{efn|For | {{yes|3}} | Zen 1 |
colspan="6" | |
{{glossary}}{{term|RDPRU}}{{defn|Read processor register in user mode.}}{{glossary end}}
| | | Read selected MSRs (mainly performance counters) in user mode. ECX specifies which register to read.{{efn|text=The register numbering used by {{(!}} class="wikitable sortable" ! ECX !! Register {{!}}- {{!}} 0 {{!!}} MPERF (MSR 0E7h: Maximum Performance Frequency Clock Count) {{!}}- {{!}} 1 {{!!}} APERF (MSR 0E8h: Actual Performance Frequency Clock Count) {{!)}} Unsupported values in ECX return 0.}} The value of the MSR is returned in EDX:EAX. | {{yes2|Usually 3{{efn|text=If | Zen 2 |
colspan="6" | |
{{glossary}}{{term|MCOMMIT}}{{defn|Commit Stores To Memory.}}{{glossary end}}
| | | Ensure that all preceding stores in thread have been committed to memory, and that any errors encountered by these stores have been signalled to any associated error logging resources. The set of errors that can be reported and the logging mechanism are platform-specific. | {{yes|3}} | Zen 2 |
colspan="6" | |
rowspan="2" | {{glossary}}{{term|INVLPGB}}{{defn|Invalidate TLB Entries with broadcast.}}{{glossary end}}
| | | Invalidate TLB Entries for a range of pages, with broadcast. The invalidation is performed on the processor executing the instruction, and also broadcast to all other processors in the system. | rowspan="2" {{no|0}} | rowspan="2" | Zen 3 |
TLBSYNC
| | Synchronize TLB invalidations. |
{{notelist}}{{vpad}}
x87 floating-point instructions
The x87 coprocessor, if present, provides support for floating-point arithmetic. The coprocessor provides eight data registers, each holding one 80-bit floating-point value (1 sign bit, 15 exponent bits, 64 mantissa bits) – these registers are organized as a stack, with the top-of-stack register referred to as "st" or "st(0)", and the other registers referred to as st(1), st(2), ...st(7). It additionally provides a number of control and status registers, including "PC" (precision control, to control whether floating-point operations should be rounded to 24, 53 or 64 mantissa bits) and "RC" (rounding control, to pick rounding-mode: round-to-zero, round-to-positive-infinity, round-to-negative-infinity, round-to-nearest-even) and a 4-bit condition code register "CC", whose four bits are individually referred to as C0, C1, C2 and C3). Not all of the arithmetic instructions provided by x87 obey PC and RC.
= Original [[8087]] instructions =
{{sticky header}}
class="wikitable sortable sticky-header"
! Instruction description ! Mnemonic ! Opcode ! colspan="2" | Additional items | |||
colspan="3" | | colspan="2" | | ||
---|---|---|---|
colspan="3" | x87 Non-Waiting{{efn|x87 coprocessors (other than the 8087) handle exceptions in a fairly unusual way. When an x87 instruction generates an unmasked arithmetic exception, it will still complete without causing a CPU fault – instead of causing a fault, it will record within the coprocessor information needed to handle the exception (instruction pointer, opcode, data pointer if the instruction had a memory operand) and set FPU status-word flag to indicate that a pending exception is present. This pending exception will then cause a CPU fault when the next x87, MMX or WAIT instruction is executed.The exception to this is x87's "Non-Waiting" instructions, which will execute without causing such a fault even if a pending exception is present (with some caveats, see application note AP-578Intel, [https://ardent-tool.com/CPU/docs/Intel/IA/243291-002.pdf Application note AP-578: Software and Hardware Considerations for FPU Exception Handlers for Intel Architecture Processors], order no. 243291-002, February 1997). These instructions are mostly control instructions that can inspect and/or modify the pending-exception state of the x87 FPU.}} FPU Control Instructions || colspan="2" | Waiting mnemonic{{efn|For each non-waiting x87 instruction whose mnemonic begins with FN , there exists a pseudo-instruction that has the same mnemonic except without the N. These pseudo-instructions consist of a WAIT instruction (opcode 9B ) followed by the corresponding non-waiting x87 instruction. For example:
| |||
Initialize x87 FPU
| | | colspan="2" | FINIT | ||
Load x87 Control Word
| | D9 /5 | colspan="2" {{CNone|(none)}} | |
Store x87 Control Word
| | D9 /7 | colspan="2" | FSTCW | |
Store x87 Status Word
| | | colspan="2" | FSTSW | ||
Clear x87 Exception Flags
| | | colspan="2" | FCLEX | ||
Load x87 FPU Environment
| | | colspan="2" {{CNone|(none)}} | ||
Store x87 FPU Environment
| {{nowrap| | | colspan="2" | FSTENV | ||
Save x87 FPU State, then initialize x87 FPU
| {{nowrap| | | colspan="2" | FSAVE | ||
Restore x87 FPU State
| | | colspan="2" {{CNone|(none)}} | ||
Enable Interrupts (8087 only){{efn|name="feni_8087_only"|In the case of an x87 instruction producing an unmasked FPU exception, the 8087 FPU will signal an IRQ some indeterminate time after the instruction was issued. This may not always be possible to handle,Intel, [https://ardent-tool.com/CPU/docs/Intel/808x/8087/appnotes/AP-113.pdf Application Note AP-113: Getting Started With The Numeric Data Processor], feb 1981, pages 24-25 and so the FPU offers the F(N)DISI and F(N)ENI instructions to set/clear the Interrupt Mask bit (bit 7) of the x87 Control Word,Intel, [http://www.datasheetcatalog.com/datasheets_pdf/8/0/8/7/8087.shtml 8087 Math Coprocessor], oct 1989, order no. 285385-007, page 3-100, fig 9 to control the interrupt.Later x87 FPUs, from 80287 onwards, changed the FPU exception mechanism to instead produce a CPU exception on the next x87 instruction. This made the Interrupt Mask bit unnecessary, so it was removed.Intel, [http://www.bitsavers.org/components/intel/_dataSheets/80287_Data_Sheet_Feb83.pdf 80287 80-bit HMOS Numeric Processor Extension], feb 1983, order no. 201920-001, page 14 In later Intel x87 FPUs, the F(N)ENI and F(N)DISI instructions were kept for backwards compatibility, executing as NOPs that do not modify any x87 state.}}
| | DB E0 | colspan="2" | FENI | |
Disable Interrupts (8087 only){{efn|name="feni_8087_only"}}
| | DB E1 | colspan="2" | FDISI | |
colspan="3" | | colspan="2" | | ||
colspan="3" | x87 Floating-point Load/Store/Move Instructions || precision control || rounding control | |||
rowspan="4" | Load floating-point value onto stack
| | D9 /0 | rowspan="4"{{no}} | rowspan="4"{{n/a}} |
FLD m64 | DD /0 | ||
FLD m80 | DB /5 | ||
FLD st(i) | D9 C0+i | ||
rowspan="3" | Store top-of-stack floating-point value to memory or stack register
| | D9 /2 | rowspan="2" {{no}} | rowspan="2" {{yes}} |
FST m64 | DD /2 | ||
FST st(i) {{efn|name="x87_amd_fstp"|FST /FSTP with an 80-bit destination (m80 or st(i)) and an sNaN source value is documented to produce exceptions on AMD but not Intel FPUs.}}
| | {{no}} | {{n/a}} | |
rowspan="6" | Store top-of-stack floating-point value to memory or stack register, then pop
| | D9 /3 | rowspan="2" {{no}} | rowspan="2" {{yes}} |
FSTP m64 | DD /3 | ||
FSTP m80 {{efn|name="x87_amd_fstp"}}
| | rowspan="4" {{no}} | rowspan="4"{{n/a}} | |
rowspan="3" | FSTP st(i) {{efn|name="x87_amd_fstp"}}{{efn|FSTP ST(0) is a commonly used idiom for popping a single register off the x87 register stack.}}
| {{nowrap| | |||
{{unofficial2|align="left"|{{mono|DF D0+i}}{{efn|name="x87_alias"|Intel x87 alias opcode. Use of this opcode is not recommended. On the Intel 8087 coprocessor, several reserved opcodes would perform operations behaving similarly to existing defined x87 instructions. These opcodes were documented for the 8087Intel, [https://ardent-tool.com/CPU/docs/Intel/808x/manuals/210201-001.pdf iAPX86, 88 User's Manual], 1981 (order no. 210201-001), p. 797 and 80287,Intel [http://bitsavers.trailing-edge.com/components/intel/80286/210498-005_80286_and_80287_Programmers_Reference_Manual_1987.pdf 80286 and 80287 Programmers Reference Manual], 1987 (order no. 210498-005), p. 485 but then omitted from later manuals until the October 2017 update of the Intel SDM.Intel [https://kib.kiev.ua/x86docs/Intel/SDMs/253669-064.pdf Software Developer's Manual] volume 3B, revision 064, section 22.18.9 They are present on all known Intel x87 FPUs but unavailable on some older non-Intel FPUs, such as AMD Geode GX/LX, DM&P Vortex86{{Cite web|url=https://gcc.gnu.org/bugzilla/show_bug.cgi?id=37179|title = GCC Bugzilla – 37179 – GCC emits bad opcode 'ffreep'}} and NexGen 586PF.Michael Steil, [https://www.pagetable.com/?p=16 FFREEP – the assembly instruction that never existed]}}}} | |||
{{unofficial2|align="left"|{{mono|DF D8+i}}{{efn|name="x87_alias"}}}} | |||
Push +0.0 onto stack
| | D9 EE | rowspan="2" {{no}} | rowspan="2"{{n/a}} |
Push +1.0 onto stack
| | D9 E8 | ||
Push {{pi}} (approximately 3.14159) onto stack
| | D9 EB | rowspan="5" {{no}} | rowspan="5" {{yes2|387{{efn|name="x87_rounding_387"|On the 8087 and 80287, FBSTP and the load-constant instructions always use the round-to-nearest rounding mode. On the 80387 and later x87 FPUs, these instructions will use the rounding mode specified in the x87 RC register.}}}} |
Push (approximately 3.32193) onto stack
| | D9 E9 | ||
Push (approximately 1.44269) onto stack
| | D9 EA | ||
Push (approximately 0.30103) onto stack
| | D9 EC | ||
Push (approximately 0.69315) onto stack
| | D9 ED | ||
rowspan="3" | Exchange top-of-stack register with other stack register
| rowspan="3" | | | rowspan=3 {{no}} | rowspan=3 {{n/a}} | |||
{{unofficial2|align="left"|{{nowrap|{{mono|DD C8+i}}{{efn|name="x87_alias"}}}}}} | |||
{{unofficial2|align="left"|{{nowrap|{{mono|DF C8+i}}{{efn|name="x87_alias"}}}}}} | |||
colspan="3" | x87 Integer Load/Store Instructions || precision control || rounding control | |||
rowspan="3" | Load signed integer value onto stack from memory, with conversion to floating-point
| | DF /0 | rowspan="3" {{no}} | rowspan="3"{{n/a}} |
FILD m32 | DB /0 | ||
FILD m64 | DF /5 | ||
rowspan="2" | Store top-of-stack value to memory, with conversion to signed integer
| | DF /2 | rowspan="2" {{no}} | rowspan="2" {{yes}} |
FIST m32 | DB /2 | ||
rowspan="3" | Store top-of-stack value to memory, with conversion to signed integer, then pop stack
| | DF /3 | rowspan="3" {{no}} | rowspan="3" {{yes}} |
FISTP m32 | DB /3 | ||
FISTP m64 | DF /7 | ||
Load 18-digit Binary-Coded-Decimal integer value onto stack from memory, with conversion to floating-point{{efn|The result of executing the FBLD instruction on non-BCD data is undefined.}}
| | | {{no}} | {{n/a}} | |
Store top-of-stack value to memory, with conversion to 18-digit Binary-Coded-Decimal integer, then pop stack
| | DF /6 | {{no}} | {{yes2|387{{efn|name="x87_rounding_387"}}}} |
colspan="3" | x87 Basic Arithmetic Instructions || precision control || rounding control | |||
rowspan="4" | Floating-point add
:{{code|dst <- dst + src}} | | D8 /0 | rowspan="4" {{yes}} | rowspan="4" {{yes}} |
FADD m64 | DC /0 | ||
FADD st,st(i) | D8 C0+i | ||
FADD st(i),st | DC C0+i | ||
rowspan="4" | Floating-point multiply
:{{code|dst <- dst * src}} | | D8 /1 | rowspan="4" {{yes}} | rowspan="4" {{yes}} |
FMUL m64 | DC /1 | ||
FMUL st,st(i) | D8 C8+i | ||
FMUL st(i),st | DC C8+i | ||
rowspan="4" | Floating-point subtract
:{{code|dst <- dst – src}} | | D8 /4 | rowspan="4" {{yes}} | rowspan="4" {{yes}} |
FSUB m64 | DC /4 | ||
FSUB st,st(i) | D8 E0+i | ||
FSUB st(i),st | DC E8+i | ||
rowspan="4" | Floating-point reverse subtract
:{{code|dst <- src – dst}} | | D8 /5 | rowspan="4" {{yes}} | rowspan="4" {{yes}} |
FSUBR m64 | DC /5 | ||
FSUBR st,st(i) | D8 E8+i | ||
FSUBR st(i),st | DC E0+i | ||
rowspan="4" | Floating-point divide{{efn|name="pentium_fdiv"|text=On early Intel Pentium processors, floating-point divide was subject to the Pentium FDIV bug. This also affected instructions that perform divide as part of their operations, such as FPREM and FPATAN .Dusko Koncaliev, [https://www.cs.earlham.edu/~dusko/cs63/fdiv.html Pentium FDIV Bug]}}
:{{code|dst <- dst / src}} | | D8 /6 | rowspan="4" {{yes}} | rowspan="4" {{yes}} |
FDIV m64 | DC /6 | ||
FDIV st,st(i) | D8 F0+i | ||
FDIV st(i),st | DC F8+i | ||
rowspan="4" | Floating-point reverse divide
:{{code|dst <- src / dst}} | | D8 /7 | rowspan="4" {{yes}} | rowspan="4" {{yes}} |
FDIVR m64 | DC /7 | ||
FDIVR st,st(i) | D8 F8+i | ||
FDIVR st(i),st | DC F0+i | ||
rowspan="4" | Floating-point compare
:{{code|CC <- result_of( st(0) – src )}} | | D8 /2 | rowspan="4" {{no}} | rowspan="4"{{n/a}} |
FCOM m64 | DC /2 | ||
rowspan="2" | FCOM st(i) {{efn|name="x87_optarg"}}
| | |||
{{unofficial2|align="left"|{{nowrap|{{mono|DC D0+i}}{{efn|name="x87_alias"}}}}}} | |||
colspan="3" | x87 Basic Arithmetic Instructions with Stack Pop || precision control || rounding control | |||
Floating-point add and pop
| | DE C0+i | {{yes}} | {{yes}} |
Floating-point multiply and pop
| | DE C8+i | {{yes}} | {{yes}} |
Floating-point subtract and pop
| | DE E8+i | {{yes}} | {{yes}} |
Floating-point reverse-subtract and pop
| | DE E0+i | {{yes}} | {{yes}} |
Floating-point divide and pop
| | DE F8+i | {{yes}} | {{yes}} |
Floating-point reverse-divide and pop
| | DE F0+i | {{yes}} | {{yes}} |
rowspan="5" | Floating-point compare and pop
| | D8 /3 | rowspan="5" {{no}} | rowspan="5"{{n/a}} |
FCOMP m64 | DC /3 | ||
rowspan="3" | FCOMP st(i) {{efn|name="x87_optarg"}}
| | |||
{{unofficial2|align="left"|{{nowrap|{{mono|DC D8+i}}{{efn|name="x87_alias"}}}}}} | |||
{{unofficial2|align="left"|{{nowrap|{{mono|DE D0+i}}{{efn|name="x87_alias"}}}}}} | |||
Floating-point compare to st(1), then pop twice
| | DE D9 | {{no}} | {{n/a}} |
colspan="3" | x87 Basic Arithmetic Instructions with Integer Source Argument || precision control || rounding control | |||
rowspan="2" | Floating-point add by integer
| | DA /0 | rowspan="2" {{yes}} | rowspan="2" {{yes}} |
FIADD m32 | DE /0 | ||
rowspan="2" | Floating-point multiply by integer
| | DA /1 | rowspan="2" {{yes}} | rowspan="2" {{yes}} |
FIMUL m32 | DE /1 | ||
rowspan="2" | Floating-point subtract by integer
| | DA /4 | rowspan="2" {{yes}} | rowspan="2" {{yes}} |
FISUB m32 | DE /4 | ||
rowspan="2" | Floating-point reverse-subtract by integer
| | DA /5 | rowspan="2" {{yes}} | rowspan="2" {{yes}} |
FISUBR m32 | DE /5 | ||
rowspan="2" | Floating-point divide by integer
| | DA /6 | rowspan="2" {{yes}} | rowspan="2" {{yes}} |
FIDIV m32 | DE /6 | ||
rowspan="2" | Floating-point reverse-divide by integer
| | DA /7 | rowspan="2" {{yes}} | rowspan="2" {{yes}} |
FIDIVR m32 | DE /7 | ||
rowspan="2" | Floating-point compare to integer
| | DA /2 | rowspan="2" {{no}} | rowspan="2"{{n/a}} |
FICOM m32 | DE /2 | ||
rowspan="2" | Floating-point compare to integer, and stack pop
| | | rowspan="2" {{no}} | rowspan="2"{{n/a}} | |
FICOMP m32
| | |||
colspan="3" | x87 Additional Arithmetic Instructions || precision control || rounding control | |||
Floating-point change sign
| | D9 E0 | {{no}} | {{n/a}} |
Floating-point absolute value
| | D9 E1 | {{no}} | {{n/a}} |
Floating-point compare top-of-stack value to 0
| | D9 E4 | {{no}} | {{n/a}} |
Classify top-of-stack st(0) register value. The classification result is stored in the x87 CC register.{{efn|text=The FXAM instruction will set C0, C2 and C3 based on value type in st(0) as follows:
{{(!}} class="wikitable sortable" ! C3 !! C2 !! C0 !! Classification {{!}}- {{!}} 0 {{!!}} 0 {{!!}} 0 {{!!}} Unsupported (unnormal or pseudo-NaN) {{!}}- {{!}} 0 {{!!}} 0 {{!!}} 1 {{!!}} NaN {{!}}- {{!}} 0 {{!!}} 1 {{!!}} 0 {{!!}} Normal finite number {{!}}- {{!}} 0 {{!!}} 1 {{!!}} 1 {{!!}} Infinity {{!}}- {{!}} 1 {{!!}} 0 {{!!}} 0 {{!!}} Zero {{!}}- {{!}} 1 {{!!}} 0 {{!!}} 1 {{!!}} Empty {{!}}- {{!}} 1 {{!!}} 1 {{!!}} 0 {{!!}} Denormal number {{!}}- {{!}} 1 {{!!}} 1 {{!!}} 1 {{!!}} Empty (may occur on 8087/80287 only) {{!)}} C1 is set to the sign-bit of st(0), regardless of whether st(0) is Empty or not. }} | | D9 E5 | {{no}} | {{n/a}} |
Split the st(0) value into two values {{mvar|E}} and {{mvar|M}} representing the exponent and mantissa of st(0). The split is done such that , where {{mvar|E}} is an integer and {{mvar|M}} is a number whose absolute value is within the range . {{efn|For FXTRACT , the behavior that results from st(0) being zero or ±∞, differs between 8087 and 80387:
}} | | D9 F4 | {{no}} | {{n/a}} |
Floating-point partial{{efn|For FPREM , if the quotient {{mvar|Q}} is larger than , then the remainder calculation may have been done only partially – in this case, the FPREM instruction will need to be run again in order to complete the remainder calculation. This is indicated by the instruction setting C2 to 1.If the instruction did complete the remainder calculation, it will set C2 to 0 and set the three bits {C0,C3,C1} to the bottom three bits of the quotient {{mvar|Q}}.On 80387 and later, if the instruction didn't complete the remainder calculation, then the computed remainder {{mvar|Q}} used for argument reduction will have been rounded to a multiple of 8 (or larger power-of-2), so that the bottom 3 bits of the quotient can still be correctly retrieved in a later pass that does complete the remainder calculation.}} remainder (not IEEE 754 compliant): | | D9 F8 | {{no}} | {{n/a}}{{efn|The remainder computation done by the FPREM instruction is always exact with no roundoff errors.}} |
Floating-point square root
| | D9 FA | {{yes}} | {{yes}} |
Floating-point round to integer
| | D9 FC | {{no}} | {{yes}} |
Floating-point power-of-2 scaling. Rounds the value of st(1) to integer with round-to-zero, then uses it as a scale factor for st(0):{{efn|For the FSCALE instruction on 8087 and 80287, st(1) is required to be in the range . Also, its absolute value must be either 0 or at least 1. If these requirements are not satisfied, the result is undefined.These restrictions were removed in the 80387.}} | | D9 FD | {{no}} | {{yes|Yes{{efn|For FSCALE , rounding is only applied in the case of overflow, underflow or subnormal result.}}}} |
colspan="3" | | colspan="2" | | ||
colspan="3" | x87 Transcendental Instructions{{efn|The x87 transcendental instructions do not obey PC or RC, but instead compute full 80-bit results. These results are not necessarily correctly rounded (see Table-maker's dilemma) – they may have an error of up to ±1 ulp on Pentium or later, or up to ±1.5 ulps on earlier x87 coprocessors.}} | colspan="2" | Source operand range restriction | ||
Base-2 exponential minus 1, with extra precision for st(0) close to 0:
| | D9 F0
| colspan="2" | 8087: | ||
Base-2 Logarithm and multiply:{{efn|name="x87_fyl2x_error"|1=For the FYL2X and FYL2XP1 instructions, the maximum error bound of ±1 ulp only holds for st(1)=1.0 – for other values of st(1), the error bound is increased to ±1.35 ulps.FYL2X can produce a #Z (divide-by-zero exception) if st(0)=0 and st(1) is a finite nonzero value. FYL2XP1 , however, cannot produce #Z.}}followed by stack pop | | | colspan="2" | no restrictions | ||
Partial Tangent: Computes from st(0) a pair of values {{mvar|X}} and {{mvar|Y}}, such thatThe {{mvar|Y}} value replaces the top-of-stack value, and then {{mvar|X}} is pushed onto the stack. On 80387 and later x87, but not original 8087, {{mvar|X}} is always 1.0 | | D9 F2
| colspan="2" | 8087: | ||
Two-argument arctangent with quadrant adjustment:{{efn|For FPATAN , the following adjustments are done as compared to just computing a one-argument arctangent of the ratio :
| | D9 F3
| colspan="2" | 8087: | ||
Base-2 Logarithm plus 1 with extra precision for st(0) close to 0, followed by multiply:{{efn|name="x87_fyl2x_error"}} followed by stack pop | | D9 F9
| colspan="2" | Intel: | ||
colspan="3" | | colspan="2" | | ||
colspan="3" | Other x87 Instructions | colspan="2" | | ||
No operation{{efn|While FNOP is a no-op in the sense that will leave the x87 FPU register stack unmodified, it may still modify FIP and CC, and it may fault if a pending x87 FPU exception is present.}}
| | D9 D0
| rowspan="7" colspan="2" | | ||
Decrement x87 FPU Register Stack Pointer
| | D9 F6 | ||
Increment x87 FPU Register Stack Pointer
| | D9 F7 | ||
Free x87 FPU Register
| | {{nowrap| | |||
Check and handle pending unmasked x87 FPU exceptions
| | 9B | ||
Floating-point store and pop, without stack underflow exception{{efn|1=If the top-of-stack register st(0) is Empty, then the FSTPNCE instruction will behave like FINCSTP , incrementing the stack pointer with no data movement and no exceptions reported.}}
| {{unofficial2|align=left|{{nowrap|{{mono|FSTPNCE st(i)}}}}}} | {{unofficial2|align=left|{{mono|D9 D8+i}}{{efn|name="x87_alias"}}}} | |||
Free x87 register, then stack pop
| {{unofficial2|align=left|{{nowrap|{{mono|FFREEP st(i)}}}}}} | {{unofficial2|align=left|{{mono|DF C0+i}}{{efn|name="x87_alias"}}}} |
{{notelist}}
= x87 instructions added in later processors =
{{sticky header}}
class="wikitable sortable sticky-header"
! Instruction description ! Mnemonic ! Opcode ! Additional items | |||
colspan="3" | | |||
---|---|---|---|
colspan="3" | x87 Non-Waiting Control Instructions added in 80287 | Waiting mnemonic | ||
Notify FPU of entry into Protected Mode{{efn|The x87 FPU needs to know whether it is operating in Real Mode or Protected Mode because the floating-point environment accessed by the F(N)SAVE , FRSTOR , FLDENV and F(N)STENV instructions has different formats in Real Mode and Protected Mode. On 80287, the F(N)SETPM instruction is required to communicate the real-to-protected mode transition to the FPU. On 80387 and later x87 FPUs, real↔protected mode transitions are handled automatically between the CPU and the FPU without the need for any dedicated instructions – therefore, on these FPUs, FNSETPM executes as a NOP that does not modify any FPU state.}} | FNSETPM | DB E4 | FSETPM |
Store x87 Status Word to AX | FNSTSW AX | DF E0 | FSTSW AX |
colspan="3" | | |||
colspan="3" | x87 Instructions added in 80387{{efn|Not including discontinued instructions specific to particular 80387-compatible FPU models.}} | {{nowrap|Source operand}} {{nowrap|range restriction}} | ||
Floating-point unordered compare. Similar to the regular floating-point compare instruction FCOM , except will not produce an exception in response to any qNaN operands. | FUCOM st(i) {{efn|name="387_optarg"|For the FUCOM and FUCOMP instructions, x86 assemblers/disassemblers may recognize variants of the instructions with no arguments. Such variants are equivalent to variants using st(1) as their first argument.}} | DD E0+i | rowspan="4" | no restrictions |
Floating-point unordered compare and pop | FUCOMP st(i) {{efn|name="387_optarg"}} | DD E8+i | |
Floating-point unordered compare to st(1), then pop twice | FUCOMPP | DA E9 | |
IEEE 754 compliant floating-point partial remainder.{{efn|The 80387 FPREM1 instruction differs from the older FPREM (D9 F8 ) instruction in that the quotient {{mvar|Q}} is rounded to integer with round-to-nearest-even rounding rather than the round-to-zero rounding used by FPREM . Like FPREM , FPREM1 always computes an exact result with no roundoff errors. Like FPREM , it may also perform a partial computation if the quotient is too large, in which case it must be run again.}} | FPREM1 | D9 F5 | |
Floating-point sine and cosine. Computes two values and {{efn|name="x87_inaccurate_sincos"}} Top-of-stack st(0) is replaced with {{mvar|S}}, after which {{mvar|C}} is pushed onto the stack. | FSINCOS | D9 FB
| rowspan="3" | {{efn|1=If st(0) is finite and its absolute value is or greater, then the top-of-stack value st(0) is left unmodified and C2 is set, with no exception raised. This applies to the | |
Floating-point sine.{{efn|name="x87_inaccurate_sincos"|Due to the x87 FPU performing argument reduction for sin/cos with only about 68 bits of precision, the value of {{mvar|k}} used in the calculation of FSIN , FCOS and FSINCOS is not precisely 1.0, but instead given byBruce Dawson, [https://randomascii.wordpress.com/2014/10/09/intel-underestimates-error-bounds-by-1-3-quintillion/ Intel Underestimates Error Bounds by 1.3 quintillion][https://kib.kiev.ua/x86docs/Intel/SDMs/253665-053.pdf Intel SDM, rev 053] and later, describes the exact argument reduction procedure used for FSIN , FCOS , FSINCOS and FPTAN in volume 1, section 8.3.8This argument reduction inaccuracy also affects the FPTAN instruction.}} | FSIN | D9 FE | |
Floating-point cosine.{{efn|name="x87_inaccurate_sincos"}} | FCOS | D9 FF | |
colspan="3" | | |||
colspan="3" | x87 Instructions added in Pentium Pro | {{nowrap|Condition for}} {{nowrap|conditional moves}} | ||
rowspan="8" | Floating-point conditional move to st(0) based on EFLAGS | FCMOVB st(0),st(i) | DA C0+i | below (CF=1) |
FCMOVE st(0),st(i) | DA C8+i | equal (ZF=1) | |
FCMOVBE st(0),st(i) | DA D0+i | below or equal (CF=1 or ZF=1) | |
FCMOVU st(0),st(i) | DA D8+i | unordered (PF=1) | |
FCMOVNB st(0),st(i) | DB C0+i | not below (CF=0) | |
FCMOVNE st(0),st(i) | DB C8+i | not equal (ZF=0) | |
{{nowrap|FCMOVNBE st(0),st(i) }} | DB D0+i | not below or equal (CF=0 and ZF=0) | |
FCMOVNU st(0),st(i) | DB D8+i | not unordered (PF=0) | |
Floating-point compare and set EFLAGS .Differs from the older FCOM floating-point compare instruction in that it puts its result in the integer EFLAGS register rather than the x87 CC register.{{efn|The FCOMI , FCOMIP , FUCOMI and FUCOMIP instructions write their results to the ZF , CF and PF bits of the EFLAGS register. On Intel but not AMD processors, the SF , AF and OF bits of EFLAGS are also zeroed out by these instructions.}} | FCOMI st(0),st(i) | DB F0+i | rowspan=4 | |
Floating-point compare and set EFLAGS , then pop | FCOMIP st(0),st(i) | DF F0+i | |
Floating-point unordered compare and set EFLAGS | FUCOMI st(0),st(i) | DB E8+i | |
Floating-point unordered compare and set EFLAGS , then pop | {{nowrap|FUCOMIP st(0),st(i) }} | DF E8+i | |
colspan="3" | | |||
colspan="3" | x87 Non-Waiting Instructions added in Pentium II, AMD K7 and SSE{{efn|The FXSAVE and FXRSTOR instructions were added in the "Deschutes" revision of Pentium II, and are not present in earlier "Klamath" revision.They are also present in AMD K7. They are also considered an integral part of SSE and are therefore present in all processors with SSE.}} ! 64-bit mnemonic | |||
Save x87, MMX and SSE state to a 464-byte data structure{{efn|name="fxsave_sse"|The FXSAVE and FXRSTOR instructions will save/restore SSE state only on processors that support SSE. Otherwise, they will only save/restore x87 and MMX state.The x87 section of the state saved/restored by FXSAVE(64) /FXRSTOR(64) has a completely different layout than the data structure of the older F(N)SAVE /FRSTOR instructions, enabling faster save/restore by avoiding misaligned loads and stores.FXSAVE and FXRSTOR require their memory argument to be 16-byte aligned.}}{{efn|name="fxsave_cr0em"|When floating-point emulation is enabled with {{code|CR0.EM{{=}}1}}, FXSAVE(64) and FXRSTOR(64) are considered to be x87 instructions and will accordingly produce an {{mono|#NM}} (device-not-available) exception. Other than WAIT , these are the only opcodes outside the D8..DF ESC opcode space that exhibit this behavior.Except on Netburst (Pentium 4 family) CPUs, all opcodes in D8..DF will produce {{mono|#NM}} if {{code|CR0.EM{{=}}1}}, even for undefined opcodes that would produce {{mono|#UD}} otherwise.}}{{efn|Unlike the older F(N)SAVE instruction, FXSAVE will not initialize the FPU after saving its state to memory, but instead leave the x87 coprocessor state unmodified.}} | FXSAVE m464byte | {{nowrap|NP 0F AE /0 }} | {{nowrap|FXSAVE64 m464byte }}{{efn|name="fxsave_fcs_fds"|1=The FXSAVE64 /FXRSTOR64 instruction differ from the FXSAVE /FXRSTOR instructions in that:
This difference also applies to the later |
Restore x87, MMX and SSE state from 464-byte data structure{{efn|name="fxsave_sse"}}{{efn|name="fxsave_cr0em"}} | {{nowrap|FXRSTOR m464byte }} | {{nowrap|NP 0F AE /1 }} | {{nowrap|FXRSTOR64 m464byte }}{{efn|name="fxsave_fcs_fds"}} |
colspan="3" | | |||
colspan="3" | x87 Instructions added as part of SSE3 | |||
rowspan="3" | Floating-point store integer and pop, with round-to-zero | FISTTP m16 | DF /1 | rowspan="3" | |
FISTTP m32 | DB /1 | ||
FISTTP m64 | DD /1 |
{{notelist}}
[[SIMD]] instructions
{{Main|x86 SIMD instruction listings}}
Cryptographic instructions
{{Main|List of x86 cryptographic instructions}}
Virtualization instructions
{{Main|List of x86 virtualization instructions}}
Other instructions
{{See also|List of discontinued x86 instructions}}
x86 also includes discontinued instruction sets which are no longer supported by Intel and AMD, and undocumented instructions which execute but are not officially documented.
=Undocumented x86 instructions=
The x86 CPUs contain undocumented instructions which are implemented on the chips but not listed in some official documents. They can be found in various sources across the Internet, such as Ralf Brown's Interrupt List and at [https://www.sandpile.org/ sandpile.org]
Some of these instructions are widely available across many/most x86 CPUs, while others are specific to a narrow range of CPUs.
== Undocumented instructions that are widely available across many x86 CPUs include ==
{{sticky header}}
== Undocumented instructions that appear only in a limited subset of x86 CPUs include ==
{{sticky header}}
class="wikitable sortable sticky-header"
! Mnemonics ! Opcodes ! Description ! Status |
REP MUL
| | rowspan=2 | On 8086/8088, a |
REP IMUL
| |
REP IDIV
| | On 8086/8088, a |
SAVEALL ,
| | Exact purpose unknown, causes CPU hang (HCF). The only way out is CPU reset.{{cite web | url = http://www.sandpile.org/post/msgs/20004129.htm | archive-url = https://web.archive.org/web/20041106070621/http://www.sandpile.org/post/msgs/20004129.htm | title = Re: Undocumented opcodes (HINT_NOP) | archive-date = 2004-11-06 | access-date = 2010-11-07 }} In some implementations, emulated through BIOS as a halting sequence.{{cite web | url = http://www.sandpile.org/post/msgs/20003986.htm | archive-url = https://web.archive.org/web/20030626044017/http://www.sandpile.org/post/msgs/20003986.htm | title = Re: Also some undocumented 0Fh opcodes | archive-date = 2003-06-26 | access-date = 2010-11-07 }} In [https://forum.vcfed.org/index.php?threads/i-found-the-saveall-opcode.71519/ a forum post at the Vintage Computing Federation], this instruction (with | Only available on 80286. |
LOADALL
| | Loads All Registers from Memory Address 0x000800H | Only available on 80286. Opcode reused for |
LOADALLD
| | Loads All Registers from Memory Address ES:EDI | Only available on 80386. Opcode reused for |
CL1INVMB
| On the Intel SCC (Single-chip Cloud Computer), invalidate all message buffers. The mnemonic and operation of the instruction, but not its opcode, are described in Intel's SCC architecture specification.Intel Labs, [https://www.intel.com/content/dam/www/public/us/en/documents/technology-briefs/intel-labs-single-chip-cloud-architecture-brief.pdf SCC External Architecture Specification (EAS), Revision 0.94], p.29. [https://web.archive.org/web/20220522083931/https://www.intel.com/content/dam/www/public/us/en/documents/technology-briefs/intel-labs-single-chip-cloud-architecture-brief.pdf Archived] on May 22, 2022. | Available on the SCC only. |
PATCH2
| | On AMD K6 and later maps to |Only available in Red unlock state ( |
PATCH3
| | Write uarch | Can change RAM part of microcode on Intel |
UMOV r,r/m ,UMOV r/m,r
| | Moves data to/from user memory when operating in ICE HALT mode.Robert R. Collins, [http://www.rcollins.org/secrets/opcodes/UMOV.html Undocumented OpCodes: UMOV]. [https://web.archive.org/web/20010221221425/http://www.rcollins.org/secrets/opcodes/UMOV.html Archived] on Feb 21, 2001. Acts as regular | Available on some 386 and 486 processors only. Opcodes reused for SSE instructions in later CPUs. |
NXOP
| | NexGen hypercode interface.Herbert Oppmann, [https://www.memotech.franken.de/NexGen/Opcode0F55.html NXOP (Opcode 0Fh 55h)] | Available on NexGen Nx586 only. |
(multiple)
| NexGen Nx586 "hyper mode" instructions. The NexGen Nx586 CPU uses "hyper code"Herbert Oppmann, [https://www.memotech.franken.de/NexGen/Bios.html Inside the NexGen Nx586 System BIOS]. [https://web.archive.org/web/20231229134905/https://www.memotech.franken.de/NexGen/Bios.html Archived] on 29 Dec 2023. (x86 code sequences unpacked at boot time and only accessible in a special "hyper mode" operation mode, similar to DEC Alpha's PALcode and Intel's XuCodeIntel, [https://www.intel.com/content/www/us/en/developer/articles/technical/software-security-guidance/secure-coding/xucode-implementing-complex-instruction-flows.html XuCode: An Innovative Technology for Implementing Complex Instruction Flows], May 6, 2021. [https://archive.today/20220719155812/https://www.intel.com/content/www/us/en/developer/articles/technical/software-security-guidance/secure-coding/xucode-implementing-complex-instruction-flows.html Archived] on Jul 19, 2022.) for many complicated operations that are implemented with microcode in most other x86 CPUs. The Nx586 provides a large number of undocumented instructions to assist hyper mode operation. | Available in Nx586 hyper mode only. |
{{nowrap|PSWAPW mm,mm/m64 }}
| | Undocumented AMD 3DNow! instruction on K6-2 and K6-3. Swaps 16-bit words within 64-bit MMX register.Grzegorz Mazur, [https://web.archive.org/web/20000121143428/http://x86.ddj.com/articles/3dnow/amd_3dnow.htm AMD 3DNow! undocumented instructions]{{cite web |url=http://grafi.ii.pw.edu.pl/gbm/x86/3dundoc.html |title=Undocumented 3DNow! Instructions |website=grafi.ii.pw.edu.pl |access-date=22 February 2022 |archive-url=https://web.archive.org/web/20030130030723/http://grafi.ii.pw.edu.pl/gbm/x86/3dundoc.html |archive-date=30 January 2003 |url-status=dead}} Instruction known to be recognized by MASM 6.13 and 6.14. | Available on K6-2 and K6-3 only. Opcode reused for documented |
{{Unknown}} mnemonic
| | Using the | Available on the UMC Green CPU only. Executes as |
FS: Jcc
| {{nowrap| | On Intel NetBurst (Pentium 4) CPUs, the 64h (FS: segment) instruction prefix will, when used with conditional branch instructions, act as a branch hint to indicate that the branch will be alternating between taken and not-taken.Agner Fog, [https://www.agner.org/optimize/microarchitecture.pdf The Microarchitecture of Intel, AMD and VIA CPUs], section 3.4 "Branch Prediction in P4 and P4E". [https://web.archive.org/web/20240107010216/https://www.agner.org/optimize/microarchitecture.pdf Archived] on 7 Jan 2024. Unlike other NetBurst branch hints (CS: and DS: segment prefixes), this hint is not documented. | Available on NetBurst CPUs only. Segment prefixes on conditional branches are accepted but ignored by non-NetBurst CPUs. |
JMPAI
| | Jump and execute instructions in the undocumented Alternate Instruction Set. | Only available on some x86 processors made by VIA Technologies. |
(FMA4)
| | On AMD Zen1, FMA4 instructions are present but undocumented (missing CPUID flag). The reason for leaving the feature undocumented may or may not have been due to a buggy implementation.Reddit /r/Amd discussion thread: [https://www.reddit.com/r/Amd/comments/68s4bj/ryzen_has_undocumented_support_for_fma4/dh0y353/ Ryzen has undocumented support for FMA4] | Removed from Zen2 onwards. |
{{unknown|{{wrap|(unknown, multiple)}}}}
| | The whitepapers for SandSifter and UISFuzz report the detection of large numbers of undocumented instructions in the 3DNow! opcode range on several different AMD CPUs (at least Geode NX and C-50). Their operation is not known. On at least AMD K6-2, all of the unassigned 3DNow! opcodes (other than the undocumented | Present on some AMD CPUs with 3DNow!. |
MOVDB ,
| {{unknown}} | Microprocessor Report's article "MediaGX Targets Low-Cost PCs" from 1997, covering the introduction of the Cyrix MediaGX processor, lists several new instructions that are said to have been added to this processor in order to support its new "Virtual System Architecture" features, including | {{unknown|{{wrap|Unknown. No specification known to have been published.}}}} |
colspan=5 | |
---|
REP XSHA512
| {{nowrap| | Perform SHA-512 hashing. Supported by OpenSSL{{Cite web|url=https://github.com/openssl/openssl/blob/1aa89a7a3afb053d0c0b7fad8d3ea1b0a5447289/engines/asm/e_padlock-x86.pl#L597|title=Welcome to the OpenSSL Project|website=GitHub|date=21 April 2022|archive-url=https://web.archive.org/web/20220104214039/https://github.com/openssl/openssl/blob/1aa89a7a3afb053d0c0b7fad8d3ea1b0a5447289/engines/asm/e_padlock-x86.pl#L597|archive-date=4 Jan 2022|url-status=live}} as part of its VIA PadLock support, and listed in a Zhaoxin-supplied Linux kernel patch,LKML, [https://lore.kernel.org/lkml/20230802110741.4077-1-TonyWWang-oc@zhaoxin.com/ (PATCH) crypto: Zhaoxin: Hardware Engine Driver for SHA1/256/384/512], 2 Aug 2023. [https://web.archive.org/web/20240117024338/https://lore.kernel.org/lkml/20230802110741.4077-1-TonyWWang-oc@zhaoxin.com/ Archived] on 17 Jan 2024. but not documented by the [https://web.archive.org/web/20100526054140/http://linux.via.com.tw/support/beginDownload.action?eleid=181&fid=261 VIA PadLock Programming Guide]. | rowspan="4" | Only available on some x86 processors made by VIA Technologies and Zhaoxin. |
REP XMODEXP
| | rowspan=2 | Instructions to perform modular exponentiation and random number generation, respectively. Listed in a VIA-supplied patch to add support for VIA Nano-specific PadLock instructions to OpenSSL,Kary Jin, [https://marc.info/?l=openssl-dev&m=130767391615291&w=2 PATCH: Update PadLock engine for VIA C7 and Nano CPUs], openssl-dev mailing list, 10 Jun 2011. [https://web.archive.org/web/20220211130841/https://marc.info/?l=openssl-dev&m=130767391615291&w=2 Archived] on 11 Feb 2022. but not documented by the VIA PadLock Programming Guide. |
XRNG2
| |
{{Unknown}} mnemonic
| {{nowrap| | {{unknown|{{wrap|Detected by CPU fuzzing tools such as SandSifterChristopher Domas, [https://raw.githubusercontent.com/xoreaxeaxeax/sandsifter/dff63246fed84d90118441b8ba5b5d3bdd094427/references/domas_breaking_the_x86_isa_wp.pdf Breaking the x86 ISA], 27 July 2017. [https://web.archive.org/web/20231227000052/https://raw.githubusercontent.com/xoreaxeaxeax/sandsifter/dff63246fed84d90118441b8ba5b5d3bdd094427/references/domas_breaking_the_x86_isa_wp.pdf Archived] on 27 Dec 2023. and UISFuzzXixing Li et al, [https://ieeexplore.ieee.org/abstract/document/8863327 UISFuzz: An Efficient Fuzzing Method for CPU Undocumented Instruction Searching], 9 Oct 2019. [https://archive.today/20231227000943/https://ieeexplore.ieee.org/document/8863327 Archived] on 27 Dec 2023. as executing without causing #UD on several different VIA and Zhaoxin CPUs. Unknown operation, may be related to the documented |
{{Unknown}} mnemonic
| | Zhaoxin SM2 instruction. CPUID flags listed in a Linux kernel patch for OpenEuler, description and opcode (but no instruction mnemonic) provided in a Zhaoxin patent applicationUSPTO/Zhaoxin, [https://image-ppubs.uspto.gov/dirsearch-public/print/downloadPdf/20230066718 Patent application US2023/006718: Processor with a hash cryptographic algorithm and data processing thereof], pages 13 and 45, Mar 2, 2023. [https://web.archive.org/web/20230912063311/https://image-ppubs.uspto.gov/dirsearch-public/print/downloadPdf/20230066718 Archived] on Sep 12, 2023. and a Zhaoxin-provided Linux kernel patch.LKML, [https://lore.kernel.org/lkml/20231109094744.545887-1-LeoLiu-oc@zhaoxin.com/t/#u (PATCH) crypto: x86/sm2 -add Zhaoxin SM2 algorithm implementation], 11 Nov 2023. [https://web.archive.org/web/20240117023414/https://lore.kernel.org/lkml/20231109094744.545887-1-LeoLiu-oc@zhaoxin.com/t/#u Archived] on 17 Jan 2024. |
ZXPAUSE
| | Pause the processor until the Time Stamp Counter reaches or exceeds the value specified in EDX:EAX. Low-power processor C-state can be requested in ECX. Listed in OpenEuler kernel patch.OpenEuler kernel [https://gitee.com/openeuler/kernel/pulls/2602/files pull request 2602: x86/delay: add support for Zhaoxin ZXPAUSE instruction]. Gitee. 26 Oct 2023. [https://web.archive.org/web/20240122224925/https://gitee.com/openeuler/kernel/pulls/2602/files Archived] on 22 Jan 2024. | Present in Zhaoxin KX-7000. |
MONTMUL2
| {{unknown}} | Zhaoxin RSA/"xmodx" instructions. Mnemonics and CPUID flags are listed in a Linux kernel patch for OpenEuler,OpenEuler mailing list, [https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/thread/W6GXBRRO6OKNHVJ3WDDUXSLQGI2GFU4X/ PATCH kernel-4.19 v2 5/6 : x86/cpufeatures: Add Zhaoxin feature bits]. [https://web.archive.org/web/20220409071314/https://mailweb.openeuler.org/hyperkitty/list/kernel@openeuler.org/thread/W6GXBRRO6OKNHVJ3WDDUXSLQGI2GFU4X/ Archived] on 9 Apr 2022. but opcodes and instruction descriptions are not available. | {{Unknown|{{wrap|1=Unknown. Some Zhaoxin CPUsInstLatx64, [http://users.atw.hu/instlatx64/CentaurHauls/CentaurHauls00307B2_KX6000_01_CPUID.txt CPUID dump for Zhaoxin KaiXian KX-6000G] – has the SM2 and xmodx feature bits set (CPUID leaf C0000001:EDX:bits 0 and 29). [https://web.archive.org/web/20230725214628/http://users.atw.hu/instlatx64/CentaurHauls/CentaurHauls00307B2_KX6000_01_CPUID.txt Archived] on Jul 25, 2023. have the CPUID flags for these instructions set.}}}} |
= Undocumented x87 instructions =
class="wikitable sortable"
! Mnemonics ! Opcodes ! Description ! Status |
FENI ,
| | FPU Enable Interrupts (8087) | rowspan="3" | Documented for the Intel 80287. Present on all Intel x87 FPUs from 80287 onwards. For FPUs other than the ones where they were introduced on (8087 for These instructions and their operation on modern CPUs are commonly mentioned in later Intel documentation, but with opcodes omitted and opcode table entries left blank (e.g. [https://cdrdv2-public.intel.com/671200/325462-sdm-vol-1-2abcd-3abcd.pdf Intel SDM 325462-077, April 2022] mentions them twice without opcodes). The opcodes are, however, recognized by Intel XED.[https://github.com/intelxed/xed/blob/ef19f00de14a9c2c253c1c9b1119e1617280e3f2/datafiles/xed-isa.txt#L916 ISA datafile for Intel XED] (April 17, 2022), lines 916-944 |
FDISI ,
| | FPU Disable Interrupts (8087) |
FSETPM ,
| | FPU Set Protected Mode (80287) |
(no mnemonic)
| {{nowrap| | "Reserved by Cyrix" opcodes | These opcodes are listed as reserved opcodes that will produce "unpredictable results" without generating exceptions on at least Cyrix 6x86,[https://www.ardent-tool.com/CPU/docs/Cyrix/6x86/94175.pdf Cyrix 6x86 processor data book], page 6-34 6x86MX, MII, MediaGX, and AMD Geode GX/LX.[https://www.amd.com/system/files/TechDocs/33234H_LX_databook.pdf AMD Geode LX Processors Data Book], publication 33234H, p.670 (The documentation for these CPUs all list the same ten opcodes.) Their actual operation is not known, nor is it known whether their operation is the same on all of these CPUs. |
See also
References
{{reflist}}
- {{cite web |title=Intel 64 and IA-32 Architectures Software Developer's Manual, Combined Volumes: 1, 2A, 2B, 2C, 2D, 3A, 3B, 3C, 3D and 4 |url=https://cdrdv2.intel.com/v1/dl/getContent/671200 |website=Intel |date=April 2022 |access-date=21 June 2022 |author=Intel Corporation}}
External links
{{Wikibooks|x86 Assembly|X86 Instructions|X86 Instructions}}
- [https://software.intel.com/en-us/articles/intel-sdm Free IA-32 and x86-64 documentation], provided by Intel
- [https://www.amd.com/content/dam/amd/en/documents/processor-tech-docs/programmer-references/40332.pdf AMD64 Architecture Programmer's Manual, Volumes 1-5], provided by AMD
- [http://ref.x86asm.net/ x86 Opcode and Instruction Reference]
- [https://www.felixcloutier.com/x86/index.html x86 and amd64 instruction reference]
- [https://www.agner.org/optimize/instruction_tables.pdf Instruction tables: Lists of instruction latencies, throughputs and micro-operation breakdowns for Intel, AMD and VIA CPUs]
- [https://www.nasm.us/doc/nasmdocf.html Netwide Assembler Instruction List] (from Netwide Assembler)
{{x86 assembly topics}}