Below is the full 8086/8088 instruction set of Intel (81 instructions total).[2] These instructions are also available in 32-bit mode, in which they operate on 32-bit registers (eax, ebx, etc.) and values instead of their 16-bit (ax, bx, etc.) counterparts. The updated instruction set is grouped according to architecture (i186, i286, i386, i486, i586/i686) and is referred to as (32-bit) x86 and (64-bit) x86-64 (also known as AMD64).
Original 8086/8088 instructions
This is the original instruction set. In the 'Notes' column, r means register, m means memory address and imm means immediate (i.e. a value).
8086/8088 datasheet documents only base 10 version of the AAD instruction (opcode0xD50x0A), but any other base will work. Later Intel's documentation has the generic form too. NEC V20 and V30 (and possibly other NEC V-series CPUs) always use base 10, and ignore the argument, causing a number of incompatibilities
0xD5
AAM
ASCII adjust AX after multiplication
Only base 10 version (Operand is 0xA) is documented, see notes for AAD
JA, JAE, JB, JBE, JC (same as JB), JE, JG, JGE, JL, JLE, JNA (same as JBE), JNAE (same as JB), JNB (same as JAE), JNBE, JNC (same as JAE), JNE, JNG (same as JLE), JNGE (same as JL), JNL (same as JGE), JNLE (same as JG), JNO, JNP, JNS, JNZ (same as JNE), JO, JP, JPE (same as JP), JPO (same as JNP), JS, JZ (same as JE)[3]
0x70...0x7F
JCXZ
Jump if CX is zero
JECXZ for ECX instead of CX in 32 bit mode (same opcode).
Modifies stack for entry to procedure for high level language. Takes two operands: the amount of storage to be allocated on the stack and the nesting level of the procedure.
INSB/INSW
6C
Input from port to string. May be used with a REP prefix to repeat the instruction CX times.
equivalent to:
INAL,DXMOVES:[DI],ALINCDI; adjust DI according to operand size and DF
6D
LEAVE
C9
Leave stack frame
Releases the local stack storage created by the previous ENTER instruction.
OUTSB/OUTSW
6E
Output string to port. May be used with a REP prefix to repeat the instruction CX times.
equivalent to:
MOVAL,DS:[SI]OUTDX,ALINCSI; adjust SI according to operand size and DF
6F
POPA
61
Pop all general purpose registers from stack
equivalent to:
POPDIPOPSIPOPBPPOPAX; no POP SP here, all it does is ADD SP, 2 (since AX will be overwritten later)POPBXPOPDXPOPCXPOPAX
PUSHA
60
Push all general purpose registers onto stack
equivalent to:
PUSHAXPUSHCXPUSHDXPUSHBXPUSHSP; The value stored is the initial SP valuePUSHBPPUSHSIPUSHDI
PUSH immediate
6A ib
Push an immediate byte/word value onto the stack
example:
PUSH12hPUSH1200h
68 iw
IMUL immediate
6B /r ib
Signed and unsigned multiplication of immediate byte/word value
Note that since the lower half is the same for unsigned and signed multiplication, this version of the instruction can be used for unsigned multiplication as well.
69 /r iw
SHL/SHR/SAL/SAR/ROL/ROR/RCL/RCR immediate
C0
Rotate/shift bits with an immediate value greater than 1
Load IDTR (Interrupt Descriptor Table Register) from memory.[b] The IDTR controls not just the address/size of the IDT (interrupt Descriptor Table) in protected mode, but the IVT (Interrupt Vector Table) in real mode as well.
LMSW r/m16
0F 01 /6
Load MSW (Machine Status Word) from 16-bit register or memory.[c][d]
CLTS
0F 06
Clear task-switched flag in the MSW.
LLDT r/m16
0F 00 /2
Load LDTR (Local Descriptor Table Register) from 16-bit register or memory.[b]
Load access rights byte from the specified segment descriptor. Reads bytes 4-7 of segment descriptor, bitwise-ANDs it with 0x00FxFF00,[i] then stores the bottom 16/32 bits of the result in destination register. Sets EFLAGS.ZF=1 if the descriptor could be loaded, ZF=0 otherwise.[j]
#UD
LSL r,r/m16
0F 03 /r
Load segment limit from the specified segment descriptor. Sets ZF=1 if the descriptor could be loaded, ZF=0 otherwise.[j]
VERR r/m16
0F 00 /4
Verify a segment for reading. Sets ZF=1 if segment can be read, ZF=0 otherwise.
VERW r/m16
0F 00 /5
Verify a segment for writing. Sets ZF=1 if segment can be written, ZF=0 otherwise.[k]
Store all CPU registers to a 102-byte data structure starting at physical address 800h, then shut down CPU.
^ abcdThe descriptors used by the LGDT, LIDT, SGDT and SIDT instructions consist of a 2-part data structure. The first part is a 16-bit value, specifying table size in bytes minus 1. The second part is a 32-bit value (64-bit value in 64-bit mode), specifying the linear start address of the table. For LGDT and LIDT with a 16-bit operand size, the address is ANDed with 00FFFFFFh.
On Intel (but not AMD) CPUs, the SGDT and SIDT instructions with a 16-bit operand size is – as of Intel SDM revision 079, March 2023 – documented to write a descriptor to memory with the last byte being set to 0. However, observed behavior is that bits 31:24 of the descriptor table address are written instead.[4]
^ abcdThe LGDT, LIDT, LLDT and LTR instructions are serializing on Pentium and later processors.
^The LMSW instruction is serializing on Intel processors from Pentium onwards, but not on AMD processors.
^On 80386 and later, the "Machine Status Word" is the same as the CR0 control register – however, the LMSW instruction can only modify the bottom 4 bits of this register and cannot clear bit 0. The inability to clear bit 0 means that LMSW can be used to enter but not leave x86 Protected Mode. On 80286, it is not possible to leave Protected Mode at all (neither with LMSW nor with LOADALL[5]) without a CPU reset – on 80386 and later, it is possible to leave Protected Mode, but this requires the use of the 80386-and-later MOV to CR0 instruction.
^If CR4.UMIP=1 is set, then the SGDT, SIDT, SLDT, SMSW and STR instructions can only run in Ring 0. These instructions were unprivileged on all x86 CPUs from 80286 onwards until the introduction of UMIP in 2017.[6]
This has been a significant security problem for software-based virtualization, since it enables these instructions to be used by a VM guest to detect that it is running inside a VM.[7][8]
^ abcThe SMSW, SLDT and STR instructions always use an operand size of 16 bits when used with a memory argument. With a register argument on 80386 or later processors, wider destination operand sizes are available and behave as follows:
SMSW: Stores full CR0 in x86-64 long mode, undefined otherwise.
SLDT: Zero-extends 16-bit argument on Pentium Pro and later processors, undefined on earlier processors.
STR: Zero-extends 16-bit argument.
^In 64-bit long mode, the ARPL instruction is not available – the 63 /r opcode has been reassigned to the 64-bit-mode-only MOVSXD instruction.
^The ARPL instruction causes #UD in Real mode and Virtual 8086 Mode – Windows 95 and OS/2 2.x are known to make extensive use of this #UD to use the 63 opcode as a one-byte breakpoint to transition from Virtual 8086 Mode to kernel mode.[9][10]
^Bits 19:16 of this mask are documented as "undefined" on Intel CPUs.[11] On AMD CPUs, the mask is documented as 0x00FFFF00.
^ abFor the LAR and LSL instructions, if the specified segment descriptor could not be loaded, then the instruction's destination register is left unmodified.
^On some Intel CPU/microcode combinations from 2019 onwards, the VERW instruction also flushes microarchitectural data buffers. This enables it to be used as part of workarounds for Microarchitectural Data Sampling security vulnerabilities.[12][13] Some of the microarchitectural buffer-flushing functions that have been added to VERW may require the instruction to be executed with a memory operand.[14]
^ abUndocumented, 80286 only.[5][15][16] (A different variant of LOADALL with a different opcode and memory layout exists on 80386.)
The 80386 added support for 32-bit operation to the x86 instruction set. This was done by widening the general-purpose registers to 32 bits and introducing the concepts of OperandSize and AddressSize – most instruction forms that would previously take 16-bit data arguments were given the ability to take 32-bit arguments by setting their OperandSize to 32 bits, and instructions that could take 16-bit address arguments were given the ability to take 32-bit address arguments by setting their AddressSize to 32 bits. (Instruction forms that work on 8-bit data continue to be 8-bit regardless of OperandSize. Using a data size of 16 bits will cause only the bottom 16 bits of the 32-bit general-purpose registers to be modified – the top 16 bits are left unchanged.)
The default OperandSize and AddressSize to use for each instruction is given by the D bit of the segment descriptor of the current code segment - D=0 makes both 16-bit, D=1 makes both 32-bit. Additionally, they can be overridden on a per-instruction basis with two new instruction prefixes that were introduced in the 80386:
66h: OperandSize override. Will change OperandSize from 16-bit to 32-bit if CS.D=0, or from 32-bit to 16-bit if CS.D=1.
67h: AddressSize override. Will change AddressSize from 16-bit to 32-bit if CS.D=0, or from 32-bit to 16-bit if CS.D=1.
The 80386 also introduced the two new segment registers FS and GS as well as the x86 control, debug and test registers.
The new instructions introduced in the 80386 can broadly be subdivided into two classes:
Pre-existing opcodes that needed new mnemonics for their 32-bit OperandSize variants (e.g. CWDE, LODSD)
New opcodes that introduced new functionality (e.g. SHLD, SETcc)
For instruction forms where the operand size can be inferred from the instruction's arguments (e.g. ADD EAX,EBX can be inferred to have a 32-bit OperandSize due to its use of EAX as an argument), new instruction mnemonics are not needed and not provided.
80386: new instruction mnemonics for 32-bit variants of older opcodes
32-bit interrupt return. Differs from the older 16-bit IRET instruction in that it will pop interrupt return items (EIP,CS,EFLAGS; also ESP[j] and SS if there is a CPL change; and also ES,DS,FS,GS if returning to virtual 8086 mode) off the stack as 32-bit items instead of 16-bit items. Should be used to return from interrupts when the interrupt handler was entered through a 32-bit IDT interrupt/trap gate.
Instruction is serializing.
IRET
^For the 32-bit string instructions, the ±± notation is used to indicate that the indicated register is post-decremented by 4 if EFLAGS.DF=1 and post-incremented by 4 otherwise. For the operands where the DS segment is indicated, the DS segment can be overridden by a segment-override prefix – where the ES segment is indicated, the segment is always ES and cannot be overridden. The choice of whether to use the 16-bit SI/DI registers or the 32-bit ESI/EDI registers as the address registers to use is made by AddressSize, overridable with the 67 prefix.
^The 32-bit string instructions accept repeat-prefixes in the same way as older 8/16-bit string instructions. For LODSD, STOSD, MOVSD, INSD and OUTSD, the REP prefix (F3) will repeat the instruction the number of times specified in rCX (CX or ECX, decided by AddressSize), decrementing rCX for each iteration (with rCX=0 resulting in no-op and proceeding to the next instruction). For CMPSD and SCASD, the REPE (F3) and REPNE (F2) prefixes are available, which will repeat the instruction, decrementing rCX for each iteration, but only as long as the flag condition (ZF=1 for REPE, ZF=0 for REPNE) holds true AND rCX ≠ 0.
^For the INSB/W/D instructions, the memory access rights for the ES:[rDI] memory address might not be checked until after the port access has been performed – if this check fails (e.g. page fault or other memory exception), then the data item read from the port is lost. As such, it is not recommended to use this instruction to access an I/O port that performs any kind of side effect upon read.
^The CWDE instruction differs from the older CWD instruction in that CWD would sign-extend the 16-bit value in AX into a 32-bit value in the DX:AX register pair.
^For the E3 opcode (JCXZ/JECXZ), the choice of whether the instruction will use CX or ECX for its comparison (and consequently which mnemonic to use) is based on the AddressSize, not OperandSize. (OperandSize instead controls whether the jump destination should be truncated to 16 bits or not). This also applies to the loop instructions LOOP,LOOPE,LOOPNE (opcodes E0,E1,E2), however, unlike JCXZ/JECXZ, these instructions have not been given new mnemonics for their ECX-using variants.
^For PUSHA(D), the value of SP/ESP pushed onto the stack is the value it had just before the PUSHA(D) instruction started executing.
^For POPA/POPAD, the stack item corresponding to SP/ESP is popped off the stack (performing a memory read), but not placed into SP/ESP.
^The PUSHFD and POPFD instructions will cause a #GP exception if executed in virtual 8086 mode if IOPL is not 3. The PUSHF, POPF, IRET and IRETD instructions will cause a #GP exception if executed in Virtual-8086 mode if IOPL is not 3 and VME is not enabled.
^If IRETD is used to return from kernel mode to user mode (which will entail a CPL change) and the user-mode stack segment indicated by SS is a 16-bit segment, then the IRETD instruction will only restore the low 16 bits of the stack pointer (ESP/RSP), with the remaining bits keeping whatever value they had in kernel code before the IRETD. This has necessitated complex workarounds on both Linux ("ESPFIX")[17] and Windows.[18] This issue also affects the later 64-bit IRETQ instruction.
If the first argument to the instruction is a register operand and/or the second argument is an immediate, then the bit-index in the second argument is taken modulo operand size (16/32/64, in effect using only the bottom 4, 5 or 6 bits of the index.)
If the first argument is a memory operand and the second argument is a register operand, then the bit-index in the second argument is used in full – it is interpreted as a signed bit-index that is used to offset the memory address to use for the bit test.
^ abcThe BTS, BTC and BTR instructions accept the LOCK (F0) prefix when used with a memory argument – this results in the instruction executing atomically.
^If the F3 prefix is used with the 0F BC /r opcode, then the instruction will execute as TZCNT on systems that support the BMI1 extension. TZCNT differs from BSF in that TZCNT but not BSR is defined to return operand size if the source operand is zero – for other source operand values, they produce the same result (except for flags).
^ abBSF and BSR set the EFLAGS.ZF flag to 1 if the source argument was all-0s and 0 otherwise. If the source argument was all-0s, then the destination register is documented as being left unchanged on AMD processors, but set to an undefined value on Intel processors.
^If the F3 prefix is used with the 0F BD /r opcode, then the instruction will execute as LZCNT on systems that support the ABM or LZCNT extensions. LZCNT produces a different result from BSR for most input values.
^ abFor SHLD and SHRD, the shift-amount is masked – the bottom 5 bits are used for 16/32-bit operand size and 6 bits for 64-bit operand size. SHLD and SHRD with 16-bit arguments and a shift-amount greater than 16 produce undefined results. (Actual results differ between different Intel CPUs, with at least three different behaviors known.[19])
^ abThe condition codes supported for the SETcc and Jcc near instructions (opcodes 0F 9x /0 and 0F 8x respectively, with the x nibble specifying the condition) are:
^For SETcc, while the opcode is commonly specified as /0 – implying that bits 5:3 of the instruction's ModR/M byte should be 000 – modern x86 processors (Pentium and later) ignore bits 5:3 and will execute the instruction as SETcc regardless of the contents of these bits.
^For LFS, LGS and LSS, the size of the offset part of the far pointer is given by operand size – the size of the segment part is always 16 bits. In 64-bit mode, using the REX.W prefix with these instructions will cause them to load a far pointer with a 64-bit offset on Intel but not AMD processors.
^ abcdefFor MOV to/from the CRx, DRx and TRx registers, the reg part of the ModR/M byte is used to indicate CRx/DRx/TRx register and r/m part the general-register.
Uniquely for the MOV CRx/DRx/TRx opcodes, the top two bits of the ModR/M byte is ignored – these opcodes are decoded and executed as if the top two bits of the ModR/M byte are 11b.
^ abcdFor moves to/from the CRx and DRx registers, the operand size is always 64 bits in 64-bit mode and 32 bits otherwise.
^On processors that support global pages (Pentium and later), global page table entries will not be flushed by a MOV to CR3 − instead, these entries can be flushed by toggling the CR4.PGE bit. On processors that support PCIDs, writing to CR3 while PCIDs are enabled will only flush TLB entries belonging to the PCID specified in bits 11:0 of the value written to CR3 (this flush can be suppressed by setting bit 63 of the written value to 1). Flushing pages belonging to other PCIDs can instead be done by toggling the CR4.PGE bit, clearing the CR4.PCIDE bit, or using the INVPCID instruction.
^On processors prior to Pentium, moves to CR0 would not serialize the instruction stream – in part for this reason, it is usually required to perform a far jump[20] immediately after a MOV to CR0 if such a MOV is used to enable/disable protected mode and/or memory paging. MOV to CR2 is architecturally listed as serializing, but has been reported to be non-serializing on at least some Intel Core-i7 processors.[21] MOV to CR8 (introduced with x86-64) is serializing on AMD but not Intel processors.
^ abThe MOV TRx instructions were discontinued from Pentium onwards.
^The INT1/ICEBP (F1) instruction is present on all known Intel x86 processors from the 80386 onwards,[22] but only fully documented for Intel processors from the May 2018 release of the Intel SDM (rev 067) onwards.[23] Before this release, mention of the instruction in Intel material was sporadic, e.g. AP-526 rev 001.[24] For AMD processors, the instruction has been documented since 2002.[25]
^The operation of the F1(ICEBP) opcode differs from the operation of the regular software interrupt opcode CD 01 in several ways:
In protected mode, CD 01 will check CPL against the interrupt descriptor's DPL field as an access-rights check, while F1 will not.
In virtual-8086 mode, CD 01 will also check CPL against IOPL as an access-rights check, while F1 will not.
In virtual-8086 mode with VME enabled, interrupt redirection is supported for CD 01 but not F1.
^The UMOV instruction is present on 386 and 486 processors only.[22]
^ abThe XBTS and IBTS instructions were discontinued with the B1 stepping of 80386.
They have been used by software mainly for detection of the buggy[26] B0 stepping of the 80386. Microsoft Windows (v2.01 and later) will attempt to run the XBTS instruction as part of its CPU detection if CPUID is not present, and will refuse to boot if XBTS is found to be working.[27]
^ abFor XBTS and IBTS, the r/m argument represents the data to extract/insert a bitfield from/to, the reg argument the bitfield to be inserted/extracted, AX/EAX a bit-offset and CL a bitfield length.[28]
Compare and Exchange. If accumulator (AL/AX/EAX/RAX) compares equal to first operand,[c] then EFLAGS.ZF is set to 1 and the first operand is overwritten with the second operand. Otherwise, EFLAGS.ZF is set to 0, and first operand is copied into the accumulator.
Write Back and Invalidate Cache.[e] Writes back all modified cache lines in the processor's internal cache to main memory and invalidates the internal caches.
^Using BSWAP with 16-bit registers is not disallowed per se (it will execute without producing an #UD or other exceptions) but is documented to produce undefined results – it is reported to produce various different results on 486,[30] 586, and Bochs/QEMU.[31]
^ abOn Intel 80486 stepping A,[32] the CMPXCHG instruction uses a different encoding - 0F A6 /r for 8-bit variant, 0F A7 /r for 16/32-bit variant. The 0F B0/B1 encodings are used on 80486 stepping B and later.[33][34]
^The CMPXCHG instruction sets EFLAGS in the same way as a CMP instruction that uses the accumulator (AL/AX/EAX/RAX) as its first argument would do.
^INVLPG executes as no-operation if the m8 argument is invalid (e.g. unmapped page or non-canonical address). INVLPG can be used to invalidate TLB entries for individual global pages.
^ abThe INVD and WBINVD instructions will invalidate all cache lines in the CPU's L1 caches. It is implementation-defined whether they will invalidate L2/L3 caches as well. These instructions are serializing – on some processors, they may block interrupts until completion as well.
^Under Intel VT-x virtualization, the INVD instruction will cause a mandatory #VMEXIT. Also, on processors that support Intel SGX, if the PRM (Processor Reserved Memory) has been set up by using the PRMRRs (PRM range registers), then the INVD instruction is not permitted and will cause a #GP(0) exception.[35]
^If the F3 prefix is used with the 0F 09 opcode, then the instruction will execute as WBNOINVD on processors that support the WBNOINVD extension – this will not invalidate the cache.
Integer/system instructions that were not present in the basic 80486 instruction set, but were added in various x86 processors prior to the introduction of SSE. (Discontinued instructions are not included.)
CPU Identification and feature information. Takes as input a CPUID leaf index in EAX and, depending on leaf, a sub-index in ECX. Result is returned in EAX,EBX,ECX,EDX.[e]
Instruction is serializing, and causes a mandatory #VMEXIT under virtualization.
Support for CPUID can be checked by toggling bit 21 of EFLAGS (EFLAGS.ID) – if this bit can be toggled, CPUID is present.
In early processors, the TSC was a cycle counter, incrementing by 1 for each clock cycle (which could cause its rate to vary on processors that could change clock speed at runtime) – in later processors, it increments at a fixed rate that doesn't necessarily match the CPU clock speed.[n]
Undefined Instructions – will generate an invalid opcode (#UD) exception in all operating modes.[aa]
These instructions are provided for software testing to explicitly generate invalid opcodes. The opcodes for these instructions are reserved for this purpose.
^ abcIn 64-bit mode, the RDMSR, RDTSC and RDPMC instructions will set the top 32 bits of RDX and RAX to zero.
^On Intel and AMD CPUs, the WRMSR instruction is also used to update the CPU microcode. This is done by writing the virtual address of the new microcode to upload to MSR 79h on Intel CPUs and MSR C001_0020h[37] on AMD CPUs.
^Writes to the following MSRs are not serializing:[38][39]
Number
Name
48h
SPEC_CTRL
49h
PRED_CMD
10Bh
FLUSH_CMD
122h
TSX_CTRL
6E0h
TSC_DEADLINE
6E1h
PKRS
774h
HWP_REQUEST (non-serializing only if the FAST_IA32_HWP_REQUEST bit it set)
802h to 83Fh
(x2APIC MSRs)
1B01h
UARCH_MISC_CTL
C001_0100h
FS_BASE (non-serializing on AMD Zen 4 and later)[40]
WRMSR to the x2APIC ICR (Interrupt Command Register; MSR 830h) is commonly used to produce an IPI (Inter-processor interrupt) - on Intel[41] but not AMD[42] CPUs, such an IPI can be reordered before an older memory store.
^System Management Mode and the RSM instruction were made available on non-SL variants of the Intel 486 only after the initial release of the Intel Pentium in 1993.
^On some older 32-bit processors, executing CPUID with a leaf index (EAX) greater than 0 may leave EBX and ECX unmodified, keeping their old values. For this reason, it is recommended to zero out EBX and ECX before executing CPUID. Processors noted to exhibit this behavior include Cyrix MII[47] and IDT WinChip 2.[48]
In 64-bit mode, CPUID will set the top 32 bits of RAX, RBX, RCX and RDX to zero.
^On some Intel processors starting from Ivy Bridge, there exists MSRs that can be used to restrict CPUID to ring 0. Such MSRs are documented for at least Ivy Bridge[49] and Denverton.[50] The ability to restrict CPUID to ring 0 also exists on AMD processors supporting the "CpuidUserDis" feature (Zen 4 "Raphael" and later).[51]
^ abCPUID is also available on some Intel and AMD 486 processor variants that were released after the initial release of the Intel Pentium.
^On the Cyrix 5x86 and 6x86 CPUs, CPUID is not enabled by default and must be enabled through a Cyrix configuration register.
^On NexGen CPUs, CPUID is only supported with some system BIOSes. On some NexGen CPUs that do support CPUID, EFLAGS.ID is not supported but EFLAGS.AC is, complicating CPU detection.[52]
^Unlike the older CMPXCHG instruction, the CMPXCHG8B instruction does not modify any EFLAGS bits other than ZF.
^LOCK CMPXCHG8B with a register operand (which is an invalid encoding) will, on some Intel Pentium CPUs, cause a hang rather than the expected #UD exception - this is known as the Pentium F00F bug.
^ abcOn IDT WinChip, Transmeta Crusoe and Rise mP6 processors, the CMPXCHG8B instruction is always supported, however its CPUID bit may be missing. This is a workaround for a bug in Windows NT.[53]
^ abThe RDTSC and RDPMC instructions are not ordered with respect to other instructions, and may sample their respective counters before earlier instructions are executed or after later instructions have executed. Invocations of RDPMC (but not RDTSC) may be reordered relative to each other even for reads of the same counter. In order to impose ordering with respect to other instructions, LFENCE or serializing instructions (e.g. CPUID) are needed.[54]
TSC running at a fixed rate as long as the processor core is not in a deep-sleep (C2 or deeper) mode, but not synchronized between CPU cores. Introduced in Intel Prescott, Yonah and Bonnell. Also present in all Transmeta and VIA Nano[55] CPUs, as well as AMD Geode LX.[56] Does not have a CPUID bit.
Invariant TSC
TSC running at a fixed rate, and remaining synchronized between CPU cores in all P-,C- and T-states (but not necessarily S-states). Present in AMD K10 and later; Intel Nehalem/Saltwell[57] and later; Zhaoxin WuDaoKou[58] and later. Indicated with a CPUID bit (leaf 8000_0007:EDX[8]).
^RDTSC can be run outside Ring 0 only if CR4.TSD=0. On Intel Pentium and AMD K5/K6, RDTSC cannot be run in Virtual-8086 mode.[59][60] Later processors (Pentium Pro, Athlon 64) removed this restriction.
^RDPMC can be run outside Ring 0 only if CR4.PCE=1.
^The RDPMC instruction is not present in VIA processors prior to the Nano.
^The condition codes supported for CMOVcc instruction (opcode 0F 4x /r, with the x nibble specifying the condition) are:
^In 64-bit mode, CMOVcc with a 32-bit operand size will clear the upper 32 bits of the destination register even if the condition is false. For CMOVcc with a memory source operand, the CPU will always read the operand from memory – potentially causing memory exceptions and cache line-fills – even if the condition for the move is not satisfied. (The Intel APX extension defines a set of new EVEX-encoded variants of CMOVcc that will suppress memory exceptions if the condition is false.)
^On pre-Nehemiah VIA C3 variants ("Samuel"/"Ezra"), the reg,reg but not reg,[mem] forms of the CMOVcc instructions have been reported to be present as undocumented instructions.[61]
^Intel's recommended byte encodings for multi-byte NOPs of lengths 2 to 9 bytes in 32/64-bit mode are (in hex):[62]
Length
Byte Sequence
2
66 90
3
0F 1F 00
4
0F 1F 40 00
5
0F 1F 44 00 00
6
66 0F 1F 44 00 00
7
0F 1F 80 00 00 00 00
8
0F 1F 84 00 00 00 00 00
9
66 0F 1F 84 00 00 00 00 00
For cases where there is a need to use more than 9 bytes of NOP padding, it is recommended to use multiple NOPs.
^Unlike other instructions added in Pentium Pro, long NOP does not have a CPUID feature bit.
^0F 1F /0 as long-NOP was introduced in the Pentium Pro, but remained undocumented until 2006.[64]
The whole 0F 18..1F opcode range was NOP in Pentium Pro. However, except for 0F 1F /0, Intel does not guarantee that these opcodes will remain NOP in future processors, and have indeed assigned some of these opcodes to other instructions in at least some processors.[65]
^While the 0F 0B opcode was officially reserved as an invalid opcode from Pentium onwards, it only got assigned the mnemonic UD2 from Pentium Pro onwards.[68]
^ abGNU Binutils have used the UD2A and UD2B mnemonics for the 0F 0B and 0F B9 opcodes since version 2.7.[69] Neither UD2A nor UD2B originally took any arguments - UD2B was later modified to accept a ModR/M byte, in Binutils version 2.30.[70]
^The UD2 (0F 0B) instruction will additionally stop subsequent bytes from being decoded as instructions, even speculatively. For this reason, if an indirect branch instruction is followed by something that is not code, it is recommended to place an UD2 instruction after the indirect branch.[71]