Galois Field New Instructions is a subset extension from Intel to accelerate cryptographic applications. However its instructions have additional unexpected beneficial purposes in bit manipulation, and the instructions can be present in other ISAs
Description
The GFNI extension comprises three instructions, VGF2P8AFFINEINVQB, VGF2P8AFFINEQB and VGF2P8MULB. They are useful for cryptography,[1] as they can be used to implement Rijndael-style S-boxes such as those used in AES, Camellia, and SM4. These instructions are also used for bit manipulation in networking and signal processing: bits can be arbitrarily reordered, copied, inverted, cleared, or set with them.[1]
GFNI is a standalone instruction set extension and can be enabled separately from AVX or AVX-512. Depending on whether AVX and AVX-512F support is indicated by the CPU, GFNI support enables legacy (SSE), VEX or EVEX-coded instructions operating on 128, 256 or 512-bit vectors.
Instruction
Description
VGF2P8AFFINEINVQB
Galois field affine transformation inverse
VGF2P8AFFINEQB
Galois field affine transformation
VGF2P8MULB
Galois field multiply bytes
Additional uses
GNFI was originally intended to help accelerate for example Rijndael (AES) GF(2^8) arithmetic: Rijndael has an explicit GF(2^8) reducing polynomial of 0x11B. However surprising number of additional uses have emerged:
An Intel guide lists parallel 5-bit byte-wise sign-extension, general bit-clear insert set and invert,
Parallel Count Leading/Trailing Zero Bits (Byte-wise), Arbitrary GF(2^N) multiplication, Fixed 2-bit Packed Arithmetic, Byte-wise variable shift, which relies on pre-truncating the inputs to ensure the polynomial reduction is not triggered.
Bit-reversal
SM4, Reed Solomon, RAID6
Vector bit-reverse
bmatflip and bmatxor is found in the Cray XMT
Power ISA vgbbd Chapter 6. Vector Facility, Book 1 p. 445
See also
bit manipulation – Algorithmically modifying data below the word level
AVX512 – Instruction set extension by IntelPages displaying short descriptions of redirect targets
AVX2 – Instructions for the x86 microprocessorsPages displaying short descriptions of redirect targets