JMP
and CALL
SyntaxThe JMP
and CALL
instructions support a
variety of syntaxes to simplify their specific use cases. Some of the
following chapters explain how these two instructions interact with various
special symbols that NASM uses and some document non-obvious scenarios
regarding differently sized modes of operation.
Near jumps are jumps within a single segment. Probably the most common
way to use them is through labels, as explained in
section 3.9. APX
added a
near jump instruction – , that allows jumps to any 64-bit address
specified with an immediate operand. The instruction works with absolute
addresses and the syntax options are shown in
section 4.1.6.
One of the ways to quickly implement an infinite loop is using the
$
token which evaluates to the current position in the code.
So a one line infinite loop can simply look like:
jmp $
In some special circumstances one might need to jump between 16-bit mode and 32-bit mode. A similar issue is addressing between 16 and 32 bit segments. The possible cases and the relevant syntax for both problems are explained in section 12.1 and section 12.2 respectively.
When writing shared libraries it's often necessary to call external
code. In the ELF format the keyword takes on a different meaning than
normally when it helps reference a segment – it's used to refer to
some special symbols (more about it can be found in
section 9.6.1). In the case
described here, "wrt
..plt
" references a PLT
(procedure linkage table) entry. It can be used to call external routines
in a way explained in section
11.2.5.
FAR
Calls and JumpsNASM supports FAR
(inter-segment) calls and jumps by means
of the syntax call segment:offset
, where segment
and offset
both represent immediate values. So to call a far
procedure, you could code either of
call (seg procedure):procedure call weird_seg:(procedure wrt weird_seg)
(The parentheses are included for clarity, to show the intended parsing of the above instructions. They are not necessary in practice.)
NASM also supports the syntax call far procedure
as a
synonym for the first of the above usages. JMP
works
identically to CALL
in these examples.
To declare a far pointer to a data item in a data segment, you must code
dw symbol, seg symbol ; 16 bit dd symbol, word seg symbol ; 32 bit
NASM supports no convenient synonym for this, though you can always invent one using the macro processor.
JMPABS
)Defined as part of the APX specification, JMPABS
is a new
near jump instruction takes a 64-bit absolute address immediate.
It is the only direct jump instruction that can jump anywhere in
the address space in 64-bit mode.
NASM allows this instruction to be specified either as:
jmpabs target
... or:
jmp abs target
The generated code is identical. The ABS
is required
regardless of the DEFAULT
setting.
Some instructions that use the VEX
prefix, mainly AVX ones,
use NDS (Non-Destructive Source) or NDD (New Data Destination) operands.
Semantically it works by passing another operand to the instruction so that
none of the source operands are modified as a result of the operation.
Syntatically NASM allows both the obvious format mentioned above and a compact format – compact meaning that if a user passes two operands instead of three, one of them is simply copied to be used as the source or destination. Thereby these instructions have exactly the same encoding:
vaddpd xmm0, xmm0, xmm1 vaddpd xmm0, xmm1
Here the XMM0
register is used as the "non-destructive
source" even though in this case it will of course be modified.
The moffs operand can be used with the MOV
instruction, only using the "A
" register (AL
,
AX
, EAX
, or RAX
), and for non-64-bit
operand size means to address memory at an offset from a segment. For
64-bit operands it simply accesses memory at a specified offset (since
segment based addressing is mostly unavailable in 64-bit mode). Syntax to
use 64-bit offsets to address memory is showcased in
section 13.2.2.
Instructions that use the mib operand, (that is memory addressed with a base register, with some offset, with an added index register that's multiplied by some scale factor) can also utilize the split EA (effective addressing). The new form is mainly intended for MPX instructions that use the mib operands, but can be used for any memory reference. The basic concept of this form is splitting base and index:
mov eax,[ebx+8,ecx*4] ; ebx=base, ecx=index, 4=scale, 8=disp
NASM supports all currently possible forms of the mib syntax:
; bndstx ; next 5 lines are parsed same ; base=rax, index=rbx, scale=1, displacement=3 bndstx [rax+0x3,rbx], bnd0 ; NASM - split EA bndstx [rbx*1+rax+0x3], bnd0 ; GAS - '*1' indecates an index reg bndstx [rax+rbx+3], bnd0 ; GAS - without hints bndstx [rax+0x3], bnd0, rbx ; ICC-1 bndstx [rax+0x3], rbx, bnd0 ; ICC-2
VPTERNLOGD
and VPTERNLOGQ
are instructions
that implement an arbitrary logic function for three inputs. They take
three register operands and one immediate value that determines what logic
function the instruction shall implement on execution. Specifically the
output of the desired logic function is encoded in the immediate 8-bit
operand. 3 binary inputs can be configured in 8 possible ways giving 8
output bits that could implement any one of 256 possible logic functions.
Therefore it's not practical to have any syntax around different possible
logic functions.
However there are some macro solutions that can help avoid writing out truth tables in order to use the ternary logic instructions. The simple, more manual way is to calculate the logic operation encoding on the fly with a few lines of arithmetic directives:
a equ 0xaa b equ 0xcc c equ 0xf0 imm equ a | b & c
Here, values for "a", "b" and "c" together are all possible bit configurations that a 3 input function can take ("a" being the least significant bit and "c" being the most significant one). Then the "imm" variable is calculated by evaluating the desired logic function, in this case "a or b and c", thereby getting the function's output column that one would get when writing out the truth tables.
Note that only the expression must be written using the bitwise
operators &
, |
, ^
, and
~
. Using the boolean operators &&
,
||
, ^^
, !
and ? :
will
not work correctly.
The vtern
standard macro package,
section 7.6, allows for these kinds
of expressions without introducing the symbols a
,
b
and c
into the global namespace:
%use vtern vpternlogd xmm1, xmm2, xmm3, a | b & c vpternlogq ymm4, ymm5, xmm6, (b ^ c) & ~a ; a, b, and c are not defined as symbols elsewhere
Intel APX (Advanced Performance Extensions) introduces multiple new features, mostly to existing instructions. APX is only available in 64-bit mode.
There are 16 new general purpose registers, R16
to
R31
.
Many instructions now support a non-destructive destination operand.
The ability to suppress the setting of the arithmetic flags.
The ability to zero the upper parts of a full 64-bit register for 8- and 16-bit operation size instructions. (This zeroing is always performed for 32-bit operations; this has been the case since 64-bit mode was first introduced.)
New instructions to conditionally set the arithmetic flags to a user-specified value.
Performance-enhanced versions of the PUSH
and
POP
instructions.
A 64-bit absolute jump instruction.
A new REX2 prefix.
See
https://www.nasm.us/specs/apx
for a link to the APX technical documentation. NASM generally follows the
syntax specified in the Assembly Syntax Recommendations for Intel
APX document although some syntax is relaxed, see below.
When it comes to register size, the new registers
(R16
–R31
) work the same way as registers
R8
–R15
(see also
section 13.1):
R31
is the 64-bit form of register 31,
R31D
is the 32-bit form,
R31W
is the 16-bit form, and
R31B
is the 8-bit form. The form R31L
can also
be used if the altreg
macro package is used
(%use altreg
), see section
7.1.
Extended registers require that either a REX2 prefix (the default, if possible) or an EVEX prefix is used.
There are some instructions that don't support EGPRs. In that case, NASM will generate an error if they are used.
Using the new data destination register (when supported) is specified by
adding an additional register in place of the first operand. For example an
ADD
instruction:
add rax, rbx, rcx
... which would add RBX
and RCX
and store the
result in RAX
, without modifying neither RBX
nor
RCX
.
The {nf}
prefix on a supported instruction inhibits the
update of the flags, for example:
{nf} add rax, rbx
... will add RAX
and RBX
together, storing the
result in RAX
, while leaving the flags register unchanged.
NASM also allows the {nf}
prefix (or any other curly-brace
prefix) to be specified after the instruction mnemonic. Spaces
around curly-brace prefixes are optional:
{nf} add rax, rbx ; Standard syntax {nf}add rax, rbx ; Prefix without space add {nf} rax, rbx ; Suffix syntax add{nf} rax, rbx ; Suffix without space
The {zu}
prefix can be used meaning – "zero-upper",
which disables retaining the upper parts of the registers and instead
zero-extends the value into the full 64-bit register when the operand size
is 8 or 16 bits (this is always done when the operand size is 32 bits, even
without APX). For example:
{zu} setb al
... zeroes out bits [63:8] of the RAX
register. For this
specific instruction, NASM also eccepts these alternate syntaxes:
{zu} setb ax setb {zu} al setb {zu} ax setb {zu} eax setb {zu} rax setb eax setb rax
The source condition code (Scc) instructions,
CCMPS
cc and CTESTS
cc
,
perform a test which if successful set the arithmetic flags to a user
specfied value and otherwise leave them unchanged.
NASM allows the resulting default flags value to be specified
either using the {dfv=}
...}
syntax, containing a
comma-separated list of zero or more of the CPU flags OF
,
SF
, ZF
or CF
or simply as a numeric
immediate (with OF
, SF
, ZF
and
CF
being represented by bits 3 to 0 in that order.)
The PF
flag is always set to the same value as the
CF
flag, and the AF
flag is always cleared. NASM
allows {dfv=pf}
as an alias for {dfv=cf}
, but do
note that it still affects both flags.
NASM allows, but does not require, a comma after the {dfv=}
value; when using the immediate syntax a comma is required; these examples
all produce the same instruction:
ccmpl {dfv=of,cf} rdx, r30 ccmpl {dfv=of,cf}, rdx, r30 ccmpl 0x9, rdx, r30 ; Comma required
The immediate syntax also allows for the {dfv=}
values to
be stored in a symbol, or having arithmetic done on them. Note that when
used in an expression, or in contexts other than EQU
or one of
the S
cc instructions, parenteses are required; this
is a safety measure (programmer needs to explicitly indicate that use as an
expression is what is intended):
ccmpl ({dfv=of}|{dfv=cf}), rdx, r30 ; Parens, comma required ocf1 equ {dfv=of,cf} ; Parens not required ccmpl ocf1, rdx, r30 ; Comma required ofcf equ ({dfv=of,sf,cf} & ~{dfv=sf}) ; Parens required ccmpl ofcf2, rdx, r30 ; Comma required
PUSH
and POP
ExtensionsAPX adds variations of the PUSH
and POP
instructions that:
informs the CPU that a specific PUSH
and POP
constitute a matched pair, allowing the hardware to optimize for this
common use case: PUSHP
and POPP
;
operates on two registers at the same time: PUSH2
and
POP2
, with paired variants PUSH2P
and
POP2P
.
These extensions only apply to register forms; they are not supported for memory or immediate operands.
The standard syntax for (P
)PUSH2
and
(P
)POP2
specify the registers in the order they
are to be pushed and popped on the stack:
push2p rax, rbx ; rax in [rsp+8] ; rbx is [rsp+0] pop2p rbx, rax
... would be the equivalent of:
push rax push rbx ; rax in [rsp+8] ; rbx is [rsp+0] pop rbx pop rax
NASM also allows the registers to be specified as a register
pair separated by a colon, in which case the order is always specified
in the order high:
low and thus is the same
for PUSH2
and POP2
. This means the order of the
operands in the POP2
instruction is different:
push2p rax:rbx ; rax in [rsp+8] ; rbx is [rsp+0] pop2p rax:rbx
When the optimizer is enabled (see section 2.1.24), NASM may apply a number of optimizations, some of which may apply non-APX instructions to what otherwise would be APX forms. Some examples are:
The {nf}
prefix may be ignored on instructions that already
don't modify the arithmetic flags.
When the {nf}
prefix is specified, NASM may generate
another instruction which would not modify the flags register. For example,
{nf} ror rax, rcx, 3
can be translated into
rorx rax, rcx, 3
.
The {zu}
prefix may be ignored on instruction that already
zero the upper parts of the destination register.
When the {zu}
prefix is specified, NASM may generate
another instruction which would zero the upper part of the register. For
example, {zu} mov ax, cs
can be translated into
mov eax, cs
.
New data destination or nondestructive source operands may be contracted
if they are the same (and the semantics are otherwise identical). For
example, add eax, eax, edx
could be encoded as
add eax, edx
using legacy encoding. NASM does not perform
this optimization as of version 3.00, but it probably will in the
future.
APX encoding, using REX2 and EVEX, respectively, can be forced by using
the {rex2}
or {evex}
instruction prefixes.