Each bytecode description has the following form:
add
(0x02): a b ⇒ a+bIn this example, add
is the name of the bytecode, and
(0x02)
is the one-byte value used to encode the bytecode, in
hexadecimal. The phrase “a b ⇒ a+b” shows
the stack before and after the bytecode executes. Beforehand, the stack
must contain at least two values, a and b; since the top of
the stack is to the right, b is on the top of the stack, and
a is underneath it. After execution, the bytecode will have
popped a and b from the stack, and replaced them with a
single value, a+b. There may be other values on the stack below
those shown, but the bytecode affects only those shown.
Here is another example:
const8
(0x22) n: ⇒ nIn this example, the bytecode const8
takes an operand n
directly from the bytecode stream; the operand follows the const8
bytecode itself. We write any such operands immediately after the name
of the bytecode, before the colon, and describe the exact encoding of
the operand in the bytecode stream in the body of the bytecode
description.
For the const8
bytecode, there are no stack items given before
the ⇒; this simply means that the bytecode consumes no values
from the stack. If a bytecode consumes no values, or produces no
values, the list on either side of the ⇒ may be empty.
If a value is written as a, b, or n, then the bytecode treats it as an integer. If a value is written is addr, then the bytecode treats it as an address.
We do not fully describe the floating point operations here; although this design can be extended in a clean way to handle floating point values, they are not of immediate interest to the customer, so we avoid describing them, to save time.
float
(0x01): ⇒add
(0x02): a b ⇒ a+bsub
(0x03): a b ⇒ a-bmul
(0x04): a b ⇒ a*bdiv_signed
(0x05): a b ⇒ a/bdiv_unsigned
(0x06): a b ⇒ a/brem_signed
(0x07): a b ⇒ a modulo brem_unsigned
(0x08): a b ⇒ a modulo blsh
(0x09): a b ⇒ a<<brsh_signed
(0x0a): a b ⇒ (signed)
a>>brsh_unsigned
(0x0b): a b ⇒ a>>blog_not
(0x0e): a ⇒ !abit_and
(0x0f): a b ⇒ a&band
.
bit_or
(0x10): a b ⇒ a|bor
.
bit_xor
(0x11): a b ⇒ a^bor
.
bit_not
(0x12): a ⇒ ~aequal
(0x13): a b ⇒ a=bless_signed
(0x14): a b ⇒ a<bless_unsigned
(0x15): a b ⇒ a<bext
(0x16) n: a ⇒ a, sign-extended from n bitsThe number of source bits to preserve, n, is encoded as a single
byte unsigned integer following the ext
bytecode.
zero_ext
(0x2a) n: a ⇒ a, zero-extended from n bitsThe number of source bits to preserve, n, is encoded as a single
byte unsigned integer following the zero_ext
bytecode.
ref8
(0x17): addr ⇒ aref16
(0x18): addr ⇒ aref32
(0x19): addr ⇒ aref64
(0x1a): addr ⇒ aref
n, fetch an n-bit value from addr, using the
natural target endianness. Push the fetched value as an unsigned
integer.
Note that addr may not be aligned in any particular way; the
ref
n bytecodes should operate correctly for any address.
If attempting to access memory at addr would cause a processor
exception of some sort, terminate with an error.
ref_float
(0x1b): addr ⇒ dref_double
(0x1c): addr ⇒ dref_long_double
(0x1d): addr ⇒ dl_to_d
(0x1e): a ⇒ dd_to_l
(0x1f): d ⇒ adup
(0x28): a => a aswap
(0x2b): a b => b apop
(0x29): a =>pick
(0x32) n: a ... b => a ... b adup
; if n is one, it copies
the item under the top item, etc. If n exceeds the number of
items on the stack, terminate with an error.
rot
(0x33): a b c => c a bif_goto
(0x20) offset: a ⇒pc
register to start
+ offset.
Thus, an offset of zero denotes the beginning of the expression.
The offset is stored as a sixteen-bit unsigned value, stored
immediately following the if_goto
bytecode. It is always stored
most significant byte first, regardless of the target's normal
endianness. The offset is not guaranteed to fall at any particular
alignment within the bytecode stream; thus, on machines where fetching a
16-bit on an unaligned address raises an exception, you should fetch the
offset one byte at a time.
goto
(0x21) offset: ⇒pc
register to start
+ offset.
The offset is stored in the same way as for the if_goto
bytecode.
const8
(0x22) n: ⇒ nconst16
(0x23) n: ⇒ nconst32
(0x24) n: ⇒ nconst64
(0x25) n: ⇒ next
bytecode.
The constant n is stored in the appropriate number of bytes
following the const
b bytecode. The constant n is
always stored most significant byte first, regardless of the target's
normal endianness. The constant is not guaranteed to fall at any
particular alignment within the bytecode stream; thus, on machines where
fetching a 16-bit on an unaligned address raises an exception, you
should fetch n one byte at a time.
reg
(0x26) n: ⇒ aThe register number n is encoded as a 16-bit unsigned integer
immediately following the reg
bytecode. It is always stored most
significant byte first, regardless of the target's normal endianness.
The register number is not guaranteed to fall at any particular
alignment within the bytecode stream; thus, on machines where fetching a
16-bit on an unaligned address raises an exception, you should fetch the
register number one byte at a time.
getv
(0x2c) n: ⇒ vThe variable number n is encoded as a 16-bit unsigned integer
immediately following the getv
bytecode. It is always stored most
significant byte first, regardless of the target's normal endianness.
The variable number is not guaranteed to fall at any particular
alignment within the bytecode stream; thus, on machines where fetching a
16-bit on an unaligned address raises an exception, you should fetch the
register number one byte at a time.
setv
(0x2d) n: v ⇒ vgetv
.
trace
(0x0c): addr size ⇒trace_quick
(0x0d) size: addr ⇒ addrtrace
opcode.
This bytecode is equivalent to the sequence dup const8
size
trace
, but we provide it anyway to save space in bytecode strings.
trace16
(0x30) size: addr ⇒ addrtrace_quick16
, for consistency.
tracev
(0x2e) n: ⇒ agetv
.
tracenz
(0x2f) addr size ⇒printf
(0x34) numargs string ⇒printf
).
The value of numargs is the number of arguments to expect on the
stack, while string is the format string, prefixed with a
two-byte length. The last byte of the string must be zero, and is
included in the length. The format string includes escaped sequences
just as it appears in C source, so for instance the format string
"\t%d\n"
is six characters long, and the output will consist of
a tab character, a decimal number, and a newline. At the top of the
stack, above the values to be printed, this bytecode will pop a
“function” and “channel”. If the function is nonzero, then the
target may treat it as a function and call it, passing the channel as
a first argument, as with the C function fprintf
. If the
function is zero, then the target may simply call a standard formatted
print function of its choice. In all, this bytecode pops 2 +
numargs stack elements, and pushes nothing.
end
(0x27): ⇒