Unit 2.6 Introduction to Instruction Format
Instructions are stored in memory. They are data just like
the data created by DC statements. Every time the ASSEMBLER
sees an instruction, it is converted into hexadecimal. This
is put in memory. When the computer executes your program,
the CPU examines each instruction in turn and performs the
appropriate operation. The ASSEMBLER can be viewed as
encoding the instructions into hexadecimal (actually
binary). The execution of a program means that the CPU
decodes each instruction, reversing the process.
When you look at the LISTING file generated by the
ASSEMBLER, you will see several columns of data. The
leftmost column of data indicates the address at which the
data may be found. The next column comes under the heading
"Object Code" This is the code that appears in the
computer. It may be inspected with the debugger just as a
memory location can be. These hexadecimal numbers are read,
interpreted and processed by the CPU of the computer.
Every computer has its own instruction format and its own
way of encoding and decoding instructions. Thus, the binary
for an A 1,A on the IBM 360 architecture (an IBM mainframe)
would correspond to a different hexadecimal than the
equivalent instruction on an IBM PC, or a VAX (Digital
Equipment computer).
This unit will introduce the instruction format and will
teach you how to encode instructions into hexadecimal by
hand. You will be able to perform the same functions as the
ASSEMBLER does. Learning this will give you insight into
what goes on "under the hood" of the computer. In addition,
it sometimes becomes necessary to manually modify the
instructions. The file containing the source, the ASSEMBLE
file, may have disappeared. A large program may simply take
too long to ASSEMBLE. I had several jobs where we routinely
changed instructions manually in order to affect program
changes during debugging. Another extreme case involved
the Viking lander on Mars. Here, they radioed instructions
to change the data comprising the program. This was to get
around a problem in the hardware of the Viking lander.
There are two formats for instructions in the IBM Mainframe.
These are RR for register to register. These accomodate
such instructions as AR 3,7, BALR 12,0 and SR 7,5. They
allow for operations that can be done between two registers.
These instructions only take two bytes. Two bytes is
sixteen bits. An RR instruction is laid out as follows
1) first 8 bits, the opcode
2) next 4 bits, register one
3) last 4 bits, second register
The opcode tells the CPU which instruction is being
executed, whether this is an AR, a SR, or somethng else.
The table below will tell you which instruction corresponds
to which hex number.
The next four bits correspond to the register specified in
the first operand. Note that a register goes from 0 to 15,
which fits neatly into four bits. Likewise, the last four
bits corresponds to the second register in the second
operand.
Thus, for example, the instruction AR 3,7 would be encoded
as
1A37
The 1A comes from the opcode for AR, 1A. The 3 in the third
hex digit is from register three. The 7 in the fourth hex
digit is from the register seven.
The other type of instruction is the register to memory
instruction. It is somewhat more complicate, taking four
bytes or eight hex digits. The first two hex digits comes
from the opcode as before. The register which appears as
the first operand goes into the third hex digit. That is
"field1" in the template below. The remaining five hex
digits will encode the memory location.
For our purposes, X2 will always be zero and B2 will always
be "C" or the hexadecimal for twelve. The offset will be
the address of the memory location -2. Let us look at the
listing of our first program. (Page 17) Let us look at the
first RX, or memory-register, instruction on line 6. It
loads the value of A into register one. Observe that A can
be found at location 10 (hexadecimal) in memory.
We will learn much later in the course the circumstances
when X2 is not 0 and when the B2 field will be something
other than C. These are important in setting up code to
access arrays and when arrays or programs become large. We
would need to set up multiple base registers for such large
programs or arrays. I.e., there would be several "USING"
statements instead of the one for register 12 as we always
have in all our programs so far.
The hexadecimal for L 1,A would bef ound in the second
column of line 6. It is "5810 C00E"
This was generated as follows:
The 58 comes from the opcode for the "L" or load
instruction. The next digit is 1; that comes from the
register one in the first operand. The fourth hex digit is
zero and the fifth is C as stated above. Then we subtract 2
from hexadecimal 10 to get 00E which goes in the last three
hex digits.
To summarize, we give the following paper algorithm, you can
implement in your heads to do these tasks:
Check the operator name (LR, L, SR, etc.) and determine if
the instruction is RR or RX
If instruction is RR
Put opcode in first two hex digits
Convert register from first operand into a hex digit. Put
it in the third hex digit
Convert register from second operand into a hex digit.
Put it into the fourth hex digit place
ELSE
IF instruction is RX
Put opcode in first two hex digits
Convert register from first operand into a hex digit. Put
it into the third hex digit place
Put zero into the fourth hex digit place
Put "C" for 12 into the fifth hex digit place
Determine address of memory location referenced in second
operand.
Subtract two from this number, getting a hexadecimal
number.
Put this hex number in the last three hex digits--adding
zeros at the end
Layout
RR
|________|____|____|
8 bit 4 4
opcode field1 field2
RX
|________|____|____|____|____________|
8 bit 4 4 4 12
opcode field1 X2 B2 offset
Opcode list
RX RR
A 5A AR 1A
BAL 45 BALR05
BC 47 BCR 07
C 59 CR 19
D 5D DR 1D
L 58 LR 18
LA 41
M 5c MR 1C
ST 50
S 5B SR 1B
STC 42
IC 43