This chapter is a general introduction to ARM/Thumb assembly languaage. It does not provide detailed coverage of every insturction, decriptions of individual instructions can be found in Appendix A Insturction Summary.
Instructions can broadly be placed in one of a number of classes:
We will take a brief look at each of these in turn. Before that, let us examine capabilities that are common to differnt instruction classes.
5.1 Instruction set basic
There are a number of features common to all parts of the instruction set.
5.1.1 Constant and immdiate values
ARM or Thumb assembly language instructions have a length of only 16 or 32 bits. This presents something of a problem. It means that you cannot encode an arbitary 32-bit value within the opcode.
In the ARM instruction set, as opcode bits are used to specify condition codes, the instruction itself and the registers to be used, only 12 bits are available to specify an immediate value. You have to be somewhat creative in how these 12 bits are available to specify an immediate value. You have to be somewhat creative in how these 12 bits are used. Rather than enabling a constant of size -2048 to +2047 to be specified, instead the 12 bits are divided into an 8-bit constant and 4-bit rotate value. The rotate value enables the 8-bit constant value to be rotated right by a number of paces from 0 to 30 in steps of 2, that is, 0,2,4,6,7...
So, you can have immediate values like 0x23. You can produce other useful immediate values, for example, addresses of peripherals or blocks of memory. As an example, 0x23000000 can be produced by expressing it as 0x23 ROR 8. But many ither constants, like 0x3FF, cannot be produced within a single instruction. For these values, you must either construct them in multiple instructions, or load them from memory. Programmers do not typically concern them selves with this,
except where hte assembler gives an erro complainin about an invalid constant. Instad, you can use assembly language pseudo-instructions to do whatever is necessary to generate the required constant.
Constant values encoded in an instruction can be one of the following in Thumb :
a constant that can be produced by rotating an 8-bit value by any even number of bits within a 32-bit wird
a constant of the form 0x00XY00XY
a constant of the form 0xXY00XY00
a constant of the form 0xXYXYXYXY.
where XY is a hexadecimal number in the range 0x00 to 0xFF
The MOVW instruction(move wide), will move a 16-bit constant into a register, while zeroing the top 16 bits of the target register. MOVT(move top) will move a 16-bit constant into the top half of a given register, without chaning the bottom 16 bits. This permits a MOV32 psedo-insturcion that is able to constrcut any 32-bit constant. The assembler provides some more help here. The prefixes L upper 16: and : lower16: enable you to extract the corresponding half from a 32-bit constant:
MOVW R0, #:lower16:label
MOVT R0, #:upper16:label
Although this requires two instructions, it does not require any extrea space to store the constant, and there is no requirement to read a data item from memory.
You can also use pseudo-insturctions LDR Rn, - or LDR Rn, =label. This was the only option for older processors that lacked MOVW and MOVT. The assembler will then use the best sequence to generate the constant in the specified register(one of MOV, MVN or an LDR from a literal pool). A literal pool is an area of constant data held within the code section, typically after the end of a function and beforethe start of another. If it is necessary to manually control literal pool placement, this can be done with an assembler directive - LTORG for armasm, or . ltorg when using GNU tools. The register loaded could be the program counter, that would cause a branch.
This can be useful for absolute addressing or for references outside the current section; obviously this will result in position dependent code. The value of the constant can be determined either by the assemble, or by the linker.
ARM tools also provides the related pseudo-instruction ADR R, =label. This uses a PC-relative ADD or SUB, to place the address of the label into the specified register, using a single instruction. If the address is too far away to be generated this way, the ADRL pesudo intruction is used. This requires two insturcions, that gives a better range. This can be used to generate addresses for position independent code, but only within the same code section.
Conditional execution
A feature of the ARM instruction set is that nearly all instructions can be conditional. On most other architectures, only branches or jumps can be executed conditionally. This can be useful in avoiding conditional branches in small if/then/else constructs or for compound comparisons.
An an example of this, consider code to find the smaller of two values, in registers R0 and R1 and place the result in R2. This
@ Code using branches
CMP R0, R1
BLT .Lsmaller @ if R0<R1 jump over
MOV R2, R1 @ R1 is greater than or equal to R0
B .Lend @ finish
.Lsmaller :
MOV R2, R0 @ R0 is less than R1
.Lend :
>
Now look at the same code written using condition MOV instructions, rather than branches