Computer Architecture Lab/WS2007/OhHaHeTa/ThreeISAs

Intel i960
The i960 (aka. 80960) architecture originates from a joint venture from Intel and Siemens (1989-1990) to design a fast, fault-tolerant, object-oriented computer system. Due to problems finding a market for the project was abandoned. Later Intel built various versions of the CPU for (high-end) embedded applications, which came quite successful till about 2000.

The i960 is a RISC-based design with some special extensions not found in many other embedded processors (see below). They dont share all of the instructions implemented in the original BiiN processor (i960MX). Some dont have a MMU or FPU or they lack several instructions (e.g. atmod), depending on the predetermined market.

Registers and Memory
32b/4GB physical address space 32b/4GB virtual address space for each process, which can be partitioned further into domains with enforced object-based protection.

Data in memory and in registers is stored in little endian (least significant byte at base address of an object/register), although there are big endian versions as well (i960CA and CF).

There are 16 general purpose 32b registers (g0...g15, g15 reserved for framepointer), 4 80b floating point registers (fp0...fp3) and 3 32b control registers (arithmetic (condition code etc.), process and trace). The processor provides a fresh set of 16 local registers (r0...r15) after every call without spilling the old register values to main memory (if possible). This method to increase performance is known as Register Windowing. The number of register sets depends on the implementation, it is 4 for the i960MX processor.

Data Types

 * signed and unsigned integer (8, 16, 32, 64 bits)
 * real (32, 64, 80 bits; if FPU present)
 * ASCII encoded decimal digits(!)
 * bits, bit-strings (consecutive bits) (in a register only)
 * byte strings (contiguous sequence of bytes (in memory only))
 * triple and quadwords (96 and 128 bits)
 * literals (0-31: 5b; +0.0, +1.0 in FPU instructions)

Other Integral Parts/Features
Pipelining is aided by:
 * instruction cache (512B in BiiN/i960MX)
 * Register Scoreboarding
 * write buffering

Instruction Set
There are 4 types of instruction encoding, although the MEM format exists in two variants. All instructions are word aligned (on 32b/4B boundaries) and all, except the second MEM variant are 4B long:

As you can see in the diagram all instructions can be easily distinguished by their opcode located in the highest byte, except for the REG instructions. Those are the majority of i960 instructions, which is the reason, why they need a second byte to be differentiated. They use values from registers (or literals (m1/2 is set then)) as operands only. Instructions in the COBR format are primarily the compare-and-branch instructions. Source_1 can be a literal or register, source_2 is a register. The displacement is used to jump to IP + 4*displ., when the branch is taken. CTRL instructions combine the branches, where only a address to jump is needed. MEMA and MEMB operations are distinguished by the 12th byte, where 0 encodes the MEMA format. They compute memory addresses and incorporate load, store and lda as well as some other instructions.

Data Movement

 * loading and storing bytes, shorts, (double-, triple-, quad-)words from/to memory with automatic sign and zero extending. certain register alignment rules need to be followed. real numbers need to be transfered to integer registers before loading/storing them.
 * moving (double-, triple-, quad-)words around in memory
 * special commands for all the above operations to be used with virtual addressing.
 * lda to load big constants immediately or from memory.

Arithmetic

 * add, subtract, multiply, divide, remainder with signed and unsigned integers.
 * add, subtract w/ carry with unsigned integers.
 * extended multiply and divide with unsigned longs
 * modulo with signed integer
 * shift left, right with signed and unsigned integers
 * rotate left with unsigned
 * shift right dividing integer (equivalent to dividing even for negative values)

Bit, Bit Field and Byte String

 * set, clear, toggle(notbit) a bit
 * chkbit sets the condition code according to the bit
 * alterbit sets the bit according to the condition code
 * find the most significant set or clear bit
 * extract converts a bit field into an unsigned integer (== shift + zero fill)
 * modify copies the masked contents of one register into another
 * movstr moves a byte string in memory (fast and nonoverwriting (if the locations overlap) mode).
 * fill copies an ordinal repeatedly into a byte string.
 * cmpstr checks if two strings are equal
 * scanbyte checks if any two corresponding bytes are equival.

Comparison

 * cmpi, cmpo compare two signed or unsigned integers
 * concmpi, concmpo similar two the instruction above, but checks condition code before comparing. can be used to optimize two-sided checks (A >= x >= B).
 * compare and in/decrement designed for check ins loops.
 * matches the condition code with several masks (see ) and stores 0 or 1 in the destination register.

Branches
Most of the branch instructions specify the target IP with a signed displacement to be added to the current IP. Extended branch instructions specify a memory address which contains the target IP using one of the addressing modes described above.

unconditional branches
b, bx jump to specified IP bal, balx "branch and link", used for an alternative implemantation of procedure calls.

conditional branches
test the condition code and jumps iff it "matches" according to the instruction. following matches are possible: [not] equal, less [or equal], greater [or equal]

compare and branch
instructions test two (un)signed integers and branches only if they "match" (same matches as above). bbs, bbc check bit and branch if it is set/clear.

Call/Return
Calls and return automatically save local registers and setup the stack frames.

FPU
Typical floating point operations like add, subtract, multiply, divide, remainder, square root, comparing and conversion to/from integer and to different floating-point types are supported in some incarnations of i960.

Other
To aid debugging, exceptions can be triggered with explicit commands (mark, fmark) or enabling generic trace-faults per process, which fire for example when a return instruction is executed. Latter are controlled thru the trace-control word which can be modified with modtc. Another possibilty to trigger faults are comparing instructions like "fault if equal" etc. There are three instructions to do an atomic read-modify-write operation:
 * Debugging
 * Atomic
 * atadd : adds the operand to the value in memory
 * atmod : replaces the value at the destination with the source value where the mask bits are set.
 * atrep : like atmod but without a mask.

The ISA supports...
 * Process Management
 * hardware scheduling where a process can be added to dispatching queue with 32 priority levels and roundrobin decisions within one priority class (schedprcs)
 * saving the current state of a process to memory (saveprcs)
 * switching over to another process (resumprcs)
 * interprocess communication in form of semaphores (wait, condwait, signal) and message passing similar to (but simpler than) unix message queues (receive, condrec, send, sendserv)


 * Decimal
 * dmovt: moves a decimal from one register to another and checks if it really is a decimal
 * daddc, dsubc: add/subtract decimals w/ carry.