Computer Architecture Lab/WS2007/SHWH/Processor Comparison

If you prefer to print or to download this document, we also offer the same content as  [[media:SHWH_ProcessorComparison.pdf|PDF version]].

History
The MOS Technology 6502 is an 8-bit microprocessor that was designed by Chuck Peddle for MOS Technology in 1975. It has approximately 5000 transistors.

The internal logic runs at the same speed as the external clock rate, but despite the slow clock speeds between 20 KHz and 2 MHz, the 6502 was actually competitive with other CPU's using significantly faster clocks. This is partly due to simplistic state machine implemented by combinatorial logic to a greater extent than in many other designs; the two phase clock can thereby control the whole machine-cycle directly.

When the 6502 was introduced, it was the least expensive full-featured CPU on the market by a considerable margin, costing less than one-sixth the price of competing designs from larger companies such as Motorola and Intel.

His main concurrent was Zilog Z80.

6502 was the first microprocessor CPU with a 1-step instruction pipeline. This means that during execution of one command already the next instruction could be fetched.

One of the first uses for the design was the Apple I computer. The 6502 was next used in the Apple II, and the Commodore PET. It was later used in the Atari home computers, the BBC Micro family, and a huge number of other designs. Bender, a fictional android "industrial robot" and a main character in the animated TV series Futurama, was revealed to have a 6502 as his "brain".

Registers
6502 has very few registers. It has an 8-bit accumulator, two 8-bit index registers, one 8-bit stack pointer, an 8-bit status register, and a 16-bit program counter.

The accumulator is the main register for arithmetic and logic operations. Unlike the index registers X and Y, it has a direct connection to the Arithmetic and Logic Unit (ALU).

The stack memory ranges from 0x0100 to 0x01FF. The stack register (S) is a 8-bit offset to the stack page. In other words, whenever anything is being pushed on the stack, it will be stored to the address 0x0100+S.

The bits in this status register are called flags. The bits in the register are Negative, Overflow, Unused, Break, Decimal mode, Interrupt disable, Zerro, and Carry flags.

Instruction Set
The 6502 microprocessor has a variable instruction encoding and the byte-order is Little Endian. Every command needs between 2 and 7 clock cycles. The instruction set includes 56 instructions.

Addressing Modes
The chip used the index and stack registers effectively with several addressing modes, including a fast "direct page" or "zero page" mode, that accessed memory locations from address 0 to 255 with a single 8-bit address. The 6502 has altogether 13 addressing modes.

The 6502 has a 64 KByte address space.

Arithmetic Instructions
Available commands are addition/subtraction, binary connections, and rotate/shift operations.

Compare Instructions
The compare instructions set or clear three of the status flags Carry, Zero, and Negative.

The three types of compare instructions are CMP (Compare Memory and Accumulator), CPX (Compare Memory and Index X), and CPY (Compare Memory and Index Y).

Register Instructions
These commands include load and store, transfer, flag operations and push/pull instructions.

Jump Instructions
The 6502 has some conditional and some unconditional jumps.

Interrupt Instructions
There are exact 4 instructions available: sei, cli, brk, nop.

Pins
The 6502 has a 16-bit address bus and a 8-bit data bus. As the memory was faster than the CPU, it made sense to optimize the CPU for memory access.

Altogether the 6502 microprocessor has 40 pins.

History
The DLX processor is a RISC processor that was designed by John Hennessy and David Patterson with the main objective to produce a fully pipelined DLX processor for pedagogical purposes. It was first mentioned 1995 in "Computer Architecture: A Quantitative Approach." The pipeline of the DLX processor has 5 stages, namely fetch, decode, execute, memory, and writeback.

This processor is a load/store machine and emphasizes a simple instruction set, design for pipeling efficiency, an easily decoded instruction set, and efficiency as a compiler target.

Registers
The DLX processor has 32 general-purpose registers (GPRs), each of which is 32 bits long. Register r0 is a special register that always has the value 0.

Furthermore, the processor has 32 floating-point registers (FPRs), which can be used as 32 single precision 32-bit registers or as even-odd pairs holding double-precision values.

Last but not least, the DLX processor has a 32 bit program counter (PC) and 31 special purpose registers.

Instruction Set
The DLX processor has a hybrid instruction encoding. Each instruction is encoded in a 32-bit word. It uses a Big Endian addressing scheme. There are 3 instruction formats, namely I-type, R-type, and J-type. The I-type format is generally used for arithmetic and logic instructions that have an immediate operand, and for branch instructions. The R-type format is used for arithmetic and logic instructions that operate entirely on data in registers. The J-type instructions are used for unconditional jump instructions.

64 basic instructions are supported by the DLX processor, though it can also support extended instructions, as long as those instructions work purely on registers.

Addressing Modes
The DLX processor has only 3 addressing modes, namely immediate, displacement, and register.

The processor has a 4 GByte address space.

Data Transfer Instructions
The load and store instructions are the only means of transferring data between the CPU and memory.

Arithmetic and Logic Instructions
All ALU instructions are register-register instructions and contain addition, subtraction, multiplication, division, comparison, and so on.

Control Transfer Instructions
Control Transfer Instructions are mainly branches and jumps. There are also instructions available which deal with exceptions and interrupts.

Pins
The DLX processor has a 32 bit address and data bus. Together with clk, reset, and so on, the processor has 73 pins.

History
The 4stack processor is still a research project for high performance and low cost computing and designed by Bernd Paysan. The 4stack processor uses stack based instructions for a four way VLIW processor.

The 4stack processor has 4 Arithmetic Logic Units and 4 Stacks. Each stack has its own ALU. In addition to the 4 ALUs, two memory units allow parallel load and store.

The stacks store a 32 bit value and 64 bit data is represented by two stacks. Stack instructions either use 32 or 64 bit signed or unsigned integers or bit patterns, or 32 bit single or 64 bit double floats. Memory instructions load and store bytes, half words, words, and double words.

Less than 500k transistors are required for the core, leaving more space for caches. Furthermore, the stack paradigm greatly increases instruction density. The 4stack processor encodes up to 8 operations in 64 bits.

Registers
Each stack has 10 registers, including the stack pointer and the status register. There are 4 additional global registers for special purposes. For memory access there are further 32 registers.

Instruction Set
The 4stack processor is a 32 bit machine. The command length is 64 bits, though each command can consist of several operation fields for the independent execution units. All these operations are performed simultaneous.

The load instruction takes 2 cycles, though the store instruction takes only 1 cycle. In case of a cache miss a wait instruction has to be inserted.

The instruction encoding uses Big Endian and has 5 main instruction formats.

Stack Operations
Stack operations are divided into ALU operations and immediate number operations. The immediate number operations are intended to push small numbers on the stack. The ALU operations are used for general purpose and for floating point calculations.

Data Move Operations
Data move operations are divided into load/store, address update and immediate offset operations. An immediate offset operation in one data move field is added to the computed address in the other data move field.

Flow Control Operations
Flow control operations divide into conditional branches, calls/jumps, counted loops, returns, and indirect calls/jumps.