Computer Architecture Lab/HOWTO

A collection of some tips for your microprocessor design

Memory
You will probably need some memory for your processor. The simplest way to start is using the small on-chip memory blocks in the FPGA. You can instantiate the vendor-specific components (e.g. altsyncram . However, to get a vendor-independent design it is also possible to use plain VHDL.

Note: In actual FPGAs you cannot use memories in an asynchronous mode. At the minimum the address, data input and write enable are registered.

ROM
A ROM can be described by a simple VHDL case statement:

-- -- rom.vhd -- -- generic VHDL version of ROM -- --     DONT edit this file! --     it is automatically generated --

library ieee; use ieee.std_logic_1164.all;

entity rom is generic (width : integer; addr_width : integer);   -- for compatibility port (   clk         : in std_logic;    address     : in std_logic_vector(9 downto 0);    q           : out std_logic_vector(7 downto 0) ); end rom;

architecture rtl of rom is

signal areg    : std_logic_vector(9 downto 0); signal data    : std_logic_vector(7 downto 0);

begin

process(clk) begin

if rising_edge(clk) then areg <= address; end if;

end process;

q <= data;

process(areg) begin

case areg is

when "0000000000" => data <= "10000000"; when "0000000001" => data <= "10000000"; when "0000000010" => data <= "11000000"; when "0000000011" => data <= "10000000"; when "0000000100" => data <= "00011011"; when "0000000101" => data <= "11000001"; ....       when "1111011011" => data <= "10000000"; when "1111011100" => data <= "10000000"; when "1111011101" => data <= "10000000"; when "1111011110" => data <= "10000000";

when others => data <= "00000000"; end case; end process;

end rtl;

Write a small program to generate this VHDL file from your assembler output or even let your assembler write the output as VHDL (see as exmple Jopa.java).

Another tool to generate a ROM VHDL file (romgen.cpp) is available in pacman_003.zip.

RAM
Even dual ported memories can now (with careful VHDL coding) be instantiated from plain VHDL as the following example shows:

-- --	sdpram.vhd -- --	Simple dual port ram with read and write port --		and independent clocks -- --	When using different clocks following warning is generated: --		Functionality differs from the original design. --	Read during write at the same address is undefined. -- --	Author: Martin Schoeberl (mschoebe@mail.tuwien.ac.at) -- --	2006-08-03	adapted from simulation only version --

library ieee; use ieee.std_logic_1164.all; use ieee.numeric_std.all;

entity sdpram is generic (width : integer := 32; addr_width : integer := 7); port (	wrclk		: in std_logic;	data		: in std_logic_vector(width-1 downto 0);	wraddress	: in std_logic_vector(addr_width-1 downto 0);	wren		: in std_logic;

rdclk		: in std_logic; rdaddress	: in std_logic_vector(addr_width-1 downto 0); dout		: out std_logic_vector(width-1 downto 0) ); end sdpram ;

architecture rtl of sdpram is

signal reg_dout			: std_logic_vector(width-1 downto 0);

subtype word is std_logic_vector(width-1 downto 0); constant nwords : integer := 2 ** addr_width; type ram_type is array(0 to nwords-1) of word;

signal ram : ram_type;

begin

process (wrclk) begin if rising_edge(wrclk) then if wren='1' then ram(to_integer(unsigned(wraddress))) <= data; end if; end if; end process;

process (rdclk) begin if rising_edge(rdclk) then reg_dout <= ram(to_integer(unsigned(rdaddress))); dout <= reg_dout; end if; end process;

end rtl;

Use a single clock connected to wrclk and rdclk for your design

External Memory
For the 32 bit 1 MB SRAM on the FPGA board you can use the SimpCom compatible memory interface (sc_sram32.vhd) that is provided with JOP.

General
This RAM implementation enables you to use the XILINX Block RAMs within your design. It is based on the code from the XILINX XST Manual. For small RAMs (e.g. address bus width = 3) distributed RAM is used since for such small memories it needs less logic cells. The code was synthesised using ISE Webpack 8.2i03 and tested on a Micromodule from Trenz Elektronik (carring a XC3S1000-4). The testroutine fills the RAM using a serial port and reads back the RAM contents. It is not included in the Wiki since it only works with additional hardware (serial port). Additionally the code was synthesised using QuartusII 6.0sp1 WebPack and tested on the LAB hardware.

Implementation Summary
TBD...to be done

Simple Dual-Port RAM
The following code is compiled to on-chip memory, which is prefered for the register file.

-- -- -- This file is a part of JOP, the Java Optimized Processor -- -- Copyright (C) 2006, Martin Schoeberl (martin@jopdesign.com) -- -- This program is free software: you can redistribute it and/or modify -- it under the terms of the GNU General Public License as published by --  the Free Software Foundation, either version 3 of the License, or --  (at your option) any later version. -- -- This program is distributed in the hope that it will be useful, -- but WITHOUT ANY WARRANTY; without even the implied warranty of --  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the -- GNU General Public License for more details. -- -- You should have received a copy of the GNU General Public License -- along with this program. If not, see . --

-- --	sdpram.vhd -- --	Simple dual port ram with read and write port --		and independent clocks --	Read and write address, write data is registered. Output is not --	registered. Read enable gates the read address. Is compatible --	with SimpCon. -- --	When using different clocks following warning is generated: --		Functionality differs from the original design. --	Read during write at the same address is undefined. -- --	If read enable is used a discrete output register is synthesized. --	Without read enable the -- --	Author: Martin Schoeberl (martin@jopdesign.com) -- --	2006-08-03	adapted from simulation only version --	2008-03-02	added read enable --

library ieee; use ieee.std_logic_1164.all; use ieee.numeric_std.all;

entity sdpram is generic (width : integer := 32; addr_width : integer := 7); port (	wrclk		: in std_logic;	data		: in std_logic_vector(width-1 downto 0);	wraddress	: in std_logic_vector(addr_width-1 downto 0);	wren		: in std_logic;

rdclk		: in std_logic; rdaddress	: in std_logic_vector(addr_width-1 downto 0); rden		: in std_logic; dout		: out std_logic_vector(width-1 downto 0) ); end sdpram ;

architecture rtl of sdpram is

signal reg_dout			: std_logic_vector(width-1 downto 0);

subtype word is std_logic_vector(width-1 downto 0); constant nwords : integer := 2 ** addr_width; type ram_type is array(0 to nwords-1) of word;

signal ram : ram_type;

begin

process (wrclk) begin if rising_edge(wrclk) then if wren='1' then ram(to_integer(unsigned(wraddress))) <= data; end if; end if; end process;

process (rdclk) begin if rising_edge(rdclk) then if rden='1' then reg_dout <= ram(to_integer(unsigned(rdaddress))); end if; end if; end process;

dout <= reg_dout;

end rtl;

Single Port RAM
- -- -- Filename: sp_ram.vhd -- ========= -- -- Short Description: -- ================== --  Single port ram. -- -- Description: -- ============ --  Implementation of a single port ram. --  Based on the ram code described in the Xilinx XST Manual. -- --  The memory uses low active control signals. The following --  combinations are possible: -- --   nwr | ncs | Description --  --     0  |  0  | Write access --    1  |  0  | Read access --    X  |  1  | Inactive -- --  The address and data bus width can be specified using --  the generic map. The default is eight bit data bus and --  eight bit address bus. -- -- Verification: -- ============= --  Simulated using ISE Webpack 8.2.03i running on Ubuntu Linux 6.10. --  Synthesized using ISE Webpack 8.2.03i running on Ubuntu Linux 6.10. --  Tested on a XC3S1000-4 -- -

library ieee; use ieee.std_logic_1164.all; use ieee.numeric_std.all;

entity sp_ram is generic (   DATA_WIDTH : integer := 8;    ADDR_WIDTH : integer := 8  ); port (   clk : in std_logic;    addr : in std_logic_vector(ADDR_WIDTH - 1 downto 0);    data_w : in std_logic_vector(DATA_WIDTH - 1 downto 0);    data_r : out std_logic_vector(DATA_WIDTH - 1 downto 0);    nwr : in std_logic;    ncs : in std_logic  ); end sp_ram;

architecture beh of sp_ram is subtype ram_entry is std_logic_vector(DATA_WIDTH - 1 downto 0); type ram_type is array(0 to (2 ** ADDR_WIDTH) - 1) of ram_entry; signal ram : ram_type; begin process(clk) begin if rising_edge(clk) then if ncs = '0' then if nwr = '0' then ram(to_integer(unsigned(addr))) <= data_w; else data_r <= ram(to_integer(unsigned(addr))); end if; end if; end if; end process; end beh;

Dual Port RAM with one Write Port
- -- -- Filename: dp_ram_1w.vhd -- ========= -- -- Short Description: -- ================== --  Dual port ram with only one writing port. -- -- Description: -- ============ --  Implementation of a dual port ram with only one writing port. --  Based on the ram code described in the Xilinx XST Manual. -- --  Both ports are using the same clock! --  The memory uses low active control signals. The following --  combinations are possible: -- --  Port 1: --   nwr1 | nsc1 | Description --  --      0  |   0  | Write access --     1  |   0  | Read access --     X  |   1  | Inactive -- --  Port 2: --   nsc2 | Description --  - --      0  | Read access --     1  | Inactive -- --  The address and data bus width can be specified using --  the generic map. The default is eight bit data bus and --  eight bit address bus. -- -- Verification: -- ============= --  Simulated using ISE Webpack 8.2.03i running on Ubuntu Linux 6.10. --  Synthesized using ISE Webpack 8.2.03i running on Ubuntu Linux 6.10. -- -

library ieee; use ieee.std_logic_1164.all; use ieee.numeric_std.all;

entity dp_ram_1w is generic (   DATA_WIDTH : integer := 8;    ADDR_WIDTH : integer := 8  ); port (   clk : in std_logic;    addr1 : in std_logic_vector(ADDR_WIDTH - 1 downto 0);    data_w1 : in std_logic_vector(DATA_WIDTH - 1 downto 0);    data_r1 : out std_logic_vector(DATA_WIDTH - 1 downto 0);    nwr1 : in std_logic;    ncs1 : in std_logic;    addr2 : in std_logic_vector(ADDR_WIDTH - 1 downto 0);    data_r2 : out std_logic_vector(DATA_WIDTH - 1 downto 0);    ncs2 : in std_logic  ); end dp_ram_1w;

architecture beh of dp_ram_1w is subtype ram_entry is std_logic_vector(DATA_WIDTH - 1 downto 0); type ram_type is array(0 to (2 ** ADDR_WIDTH) - 1) of ram_entry; signal ram : ram_type; begin process(clk) begin if rising_edge(clk) then if ncs1 = '0' then if nwr1 = '0' then ram(to_integer(unsigned(addr1))) <= data_w1; else data_r1 <= ram(to_integer(unsigned(addr1))); end if; end if; end if; end process;

process(clk) begin if rising_edge(clk) then if ncs2 = '0' then data_r2 <= ram(to_integer(unsigned(addr2))); end if; end if; end process; end beh;

Dual Port RAM with two Write Ports
- -- -- Filename: dp_ram.vhd -- ========= -- -- Short Description: -- ================== --  Dual port ram with two writing ports. -- -- Description: -- ============ --  Implementation of a dual port ram with two writing ports. --  Based on the ram code described in the Xilinx XST Manual. -- --  Both ports are using the same clock! --  The memory uses low active control signals. The following --  combinations are possible: -- --   nwrx | nscx | Description --  --      0  |   0  | Write access --     1  |   0  | Read access --     X  |   1  | Inactive -- --  The address and data bus width can be specified using --  the generic map. The default is eight bit data bus and --  eight bit address bus. -- -- Verification: -- ============= --  Simulated using ISE Webpack 8.2.03i running on Ubuntu Linux 6.10. --  Synthesized using ISE Webpack 8.2.03i running on Ubuntu Linux 6.10. -- -

library ieee; use ieee.std_logic_1164.all; use ieee.numeric_std.all;

entity dp_ram is generic (   DATA_WIDTH : integer := 8;    ADDR_WIDTH : integer := 8  ); port (   clk : in std_logic;    addr1 : in std_logic_vector(ADDR_WIDTH - 1 downto 0);    data_w1 : in std_logic_vector(DATA_WIDTH - 1 downto 0);    data_r1 : out std_logic_vector(DATA_WIDTH - 1 downto 0);    nwr1 : in std_logic;    ncs1 : in std_logic;    addr2 : in std_logic_vector(ADDR_WIDTH - 1 downto 0);    data_w2 : in std_logic_vector(DATA_WIDTH - 1 downto 0);    data_r2 : out std_logic_vector(DATA_WIDTH - 1 downto 0);    nwr2 : in std_logic;    ncs2 : in std_logic  ); end dp_ram;

architecture beh of dp_ram is subtype ram_entry is std_logic_vector(DATA_WIDTH - 1 downto 0); type ram_type is array(0 to (2 ** ADDR_WIDTH) - 1) of ram_entry; shared variable ram : ram_type; begin process(clk) begin if rising_edge(clk) then if ncs1 = '0' then if nwr1 = '0' then ram(to_integer(unsigned(addr1))) := data_w1; else data_r1 <= ram(to_integer(unsigned(addr1))); end if; end if; end if; end process;

process(clk) begin if rising_edge(clk) then if ncs2 = '0' then if nwr2 = '0' then ram(to_integer(unsigned(addr2))) := data_w2; else data_r2 <= ram(to_integer(unsigned(addr2))); end if; end if; end if; end process; end beh;

Automation
When your processor build process gets more complex a more batch oriented build process will come handy. make will be your friend. Following listing shows how to use Quartus II from within a Makefile:

qsyn: echo $(QBT) echo "building $(QBT)" -rm -r quartus/$(QBT)/db -rm quartus/$(QBT)/jop.sof -rm jbc/$(QBT).jbc -rm rbf/$(QBT).rbf quartus_map quartus/$(QBT)/jop quartus_fit quartus/$(QBT)/jop quartus_asm quartus/$(QBT)/jop quartus_tan quartus/$(QBT)/jop
 * 1) 	Quartus build process
 * 2) 		called by jopser, jopusb,...
 * 1) 		called by jopser, jopusb,...

The example is taken from the Makefile for JOP.

JOP as an Example
If you want to see a Java processor (JOP) running on the board follow the instructions in Jopwiki Getting started. The processor is open-source and you can borrow some ideas from it.

Latex2wiki
For the ones who prefer Latex over wiki-code for documentation, this tool might be interesting: Latex2wiki.