# COE608 Computer Organization and Architectures Winter 2017 **Lab 6: The Complete CPU (Overall Project)** **Due Date: Lab6 Part I & II -Week 12 (During the Lab Session)** Bonus -Week 13 # 1. Overview In this final lab project, a complete CPU will be implemented whose main components datapath and control have been designed and implemented in the previous labs 4b and 5. Students are to combine the control unit and data-path with a reset circuit (more on this below). When complete, the over-all design will be able to implement the features described in the <u>CPU specification document</u>. Students are encouraged to consult this specification document while proceeding to test the CPU. The instruction memory unit (VHDL) and overall CPU testing block diagrams files to complete this lab are provided as follows: 1) The files and specifics for testing and setting up the CPU for simulation can be found in a document located in .../courses/coe608/labs/lab6/\* and 2) for bonus marks involving CPU hardware implementation and emulation, the specifications and documentation can be found in .../courses/coe608/labs/bonus/\*. The rest of this lab presents the reset circuit needed for the CPU. # 2. Part I - CPU Reset Circuitry In order for the CPU developed here to work properly it must incorporate a reset circuit. The block diagram of the reset circuit is illustrated in Figure 1. Figure 1: Reset Circuit The reset circuit works as explained here. When RESET signal goes high, ENABLE\_PD goes low that forces the control unit into state T0 and CLR\_PC goes high, which clears the Program Counter. We know that the CPU program starts in memory at location 0x00000000. When RESET goes low, ENABLE\_PD & CLR\_PC remains low & high respectively for 4 clock cycles. This allows the data surrounding the CPU to stabilize before its operation begins. The reset circuit is required to keep track (count) of the three clock cycles (T0, T1, and T2). This reset circuit can either be implemented asynchronously or synchronously. The synchronous waveform is shown in Figure 2. Figure 2: Reset Circuit Operation # 3. VHDL Implementation ``` LIBRARY ieee: USE ieee.std logic 1164.ALL; USE ieee.std logic arith.ALL; USE ieee.std logic unsigned.ALL; ENTITY reset circuit IS PORT ( Reset: IN STD LOGIC; Clk: IN STD LOGIC; Enable PD: OUT STD LOGIC; Clr PC: OUT STD LOGIC ); END reset circuit; ARCHITECTURE description OF reset circuit IS BEGIN -- you fill in what goes here. END description; ``` Students are free to implement the ARCHITECTURE section however they see fit. # 4. Part I - What to Hand In Students must submit the following to obtain full marks for Lab 6, Part I: - A hard-copy listing of your VHDL source code implementation. - A hard-copy printout of the timing simulation results for the reset circuit. The lab instructor will quiz you on both Part I (reset circuit) and Part II (final CPU) for the Lab 6 demo. # 5. Part II - The Complete CPU System Once the reset circuit is implemented, all the CPU sub-systems are complete and the CPU can be assembled. The final CPU will consist of one instance of the data-path, the control unit, and the reset circuit. Interconnecting them appropriately is up to the students, however supporting files for VHDL interconnection and setup may be found in the course directory .../courses/coe608/labs/lab6/ in the cpu1 module. To help better understand the expectations and functionality of the final CPU, students are encouraged to refer to the **CPU Testing** document available in the course directory and website. This document will help you to: - Generate the system's instruction memory unit (MegaCore RAM block (.mif) implementation) - Assemble a top level working file (with reset circuit, instruction memory, a datapath, and control unit) - Include and map other supporting files - Simulate and demo your working CPU for Lab 6 - And optionally emulate the CPU for the Lab 6 Bonus Students are advised to also refer to the CPU Specification document to ensure that all operations and specifications have been fulfilled by their final CPU. # 6. Part II - What to Hand in To obtain full marks for the complete CPU project (i.e. Lab6), students must demonstrate the correct operation of CPU circuit through simulation. This means that students have to demonstrate the timing simulation results that show the CPU correctly loading <u>ALL</u> the instructions and data, performing addition, load upper immediate, etc. In addition to this, students must submit the VHDL code for their CPU, as well as timing simulation results for the complete CPU. To properly simulate the CPU operation, the *CPU\_testing* document should be consulted, and Memory Module files provided should be used. Your lab supervisor will quiz you during the demo for both Parts I and II. # 7. Bonus Project To obtain bonus marks, students are required to demonstrate the operation of the processor through emulation on the DE2 boards found in the laboratory. To properly implement the CPU, the **CPU\_testing** document should be consulted and system memory, seven-segment, display-unit, decoder, and all the other VHDL file provided in the following course directory should be used. .../courses/coe608/labs/bonus/\* The following problems are provided as a bonus. Solving them will result in additional marks being added to your course mark. # **Problem 1:** The processor developed throughout this course includes various branch and jump operations used for conditional statements. Using the built in mnemonics/ instruction set, find a way to implement the following: # **Problem 2:** Implement the following code using your CPU and its instruction set: ``` a = 1; for(i = 1; i < 6; i++){ a = a*2^{i}} ``` To obtain the bonus, the assembly code for both of the problems above must be submitted, and the programs in question must be demonstrated on the DE2 boards in the laboratory. | Course Title: | | |---------------------------|--| | Course Number: | | | Semester/Year (e.g.F2016) | | | | | | Instructor: | | | | | | | | | Assignment/Lab Number: | | | Assignment/Lab Title: | | | | | | | | | Submission Date: | | | Due Date: | | | | | | Student<br>LAST Name | Student<br>FIRST Name | Student<br>Number | Section | Signature* | |----------------------|-----------------------|-------------------|---------|------------| | | | | | | | | | | | | | | | | | | <sup>\*</sup>By signing above you attest that you have contributed to this written lab report and confirm that all work you have contributed to this lab report is your own work. Any suspicion of copying or plagiarism in this work will result in an investigation of Academic Misconduct and may result in a "0" on the work, an "F" in the course, or possibly more severe penalties, as well as a Disciplinary Notice on your academic record under the Student Code of Academic Conduct, which can be found online at: <a href="http://www.ryerson.ca/senate/current/pol60.pdf">http://www.ryerson.ca/senate/current/pol60.pdf</a> # Table of Contents | Introduction | 2 | |---------------------------------------|----| | Reset Circuit | 2 | | Core VHDL Modules | 3 | | Instruction Simulations and MIF Files | 7 | | LDAI, STA, CLRA, LDA | 7 | | LUI | 8 | | LDBI, STB, CLRB, LDB | 8 | | JMP | 9 | | ANDI | 10 | | ADDI | 10 | | ORI | 11 | | ADD | 12 | | SUB | 12 | | DECA | 13 | | INCA | 14 | | ROL | 14 | | ROR | 15 | | BEQ | 16 | | BNE Instruction Simulation | 16 | | VHDL Implementation of CPU | 17 | | Conclusion. | 25 | | References | 26 | #### Introduction The goal of Lab 6 was to build and simulate a fully functional Semi-RISC CPU using VHDL. This project involved combining all previously developed modules—datapath, control unit, and reset circuit—into a cohesive system capable of executing a basic instruction set. Through simulation and testing using a series of Memory Initialization Files (.mif), each CPU instruction was validated in Quartus II using waveform outputs. This report presents the finalized VHDL modules, the reset mechanism, and simulation results for each instruction. The .mif values were used to populate instruction memory, and the functionality was confirmed through waveform comparison. #### **Reset Circuit** ``` use ieee.std_logic_1164.all; use ieee.std_logic_arith.all; use ieee.std_logic_unsigned.all; ⊟entity reset circuit is reset : in std_logic; clk : in std_logic; enable_PD : out std_logic :='1'; clr_PC : out std_logic end reset_circuit; ⊟architecture Behavior of reset circuit is type clkNum is (clk0, clk1, clk2, clk3); signal present_clk: clkNum; ⊟begin process(clk)begin if rising_edge(clk) then if reset = '1' then clr_PC <= '1';</pre> enable_PD <= '0'; present_clk <= clk0; elsif present_clk <= clk0 then 23 24 25 26 27 28 29 30 31 32 33 34 present_clk <= clk1; elsif present_clk <= clk1 then present_clk <= clk2; elsif present_clk <= clk2 then present_clk <= clk3; elsif present_clk <= clk3 then clr_PC <= '0'; enable_PD <= '1'; end if: end if; end process; end Behavior; ``` Figure 1: reset circuit.vhd This figure shows the VHDL code for the synchronous reset circuit. When the Reset signal is asserted high, the Enable\_PD signal is driven low and Clr\_PC is driven high, resetting the program counter. After a few clock cycles, the CPU becomes enabled, allowing stable execution. Figure 2: Reset Circuit Simulation Waveform The waveform confirms proper operation: Enable\_PD is low and Clr\_PC is high during the first 4 cycles after reset is asserted, and the CPU begins execution once the reset phase completes. #### **Core VHDL Modules** ``` library ieee; use ieee.std_logic_1164.all; Bentity Control_New is Eport( clk, mclk: in std_logic; enable: in std_logic; statusC, statusZ: in std_logic; statusC, statusZ: in std_logic; INST: in std_logic_vector(31 downto 0); Amux, B Mux: out std_logic; IM_MUX1, REG_MUX: out std_logic; IM_MUX2, DATA_Mux: out std_logic; IM_ECC, ld_PC: out std_logic_vector(1 downto 0); ALU_op: out std_logic_vector(2 downto 0); If inc_PC, ld_PC: out std_logic; If_IR, ld_IR: out std_logic; If_IR, ld_IR: out std_logic; If_IR, ld_R, ld_C, ld_Z: out std_logic; If_IR, ld_R, ld_C, ld_Z: out std_logic; If_IR, ld_R, ld_C, ld_Z: out std_logic; If_IR, ld_R, ld_C, ld_Z: out std_logic; If_IR, ld_R, ld_C, ld_Z: std_logic; If_IR, ld_R, ld_C, ld_Z: std_logic; If_IR, ld_R, ld_C, ld_Z: std_logic; If_IR, ld_R, ld_C, ld_Z: std_logic, ld_R, ``` Figure 3: Control New.vhd This file contains the finite state machine (FSM) logic that drives control signals for the datapath. It interprets instructions stored in IR, managing transitions through the T0–T3 states and issuing the necessary control signals for memory, register loading, and ALU operations. ``` LIBRARY leee; USE leee.std_logic_li64.all; USE leee.std_logic_arith.all; USE leee.std_logic_unsigned.all; ERNTITY cpul IS ERNTITY cpul IS ERNTITY cpul IS ERNTITY cpul IS Colk: IN STD_LOGIC; mem_clk: IN STD_LOGIC; mem_clk: IN STD_LOGIC; mem_clk: IN STD_LOGIC; mem_clk: IN STD_LOGIC; mem_clk: IN STD_LOGIC vector(31 DOWNTO 0); dataOut: OUT STD_LOGIC_VECTOR(31 DOWNTO 0); dataOut: OUT STD_LOGIC_VECTOR(31 DOWNTO 0); douth: OUT STD_LOGIC_VECTOR(31 DOWNTO 0); douth: OUT STD_LOGIC_VECTOR(31 DOWNTO 0); douth: OUT STD_LOGIC vector(31 DOWNTO 0); douth: OUT STD_LOGIC vector(31 DOWNTO 0); douth: OUT STD_LOGIC vector(31 DOWNTO 0); douth: OUT STD_LOGIC; douth: OUT STD_LOGIC vector(31 DOWNTO 0); wen. out std_Logic. wen.mem: OUT STD_LOGIC vector(31 DOWNTO 0); wen.mem: OUT STD_LOGIC; outh: OUT STD_LOGIC; an.mem: OUT STD_LOGIC; an.mem: OUT STD_LOGIC; coll. outh: Outh Std_LOGIC; coll. outh: Outh Std_Logic vector(31 DOWNTO 0); coll. outh: Outh Std_Logic; Outh ``` ``` 81 dOutIR => outIR. dOutPC => outPC, 82 outT => T_Info, 83 84 wen mem => wen mem, 85 en mem => en mem 86 ); 87 88 addrOut <= add_from_cpu(5 downto 0);</pre> 89 wEn <= wen_from_cpu; 90 memDataOut <= mem_to_cpu;</pre> 91 memDataIn <= cpu_to_mem;</pre> 92 END behavior; ``` Figure 4: cpu1.vhd This file instantiates and wires together the datapath, control unit, and reset circuit. It maps internal signals and interfaces with the testbench and instruction memory, acting as the main processor entity. ``` use ieee.std_logic_1164.all; □ENTITY CPU_TEST_Sim IS PORT ( cpuClk : in std_logic; memClk : in std_logic; rst : in std_logic; -- Debug data. outA, outB : out std_logic_vector(31 downto 0); outC, outZ : out std_logic; outIR : out std_logic_vector(31 downto 0); outPC : out std_logic_vector(31 downto 0); 10 11 12 13 -- Processor-Inst Memory Interface. addrOut : out std_logic_vector(5 downto 0); 14 15 16 17 wEn : out std logic; memDataOut : out std_logic_vector(31 downto 0); 18 memDataIn : out std_logic_vector(31 downto 0); 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 -- Processor State T_Info : out std_logic_vector(2 downto 0); --data Memory Interface wen_mem, en_mem : out std_logic); END CPU_TEST_Sim; ⊟ARCHITECTURE behavior OF CPU_TEST_Sim IS COMPONENT system memory PORT ( address : IN STD_LOGIC_VECTOR (5 DOWNTO 0); clock : IN STD_LOGIC; data : IN STD_LOGIC_VECTOR (31 DOWNTO 0); wren : IN STD_LOGIC; q : OUT STD_LOGIC_VECTOR (31 DOWNTO 0) END COMPONENT; ``` Figure 5: CPU\_TEST\_Sim.vhd This testbench-style top-level file connects cpu1 and the system\_memory. It provides external clocks and data ports to simulate the system. ``` address_a : IN STD_LOGIC_VECTOR (5 DOWNTO 0); clock0 : IN STD_LOGIC; data_a : IN STD_LOGIC_VECTOR (31 DOWNTO 0); wren_a : IN STD_LOGIC; LIBRARY altera_mf; USE altera_mf.all; : OUT STD LOGIC VECTOR (31 DOWNTO 0) □ENTITY system_memory IS | PORT □ ( 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 83 address : IN STD_LOGIC_VECTOR (5 DOWNTO 0); clock : IN STD_LOGIC := '1'; data : IN STD_LOGIC_VECTOR (31 DOWNTO 0); wren : IN STD_LOGIC_VECTOR (31 DOWNTO 0) , q : OUT STD_LOGIC_VECTOR (31 DOWNTO 0) ), Devotes END COMPONENT; 85 86 BEGIN <= sub_wire0(31 DOWNTO 0); 88 89 90 altsyncram_component : altsyncram GENERIC MAP ( END system_memory; NERIC MAP ( clock_enable_input_a => "BYPASS", clock_enable_output_a => "BYPASS", init_file => "system_memory.mif", intended_device_family => "cyclone II", lpm_hint => "ENABLE_RUNTIME_MOD=NO", lpm_type => "altsyncram", numwords_a => 64, 92 ⊟ARCHITECTURE SYN OF system_memory IS 94 SIGNAL sub wire0 : STD LOGIC VECTOR (31 DOWNTO 0); 96 97 operation_mode => "SINGLE_PORT", outdata_aclr_a => "NONE", outdata_reg_a => "CLOCKO", COMPONENT altsyncram 98 99 clock_enable_input_a : STRING; clock_enable_output_a : STRING; init_file -: STRING; intended_device_family : STRING; power_up_uninitialized => "FALSE", 01 widthad_a => 6, width_a => 32, width_byteena_a => 1 105 PORT MAP ( 106 🚊 address_a => address, clock0 => clock, outdata_reg_a : STRING; power_up_unintialized : S' widthad_a : NATURAL; width_a : NATURAL; width_byteena_a : NATURAL 108 data_a => data, wren_a => wren, q_a => sub_wire0 PORT ( ( address_a : IN STD_LOGIC_VECTOR (5 DOWNTO 0); clock0 : IN STD_LOGIC; data_a : IN STD_LOGIC_VECTOR (31 DOWNTO 0); wren_a : IN STD_LOGIC; q_a : OUT STD_LOGIC_VECTOR (31 DOWNTO 0) END SYN; ``` Figure 6: system memory.vhd This memory unit connects to cpu1 and uses .mif files to preload instructions for simulation. Each test modifies this memory with specific opcodes to verify CPU behavior. ``` LIBRARY ieee; USE ieee.std_logic_1164.ALL; 3 USE ieee.std_logic_arith.ALL; 4 USE ieee.std logic unsigned.ALL; 5 6 □ENTITY add IS A: IN STD LOGIC VECTOR(31 DOWNTO 0); -- Input data (32 bits) 8 9 B: OUT STD LOGIC VECTOR (31 DOWNTO 0) -- Output (A + 4) 10 11 END add; 12 13 BARCHITECTURE Behavior OF add IS 14 ⊟BEGIN 15 END Behavior; B \le A + 1; -- Add 4 to the input A and output the result to B 16 17 ``` Figure 7: add.vhd (Updated) Modified to add by 1, this version helps test increment operations and is used in simulations involving INCA, ADDI, and other arithmetic instructions. #### **Instruction Simulations and MIF Files** Each of the following instructions was tested by updating the .mif file with corresponding opcodes and running functional simulation. The results were observed using Quartus II's waveform viewer. # LDAI, STA, CLRA, LDA Figure 8: Waveform Simulation Confirms LDAI loads immediate into register A, STA stores it into memory, CLRA clears A, and LDA reloads the stored value. ``` WIDTH=32; DEPTH=64; ADDRESS_RADIX=UNS; DATA_RADIX=HEX; CONTENT BEGIN 0 : 0000AAAA; 1 : 20000001; 2 : 75000000; 3 : 90000001; [4..63] : 000000000; END; ``` Figure 9: MIF Explanation The instruction memory includes four key opcodes at addresses 0 through 3. These drive register and memory interaction patterns. # LUI Figure 10: Waveform Simulation LUI loads an upper immediate value into register A. The waveform shows a correct load into A without affecting the lower bits. ``` WIDTH=32; DEPTH=64; ADDRESS_RADIX=UNS; DATA_RADIX=HEX; CONTENT BEGIN 0 : 4000AAAA; -- LUI instruction [1..63] : 00000000; END; ``` Figure 11: MIF Explanation The opcode at address 0 sets the upper 16 bits of A, validating LUI functionality. # LDBI, STB, CLRB, LDB Figure 12: Waveform Simulation *Immediate value is loaded into B, stored to memory, cleared, and then reloaded from memory.* ``` WIDTH=32; DEPTH=64; ADDRESS_RADIX=UNS; DATA_RADIX=HEX; CONTENT BEGIN 0 : 1000BBBB; -- LDBI instruction 1 : 30000001; -- STB instruction 2 : 76000000; -- CLRB instruction 3 : A0000001; -- LDB instruction [4..63] : 000000000; END; ``` Figure 13: MIF Explanation *Memory addresses 0–3 are loaded with opcodes representing the four instructions above.* #### **JMP** Figure 14: Waveform Simulation Confirms the Program Counter jumps to a new instruction address when JMP is executed. ``` WIDTH=32; DEPTH=64; ADDRESS_RADIX=UNS; DATA_RADIX=HEX; CONTENT BEGIN 0 : 5000AAAA; -- JMP instruction [1..63] : 00000000; END; ``` Figure 15: MIF Explanation *Memory address 0 contains a JMP opcode that modifies the PC directly.* #### **ANDI** Figure 16: Waveform Simulation ANDI performs a logical AND between A and immediate, storing result in C. The waveform shows changes in C with the Zero flag set appropriately. ``` WIDTH=32; DEPTH=64; ADDRESS_RADIX=UNS; DATA_RADIX=HEX; CONTENT BEGIN 0 : 00000006; -- Load initial value (e.g., CLRA or a base op) 1 : 7900000B; -- ANDI operation with immediate value 0x0000000B [2..63] : 000000000; END; ``` Figure 17: MIF Explanation Memory includes immediate load and ANDI opcode with bitwise result testing. # **ADDI** Figure 18: Waveform Simulation ADDI performs immediate addition and updates the result in C. The waveform shows carry/zero status and updated output. ``` WIDTH=32; DEPTH=64; ADDRESS_RADIX=UNS; DATA_RADIX=HEX; CONTENT BEGIN 0 : 000000006; -- Initialize (e.g., CLRA or similar setup) 1 : 7100000B; -- ADDI operation with immediate value 0x00000000B [2..63] : 000000000; END; ``` Figure 19: MIF Explanation Test verifies signed/unsigned immediate handling through two opcodes. # **ORI** Figure 20: Waveform Simulation Performs bitwise OR with an immediate value. Waveform shows OR result populating C. ``` WIDTH=32; DEPTH=64; ADDRESS_RADIX=UNS; DATA_RADIX=HEX; CONTENT BEGIN 0 : 000000006; -- Possibly CLRA or setup 1 : 7D000000B; -- ORI operation with immediate value 0x000000000 [2..63] : 000000000; END; ``` Figure 21: MIF Explanation *Memory holds immediate and ORI instruction opcodes.* #### **ADD** Figure 22: Waveform Simulation Standard addition between registers A and B. Result is shown on output C. ``` WIDTH=32; DEPTH=64; ADDRESS_RADIX=UNS; DATA_RADIX=HEX; CONTENT BEGIN 0 : 00000005; -- CLRA or load immediate value into register A 1 : 10000003; -- LDBI instruction to load value into register B 2 : 70000000; -- ADD instruction [3..63] : 000000000; END; ``` Figure 23: MIF Explanation Tested with preload of values in A and B, with ALU ADD opcode issued. # **SUB** Figure 24: Waveform Simulation Subtracts B from A and shows result in C with corresponding zero/carry status. ``` WIDTH=32; DEPTH=64; ADDRESS_RADIX=UNS; DATA_RADIX=HEX; CONTENT BEGIN 0 : 000000005; -- CLRA or load immediate into A 1 : 10000003; -- LDBI to load value into B 2 : 72000000; -- SUB instruction [3..63] : 000000000; END; ``` Figure 25: MIF Explanation *Program loads values and performs a subtract via ALU.* # **DECA** Figure 26: Waveform Simulation *Decrements A and shows updated result.* ``` WIDTH=32; DEPTH=64; ADDRESS_RADIX=UNS; DATA_RADIX=HEX; CONTENT BEGIN 0 : 0000AAAA; -- Load immediate value into register A 1 : 7E000000; -- DECA instruction 2 : 70000000; -- ADD instruction to verify result (can be viewed as NOP here) [3..63] : 000000000; END; ``` Figure 27: MIF Explanation *Instruction sequence loads value, decrements, and stores.* # **INCA** Figure 28: Waveform Simulation *Increments A. ALU output is verified along with zero flag.* ``` WIDTH=32; DEPTH=64; ADDRESS_RADIX=UNS; DATA_RADIX=HEX; CONTENT BEGIN 0 : 0000AAAA; -- Load immediate value into register A 1 : 73000000; -- INCA instruction 2 : 70000000; -- ADD instruction (used here to observe changes or as NOP) [3..63] : 000000000; END; ``` Figure 29: MIF Explanation *Value is loaded and incremented through INCA opcode.* # **ROL** Figure 30: Waveform Simulation *Performs rotate-left operation on register A.* ``` WIDTH=32; DEPTH=64; ADDRESS_RADIX=UNS; DATA_RADIX=HEX; CONTENT BEGIN 0 : 00000008; -- Load immediate value into register A 1 : 74000000; -- ROL instruction 2 : 70000000; -- ADD instruction (or NOP for observing result) [3..63] : 000000000; END; ``` Figure 31: MIF Explanation *Two opcodes simulate ROL on predefined value.* # **ROR** Figure 32: Waveform Simulation *Performs rotate-right operation on register A.* ``` WIDTH=32; DEPTH=64; ADDRESS_RADIX=UNS; DATA_RADIX=HEX; CONTENT BEGIN 0 : 000000008; -- Load immediate value into register A 1 : 7F000000; -- ROR instruction 2 : 70000000; -- ADD instruction (or NOP to view result) [3..63] : 000000000; END; ``` Figure 33: MIF Explanation Instruction memory loaded to show ROR effect on bit positions. # **BEQ** Figure 34: Waveform Simulation Simulates branch-if-equal using flags from ALU. Jump is taken based on Z flag. ``` WIDTH=32; DEPTH=64; ADDRESS_RADIX=UNS; DATA_RADIX=HEX; CONTENT BEGIN 0 : 0000AAAA; -- LDAI (Load Immediate into A) 1 : 1000AAAA; -- LDBI (Load Immediate into B) 2 : 600000F0; -- BEQ to address offset (usually means jump if A == B) [3..63] : 000000000; END; ``` Figure 35: MIF Explanation Memory programmed to test conditional branching. BEQ occurs based on equality check. # **BNE Instruction Simulation** Figure 36: Waveform for BNE Execution This waveform illustrates the behavior of the BNE (Branch if Not Equal) instruction. Register A is loaded with AAAA, and Register B with BBBB. Since the two values differ, the CPU correctly performs a branch to the address F0, as shown by the updated program counter (PC) in the waveform. ``` WIDTH=32; DEPTH=64; ADDRESS_RADIX=UNS; DATA_RADIX=HEX; CONTENT BEGIN 0 : 0000AAAA; -- LDAI (Load Immediate into A) 1 : 1000AAAA; -- LDBI (Load Immediate into B) 2 : 600000F0; -- BEQ to address offset (usually means jump if A == B) [3..63] : 000000000; END; ``` Figure 37: MIF Setup for BNE Instruction This MIF setup initializes the instruction memory to test BNE. It loads values into A and B, followed by the branch instruction. The remaining memory is filled with zeros. # VHDL Implementation of CPU ``` | Second S ``` Figure 1: VHDL Code for Control Unit The Control.vhd module defines the logic for the CPU control unit, ensuring correct sequencing of fetch, decode, and execute phases. The FSM transitions through states T0, T1, and T2, setting control signals accordingly. The operation decoder determines execution steps based on the instruction opcode, and the memory signal generator enables correct read/write operations. This implementation ensures that all CPU components function in harmony, allowing seamless execution of instructions. ``` LIBRARY ieee; USE ieee.std_logic_1164.ALL; - 4-bit Adder ⊟ENTITY adder4 IS Cin : IN STD_LOGIC; -- Carry-in X, Y : IN STD_LOGIC_VECTOR(3 DOWNTO 0); -- 4-bit inputs S : OUT STD_LOGIC_VECTOR(3 DOWNTO 0); -- 4-bit sum output Cout : OUT STD_LOGIC -- Carry-out END adder4; 12 13 BARCHITECTURE behavior OF adder4 IS COMPONENT fulladd PORT ( 17 Cin, x, y : IN STD_LOGIC; 18 s, Cout : OUT STD_LOGIC 19 20 END COMPONENT: 22 23 24 SIGNAL C : STD_LOGIC_VECTOR(1 TO 3); -- Internal carry signals BEGIN 25 - Instantiate four 1-bit full adders to form a 4-bit ripple-carry add stage1: fulladd PORT MAP(Cin, X(0), Y(0), S(0), C(1)); stage2: fulladd PORT MAP(C(2), X(2), Y(2), S(2), C(3)); stage3: fulladd PORT MAP(C(3), X(3), Y(3), S(3), Cout); 27 28 29 30 END behavior; ``` Figure 3. Adder4.vhd *A 4-bit adder module that forms part of the ALU's arithmetic capabilities.* USE ieee.std\_logic\_1164.ALL; -- 16-bit Adder mENTITY adder16 IS PORT ( Cin : IN STD\_LOGIC; -- Carry-in X, Y : IN STD\_LOGIC\_VECTOR(15 DOWNTO 0); -- 16-bit inputs S : OUT STD\_LOGIC\_VECTOR(15 DOWNTO 0); -- 16-bit sum output Cout : OUT STD\_LOGIC -- Carry-out 8 10 11 ); END adder16; 12 **MARCHITECTURE** behavior OF adder16 IS 14 15 COMPONENT adder4 17 Cin : IN STD\_LOGIC; X, Y: IN STD\_LOGIC\_VECTOR(3 DOWNTO 0); S: OUT STD\_LOGIC\_VECTOR(3 DOWNTO 0); 18 19 20 Cout : OUT STD LOGIC 21 END COMPONENT: 22 23 SIGNAL C : STD\_LOGIC\_VECTOR(1 TO 3); -- Internal carry signals 24 25 26 BEGIN -- Instantiate four 4-bit adders to form a 16-bit ripple-carry adder 27 stage1: adder4 PORT MAP(Cin, X(3 DOWNTO 0), Y(3 DOWNTO 0), S(3 DOW stage1: adder4 PORT MAP(Cil), X(7 DOWNTO 4), Y(7 DOWNTO 4), S(7 DOWN stage2: adder4 PORT MAP(C(2), X(11 DOWNTO 4), Y(11 DOWNTO 8), S(11 DOWNS 3), S(1 28 29 30 31 32 33 LEND behavior; 34 Figure 4. Adder16.vhd. A 16-bit adder that enables larger arithmetic operations within the CPU. ``` LIBRARY ieee; USE ieee.std logic 1164.ALL; 32-bit Adder ⊟ENTITY adder32 IS PORT ( Cin : IN STD_LOGIC; -- Carry-in X, Y : IN STD_LOGIC_VECTOR(31 DOWNTO 0); -- 32-bit inputs S : OUT STD_LOGIC_VECTOR(31 DOWNTO 0); -- 32-bit sum output Cout : OUT STD_LOGIC -- Carry-out 10 11 12 END adder32; □ARCHITECTURE behavior OF adder32 IS 14 COMPONENT adder16 16 PORT ( 17 Cin : IN STD LOGIC; X, Y: IN STD LOGIC VECTOR(15 DOWNTO 0); S: OUT STD LOGIC VECTOR(15 DOWNTO 0); Cout: OUT STD_LOGIC 18 19 20 21 22 END COMPONENT; 23 24 SIGNAL C : STD_LOGIC; -- Internal carry signal 25 26 27 - Instantiate two 16-bit adders to form a 32-bit ripple-carry adder stage0: adder16 PORT MAP(Cin, X(15 DOWNTO 0), Y(15 DOWNTO 0), S(15 stage1: adder16 PORT MAP(C, X(31 DOWNTO 16), Y(31 DOWNTO 16), S(31 28 29 30 31 END behavior; 32 ``` Figure 5. Adder32.vhd A 32-bit adder responsible for performing full-width arithmetic computations. Figure 6. Alu.vhd The Arithmetic Logic Unit (ALU) processes arithmetic and logical operations based on control signals. ``` 97 B FORT ( 98 S: IN STD LOGIC, VECTOR (31 DOWNTO 0); 99 STD COMPONENT; 100 STD LOGIC, VECTOR (31 DOWNTO 0); 101 STD LOGIC, VECTOR (31 DOWNTO 0); 102 STD COMPONENT; 103 STD LOGIC, VECTOR (11 DOWNTO 0); 104 STD COMPONENT; 105 STD COMPONENT; 106 STD COMPONENT; 107 STD LOGIC, VECTOR (11 DOWNTO 0); 108 STD COMPONENT; 109 STD COMPONENT; 110 STD LOGIC, VECTOR (31 DOWNTO 0); 111 STD LOGIC, VECTOR (31 DOWNTO 0); 111 STD LOGIC, VECTOR (31 DOWNTO 0); 112 STD COMPONENT; 113 STD LOGIC, VECTOR (31 DOWNTO 0); 114 STD COMPONENT; 115 STD COMPONENT; 116 STD COMPONENT; 117 STD LOGIC, VECTOR (31 DOWNTO 0); 118 STD LOGIC, VECTOR (31 DOWNTO 0); 119 STD LOGIC, VECTOR (31 DOWNTO 0); 120 STGNAL IR OUT, data bus s: STD LOGIC, VECTOR (31 DOWNTO 0); 121 STGNAL IR OUT, data bus s: STD LOGIC, VECTOR (31 DOWNTO 0); 122 STGNAL LEE OUT PC, LEE OUT A MUX, LEE OUT B MUX: STD LOGIC, VECTOR (31 SIGNAL IR MUX) GUT, FERNAL TIPE DOWNTO 0); 123 STGNAL LEE OUT PC, LEE OUT A MUX, LEE OUT B B OUT: STD LOGIC, VECTOR (31 DOWNTO 0); 124 STGNAL LEE OUT, STD LOGIC, VECTOR (31 DOWNTO 0); 125 STGNAL LEE MUXI GUT, HUM MUXI GUT, HUM MUXI GUT, HUM MUXI GUT, HUM MUXI GUT, STD LOGIC, VECTOR (31 DOWNTO 0); 126 STGNAL LEE MUXI GUT, HUM MUXI GUT, HUM MUXI GUT, STD LOGIC, VECTOR (31 DOWNTO 0); 127 STGNAL LEE MUXI GUT, HUM MUXI GUT, STD LOGIC, VECTOR (31 DOWNTO 0); 128 STGNAL LEE STD LOGIC, VECTOR (31 DOWNTO 0); 129 STGNAL LEE STD LOGIC, VECTOR (31 DOWNTO 0); 130 STGNAL LEE STD LOGIC, VECTOR (31 DOWNTO 0); 131 STGNAL LEE STD LOGIC, VECTOR (31 DOWNTO 0); 132 STGNAL LEED STD LOGIC, VECTOR (31 DOWNTO 0); 133 STGNAL LEED STD LOGIC, VECTOR (31 DOWNTO 0); 134 STGNAL LEED STD LOGIC, VECTOR (31 DOWNTO 0); 135 STGNAL LEED STD LOGIC, VECTOR (31 DOWNTO 0); 136 STGNAL LEED STD LOGIC, VECTOR (31 DOWNTO 0); 137 STGNAL LEED STD LOGIC, VECTOR (31 DOWNTO 0); 138 STGNAL LEED STD LOGIC, VECTOR (32 DOWNTO 0); 139 STGNAL LEED STD LOGIC, VECTOR (32 DOWNTO 0); 130 STGNAL LEED STD LOGIC, VECTOR (32 DOWNTO 0); 131 STGNAL LEED STD LOGIC, VECTOR (32 DOWNTO 0); 131 STGNAL LEED STD LOGIC, VECTOR (3 ``` Figure 7: Data path.vhd Defines the data path, including registers, ALU, and multiplexers, ensuring correct data movement. Figure 8: Data mem.vhd *Implements the memory storage unit, supporting read and write operations.* ``` LIBRARY ieee; 2 USE ieee.std_logic_1164.ALL; 3 -- 1-bit Full Adder 4 □ENTITY fulladd IS 5 6 ⊟ PORT ( 7 Cin, x, y: IN STD_LOGIC; -- Inputs: Carry-in, x, and y 8 s, Cout : OUT STD LOGIC -- Outputs: Sum (lowercase s) and Carr 9 ); 10 END fulladd; 11 12 □ARCHITECTURE behavior OF fulladd IS 13 ⊟BEGIN 14 -- Sum computation using XOR gates 15 s <= x XOR y XOR Cin; 16 17 -- Carry-out computation using OR-AND logic 18 Cout <= (x AND y) OR (Cin AND x) OR (Cin AND y); 19 LEND behavior; 20 ``` Figure 9. Fulladd.vhd A full-adder module utilized within arithmetic operations. ``` LIBRARY ieee; USE ieee.std_logic_1164.ALL; USE ieee.numeric std.ALL; 3 5 □ENTITY LZE IS 6 PORT ( LZE_in : IN std_logic_vector(31 DOWNTO 0); -- Input signal LZE_out : OUT std_logic_vector(31 DOWNTO 0) -- Output signal 8 9 10 END ENTITY LZE; 11 12 □ARCHITECTURE Behavior OF LZE IS SIGNAL zeros: std_logic_vector(15 DOWNTO 0) := (OTHERS => '0'); -- 16 13 14 15 -- Concatenates 16 zeros with the lower 16 bits of LZE in 16 LZE_out <= zeros & LZE_in(15 DOWNTO 0);</pre> 17 <sup>L</sup>END Behavior; 18 ``` Figure 10. LZE.vhd Logical Zero Extension module extends immediate values for operations requiring sign extension. ``` LIBRARY ieee; USE ieee.std_logic_1164.ALL; 3 ⊟ENTITY mux2to1 IS 4 5 □PORT ( 6 s : IN STD LOGIC; -- Select signal w0: IN STD_LOGIC_VECTOR(31 DOWNTO 0); -- Input 0 w1: IN STD_LOGIC_VECTOR(31 DOWNTO 0); -- Input 1 7 8 9 f : OUT STD_LOGIC_VECTOR(31 DOWNTO 0) -- Output 10 | | ; 11 END mux2to1; 12 □ARCHITECTURE Behavior OF mux2tol IS 13 14 15 WITH s SELECT 16 f <= w0 WHEN '0', w1 WHEN OTHERS; 17 18 END Behavior; 19 20 ``` Figure 11. Mux2to1.vhd *A two-input multiplexer used for selecting data sources.* ``` LIBRARY ieee; USE ieee.std_logic_1164.ALL; 4 □ENTITY mux4tol IS PORT ( 5 6 : IN std_logic_vector(1 DOWNTO 0); -- 2-bit selector X1, X2, X3, X4: IN std_logic_vector(31 DOWNTO 0); -- 4 input dat f : OUT std_logic_vector(31 DOWNTO 0) -- Output line 10 END ENTITY mux4tol; 11 ⊟ARCHITECTURE Behavior OF mux4tol IS 12 13 ⊟BEGIN 14 -- Selects one of the four inputs based on selector 's' WITH s SELECT 15 16 f <= X1 WHEN "00", X2 WHEN "01", 17 X3 WHEN "10", 18 X4 WHEN "11"; 19 20 END Behavior; 21 ``` Figure 12. Mux4to1.vhd *A four-input multiplexer used for complex data routing.* ``` USE ieee.std_logic_1164.ALL; USE ieee.std_logic_arith.ALL; USE ieee.std_logic_unsigned.ALL; ⊟ENTITY pc IS T( clr: IN STD_LOGIC; clk: IN STD_LOGIC; ld: IN STD_LOGIC; inc: IN STD_LOGIC; d: IN STD_LOGIC; q: OUT STD_LOGIC_VECTOR(31 DOWNTO 0); q: OUT STD_LOGIC_VECTOR(31 DOWNTO 0) -- Clear signal -- Clock signal -- Load/Enable signal -- Increment signal -- Data input (PC value) -- Output (PC value) END pc; MARCHITECTURE Behavior OF pc IS COMPONENT add PORT( B: OUT STD_LOGIC_VECTOR(31 DOWNTO 0); B: OUT STD_LOGIC_VECTOR(31 DOWNTO 0) END COMPONENT; COMPONENT mux2tol PORT ( r( s: IN STD_LOGIC; s0: IN STD_LOGIC_VECTOR(31 DOWNTO 0); s1: IN STD_LOGIC_VECTOR(31 DOWNTO 0); f: OUT STD_LOGIC_VECTOR(31 DOWNTO 0) END COMPONENT: COMPONENT register32 PORT ( T( d : IN STD_LOGIC_VECTOR(31 DOWNTO 0); ld : IN STD_LOGIC; clr : IN STD_LOGIC; clk : IN STD_LOGIC; Q : OUT STD_LOGIC_VECTOR(31 DOWNTO 0) END COMPONENT; SIGNAL add_out : STD_LOGIC_VECTOR(31 DOWNTO 0); SIGNAL mux_out : STD_LOGIC_VECTOR(31 DOWNTO 0); SIGNAL q_out : STD_LOGIC_VECTOR(31 DOWNTO 0); ``` ``` A : IN STD_LOGIC_VECTOR(31 DOWNTO 0); B : OUT STD_LOGIC_VECTOR(31 DOWNTO 0) 21 23 24 25 END COMPONENT: 26 = 27 = 28 | COMPONENT mux2to1 PORT ( s : IN STD LOGIC; w0 : IN STD_LOGIC_VECTOR(31 DOWNTO 0); w1 : IN STD_LOGIC_VECTOR(31 DOWNTO 0); f : OUT STD_LOGIC_VECTOR(31 DOWNTO 0) 30 31 32 33 34 35 END COMPONENT: COMPONENT register32 : IN STD_LOGIC_VECTOR(31 DOWNTO 0); ld : IN STD_LOGIC; clr : IN STD_LOGIC; clk : IN STD_LOGIC; 40 Q : OUT STD_LOGIC_VECTOR(31 DOWNTO 0) 42 43 44 END COMPONENT: SIGNAL add_out : STD_LOGIC_VECTOR(31 DOWNTO 0); SIGNAL mux_out : STD_LOGIC_VECTOR(31 DOWNTO 0); SIGNAL q_out : STD_LOGIC_VECTOR(31 DOWNTO 0); 45 46 47 BEGIN 49 50 51 -- Add block to increment the PC by 4 add0 : add PORT MAP (q_out, add_out); 54 -- 2-to-1 Multiplexer to select between PC+4 and the data input `d` mux0 : mux2to1 PORT MAP (inc, d, add_out, mux_out); 56 57 58 -- 32-bit Register to hold the PC value reg0 : register32 PORT MAP (mux_out, ld, clr, clk, q_out); 59 60 - Output assignment 61 q <= q_out; END Behavior: 63 ``` Figure 13. Pc.vhd *The Program Counter module responsible for maintaining instruction execution order.* ``` LIBRARY ieee; 2 USE ieee.std logic 1164.ALL; 3 USE ieee.numeric std.ALL; 4 5 ⊟ENTITY RED IS 6 ⊟ PORT ( 7 RED in : IN std logic vector(31 DOWNTO 0); -- 32-bit input 8 RED out : OUT unsigned(7 DOWNTO 0) -- 8-bit output 9 ); 10 END ENTITY RED; 11 12 BARCHITECTURE Behavior OF RED IS 13 ⊟BEGIN 14 -- Extracts the lower 8 bits of RED in and converts them to an unsigne 15 RED out <= unsigned(RED in(7 DOWNTO 0));</pre> END Behavior; 16 17 ``` Figure 14. RED.vhd *Extracts address and control signals from instructions.* ``` LIBRARY ieee; USE ieee.std_logic_1164.ALL; USE ieee.std logic arith.ALL; USE ieee.std_logic_unsigned.ALL; 6 ⊟ENTITY register32 IS 7 □PORT ( : IN STD_LOGIC_VECTOR(31 DOWNTO 0); -- Input data (32 bits) 8 d ld : IN STD LOGIC; -- Load/Enable signal clr : IN STD LOGIC; 10 -- Clear signal 11 clk : IN STD LOGIC; -- Clock signal Q : OUT STD LOGIC VECTOR(31 DOWNTO 0) -- Output data (32 bits) 12 13 14 END register32; 15 ⊟ARCHITECTURE Behavior OF register32 IS 17 ⊟BEGIN 18 ⊟ PROCESS(ld, clr, clk) 19 BEGIN IF clr = '1' THEN 20 ⊟ Q \leftarrow (OTHERS \Rightarrow '0'); -- Clear the register to 0 21 ELSIF (clk'event AND clk = '1' AND ld = '1') THEN 22 Q \leftarrow d; -- Load the value of `d` into the register on rising 23 24 END IF; 25 END PROCESS: END Behavior; 26 27 ``` Figure 15. Register32.vhd *Defines 32-bit registers used for storing operands and computation results.* ``` LIBRARY ieee; USE ieee.std logic 1164.ALL; 3 USE ieee.numeric std.ALL; 4 5 ⊟ENTITY UZE IS 6 PORT ( 7 UZE_in : IN std_logic_vector(31 DOWNTO 0); -- Input signal UZE_out : OUT std_logic_vector(31 DOWNTO 0) -- Output signal 8 9 ); 10 END ENTITY UZE; 11 12 ⊟ARCHITECTURE Behavior OF UZE IS 13 SIGNAL zeros: std logic vector(15 DOWNTO 0) := (OTHERS => '0'); 14 ⊟BEGIN -- Concatenates lower 16 bits of UZE in with 16 zeros 15 UZE out <= UZE in(15 DOWNTO 0) & zeros;</pre> 16 END Behavior; 17 18 ``` Figure 16. UZE.vhd Upper Zero Extension module for handling upper immediate values #### **Conclusion** This lab demonstrated the full implementation of a Semi-RISC CPU in VHDL. By integrating a datapath, control unit, and reset circuit, the system could interpret and execute 16+ instructions. Simulations confirmed accurate behavior via .mif-based instruction loading and waveform verification. Each opcode's behavior—arithmetic, logic, memory, and branching—was successfully emulated. This lab served as a culmination of concepts learned throughout the course and provided a foundation for more complex CPU architectures and digital systems. # References - [1] Geurkov, V. (2017). *COE608: Computer Organization and Architecture Lab Manual* (Winter 2017). Toronto Metropolitan University. - [2] IEEE. (2008). IEEE Standard VHDL Language Reference Manual. IEEE. - [3] "VHDL Documentation and Tutorials." (n.d.). Retrieved January 2025, from https://www.vhdl.org