A specialized 32-bit CPU architecture featuring a 5-stage FSM and custom ISA, developed for high-efficiency FPGA resource utilization.
As systems complexity grows in the AI and Cloud era, the need for custom centralized controllers is mandatory. While soft-processors like MicroBlaze are common, they often consume a large amount of FPGA resources.
I built this Custom CPU to balance complexity and resource consumption, allowing for independent data processing and control requirements without the overhead of a standard hard processor.
The processor is built on a 16-bit data path with a 32-bit instruction width, utilizing a Harvard-style approach with integrated Program and Data memory.
To ensure stable timing and memory synchronization, I implemented a robust Finite State Machine (FSM):
- Idle: System reset and initialization of the Program Counter (PC).
- Fetch: Loading the instruction from
inst_meminto the Instruction Register (IR). - Decode/Execute: Processing the opcode and driving the execution unit.
- Delay/Next: A critical delay state to meet resource utilization constraints and ensure data stability before the next fetch.
- Sense Halt: Managing termination logic without hanging the system bus.
One of my key implementation notes: since a 16-bit multiplication results in a 32-bit value, I engineered a Special General Purpose Register (SGPR) to capture the upper bits, preventing data loss.
I developed a custom ISA that supports both register-direct and immediate addressing modes via a dedicated imm_mode bit.
| Category | Opcodes | Description |
|---|---|---|
| Arithmetic | ADD, SUB, MUL, MOV |
Handles core computations and data movement. |
| Logical | OR, AND, XOR, NAND, NOT |
Bitwise operations for signal processing. |
| Control | JUMP, JCarry, JZero, JOverflow |
Multi-condition branching for algorithm control. |
| Memory | Storedin, Storereg, Sendreg |
Synchronizing data between memory and the GPR. |
[cite_start]Targeting Validation and Verification roles[cite: 2], I focused heavily on the integrity of hardware status flags. The CPU monitors four critical condition flags used for branching:
-
Carry Flag: Set when the 17-bit
temp_sumdetects an arithmetic carry-out. - Zero Flag: Monitored across both the GPR and SGPR to ensure accurate zero-detection.
- Sign Flag: Tracks the MSB of the result for signed arithmetic.
-
Overflow Flag: Implemented logic to detect two's complement arithmetic violations:
$$Overflow = (\neg A_{msb} \wedge \neg B_{msb} \wedge Out_{msb}) \vee (A_{msb} \wedge B_{msb} \wedge \neg Out_{msb})$$
/src: Containstop.v, the primary RTL design./tb: Contains the functional testbench andinst_data.mem./docs: FSM mapping and block diagrams.
Arjun Chati Electrical & Computer Engineering, The University of Texas at Austin
LinkedIn | GitHub [cite: 2]