Skip to content
This repository was archived by the owner on May 28, 2026. It is now read-only.
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
83 changes: 83 additions & 0 deletions .claude/rules/architecture/cpu.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
---
paths: hdl/cpu/**
---

# CPU Architecture

**Last updated**: 2026-01-05
**Sources**: [cpu.v](hdl/cpu/cpu.v), [cpu_core_params.vh](hdl/cpu/cpu_core_params.vh)

RV32I soft core, 3-stage pipeline. No M/F/D extensions, no multiplication.

## Pipeline Stages

**Stage 1: Fetch/Decode/Execute**
- Fetch instruction (AXI), decode, execute ALU/comparator
- Outputs: `w_Alu_Result`, `w_Compare_Result`, `w_Instruction_Valid`

**Stage 2: Memory/Wait**
- Issue AXI read/write for loads/stores
- Pipeline registers: `r_S2_*` (Valid, Alu_Result, Load_Data, Rd, Write_Enable)
- Stalls while memory operations complete

**Stage 3: Writeback**
- Write ALU result, load data, immediate, or PC+4 to register file
- Writeback mux selects source based on `r_S3_Wb_Src`

## Timing

**Cycles per instruction**: Variable
- S1: 1 cycle (ALU/decode)
- S2: 0 cycles (no memory) or 2-4 cycles (load/store AXI transaction)
- S3: 1 cycle (writeback)

Tests use `PIPELINE_CYCLES` from [tests/cpu/constants.py](tests/cpu/constants.py) as conservative wait.

## Stall Logic

```verilog
w_Stall_S1 = w_Debug_Stall
|| !i_Init_Calib_Complete
|| (r_S2_Valid && (w_S2_Is_Load || w_S2_Is_Store)
&& !(w_Mem_Read_Done || w_Mem_Write_Done));
```

CPU stalls when:
- `w_Debug_Stall`: Debug peripheral halted CPU
- `!i_Init_Calib_Complete`: DDR3 MIG not ready
- Memory op in progress: S2 has valid load/store waiting for AXI completion

## Hazards

**Status**: No hazard detection or forwarding implemented.

**Workaround**: Tests insert NOPs or wait `PIPELINE_CYCLES` between dependent instructions.

## PC (Program Counter)

**Normal**: `PC += 4` after instruction completes
**Branch taken**: `PC = PC + immediate`
**Jump**: `PC = target address`
**Reset**: `PC = 0`

Mux control: `w_Pc_Alu_Mux_Select` chooses between `PC+4` and `w_Alu_Result`

## Register File

32 registers × 32 bits (XLEN=32)
- Read ports: Rs1, Rs2 (from instruction[19:15], [24:20])
- Write port: Rd (r_S3_Rd), enabled by `w_Wb_Enable`
- Sources: ALU, comparator, immediate, PC+4, load data
- Register 0 always reads 0 (RISC-V spec)

See [register_file.v](hdl/cpu/register_file/register_file.v)

## Memory Interface

Two separate AXI4-Lite masters:
1. **Instruction memory**: Fetch-only (read)
2. **Data memory**: Loads/stores

No error handling - assumes all transactions succeed.

See [memory.md](memory.md) for AXI protocol details.
65 changes: 65 additions & 0 deletions .claude/rules/architecture/memory.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
---
paths:
- hdl/cpu/memory/**
- hdl/cpu/instruction_memory/**
---

# Memory Architecture

**Last updated**: 2026-01-05
**Sources**: [memory_axi.v](hdl/cpu/memory/memory_axi.v), [memory.vh](hdl/cpu/memory/memory.vh)

AXI4-Lite memory interface for CPU instruction/data access.

## Memory Map

| Region | Start | End | Size | Backing | Notes |
|--------|-------|-----|------|---------|-------|
| ROM | `0x0000` | `0x0FFF` | 4 KB | BRAM | Bootstrap, read-only |
| RAM | `0x1000` | (varies) | 256 MB | DDR3 (MIG) | Main memory, stack, heap |
| Peripherals | TBD | TBD | TBD | Memory-mapped | Debug UART (future) |

**ROM boundary**: `ROM_BOUNDARY_ADDR = 0x1000` - see [memory.vh](hdl/cpu/memory/memory.vh) and [tests/cpu/constants.py](tests/cpu/constants.py)

## AXI State Machine

**States**: `IDLE` → `READ_SUBMITTING` → `READ_AWAITING` → `READ_SUCCESS`
**Write**: `IDLE` → `WRITE_SUBMITTING` → `WRITE_AWAITING` → `WRITE_SUCCESS`

**Latency**: 2-4 cycles (BRAM fast, DDR3 slower)

## Load/Store Types

**Supported**:
- `LW/SW`: 32-bit word
- `LH/LHU/SH`: 16-bit halfword (signed/unsigned)
- `LB/LBU/SB`: 8-bit byte (signed/unsigned)

**Byte alignment**: AXI write strobes (`wstrb`) enable byte-level writes without read-modify-write. Load data extraction uses `i_Addr[1:0]` offset with sign-extension for LB/LH.

See [memory_axi.v](hdl/cpu/memory/memory_axi.v) for alignment logic.

## Access Patterns

**Instruction fetch**:
- Address < 0x1000: Fast BRAM access
- Address >= 0x1000: AXI transaction to DDR3
- Interface: `s_instruction_memory_axil_*` (read-only)

**Data load/store**:
- Typically RAM (ROM is read-only)
- Interface: `s_data_memory_axil_*` (read/write)

## Constants

Constants defined in `.vh` files:
- [memory.vh](hdl/cpu/memory/memory.vh): `LS_TYPE_*`, state machine states, ROM boundary
- [cpu_core_params.vh](hdl/cpu/cpu_core_params.vh): Register widths, control signal widths

Python mirror: [tests/cpu/constants.py](tests/cpu/constants.py) - must stay in sync with `.vh` files.

## Current Status

- DDR3 operational @ 81.25 MHz (MIG initialized 2026-01-04)
- No memory protection (CPU can write to ROM, slave may ignore)
- No alignment checks (misaligned loads/stores may behave unexpectedly)
138 changes: 138 additions & 0 deletions .claude/rules/architecture/mig-vivado.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,138 @@
---
paths:
- hdl/reset_timer.v
- config/arty-s7-50.xdc
---

# MIG DDR3 Configuration (Arty S7-50)

**Last updated**: 2026-01-05
**Status**: ✅ MIG CALIBRATION SUCCESSFUL - DDR3 functional @ 81.25 MHz

## Critical Success Factors

**MUST HAVE** for DDR3 calibration:
1. **200 MHz reference clock** to MIG `clk_ref_i` (MANDATORY for IDELAYCTRL - won't calibrate without it)
2. **Bank 34 only** for all DDR3 signals (SSTL135 @ 1.35V)
3. **200µs reset hold time** for MIG `sys_rst` (20,000 cycles @ 100 MHz)
4. CPU reset from `ui_clk_sync_rst`, NOT `peripheral_reset` (stays HIGH)

**Vivado project**: NOT in repository (binary files, too large). Recreate from notes below if needed.

## Working MIG Configuration

**Memory part**: MT41K128M16XX-15E
- 16-bit DDR3L, 128 Mb, -15E speed grade, **1.35V operation**
- I/O standard: **SSTL135** (NOT SSTL15)
- Bank: **Bank 34 only** (all byte groups: DQ[0-15], Address/Ctrl)
- Internal Vref: ENABLED (0.675V for Bank 34)

**MIG Parameters**:
- AXI interface: 128-bit data width (SmartConnect converts from CPU's 32-bit)
- Address width: 28-bit
- Input clock period: 10000 ps (100 MHz) → `sys_clk_i`
- Memory clock: 3077 ps (324.99 MHz, MIG-generated internally)
- Reference clock: **200 MHz (5000 ps)** → `clk_ref_i` ⚠️ CRITICAL
- PHY ratio: 4:1
- **UI clock: 81.25 MHz** (324.99 MHz ÷ 4) - CPU runs at this speed

## Clock Architecture

**Input**: 12 MHz from board oscillator (pin F14, LVCMOS33)

**Clock Wizard** (MMCM):
- VCO: 12 MHz × 50 = 600 MHz
- Output 1: **100 MHz** (÷6) → MIG `sys_clk_i` + reset_timer
- Output 2: **200 MHz** (÷3) → MIG `clk_ref_i` ⚠️ CRITICAL

**MIG-generated**:
- Memory interface: 324.99 MHz (internal)
- UI clock: **81.25 MHz** (CPU domain)

## Reset Architecture

**Custom reset timer** ([reset_timer.v](hdl/reset_timer.v)):
- Counts **20,000 cycles @ 100 MHz = 200µs**
- Holds MIG `sys_rst` LOW during startup (ACTIVE-LOW reset)
- Releases when count completes
- Parameters: `COUNTER_WIDTH=15`, `HOLD_CYCLES=20000`

**CPU reset**:
- Connected to MIG's `ui_clk_sync_rst` (ACTIVE-HIGH, synchronized to ui_clk)
- ❌ **NOT using** `proc_sys_reset_0/peripheral_reset` (stays perpetually HIGH - known issue)

## Bank Selection - CRITICAL

**Why Bank 34 only**:
- Bank 34: Powered at **1.35V** for DDR3L (SSTL135)
- Bank 15: Has RGB LEDs requiring **3.3V** (LVCMOS33) - voltage conflict with DDR3
- Bank 14: UART signals (**3.3V** LVCMOS33)
- **Separate banks = independent VCCO rails** = no voltage conflict

**All DDR3 signals must be on Bank 34**:
- DQ[0-7] (Byte Group T0)
- DQ[8-15] (Byte Group T1)
- Address/Control-0 (Byte Group T2)
- Address/Control-1 (Byte Group T3)

## Key Lessons

1. **200 MHz ref_clk is MANDATORY**: DDR3 WILL NOT calibrate without it (IDELAYCTRL requirement)
2. **Bank voltage isolation**: Check board schematic for VCCO rail voltages before assigning pins
3. **SSTL135 for DDR3L**: Use SSTL135 (1.35V), NOT SSTL15 (1.5V) - wrong I/O standard prevents calibration
4. **Reset timing matters**: MIG requires minimum 200µs reset hold time
5. **ui_clk_sync_rst for AXI resets**: Use inverted `ui_clk_sync_rst` (via NOT gate `util_vector_logic_1`) for SmartConnect `aresetn`, MIG `aresetn`, and VDMA `axi_resetn`. proc_sys_reset_0's `peripheral_aresetn` stays stuck LOW (never deaserts) so MIG AXI slave never responds. The NOT-gated `ui_clk_sync_rst` is already in the 81.25MHz domain (no CDC issue) and properly deaserts after MIG calibration.

## Vivado Block Diagram Components

**If recreating from scratch**:

1. **Clock Wizard**:
- Input: 12 MHz
- Outputs: 100 MHz (sys_clk), 200 MHz (ref_clk)

2. **Reset Timer** (custom Verilog):
- Input: 100 MHz clock, Clock Wizard `locked`
- Output: ACTIVE-LOW reset to MIG `sys_rst`
- Hold: 20,000 cycles

3. **MIG 7-Series**:
- Part: MT41K128M16XX-15E
- Clocks: 100 MHz sys_clk_i, 200 MHz clk_ref_i
- AXI: 128-bit interface
- Bank: 34 (SSTL135)
- Internal Vref: ENABLED

4. **AXI SmartConnect**:
- Masters: CPU instruction + data (32-bit each)
- Slave: MIG (128-bit)
- Handles width conversion

5. **Processor System Reset**:
- Generates AXI reset signals for MIG/SmartConnect
- **Do NOT use for CPU reset** (use ui_clk_sync_rst instead)

## Troubleshooting

**Calibration fails**:
- Check 200 MHz ref_clk connected to MIG `clk_ref_i`
- Verify Bank 34 for all DDR3 pins
- Verify SSTL135 I/O standard (not SSTL15)
- Check reset hold time (minimum 200µs)

**Wrong data/corruption**:
- Verify AXI connections (SmartConnect to MIG)
- Check ui_clk domain crossing
- Verify CPU reset from ui_clk_sync_rst

**Build errors**:
- Vivado project not in repo - must recreate block diagram
- Constraint file: [arty-s7-50.xdc](config/arty-s7-50.xdc) has pin assignments

## Reference

**Board**: Arty S7-50 (xc7s50-csga324, speed grade -1)
**Memory**: 256 MB DDR3L @ 1.35V (MT41K128M16XX-15E)
**Oscillator**: 12 MHz (pin F14)

See Arty S7 reference manual for schematic and VCCO rail assignments.
Loading
Loading