Skip to content
102 changes: 102 additions & 0 deletions rtl/video/# fb_ctrl - Module Brief (v0.1).md
Original file line number Diff line number Diff line change
@@ -0,0 +1,102 @@
# fb_ctrl - Module Brief (v0.1)

**Owner:** Jay Merveilleux
**RTL:** `rtl/video/fb_ctrl.sv`

The `fb_ctrl` module implements the display scanout engine for the video subsystem by autonomously reading pixel data from a framebuffer stored in system memory and delivering it to the video output pipeline in precise synchronization with VGA timing. A framebuffer is a memory-resident image buffer in which each location represents the color of a screen pixel; `fb_ctrl` continuously scans this memory in raster order-left to right, top to bottom-using DMA-style AXI reads driven by the active display timing. The module supports configurable base address and stride, enabling flexible memory layout and efficient, cache-aligned access to frame data. To prevent visual artifacts such as tearing, `fb_ctrl` employs double buffering, displaying one framebuffer while another is updated and performing buffer swaps only at vertical sync boundaries. In addition, the controller expands indexed pixel formats via palette or colormap lookup and raises a VSYNC interrupt to coordinate safe rendering and buffer management with software.

---

## Parameters

- **H_RES** (default: 640)
Defines the horizontal resolution of the active display area and the number of pixels scanned per line.

- **V_RES** (default: 480)
Defines the vertical resolution of the active display area and the total number of visible lines per frame.

- **Pixel format / data width**
Number of bits per pixel fetched from framebuffer memory. The default configuration uses 8-bit indexed color expanded to RGB via palette/COLORMAP logic.

- **Address width**
Width of framebuffer base address and internal address counters, defining the maximum addressable framebuffer size in system memory.

- **Stride support**
Programmable line stride allowing each framebuffer row to begin at a configurable byte offset, supporting padded or cache-line-aligned layouts.

- **AXI burst length / beat size**
Controls AXI read burst sizing during scanout to maximize sustained memory bandwidth and minimize bus overhead.

- **Double-buffer enable**
Enables front/back framebuffer operation with swaps applied only at VSYNC boundaries.

- **Scaling mode**
Optional pixel replication used for resolution upscaling (e.g., 320*200 source scaled to 640*480 output).

---

## Interfaces (Ports)

| Signal | Dir | Width | Description |
|------------------|-----|--------|---------------------------------------------------------------------------|
| `clk_i` | in | 1 | System clock driving framebuffer control logic and DMA request generation |
| `rst_ni` | in | 1 | Active-low reset; clears internal state and disables scanout |
| `enable_i` | in | 1 | Enables framebuffer scanout |
| `fb_base_i` | in | ADDR_W | Base address of active framebuffer in DDR |
| `fb_stride_i` | in | STR_W | Byte stride between successive framebuffer rows |
| `fb_swap_i` | in | 1 | Requests front/back framebuffer swap (applied at VSYNC) |
| `pixel_x_i` | in | X_W | Horizontal pixel coordinate from VGA timing |
| `pixel_y_i` | in | Y_W | Vertical pixel coordinate from VGA timing |
| `active_video_i` | in | 1 | High during visible display region |
| `pixel_index_o` | out | PIX_W | Indexed pixel fetched from framebuffer |
| `pixel_rgb_o` | out | RGB_W | Expanded RGB pixel output |
| `axi_ar_*` | out | - | AXI read address channel |
| `axi_r_*` | in | - | AXI read data channel |
| `vsync_irq_o` | out | 1 | VSYNC interrupt signaling frame boundary |

---

## Reset / Initialization

The `fb_ctrl` module uses an active-low reset (`rst_ni`) to return all internal state to a known idle condition. When reset is asserted, scanout is disabled, internal registers are cleared, and no AXI memory transactions are issued. Pixel output and VSYNC interrupt generation are suppressed during reset. After reset is deasserted, software programs framebuffer base address, stride, and buffer configuration before enabling scanout. Normal operation begins once enabled, with any buffer swap requests applied on the next VSYNC boundary.

---

## Behavior & Timing

The `fb_ctrl` module operates synchronously on the system clock (`clk_i`) as a continuously running scanout engine. When enabled, it autonomously issues AXI read transactions to fetch pixel data from the active framebuffer in raster order, synchronized to VGA timing. Framebuffer addresses advance horizontally across each line and jump by the programmed stride at line boundaries. Memory fetches are gated during blanking intervals using `active_video_i`. Double-buffer swaps are latched during operation and applied atomically at VSYNC, where an optional interrupt signals frame completion.

---

## Programming Model

The `fb_ctrl` module is configured through memory-mapped control registers defined in `specs/registers/video.yaml`. Software programs framebuffer base address, stride, pixel format, palette configuration, and optional front/back buffers before enabling scanout. Once enabled, hardware autonomously performs pixel fetch, format expansion, and display synchronization. A VSYNC interrupt allows software to safely update frame data or request buffer swaps, which are applied only at VSYNC.

---

## Errors / IRQs

The `fb_ctrl` module does not implement internal error detection for invalid configuration parameters or memory access failures. It assumes valid framebuffer addresses and reliable AXI read responses. No explicit error flags are generated for out-of-bounds access or underruns. An optional VSYNC interrupt is generated at the end of each frame and cleared via control/status registers as defined in `specs/registers/video.yaml`.

---

## Performance Targets

- Sustains continuous scanout at one pixel per pixel clock
- Operates at standard video pixel clocks (e.g., 25-40+ MHz for VGA modes)
- AXI bandwidth provisioned for worst-case resolution and pixel format
- Bounded, deterministic pixel latency through fixed pipeline depth
- Tear-free full-frame updates via double buffering
- No jitter introduced into active display timing

---

## Dependencies

Depends on system clock (`clk_i`), reset (`rst_ni`), AXI memory fabric, and backing DDR. Requires VGA timing signals from `vga_timing.sv`. Software must program valid framebuffer parameters via `specs/registers/video.yaml`. The AXI interconnect arbitrates memory access against other masters and must provide sufficient bandwidth for worst-case scanout.

---

## Verification Links

Verified using directed simulation testbenches validating raster-order scanout, stride handling, blanking behavior, and VSYNC-synchronized buffer swaps. System-level video simulations and test applications validate sustained operation under memory contention. AXI memory models observe read access patterns and burst alignment. Known limitations include reliance on software for bounds checking and the absence of formal verification under extreme contention.
60 changes: 60 additions & 0 deletions rtl/video/VGA_timing_op.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
The purpose of the VGA timing module is to generate the horizontal and vertical synchronization pulses and pixel coordinate signals required for standard VGA resolutions (e.g., 640×480 @ 60 Hz). It defines when each pixel should be drawn, when blanking intervals occur, and when sync pulses are active essentially acting as the heartbeat of the display pipeline. Other video blocks, such as the framebuffer controller or DAC driver, use these timing signals to know when to fetch and output video data.

Parameters

| Name | Default | Description |
| ----------| ------- | --------------------------------------------|
| H_RES | 640 | Active horizontal pixels per line |
| V_RES | 480 | Active vertical pixels per frame |
| H_FP | 16 | Horizontal front porch (pixels) |
| H_SYNC | 96 | Horizontal sync pulse width (pixels) |
| H_BP | 48 | Horizontal back porch (pixels) |
| V_FP | 10 | Vertical front porch (lines) |
| V_SYNC | 2 | Vertical sync pulse width (lines) |
| V_BP | 2 | Vertical back porch (lines) |
| PIXEL_CLK | 25*10^6 | Pixel clock frequency for 640 by 480 @ 60Hz |

Interfaces (Ports)

| Signal | Dir | Width | Description |
| ---- | ---- | ---- |-------------------------------------------------------------------------------------------------------------|
| clk | Input | 1 | Main pixel clock. Ensures timing logic and display pipeline stay synchronized |
| reset | Input | 1 | Active-low synchronous reset |
| hsync | Output | 1 | Horizontal sync pulse. Signals the end of a frame (start of new refresh) |
| vsync | Output | 1 | Vertical sync pulse. Signals the end of a frame (start of new fresh) |
| x | Output | 10 | Horizontal pixel counter (0-639 during active display) |
| y | Output | 10 | Vertical pixel counter (0-479 during active display) |
| active_video | Output | 1 | High during visible display time; low during blanking intervals (used to gate pixel output or blank screen) |

Reset/Initialization

- On reset (reset = 0), all internal counters (x,y) reset to zero and both sync outputs (hsync, vsync) are deserted
- The module begins normal operation as soon as reset is released and a valid clk is present
- No external configuration sequence is required timing parameters are static or parametrized at synthesis

Behavior and Timing

- The module implements two nested counters:
- The horizontal counter (x) increments every clock cycle
- When x reaches the total pixels per line, it resets to zero and increments the vertical counter (y)
- hsync is asserted low for H_SYNC cycles after the active + front porch interval
- vsync is asserted low for V_SYNC lines after the active + front porch period
- The signal active_video is high only when both x and y are within the active display area
- The structure guarantees a 60 Hz refresh at 640 \* 480 with a 25 MHz pixel clock

Errors / IRQs

- This module does not generate interrupts or error signals
- It operates continuously as long as a valid clock is provided
- Any display synchronization or VSYNC interrupt is usually handled by the framebuffer controller

Dependencies

- Clock: Requires a stable pixel clock (typically 25 MHz).
- Reset: Synchronous, active-low (reset).
- Upstream IP: Clock generation block (PLL or divider).
- Downstream IP: Framebuffer controller or video DAC/encoder that consumes timing signals.

Summary

- The VGA timing generator defines the temporal structure of a video frame by driving sync pulses, counters, and valid video windows. It forms the foundation for raster-scan display logic and provides synchronization for all downstream video pipeline modules
94 changes: 94 additions & 0 deletions rtl/video/audio_dma_one_pager.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
# audio_dma — Module Brief (v0.1)

**Owner:** Jay Merveilleux
**RTL:** `rtl/audio/audio_dma.sv`

The `audio_dma` module implements the streaming direct memory access (DMA) engine for the GamingCPU audio subsystem. It autonomously transfers audio sample data from system memory into a local buffering interface that feeds downstream audio output blocks such as `i2s_tx` or `pwm_audio`. Designed around a ring-buffer model, `audio_dma` allows software to continuously produce audio samples into memory while hardware consumes them at a deterministic rate, decoupling real-time audio playback from CPU scheduling and instruction execution. By issuing burst-based AXI read transactions, the module sustains audio throughput with minimal bus overhead and predictable latency.

---

## Parameters

- **Address width**
Width of the AXI address bus and internal address counters, defining the maximum addressable audio buffer size in system memory.

- **Data width**
Width of AXI read data beats, typically aligned to the system memory and interconnect configuration.

- **Ring buffer size**
Defines the total size of the circular audio buffer in bytes or samples.

- **Burst length**
Controls the maximum AXI read burst size used when fetching audio data, balancing latency and bus efficiency.

- **FIFO depth**
Depth of internal buffering between AXI read responses and the downstream audio sink.

---

## Interfaces (Ports)

| Signal | Dir | Width | Description |
|-------------------|-----|-------|----------------------------------------------------------------------|
| `clk_i` | in | 1 | System clock driving DMA control logic |
| `rst_ni` | in | 1 | Active-low reset; clears internal state and halts DMA operation |
| `enable_i` | in | 1 | Enables audio DMA operation |
| `buf_base_i` | in | ADDR | Base address of audio ring buffer in system memory |
| `buf_size_i` | in | SIZE | Total size of ring buffer |
| `rd_ptr_i` | in | ADDR | Read pointer supplied by software or internal state |
| `axi_ar_*` | out | — | AXI read address channel |
| `axi_r_*` | in | — | AXI read data channel |
| `sample_*` | out | — | Audio sample stream output to downstream audio blocks |
| `sample_valid_o` | out | 1 | Indicates valid audio sample data |
| `sample_ready_i` | in | 1 | Backpressure from downstream consumer |
| `underrun_o` | out | 1 | Indicates ring buffer underrun condition |
| `irq_o` | out | 1 | Optional interrupt signaling underrun or threshold events |

---

## Reset / Initialization

The `audio_dma` module uses an active-low reset (`rst_ni`) to return all internal state to a known idle condition. When reset is asserted, all AXI transactions are halted, internal FIFOs are flushed, address counters are cleared, and no audio samples are emitted. Underrun status is cleared, and interrupt outputs are deasserted. After reset deassertion, software programs the ring buffer base address, buffer size, and initial read/write pointers before enabling DMA operation.

---

## Behavior & Timing

The `audio_dma` module operates synchronously on the system clock (`clk_i`) as a continuously running streaming DMA engine. When enabled, it issues AXI read transactions to fetch audio data from the configured ring buffer in memory. Address generation advances sequentially through the buffer and wraps automatically at the end of the configured buffer region, implementing circular addressing semantics.

Read data returned over the AXI interface is queued into an internal FIFO, decoupling memory access timing from the fixed consumption rate of the downstream audio output block. Data is presented to the consumer using a valid/ready handshake. AXI read bursts are sized to maximize sustained bandwidth while minimizing arbitration overhead on the shared interconnect.

---

## Programming Model

The `audio_dma` module is configured through memory-mapped control registers defined in the audio/DMA register specification. Software initializes the audio ring buffer in memory and programs buffer base address, buffer size, and control flags before enabling DMA operation. During playback, software advances the producer write pointer independently, while `audio_dma` advances the consumer read pointer autonomously. Status registers allow software to monitor buffer occupancy and detect underrun conditions.

---

## Errors / IRQs

The primary error condition detected by `audio_dma` is a buffer underrun, which occurs when the DMA engine attempts to fetch audio data beyond the available produced samples. In this condition, the module asserts an underrun status flag and may generate an interrupt to notify software. Depending on configuration, the DMA engine may stall, continue issuing reads that return invalid data, or output zero-valued samples downstream until the buffer is refilled. Recovery is software-driven and involves replenishing the ring buffer and clearing the underrun status.

---

## Performance Targets

- Sustains continuous audio streaming at the configured sample rate
- Supports burst-based AXI reads aligned to cache-line boundaries
- Maintains deterministic sample delivery to downstream audio blocks
- Tolerates short-term memory latency via internal buffering
- No audible glitches during steady-state operation
- Graceful handling of underrun conditions

---

## Dependencies

Depends on the system clock (`clk_i`), reset (`rst_ni`), AXI memory fabric, and backing system memory. Requires a downstream audio consumer such as `i2s_tx` or `pwm_audio`. Software must configure valid ring buffer parameters via the audio/DMA register specification. The AXI interconnect must provide sufficient bandwidth to sustain real-time audio fetches under worst-case contention.

---

## Verification Links

Verified using directed simulation testbenches validating ring buffer wraparound behavior, AXI burst alignment, and backpressure handling. System-level audio simulations confirm sustained playback under memory contention and proper underrun detection. AXI memory models observe read access patterns and FIFO behavior. Known limitations include reliance on software for correct buffer sizing and the absence of formal verification for extreme arbitration scenarios.
Loading