From 3730173567d74a3e9ef98e61ebe3d37a7e17a077 Mon Sep 17 00:00:00 2001 From: sebasthechill Date: Tue, 18 Nov 2025 01:11:33 -0500 Subject: [PATCH 1/2] Add correct Markdown documentation for MMU --- docs/ip-briefs/ptw One Pager (v1.0).md | 296 ++++++++++++++++++++ docs/ip-briefs/sv32_mmu One Pager (v1.0).md | 283 +++++++++++++++++++ docs/ip-briefs/tlb One Pager (v1.0).md | 172 ++++++++++++ 3 files changed, 751 insertions(+) create mode 100644 docs/ip-briefs/ptw One Pager (v1.0).md create mode 100644 docs/ip-briefs/sv32_mmu One Pager (v1.0).md create mode 100644 docs/ip-briefs/tlb One Pager (v1.0).md diff --git a/docs/ip-briefs/ptw One Pager (v1.0).md b/docs/ip-briefs/ptw One Pager (v1.0).md new file mode 100644 index 0000000..4aa4a83 --- /dev/null +++ b/docs/ip-briefs/ptw One Pager (v1.0).md @@ -0,0 +1,296 @@ +ptw — Module Brief (v1.0) RTL: rtl/cpu/mmu/ptw.sv + + + +#### **Purpose \& Role** + +The Page Table Walker (PTW) performs a hardware page-table traversal for the SV32 Virtual memory. When the TLB and sv32\_mmu detects a miss, the PTW takes the page table entries (PTEs) from memory using AXI4-Lite read requests. It implements the 2 level SV32 walk (VPN\[1]->VPN\[0]) and returns either a valid PTE or a page fault condition. + + + +The PTW is a lone external module. It does not manage TLBs directly. + +sv32\_mmu requests page walks -> ptw gets PTEs -> sv32\_mmu interprets results and completes permission checks -> sv32\_mmu handles TLB insertion after walk completes. + + + +**Parameters** +**Name Default Description** +--- + +  TIMEOUT\_CYCLES 256 Max cycles allowed mem response before timeout + +  ADDR\_WIDTH 32 Width of virtual \& PTE address generated. + +  DATA\_WIDTH 64 Width of PTE fetch data. + +  PPN\_WIDTH 32 Physical page # width for next level base @. + + + +#### **Interfaces (Ports)** + +##### **Signal Dir Width Description** + +  clk\_i In 1 Clock input. + +  rst\_ni In 1 Active-low reset. + +  walk\_req\_valid\_i In 1 Request start page walk (from sv32). + +  walk\_req\_addr\_i In 32 Base physical @ of L1 page table. + +  walk\_req\_vpn\_i In 20 VPN\[19:0] split into VPN\[1]/\[0]. + +  walk\_rsp\_valid\_o Out 1 PTW response valid. + +  walk\_rsp\_pte\_o Out 64 Returned PTE data (valid or faulty). + +  walk\_rsp\_error\_o Out 1 Signals page-fault or timeout error. + +  axi\_ar\_valid\_o Out 1 AXI4-Lite read address valid. + +  axi\_ar\_addr\_o Out 32 Address for PTE fetch. + +  axi\_r\_valid\_i In 1 Memory read data valid. + +  axi\_r\_data\_i In 64 PTE data returned from memory. + + + + + +#### **Protocols** + +* Walk interface: single outstanding request. Valid/ready managed by sv32\_mmu. +* AXI4-Lite: ar\_valid -> ar\_addr handshake, r\_valid returns 64 bit PTE data. +* MMU sequencing: PTW only begins walk when walk\_req\_valid\_i is asserted, no interleaved walks are supported (one at a time). + +  + +#### **Behavior \& Timing** + +* Performs two level Sv32 lookup: Fetch L1 PTE using SATP.PPN + VPN\[1] and If valid leaf -> respond, If pointer -> fetch L2 PTE using next-level PPN + VPN\[0]. +* Detects and reports: Invalid PTE, misaligned PTE, timeout. +* Walk latency depends on memory response: Typical 8–20 cycles, Max is TIMEOUT\_CYCLES before error. +* One translation may be in flight at a time. +* PTW does not modify A/D bits (sv32\_mmu handles that if required). + + + +#### **Programming Model** + +The PTW is not software-visible. It is indirectly controlled through sv32\_mmu and CSRs: + +* SATP.PPN provides root page-table pointer. +* SFENCE.VMA causes PTW state flush through the MMU. +* No memory mapped registers. +* No CSR interface inside PTW. + + + +#### **Errors \& IRQs** + +#####   **Condition Description Handling** + +  Timeout Memory read response exceeds walk\_rsp\_error\_o asserted. + +  TIMEOUT\_CYCLES. + +  Invalid PTE PTE has illegal or reserved Error returned to sv32\_mmu. + +  values. + +  Misaligned PTE Invalid alignment from PTW. Treated as page fault. + +  Access fault AXI bus returns error. Error raised to sv32\_mmu. + +PTW does not generate standalone interrupts; all exceptions are handled by sv32\_mmu and core trap logic. + + + +#### **Performance Targets** + +#####   **Metric Target Notes** + +  Walk latency <= 40 cycles typical 2 level walk under average memory timing. + +  Throughput 1 walk at a time Back pressure via walk\_req\_valid\_i. + +  Frequency 500 MHz Same as MMU domain. + + + +#### **Dependencies** + +* sv32\_mmu (requests + responses), AXI4-Lite interconnect (memory PTE fetches). +* Clocks: clk\_i / rst\_ni shared with MMU. +* Inputs: SATP root pointer (via mmu). +* Must be coordinated with TLB insertions performed by sv32\_mmu. + + + +#### **Verification Links** + +Unit tests: verification/mmu/test\_ptw.py + +Integration: verification/core/system\_paging.sv + +Coverage: cov/cov/ptw\_cov.html + +Known limitations: + +* No superpage support (>4 MiB). +* No multi-walk concurrency. +* Timeout behavior not cycle-accurate with all DRAM models. + + + +#### **Definitions \& Acronyms** + + + +AXI4-Lite: + +Advanced eXtensible Interface, lightweight subset of the ARM AXI4 protocol used for memory-mapped control and status register accesses. + + + +A/D bits: + +Accessed and Dirty bits within a page-table entry (PTE). The MMU sets these when a page is read or written for the first time. + + + +ASID: + +Address Space Identifier; field in the SATP register distinguishing virtual-memory contexts. + + + +CDC: + +Clock-Domain Crossing; logic used to safely transfer signals between different clock domains. + + + +CPU: + +Central Processing Unit. + + + +CSR: + +Control and Status Register; RISC-V architectural registers that configure privilege behavior, MMU mode, and interrupts. + + + +MMU: + +Memory Management Unit; hardware responsible for translating virtual addresses to physical addresses and enforcing protection. + + + +OS: + +Operating System. + + + +PA: + +PADDR\_WIDTH — Physical Address; bit-width of the physical address output from the MMU. + + + +PTE: + +Page Table Entry; 32-bit or 64-bit descriptor in memory describing one virtual-to-physical mapping and its permissions. + + + +PTW: + +Page Table Walker; sub-module that fetches PTEs from memory on a TLB miss. + + + +RAM: + +Random-Access Memory; main system memory where program data and page tables reside. + + + +R/W/X: + +Read, Write, and Execute permission bits inside a PTE. + + + +RV32 / RV32I: 32-bit RISC-V base integer instruction set architecture. + + + +SATP: + +Supervisor Address Translation and Protection register; enables paging and provides root page-table pointer and ASID. + + + +SFENCE.VMA: + +Supervisor Fence for Virtual-Memory Area; RISC-V instruction that invalidates TLB entries. + + + +S-mode / U-mode / M-mode: Supervisor, User, and Machine privilege levels defined by the RISC-V privilege specification. + + + +SoC: + +System-on-Chip, integrated design including CPU, MMU, caches, interconnect, and peripherals. + + + +SV32: + +RISC-V 32-bit virtual-memory scheme using two-level page tables with 4 KB pages. + + + +TLB: + +Translation Lookaside Buffer; cache storing recently used PTEs to accelerate address translation. + + + +VPN: + +Virtual Page Number; upper bits of a virtual address that index the page table. + + + +CSR\_FILE: + +Hardware block managing RISC-V control/status registers used by the CPU and MMU. + + + +AXI Crossbar: + +On-chip interconnect fabric (rtl/bus/axi/axi\_crossbar.sv) that routes AXI transactions between masters (CPU, PTW) and slaves (memory, peripherals). + + + +BootROM: + +Read-only memory code executed on reset to initialize hardware and enable the MMU/OS. + + + +IRQ: + +Interrupt Request; hardware signal used to notify the processor of asynchronous events. + diff --git a/docs/ip-briefs/sv32_mmu One Pager (v1.0).md b/docs/ip-briefs/sv32_mmu One Pager (v1.0).md new file mode 100644 index 0000000..02b0d05 --- /dev/null +++ b/docs/ip-briefs/sv32_mmu One Pager (v1.0).md @@ -0,0 +1,283 @@ +sv32\_mmu — Module Brief (v1.0) RTL: rtl/cpu/mmu/sv32\_mmu.sv + + + +#### **Purpose \& Role** + +Sv32 virtual memory translation unit. Handles TLB management, page-table walks (delegated to an external Page Table Walker module), and access-permission enforcement for S-mode and U-mode memory accesses. Sits between the CPU memory stage and the memory subsystem, translating virtual addresses into physical addresses. Uses a Translation Lookaside Buffer (TLB) for cached translations and issues page-table walk requests to an external PTW module on TLB miss. Enforces R/W/X and U/S permissions according to the Sv32 specification, propagating page-faults to core trap logic. Ensures isolation between privilege levels and maintains correct virtual memory operation across the CPU pipeline. + + + +**Parameters** +**Name Default Description** +--- + +  TLB\_ENTRIES 16 Number of cached page table entries. + +  PAGE\_SIZE 4 KB Base page size per SV32 specification. + +  PTW\_TIMEOUT\_CYCLES 256 Maximum cycles to wait for external PTW. + +  ADDR\_WIDTH 32 Virtual address width (fixed for RV32). + +  PADDR\_WIDTH 34 Physical address width to memory subsystem. + + + +#### **Interfaces (Ports)** + +##### **Signal Dir Width Description** + +  clk\_i In 1 Clock input. + +  rst\_ni In 1 Active-low reset. + +  va\_i In 32 Virtual address input. + +  pa\_o Out 34 Physical address output. + +  valid\_i In 1 Request valid. + +  ready\_o Out 1 Ready for next request. + +  ptw\_req\_valid\_o Out 1 Request external PTW initiation. + +  ptw\_req\_addr\_o Out 32 Base address of PTE to read from SATP. + +  ptw\_rsp\_valid\_i In 1 External PTW response valid. + +  ptw\_rsp\_data\_i In 64 Returned PTE data from external PTW. + +  satp\_i In 32 SATP register value. + +  priv\_i In 2 Current privilege level (U/S/M). + + + +#### **Protocols** + +* CPU side: uses in-order valid/ready. +* PTW side: sv32\_mmu issues requests, PTW handles memory access and returns results according to valid/ready exchange. + +  + +#### **Behavior \& Timing** + +* TLB hit ---> 1-cycle translation. +* TLB miss ---> sv32\_mmu issues a PTW request via ptw\_req\_ to the external PTW. +* PTW performs the actual AXI/DRAM access and returns PTE data through ptw\_rsp\_\* signals. +* Permission check for R/W/X and U/S enforcement. +* Order based request handling. One translation in flight. +* SV32 two level walk: Root PPN from SATP -> VPN\[1] -> VPN \[0]. +* Single clock domain (clk\_i). No CDC. One pipeline stage for TLB lookup. Stalls on misses. + + + +#### **Programming Model** + +Controlled by CSRs: SATP (enable/ASID/root PPN), SFENCE.VMA (invalidate), and SUM/MXR/UXN bits from mstatus/sstatus. Refer to csr\_spec.yaml. No memory mapped registers. + + + +#### **Errors \& IRQs** + +#####   **Condition Description Handling** + +  Page fault Bad PTW or CPU exception + +  permission violation. (load/store/instr). + +  PTW timeout No response from external Sets error flag, + +  PTW in timeout window. Retry or exception. + +  Misaligned PTE Invalid alignment from PTW. Treated as page fault. + +There are no stand alone IRQ outputs. Exceptions spread to core trap logic. + + + +#### **Performance Targets** + +#####   **Metric Target Notes** + +  TLB hit latency 1 cycle No stall translation. + +  TLB miss latency less than or equal to 40 cycles Two level walk average. + +  Throughput 1 translation/cycle When not PTW stalled. + +  Clock frequency 500 MHz CPU domain minimal default. + + + +#### **Dependencies** + +* Modules: tlb (lookup + insertion), ptw (external page table walker module). +* Clocks/Resets: clk\_i, rest\_ni (is shared with CPU). +* Software: SATP must be configured before enable; SFENCE.VMA after context switch. +* PTW performs AXI4-Lite/AXI memory reads; MMU only provides address and receives PTE data. +* MMU receives privilege and CSR configuration from csr\_file. + + + +#### **Verification Links** + +Unit tests: verification/mmu/test\_sv32\_mmu.py + +Integration: verification/core/system\_paging.sv + +Coverage: cov/mmu\_cov.html + +Known limitations: No superpage support (>4 MiB). PTW timeout error path unverified. + + + +#### **Definitions \& Acronyms** + + + +AXI4-Lite: + +Advanced eXtensible Interface, lightweight subset of the ARM AXI4 protocol used for memory-mapped control and status register accesses. + + + +A/D bits: + +Accessed and Dirty bits within a page-table entry (PTE). The MMU sets these when a page is read or written for the first time. + + + +ASID: + +Address Space Identifier; field in the SATP register distinguishing virtual-memory contexts. + + + +CDC: + +Clock-Domain Crossing; logic used to safely transfer signals between different clock domains. + + + +CPU: + +Central Processing Unit. + + + +CSR: + +Control and Status Register; RISC-V architectural registers that configure privilege behavior, MMU mode, and interrupts. + + + +MMU: + +Memory Management Unit; hardware responsible for translating virtual addresses to physical addresses and enforcing protection. + + + +OS: + +Operating System. + + + +PA: + +PADDR\_WIDTH — Physical Address; bit-width of the physical address output from the MMU. + + + +PTE: + +Page Table Entry; 32-bit or 64-bit descriptor in memory describing one virtual-to-physical mapping and its permissions. + + + +PTW: + +Page Table Walker; sub-module that fetches PTEs from memory on a TLB miss. + + + +RAM: + +Random-Access Memory; main system memory where program data and page tables reside. + + + +R/W/X: + +Read, Write, and Execute permission bits inside a PTE. + + + +RV32 / RV32I: 32-bit RISC-V base integer instruction set architecture. + + + +SATP: + +Supervisor Address Translation and Protection register; enables paging and provides root page-table pointer and ASID. + + + +SFENCE.VMA: + +Supervisor Fence for Virtual-Memory Area; RISC-V instruction that invalidates TLB entries. + + + +S-mode / U-mode / M-mode: Supervisor, User, and Machine privilege levels defined by the RISC-V privilege specification. + + + +SoC: + +System-on-Chip, integrated design including CPU, MMU, caches, interconnect, and peripherals. + + + +SV32: + +RISC-V 32-bit virtual-memory scheme using two-level page tables with 4 KB pages. + + + +TLB: + +Translation Lookaside Buffer; cache storing recently used PTEs to accelerate address translation. + + + +VPN: + +Virtual Page Number; upper bits of a virtual address that index the page table. + + + +CSR\_FILE: + +Hardware block managing RISC-V control/status registers used by the CPU and MMU. + + + +AXI Crossbar: + +On-chip interconnect fabric (rtl/bus/axi/axi\_crossbar.sv) that routes AXI transactions between masters (CPU, PTW) and slaves (memory, peripherals). + + + +BootROM: + +Read-only memory code executed on reset to initialize hardware and enable the MMU/OS. + + + +IRQ: + +Interrupt Request; hardware signal used to notify the processor of asynchronous events. + diff --git a/docs/ip-briefs/tlb One Pager (v1.0).md b/docs/ip-briefs/tlb One Pager (v1.0).md new file mode 100644 index 0000000..7cc76c5 --- /dev/null +++ b/docs/ip-briefs/tlb One Pager (v1.0).md @@ -0,0 +1,172 @@ +tlb — Module Brief (v1.0) RTL: rtl/cpu/mmu/tlb.sv + + + +#### **Purpose \& Role** + +The TLB, Translation Lookaside Buffer, provides cached virtual to physical mappings for the Sv32 MMU. It takes recently used page table entries and stores them to accelerate address translation and minimize page table walks. It supports separate instruction (I) and data (D) lookup paths with shared shootdown domain. Implemented as an associative cache that uses a content addressable memory (CAM) structure with the least recently used (LRU) or pseudo least recently used replacement policy. During misses, the requests are sent to the MMU, which forwards them to an external Page Table Walker (PTW) for page-table fetch. When address spaces change, global invalidations through SFENCE.VMA and SATP writes ensure these outdated translations are removed. + + + +**Parameters** +**Name Default Description** +--- + +  ENTRIES 16 Number of cached TLB entries per instance. + +  PAGE\_SIZE 4 KB Base page size per Sv32 specification. + +  ASSOCIATIVE FULL Fully associative lookup organization. + +  REPL\_POLICY LRU Replacement policy: LRU or pseudo-LRU select. + +  ADDR\_WIDTH 32 Virtual address width for tag comparison. + +  PADDR\_WIDTH 34 Physical address width for stored entries. + + + +#### **Interfaces (Ports)** + +##### **Signal Dir Width Description** + +  clk\_i In 1 Clock input. + +  rst\_ni In 1 Active-low reset. + +  lookup\_va\_i In 32 Virtual address to translate. + +  lookup\_hit\_o Out 1 Indicates translation hit. + +  lookup\_pa\_o Out 34 Physical address result if hit. + +  lookup\_valid\_i In 1 Lookup request valid. + +  lookup\_ready\_o Out 1 Ready for next lookup. + +  insert\_valid\_i In 1 Request insert entry (from sv32\_mmu). + +  insert\_vpn\_i In 20 Virtual page number to cache. + +  insert\_ppn\_i In 22 Physical page number to cache. + +  insert\_perm\_i In 8 Permission bits (R/W/X/U/S/A/D). + +  flush\_i In 1 Global flush signal (SFENCE.VMA/SATP). + +  miss\_o Out 1 When lookup misses existing entries. + + + +#### **Protocols** + +* Lookup interface uses sequential valid/ready handshake with MMU pipeline. +* Insert interface triggered by MMU/PTW completion. Must not overlap with active flush. +* Flush is synchronous and clears all entries within one cycle after assertion. + +  + +#### **Behavior \& Timing** + +* CAM based associative lookup performs tag comparison in one cycle. +* On hit ---> return cached physical address. No pipeline stall. +* On miss --> raise miss\_o prompting MMU to initiate PTW fetch. +* LRU or pseudo LRU replacement selects victim entry for new insertions. +* Supports optional shared shootdown across I/D TLBs when enabled. +* Single clock domain (clk\_i). No clock domain crossings. + + + +#### **Programming Model** + +Indirectly controlled through MMU CSRs and instructions: + +* SFENCE.VMA: Flush all or ASID specific entries. +* SATP writes: trigger global flush and context switch. +* Privilege level (w/ CSR) determines permission bits cached with each entry. +* Refer to csr\_spec.yaml for CSR definitions. + + + + + +#### **Errors \& IRQs** + +#####   **Condition Description Handling** + +  Parity error CAM parity or ECC error. Entry invalidated, + +  Reloaded on next access. + +  Invalid insert Insertion without valid PTE. Ignored, triggers MMU retry. + + + +There are no stand alone IRQ outputs. Exceptions spread to core trap logic. + + + +#### **Performance Targets** + +#####   **Metric Target Notes** + +  Lookup latency 1 cycle No stall on hit. + +  Insert latency 1 cycle Tag and data write. + +  Flush latency 1-2 cycles Depends on entry count. + +  Throughput 1 lookup/cycle Continuous pipeline operation. + + + +#### **Dependencies** + +* Connected to: sv32\_mmu for lookups, misses, and entry insertions. +* Page-table misses resolved indirectly through external PTW (handled by sv32\_mmu). +* Receives: SFENCE.VMA and SATP write flush controls through CSR subsystem. +* Clocks/Resets: clk\_i, rst\_ni (is shared with MMU). +* Integration: Shares shootdown domain across instruction/data TLB instances. + + + +#### **Verification Links** + +Unit tests: verification/mmu/test\_tlb.py + +Integration: verification/mmu/test\_sv32\_mmu.sv + +Coverage: cov/tlb\_cov.html + +Known limitations: No superpage support (>4 MiB) entries. pseudo LRU accuracy is not verified under concurrent insertions. + + + +#### **Definitions \& Acronyms** + + + +TLB: Translation Lookaside Buffer. Cache storing recently used page-table entries. + +CAM: Content Addressable Memory. Memory allowing associative lookup based on tag comparison. + +LRU: Least Recently Used. Replacement policy that evicts the entry unused for the longest time. + +PTW: Page Table Walker. Internal MMU logic that fetches page-table entries on TLB misses. + +MMU: Memory Management Unit. Performs address translation and permission checks. + +PTE: Page Table Entry. Descriptor defining a mapping between virtual and physical pages. + +SFENCE.VMA: Supervisor Fence Virtual Memory Area. Instruction used to flush TLB entries. + +SATP: Supervisor Address Translation and Protection register. Defines root page table and ASID. + +ASID: Address Space Identifier. Distinguishes virtual-memory address spaces. + +CSR: Control and Status Register. Holds configuration and privilege control data. + +AXI4-Lite: Simplified version of ARM AXI4 bus protocol used for memory access. + +SoC: System-on-Chip. Integrated CPU, MMU, cache, and peripheral components. + From 9f6e22809f0e9ca51ee4adfc5b34a3d731d9392c Mon Sep 17 00:00:00 2001 From: sebasthechill Date: Wed, 26 Nov 2025 23:18:45 -0500 Subject: [PATCH 2/2] MMU - ptw.sv --- rtl/cpu/mmu/ptw.sv | 318 +++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 318 insertions(+) diff --git a/rtl/cpu/mmu/ptw.sv b/rtl/cpu/mmu/ptw.sv index e69de29..c7fc274 100644 --- a/rtl/cpu/mmu/ptw.sv +++ b/rtl/cpu/mmu/ptw.sv @@ -0,0 +1,318 @@ +module ptw #( + // main parameters for widths and timeouts + parameter int TIMEOUT_CYCLES = 256, + parameter int ADDR_WIDTH = 32, + parameter int DATA_WIDTH = 64, + parameter int PPN_WIDTH = 32 +)( + input logic clk_i, + input logic rst_ni, + + // flush from sfence.vma to clear any in-flight walk + input logic flush_i, + + // walk request from mmu + input logic walk_req_valid_i, + output logic walk_req_ready_o, + input logic [ADDR_WIDTH-1:0] walk_req_addr_i, // l1 table base address + input logic [19:0] walk_req_vpn_i, // full vpn {vpn1, vpn0} + + // walk response back to mmu + output logic walk_rsp_valid_o, + output logic [DATA_WIDTH-1:0] walk_rsp_pte_o, + output logic walk_rsp_error_o, + + // axi-lite read interface for pte fetches + output logic axi_ar_valid_o, + output logic [ADDR_WIDTH-1:0] axi_ar_addr_o, + input logic axi_ar_ready_i, + input logic axi_r_valid_i, + input logic [DATA_WIDTH-1:0] axi_r_data_i, + input logic [1:0] axi_r_resp_i +); + + // two-level walk fsm + typedef enum logic [2:0] { + IDLE, + SEND_L1, + WAIT_L1, + SEND_L2, + WAIT_L2, + DONE, + ERROR + } ptw_state_e; + + ptw_state_e state_q, state_d; + + // latched request info + logic [ADDR_WIDTH-1:0] base_addr_q; + logic [19:0] vpn_q; + + // split vpn for l1 / l2 table indexing + logic [9:0] vpn_l1; + logic [9:0] vpn_l2; + + // pte registers + logic [DATA_WIDTH-1:0] pte_l1_q; + logic [DATA_WIDTH-1:0] pte_l2_q; + + // timeout handling + logic [31:0] timeout_cnt_q, timeout_cnt_d; + logic timeout_expired; + + // computed pte addresses + logic [ADDR_WIDTH-1:0] l1_addr; + logic [ADDR_WIDTH-1:0] l2_base_addr; + logic [ADDR_WIDTH-1:0] l2_addr; + + // alignment helpers based on pte size + localparam int PTE_SIZE_BYTES = DATA_WIDTH / 8; + localparam int PTE_ALIGN_BITS = (PTE_SIZE_BYTES > 1) ? $clog2(PTE_SIZE_BYTES) : 1; + + logic l1_addr_misaligned; + logic l2_addr_misaligned; + + // pte helper functions + // basic sv32 legality checks + function automatic logic pte_invalid(input logic [DATA_WIDTH-1:0] pte); + logic v, r, w; + begin + v = pte[0]; + r = pte[1]; + w = pte[2]; + + // invalid if v = 0, w = 1 while r = 0, or upper bits are set + pte_invalid = (!v) || + (!r && w) || + ((DATA_WIDTH > 32) && (|pte[DATA_WIDTH-1:32])); + end + endfunction + + // pte points to next level + function automatic logic pte_is_pointer(input logic [DATA_WIDTH-1:0] pte); + logic v, r, x; + begin + v = pte[0]; + r = pte[1]; + x = pte[3]; + pte_is_pointer = v && !r && !x; + end + endfunction + + // pte is a valid leaf mapping + function automatic logic pte_is_leaf(input logic [DATA_WIDTH-1:0] pte); + logic v, r, x; + begin + v = pte[0]; + r = pte[1]; + x = pte[3]; + pte_is_leaf = v && (r || x); + end + endfunction + + // check the A bit (ptw doesn't set A/D, so treat A=0 as fault) + function automatic logic pte_has_ad_fault(input logic [DATA_WIDTH-1:0] pte); + begin + pte_has_ad_fault = !pte[6]; + end + endfunction + + // superpage alignment check for l1 leaf + function automatic logic pte_superpage_misaligned(input logic [DATA_WIDTH-1:0] pte); + logic [PPN_WIDTH-1:0] ppn; + begin + ppn = pte[PPN_WIDTH-1:10]; + pte_superpage_misaligned = |ppn[9:0]; + end + endfunction + + // detect axi read access faults + function automatic logic axi_access_fault(input logic [1:0] resp); + begin + axi_access_fault = (resp != 2'b00); + end + endfunction + + // vpn splits + address generation + assign vpn_l1 = vpn_q[19:10]; + assign vpn_l2 = vpn_q[9:0]; + + assign timeout_expired = (timeout_cnt_q >= TIMEOUT_CYCLES); + + // l1 pte address + assign l1_addr = + base_addr_q + {{(ADDR_WIDTH-($bits(vpn_l1)+PTE_ALIGN_BITS)){1'b0}}, + vpn_l1, {PTE_ALIGN_BITS{1'b0}}}; + + // l2 table base address (from l1 ppn) + assign l2_base_addr = {pte_l1_q[PPN_WIDTH-1:10], 12'b0}; + + // l2 pte address + assign l2_addr = + l2_base_addr + {{(ADDR_WIDTH-($bits(vpn_l2)+PTE_ALIGN_BITS)){1'b0}}, + vpn_l2, {PTE_ALIGN_BITS{1'b0}}}; + + assign l1_addr_misaligned = |l1_addr[PTE_ALIGN_BITS-1:0]; + assign l2_addr_misaligned = |l2_addr[PTE_ALIGN_BITS-1:0]; + + // fsm + outputs + always_comb begin + state_d = state_q; + + walk_req_ready_o = (state_q == IDLE); + + walk_rsp_valid_o = 1'b0; + walk_rsp_pte_o = '0; + walk_rsp_error_o = 1'b0; + + axi_ar_valid_o = 1'b0; + axi_ar_addr_o = '0; + + timeout_cnt_d = timeout_cnt_q; + + // timeout increases only while waiting for mem + if (state_q == WAIT_L1 || state_q == WAIT_L2) begin + if (!timeout_expired) + timeout_cnt_d = timeout_cnt_q + 1; + end else begin + timeout_cnt_d = '0; + end + + unique case (state_q) + + IDLE: begin + // accept request only when ready + if (walk_req_valid_i && walk_req_ready_o) + state_d = SEND_L1; + end + + SEND_L1: begin + // l1 address must be aligned + if (l1_addr_misaligned) begin + state_d = ERROR; + end else begin + axi_ar_valid_o = 1'b1; + axi_ar_addr_o = l1_addr; + if (axi_ar_ready_i) + state_d = WAIT_L1; + end + end + + WAIT_L1: begin + if (axi_r_valid_i) begin + if (axi_access_fault(axi_r_resp_i)) + state_d = ERROR; + else if (pte_invalid(axi_r_data_i)) + state_d = ERROR; + else if (pte_is_leaf(axi_r_data_i)) begin + // superpage case + if (pte_superpage_misaligned(axi_r_data_i) || + pte_has_ad_fault(axi_r_data_i)) + state_d = ERROR; + else + state_d = DONE; + end else if (pte_is_pointer(axi_r_data_i)) begin + // go to l2 + state_d = SEND_L2; + end else begin + state_d = ERROR; + end + end else if (timeout_expired) begin + state_d = ERROR; + end + end + + SEND_L2: begin + // l2 address must be aligned + if (l2_addr_misaligned) begin + state_d = ERROR; + end else begin + axi_ar_valid_o = 1'b1; + axi_ar_addr_o = l2_addr; + if (axi_ar_ready_i) + state_d = WAIT_L2; + end + end + + WAIT_L2: begin + if (axi_r_valid_i) begin + if (axi_access_fault(axi_r_resp_i)) + state_d = ERROR; + else if (pte_invalid(axi_r_data_i)) + state_d = ERROR; + else if (pte_is_leaf(axi_r_data_i)) begin + if (pte_has_ad_fault(axi_r_data_i)) + state_d = ERROR; + else + state_d = DONE; + end else begin + state_d = ERROR; + end + end else if (timeout_expired) begin + state_d = ERROR; + end + end + + DONE: begin + walk_rsp_valid_o = 1'b1; + walk_rsp_pte_o = (pte_l2_q != '0) ? pte_l2_q : pte_l1_q; + walk_rsp_error_o = 1'b0; + state_d = IDLE; + end + + ERROR: begin + walk_rsp_valid_o = 1'b1; + walk_rsp_pte_o = '0; + walk_rsp_error_o = 1'b1; + state_d = IDLE; + end + endcase + + // flush overrides the walk immediately + if (flush_i) begin + state_d = IDLE; + walk_rsp_valid_o = 1'b0; + walk_rsp_error_o = 1'b0; + axi_ar_valid_o = 1'b0; + end + end + + // sequential pipeline regs + always_ff @(posedge clk_i or negedge rst_ni) begin + if (!rst_ni) begin + state_q <= IDLE; + base_addr_q <= '0; + vpn_q <= '0; + pte_l1_q <= '0; + pte_l2_q <= '0; + timeout_cnt_q <= '0; + end else if (flush_i) begin + // reset everything on flush + state_q <= IDLE; + base_addr_q <= '0; + vpn_q <= '0; + pte_l1_q <= '0; + pte_l2_q <= '0; + timeout_cnt_q <= '0; + end else begin + state_q <= state_d; + timeout_cnt_q <= timeout_cnt_d; + + // latch new walk request + if (state_q == IDLE && walk_req_valid_i && walk_req_ready_o) begin + base_addr_q <= walk_req_addr_i; + vpn_q <= walk_req_vpn_i; + pte_l1_q <= '0; + pte_l2_q <= '0; + end + + // latch ptes on valid read + if (state_q == WAIT_L1 && axi_r_valid_i && !axi_access_fault(axi_r_resp_i)) + pte_l1_q <= axi_r_data_i; + + if (state_q == WAIT_L2 && axi_r_valid_i && !axi_access_fault(axi_r_resp_i)) + pte_l2_q <= axi_r_data_i; + end + end + +endmodule \ No newline at end of file