Skip to content

Conversation

@HeatCrab
Copy link
Collaborator

@HeatCrab HeatCrab commented Dec 13, 2025

User mode tasks require kernel stack isolation to prevent malicious or corrupted user stack pointers from compromising kernel memory during interrupt handling. Without this protection, a user task could set its stack pointer to an invalid or controlled address, causing the ISR to write trap frames to arbitrary memory locations.

Stack isolation is implemented by using the mscratch register as a discriminator between machine mode and user mode execution contexts. The ISR entry performs a blind swap with mscratch: for machine mode tasks (mscratch=0), the swap is immediately undone to restore the kernel stack pointer. For user mode tasks, the swap provides the kernel stack while preserving the user stack pointer in mscratch. The interrupt frame structure is extended to 36 words with frame[33] dedicated to stack pointer storage. Task initialization configures mscratch appropriately during the first dispatch by checking the MPP field in mstatus.

[umode] Phase 1: Testing Kernel Stack Isolation

[umode] Test 1a: sys_tid() with normal SP
[umode] PASS: sys_tid() returned 2

[umode] Test 1b: sys_tid() with malicious SP
[umode] PASS: sys_tid() succeeded, ISR correctly used kernel stack

[umode] Test 1c: sys_uptime() with normal SP
[umode] PASS: sys_uptime() returned 4

[umode] Phase 1 Complete: Kernel stack isolation validated

[umode] ========================================

[umode] Phase 2: Testing Security Isolation

[umode] Action: Attempting to read 'mstatus' CSR from U-mode.
[umode] Expect: Kernel Panic with 'Illegal instruction'.

[EXCEPTION] Illegal instruction epc=0x800002F0

Test code and the outputs validates that system calls succeed even when invoked with a malicious stack pointer (0xDEADBEEF), confirming the ISR correctly uses the kernel stack from mscratch rather than the user-controlled stack pointer. All existing tests continue to pass, demonstrating that the isolation mechanism does not affect machine mode task execution.

Related to #53

Copy link

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 5 files

@HeatCrab HeatCrab marked this pull request as draft December 13, 2025 13:28
@HeatCrab HeatCrab force-pushed the u-mode/basic-support branch 2 times, most recently from 9801050 to afcfd14 Compare December 13, 2025 14:15
@HeatCrab HeatCrab marked this pull request as ready for review December 13, 2025 14:21
Copy link

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 7 files

Prompt for AI agents (all 1 issues)

Check if these issues are valid — if so, understand the root cause of each and fix them.


<file name="arch/riscv/boot.c">

<violation number="1" location="arch/riscv/boot.c:333">
P3: Comment claims `ISR_CONTEXT_SIZE` is &quot;used inline&quot; but the macro is not actually used - the value `144` is hardcoded. Consider either using the macro via extended asm operands (as the original code did) or updating the comment to reflect the actual implementation.</violation>
</file>

Reply to cubic to teach it or ask questions. Re-run a review with @cubic-dev-ai review this PR

@HeatCrab HeatCrab force-pushed the u-mode/basic-support branch from afcfd14 to fe3574c Compare December 13, 2025 14:42
@HeatCrab HeatCrab force-pushed the u-mode/basic-support branch from fe3574c to e31f722 Compare December 18, 2025 09:11
@HeatCrab HeatCrab force-pushed the u-mode/basic-support branch 2 times, most recently from 1930417 to a903be2 Compare December 27, 2025 09:08
@HeatCrab HeatCrab force-pushed the u-mode/basic-support branch 2 times, most recently from 98ec3ef to 6411a74 Compare January 4, 2026 09:44
@HeatCrab
Copy link
Collaborator Author

HeatCrab commented Jan 4, 2026

This PR now implements per-task kernel stack allocation for U-mode tasks. Previously, all U-mode tasks shared a single global kernel stack ( _stack ), which caused trap frame corruption when multiple U-mode tasks were scheduled concurrently.

Regarding validation, as umode.c had verified that syscalls work correctly with corrupted user SP, confirming the ISR correctly uses the kernel stack from mscratch. The multi-task isolation scenario will be validated by the PMP test suite in PR #32, which depends on this infrastructure.

@HeatCrab HeatCrab force-pushed the u-mode/basic-support branch from 6411a74 to 28dcddd Compare January 4, 2026 09:58
@sysprog21 sysprog21 deleted a comment from cubic-dev-ai bot Jan 14, 2026
Copy link

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 issues found across 11 files

Prompt for AI agents (all issues)

Check if these issues are valid — if so, understand the root cause of each and fix them.


<file name="arch/riscv/hal.c">

<violation number="1" location="arch/riscv/hal.c:84">
P3: The ISR frame size comment is now outdated. With `FRAME_SP = 33` added, the frame contains 34 words (indices 0-33), not 33 as documented.</violation>

<violation number="2" location="arch/riscv/hal.c:866">
P2: U-mode mscratch initialization uses global `_stack` instead of `current_kernel_stack_top`. This means the first trap after initial dispatch will use the global stack rather than the task's per-task kernel stack, which is inconsistent with boot.c's ISR exit path design. Should load from `current_kernel_stack_top` with fallback to `_stack` if NULL.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

Copy link

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 issues found across 11 files

Prompt for AI agents (all issues)

Check if these issues are valid — if so, understand the root cause of each and fix them.


<file name="Documentation/hal-calling-convention.md">

<violation number="1" location="Documentation/hal-calling-convention.md:234">
P2: Trap entry description incorrectly claims the hardware swaps SP with `mscratch`; the swap is performed manually in `_isr` via `csrrw`, so the documentation should describe it as an ISR software operation, not a hardware feature.</violation>
</file>

<file name="kernel/syscall.c">

<violation number="1" location="kernel/syscall.c:400">
P1: `sys_tputs` can block the scheduler indefinitely because it keeps the timer interrupt disabled while emitting an unbounded user-controlled string, allowing a U-mode caller to freeze all other tasks.</violation>
</file>

<file name="arch/riscv/hal.c">

<violation number="1" location="arch/riscv/hal.c:866">
P1: Kernel stack isolation bypassed on initial dispatch. For U-mode tasks, `mscratch` is set to `_stack` (global stack) instead of `current_kernel_stack_top` (per-task kernel stack). This causes the first trap after initial dispatch to use the wrong kernel stack, inconsistent with boot.c's ISR restore path which correctly loads `current_kernel_stack_top`.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

@jserv
Copy link
Contributor

jserv commented Jan 27, 2026

512 bytes with 144-byte ISR frame leaves only 368 bytes for trap handler execution. If do_trap() → syscall_handler() → any function with local variables, stack overflow is possible.

Consider: #define KERNEL_STACK_SIZE 1024 for safety margin.

@jserv
Copy link
Contributor

jserv commented Jan 27, 2026

The initial dispatch path restores from frame but doesn't handle frame[33] (FRAME_SP) for the first task. Looking at the code:

  "addi   sp, sp, %0\n"  /* Deallocate frame - wrong for U-mode first dispatch */

For U-mode, this sets SP to kernel stack top (wrong).
Should restore SP from frame[FRAME_SP] before mret when MPP=U.

This contradicts _isr's U-mode restore path which correctly loads SP from frame[33].

Impact: First U-mode task starts executing with SP pointing to kernel stack instead of user stack. Security violation and likely crash.

@jserv
Copy link
Contributor

jserv commented Jan 28, 2026

Location: arch/riscv/boot.c, _entry function

Problem: The blind-swap relies on mscratch == 0 to detect M-mode. At reset, mscratch contains undefined garbage. If non-zero, the first M-mode interrupt misidentifies as U-mode entry, causing:

  • SP becomes garbage value from mscratch
  • Frame allocation corrupts arbitrary memory
  • System crash

Fix: Add to _entry before enabling traps:

  "csrw   mideleg, zero\n"
  "csrw   medeleg, zero\n"
  "csrw   mscratch, zero\n"   /* ADD THIS */

@jserv
Copy link
Contributor

jserv commented Jan 28, 2026

Location: kernel/task.c, task_spawn_impl

Problem: Task is added to kcb->tasks list, but kernel_stack is allocated later. If another context (ISR, concurrent task) calls mo_task_cancel() on the new task between these lines:

  1. mo_task_cancel frees the TCB
  2. task_spawn_impl resumes and writes to freed TCB → Use-After-Free

Current flow:

  CRITICAL_LEAVE();
  // <-- Window: task visible but kernel_stack not allocated
  tcb->kernel_stack = malloc();  // UAF if TCB freed

Fix: Move list_pushback after all allocations complete:

  /* Allocate kernel stack BEFORE adding to task list */
  if (user_mode) {
      tcb->kernel_stack = malloc(KERNEL_STACK_SIZE);
      if (!tcb->kernel_stack) {
          free(tcb->stack);
          free(tcb);
          panic(ERR_STACK_ALLOC);
      }
      tcb->kernel_stack_size = KERNEL_STACK_SIZE;
  }

  CRITICAL_ENTER();
  node = list_pushback(kcb->tasks, tcb);  // Now safe
  // ...
  CRITICAL_LEAVE();

Copy link
Contributor

@jserv jserv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rebase latest main branch.

@HeatCrab
Copy link
Collaborator Author

HeatCrab commented Jan 29, 2026

The initial dispatch path restores from frame but doesn't handle frame[33] (FRAME_SP) for the first task. Looking at the code:

  "addi   sp, sp, %0\n"  /* Deallocate frame - wrong for U-mode first dispatch */

For U-mode, this sets SP to kernel stack top (wrong). Should restore SP from frame[FRAME_SP] before mret when MPP=U.

This contradicts _isr's U-mode restore path which correctly loads SP from frame[33].

Impact: First U-mode task starts executing with SP pointing to kernel stack instead of user stack. Security violation and likely crash.

Fixed. The initial dispatch now restores SP from frame[33], which contains the correct SP value for both M-mode and U-mode.

@HeatCrab HeatCrab force-pushed the u-mode/basic-support branch from 185b4a0 to 0a6cd74 Compare January 29, 2026 07:47
@HeatCrab
Copy link
Collaborator Author

Location: kernel/task.c, task_spawn_impl

Problem: Task is added to kcb->tasks list, but kernel_stack is allocated later. If another context (ISR, concurrent task) calls mo_task_cancel() on the new task between these lines:

  1. mo_task_cancel frees the TCB
  2. task_spawn_impl resumes and writes to freed TCB → Use-After-Free

Current flow:

  CRITICAL_LEAVE();
  // <-- Window: task visible but kernel_stack not allocated
  tcb->kernel_stack = malloc();  // UAF if TCB freed

Fix: Move list_pushback after all allocations complete:

  /* Allocate kernel stack BEFORE adding to task list */
  if (user_mode) {
      tcb->kernel_stack = malloc(KERNEL_STACK_SIZE);
      if (!tcb->kernel_stack) {
          free(tcb->stack);
          free(tcb);
          panic(ERR_STACK_ALLOC);
      }
      tcb->kernel_stack_size = KERNEL_STACK_SIZE;
  }

  CRITICAL_ENTER();
  node = list_pushback(kcb->tasks, tcb);  // Now safe
  // ...
  CRITICAL_LEAVE();

Fixed. Moved kernel_stack allocation before list_pushback to eliminate the UAF race window.

@HeatCrab
Copy link
Collaborator Author

Location: arch/riscv/boot.c, _entry function

Problem: The blind-swap relies on mscratch == 0 to detect M-mode. At reset, mscratch contains undefined garbage. If non-zero, the first M-mode interrupt misidentifies as U-mode entry, causing:

  • SP becomes garbage value from mscratch
  • Frame allocation corrupts arbitrary memory
  • System crash

Fix: Add to _entry before enabling traps:

  "csrw   mideleg, zero\n"
  "csrw   medeleg, zero\n"
  "csrw   mscratch, zero\n"   /* ADD THIS */

Fixed. Added csrw mscratch, zero in _entry before enabling traps.

@HeatCrab HeatCrab requested a review from jserv January 29, 2026 07:52
@HeatCrab
Copy link
Collaborator Author

HeatCrab commented Jan 29, 2026

Rebase latest main branch.

Finished.

@HeatCrab HeatCrab force-pushed the u-mode/basic-support branch from 0a6cd74 to 32dbe79 Compare January 29, 2026 08:06
app/umode.c Outdated
umode_printf("\n");

/* Test 1: sys_tid() - Simplest read-only syscall. */
/* Test 1a: Baseline - Syscall with normal SP */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of 1a and 1b, use numbers.

Copy link
Collaborator Author

@HeatCrab HeatCrab Jan 29, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now test 1 is renumbered to test 1-1, 1-2, 1-3.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now test 1 is renumbered to test 1-1, 1-2, 1-3.

No, simply count Test 1, Test 2, and so on.

@HeatCrab HeatCrab force-pushed the u-mode/basic-support branch from 32dbe79 to fac7e93 Compare January 29, 2026 12:57
User mode tasks require kernel stack isolation to prevent malicious or
corrupted user stack pointers from compromising kernel memory during
interrupt handling. Without this protection, a user task could set its
stack pointer to an invalid or controlled address, causing the ISR to
write trap frames to arbitrary memory locations.

This commit implements stack isolation using the mscratch register as a
discriminator between machine mode and user mode execution contexts. The
ISR entry performs a blind swap with mscratch: for machine mode tasks
(mscratch=0), the swap is immediately undone to restore the kernel stack
pointer. For user mode tasks (mscratch=kernel_stack), the swap provides
the kernel stack while preserving the user stack pointer in mscratch.

Each user mode task is allocated a dedicated 1024-byte kernel stack to
ensure complete isolation between tasks and prevent stack overflow
attacks. The task control block is extended to track per-task kernel
stack allocations. A global pointer references the current task's kernel
stack and is updated during each context switch. The ISR loads this
pointer to access the appropriate per-task kernel stack through
mscratch, replacing the previous approach of using a single global
kernel stack shared by all user mode tasks.

The interrupt frame structure is extended to include dedicated storage
for the stack pointer. Task initialization zeroes the entire frame and
correctly sets the initial stack pointer to support the new restoration
path. For user mode tasks, the initial ISR frame is constructed on the
kernel stack rather than the user stack, ensuring the frame is protected
from user manipulation. Enumeration constants replace magic number usage
for improved code clarity and consistency.

The ISR implementation now includes separate entry and restoration paths
for each privilege mode. The M-mode path maintains mscratch=0 throughout
execution. The U-mode path saves the user stack pointer from mscratch
immediately after frame allocation and restores mscratch to the current
task's kernel stack address before returning to user mode, enabling the
next trap to use the correct per-task kernel stack.

Task initialization was updated to configure mscratch appropriately
during the first dispatch. The dispatcher checks the current privilege
level and sets mscratch to zero for machine mode tasks. For user mode
tasks, it loads the current task's kernel stack pointer if available,
with a fallback to the global kernel stack for initial dispatch before
the first task switch. The main scheduler initialization ensures the
first task's kernel stack pointer is set before entering the scheduling
loop.

The user mode output system call was modified to bypass the asynchronous
logger queue and implement task-level synchronization. Direct output
ensures strict FIFO ordering for test output clarity, while preventing
task preemption during character transmission avoids interleaving when
multiple user tasks print concurrently. This ensures each string is
output atomically with respect to other tasks.

A test helper function was added to support stack pointer manipulation
during validation. Following the Linux kernel's context switching
pattern, this provides precise control over stack operations without
compiler interference. The validation harness uses this to verify
syscall stability under corrupted stack pointer conditions.

Documentation updates include the calling convention guide's stack layout
section, which now distinguishes between machine mode and user mode task
stack organization with detailed diagrams of the dual-stack design. The
context switching guide's task initialization section reflects the
updated function signature for building initial interrupt frames with
per-task kernel stack parameters.

Testing validates that system calls succeed even when invoked with a
malicious stack pointer (0xDEADBEEF), confirming the ISR correctly uses
the per-task kernel stack from mscratch rather than the user-controlled
stack pointer.
@HeatCrab HeatCrab force-pushed the u-mode/basic-support branch from fac7e93 to a41d543 Compare January 29, 2026 13:17
@jserv jserv merged commit 869c914 into sysprog21:main Jan 29, 2026
6 checks passed
@jserv
Copy link
Contributor

jserv commented Jan 29, 2026

Thank @HeatCrab for contributing!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants