Skip to content

分析未实现接口并实现功能#1

Open
sonald wants to merge 1 commit intomasterfrom
cursor/bc-7043deeb-09d5-4beb-aea9-fe74647f4452-4474
Open

分析未实现接口并实现功能#1
sonald wants to merge 1 commit intomasterfrom
cursor/bc-7043deeb-09d5-4beb-aea9-fe74647f4452-4474

Conversation

@sonald
Copy link
Owner

@sonald sonald commented Jul 3, 2025

Implement core C99 code generation features to resolve 'not implemented' panics and improve compiler completeness.

This PR addresses the initial phase of implementing missing functionalities identified in the compiler project. It focuses on the code generation module, specifically:

  • Core expression handling (e.g., sizeof, ternary operator ?:, type casting, character literals).
  • All compound assignment operators (-=, *=, /=, %=, &=, |=, ^=, <<=, >>=).
  • Support for compound literals and initialization lists.
    These implementations eliminate several 'not implemented' panics and significantly advance the compiler's ability to generate LLVM IR for common C99 constructs.

好的,这是将 pull request 总结翻译成中文的结果:

Sourcery 总结

在 LLVM 后端实现缺失的 C99 代码生成功能,以消除“未实现”的 panic 并提高编译器的完整性

新功能:

  • 增加对字符字面量、sizeof 运算符、条件 (?:) 运算符和类型转换的代码生成支持
  • 实现所有复合赋值运算符(例如,-=, *=, /=, %=, &=, |=, ^=, <<=, >>=)
  • 支持代码生成中的复合字面量和初始化列表

构建:

  • 引入 go.mod 用于项目模块定义

文档:

  • 添加 IMPLEMENTATION_TASKS.md 和 PROGRESS_SUMMARY.md,概述项目任务和进度

测试:

  • 添加基本的 C 测试文件,用于测试新实现的功能
Original summary in English

Summary by Sourcery

Implement missing C99 code generation features in LLVM backend to eliminate "not implemented" panics and improve compiler completeness

New Features:

  • Add codegen support for character literals, sizeof operator, conditional (?:) operator, and type casts
  • Implement all compound assignment operators (e.g., -=, *=, /=, %=, &=, |=, ^=, <<=, >>=)
  • Support compound literals and initialization lists in code generation

Build:

  • Introduce go.mod for project module definition

Documentation:

  • Add IMPLEMENTATION_TASKS.md and PROGRESS_SUMMARY.md outlining project tasks and progress

Tests:

  • Add basic C test file exercising newly implemented features

Co-authored-by: sonaldcao <sonaldcao@gmail.com>
@sourcery-ai
Copy link

sourcery-ai bot commented Jul 3, 2025

审查者指南

此 PR 通过完全实现先前未实现的 AST 遍历器来支持核心 C99 构造(包括字符字面量、sizeof、三元表达式、类型转换、复合赋值、复合字面量和初始化列表),从而扩展了 LLVM 代码生成器,并添加了元数据和测试文件来验证这些功能。

三元运算符 (?:) 代码生成的序列图

sequenceDiagram
    participant AstWalker as AstWalker
    participant Builder as LLVMBuilder
    participant Context as WalkContext
    AstWalker->>AstWalker: WalkConditionalOperation(cond ? true : false)
    AstWalker->>Builder: Create condition value
    Builder-->>AstWalker: cond (i1)
    AstWalker->>Builder: CreateCondBr(cond, trueBB, falseBB)
    AstWalker->>AstWalker: WalkAst(trueExpr)
    AstWalker->>Builder: CreateLoad if pointer
    AstWalker->>AstWalker: WalkAst(falseExpr)
    AstWalker->>Builder: CreateLoad if pointer
    AstWalker->>Builder: CreateBr(mergeBB) for both branches
    AstWalker->>Builder: CreatePHI(resultType, [trueVal, falseVal])
    Builder-->>Context: PHI node (merged value)
    Context-->>AstWalker: Set ctx.Value = PHI
Loading

LLVM 代码生成中扩展的 AST 遍历器的类图

classDiagram
    class AstWalker {
        +WalkCharLiteralExpr()
        +WalkSizeofExpr()
        +WalkConditionalOperation()
        +WalkCastExpr()
        +WalkCompoundAssignExpr()
        +WalkCompoundLiteralExpr()
        +WalkInitListExpr()
    }
    class WalkContext {
        +Value: llvm.Value
    }
    class llvm.Value
    class llvm.Type
    AstWalker --> WalkContext
    WalkContext --> llvm.Value
    AstWalker ..> llvm.Type : uses
    AstWalker ..> llvm.Value : uses
Loading

复合赋值运算符处理的类图

classDiagram
    class AstWalker {
        +WalkCompoundAssignExpr()
    }
    class lexer {
        <<enumeration>>
        MINUS_ASSIGN
        MUL_ASSIGN
        DIV_ASSIGN
        MOD_ASSIGN
        AND_ASSIGN
        OR_ASSIGN
        XOR_ASSIGN
        LSHIFT_ASSIGN
        RSHIFT_ASSIGN
    }
    AstWalker ..> lexer : uses
Loading

复合字面量和初始化列表支持的类图

classDiagram
    class AstWalker {
        +WalkCompoundLiteralExpr()
        +WalkInitListExpr()
    }
    class llvm.Value
    class llvm.Type
    AstWalker ..> llvm.Value : uses
    AstWalker ..> llvm.Type : uses
Loading

文件级别更改

变更 详情 文件
实现核心表达式遍历器
  • 将字符字面量转换为 8 位整数常量
  • 使用目标数据和回退完全实现 sizeof(类型或表达式)
  • 为三元运算符生成条件分支、加载和 PHI 节点
  • 处理 CastExpr 的加载和上下文,包括全面的整数、指针和浮点转换
codegen/llvm_cg.go
添加所有复合赋值运算符
  • 为每个运算符实现 load-if-pointer、运算和存储
  • 支持 -=, *=, /=, %=, <<=, >>=, >>=, &=, `
=, ^=`
  • 用特定于运算符的错误替换通用 panic 以处理未知情况
  • 支持复合字面量和初始化列表
    • 为 CompoundLiteralExpr 分配存储空间并进行零初始化或复制
    • 收集、加载并为 InitListExpr 创建常量数组
    • 为初始化列表定义全局常量数组和链接
    codegen/llvm_cg.go
    添加文档和基本 C 测试
    • 添加 IMPLEMENTATION_TASKS.md 以跟踪功能任务
    • 添加 PROGRESS_SUMMARY.md 以总结已完成的工作
    • 包含 test_basic.c 以验证新的代码生成功能
    • 引入 go.mod 用于模块定义
    IMPLEMENTATION_TASKS.md
    PROGRESS_SUMMARY.md
    test_basic.c
    go.mod

    提示和命令

    与 Sourcery 交互

    • 触发新的审查: 在 pull request 上评论 @sourcery-ai review
    • 继续讨论: 直接回复 Sourcery 的审查评论。
    • 从审查评论生成 GitHub issue: 通过回复审查评论,要求 Sourcery 从审查评论创建一个 issue。您也可以回复审查评论并使用 @sourcery-ai issue 来从中创建一个 issue。
    • 生成 pull request 标题: 在 pull request 标题中的任何位置写入 @sourcery-ai 以随时生成标题。您也可以在 pull request 上评论 @sourcery-ai title 以随时(重新)生成标题。
    • 生成 pull request 摘要: 在 pull request 正文中的任何位置写入 @sourcery-ai summary 以随时在您想要的位置生成 PR 摘要。您也可以在 pull request 上评论 @sourcery-ai summary 以随时(重新)生成摘要。
    • 生成审查者指南: 在 pull request 上评论 @sourcery-ai guide 以随时(重新)生成审查者指南。
    • 解决所有 Sourcery 评论: 在 pull request 上评论 @sourcery-ai resolve 以解决所有 Sourcery 评论。如果您已经解决了所有评论并且不想再看到它们,这将非常有用。
    • 驳回所有 Sourcery 审查: 在 pull request 上评论 @sourcery-ai dismiss 以驳回所有现有的 Sourcery 审查。如果您想从新的审查开始,这将特别有用 - 不要忘记评论 @sourcery-ai review 以触发新的审查!

    自定义您的体验

    访问您的 dashboard 以:

    • 启用或禁用审查功能,例如 Sourcery 生成的 pull request 摘要、审查者指南等。
    • 更改审查语言。
    • 添加、删除或编辑自定义审查说明。
    • 调整其他审查设置。

    获取帮助

    Original review guide in English

    Reviewer's Guide

    This PR extends the LLVM code generator by fully implementing previously unimplemented AST walkers to support core C99 constructs—covering character literals, sizeof, ternary expressions, type casts, compound assignments, compound literals, and initialization lists—and adds metadata and test files to validate these features.

    Sequence diagram for ternary operator (?:) code generation

    sequenceDiagram
        participant AstWalker as AstWalker
        participant Builder as LLVMBuilder
        participant Context as WalkContext
        AstWalker->>AstWalker: WalkConditionalOperation(cond ? true : false)
        AstWalker->>Builder: Create condition value
        Builder-->>AstWalker: cond (i1)
        AstWalker->>Builder: CreateCondBr(cond, trueBB, falseBB)
        AstWalker->>AstWalker: WalkAst(trueExpr)
        AstWalker->>Builder: CreateLoad if pointer
        AstWalker->>AstWalker: WalkAst(falseExpr)
        AstWalker->>Builder: CreateLoad if pointer
        AstWalker->>Builder: CreateBr(mergeBB) for both branches
        AstWalker->>Builder: CreatePHI(resultType, [trueVal, falseVal])
        Builder-->>Context: PHI node (merged value)
        Context-->>AstWalker: Set ctx.Value = PHI
    
    Loading

    Class diagram for extended AST walkers in LLVM codegen

    classDiagram
        class AstWalker {
            +WalkCharLiteralExpr()
            +WalkSizeofExpr()
            +WalkConditionalOperation()
            +WalkCastExpr()
            +WalkCompoundAssignExpr()
            +WalkCompoundLiteralExpr()
            +WalkInitListExpr()
        }
        class WalkContext {
            +Value: llvm.Value
        }
        class llvm.Value
        class llvm.Type
        AstWalker --> WalkContext
        WalkContext --> llvm.Value
        AstWalker ..> llvm.Type : uses
        AstWalker ..> llvm.Value : uses
    
    Loading

    Class diagram for compound assignment operator handling

    classDiagram
        class AstWalker {
            +WalkCompoundAssignExpr()
        }
        class lexer {
            <<enumeration>>
            MINUS_ASSIGN
            MUL_ASSIGN
            DIV_ASSIGN
            MOD_ASSIGN
            AND_ASSIGN
            OR_ASSIGN
            XOR_ASSIGN
            LSHIFT_ASSIGN
            RSHIFT_ASSIGN
        }
        AstWalker ..> lexer : uses
    
    Loading

    Class diagram for compound literal and initializer list support

    classDiagram
        class AstWalker {
            +WalkCompoundLiteralExpr()
            +WalkInitListExpr()
        }
        class llvm.Value
        class llvm.Type
        AstWalker ..> llvm.Value : uses
        AstWalker ..> llvm.Type : uses
    
    Loading

    File-Level Changes

    Change Details Files
    Implement core expression walkers
    • Convert char literals to 8-bit integer constants
    • Fully implement sizeof (type or expression) using target data with fallbacks
    • Generate conditional branches, loads, and PHI nodes for ternary operator
    • Handle loads and context for CastExpr with comprehensive integer, pointer, and float conversions
    codegen/llvm_cg.go
    Add all compound assignment operators
    • Implement load-if-pointer, operation, and store for each operator
    • Support -=, *=, /=, %=, <<=, >>=, >>=, &=, `
    =, ^=`
  • Replace generic panic with operator-specific error for unknown cases
  • Support compound literals and initializer lists
    • Allocate storage and zero-initialize or copy for CompoundLiteralExpr
    • Collect, load, and create constant arrays for InitListExpr
    • Define global constant array and linkage for initializer lists
    codegen/llvm_cg.go
    Add documentation and basic C test
    • Add IMPLEMENTATION_TASKS.md to track feature tasks
    • Add PROGRESS_SUMMARY.md to summarize completed work
    • Include test_basic.c to validate new codegen features
    • Introduce go.mod for module definition
    IMPLEMENTATION_TASKS.md
    PROGRESS_SUMMARY.md
    test_basic.c
    go.mod

    Tips and commands

    Interacting with Sourcery

    • Trigger a new review: Comment @sourcery-ai review on the pull request.
    • Continue discussions: Reply directly to Sourcery's review comments.
    • Generate a GitHub issue from a review comment: Ask Sourcery to create an
      issue from a review comment by replying to it. You can also reply to a
      review comment with @sourcery-ai issue to create an issue from it.
    • Generate a pull request title: Write @sourcery-ai anywhere in the pull
      request title to generate a title at any time. You can also comment
      @sourcery-ai title on the pull request to (re-)generate the title at any time.
    • Generate a pull request summary: Write @sourcery-ai summary anywhere in
      the pull request body to generate a PR summary at any time exactly where you
      want it. You can also comment @sourcery-ai summary on the pull request to
      (re-)generate the summary at any time.
    • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
      request to (re-)generate the reviewer's guide at any time.
    • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
      pull request to resolve all Sourcery comments. Useful if you've already
      addressed all the comments and don't want to see them anymore.
    • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
      request to dismiss all existing Sourcery reviews. Especially useful if you
      want to start fresh with a new review - don't forget to comment
      @sourcery-ai review to trigger a new review!

    Customizing Your Experience

    Access your dashboard to:

    • Enable or disable review features such as the Sourcery-generated pull request
      summary, the reviewer's guide, and others.
    • Change the review language.
    • Add, remove or edit custom review instructions.
    • Adjust other review settings.

    Getting Help

    Copy link

    @sourcery-ai sourcery-ai bot left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    Hey @sonald - I've reviewed your changes and they look great!

    Prompt for AI Agents
    Please address the comments from this code review:
    ## Individual Comments
    
    ### Comment 1
    <location> `src/github.com/sonald/sc/codegen/llvm_cg.go:850` </location>
    <code_context>
    +				walker.Info.builder.CreateStore(val, lhs)
    +				op = val
    
    +			case lexer.MINUS_ASSIGN:
    +				var l, r llvm.Value = lhs, rhs
    +				if l.Type().TypeKind() == llvm.PointerTypeKind {
    </code_context>
    
    <issue_to_address>
    Consider extracting repeated pointer loading and operator logic in binary and compound assignment walkers into reusable helper functions and operator maps.
    
    Here are a few quick wins to collapse the enormous `WalkBinaryOperation`/`WalkCompoundAssignExpr` plumbing into small, reusable helpers.  All existing functionality is preserved.
    
    1. Pull your “load if pointer” pattern into one helper:
    
    ```go
    // loadIfPtr loads *v if it's a pointer, otherwise returns v unchanged.
    func loadIfPtr(v llvm.Value, b llvm.Builder) llvm.Value {
        if v.Type().TypeKind() == llvm.PointerTypeKind {
            return b.CreateLoad(v, "")
        }
        return v
    }
    ```
    
    2. Predeclare your operator‐to‐builder maps at package scope (rather than in each walker):
    
    ```go
    var intArithOps = map[lexer.Kind]func(b llvm.Builder, l, r llvm.Value) llvm.Value{
        lexer.PLUS:  func(b llvm.Builder, l, r llvm.Value) llvm.Value { return b.CreateAdd(l, r, "") },
        lexer.MINUS: func(b llvm.Builder, l, r llvm.Value) llvm.Value { return b.CreateSub(l, r, "") },
        lexer.MUL:   func(b llvm.Builder, l, r llvm.Value) llvm.Value { return b.CreateMul(l, r, "") },
        lexer.DIV:   func(b llvm.Builder, l, r llvm.Value) llvm.Value { return b.CreateSDiv(l, r, "") },
        lexer.MOD:   func(b llvm.Builder, l, r llvm.Value) llvm.Value { return b.CreateSRem(l, r, "") },
    }
    
    var intCmpOps = map[lexer.Kind]llvm.IntPredicate{
        lexer.GREAT: llvm.IntSGT,
        lexer.GE:    llvm.IntSGE,
        lexer.LESS:  llvm.IntSLT,
        lexer.LE:    llvm.IntSLE,
        lexer.NE:    llvm.IntNE,
        lexer.EQUAL: llvm.IntEQ,
    }
    ```
    
    3. Replace the huge switch in `WalkBinaryOperation` with:
    
    ```go
    walker.WalkBinaryOperation = func(ws ast.WalkStage, e *ast.BinaryOperation, ctx *ast.WalkContext) bool {
        if ws != ast.WalkerPropagate || InSwitchCaseCounting() {
            return true
        }
    
        b := walker.Info.builder
        lhs := loadIfPtr(ast.WalkAst(e.LHS, walker, ctx).(llvm.Value), b)
        rhs := loadIfPtr(ast.WalkAst(e.RHS, walker, ctx).(llvm.Value), b)
    
        // arithmetic
        if fn, ok := intArithOps[e.Op]; ok {
            ctx.Value = fn(b, lhs, rhs)
            return false
        }
        // comparisons
        if pred, ok := intCmpOps[e.Op]; ok {
            ctx.Value = b.CreateICmp(pred, lhs, rhs, "")
            return false
        }
        // TODO: logical, bitwise, comma, assign, etc. can use the same pattern:
        //   1) map lexer.Kind to a small func
        //   2) call it here
        //   3) return false
    
        panic("unhandled binary op " + e.Op.String())
    }
    ```
    
    4. Do the same in `WalkCompoundAssignExpr`:
    
    ```go
    walker.WalkCompoundAssignExpr = func(ws ast.WalkStage, e *ast.CompoundAssignExpr, ctx *ast.WalkContext) bool {
        if ws != ast.WalkerPropagate {
            return true
        }
        b := walker.Info.builder
        lhsPtr := ast.WalkAst(e.LHS, walker, ctx).(llvm.Value)
        rhsPtr := ast.WalkAst(e.RHS, walker, ctx).(llvm.Value)
        l := loadIfPtr(lhsPtr, b)
        r := loadIfPtr(rhsPtr, b)
    
        // e.g. for +=
        switch e.Op {
        case lexer.PLUS_ASSIGN:
            val := intArithOps[lexer.PLUS](b, l, r)
            b.CreateStore(val, lhsPtr)
            ctx.Value = val
        default:
            panic("not implemented: " + e.Op.String())
        }
        return false
    }
    ```
    
    Once you’ve extracted these patterns you can remove all the repeated `if TypeKind() == Pointer` blocks and the huge inline maps, and focus each walker on just orchestrating the right helpers.
    </issue_to_address>

    Sourcery is free for open source - if you like our reviews please consider sharing them ✨
    Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.


    case lexer.MINUS_ASSIGN:
    Copy link

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    issue (complexity): Consider extracting repeated pointer loading and operator logic in binary and compound assignment walkers into reusable helper functions and operator maps.

    Here are a few quick wins to collapse the enormous WalkBinaryOperation/WalkCompoundAssignExpr plumbing into small, reusable helpers. All existing functionality is preserved.

    1. Pull your “load if pointer” pattern into one helper:
    // loadIfPtr loads *v if it's a pointer, otherwise returns v unchanged.
    func loadIfPtr(v llvm.Value, b llvm.Builder) llvm.Value {
        if v.Type().TypeKind() == llvm.PointerTypeKind {
            return b.CreateLoad(v, "")
        }
        return v
    }
    1. Predeclare your operator‐to‐builder maps at package scope (rather than in each walker):
    var intArithOps = map[lexer.Kind]func(b llvm.Builder, l, r llvm.Value) llvm.Value{
        lexer.PLUS:  func(b llvm.Builder, l, r llvm.Value) llvm.Value { return b.CreateAdd(l, r, "") },
        lexer.MINUS: func(b llvm.Builder, l, r llvm.Value) llvm.Value { return b.CreateSub(l, r, "") },
        lexer.MUL:   func(b llvm.Builder, l, r llvm.Value) llvm.Value { return b.CreateMul(l, r, "") },
        lexer.DIV:   func(b llvm.Builder, l, r llvm.Value) llvm.Value { return b.CreateSDiv(l, r, "") },
        lexer.MOD:   func(b llvm.Builder, l, r llvm.Value) llvm.Value { return b.CreateSRem(l, r, "") },
    }
    
    var intCmpOps = map[lexer.Kind]llvm.IntPredicate{
        lexer.GREAT: llvm.IntSGT,
        lexer.GE:    llvm.IntSGE,
        lexer.LESS:  llvm.IntSLT,
        lexer.LE:    llvm.IntSLE,
        lexer.NE:    llvm.IntNE,
        lexer.EQUAL: llvm.IntEQ,
    }
    1. Replace the huge switch in WalkBinaryOperation with:
    walker.WalkBinaryOperation = func(ws ast.WalkStage, e *ast.BinaryOperation, ctx *ast.WalkContext) bool {
        if ws != ast.WalkerPropagate || InSwitchCaseCounting() {
            return true
        }
    
        b := walker.Info.builder
        lhs := loadIfPtr(ast.WalkAst(e.LHS, walker, ctx).(llvm.Value), b)
        rhs := loadIfPtr(ast.WalkAst(e.RHS, walker, ctx).(llvm.Value), b)
    
        // arithmetic
        if fn, ok := intArithOps[e.Op]; ok {
            ctx.Value = fn(b, lhs, rhs)
            return false
        }
        // comparisons
        if pred, ok := intCmpOps[e.Op]; ok {
            ctx.Value = b.CreateICmp(pred, lhs, rhs, "")
            return false
        }
        // TODO: logical, bitwise, comma, assign, etc. can use the same pattern:
        //   1) map lexer.Kind to a small func
        //   2) call it here
        //   3) return false
    
        panic("unhandled binary op " + e.Op.String())
    }
    1. Do the same in WalkCompoundAssignExpr:
    walker.WalkCompoundAssignExpr = func(ws ast.WalkStage, e *ast.CompoundAssignExpr, ctx *ast.WalkContext) bool {
        if ws != ast.WalkerPropagate {
            return true
        }
        b := walker.Info.builder
        lhsPtr := ast.WalkAst(e.LHS, walker, ctx).(llvm.Value)
        rhsPtr := ast.WalkAst(e.RHS, walker, ctx).(llvm.Value)
        l := loadIfPtr(lhsPtr, b)
        r := loadIfPtr(rhsPtr, b)
    
        // e.g. for +=
        switch e.Op {
        case lexer.PLUS_ASSIGN:
            val := intArithOps[lexer.PLUS](b, l, r)
            b.CreateStore(val, lhsPtr)
            ctx.Value = val
        default:
            panic("not implemented: " + e.Op.String())
        }
        return false
    }

    Once you’ve extracted these patterns you can remove all the repeated if TypeKind() == Pointer blocks and the huge inline maps, and focus each walker on just orchestrating the right helpers.

    @sonald sonald added the jules label Jul 3, 2025
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

    Labels

    Projects

    None yet

    Development

    Successfully merging this pull request may close these issues.

    2 participants