Skip to content

Add symbol table and AST extension APIs for semantic analysis#3013

Draft
chmodshubham wants to merge 4 commits intoSonarOpenCommunity:masterfrom
chmodshubham:symbol-support
Draft

Add symbol table and AST extension APIs for semantic analysis#3013
chmodshubham wants to merge 4 commits intoSonarOpenCommunity:masterfrom
chmodshubham:symbol-support

Conversation

@chmodshubham
Copy link

@chmodshubham chmodshubham commented Feb 10, 2026

Summary

The sonar-cryptography plugin (based on sonarqube) requires 4 essential classes for language support as documented here. Sonar-cxx already had 3 of these classes, but Symbol class support was missing. This is one of the important classes needed for parsing and traversal of C/C++ files.

Explored what methods/classes are needed by sonar-cryptography from the implementation of Java and Python language support. Then, from the reference implementations of sonar-java and sonar-python libraries, replicated the similar functionality for the sonar-cxx plugin.

Changes

  • Added Symbol, SymbolTable, Type, SourceCodeSymbol classes
  • Added AST node extensions for traversal and type resolution
  • Added PreciseIssue, CxxBaseDetectionRule, CxxCustomRuleRepository for rule management
  • Added CxxAstNodeHelper utility
  • Enhanced SquidAstVisitorContext and SquidCheck
  • Comprehensive unit tests for all new components

Signed-off-by: Shubham Kumar chmodshubham@gmail.com


This change is Reviewable

This commit introduces symbol table management and AST node extension APIs
to enable semantic code analysis in the CXX squid bridge framework.

New symbol table components:
- Symbol: Represents a code symbol with name, type, scope, and qualifiers
- SymbolTable: Manages symbol registration, lookup, and scope hierarchy
- Type: Represents type information including primitive and user-defined types
- SourceCodeSymbol: Links symbols to their source code declarations

New AST node extension APIs:
- AstNodeExtensions: Provides extension methods for AstNode operations
- AstNodeSymbolExtension: Adds symbol resolution and lookup to AST nodes
- AstNodeTraversal: Implements tree traversal patterns for AST analysis
- AstNodeTypeExtension: Provides type checking and type resolution for nodes

New rule and issue management:
- PreciseIssue: Records issues with primary and secondary code locations
- CxxBaseDetectionRule: Abstract base for pattern detection rules
- CxxCustomRuleRepository: Interface for registering custom analysis rules

New utility:
- CxxAstNodeHelper: Helper methods for common AST node operations

Modified classes:
- SquidAstVisitorContext: Added symbol table storage, root tree tracking,
  and precise issue collection. Implemented methods for symbol lookup,
  type resolution, and issue management.
- SquidCheck: Updated to work with new infrastructure components

Test coverage:
- Added unit tests for all new API classes and utilities
- Tests cover symbol management, AST traversal, type checking, and issue
  reporting functionality

Build configuration:
- Updated .gitignore to exclude IDE settings and generated dependency POM

Signed-off-by: Shubham Kumar <chmodshubham@gmail.com>
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@guwirth
Copy link
Collaborator

guwirth commented Feb 11, 2026

Hi @chmodshubham,

Thanks for providing this.
We will review it.

Regards,
Günter

Signed-off-by: Shubham Kumar <chmodshubham@gmail.com>
@guwirth
Copy link
Collaborator

guwirth commented Feb 13, 2026

@chmodshubham has this SymbolTable something to do with #1401 ?

@chmodshubham
Copy link
Author

chmodshubham commented Feb 14, 2026

No, thanks for pointing this out. The issue is regarding the symbol highlighting support, however, I missed the symbol population part. The methods and infra for symbol creation, usage tracking, AST node binding, and feeding data into SonarQube's NewSymbolTable API are all there, but the actual symbol population visitor which traverses the AST, detects declarations, resolves ref, classifies usages, and populates the symbol table are completely missing. The CxxHighlighterVisitor.java only handles syntax-level token classification (keywords, strings, comments, etc.), it doesn't interact with symbols right now.

I will check and follow up on this.

@chmodshubham
Copy link
Author

Hi @guwirth , the automatic symbol population turned out to be more complex than I expected. It would require either building a full C++ semantic analyzer (a lot of work) or integrating an existing solution, example like the Eclipse CDT C/C++ AST, as also mentioned in #1401 (comment) (better approach).

Fortunately, I was able to identify some other missing methods that is required by sonar-cryptography, and working on it(will push it soon, once tested). Btw, sonar-cryptography also implements its own visitor mechanism (for ref.).

@guwirth
Copy link
Collaborator

guwirth commented Feb 16, 2026

turned out to be more complex

For my understanding this SymbolTable is only in use in the UI to show/highlight all locations of a symbol if you select one. Maybe a simple solution can highlight same symbols in a method/function only? The maximum scope is anyway file?

@chmodshubham
Copy link
Author

So see, for reference sonar-java's SonarSymbolTableVisitor.java simply calls the tree.symbol().usages() on each declaration, the symbols and their references are already fully resolved by the time before the highlighting runs and it doesn't do any name matching itself.

whereas in sonar-cxx, there is no symbol population mechanism yet (no visitor impl) and it will be better if we first implement symbol population via a visitor and then focus on the highlighter to have a similar working model as sonar-java.

just a conjecture:
a basic name-matching highlighter without proper symbol population might be possible but ig it would be limited to simple cases and will produce incorrect highlighting. In any case, that work can be a part of another PR.

@guwirth guwirth added this to the 2.3.0 milestone Feb 17, 2026
Introduce two new utility classes to cxx-squid to support semantic
analysis in C++ check rules:

- CxxConstantUtils: resolves AST expression nodes to compile-time
  constant values (integers, longs, strings, booleans, characters).
  Handles literals, const variables, enum constants, unary/binary
  expressions, and parenthesised sub-expressions. Adapted from
  sonar-java's ExpressionUtils for C++ grammar and the symbol table
  extension API.

- CxxMethodMatcher: fluent builder API for matching C++ function and
  method calls by owner type, name, and parameter types. Mirrors
  sonar-java's MethodMatchers pattern while integrating with the
  CxxSquidbridgeSymbol/Type APIs.

Both classes include full unit-test coverage (CxxConstantUtilsTest,
CxxMethodMatcherTest) with parametrised and edge-case tests.

Signed-off-by: Shubham Kumar <chmodshubham@gmail.com>
@guwirth
Copy link
Collaborator

guwirth commented Feb 27, 2026

Hi @chmodshubham,

what is the status of this PR? Ready to review or are you still working on it?

Regards,

@chmodshubham
Copy link
Author

Hi @guwirth, it is ready for review. I have completed my testing and everything is working fine on my end.

@guwirth
Copy link
Collaborator

guwirth commented Mar 4, 2026

Hi @chmodshubham, did you verify the code coverage of the new code. Should be >80%. For PRs SonarCloud is not working, I can verify this only after merge.

@chmodshubham
Copy link
Author

Hi @guwirth, no, I think I need to add more tests. The current coverage is around 60% or so.

@chmodshubham chmodshubham marked this pull request as draft March 6, 2026 17:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

2 participants