Add symbol table and AST extension APIs for semantic analysis#3013
Add symbol table and AST extension APIs for semantic analysis#3013chmodshubham wants to merge 4 commits intoSonarOpenCommunity:masterfrom
Conversation
This commit introduces symbol table management and AST node extension APIs to enable semantic code analysis in the CXX squid bridge framework. New symbol table components: - Symbol: Represents a code symbol with name, type, scope, and qualifiers - SymbolTable: Manages symbol registration, lookup, and scope hierarchy - Type: Represents type information including primitive and user-defined types - SourceCodeSymbol: Links symbols to their source code declarations New AST node extension APIs: - AstNodeExtensions: Provides extension methods for AstNode operations - AstNodeSymbolExtension: Adds symbol resolution and lookup to AST nodes - AstNodeTraversal: Implements tree traversal patterns for AST analysis - AstNodeTypeExtension: Provides type checking and type resolution for nodes New rule and issue management: - PreciseIssue: Records issues with primary and secondary code locations - CxxBaseDetectionRule: Abstract base for pattern detection rules - CxxCustomRuleRepository: Interface for registering custom analysis rules New utility: - CxxAstNodeHelper: Helper methods for common AST node operations Modified classes: - SquidAstVisitorContext: Added symbol table storage, root tree tracking, and precise issue collection. Implemented methods for symbol lookup, type resolution, and issue management. - SquidCheck: Updated to work with new infrastructure components Test coverage: - Added unit tests for all new API classes and utilities - Tests cover symbol management, AST traversal, type checking, and issue reporting functionality Build configuration: - Updated .gitignore to exclude IDE settings and generated dependency POM Signed-off-by: Shubham Kumar <chmodshubham@gmail.com> Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
cb96fcd to
33d0802
Compare
|
Hi @chmodshubham, Thanks for providing this. Regards, |
Signed-off-by: Shubham Kumar <chmodshubham@gmail.com>
|
@chmodshubham has this SymbolTable something to do with #1401 ? |
|
No, thanks for pointing this out. The issue is regarding the symbol highlighting support, however, I missed the symbol population part. The methods and infra for symbol creation, usage tracking, AST node binding, and feeding data into SonarQube's NewSymbolTable API are all there, but the actual symbol population visitor which traverses the AST, detects declarations, resolves ref, classifies usages, and populates the symbol table are completely missing. The CxxHighlighterVisitor.java only handles syntax-level token classification (keywords, strings, comments, etc.), it doesn't interact with symbols right now. I will check and follow up on this. |
|
Hi @guwirth , the automatic symbol population turned out to be more complex than I expected. It would require either building a full C++ semantic analyzer (a lot of work) or integrating an existing solution, example like the Eclipse CDT C/C++ AST, as also mentioned in #1401 (comment) (better approach). Fortunately, I was able to identify some other missing methods that is required by sonar-cryptography, and working on it(will push it soon, once tested). Btw, sonar-cryptography also implements its own visitor mechanism (for ref.). |
For my understanding this SymbolTable is only in use in the UI to show/highlight all locations of a symbol if you select one. Maybe a simple solution can highlight same symbols in a method/function only? The maximum scope is anyway file? |
|
So see, for reference sonar-java's SonarSymbolTableVisitor.java simply calls the whereas in sonar-cxx, there is no symbol population mechanism yet (no visitor impl) and it will be better if we first implement symbol population via a visitor and then focus on the highlighter to have a similar working model as sonar-java. just a conjecture: |
Introduce two new utility classes to cxx-squid to support semantic analysis in C++ check rules: - CxxConstantUtils: resolves AST expression nodes to compile-time constant values (integers, longs, strings, booleans, characters). Handles literals, const variables, enum constants, unary/binary expressions, and parenthesised sub-expressions. Adapted from sonar-java's ExpressionUtils for C++ grammar and the symbol table extension API. - CxxMethodMatcher: fluent builder API for matching C++ function and method calls by owner type, name, and parameter types. Mirrors sonar-java's MethodMatchers pattern while integrating with the CxxSquidbridgeSymbol/Type APIs. Both classes include full unit-test coverage (CxxConstantUtilsTest, CxxMethodMatcherTest) with parametrised and edge-case tests. Signed-off-by: Shubham Kumar <chmodshubham@gmail.com>
|
Hi @chmodshubham, what is the status of this PR? Ready to review or are you still working on it? Regards, |
|
Hi @guwirth, it is ready for review. I have completed my testing and everything is working fine on my end. |
|
Hi @chmodshubham, did you verify the code coverage of the new code. Should be >80%. For PRs SonarCloud is not working, I can verify this only after merge. |
|
Hi @guwirth, no, I think I need to add more tests. The current coverage is around 60% or so. |
Summary
The sonar-cryptography plugin (based on sonarqube) requires 4 essential classes for language support as documented here. Sonar-cxx already had 3 of these classes, but Symbol class support was missing. This is one of the important classes needed for parsing and traversal of C/C++ files.
Explored what methods/classes are needed by sonar-cryptography from the implementation of Java and Python language support. Then, from the reference implementations of sonar-java and sonar-python libraries, replicated the similar functionality for the sonar-cxx plugin.
Changes
Signed-off-by: Shubham Kumar chmodshubham@gmail.com
This change is