Skip to content

Conversation

@rolfbjarne
Copy link
Member

@rolfbjarne rolfbjarne commented Jan 7, 2026

The current implementation in clangsharp_Cursor_getDecl to get a Decl is an N^2 algorithm; getting a Decl for index X means looping through all the previous indices 0-(X-1) first.

This is rather slow when dealing with hundreds of thousands of Decl instances, here's a screenshot from Instruments showing 74% of the time spent in clangsharp_Cursor_getDecl:

Screenshot 2026-01-07 at 12 47 26

So I added a way in LazyList to use an existing item to get the next item, and then use Decl.NextDeclInContext to take advantage of this, effectively making an N algorithm (which becomes 2N, because we iterate over all the Decls first to count them).

My main scenario now runs in ~51 seconds instead of 3m05 seconds, so less that 1/3 of the time.

@tannergooding
Copy link
Member

Storing the C++ iterator in C# to keep iterating on it is beyond my C++ knowledge (in particular any iterator lifetime management didn't look trivial).

In this particular case, the decl_iterator is actually rather trivial, it is effectively:

int i = 0;
CXCursor currentDecl = Handle.GetDecl(i);

do
{
    yield return TranslationUnit.GetOrCreate<Decl>(currentDecl);
    currentDecl = currentDecl.NextDeclInContext;
}
while (!currentDecl.IsNull);

If LazyList changed the signature of valueFactory from Func<int, T> to instead be Func<int, ROSpan<T>, T>>, then each iteration could pass in the already initialized elements. i.e. 0 is empty while 2 contains elements 0 and 1.

This would allow the implementation to be something like this:

_decls = LazyList.Create<Decl>(Handle.NumDecls, (i, initializedDecls) => {
        if (i == 0)
        {
            return Handle.GetDecl(0);
        }
        else
        {
            return initializedDecls[^1].NextDeclInContext;
        }
    }
);

The current implementation in clangsharp_Cursor_getDecl to get a Decl is an N^2 algorithm;
getting a Decl for index X means looping through all the previous indices 0-(X-1)
first.

This is rather slow when dealing with hundreds of thousands of Decl instances, here's
a screenshot from Instruments showing 74% of the time spent in clangsharp_Cursor_getDecl:

<img width="921" height="348" alt="Screenshot 2026-01-07 at 12 47 26" src="https://github.com/user-attachments/assets/c96d5611-683e-4f0d-9878-cc551ba60c71" />

So I added a way in LazyList to use an existing item to get the next item, and then
use Decl.NextDeclInContext to take advantage of this, effectively making an N algorithm
(which becomes 2N, because we iterate over all the Decls first to count them).

My main scenario now runs in ~51 seconds instead of 3m05 seconds, so less that 1/3 of the time.
@rolfbjarne
Copy link
Member Author

I tried your suggestion, and it works! Speed increase is quite comparable (not sure if it's worse or better, because I'm not using the exact same test case, but it's very much in the same ballpark).

Also it's a fully managed solution, so existing tests work fine.

@tannergooding tannergooding merged commit 56982e5 into dotnet:main Jan 12, 2026
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants