Skip to content

Conversation

@ibbem
Copy link
Collaborator

@ibbem ibbem commented Sep 8, 2025

The line graph exporter tries to merge unchanged children. However, when a child was moved, it cannot be classified as an unchaged edge when linearized (same problems as representing moves in line diffs). This resulted in a bug that makes the line graph exporter non-injective. This was fixed in 3a778b5.

All other commits in this PR are just refactorings I had lying around. All of these commits should probably be reviewed independently but I'm too lazy and thus I'm opening only one PR 😅

@ibbem
Copy link
Collaborator Author

ibbem commented Oct 1, 2025

In addition to addressing your feedback, I also discovered some bugs and added some tests.

There is one issue that I didn't address: The GumTreeDiff doesn't set the line numbers in the DiffLinesLabel correctly. However, although this is a bug in all current usages, the implementation is generic in the label and thus cannot generically merge/split labels. There are two possible fixes I can imagine:

  • Add merge (and split) function arguments similarly as I did in DiffNode.join to fix this issue.
  • Add merge (and split) functions to the Label interface.
    What do you think?

@pmbittner
Copy link
Member

I like the idea of adding merge and split functions to the label interface. Would that break anything? Do both functions have meaning for our current implementations?

ibbem added 12 commits October 3, 2025 16:46
When importing line graphs, we consider the order of the edge lines as
the definite child ordering even if we export the child order using
`ChildOrderEdgeFormat`. However, the previous algorithm blindly exports
all `DiffType.NON` edges disregarding the fact the other edges might
need to be exported first. As the order of edge lines acts as a diff
(before edges act as deletions and after edges as insertions), we need
to compute a line diff of the child orders at both times to know which
edges can be exported as existing at both times (unchanged).

Note that some edges will be exported as inserted and deleted instead of
unmodified although both the child and the parent are the same because
line diffs cannot encode moves. Consumers of line graphs should not
depend on the fact which edges are exported as unchanged because the
diffing algorithm needs to apply heuristics to select these edges.
This pattern should be more understandable than the previous array based
implementation. Furthermore, it is now much easier to add new time
dependent fields.
This is not necessary because the state is (and must be) only used for
parsing one variation diff. Furthermore, if this would be needed, it
would be buggy. In particular, this cleanup code is not executed in case
an exception is thrown.
This version will emit an error if `GraphFormat` is changed and doesn't
require an explicit exception for missing cases.
Previously, all tests would run with the default GumTree matcher instead
of the matcher stated in the test case file name.

As stated in my bachelor thesis, some matchers fail to return a valid
matching. Hence, some test cases need to be removed. In particular, the
theta matcher contains a bug that results in matchings that are
inconsistent (e.g., A->B but B->C). Furthermore, the
gumtree-partition-id matcher tests have been removed. This matcher is
already removed in a newer GumTree version.
This makes it more discoverable and consistent with `DiffNode.split`.
@ibbem
Copy link
Collaborator Author

ibbem commented Oct 7, 2025

I added such functions in ec82fd4. I'm not sure about their names though.

Should we write tests for the line numbers in the DiffLinesLabel? Currently, there are only tests for the line numbers in DiffNode.

@pmbittner
Copy link
Member

pmbittner commented Oct 7, 2025

I clicked on the commit hash link in your comment and made some comments on that commit. Unfortunately, my comments do not appear here, so here are links to them:

@pmbittner
Copy link
Member

There seems to be a type error making CI fail. After fixing that, feel free to merge.

@ibbem ibbem merged commit 64f242c into develop Oct 7, 2025
2 checks passed
@ibbem ibbem deleted the refactorings branch October 7, 2025 21:32
@pmbittner pmbittner mentioned this pull request Nov 3, 2025
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants