Skip to content

fix(server): tx leak when stopping the graph server#2791

Merged
VGalaxies merged 4 commits intoapache:masterfrom
hahahahbenny:fix_tx_leak
Jun 16, 2025
Merged

fix(server): tx leak when stopping the graph server#2791
VGalaxies merged 4 commits intoapache:masterfrom
hahahahbenny:fix_tx_leak

Conversation

@hahahahbenny
Copy link
Contributor

Purpose of the PR

Main Changes

  • I added two lines of code under the restoreTasks
        this.graph.graphTransaction().commit();
        this.graph.closeTx();
  • added a concurrent set in the TinkerPopTransaction class to show which threads have not closed the transaction
  private final ConcurrentHashMap.KeySetView<String, Boolean> openedThreads =
                ConcurrentHashMap.newKeySet();
  • insert key when open tx and remove key when close tx
       private void setOpened() {
            // The backend tx may be reused, here just set a flag
            assert !this.opened.get();
            this.opened.set(true);
            this.transactions.get().openedTime(DateUtil.now().getTime());
            this.openedThreads.add(Thread.currentThread().getName());
            this.refs.incrementAndGet();
        }

        private void setClosed() {
            // Just set flag opened=false to reuse the backend tx
            if (this.opened.get()) {
                this.opened.set(false);
                this.openedThreads.remove(Thread.currentThread().getName());
                this.refs.decrementAndGet();
            }
        }
  • show threads have not close tx before E.checkState
        if(!this.tx.closed()){
            for (String key : this.tx.openedThreads) {
                LOG.warn("thread [{}] did not close transaction", key);
            }
        }
        E.checkState(this.tx.closed(),
                     "Ensure tx closed in all threads when closing graph '%s'",
                     this.name);

    }

Verifying these changes

  • Trivial rework / code cleanup without any test coverage. (No Need)
  • Already covered by existing tests, such as (please modify tests here).
  • Need tests and can be verified as follows:
  • add closetx under the restoreTasks , the shutdown log , did not warn Ensure tx closed in all threads when closing graph 'hugegraph'
2025-06-09 20:32:49 [gremlin-server-shutdown] [INFO] o.a.t.g.s.GremlinServer - Shutting down OpProcessor[]
2025-06-09 20:32:49 [gremlin-server-shutdown] [INFO] o.a.t.g.s.GremlinServer - Shutting down OpProcessor[cypher]
2025-06-09 20:32:49 [hugegraph-server-shutdown] [INFO] o.a.h.d.HugeGraphServer - HugeGraphServer stopping
2025-06-09 20:32:49 [gremlin-server-shutdown] [INFO] o.a.t.g.s.GremlinServer - Shutting down OpProcessor[session]
2025-06-09 20:32:49 [hugegraph-server-shutdown] [INFO] o.a.h.d.MemoryMonitor - Memory monitoring stopped.
2025-06-09 20:32:49 [gremlin-server-shutdown] [INFO] o.a.t.g.s.GremlinServer - Shutting down OpProcessor[traversal]
2025-06-09 20:32:49 [gremlin-server-shutdown] [INFO] o.a.t.g.s.GremlinServer - Shutting down thread pools.
2025-06-09 20:32:49 [gremlin-server-stop] [INFO] o.a.t.g.s.GremlinServer - Executing shutdown LifeCycleHook
2025-06-09 20:32:49 [gremlin-server-stop] [INFO] o.a.t.g.s.GremlinServer - Executed once at shutdown of Gremlin Server.
2025-06-09 20:32:51 [gremlin-server-stop] [INFO] o.a.h.StandardHugeGraph - Close graph standardhugegraph[hugegraph]
2025-06-09 20:32:51 [gremlin-server-stop] [INFO] o.a.h.t.ServerInfoManager - Remove server info: server-1
2025-06-09 20:32:51 [gremlin-server-stop] [INFO] o.a.t.g.s.GremlinServer - Closed Graph instance [hugegraph]
2025-06-09 20:32:51 [gremlin-server-stop] [INFO] o.a.t.g.s.GremlinServer - Gremlin Server - shutdown complete
2025-06-09 20:32:51 [hugegraph-server-shutdown] [INFO] o.a.h.d.HugeGraphServer - HugeGremlinServer stopped
2025-06-09 20:32:51 [hugegraph-server-shutdown] [INFO] o.a.h.d.HugeGraphServer - HugeRestServer stopped
2025-06-09 20:32:51 [hugegraph-server-shutdown] [INFO] o.a.h.HugeFactory - HugeFactory shutdown
2025-06-09 20:32:51 [hugegraph-server-shutdown] [INFO] o.a.h.d.HugeGraphServer - HugeGraph stopped
2025-06-09 20:32:51 [hugegraph-server-shutdown] [INFO] o.a.h.d.HugeGraphServer - HugeGraphServer stopped
  • log show which thread have not closed tx
025-06-09 20:30:43 [gremlin-server-stop] [INFO] o.a.h.t.ServerInfoManager - Remove server info: server-1
2025-06-09 20:30:43 [gremlin-server-stop] [WARN] o.a.h.StandardHugeGraph - thread [main] did not close transaction
2025-06-09 20:30:43 [gremlin-server-stop] [WARN] o.a.t.g.s.GremlinServer - Exception while closing Graph instance [hugegraph]
java.lang.IllegalStateException: Ensure tx closed in all threads when closing graph 'hugegraph'
	at com.google.common.base.Preconditions.checkState(Preconditions.java:534) ~[guava-31.0.1-android.jar:?]
	at org.apache.hugegraph.util.E.checkState(E.java:64) ~[classes/:?]
	at org.apache.hugegraph.StandardHugeGraph.close(StandardHugeGraph.java:1011) ~[classes/:?]
	at org.apache.tinkerpop.gremlin.server.GremlinServer.lambda$null$7(GremlinServer.java:307) ~[gremlin-server-3.5.1.jar:3.5.1]
	at java.util.concurrent.ConcurrentHashMap$KeySetView.forEach(ConcurrentHashMap.java:4696) ~[?:?]
	at org.apache.tinkerpop.gremlin.server.GremlinServer.lambda$stop$8(GremlinServer.java:304) ~[gremlin-server-3.5.1.jar:3.5.1]
	at java.lang.Thread.run(Thread.java:829) [?:?]

Does this PR potentially affect the following parts?

Documentation Status

  • Doc - TODO
  • Doc - Done
  • Doc - No Need

@dosubot dosubot bot added size:S This PR changes 10-29 lines, ignoring generated files. bug Something isn't working labels Jun 9, 2025
@imbajin imbajin requested a review from Copilot June 10, 2025 13:54
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR addresses a transaction leak by ensuring graph transactions are committed and closed after task restoration, tracking open threads, and warning if any threads leave transactions open at shutdown.

  • Commit and close the graph transaction in restoreTasks to prevent TX leaks.
  • Add a concurrent set to track threads that open transactions in TinkerPopTransaction.
  • Log any threads with unclosed transactions before enforcing closure in StandardHugeGraph.close().

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
StandardTaskScheduler.java Added calls to commit and close the transaction after restoring tasks.
StandardHugeGraph.java Introduced openedThreads tracking, warning logs for unclosed threads, and a stricter check on TX closure.
Comments suppressed due to low confidence (2)

hugegraph-server/hugegraph-core/src/main/java/org/apache/hugegraph/task/StandardTaskScheduler.java:158

  • Add unit or integration tests for restoreTasks to verify that transactions are correctly committed and closed, including failure scenarios during commit.
this.graph.graphTransaction().commit();

hugegraph-server/hugegraph-core/src/main/java/org/apache/hugegraph/StandardHugeGraph.java:1007

  • [nitpick] The loop variable key is ambiguous; consider renaming it to threadName to clarify that it represents a thread identifier.
for (String key : this.tx.openedThreads) {

@imbajin imbajin changed the title Fix tx leak fix(server): tx leak when stopping the graph server Jun 12, 2025
imbajin
imbajin previously approved these changes Jun 12, 2025
Copy link
Member

@imbajin imbajin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

THX

@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Jun 12, 2025
@imbajin imbajin requested a review from VGalaxies June 16, 2025 06:31
@VGalaxies VGalaxies merged commit dce7995 into apache:master Jun 16, 2025
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working lgtm This PR has been approved by a maintainer size:S This PR changes 10-29 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] tx leak when stopping server

4 participants