Skip to content

Fix: Refactor HttpClient to prevent potential memory leak#215

Closed
nylonee wants to merge 3 commits intomainfrom
fix/httpclient-memory-leak
Closed

Fix: Refactor HttpClient to prevent potential memory leak#215
nylonee wants to merge 3 commits intomainfrom
fix/httpclient-memory-leak

Conversation

@nylonee
Copy link
Owner

@nylonee nylonee commented May 23, 2025

The previous implementation of HttpClient created and managed its own http4s EmberClientBuilder instance, using client.use within the makeHttpRequest method. This method's result (an IO wrapped in a Future) was cached. This meant that each request, even a cached one, could potentially re-evaluate or re-trigger aspects of the client resource management, leading to inefficient resource use and a high risk of memory leaks due to constant, fine-grained acquire/release cycles of the HTTP client resources.

This commit refactors the HttpClient and Server:

  1. HttpClient now accepts an org.http4s.client.Client[IO] as a constructor parameter. It no longer builds its own client.
  2. Server.scala now creates a single EmberClientBuilder resource at application startup and passes the resolved Client[IO] to the HttpClient. This ensures the HTTP client's lifecycle is tied to the application's lifecycle, which is the standard and recommended practice.
  3. Added HttpClientSpec.scala with a stress test that makes numerous requests to an embedded server. This test helps verify the stability of HttpClient under load with the new resource management strategy.

These changes should resolve the suspected memory leak by ensuring proper and efficient management of HTTP client resources.

Summary by CodeRabbit

  • Tests

    • Added a new stress test suite for the HTTP client to ensure reliability under high request loads.
  • Refactor

    • Improved HTTP client management with resource-safe handling and redirect support.
    • Updated method signatures and client usage for consistency and maintainability.

The previous implementation of HttpClient created and managed its own
http4s EmberClientBuilder instance, using `client.use` within the
`makeHttpRequest` method. This method's result (an IO wrapped in a Future)
was cached. This meant that each request, even a cached one, could
potentially re-evaluate or re-trigger aspects of the client resource
management, leading to inefficient resource use and a high risk of
memory leaks due to constant, fine-grained acquire/release cycles of
the HTTP client resources.

This commit refactors the HttpClient and Server:

1.  `HttpClient` now accepts an `org.http4s.client.Client[IO]` as a
    constructor parameter. It no longer builds its own client.
2.  `Server.scala` now creates a single `EmberClientBuilder` resource
    at application startup and passes the resolved `Client[IO]` to the
    `HttpClient`. This ensures the HTTP client's lifecycle is tied to the
    application's lifecycle, which is the standard and recommended practice.
3.  Added `HttpClientSpec.scala` with a stress test that makes numerous
    requests to an embedded server. This test helps verify the stability
    of `HttpClient` under load with the new resource management strategy.

These changes should resolve the suspected memory leak by ensuring proper
and efficient management of HTTP client resources.
@coderabbitai
Copy link

coderabbitai bot commented May 23, 2025

Walkthrough

The changes refactor HTTP client management by introducing resource-safe handling using http4s' EmberClientBuilder and redirect middleware. The HttpClient class now requires an externally provided http4s client, removing internal lifecycle management. Method signatures in Server are updated accordingly. Additionally, a new stress test suite for HttpClient is introduced.

Changes

File(s) Change Summary
src/main/scala/Server.scala Refactored to use a resource-managed http4s client with redirect support; updated method signatures and usages to accept the new HttpClient instance.
src/main/scala/http/HttpClient.scala Modified HttpClient to accept an injected Client[IO], removing internal client creation and lifecycle management.
src/test/scala/http/HttpClientSpec.scala Added new test suite HttpClientSpec to perform stress testing of HttpClient with concurrent requests.

Sequence Diagram(s)

sequenceDiagram
    participant Server
    participant EmberClientBuilder
    participant FollowRedirect
    participant HttpClient
    participant SyncTasks

    Server->>EmberClientBuilder: Build http4s client
    EmberClientBuilder-->>Server: Resource[Client[IO]]
    Server->>FollowRedirect: Wrap client for redirects
    FollowRedirect-->>Server: Client[IO] with redirect support
    Server->>HttpClient: Instantiate with wrapped client
    Server->>SyncTasks: Run pingTokenSync/plexRssSync/plexFullSync/plexTokenDeleteSync with HttpClient
    SyncTasks->>HttpClient: Make HTTP requests using injected client
Loading

Poem

In the warren of code, a new path is spun,
Clients now managed, their lifecycles begun.
Redirects are followed, like trails in the dew,
Stress tests abound—five hundred, not a few!
With paws on the keyboard, this rabbit’s delight:
Resource-safe hopping, from morning to night.
🐇💻

Note

⚡️ AI Code Reviews for VS Code, Cursor, Windsurf

CodeRabbit now has a plugin for VS Code, Cursor and Windsurf. This brings AI code reviews directly in the code editor. Each commit is reviewed immediately, finding bugs before the PR is raised. Seamless context handoff to your AI code agent ensures that you can easily incorporate review feedback.
Learn more here.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
src/test/scala/http/HttpClientSpec.scala (1)

44-45: Consider adding explicit caching behavior verification.

While the test sends duplicate requests, it doesn't explicitly verify that caching is working (e.g., measuring response times or checking cache statistics). Consider adding cache-specific assertions to ensure the memory leak fix doesn't break caching functionality.

 (serverResource, clientResource).tupled.use { case (server, liveHttpClient) =>
   val httpClient = new HttpClient(liveHttpClient) // Use the live client from the resource
+  
+  // Optional: Access cache statistics if HttpClient exposes them
+  // val cacheStats = httpClient.getCacheStats() // if such method exists
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between dc9ef85 and 4b2f758.

📒 Files selected for processing (3)
  • src/main/scala/Server.scala (2 hunks)
  • src/main/scala/http/HttpClient.scala (2 hunks)
  • src/test/scala/http/HttpClientSpec.scala (1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
src/test/scala/http/HttpClientSpec.scala (2)
src/main/scala/Server.scala (1)
  • Server (13-81)
src/main/scala/http/HttpClient.scala (1)
  • httpRequest (32-37)
🔇 Additional comments (12)
src/main/scala/http/HttpClient.scala (1)

15-17: LGTM: Constructor dependency injection pattern implemented correctly.

The refactoring successfully addresses the memory leak issue by accepting an externally managed Client[IO] instead of creating and managing HTTP client resources internally. This eliminates the problematic repeated acquire/release cycles mentioned in the PR objectives.

src/test/scala/http/HttpClientSpec.scala (4)

18-24: Well-designed test setup with proper resource management.

The test correctly sets up an embedded HTTP server and service, following good testing practices by using Resource for lifecycle management.


26-37: Excellent resource management pattern for testing.

The use of separate Resource instances for both server and client is correct and mirrors the production setup. Using port 0 for automatic port assignment is a good practice for avoiding port conflicts in tests.


40-53: Smart stress test design with caching verification.

The test design is excellent:

  • 500 total requests with 50% duplicates tests caching behavior
  • Using tupled.use properly manages both resources
  • URI generation logic is clear and testable

55-68: Comprehensive assertions validate both correctness and performance.

The test properly validates:

  • All requests succeed (Right values)
  • JSON structure is correct
  • Request count matches expectations
  • Using Traverse for concurrent execution is appropriate
src/main/scala/Server.scala (7)

5-7: LGTM: Proper imports for external client management.

The added imports are necessary for the new HTTP client resource management pattern and are correctly placed.


24-24: Excellent HTTP client resource configuration.

The EmberClientBuilder.default[IO].build.map(FollowRedirect(5)) creates a properly configured HTTP client with redirect support. The 5-redirect limit is reasonable for most use cases.


26-38: Perfect resource lifecycle management implementation.

The resource usage pattern is excellent:

  • Single HTTP client resource shared across the application
  • Proper resource cleanup handled by use
  • HttpClient constructor receives the managed client
  • All sync tasks use the same client instance

This eliminates the memory leak issue described in the PR objectives by ensuring the HTTP client lifecycle is tied to the application lifecycle.


44-50: Consistent parameter naming and usage.

The parameter rename to appHttpClient improves clarity and the recursive call correctly passes the client instance.


63-72: Method signatures updated consistently.

All the sync methods have been updated with consistent parameter naming (appHttpClient) and proper recursive calls that maintain the client instance.


74-80: Complete and consistent refactoring.

The plexTokenDeleteSync method follows the same pattern as the others, ensuring consistent HTTP client usage throughout the application.


52-61:

Details

✅ Verification successful

Verify sync service implementations accept the new client parameter.

The method signatures are updated correctly, but ensure that the underlying sync services (PlexTokenSync.run) have been updated to accept the new HttpClient parameter structure.


🏁 Script executed:

#!/bin/bash
# Description: Verify that sync services are properly updated to work with the new HttpClient
# Expected: Services should accept HttpClient parameter and use it correctly

echo "=== Checking PlexTokenSync.run method signature ==="
ast-grep --pattern 'object PlexTokenSync {
  $$$
  def run($_, $httpClient: HttpClient, $_) = {
    $$$
  }
  $$$
}'

echo "=== Checking PingTokenSync.run method signature ==="
ast-grep --pattern 'object PingTokenSync {
  $$$
  def run($_, $httpClient: HttpClient) = {
    $$$
  }
  $$$
}'

echo "=== Checking PlexTokenDeleteSync.run method signature ==="
ast-grep --pattern 'object PlexTokenDeleteSync {
  $$$
  def run($_, $httpClient: HttpClient) = {
    $$$
  }
  $$$
}'

Length of output: 705


I didn’t find any run method signatures matching the expected new HttpClient parameter. Let’s locate the actual definitions to confirm what needs updating:


🏁 Script executed:

#!/bin/bash
# Find and inspect the sync service definitions to verify their `run` signatures

echo "=== Locating PlexTokenSync source file ==="
PLEX_FILE=$(rg --files-with-matches "object PlexTokenSync" -g '*.scala' | head -n1)
echo "Found: $PLEX_FILE"
rg -n "def run" "$PLEX_FILE"

echo "=== Locating PingTokenSync source file ==="
PING_FILE=$(rg --files-with-matches "object PingTokenSync" -g '*.scala' | head -n1)
echo "Found: $PING_FILE"
rg -n "def run" "$PING_FILE"

echo "=== Locating PlexTokenDeleteSync source file ==="
DELETE_FILE=$(rg --files-with-matches "object PlexTokenDeleteSync" -g '*.scala' | head -n1)
echo "Found: $DELETE_FILE"
rg -n "def run" "$DELETE_FILE"

Length of output: 1400


Confirmed HttpClient parameter in all sync services
I’ve verified that PlexTokenSync.run, PingTokenSync.run, and PlexTokenDeleteSync.run signatures now include the new client: HttpClient parameter. No further changes needed—LGTM.

logger.debug(s"HTTP Request: ${requestWithPayload.toString()}")

val responseIO = client.use(_.expect[Json](requestWithPayload).attempt)
val responseIO = httpClient.expect[Json](requestWithPayload).attempt
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Verify direct client usage aligns with resource lifecycle.

The direct usage of the injected httpClient is correct and eliminates the internal resource management that was causing memory leaks. However, ensure that the external resource management (in Server.scala) properly handles the client lifecycle.


🏁 Script executed:

#!/bin/bash
# Description: Verify that the httpClient parameter is being used consistently and no old EmberClientBuilder usage remains
# Expected: Only the new constructor parameter usage, no internal client creation

echo "=== Checking for any remaining EmberClientBuilder usage in HttpClient ==="
rg -A 3 -B 3 "EmberClientBuilder" src/main/scala/http/HttpClient.scala

echo "=== Verifying httpClient parameter usage ==="
rg -A 2 -B 2 "httpClient\." src/main/scala/http/HttpClient.scala

echo "=== Checking for any FollowRedirect usage that should be externalized ==="
rg -A 2 -B 2 "FollowRedirect" src/main/scala/http/HttpClient.scala

Length of output: 1287


Let’s verify that the HTTP client is being managed as a Resource in your server bootstrap. We’ll locate the Server.scala file and check for any EmberClientBuilder usage as well as Resource and .use calls:


🏁 Script executed:

#!/bin/bash
# 1. Find Server.scala
echo "=== Locating Server.scala ==="
fd -e scala Server.scala

# 2. Check for EmberClientBuilder usage in Server.scala
echo "=== Checking EmberClientBuilder usage in Server.scala ==="
fd -e scala Server.scala --exec rg -A 3 -B 3 "EmberClientBuilder" {}

# 3. Verify Resource allocations
echo "=== Checking for Resource in Server.scala ==="
fd -e scala Server.scala --exec rg -n "Resource" {}

# 4. Ensure .use is invoked on the client Resource
echo "=== Checking for .use calls in Server.scala ==="
fd -e scala Server.scala --exec rg -n "\.use" {}

Length of output: 1436


Remove unused EmberClientBuilder and FollowRedirect imports in HttpClient.scala

The direct use of the injected httpClient in HttpClient.scala is correct, but there are leftover imports that should be cleaned up:

• In src/main/scala/http/HttpClient.scala, remove:
import org.http4s.ember.client.EmberClientBuilder
import org.http4s.client.middleware.FollowRedirect

No other references to EmberClientBuilder or FollowRedirect remain in that file, and your calls to httpClient.expect[Json](…) are already using the injected client.

External resource management in Server.scala is correctly handled:

• Builds the client as a Resource:
val httpClientResource = EmberClientBuilder.default[IO].build.map(FollowRedirect(5))
• Invokes .use { implicit resolvedHttpClient => … }

🤖 Prompt for AI Agents
In src/main/scala/http/HttpClient.scala at line 65, the injected httpClient is
used correctly, but the imports for EmberClientBuilder and FollowRedirect are no
longer needed. Remove the imports "org.http4s.ember.client.EmberClientBuilder"
and "org.http4s.client.middleware.FollowRedirect" from the top of the file to
clean up unused dependencies, since the client lifecycle is managed externally
in Server.scala.

The GitHub Actions build failed due to a compilation error in
Server.scala, where it could not find the type `HttpClient`.
This was likely because the instantiation `new HttpClient(...)`
was not using the fully qualified name `new http.HttpClient(...)`
in the version of the code that the pipeline ran against.

This commit ensures that Server.scala uses the fully qualified
name `http.HttpClient` for instantiation, which should resolve
the compilation error.

HttpClientSpec.scala was also reviewed and requires no changes as
it resides within the same `http` package, allowing direct use
of `new HttpClient(...)`.
This commit includes several changes:

1.  **Address PR Comments:**
    *   In `HttpClientSpec.scala`:
        - The class now extends `AsyncIOSpec with Matchers` to ensure ScalaTest matchers (`shouldBe`, etc.) are available while using Cats Effect testing support.
        - `HttpClient` instantiation uses the simple `new HttpClient(...)` as it's within the same `http` package.
    *   In `Server.scala`:
        - Added a newline for formatting as requested in PR comments.

2.  **Attempt to Fix Build Failure (`not found: type HttpClient`):**
    *   The primary build failure indicated that `http.HttpClient` could not be found when compiling `Server.scala`.
    *   To address this, I explicitly rewrote the `package http` line and the `class HttpClient(...)` definition line in `src/main/scala/http/HttpClient.scala`. This is to eliminate any potential hidden characters or subtle syntax/encoding issues on these critical lines that might have been confusing the Scala compiler.
    *   `Server.scala` continues to use the fully qualified `new http.HttpClient(...)` for instantiation, which should be correct if `HttpClient.scala` is now properly recognized.

Hopefully, these changes allow the build to pass and incorporate the feedback from the pull request.
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
src/test/scala/http/HttpClientSpec.scala (2)

49-51: Simplify URI generation logic.

The current approach creates two separate lists and concatenates them. Consider a more readable approach.

-        // Generate 250 unique URIs and 250 duplicates
-        val uris = (0 until uniqueRequests).map(i => Uri.unsafeFromString(s"$baseUri/$i")).toList ++
-                   (0 until uniqueRequests).map(i => Uri.unsafeFromString(s"$baseUri/$i")).toList
+        // Generate 250 unique URIs and 250 duplicates
+        val uniqueUris = (0 until uniqueRequests).map(i => Uri.unsafeFromString(s"$baseUri/$i")).toList
+        val uris = uniqueUris ++ uniqueUris

41-42: Consider test performance and reliability.

A stress test with 500 requests might be excessive for unit tests and could lead to:

  1. Slow test execution in CI/CD pipelines
  2. Potential flakiness due to resource constraints
  3. Unnecessary load on test infrastructure

Consider reducing the number of requests or making it configurable:

-      val numRequests = 500
+      val numRequests = sys.env.get("STRESS_TEST_REQUESTS").map(_.toInt).getOrElse(50)

This allows for lightweight tests by default while still enabling comprehensive stress testing when needed.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f4ce447 and 0eef36d.

📒 Files selected for processing (3)
  • src/main/scala/Server.scala (2 hunks)
  • src/main/scala/http/HttpClient.scala (2 hunks)
  • src/test/scala/http/HttpClientSpec.scala (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/main/scala/http/HttpClient.scala
  • src/main/scala/Server.scala
🔇 Additional comments (6)
src/test/scala/http/HttpClientSpec.scala (6)

14-14: Good cleanup of unused import.

The commented import shows good attention to detail in removing unnecessary dependencies.


21-24: Well-designed test service.

The test service properly returns JSON responses with both ID and message fields, which allows for comprehensive validation of the HTTP client functionality.


27-32: Excellent server resource configuration.

Using port "0" to let the system pick an available port is a best practice for tests, preventing port conflicts in CI/CD environments.


55-56: Consider the request execution pattern.

The current implementation using Traverse[List].traverse executes requests sequentially. For a true stress test, consider whether concurrent execution would better simulate real-world load patterns.

If concurrent execution is intended, consider using Parallel operations:

-        val resultsIO: IO[List[Either[Throwable, Json]]] =
-          Traverse[List].traverse(uris)(uri => httpClient.httpRequest(Method.GET, uri))
+        val resultsIO: IO[List[Either[Throwable, Json]]] =
+          uris.parTraverse(uri => httpClient.httpRequest(Method.GET, uri))

However, if the intent is to test sequential behavior or if the HttpClient has internal caching that you want to test sequentially, the current approach is appropriate.


58-67: Comprehensive assertion logic.

The test properly validates both the structure of responses and ensures all requests succeed. The use of ScalaTest matchers provides clear failure messages for debugging.


44-45: Excellent resource management.

Using tupled.use ensures proper cleanup of both server and client resources, preventing resource leaks in tests. The passing of the live client to HttpClient aligns perfectly with the refactored constructor.

@nylonee nylonee closed this May 23, 2025
@nylonee nylonee deleted the fix/httpclient-memory-leak branch May 23, 2025 17:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments