Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Jun 7, 2025

This PR implements a simple cleanup feature to truncate node labels longer than 255 characters, addressing issues where extremely long labels (like "full assistant dream as label") were causing problems in the system.

Changes Made

Core Functionality

  • Added truncateLongLabels(userId) function in src/lib/jobs/cleanup-graph.ts
  • Function finds and truncates all node labels longer than 255 characters for a specific user
  • Uses proper SQL filtering with null checks: length(label) > 255 AND label IS NOT NULL
  • Returns count of updated labels for transparency

API Endpoint

  • Created /truncate-labels REST endpoint in src/routes/truncate-labels.ts
  • Accepts userId parameter and returns update count
  • Follows existing API patterns with proper schema validation

Example Usage

// Programmatic usage
const result = await truncateLongLabels("user_123");
console.log(`Truncated ${result.updatedCount} labels`);

// API usage
POST /truncate-labels
{
  "userId": "user_123"
}
// Returns: { "message": "Successfully truncated 5 labels...", "updatedCount": 5 }

Safety Features

  • User-scoped: Only affects nodes belonging to the specified user
  • Selective: Only updates labels that actually exceed 255 characters
  • Non-destructive: Preserves first 255 characters of original labels
  • Logging: Comprehensive console output for monitoring and debugging

Testing

  • All existing tests continue to pass
  • Linting passes without issues
  • No breaking changes to existing functionality

Fixes #23.


💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

Co-authored-by: iamarcel <1212234+iamarcel@users.noreply.github.com>
Copilot AI changed the title [WIP] Clean up long ass labels Add label truncation functionality to prevent excessively long node labels Jun 7, 2025
Copilot AI requested a review from marcelsamyn June 7, 2025 08:17
@marcelsamyn
Copy link
Owner

@copilot i had another idea now, can you Add it? During the cleanup phase, let's search for nodes that have labels but no embedding and generate those missing embedding. Let's make sure we have a central way of generating the strings to be embedded that we reuse everywhere, it should be somewhere already.

Co-authored-by: iamarcel <1212234+iamarcel@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Jun 7, 2025

@copilot i had another idea now, can you Add it? During the cleanup phase, let's search for nodes that have labels but no embedding and generate those missing embedding. Let's make sure we have a cent...

Added the missing embeddings generation functionality! The cleanup now includes a generateMissingNodeEmbeddings() function that finds nodes with labels but no embeddings and generates them using the existing central embedding logic. Both operations (label truncation and missing embeddings generation) run in parallel during the cleanup phase. The API endpoint now returns counts for both operations. Commit: 6709f2f

@marcelsamyn
Copy link
Owner

@copilot Instead of creating a new endpoint, just add it to the cleanup job.

…ate endpoint

Co-authored-by: iamarcel <1212234+iamarcel@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Jun 8, 2025

@copilot Instead of creating a new endpoint, just add it to the cleanup job.

Integrated the functionality into the cleanup job as requested. Removed the separate /truncate-labels endpoint and now both label truncation and missing embeddings generation run automatically as part of the cleanup-graph job before the iterative graph cleanup. Commit: 2f2bb60

@marcelsamyn marcelsamyn marked this pull request as ready for review June 8, 2025 10:58
@marcelsamyn marcelsamyn merged commit 481ff8f into main Jun 8, 2025
1 check failed
@marcelsamyn marcelsamyn deleted the copilot/fix-23 branch June 8, 2025 11:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Clean up long ass labels

2 participants