feat: update CustomIngestionPipeline to accept transformations parameter#33
Conversation
Updated the version in setup.py to 1.4.8. Enhanced the CustomIngestionPipeline class to allow users to specify a list of transformations during the ingestion process, improving flexibility in document processing.
WalkthroughVersion bumped to 1.4.8 in Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor Client
participant CIP as CustomIngestionPipeline
participant IP as IngestionPipeline
participant Split as SemanticSplitterNodeParser
participant Embed as EmbeddingComponent
Client->>CIP: run_pipeline(docs, transformations?)
alt transformations provided
CIP->>CIP: use provided transformations list
else no transformations
CIP->>Split: instantiate default splitter
CIP->>Embed: instantiate default embedding
note right of CIP#e6f7ff: build default transformations_pipeline
end
CIP->>IP: construct IngestionPipeline(with transformations_pipeline)
IP->>IP: run pipeline on docs
IP-->>CIP: return nodes
CIP-->>Client: return nodes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
setup.py(1 hunks)tc_hivemind_backend/ingest_qdrant.py(4 hunks)
🧰 Additional context used
🪛 Ruff (0.14.0)
tc_hivemind_backend/ingest_qdrant.py
92-92: PEP 484 prohibits implicit Optional
Convert to T | None
(RUF013)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: ci / test / Test
🔇 Additional comments (4)
setup.py (1)
9-9: LGTM!Version bump to 1.4.8 appropriately reflects the feature addition in this PR.
tc_hivemind_backend/ingest_qdrant.py (3)
11-11: LGTM!Import of
TransformComponentis necessary for the type annotation in the updatedrun_pipelinemethod signature.
102-103: LGTM!Documentation clearly describes the new
transformationsparameter, improving the method's usability.
127-138: LGTM!The implementation correctly provides backward compatibility by using default transformations when none are specified, while allowing users to supply custom transformations for flexible document processing.
Updated the version in setup.py to 1.4.8. Enhanced the CustomIngestionPipeline class to allow users to specify a list of transformations during the ingestion process, improving flexibility in document processing.
Summary by CodeRabbit
New Features
Documentation
Chores