fix: Make content_all search work with words containing dashes#1162
Open
fix: Make content_all search work with words containing dashes#1162
Conversation
Contributor
|
You can access the deployment of this PR at https://renku-ci-ds-1162.dev.renku.ch |
7f2f13b to
2cab73b
Compare
olevski
requested changes
Jan 9, 2026
Pull Request Test Coverage Report for Build 20951512889Details
💛 - Coveralls |
df44217 to
75112e0
Compare
olevski
previously approved these changes
Jan 14, 2026
57d7a30 to
5346e09
Compare
WIP: the re-indexing after migration on start doesn't work anymore
Using app.ctx doesn't work anymore, as any other memory-variant I tried.
It is not needed and I couldn't find it in the documentation. It now is constistent with the other calls.
- on reprovision, first do migration, then delete+insert so the correct schema is ensured - remove the fuzzy operator which doesn't play well with multiple tokens in a query - don't split words on numbers, retaining those unique words like `a56bd3e` used in our tests
5346e09 to
16de543
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Changes the text fields
content_all,nameanddescriptionconfiguration. It uses a simplewhitespacetokenizer and does the more complex splitting via thewordDelimiterGraphFilter(docs).This adds more variants of concatenated and splitted phrases to the index (for example splits camelCase and hyphens but also includes the concatenated and original version).
While testing I noticed that the reindexing after a migration requiring it wouldn't happen anymore. I tried a lot of things but couldn't pass data from the
main_process_starthandler to theafter_server_starthandler - but only in the latter it is possible to submit tasks. This change now writes a temp file to communicate across these hooks./deploy extra-values=enableInternalGitlab=false