-
Notifications
You must be signed in to change notification settings - Fork 1
Description
I have a solr cloud setup with 16 shards.
I've set up the request sanitizer to limit rows to 1000 with the following in solrconfig.xml:
<str name="sanitize">rows=>1000:1000</str>
This works as expected and limits rows to 1000. However, the rows sanitation is affecting the start request parameter as well.
When I query this URL I see a valid response containing documents:
http://solr-901:8983/solr/journals_dev/select?fl=id&fq=doc_type:full&q=*:*&rows=1000&start=15000&wt=json
However, when I query this URL I see a response containing no documents:
http://solr-901:8983/solr/journals_dev/select?fl=id&fq=doc_type:full&q=*:*&rows=1000&start=16000&wt=json
Notice that the only difference is the start value.
I have determined that this behavior is dictated by the number of shards multiplied by the rows sanitation number. So for my case, 16 shards x 1000 row limit means I will get no results when I query with start > 16,000.
Is this expected behavior, and is there any way I can work around it? We use paging on our website and this will affect any searches that go beyond result 16,000. We still need to limit rows, though.
Thanks!