Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -138,7 +138,7 @@ If you processed the corpora yourself, please verfify that you have the right pa

### Baselines
* [Indri search interface](http://boston.lti.cs.cmu.edu/Services/treccast19) - We provide an Indri index of the CAsT collection. See the [help page](http://boston.lti.cs.cmu.edu/Services/treccast19/help-db.html) for details on indexing parameters and statistics. It includes a standard [batch search](http://boston.lti.cs.cmu.edu/Services/treccast19_batch/) API limited to 50 queries per batch.)
* Baseline retrieval - We provide the queries and run files in [trec eval](https://github.com/usnistgov/trec_eval) format: [train queries](https://github.com/daltonj/treccastweb/blob/master/2019/data/training/train_topics.query), [train run file](http://boston.lti.cs.cmu.edu/vaibhav2/cast/train_topics.teIn), [test queries](https://github.com/daltonj/treccastweb/blob/master/2019/data/test_topics.query), [test run file](http://boston.lti.cs.cmu.edu/vaibhav2/cast/test_topics.teIn) - We provide an Indri baseline run with Query Likelihood run, including both the topics and run files. Queries are generated by running AllenNLP coreference resolution to perform rewriting and stopwords are removed using the Indri stopword list.
* Baseline retrieval - We provide the queries and run files in [trec eval](https://github.com/usnistgov/trec_eval) format: [train queries](https://github.com/daltonj/treccastweb/blob/master/2019/data/training/train_topics.query), [train run file](https://huggingface.co/datasets/macavaney/trec-cast-files/resolve/main/train_topics.teIn), [test queries](https://github.com/daltonj/treccastweb/blob/master/2019/data/test_topics.query), [test run file](https://huggingface.co/datasets/macavaney/trec-cast-files/resolve/main/test_topics.teIn) - We provide an Indri baseline run with Query Likelihood run, including both the topics and run files. Queries are generated by running AllenNLP coreference resolution to perform rewriting and stopwords are removed using the Indri stopword list.

### Collection
* The corpus is a combination of three standard TREC collections: MARCO Ranking passages, Wikipedia (TREC CAR), and News (Washington Post)
Expand Down