This tool is used to scrape a page and dump the contents into a MySQL table, called tsearch.
The file create_tsearch.sql is used to create the table.
In MySQL, run
\. create_tsearch.sqlto create the table.
Run
./scraper url
where url is points to the page you want to add to the table.
This inserts a line in the insert_tsearch.sql file.
Once the insert_tsearch.sql file is fully populated, run
\. insert_tsearch.sqlin MySQL to insert all rows into the tsearch table.
./file_scraper filename
where filename is a text file that contains a list of URLs that you want to scrape. Each URL must be separated by a new line.
The tsearch table is setup with a full text index, so instead of using the LIKE keyword in MySQL, we can use MATCH and AGAINST.
For example,
SELECT url
FROM tsearch
WHERE
MATCH(body)
AGAINST('query goes here');