Name: Kwok Ying Kwan Kasey
- Download Apache Tomcat 10.1.8 (
apache-tomcat-10.1.8) from https://tomcat.apache.org/download-10.cgi and store it at the the root directory - Create the directory
/COMP4321WebApp/src/main/webapp/WEB-INF/and add aliband aclassesdirectory into theWEB-INFdirectory.- Download
sqlite-jdbc-3.41.0.0.jarfrom https://github.com/xerial/sqlite-jdbc/releases/tag/3.41.0.0 and store it inlib - Download
jsoup-1.15.4.jarfrom https://jsoup.org/download and store it inlib - Download package
htmlparser1_6_20060610.zipfrom https://sourceforge.net/projects/htmlparser/files/htmlparser/1.6/htmlparser1_6_20060610.zip/download?use_mirror=altushost-swe&modtime=1149940066&big_mirror=0, extract it and storehtmlparser.jarinlib.
- Download
- Set up the environment variable
CATALINA_HOME= "your tomcat path in this root directory"JAVA_HOME= "Your jdk path (OpenJDK 11 is used in this project)"
- Download "Eclipse IDE for Enterprise Java and Web Developers" from https://www.eclipse.org/downloads/packages/release/2022-12/r if you have not, and open the application
- Create a new workspace for this project.
- Click File > Import, then choose General > Existing Projects into Workspace and click Next
- Click Browse and find the repository of where
/COMP4321WebAppis located at. You should see the project "WebApp" showing up in the projects box. Select "WebApp" and click Finish. - In the Servers view, click the link "No servers are avilable. Click this link to create a new server...".
- Choose Apache > Tomcat v10.1 Server, then click Next.
- Click Browse and find the Tomcat directory you have downloaded in the root directory. Then click Finish. You can see there is a new Tomcat server added in the Servers view
- Locate
Spider.javaat/COMP4321WebApp/src/main/java/com/comp4321. Right click > Run As > Java Application. It will start crawling the pages. The database file should locate at/COMP4321WebApp/comp4321.db.- If it is not, right click
Spider.java> Run As > Run Configurations. Choose Java Application > Spider, then swithc to the Arugments tab and change the working directory to${workspace_loc:WebApp}for MacOS (Not sure about Linux/Window). Click Apply and Run.
- If it is not, right click
- Locate the
search.jspfile in/COMP4321WebApp/src/main/webapp/WEB-INF/webapp. Right click > Run As > Run On Server. Then choose tomcat and check "Always use this server when running this project" - Right click
search.jsp> Run As > Run Configurations. Choose Apache Tomcat > the tomcat server. Switch to the Arguments tab and change the working directory by selecting "Other" and fill in${workspace_loc:WebApp}for MacOS (Not sure about Linux/Window). Click Apply and Run. - The website can be accessed at http://localhost:8080/WebApp/search.jsp
Fill in the input query. Then click the magnifying glass button to search. The result will show up.
Click the "Get Similar Pages" button, and it will submit a query by adding the most frequent 5 keywords to the original query of this result (instead of the query in the input box if changed). You will get the similar pages afterward.
- Click on "View All Stemmed Words" to check all the stemmed words.
- Click "Add" if you want to add this keyword to the query.
.
├── COMP4321WebApp
│ ├── comp4321.db
│ ├── src
│ │ └── main
│ │ ├── java
│ │ │ └── com
│ │ │ └── comp4321
│ │ │ ├── Database.java
│ │ │ ├── Indexer.java
│ │ │ ├── Page.java
│ │ │ ├── Parser.java
│ │ │ ├── Porter.java
│ │ │ ├── SearchEngine.java
│ │ │ ├── Spider.java
│ │ │ ├── URLQueue.java
│ │ │ └── Utility.java
│ │ └── webapp
│ │ ├── META-INF
│ │ │ └── MANIFEST.MF
│ │ ├── WEB-INF
│ │ │ ├── classes
│ │ │ │ └── com
│ │ │ │ └── comp4321
│ │ │ │ ├── Database.class
│ │ │ │ ├── Indexer.class
│ │ │ │ ├── NewString.class
│ │ │ │ ├── Page.class
│ │ │ │ ├── Parser.class
│ │ │ │ ├── Porter.class
│ │ │ │ ├── SearchEngine.class
│ │ │ │ ├── Spider.class
│ │ │ │ ├── URLQueue.class
│ │ │ │ └── Utility.class
│ │ │ └── lib
│ │ │ ├── htmlparser.jar
│ │ │ ├── jsoup-1.15.4.jar
│ │ │ └── sqlite-jdbc-3.41.0.0.jar
│ │ └── search.jsp
│ └── stopwords.txt
├── apache-tomcat-10.1.8 (A directory)
├── readme.md
└── report.pdf
- Report
report.pdf - Readme file for instructions
readme.md - Source folder
COMP4321WebApp