- Annotation Data Embedding Generation - Converts annotation data into vectors using the Sentence Transformer model (
distiluse-base-multilingual-cased-v2HUGGING FACE). - Search and Save Image Results - Searches for the top 3 annotations most similar to the input query and saves the corresponding images in the
result/directory. - Similarity-Based Search - The search is based on semantic similarity, allowing for relevant results even if the query does not exactly match any annotations.
The following Python packages are required to run this project:
- Python 3.7 or higher
- numpy
- faiss
- sentence-transformers
- scikit-learn
You can install the necessary packages with the following command:
pip install -r requirements.txt-
Prepare Data and Generate Embeddings
Load the annotation data and generate sentence embeddings. This step will create the
embeddings.npyfile.python data_preparation.py
-
Run the Main Program
The main program takes a query input from the terminal, performs the search, and outputs the results.
python main.py
Example:
Enter your search query: ๊ณ ์์ด์ ๋๋งค, ๋ ๋ ตํ ํฑ์ , ์ค๋ํ ์ฝ
This command searches for the 3 most similar annotations to the query, copies the corresponding images to the
result/directory, and prints the full details of the matched annotations to the terminal.
The search results are saved in the result/ directory. The directory will contain the image files corresponding to the matched annotations, and the detailed information of the matched annotations will be printed in the terminal.
Example output:
{
"id": 1,
"category": "best",
"image": "result/image02.png",
"annotation": "๊ฐ์์ง์ ๋๋งค, ๋ํ์ ์ด๋ฏธ์ง, ์ฐจ๊ฐ์ด ์ด๋ฏธ์ง, ๋ํฐํ ์ ๊ต์ด, ๊ณ๋ํ, ๋ญํญํ ํฑ ๋, ๋ํฐํ ์ฝง๋ง์ธ, ์ ๋นํ ์ฝง๋ณผ, ์ ๋ช
ํ ์์ปคํ, ๋ํฐํ ์๋ซ์
์ , ํ์ ํผ๋ถ, ์๊ธฐ์ฃผ๋์ ์ฑํฅ ์ด๋ฏธ์ง, ์ฌ์ฑ๋ฏธ๊ฐ ์๋ ์ผ๊ตด, ์๋
๊ฐ์ ์ด๋ฏธ์ง",
"name": "best_female_image02"
}- Be sure to run
data_preparation.pyfirst to generate theembeddings.npyfile. - Verify that the image paths and annotation data are correctly specified.
- Run
main.pyfrom the projectโs root directory to ensure correct file paths.