feat(llm): add AGENTS.md as new document standard#299
Merged
imbajin merged 41 commits intoapache:mainfrom Sep 11, 2025
Merged
Conversation
This workflow will be triggered when a pull request is opened. It will then post a comment "@codecov-ai-reviewer review" to help with automated AI code reviews. It will use the `peter-evans/create-or-update-comment` action to create the comment. Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Introduced detailed AI coding guidelines for developers in README.md and hugegraph-llm/README.md, referencing structured documentation and analysis standards. Updated rules/README.md and rules/prompts/project-general.md for improved clarity and formatting in project and documentation standards.
Expanded .gitignore to exclude various AI coding assistant prompt files. Updated contributor guidelines in README files to clarify the AI-assisted development workflow and usage of module context files for LLMs.
Co-authored-by: imbajin <jin@apache.org>
Added explicit instructions for creating a .env configuration file under hugegraph-llm if it does not exist, to help new users avoid setup issues.
## Changes This PR introduces performance optimizations for vector index building and querying by implementing parallel text embedding generation. ### Key Improvements 1. Added new utility class `embedding_utils.py` with parallel batch processing capabilities - Implements `get_embeddings_parallel` function for efficient batch processing - Uses asyncio with semaphore for controlled concurrency - Supports batch size of 1000 with max 10 concurrent tasks 2. Refactored all index operation classes to use parallel processing: - `BuildGremlinExampleIndex` - `BuildSemanticIndex` - `BuildVectorIndex` - `GremlinExampleIndexQuery` - `SemanticIdQuery` - `VectorIndexQuery` 3. Unified embedding generation approach: - Replaced individual `get_text_embedding` calls with batch `get_texts_embeddings` - Removed duplicate parallel processing code - Improved code reusability and maintainabilityl --------- Co-authored-by: imbajin <jin@apache.org>
…e#295) Reworked get_embeddings_parallel to use asyncio.gather for batch processing, ensuring output order matches input. Added a helper for batch progress updates and improved progress bar accuracy.
close apache#260 - Add prompt language indicator <img width="1878" height="89" alt="image" src="https://github.com/user-attachments/assets/a0040611-8c5d-45c5-8aa7-580fd4e5367d" /> <img width="1853" height="52" alt="image" src="https://github.com/user-attachments/assets/49201c7b-e6e9-403c-bc76-7e2b6e03ac8d" /> - Add query examples for CN/EN support <img width="1233" height="407" alt="image" src="https://github.com/user-attachments/assets/4c06cd78-a6c3-4749-b132-a9db42a29429" /> -update README with prompt language support details -Support switching prompt EN/CN support switch prompt language EN/CN <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **新功能** * 增加了对中英文提示词的支持,可通过 `.env` 文件中的 `LANGUAGE` 配置切换语言,默认使用英文。 * 配置文件中新增 `language` 字段,支持 "EN" 和 "CN" 两种语言选项。 * 自动检测并同步 `.env` 配置与 YAML 提示词文件的语言设置,确保提示词内容与当前语言一致,语言切换时自动重新生成提示词。 * 提示词属性重命名以区分语言版本,新增语言参数支持提示词生成。 * 演示界面新增语言指示器,动态显示当前使用的提示词语言。 * 查询示例根据语言自动加载对应的中英文版本。 * 新增中文查询示例资源文件,丰富多样的图数据库查询案例。 * **文档** * README 增加了“语言支持”说明,指导如何切换中英文提示词。 * 新增详细的配置说明文档 `CONFIGURATION.md`,涵盖系统、提示词、部署和项目配置,提供完整配置参数说明和使用示例。 * **样式** * 优化了 README 部分格式,提升可读性。 * pylint 配置调整,改善对动态属性和兼容性警告的处理。 * 新增语言指示器相关样式,支持界面语言状态的视觉展示。 <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: imbajin <jin@apache.org>
…pache#265) ## Overview This PR implements independent data directory management when switching between different embedding models. Each embedding model will store its results in a separate directory, and during queries, the corresponding model’s embedding data will be automatically identified and loaded, ensuring that multiple models can coexist without data overwriting. ## Main Changes 1. **`vector_index` Module Method Adjustments** - Updated methods such as `from_index_file`, `to_index_file`, and `clean` to include a `model_name` parameter, enabling model-based independent storage paths. - Backward compatibility is maintained: if `model_name` is not provided, the original behavior remains unchanged. 2. **Index Operation Class Path Concatenation Refactoring** - In classes like `BuildSemanticIndex`, modified directory concatenation so that the current `embedding_type` is automatically used to generate subdirectories without manual specification. 3. **Embedding Class Name Unification** - Unified the model name field across all embedding classes as `model_name`, and updated related invocation logic to ensure consistency with the index module. --------- Co-authored-by: imbajin <jin@apache.org> Co-authored-by: Frui Guo <153489642+Gfreely@users.noreply.github.com>
Co-authored-by: jinsong04 <jinsong04@MacBook-Pro-2.local> Co-authored-by: Seanium <Seanium@foxmail.com> Co-authored-by: imbajin <jin@apache.org>
imbajin
reviewed
Sep 9, 2025
imbajin
approved these changes
Sep 11, 2025
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.