Skip to content

feat(llm): add AGENTS.md as new document standard#299

Merged
imbajin merged 41 commits intoapache:mainfrom
fantasy-lotus:agentic-rules
Sep 11, 2025
Merged

feat(llm): add AGENTS.md as new document standard#299
imbajin merged 41 commits intoapache:mainfrom
fantasy-lotus:agentic-rules

Conversation

@fantasy-lotus
Copy link
Contributor

No description provided.

imbajin and others added 30 commits June 11, 2025 19:36
This workflow will be triggered when a pull request is opened. It will then post a comment "@codecov-ai-reviewer review" to help with automated AI code reviews.

It will use the `peter-evans/create-or-update-comment` action to create the comment.

Co-authored-by: google-labs-jules[bot] <161369871+google-labs-jules[bot]@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Introduced detailed AI coding guidelines for developers in README.md and hugegraph-llm/README.md, referencing structured documentation and analysis standards. Updated rules/README.md and rules/prompts/project-general.md for improved clarity and formatting in project and documentation standards.
Expanded .gitignore to exclude various AI coding assistant prompt files. Updated contributor guidelines in README files to clarify the AI-assisted development workflow and usage of module context files for LLMs.
weijinglin and others added 8 commits September 9, 2025 16:27
Added explicit instructions for creating a .env configuration file under
hugegraph-llm if it does not exist, to help new users avoid setup
issues.
## Changes
This PR introduces performance optimizations for vector index building
and querying by implementing parallel text embedding generation.

### Key Improvements
1. Added new utility class `embedding_utils.py` with parallel batch
processing capabilities
- Implements `get_embeddings_parallel` function for efficient batch
processing
   - Uses asyncio with semaphore for controlled concurrency
   - Supports batch size of 1000 with max 10 concurrent tasks

2. Refactored all index operation classes to use parallel processing:
   - `BuildGremlinExampleIndex`
   - `BuildSemanticIndex`
   - `BuildVectorIndex`
   - `GremlinExampleIndexQuery`
   - `SemanticIdQuery`
   - `VectorIndexQuery`

3. Unified embedding generation approach:
- Replaced individual `get_text_embedding` calls with batch
`get_texts_embeddings`
   - Removed duplicate parallel processing code
   - Improved code reusability and maintainabilityl

---------

Co-authored-by: imbajin <jin@apache.org>
…e#295)

Reworked get_embeddings_parallel to use asyncio.gather for batch
processing, ensuring output order matches input. Added a helper for
batch progress updates and improved progress bar accuracy.
close apache#260
  - Add prompt language indicator
  
<img width="1878" height="89" alt="image"
src="https://github.com/user-attachments/assets/a0040611-8c5d-45c5-8aa7-580fd4e5367d"
/>
<img width="1853" height="52" alt="image"
src="https://github.com/user-attachments/assets/49201c7b-e6e9-403c-bc76-7e2b6e03ac8d"
/>

  - Add  query examples for CN/EN support
  
<img width="1233" height="407" alt="image"
src="https://github.com/user-attachments/assets/4c06cd78-a6c3-4749-b132-a9db42a29429"
/>

-update README with prompt language support details
-Support switching prompt EN/CN

support switch prompt language EN/CN

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **新功能**
  * 增加了对中英文提示词的支持,可通过 `.env` 文件中的 `LANGUAGE` 配置切换语言,默认使用英文。
  * 配置文件中新增 `language` 字段,支持 "EN" 和 "CN" 两种语言选项。
  * 自动检测并同步 `.env` 配置与 YAML 提示词文件的语言设置,确保提示词内容与当前语言一致,语言切换时自动重新生成提示词。
  * 提示词属性重命名以区分语言版本,新增语言参数支持提示词生成。
  * 演示界面新增语言指示器,动态显示当前使用的提示词语言。
  * 查询示例根据语言自动加载对应的中英文版本。
  * 新增中文查询示例资源文件,丰富多样的图数据库查询案例。

* **文档**
  * README 增加了“语言支持”说明,指导如何切换中英文提示词。
  * 新增详细的配置说明文档 `CONFIGURATION.md`,涵盖系统、提示词、部署和项目配置,提供完整配置参数说明和使用示例。

* **样式**
  * 优化了 README 部分格式,提升可读性。
  * pylint 配置调整,改善对动态属性和兼容性警告的处理。
  * 新增语言指示器相关样式,支持界面语言状态的视觉展示。
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: imbajin <jin@apache.org>
…pache#265)

## Overview
This PR implements independent data directory management when switching
between different embedding models. Each embedding model will store its
results in a separate directory, and during queries, the corresponding
model’s embedding data will be automatically identified and loaded,
ensuring that multiple models can coexist without data overwriting.

## Main Changes
1. **`vector_index` Module Method Adjustments**  
- Updated methods such as `from_index_file`, `to_index_file`, and
`clean` to include a `model_name` parameter, enabling model-based
independent storage paths.
- Backward compatibility is maintained: if `model_name` is not provided,
the original behavior remains unchanged.

2. **Index Operation Class Path Concatenation Refactoring**  
- In classes like `BuildSemanticIndex`, modified directory concatenation
so that the current `embedding_type` is automatically used to generate
subdirectories without manual specification.

3. **Embedding Class Name Unification**  
- Unified the model name field across all embedding classes as
`model_name`, and updated related invocation logic to ensure consistency
with the index module.

---------

Co-authored-by: imbajin <jin@apache.org>
Co-authored-by: Frui Guo <153489642+Gfreely@users.noreply.github.com>
Co-authored-by: jinsong04 <jinsong04@MacBook-Pro-2.local>
Co-authored-by: Seanium <Seanium@foxmail.com>
Co-authored-by: imbajin <jin@apache.org>
@github-actions github-actions bot added the llm label Sep 9, 2025
@dosubot dosubot bot added the size:XXL This PR changes 1000+ lines, ignoring generated files. label Sep 9, 2025
@dosubot dosubot bot added size:XS This PR changes 0-9 lines, ignoring generated files. documentation Improvements or additions to documentation and removed size:XXL This PR changes 1000+ lines, ignoring generated files. labels Sep 9, 2025
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also need update README.md

@dosubot dosubot bot added size:M This PR changes 30-99 lines, ignoring generated files. and removed size:XS This PR changes 0-9 lines, ignoring generated files. labels Sep 9, 2025
@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Sep 11, 2025
@imbajin imbajin merged commit 9e76c2a into apache:main Sep 11, 2025
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation lgtm This PR has been approved by a maintainer llm size:M This PR changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants