【Feature】Support Vbench eval benchmark by GaoHuaZhang · Pull Request #152 · AISBench/benchmark

GaoHuaZhang · 2026-02-25T08:29:20Z

Thanks for your contribution; we appreciate it a lot. The following instructions will make your pull request healthier and help you get feedback more easily. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers.
感谢您的贡献，我们非常重视。以下说明将使您的拉取请求更健康，更易于获得反馈。如果您不理解某些项目，请不要担心，只需提交拉取请求并从维护人员那里寻求帮助即可。

PR Type / PR类型

Related Issue | 关联 Issue
Fixes #(issue ID / issue 编号) / Relates to #(issue ID / issue 编号)

🔍 Motivation / 变更动机

Please describe the motivation of this PR and the goal you want to achieve through this PR.
请描述您的拉取请求的动机和您希望通过此拉取请求实现的目标。

📝 Modification / 修改内容

Please briefly describe what modification is made in this PR.
请简要描述此拉取请求中进行的修改。

📐 Associated Test Results / 关联测试结果

Please provide links to the related test results, such as CI pipelines, test reports, etc.
请提供相关测试结果的链接，例如 CI 管道、测试报告等。

⚠️ BC-breaking (Optional) / 向后不兼容变更（可选）

Does the modification introduce changes that break the backward compatibility of the downstream repositories? If so, please describe how it breaks the compatibility and how the downstream projects should modify their code to keep compatibility with this PR.
是否引入了会破坏下游存储库向后兼容性的更改？如果是，请描述它如何破坏兼容性，以及下游项目应该如何修改其代码以保持与此 PR 的兼容性。

⚠️ Performance degradation (Optional) / 性能下降（可选）

If the modification introduces performance degradation, please describe the impact of the performance degradation and the expected performance improvement.
如果引入了性能下降，请描述性能下降的影响和预期的性能改进。

🌟 Use cases (Optional) / 使用案例（可选）

If this PR introduces a new feature, it is better to list some use cases here and update the documentation.
如果此拉取请求引入了新功能，最好在此处列出一些用例并更新文档。

✅ Checklist / 检查列表

Before PR:

Pre-commit or other linting tools are used to fix the potential lint issues. / 使用预提交或其他 linting 工具来修复潜在的 lint 问题。
Bug fixes are fully covered by unit tests, the case that causes the bug should be added in the unit tests. / 修复的 Bug 已完全由单元测试覆盖，导致 Bug 的情况应在单元测试中添加。
The modification is covered by complete unit tests. If not, please add more unit tests to ensure the correctness. / 此拉取请求中的修改已完全由单元测试覆盖。如果不是，请添加更多单元测试以确保正确性。
All relevant documentation (API docs, docstrings, example tutorials) has been updated to reflect these changes. / 所有相关文档（API 文档、文档字符串、示例教程）已更新以反映这些更改。

After PR:

If the modification has potential influence on downstream or other related projects, this PR should be tested with those projects. / 如果此拉取请求对下游或其他相关项目有潜在影响，应在那些项目中测试此 PR。
CLA has been signed and all committers have signed the CLA in this PR. / CLA 已签署，且本 PR 中的所有提交者均已签署 CLA。

👥 Collaboration Info / 协作信息

Suggested Reviewers / 建议审核人: @xxx
Relevant Module Owners / 相关模块负责人: @xxx
Other Collaboration Notes / 其他协作说明：

🌟 Useful CI Command / 实用的CI命令

Command / 命令	Introduction / 介绍
`/gemini review`	Performs a code review for the current pull request in its current state by Gemini. / 对当前拉取请求在当前状态下由 Gemini 执行代码审核。
`/gemini summary`	Provides a summary of the current pull request in its current state by Gemini. / 对当前拉取请求在当前状态下由 Gemini 提供摘要。
`/gemini help`	Displays a list of available commands of Gemini. / 显示 Gemini 可用命令的列表。
`/readthedocs build`	Triggers a build of the documentation for the current pull request in its current state by Read the Docs. / 触发当前拉取请求在当前状态下由 Read the Docs 构建文档。

gemini-code-assist · 2026-02-25T08:46:25Z

Summary of Changes

Hello @GaoHuaZhang, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces comprehensive support for the VBench eval benchmark within AISBench. It includes new tasks, dataset configurations, and necessary documentation to facilitate video and image quality assessments on both GPU and NPU, while maintaining compatibility with existing AISBench workflows.

Highlights

Feature Addition: Introduced support for VBench eval benchmark, enabling video/image quality assessment on both GPU and NPU.
New Task Integration: Added a new task, VBenchEvalTask, to handle VBench 1.0 evaluation, including dependency management and result wrapping.
Dataset Configurations: Created dataset configurations for various VBench dimensions, allowing for standard and custom input evaluations.
Compatibility: Ensured compatibility with existing AISBench workflows, maintaining CLI usage and automatically selecting VBenchEvalTask when VBench datasets are detected.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog

ais_bench/benchmark/cli/workers.py
- Modified to include VBenchEvalTask in the evaluation workflow, dynamically selecting the appropriate eval task based on the dataset configuration.
ais_bench/benchmark/configs/datasets/vbench/README.md
- Added a README file providing instructions and documentation for using the VBench 1.0 integration.
ais_bench/benchmark/configs/datasets/vbench/vbench_aesthetic_quality.py
- Added dataset configuration for VBench aesthetic quality evaluation.
ais_bench/benchmark/configs/datasets/vbench/vbench_appearance_style.py
- Added dataset configuration for VBench appearance style evaluation.
ais_bench/benchmark/configs/datasets/vbench/vbench_background_consistency.py
- Added dataset configuration for VBench background consistency evaluation.
ais_bench/benchmark/configs/datasets/vbench/vbench_color.py
- Added dataset configuration for VBench color evaluation.
ais_bench/benchmark/configs/datasets/vbench/vbench_custom.py
- Added dataset configuration for VBench custom input evaluation, supporting prompts from files or filenames.
ais_bench/benchmark/configs/datasets/vbench/vbench_dynamic_degree.py
- Added dataset configuration for VBench dynamic degree evaluation.
ais_bench/benchmark/configs/datasets/vbench/vbench_human_action.py
- Added dataset configuration for VBench human action evaluation.
ais_bench/benchmark/configs/datasets/vbench/vbench_imaging_quality.py
- Added dataset configuration for VBench imaging quality evaluation.
ais_bench/benchmark/configs/datasets/vbench/vbench_motion_smoothness.py
- Added dataset configuration for VBench motion smoothness evaluation.
ais_bench/benchmark/configs/datasets/vbench/vbench_multiple_objects.py
- Added dataset configuration for VBench multiple objects evaluation.
ais_bench/benchmark/configs/datasets/vbench/vbench_object_class.py
- Added dataset configuration for VBench object class evaluation.
ais_bench/benchmark/configs/datasets/vbench/vbench_overall_consistency.py
- Added dataset configuration for VBench overall consistency evaluation.
ais_bench/benchmark/configs/datasets/vbench/vbench_scene.py
- Added dataset configuration for VBench scene evaluation.
ais_bench/benchmark/configs/datasets/vbench/vbench_spatial_relationship.py
- Added dataset configuration for VBench spatial relationship evaluation.
ais_bench/benchmark/configs/datasets/vbench/vbench_standard.py
- Added dataset configuration for VBench standard prompt evaluation, requiring a video directory and optional full_info JSON.
ais_bench/benchmark/configs/datasets/vbench/vbench_subject_consistency.py
- Added dataset configuration for VBench subject consistency evaluation.
ais_bench/benchmark/configs/datasets/vbench/vbench_temporal_flickering.py
- Added dataset configuration for VBench temporal flickering evaluation.
ais_bench/benchmark/configs/datasets/vbench/vbench_temporal_style.py
- Added dataset configuration for VBench temporal style evaluation.
ais_bench/benchmark/configs/models/vbench_eval/vbench_eval.py
- Added a placeholder model configuration for VBench 1.0 eval-only, used for task naming and result paths.
ais_bench/benchmark/datasets/init.py
- Modified to include the new vbench module, making the VBenchDataset available.
ais_bench/benchmark/datasets/vbench.py
- Added a placeholder dataset for VBench evaluation, ensuring that the LOAD_DATASET.build function does not fail.
ais_bench/benchmark/tasks/init.py
- Modified to include the new vbench_eval module, making the VBenchEvalTask available.
ais_bench/benchmark/tasks/vbench_eval.py
- Added a new task, VBenchEvalTask, to handle VBench 1.0 evaluation, including dependency management and result wrapping.
ais_bench/third_party/detectron2/.clang-format
- Added .clang-format file for code formatting.
ais_bench/third_party/detectron2/.flake8
- Added .flake8 file for code linting.
ais_bench/third_party/detectron2/.gitignore
- Added .gitignore file to exclude unnecessary files from version control.
ais_bench/third_party/detectron2/GETTING_STARTED.md
- Added GETTING_STARTED.md file to provide a brief intro of the usage of builtin command-line tools in detectron2.
ais_bench/third_party/detectron2/INSTALL.md
- Added INSTALL.md file to provide installation instructions for detectron2.
ais_bench/third_party/detectron2/LICENSE
- Added LICENSE file to specify the licensing information for detectron2.
ais_bench/third_party/detectron2/MODEL_ZOO.md
- Added MODEL_ZOO.md file to document a large collection of baselines trained with detectron2.
ais_bench/third_party/detectron2/README.md
- Added README.md file to provide a brief overview of the detectron2 library.
ais_bench/third_party/detectron2/detectron2/init.py
- Added init.py file to initialize the detectron2 package.
ais_bench/third_party/detectron2/detectron2/checkpoint/init.py
- Added init.py file to initialize the detectron2.checkpoint package.
ais_bench/third_party/detectron2/detectron2/checkpoint/c2_model_loading.py
- Added c2_model_loading.py file to provide functions for loading Caffe2 models.
ais_bench/third_party/detectron2/detectron2/checkpoint/catalog.py
- Added catalog.py file to store mappings from names to third-party models.
ais_bench/third_party/detectron2/detectron2/checkpoint/detection_checkpoint.py
- Added detection_checkpoint.py file to provide a class for handling model checkpoints.
ais_bench/third_party/detectron2/detectron2/config/init.py
- Added init.py file to initialize the detectron2.config package.
ais_bench/third_party/detectron2/detectron2/config/compat.py
- Added compat.py file to provide backward compatibility for configs.
ais_bench/third_party/detectron2/detectron2/config/config.py
- Added config.py file to define the CfgNode class and related functions.
ais_bench/third_party/detectron2/detectron2/config/defaults.py
- Added defaults.py file to define the default configurations for detectron2.
ais_bench/third_party/detectron2/detectron2/config/instantiate.py
- Added instantiate.py file to provide functions for instantiating objects from configurations.
ais_bench/third_party/detectron2/detectron2/config/lazy.py
- Added lazy.py file to provide classes for lazy loading of configurations.
ais_bench/third_party/detectron2/detectron2/data/init.py
- Added init.py file to initialize the detectron2.data package.
ais_bench/third_party/detectron2/detectron2/data/benchmark.py
- Added benchmark.py file to provide common benchmarks for dataloaders.
ais_bench/third_party/detectron2/detectron2/data/build.py
- Added build.py file to provide functions for building dataloaders.
ais_bench/third_party/detectron2/detectron2/data/catalog.py
- Added catalog.py file to provide classes for managing datasets and metadata.
ais_bench/third_party/detectron2/detectron2/data/common.py
- Added common.py file to provide common dataset classes.
ais_bench/third_party/detectron2/detectron2/data/dataset_mapper.py
- Added dataset_mapper.py file to provide a class for mapping dataset dicts to model inputs.
ais_bench/third_party/detectron2/detectron2/data/datasets/README.md
- Added README.md file to provide information about the datasets.
ais_bench/third_party/detectron2/detectron2/data/datasets/init.py
- Added init.py file to initialize the detectron2.data.datasets package.
ais_bench/third_party/detectron2/detectron2/data/datasets/builtin.py
- Added builtin.py file to register pre-defined datasets.
ais_bench/third_party/detectron2/detectron2/data/datasets/builtin_meta.py
- Added builtin_meta.py file to define metadata for built-in datasets.
ais_bench/third_party/detectron2/detectron2/data/datasets/cityscapes.py
- Added cityscapes.py file to provide functions for loading Cityscapes dataset.
ais_bench/third_party/detectron2/detectron2/data/datasets/cityscapes_panoptic.py
- Added cityscapes_panoptic.py file to provide functions for loading Cityscapes panoptic dataset.
ais_bench/third_party/detectron2/detectron2/data/datasets/coco.py
- Added coco.py file to provide functions for loading COCO dataset.
ais_bench/third_party/detectron2/detectron2/data/datasets/coco_panoptic.py
- Added coco_panoptic.py file to provide functions for loading COCO panoptic dataset.
ais_bench/third_party/detectron2/detectron2/data/datasets/lvis.py
- Added lvis.py file to provide functions for loading LVIS dataset.
ais_bench/third_party/detectron2/detectron2/data/datasets/pascal_voc.py
- Added pascal_voc.py file to provide functions for loading Pascal VOC dataset.
ais_bench/third_party/detectron2/detectron2/data/datasets/register_coco.py
- Added register_coco.py file to register COCO datasets.
ais_bench/third_party/detectron2/detectron2/data/samplers/init.py
- Added init.py file to initialize the detectron2.data.samplers package.
ais_bench/third_party/detectron2/detectron2/data/samplers/distributed_sampler.py
- Added distributed_sampler.py file to provide distributed samplers.
ais_bench/third_party/detectron2/detectron2/data/samplers/grouped_batch_sampler.py
- Added grouped_batch_sampler.py file to provide a grouped batch sampler.
ais_bench/third_party/detectron2/detectron2/data/transforms/init.py
- Added init.py file to initialize the detectron2.data.transforms package.
ais_bench/third_party/detectron2/detectron2/data/transforms/augmentation.py
- Added augmentation.py file to define augmentation classes.
ais_bench/third_party/detectron2/detectron2/data/transforms/augmentation_impl.py
- Added augmentation_impl.py file to implement specific augmentation techniques.
ais_bench/third_party/detectron2/detectron2/data/transforms/transform.py
- Added transform.py file to define transform classes.
ais_bench/third_party/detectron2/detectron2/engine/init.py
- Added init.py file to initialize the detectron2.engine package.
ais_bench/third_party/detectron2/detectron2/engine/defaults.py
- Added defaults.py file to provide default training and evaluation components.
ais_bench/third_party/detectron2/detectron2/engine/hooks.py
- Added hooks.py file to define training hooks.
ais_bench/third_party/detectron2/detectron2/engine/launch.py
- Added launch.py file to provide functions for launching multi-gpu training.
ais_bench/third_party/detectron2/detectron2/engine/train_loop.py
- Added train_loop.py file to define the training loop.
ais_bench/third_party/detectron2/detectron2/export/init.py
- Added init.py file to initialize the detectron2.export package.
ais_bench/third_party/detectron2/detectron2/export/api.py
- Added api.py file to define the export API.
ais_bench/third_party/detectron2/detectron2/export/c10.py
- Added c10.py file to provide Caffe2-compatible implementations of detectron2 components.
ais_bench/third_party/detectron2/detectron2/export/caffe2_export.py
- Added caffe2_export.py file to provide functions for exporting models to Caffe2 format.
ais_bench/third_party/detectron2/detectron2/export/caffe2_modeling.py
- Added caffe2_modeling.py file to define Caffe2-compatible meta architectures.
ais_bench/third_party/detectron2/detectron2/export/flatten.py
- Added flatten.py file to provide functions for flattening and unflattening data structures.
ais_bench/third_party/detectron2/detectron2/export/shared.py
- Added shared.py file to provide shared utilities for exporting models.
ais_bench/third_party/detectron2/detectron2/export/torchscript.py
- Added torchscript.py file to provide functions for exporting models to TorchScript format.
ais_bench/third_party/detectron2/detectron2/export/torchscript_patch.py
- Added torchscript_patch.py file to provide patches for TorchScript compatibility.

Activity

The pull request introduces a new feature to support VBench eval benchmark.
The pull request adds new dataset configurations for various VBench dimensions.
The pull request modifies the evaluation workflow to include VBenchEvalTask.
The pull request adds a placeholder model configuration for VBench 1.0 eval-only.
The pull request includes a third_party/detectron2 directory with necessary files for VBench (GRiT) support.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

这个 PR 增加了对 VBench 评估基准的支持，这是一个很棒的功能。代码结构清晰，主要逻辑集中在新的 VBenchEvalTask 中。新增的数据集配置文件使得 VBench 的不同评估维度可以方便地被调用。
我发现了一些可以改进的地方：

多个数据配置文件中硬编码了用户特定的绝对路径，这不利于代码的可移植性。
vbench_eval.py 中读取 JSON 文件时未指定编码，可能导致在不同平台上的兼容性问题。
vbench_eval.py 中定义了一个默认的维度列表，该列表与 vbench_standard.py 中的定义重复，建议提取到公共位置以方便维护。
有一个配置文件末尾缺少换行符。
除了这些小问题，整体实现质量很高。

gemini-code-assist · 2026-02-25T08:48:51Z

ais_bench/benchmark/configs/datasets/vbench/vbench_aesthetic_quality.py

+    load_ckpt_from_local=True,
+)
+
+_BASE_PATH = '/data/zhanggaohua/datasets/vbench/CogVideoX-5B-mini'


此配置文件中硬编码了一个绝对路径 _BASE_PATH，这似乎包含了个人用户信息 (/data/zhanggaohua/...)。这种做法会给其他用户在不同环境中使用该配置带来不便。建议将其修改为占位符路径，例如 '/path/to/your/video/folder'，并在注释中说明用户需要修改此路径。此问题同样存在于本次PR新增的大部分 vbench 数据集配置文件中。

Suggested change

_BASE_PATH = '/data/zhanggaohua/datasets/vbench/CogVideoX-5B-mini'

_BASE_PATH = '/path/to/your/video/folder' # 请替换为您的视频目录

gemini-code-assist · 2026-02-25T08:48:52Z

ais_bench/benchmark/configs/datasets/vbench/vbench_subject_consistency.py

+            dimension_list=['subject_consistency'],
+        ),
+    )
+]


文件末尾缺少一个换行符。根据 PEP 8 规范，所有源文件应以一个换行符结尾。

gemini-code-assist · 2026-02-25T08:48:52Z

ais_bench/benchmark/tasks/vbench_eval.py

+            if not dimension_list:
+                dimension_list = [
+                    'subject_consistency', 'background_consistency', 'aesthetic_quality',
+                    'imaging_quality', 'object_class', 'multiple_objects', 'color',
+                    'spatial_relationship', 'scene', 'temporal_style', 'overall_consistency',
+                    'human_action', 'temporal_flickering', 'motion_smoothness', 'dynamic_degree',
+                    'appearance_style',
+                ]


这个默认的 dimension_list 与 ais_bench/benchmark/configs/datasets/vbench/vbench_standard.py 文件中定义的 VBENCH_DEFAULT_DIMENSIONS 列表内容相同。为了保持代码的 DRY (Don't Repeat Yourself) 原则并方便维护，建议将此列表定义在一个公共位置（例如 vbench_standard.py 中），然后在此处导入使用。这样可以避免两处代码不一致导致的潜在问题。

gemini-code-assist · 2026-02-25T08:48:52Z

ais_bench/benchmark/tasks/vbench_eval.py

+            prompt_list = dataset_cfg.get('prompt_list') or eval_cfg.get('prompt_list') or []
+            prompt_file = eval_cfg.get('prompt_file')
+            if prompt_file and osp.isfile(prompt_file):
+                with open(prompt_file, 'r') as f:


在读取 JSON 文件时，建议明确指定编码格式为 utf-8，以避免在不同系统环境下可能出现的编码问题，并与文件中其他部分的文件写入操作保持一致。

Suggested change

with open(prompt_file, 'r') as f:

with open(prompt_file, 'r', encoding='utf-8') as f:

GaoHuaZhang added 3 commits February 25, 2026 08:26

add third party

f94b596

adapter vbench support

1ca04e6

add config

f1e07e6

gemini-code-assist bot reviewed Feb 25, 2026

View reviewed changes

GaoHuaZhang added 4 commits February 26, 2026 03:46

add barrier

406d24d

auto chose device

af49417

adapter vbench

14366b9

use custom config

26b6ab9

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

【Feature】Support Vbench eval benchmark#152

【Feature】Support Vbench eval benchmark#152
GaoHuaZhang wants to merge 7 commits intoAISBench:masterfrom
GaoHuaZhang:vbench

GaoHuaZhang commented Feb 25, 2026

Uh oh!

gemini-code-assist bot commented Feb 25, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Feb 25, 2026

Uh oh!

gemini-code-assist bot Feb 25, 2026

Uh oh!

gemini-code-assist bot Feb 25, 2026

Uh oh!

gemini-code-assist bot Feb 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	_BASE_PATH = '/data/zhanggaohua/datasets/vbench/CogVideoX-5B-mini'
	_BASE_PATH = '/path/to/your/video/folder' # 请替换为您的视频目录

	with open(prompt_file, 'r') as f:
	with open(prompt_file, 'r', encoding='utf-8') as f:

Conversation

GaoHuaZhang commented Feb 25, 2026

🔍 Motivation / 变更动机

📝 Modification / 修改内容

📐 Associated Test Results / 关联测试结果

⚠️ BC-breaking (Optional) / 向后不兼容变更（可选）

⚠️ Performance degradation (Optional) / 性能下降（可选）

🌟 Use cases (Optional) / 使用案例（可选）

✅ Checklist / 检查列表

👥 Collaboration Info / 协作信息

🌟 Useful CI Command / 实用的CI命令

Uh oh!

gemini-code-assist bot commented Feb 25, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant