Skip to content

feat: add file scanner utility with demo program#3568

Merged
deepin-bot[bot] merged 1 commit intolinuxdeepin:masterfrom
Johnson-zs:master
Feb 7, 2026
Merged

feat: add file scanner utility with demo program#3568
deepin-bot[bot] merged 1 commit intolinuxdeepin:masterfrom
Johnson-zs:master

Conversation

@Johnson-zs
Copy link
Contributor

@Johnson-zs Johnson-zs commented Feb 7, 2026

  1. Implement FileScanner class for asynchronous file system statistics collection
  2. Add ScannerWorker with optimized local file traversal using fts(3) API
  3. Support both local filesystem and other protocols (SMB, SFTP) via InfoFactory
  4. Include hard link deduplication using device+inode tracking
  5. Add progress reporting with throttling
  6. Create test/demo program with GUI for directory statistics
  7. Add CMake option OPT_ENABLE_BUILD_TESTS to control test program compilation

Log: Added file scanner utility for directory statistics with demo program

Influence:

  1. Test FileScanner with various directory structures and file types
  2. Verify hard link handling and deduplication functionality
  3. Test progress reporting and throttling behavior
  4. Verify both local filesystem and protocol-based scanning
  5. Test demo program with single and multiple directory selections
  6. Verify scanner stops correctly during application shutdown
  7. Test with special system files and symlink handling

feat: 添加文件扫描器工具及演示程序

  1. 实现 FileScanner 类用于异步文件系统统计收集
  2. 添加 ScannerWorker,使用 fts(3) API 优化本地文件遍历
  3. 支持本地文件系统和其他协议(SMB、SFTP)通过 InfoFactory
  4. 包含使用设备+inode 跟踪的硬链接去重功能
  5. 添加节流进度报告
  6. 创建带有 GUI 的测试/演示程序用于目录统计
  7. 添加 CMake 选项 OPT_ENABLE_BUILD_TESTS 控制测试程序编译

Log: 新增文件扫描器工具及目录统计演示程序

Influence:

  1. 使用各种目录结构和文件类型测试 FileScanner
  2. 验证硬链接处理和去重功能
  3. 测试进度报告和节流行为
  4. 验证本地文件系统和基于协议的扫描
  5. 测试演示程序的单选和多选目录功能
  6. 验证应用程序关闭时扫描器正确停止
  7. 测试特殊系统文件和符号链接处理

Summary by Sourcery

Introduce an asynchronous file scanning utility with directory statistics support and a GUI-based demo application, guarded by a new optional tests build flag.

New Features:

  • Add FileScanner API and ScannerWorker backend for asynchronous directory statistics over local filesystems and remote protocols.
  • Introduce hard-link-aware size accounting, progress reporting, and scan options such as single-depth and symlink handling.
  • Provide a GUI statistics dialog and main window demo to explore and test directory scanning interactively.

Build:

  • Add OPT_ENABLE_BUILD_TESTS CMake option and wire up an optional tests/ subtree for building demo/test programs, including the FileScanner demo binary output to the tests directory.

Tests:

  • Add a FileScanner GUI demo under tests/filescanner for manual verification of scanning behavior, progress updates, and cancellation.

@deepin-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: Johnson-zs

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@sourcery-ai
Copy link

sourcery-ai bot commented Feb 7, 2026

Reviewer's Guide

Implements a new asynchronous FileScanner utility with a threaded ScannerWorker for directory statistics (including hard-link deduplication and throttled progress reporting), integrates it into dfm-base, and adds an optional GUI demo/test application wired via a new OPT_ENABLE_BUILD_TESTS CMake option.

Sequence diagram for asynchronous FileScanner lifecycle

sequenceDiagram
actor User
participant DemoMainWindow
participant FileScanner
participant FileScannerPrivate
participant WorkerThread as QThread_workerThread
participant ScannerWorker
participant LocalFS
participant InfoFactory

User->>DemoMainWindow: select_directories()
DemoMainWindow->>FileScanner: setOptions(options)
DemoMainWindow->>FileScanner: start(urls)
FileScanner->>FileScannerPrivate: startWorker(urls)

alt worker_thread_not_created
  FileScannerPrivate->>WorkerThread: create()
end
alt worker_not_created
  FileScannerPrivate->>ScannerWorker: create()
  FileScannerPrivate->>ScannerWorker: moveToThread(WorkerThread)
end
FileScannerPrivate->>ScannerWorker: setUrls(urls)
FileScannerPrivate->>ScannerWorker: setOptions(options)
FileScannerPrivate->>WorkerThread: start()
FileScannerPrivate->>ScannerWorker: start() queued

activate ScannerWorker
ScannerWorker->>ScannerWorker: reset_state()

alt first_url_is_local_file_scheme
  ScannerWorker->>LocalFS: scanLocalPaths_with_fts()
else other_protocols
  ScannerWorker->>InfoFactory: scanOtherProtocols_via_InfoFactory()
end

loop periodic_progress
  ScannerWorker-->>FileScannerPrivate: resultReady(currentResult,false)
  FileScannerPrivate->>FileScanner: onWorkerResultReady(result,false)
  FileScanner-->>DemoMainWindow: progressChanged(result)
end

ScannerWorker-->>FileScannerPrivate: resultReady(finalResult,true)
ScannerWorker-->>FileScannerPrivate: finished()
Deactivate ScannerWorker

FileScannerPrivate->>FileScanner: onWorkerFinished()
FileScanner-->>DemoMainWindow: finished(lastResult,true)

rect rgb(230,230,230)
QCoreApplication->>FileScanner: aboutToQuit()
FileScanner->>FileScannerPrivate: stopWorker()
FileScannerPrivate->>ScannerWorker: stop()
end
Loading

Class diagram for FileScanner and ScannerWorker utility

classDiagram
class FileScanner {
  +ScanOption enum
  +ScanResult struct
  +FileScanner(QObject_parent)
  +~FileScanner()
  +void setOptions(ScanOptions_options)
  +ScanOptions options()
  +ScanResult result()
  +bool isRunning()
  +void start(QList_QUrl_urls)
  +void stop()
  +progressChanged(ScanResult_result)
  +finished(ScanResult_result,bool_success)
  -QScopedPointer_FileScannerPrivate_d
}

class FileScannerPrivate {
  +FileScannerPrivate(FileScanner_qq)
  +~FileScannerPrivate()
  +void startWorker(QList_QUrl_urls)
  +void stopWorker()
  +void onWorkerResultReady(FileScanner_ScanResult_result,bool_isFinal)
  +void onWorkerFinished()
  -FileScanner_q
  -QThread_workerThread
  -ScannerWorker_worker
  -FileScanner_ScanResult_lastResult
  -FileScanner_ScanOptions_options
}

class FileScanner_ScanResult {
  +qint64 totalSize
  +qint64 progressSize
  +int fileCount
  +int directoryCount
  +bool isValid()
  +void clear()
}

class ScannerWorker {
  +ScannerWorker(QObject_parent)
  +~ScannerWorker()
  +void setUrls(QList_QUrl_urls)
  +void setOptions(FileScanner_ScanOptions_options)
  +void start()
  +void stop()
  +resultReady(FileScanner_ScanResult_result,bool_isFinal)
  +finished()
  -void scanLocalPaths()
  -void scanOtherProtocols()
  -void processFileWithStat(QUrl_url,stat64_pointer_statBuf,bool_followSymlink)
  -bool shouldStop() const
  -bool isInodeProcessed(quint64_device,quint64_inode)
  -void markInodeProcessed(quint64_device,quint64_inode)
  -void emitProgress(bool_force)
  -bool isSpecialSystemFile(QString_path) const
  -QList_QUrl_urls
  -FileScanner_ScanOptions_options
  -FileScanner_ScanResult_currentResult
  -atomic_bool_stopped
  -QElapsedTimer_progressTimer
  -qint64 lastEmittedSize
  -QHash_quint64_QSet_quint64_processedInodes
  -static QSet_QString_kSpecialSystemFiles
}

FileScanner o-- FileScannerPrivate
FileScannerPrivate --> ScannerWorker
FileScanner ..> FileScanner_ScanResult
ScannerWorker ..> FileScanner_ScanResult
Loading

Flow diagram for ScannerWorker scanning logic with protocol selection and deduplication

flowchart TD
  A_start["Start scan in ScannerWorker.start"] --> B_checkScheme{First_URL_scheme_is_file}

  B_checkScheme -->|Yes| C_local["scanLocalPaths using fts"]
  B_checkScheme -->|No| D_remote["scanOtherProtocols using InfoFactory"]

  subgraph Local_scan
    C_local --> C1_iterFTS["Iterate entries via fts_read"]
    C1_iterFTS --> C2_type{Entry_type}

    C2_type -->|Directory| C_dir["directoryCount++ and progressSize+=4096"]
    C_dir --> C3_singleDepth{SingleDepth_option}
    C3_singleDepth -->|Yes| C_skip["fts_set FTS_SKIP for children"]
    C3_singleDepth -->|No| C1_iterFTS

    C2_type -->|Regular_file| C_file["Check st_nlink and size"]
    C_file --> C4_nlink{st_nlink>1}
    C4_nlink -->|Yes and not_processed| C5_mark["markInodeProcessed and add_size"]
    C4_nlink -->|Yes and processed| C6_countOnly["fileCount++ only"]
    C4_nlink -->|No| C7_add["fileCount++ and add_size"]
    C5_mark --> C8_progress["progressSize+=st_size"]
    C6_countOnly --> C8_progress
    C7_add --> C8_progress

    C2_type -->|Symlink| C_sym["fileCount++ (optionally follow target)"]
    C_sym --> C9_emitLocal["emitProgress_if_throttling_allows"]

    C8_progress --> C9_emitLocal
    C9_emitLocal --> C1_iterFTS
  end

  subgraph Remote_scan
    D_remote --> D1_initQueue["Initialize directoryQueue with source_urls"]
    D1_initQueue --> D2_loop{directoryQueue_not_empty_and_not_stopped}
    D2_loop -->|Dir| D_dir["directoryCount++ and progressSize+=4096"]
    D_dir --> D3_singleDepth{SingleDepth_option}
    D3_singleDepth -->|Yes| D4_emitRemote["emitProgress_if_throttling_allows"]
    D3_singleDepth -->|No| D5_iterDir["Create AbstractDirIterator and enqueue children"]
    D5_iterDir --> D4_emitRemote

    D2_loop -->|File| D_file["fileCount++ and add childInfo.size"]
    D_file --> D4_emitRemote

    D4_emitRemote --> D2_loop
  end

  C1_iterFTS -->|fts_read_null_or_stopped| E_postLocal["Adjust directoryCount by excluding source dirs unless IncludeSource"]
  D2_loop -->|Queue_empty_or_stopped| E_postRemote["Adjust directoryCount by excluding source dirs unless IncludeSource"]

  E_postLocal --> F_finish["emitProgress(force=true) and emit finished"]
  E_postRemote --> F_finish
Loading

File-Level Changes

Change Details Files
Add asynchronous FileScanner API and ScannerWorker implementation for directory statistics, with options, hard-link deduplication, and throttled progress signals.
  • Introduce FileScanner public QObject API with ScanOptions, ScanResult struct, start/stop slots, isRunning/result accessors, and progressChanged/finished signals.
  • Implement FileScannerPrivate to own a QThread-backed ScannerWorker, manage its lifecycle, forward results, and hook into QCoreApplication::aboutToQuit for graceful shutdown.
  • Implement ScannerWorker to traverse local paths via fts(3) and remote/protocol URLs via InfoFactory/DirIteratorFactory, counting files/dirs, summing sizes, and updating progressSize.
  • Add hard-link deduplication using a QHash of device to inode sets, only tracking entries with st_nlink > 1, and skip special system files via a static QSet path list.
  • Implement progress throttling using QElapsedTimer and lastEmittedSize, emitting resultReady only every 500ms or after 10MB size delta or on force, then forwarding to FileScanner::progressChanged/finished.
src/dfm-base/utils/filescanner.h
src/dfm-base/utils/filescanner.cpp
Provide a GUI demo application to exercise FileScanner with single/multi-directory selection and live statistics display.
  • Add StatisticsDialog dialog that wraps a FileScanner instance, shows formatted file/dir counts and size metrics, and provides stop and close controls while reacting to progressChanged/finished.
  • Add MainWindow with a QFileSystemModel-backed QTreeView supporting extended selection, context menu actions to run combined or per-directory stats, and launching non-modal StatisticsDialog instances.
  • Add a minimal QApplication entry point that constructs and shows MainWindow.
tests/filescanner/statisticsdialog.h
tests/filescanner/statisticsdialog.cpp
tests/filescanner/mainwindow.h
tests/filescanner/mainwindow.cpp
tests/filescanner/main.cpp
Wire up build-system support for the FileScanner demo via a new tests tree guarded by an option.
  • Introduce OPT_ENABLE_BUILD_TESTS option in top-level CMakeLists.txt and conditionally add the tests subdirectory.
  • Add tests/CMakeLists.txt to host demo/test programs, currently adding the filescanner demo when present.
  • Create tests/filescanner/CMakeLists.txt to build a Qt6/Dtk6-based test-filescanner-demo executable (aliased as dfm-filescanner-demo), link it against dfm6-base and Qt/Dtk components, and place the binary under the top-level build tests directory with appropriate include paths.
CMakeLists.txt
tests/CMakeLists.txt
tests/filescanner/CMakeLists.txt

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 5 issues, and left some high level feedback:

  • In FileScanner::~FileScanner you dereference d->workerThread without a null check (QThread::currentThread() != d->workerThread followed by d->workerThread->quit()), which will crash if no worker thread was ever started; guard the whole block with a if (d->workerThread) check.
  • The isFinal/success semantics are currently inconsistent: ScannerWorker::emitProgress passes the force flag as isFinal and FileScannerPrivate::onWorkerResultReady ignores it, while FileScannerPrivate::onWorkerFinished always emits finished(..., true) even on stop; either wire isFinal/success correctly (including cancellation) or remove these parameters to avoid confusion.
  • Thread and worker lifetime management is split between FileScannerPrivate::~FileScannerPrivate and FileScanner::~FileScanner (both attempting to stop/quit/wait/delete the same QThread/worker), which risks double management; consider centralizing ownership and shutdown logic in one place to make the lifecycle clear and safe.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- In `FileScanner::~FileScanner` you dereference `d->workerThread` without a null check (`QThread::currentThread() != d->workerThread` followed by `d->workerThread->quit()`), which will crash if no worker thread was ever started; guard the whole block with a `if (d->workerThread)` check.
- The `isFinal`/`success` semantics are currently inconsistent: `ScannerWorker::emitProgress` passes the `force` flag as `isFinal` and `FileScannerPrivate::onWorkerResultReady` ignores it, while `FileScannerPrivate::onWorkerFinished` always emits `finished(..., true)` even on stop; either wire `isFinal`/`success` correctly (including cancellation) or remove these parameters to avoid confusion.
- Thread and worker lifetime management is split between `FileScannerPrivate::~FileScannerPrivate` and `FileScanner::~FileScanner` (both attempting to stop/quit/wait/delete the same QThread/worker), which risks double management; consider centralizing ownership and shutdown logic in one place to make the lifecycle clear and safe.

## Individual Comments

### Comment 1
<location> `src/dfm-base/utils/filescanner.cpp:67-71` </location>
<code_context>
+
+FileScannerPrivate::~FileScannerPrivate()
+{
+    if (workerThread) {
+        workerThread->quit();
+        workerThread->wait(5000);
+        delete worker;
+        delete workerThread;
+    }
+}
</code_context>

<issue_to_address>
**issue (bug_risk):** Avoid double-deleting workerThread by relying on QObject parenting or changing the parent.

`workerThread` is constructed with `q` as its parent (`workerThread = new QThread(q);`), but is also deleted here. When `FileScanner` is destroyed, it will delete its children (including `workerThread`), so this risks a double delete. Either construct `workerThread` with `this` as the parent and keep the manual delete, or keep the parent as `q` and remove `delete workerThread;`. Review `worker`’s ownership as well to avoid the same issue.
</issue_to_address>

### Comment 2
<location> `src/dfm-base/utils/filescanner.cpp:152-153` </location>
<code_context>
+{
+    d->stopWorker();
+
+    if (QThread::currentThread() != d->workerThread) {
+        d->workerThread->quit();
+        d->workerThread->wait(5000);   // 最多等 5 秒
+    }
</code_context>

<issue_to_address>
**issue (bug_risk):** Guard against null workerThread when tearing down in FileScanner destructor.

`d->workerThread` is dereferenced here without a null check. If no scan was started, it stays null and this is UB; combined with the earlier manual deletion risk, this could also be a use-after-free. Please guard with something like `if (d->workerThread && QThread::currentThread() != d->workerThread)` or centralize thread shutdown in `FileScannerPrivate` and avoid touching `workerThread` here.
</issue_to_address>

### Comment 3
<location> `src/dfm-base/utils/filescanner.cpp:78-79` </location>
<code_context>
+void FileScannerPrivate::startWorker(const QList<QUrl> &urls)
+{
+    // 如果已有工作线程在运行,先停止
+    if (workerThread && workerThread->isRunning()) {
+        stopWorker();
+    }
+
</code_context>

<issue_to_address>
**suggestion (bug_risk):** Starting a new scan while the previous worker is still shutting down can race and duplicate signal connections.

On `startWorker`, when `workerThread` is running you call `stopWorker()` (which only sets `stopped = true`) and then immediately reuse the same `worker`/`workerThread` and attach new `resultReady`/`finished` connections. The previous run may still be unwinding and emit `finished()` later, so the old `onWorkerFinished` can run concurrently with the new task and you can end up with multiple active connections to the same signals. Please ensure the previous thread is fully stopped before reuse (e.g. `quit()` + `wait()` or a state flag), or create a new worker/thread per start and dispose the old pair on completion.

Suggested implementation:

```cpp
void FileScannerPrivate::startWorker(const QList<QUrl> &urls)
{
    // 如果已有工作线程在运行,先停止
    if (workerThread && workerThread->isRunning()) {
        // 确保上一次扫描的线程已经完全退出,避免信号重复连接和并发回调
        workerThread->quit();
        workerThread->wait();
    }

```

This change assumes the current implementation of `startWorker` begins immediately after the comment (i.e. right after the line we modified). If there is still an explicit call to `stopWorker()` inside this function, it should be removed to avoid redundant state changes, since we now directly manage the thread shutdown with `quit()`/`wait()`.
</issue_to_address>

### Comment 4
<location> `src/dfm-base/utils/filescanner.cpp:124` </location>
<code_context>
+    disconnect(worker, &ScannerWorker::resultReady, this, &FileScannerPrivate::onWorkerResultReady);
+    disconnect(worker, &ScannerWorker::finished, this, &FileScannerPrivate::onWorkerFinished);
+
+    emit q->finished(lastResult, true);
+}
+
</code_context>

<issue_to_address>
**suggestion (bug_risk):** `finished(success)` always reports success=true, even on user stop or traversal errors.

Since `FileScanner::finished` exposes a `success` flag, hardcoding it to `true` means callers can’t distinguish a complete traversal from one that hit `FTS_ERR/FTS_DNR/FTS_NS`, was explicitly stopped, or ended early based on `isFinal` in `resultReady`. Please track a success state in the worker (e.g., set to false on errors or `stop()`), and emit that value here instead of always `true`.

Suggested implementation:

```cpp
void FileScannerPrivate::onWorkerFinished()
{
    // 断开连接
    disconnect(worker, &ScannerWorker::resultReady, this, &FileScannerPrivate::onWorkerResultReady);
    disconnect(worker, &ScannerWorker::finished, this, &FileScannerPrivate::onWorkerFinished);

    const bool success = worker ? worker->isSuccess() : true;
    emit q->finished(lastResult, success);
}

#include "filescanner.h"

```

To fully implement the behavior you described, `ScannerWorker` needs to track and expose its success state:

1. In the `ScannerWorker` class (likely in `filescanner.h/.cpp` or a related worker file):
   - Add a private member, initialized to `true` at the start of each scan:
     - `bool m_success { true };`
   - Add a public const accessor:
     - `bool isSuccess() const { return m_success; }`
2. Ensure `m_success` is set to `false` in these situations:
   - When traversal hits `FTS_ERR`, `FTS_DNR`, `FTS_NS`, or analogous error conditions in your implementation.
   - When the user explicitly stops the scan via `stop()` or equivalent API.
   - When early termination is triggered based on `isFinal` logic, if you consider that a non-successful completion.
3. Reset `m_success` to `true` at the beginning of each new traversal (e.g., in the worker’s `start()` / `run()` / setup method) so each scan reports success independently.

After these changes, `FileScanner::finished(lastResult, success)` will correctly distinguish a complete traversal from one that ended due to errors or early termination.
</issue_to_address>

### Comment 5
<location> `src/dfm-base/utils/filescanner.h:139` </location>
<code_context>
+     * @brief 进度更新信号
+     * @param result 当前统计结果
+     *
+     * 发送频率:每200ms或每处理1000个文件
+     */
+    void progressChanged(const ScanResult &result);
</code_context>

<issue_to_address>
**nitpick:** The documented progress signal frequency does not match the implementation.

The comment says `progressChanged` is emitted every 200ms or every 1000 files, but `emitProgress()` currently uses a 500ms interval and a 10MB size-delta threshold. This discrepancy can confuse callers relying on the documented behavior. Please update either the implementation or the comment (e.g., to describe the 500ms/10MB strategy) so they match.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@Johnson-zs Johnson-zs force-pushed the master branch 5 times, most recently from a545f79 to de702a9 Compare February 7, 2026 04:13
1. Implement FileScanner class for asynchronous file system statistics
collection
2. Add ScannerWorker with optimized local file traversal using fts(3)
API
3. Support both local filesystem and other protocols (SMB, SFTP) via
InfoFactory
4. Include hard link deduplication using device+inode tracking
5. Add progress reporting with throttling
6. Create test/demo program with GUI for directory statistics
7. Add CMake option OPT_ENABLE_BUILD_TESTS to control test program
compilation

Log: Added file scanner utility for directory statistics with demo
program

Influence:
1. Test FileScanner with various directory structures and file types
2. Verify hard link handling and deduplication functionality
3. Test progress reporting and throttling behavior
4. Verify both local filesystem and protocol-based scanning
5. Test demo program with single and multiple directory selections
6. Verify scanner stops correctly during application shutdown
7. Test with special system files and symlink handling

feat: 添加文件扫描器工具及演示程序

1. 实现 FileScanner 类用于异步文件系统统计收集
2. 添加 ScannerWorker,使用 fts(3) API 优化本地文件遍历
3. 支持本地文件系统和其他协议(SMB、SFTP)通过 InfoFactory
4. 包含使用设备+inode 跟踪的硬链接去重功能
5. 添加节流进度报告
6. 创建带有 GUI 的测试/演示程序用于目录统计
7. 添加 CMake 选项 OPT_ENABLE_BUILD_TESTS 控制测试程序编译

Log: 新增文件扫描器工具及目录统计演示程序

Influence:
1. 使用各种目录结构和文件类型测试 FileScanner
2. 验证硬链接处理和去重功能
3. 测试进度报告和节流行为
4. 验证本地文件系统和基于协议的扫描
5. 测试演示程序的单选和多选目录功能
6. 验证应用程序关闭时扫描器正确停止
7. 测试特殊系统文件和符号链接处理
@deepin-ci-robot
Copy link
Contributor

deepin pr auto review

Git Diff 代码审查报告

总体评价

这段代码实现了一个异步文件扫描器,用于统计目录的文件数量和大小,并提供了一个测试程序。整体设计合理,使用了Qt的信号槽机制和线程模型,代码结构清晰,但存在一些可以改进的地方。

详细审查

1. CMakeLists.txt

语法逻辑

  • 无明显问题,结构清晰

代码质量

  • 建议为 OPT_ENABLE_BUILD_TESTS 添加注释说明其用途
  • 可以考虑将测试目录的构建条件统一管理

代码性能

  • 无直接影响性能的部分

代码安全

  • 无明显安全问题

2. filescanner.h

语法逻辑

  • 整体结构合理,使用了Pimpl模式
  • ScanResult 结构体缺少拷贝/移动构造函数和赋值运算符,虽然默认实现足够,但显式声明更清晰

代码质量

  • FileScanner 类的 finished 信号没有 success 参数,但文档注释中提到了它
  • ScannerWorker 类的 resultReady 信号有 isFinal 参数,但似乎没有被使用
  • 建议为 ScanOption 枚举添加更多选项,如 FollowSymlinksSkipHidden

代码性能

  • ScanResult 结构体使用 int 类型存储计数,对于大目录可能会溢出,建议使用 qint64
  • processedInodes 使用 QHash<quint64, QSet<quint64>> 结构,对于大量文件可能会消耗较多内存

代码安全

  • ScannerWorker 类的 stopped 标志使用 std::atomic<bool> 是正确的
  • processedInodes 没有保护机制,虽然只在单个线程中使用,但为了未来可能的扩展,建议添加保护

3. filescanner.cpp

语法逻辑

  • 整体实现正确,逻辑清晰
  • FileScannerPrivate::startWorker 中,如果已有工作线程在运行,会先停止并清理,这是正确的做法
  • FileScannerPrivate::onWorkerFinished 中,断开连接并清理指针的逻辑正确

代码质量

  • FileScannerPrivate::startWorker 中,每次都创建新的 ScannerWorker 是好的做法,避免了状态重置的问题
  • ScannerWorker::scanLocalPaths 使用 fts(3) 进行遍历,这是高效的做法
  • ScannerWorker::scanOtherProtocols 使用 InfoFactory 创建文件信息,这是正确的做法
  • ScannerWorker::emitProgress 中的节流逻辑合理

代码性能

  • ScannerWorker::scanLocalPaths 中,对 st_nlink > 1 的文件进行去重,这是好的优化
  • ScannerWorker::scanLocalPaths 中,跳过特殊系统文件是好的做法
  • ScannerWorker::scanOtherProtocols 中,使用 QQueueQSet 进行遍历和去重,是合理的做法
  • ScannerWorker::emitProgress 中的节流条件合理,但可以考虑添加文件数量阈值

代码安全

  • FileScannerPrivate::~FileScannerPrivate 中,等待线程结束的超时时间为5秒,可能不够
  • FileScanner::~FileScanner 中,等待线程结束没有超时时间,可能导致应用卡死
  • ScannerWorker::scanLocalPaths 中,对特殊系统文件的检查是好的做法
  • ScannerWorker::scanOtherProtocols 中,没有对特殊文件的处理,可能需要添加

4. 测试代码

语法逻辑

  • 整体实现正确,逻辑清晰
  • MainWindowStatisticsDialog 的实现合理

代码质量

  • MainWindow::onCustomContextMenu 中,提供了两种统计模式,这是好的设计
  • StatisticsDialog::formatSize 中,格式化大小的方式合理
  • StatisticsDialog::onProgressChanged 中,更新UI的方式合理

代码性能

  • 无明显性能问题

代码安全

  • StatisticsDialog::~StatisticsDialog 中,停止扫描器并删除是正确的做法
  • StatisticsDialog::onStopClicked 中,停止扫描器是正确的做法

改进建议

1. 代码质量改进

  1. ScanResult 结构体添加显式的拷贝/移动构造函数和赋值运算符
  2. FileScanner 类的 finished 信号添加 success 参数,或更新文档注释
  3. ScanOption 枚举添加更多选项,如 FollowSymlinksSkipHidden
  4. OPT_ENABLE_BUILD_TESTS 添加注释说明其用途

2. 代码性能改进

  1. ScanResult 结构体中的 fileCountdirectoryCount 类型改为 qint64
  2. 考虑优化 processedInodes 的数据结构,减少内存消耗
  3. ScannerWorker::emitProgress 中,添加文件数量阈值,减少信号发送频率

3. 代码安全改进

  1. 增加 FileScannerPrivate::~FileScannerPrivate 中的线程等待超时时间
  2. FileScanner::~FileScanner 添加线程等待超时时间
  3. ScannerWorker::scanOtherProtocols 中,添加对特殊文件的处理
  4. 考虑为 processedInodes 添加保护机制,为未来可能的扩展做准备

4. 其他改进

  1. 考虑添加单元测试,覆盖更多场景
  2. 考虑添加对网络文件系统的特殊处理,如超时、重试等
  3. 考虑添加对大文件的特殊处理,如分块读取、流式处理等

总结

这段代码整体质量较高,设计合理,实现正确。主要改进点在于:

  1. 增强代码的健壮性,特别是线程管理和错误处理
  2. 优化性能,特别是内存使用和信号发送频率
  3. 增强功能,如添加更多扫描选项和处理特殊文件
  4. 完善文档和测试

建议在后续开发中,逐步实现这些改进,以提高代码质量和用户体验。

@Johnson-zs
Copy link
Contributor Author

/forcemerge

@deepin-bot
Copy link
Contributor

deepin-bot bot commented Feb 7, 2026

This pr force merged! (status: blocked)

@deepin-bot deepin-bot bot merged commit 7e0eb0c into linuxdeepin:master Feb 7, 2026
21 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants