📖 Getting Started: Open Daily and weekly tasks to start work on maintenance tasks.
This repository contains scripts that help monitor code repos for changes that could break documentation builds and help identify when documentation updates are needed.
When sample code from a code repo is referenced in documentation, certain changes in the code repos can break the documentation build:
- File deletion - Referenced files are removed
- File renaming - Referenced file paths change
- Content changes - Named sections or code blocks are modified or removed
Also, when a code file is updated, the document won't reflect the changes until a rebuild of the markdown file is triggered.
The scripts in this repository help prevent problems and identify necessary document updates.
NEW! This repository now includes automation for daily, weekly, and monthly maintenance tasks. These are a WIP and not currently turned on.
- Daily Merge Docs - this could be used for monitering merged content in the code repos and creating doc PRs to update metadata, allowing the articles to rebuild with updated code.
The rest were created by an AI Agent, and still have some issues that keep us from using them:
- Daily PR Monitor - Automatically monitors PRs across 3 code repositories (Azure/azureml-examples, Azure-AI-Foundry/foundry-samples, Azure-Samples/azureai-samples), analyzes them for documentation impact, and auto-approves safe PRs (Mon-Fri 7 AM EST)
- Weekly Snippet Scanner - Scans docs and updates CODEOWNERS files (Mon 6 AM EST)
- Monthly Reports - Generates statistics and health reports (1st of month)
👉 See automation/README.md for setup and usage
The automation could reduce manual effort by ~80% while maintaining documentation quality through automated validation and safety checks.
- Setup and overview
- Create token for authentication
- Daily and weekly tasks
- Fix the problem
- Automation Guide ⭐ NEW
All code repo configurations are centralized in config.yml. This file contains:
- Repository details (owner, repo name, team assignments)
- Search paths for each repository
- File naming patterns and output directories
- Default settings for various scripts
To add or modify repositories, edit the config.yml file.
Note: The docs repo is currently hardcoded as MicrosoftDocs/azure-ai-docs in the scripts.
Scripts in this repository can be run locally or in a GitHub Codespace.
-
Clone the repository:
git clone <repo-url> cd content-maintenance
-
Create a virtual environment (recommended):
python -m venv .venv # On Windows: .venv\Scripts\activate # On macOS/Linux: source .venv/bin/activate
-
Install dependencies:
pip install -r requirements.txt
-
Set up GitHub authentication:
export GH_ACCESS_TOKEN=your_github_token_here
See instructions in Daily and weekly tasks for details.
Click to view script details
find-prs.py - Find PRs requiring team review
- Identifies pull requests across multiple repositories that need review from documentation team members
- Generates a markdown report (
pr-review-report-DATE.md) with clickable links - Usage:
python find-prs.py
find-snippets.py - Scan documentation for code references
- Creates
refs-found.csvfile used by other scripts - Generates CODEOWNERS files for each repository
- Usage:
python find-snippets.py
pr-report.py - Analyze specific PR impact on documentation
- Evaluates whether a specific PR will cause documentation build issues
- Supports repository-specific arguments for targeting different code repos
- Usage:
python pr-report.py 91(for azureml-examples)python pr-report.py 169 ai(for foundry-samples)python pr-report.py 267 ai2(for azureai-samples)
merge-report.py - Review recent merged PRs
- Shows PRs merged in the last N days (default: 8) that may require documentation updates
- Usage:
python merge-report.pyorpython merge-report.py 14
These files provide functions used by the main scripts:
- config.py - Reads repository configurations from
config.yml - helpers.py - Common functions for snippet processing and file operations
- gh_auth.py - GitHub authentication and API interaction functions
- find_pr_files.py - Functions for analyzing PR file changes and documentation impact
- Initial Setup: Run
find-snippets.pyto scan documentation and create the reference database - Regular Monitoring: Use
find-prs.pyto identify PRs requiring review - PR Analysis: Use
pr-report.pyto evaluate specific PRs before approval - Post-Merge Review: Use
merge-report.pyto identify documentation that may need updates after PRs are merged
The monitored repositories are defined in config.yml:
- foundry-samples (microsoft-foundry/foundry-samples)
- azureai-samples (Azure-Samples/azureai-samples)
- azureml-examples (Azure/azureml-examples)
Each repository configuration includes team assignments, search paths, and service categorizations for automated processing.