Skip to content

Internal repository with custom GitHub actions for organization

Notifications You must be signed in to change notification settings

dpaia/infrastructure

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

382 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GitHub Actions Workflows

This repository contains several GitHub Actions workflows that automate various tasks related to issue management, data collection, and repository maintenance. Below is a detailed explanation of each workflow.

Dataset Generation Process

Generated datasets are stored in the "dataset" repository. This includes datasets with multiple files. The workflow for generating datasets is as follows:

  1. Individual dataset items are generated using the "Update Issue dataset data" workflow, which creates a pull request with the generated file to the "dataset" repository.
  2. Multiple dataset items can be generated at once using the "Generate Dataset Data" workflow, which processes issues matching a GitHub search query.
  3. Dataset items are automatically generated by a scheduled task in the "Generate Dataset Data" workflow for issues matching the query "-label:Epic label:Verified".
  4. Created pull requests in the "dataset" repository must be verified by users and merged into the main branch.
  5. The "Export Dataset" workflow collects already generated dataset items into one file and creates a pull request with the resulting dataset to the "ee-dataset" repository.

Available Workflows

1. Update Issue dataset data

File: .github/workflows/update-issue-data.yml

Purpose: Generates a single benchmark dataset item from issue commits and creates a pull request to the "dataset" repository.

Trigger: Manual (workflow_dispatch)

Inputs:

  • generator: Profile to use (default: 'java')
  • organization: GitHub organization name (default: 'dpaia')
  • repository: Repository name
  • issue_id: Issue number
  • auto_merge: Automatically merge data updates (default: false)

Description: This workflow generates a single benchmark dataset item from issue commits. It extracts data from the specified issue, retrieves commit information, and generates structured benchmark data. The generated dataset item is stored in the "dataset" repository through a pull request, which must be verified by users and merged into the main branch. This workflow is used for generating individual dataset items one at a time.

2. Generate Dataset Data

File: .github/workflows/generate-dataset-data.yml

Purpose: Generates dataset items for multiple issues matching a GitHub search query and creates pull requests to the "dataset" repository.

Trigger: Manual (workflow_dispatch) or Scheduled (daily at 2:00 AM UTC)

Inputs:

  • organization: GitHub organization name (default: 'dpaia')
  • topic: Repositories topic (default: 'Java')
  • generator: Profile to use (default: 'java')
  • search_query: GitHub issue search query (default: '-label:Epic label:Verified')
  • update: Update mode (options: create-new, update-outdated, force-update)
  • auto_merge: Automatically merge data updates (default: false)

Description: This workflow automates the generation of multiple dataset items by processing issues that match a specified search query. It can create new dataset items, update outdated ones, or force update all matching issues. The workflow runs automatically on a daily schedule with the default search query "-label:Epic label:Verified", generating dataset items that must be verified by users and merged into the main branch of the "dataset" repository. This workflow is ideal for batch processing multiple issues at once.

Update Mode Options:

  • create-new: Processes only issues that haven't been marked as "Done" in the project. This option is useful for generating dataset items for newly added issues without modifying existing ones.
  • update-outdated: Checks if the latest commit for an issue matches the commit stored in the project. If they don't match or if the commit field is empty, it processes the issue to update the dataset. This option is useful for updating dataset items when the underlying issue has been updated with new commits.
  • force-update: Processes all issues matching the search query regardless of their current status or commit information. This option is useful when you need to regenerate all dataset items, such as after making changes to the data generation process.

3. Export Dataset

File: .github/workflows/export-dataset.yml

Purpose: Collects already generated dataset items into one file and creates a pull request with the resulting dataset to the "ee-dataset" repository.

Trigger: Manual (workflow_dispatch)

Inputs:

  • organization: GitHub organization name (default: 'dpaia')
  • search_query: GitHub issue search query (default: '-label:Epic label:Verified')
  • output_file: Export file name (default: "dataset.json")
  • datasets_repository: Repository for exported dataset (default: "dpaia/ee-dataset")
  • create_pull_request: Create a pull request to the result datasets repository (default: false)

Description: This workflow aggregates already generated dataset items from the "dataset" repository into a comprehensive dataset file. It searches for issues matching the provided query, retrieves the data for each issue from the "Data" field, and combines all the individual issue data into a single JSON file. The workflow then creates a pull request to the "ee-dataset" repository with the resulting dataset. This workflow is the final step in the dataset generation process, collecting and organizing the benchmark data generated by the "Update Issue dataset Data" and "Generate Dataset Data" workflows into a unified dataset that can be used for software engineering benchmarks, analysis, reporting, or machine learning purposes.

4. Sync Labels to Repositories

File: .github/workflows/sync-labels.yml

Purpose: Synchronizes GitHub issue labels across multiple repositories.

Trigger: Manual (workflow_dispatch)

Inputs:

  • profiles: Comma-separated list of label profiles (e.g., common,spring)
  • topics: Repository topics to filter by
  • repositories: Optional specific repositories to target

Description: This workflow helps maintain consistent issue labels across multiple repositories in the organization. It can target repositories based on topics or a specific list, and apply different label profiles to them.

5. Add Issues to Project

File: .github/workflows/add-issues-to-project.yml

Purpose: Automatically adds GitHub issues to a specified project board.

Trigger: Manual (workflow_dispatch) or Scheduled (daily at midnight UTC)

Inputs:

  • organization: GitHub organization name (default: 'dpaia')
  • project_number: Project number (default: '2')
  • search_query: GitHub issue search query (default: '-label:Epic is:issue')

Description: This workflow automates the process of adding GitHub issues to a project board. It searches for issues matching the specified query and adds them to the designated project. The workflow consists of three main jobs: finding issues that match the search criteria, retrieving the project data, and adding each matching issue to the project.

6. Share Custom Workflows

File: .github/workflows/share-custom-workflows.yml

Purpose: Shares GitHub workflow files between repositories by creating pull requests.

Trigger: Manual (workflow_dispatch)

Inputs:

  • organization: GitHub organization name (default: 'dpaia')
  • topic: Repository topic filter (default: empty, which means all repositories)
  • workflow_path: Path to GitHub workflow file to share (relative to shared/)

Description: This workflow automates the process of sharing GitHub workflow files across multiple repositories within an organization. It finds all repositories matching the specified organization and topic criteria, then creates pull requests to add the specified workflow file to each repository. The workflow consists of three main jobs: finding repositories that match the criteria, creating pull requests for each repository, and summarizing the results.

The workflow handles various scenarios gracefully:

  • If the workflow file already exists in a target repository, it skips creating a pull request
  • If a pull request already exists for the workflow file, it uses the existing PR URL
  • If there's an error creating a pull request, it captures the error and continues with other repositories

After completion, the workflow generates a summary report that categorizes the results into "Pull Requests Created", "Repositories with No Changes Needed", and "Failed Pull Requests", making it easy to see the outcome at a glance. This workflow is particularly useful for maintaining consistent CI/CD processes across multiple repositories in an organization.

7. Shared Collect and Process Tests

File: .github/workflows/shared-collect-process-tests.yml

Purpose: Collects and processes test information from issues to identify tests that should change from FAIL to PASS and tests that should remain PASS.

Trigger: Reusable workflow (workflow_call)

Inputs:

  • issue-number: Optional issue number to extract test names from (default: '')

Secrets:

  • github-token: Required GitHub token for API access

Outputs:

  • fail_to_pass: Comma-separated list of tests that should change from FAIL to PASS
  • pass_to_pass: Comma-separated list of tests that should remain PASS
  • tests: Comma-separated list of all tests to run
  • comment_id: ID of the comment where FAIL_TO_PASS or PASS_TO_PASS was manually described

Description: This reusable workflow collects test names from issues and processes them to identify tests that should change from FAIL to PASS and tests that should remain PASS. It consists of several jobs: collecting issue numbers based on event type, extracting test names from issues, combining test results from all issues, and checking if FAIL_TO_PASS or PASS_TO_PASS were found. This workflow is designed to be called by other workflows, such as the Shared Run Tests Maven workflow.

8. Shared Run Tests Maven

File: .github/workflows/shared-run-tests-maven.yml

Purpose: Runs Maven tests for a project, focusing on tests that should change from FAIL to PASS and tests that should remain PASS.

Trigger: Reusable workflow (workflow_call)

Inputs:

  • java-version: Java version to set up (default: '24')
  • distribution: Java distribution to use (default: 'temurin')
  • pom-file: Path to the pom.xml file (default: 'pom.xml')
  • issue-number: Issue number to extract test names from (default: '')

Secrets:

  • github-token: Required GitHub token for API access

Description: This reusable workflow runs Maven tests for a project, focusing on tests that should change from FAIL to PASS and tests that should remain PASS. It uses the Shared Collect and Process Tests workflow to collect and process tests, creates a placeholder comment on the issue, sets up Java/Maven and runs the tests, and updates the issue comment with the final test status. This workflow is designed to be called by other workflows that need to run Maven tests.

9. Shared Maven Workflow

File: shared/.github/workflows/maven.yml

Purpose: Runs Maven tests for a project, focusing on tests that should change from FAIL to PASS and tests that should remain PASS.

Trigger:

  • Push to branches: main, scenario/, eval/, feature/*
  • Pull requests to branches: main, scenario/, eval/, feature/*
  • Issue comments (created)

Description: This shared workflow is designed to be distributed to other repositories using the "Share Custom Workflows" workflow. It runs Maven tests for a project, focusing on tests that should change from FAIL to PASS and tests that should remain PASS. The workflow collects and processes test information from issues, creates a placeholder comment on the issue, sets up Java/Maven and runs the tests, and updates the issue comment with the final test status. Unlike the "Shared Run Tests Maven" workflow, this is a standalone workflow rather than a reusable workflow, making it easier to share across repositories.

About

Internal repository with custom GitHub actions for organization

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors