Skip to content

feat: Added comprehensive unit testing and github action to run tests on new pull requests#8

Open
dklawren wants to merge 13 commits intomainfrom
unit-tests
Open

feat: Added comprehensive unit testing and github action to run tests on new pull requests#8
dklawren wants to merge 13 commits intomainfrom
unit-tests

Conversation

@dklawren
Copy link
Contributor

No description provided.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a comprehensive pytest-based test suite and GitHub Actions workflow to automatically run linting, unit tests, and integration tests for the GitHub ETL pipeline.

Changes:

  • Added a large pytest test suite covering main.py functions (extraction, transformation, loading, orchestration).
  • Added pytest/coverage configuration and a testing guide documenting local + CI workflows.
  • Added a GitHub Actions workflow to run linting/tests on pull requests; expanded dependencies to include lint/format tools.

Reviewed changes

Copilot reviewed 4 out of 5 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
test_main.py New comprehensive unit/integration test suite for main.py.
requirements.txt Adds dev tooling dependencies (black/flake8/mypy/isort) alongside existing test deps.
pytest.ini Configures pytest discovery, verbosity, and coverage reporting.
TESTING.md Documents how to run tests/linting locally and in CI, plus docker-based integration testing.
.github/workflows/tests.yml Adds CI workflow for linting, pytest runs (unit + all), coverage artifacts, and docker-compose integration job.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 6 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 5 out of 6 changed files in this pull request and generated 6 comments.

Comments suppressed due to low confidence (1)

main.py:536

  • GITHUB_REPOS is split on commas but entries aren’t stripped/validated. Values like "owner/repo, owner/repo" (note the space) or a trailing comma will produce repo strings with leading whitespace or empty entries, which will break API URLs. Consider stripping whitespace and filtering out empty repo names before iterating.
    github_repos = []
    github_repos_str = os.getenv("GITHUB_REPOS")
    if github_repos_str:
        github_repos = github_repos_str.split(",")
    else:
        raise SystemExit(
            "Environment variable GITHUB_REPOS is required (format: 'owner/repo,owner/repo')"
        )

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Member

@cgsheeh cgsheeh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't looked over the tests in full yet, but wanted to leave comments about the higher-level items I noticed off the bat.

@dklawren dklawren requested a review from cgsheeh January 23, 2026 23:22
Copy link
Member

@cgsheeh cgsheeh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a lot of content in this PR, it might take a few iterations to review it fully.

We could land some of this code faster if we split it into smaller chunks. For example, we could add linting support and tests in one PR, and deal with the tests themselves in another.

@dklawren
Copy link
Contributor Author

dklawren commented Feb 2, 2026

There is a lot of content in this PR, it might take a few iterations to review it fully.

We could land some of this code faster if we split it into smaller chunks. For example, we could add linting support and tests in one PR, and deal with the tests themselves in another.

Good idea and sorry for so much in one PR. I create #9 just now which only includes the github action and formatting parts this PR and we can land it first and then add the unit tests in a separate PR.

dklawren added a commit that referenced this pull request Feb 2, 2026
- Broke up all of the tests into individual files based on function to
  make for easier review.
@dklawren dklawren requested review from cgsheeh and Copilot February 6, 2026 21:50
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 16 out of 17 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Member

@cgsheeh cgsheeh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR has the right intent, but it has execution issues that come down to common pitfalls when using AI tools.

The main issue is tests which test nothing. Many of the tests in this PR follow this pattern:

 mock_session.get.return_value.json.return_value = [{"id": 456}]
 result = main.extract_comments(mock_session, ...)
 assert result[0]["id"] == 456  # just reading back our own mock value

Since the method we're testing (extract_comments in this example) simply returns the value from the mock, the assertion will never catch a regression in our code. We need to ensure that mocking the method causes the function to go through behaviour we wish to test, and that we assert an appropriate value such that we would catch a regression.

There are also references to things which don't exist, overly verbose generated docs, dead code, etc.

Some suggestions I would make for the future:

  • Use Claude/Codex in the terminal instead of the Github Copilot integration, local terminal-based agents provide much better results.
  • Go slowly with the AI agent, ask it to "make as few changes as possible", and verify changes it makes.
  • Ask the agent to do a code review before submitting. In this case you could ask it to verify if there were missing test coverage of important functionality, if the tests could be reduced or simplified in any way, or if the tests were correct.

README.md Outdated

The project includes a comprehensive test suite using pytest. Tests are organized in the `test/` directory and include both unit and integration tests.

#### Setting Up the Development Environment
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Dockerfile should do all of this for us. We should update these docs to demonstrate how to set up the environment and run the tests using Docker.

@dklawren
Copy link
Contributor Author

This PR has the right intent, but it has execution issues that come down to common pitfalls when using AI tools.

The main issue is tests which test nothing. Many of the tests in this PR follow this pattern:

 mock_session.get.return_value.json.return_value = [{"id": 456}]
 result = main.extract_comments(mock_session, ...)
 assert result[0]["id"] == 456  # just reading back our own mock value

Since the method we're testing (extract_comments in this example) simply returns the value from the mock, the assertion will never catch a regression in our code. We need to ensure that mocking the method causes the function to go through behaviour we wish to test, and that we assert an appropriate value such that we would catch a regression.

There are also references to things which don't exist, overly verbose generated docs, dead code, etc.

Some suggestions I would make for the future:

* Use Claude/Codex in the terminal instead of the Github Copilot integration, local terminal-based agents provide much better results.

* Go slowly with the AI agent, ask it to "make as few changes as possible", and verify changes it makes.

* Ask the agent to do a code review before submitting. In this case you could ask it to verify if there were missing test coverage of important functionality, if the tests could be reduced or simplified in any way, or if the tests were correct.

Thanks for taking the time for this review. I am still trying to get my head around the use of AI to help with some of the more mundane tasks such as generating tests, etc. Especially I need to learn more about how much to depend on the tool and when not to go to far with it. I will go through the tests more thoroughly and also address your change suggestions. For full disclosure this was all done (as with my other pull requests) using our Mozilla account with Claude Code. I only use Github Copilot for first round reviews to catch more of my glaring mistakes before bring you in as a reviewer. Thanks again and I will get this fixed up in the next day or so,

dkl

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 14 out of 15 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

Comment on lines +176 to +181
Run with coverage:

```bash
pytest --cov=. --cov-report=html
```

Copy link

Copilot AI Mar 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

README suggests running pytest --cov=. --cov-report=html, but pytest-cov is not listed in the dev dependencies. Either add pytest-cov to [project.optional-dependencies].dev or adjust the documentation to a command that works with the declared dependencies.

Suggested change
Run with coverage:
```bash
pytest --cov=. --cov-report=html
```

Copilot uses AI. Check for mistakes.
run: |
docker compose up --build --abort-on-container-exit --exit-code-from github-etl
- name: Cleanup
run: docker compose down -v
Copy link

Copilot AI Mar 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The cleanup step won’t run if docker compose up fails, because subsequent steps are skipped on failure by default. Mark the cleanup step with if: always() (and optionally add docker compose down -v --remove-orphans) so containers/volumes are reliably cleaned up even on failing integration tests.

Suggested change
run: docker compose down -v
if: always()
run: docker compose down -v --remove-orphans

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants