Skip to content

Conversation

@maykathm
Copy link
Contributor

@maykathm maykathm commented Feb 17, 2025

This introduces the ability to collect and process logs during spread runs and extract features of interest.

It requires a version of spread that supports both project-level artifacts and a switch to use gzip instead of xz to be able to download the data from core.

The following tests are incompatible with the feature tagging and explained in the jira ticket:

  • tests/core/persistent-journal
  • tests/core/persistent-journal-namespace
  • tests/nested/manual/snapd-refresh-from-old
  • tests/nested/manual/preseed
  • tests/nested/manual/core20-install-mode-shutdown-via-hook

Currently, the only "feature" available is a fake features called "all" that will match all content of all lines. It will get removed in favor of real features. Note that currently it will obtain nothing since the log entries are not in json format.

As an example, here's the command to get the fake feature for the tests/main/ack test:

SPREAD_TAG_FEATURES="all" spread -artifacts feature-artifacts google:tests/main/ack

After running the command, one would then find in the specified feature-artifacts folder:

$ tree feature-artifacts
feature-artifacts/
└── feature-tags
    ├── google:ubuntu-14.04-64:tests_main_ack
    ├── google:ubuntu-16.04-64:tests_main_ack
    ├── google:ubuntu-18.04-64:tests_main_ack
    ├── google:ubuntu-20.04-64:tests_main_ack
    ├── google:ubuntu-secboot-20.04-64:tests_main_ack
    ├── google:ubuntu-22.04-64:tests_main_ack
    ├── google:ubuntu-24.04-64:tests_main_ack
    └── google:ubuntu-24.10-64:tests_main_ack
    ├── google:ubuntu-25.04-64:tests_main_ack

@github-actions
Copy link

github-actions bot commented Feb 17, 2025

Thu Mar 27 14:32:18 UTC 2025
The following results are from: https://github.com/canonical/snapd/actions/runs/14101686381

Failures:

Preparing:

  • openstack:debian-sid-64
  • openstack:debian-sid-64
  • openstack:debian-sid-64
  • openstack:debian-sid-64
  • openstack:debian-sid-64
  • openstack:debian-sid-64
  • google-nested:ubuntu-24.04-64:tests/nested/manual/uc20-install-in-initrd:secureboot
  • google-nested:ubuntu-24.04-64:tests/nested/manual/uc20-install-in-initrd:hook
  • google-nested:ubuntu-24.04-64:tests/nested/manual/core20-fault-inject-on-install-component:kernel_panic_prepare_kernel_components
  • google-nested:ubuntu-24.04-64:tests/nested/manual/kernel-modules-components:plain
  • google-nested:ubuntu-24.04-64:tests/nested/manual/uc20-install-in-initrd:none
  • google-nested:ubuntu-24.04-64:tests/nested/manual/build-with-kernel-modules-components:plain
  • google-nested:ubuntu-24.04-64:tests/nested/manual/build-with-kernel-modules-components:encrypted
  • google-nested:ubuntu-24.04-64:tests/nested/manual/uc20-install-in-initrd:both
  • google-nested:ubuntu-24.04-64:tests/nested/manual/remodel-to-installed-kernel
  • google-nested:ubuntu-24.04-64:tests/nested/manual/kernel-modules-components:encrypted
  • google-nested:ubuntu-24.04-64:tests/nested/manual/update-snapd-seed-and-factory-reset:tpm
  • google-nested:ubuntu-24.04-64:tests/nested/manual/core-factory-reset-new-secboot:hook
  • google-nested:ubuntu-24.04-64:tests/nested/manual/core-factory-reset-new-secboot:tpm
  • google-nested:ubuntu-24.04-64:tests/nested/manual/hybrid-fde-dbx
  • google:ubuntu-25.04-64:tests/main/snap-ns-forward-compat

Executing:

  • google-distro-1:amazon-linux-2023-64:tests/main/snap-confine-undesired-mode-group
  • google-nested:ubuntu-24.04-64:tests/nested/manual/muinstaller-real:encrypted
  • google-nested:ubuntu-24.04-64:tests/nested/manual/split-refresh
  • google-nested:ubuntu-24.04-64:tests/nested/manual/muinstaller-real:seeded
  • google-nested:ubuntu-24.04-64:tests/nested/manual/muinstaller-real:plain
  • google-nested:ubuntu-24.04-64:tests/nested/manual/component-recovery-system-offline
  • google-nested:ubuntu-24.04-64:tests/nested/manual/muinstaller-real:partial
  • google-nested:ubuntu-24.04-64:tests/nested/manual/component-recovery-system
  • google-nested:ubuntu-24.04-64:tests/nested/manual/remodel-with-components
  • google-nested:ubuntu-24.04-64:tests/nested/manual/remodel-with-components-offline
  • google-nested:ubuntu-24.04-64:tests/nested/manual/muinstaller
  • google-nested:ubuntu-24.04-64:tests/nested/manual/muinstaller-core:passphrase_auth
  • google-nested:ubuntu-24.04-64:tests/nested/manual/muinstaller-core:install_optional_all
  • google-nested:ubuntu-24.04-64:tests/nested/manual/muinstaller-core:install_optional_snap
  • google-nested:ubuntu-24.04-64:tests/nested/manual/muinstaller-core:plain
  • google-nested:ubuntu-24.04-64:tests/nested/manual/muinstaller-core:install_optional_snap_and_comp
  • openstack:opensuse-tumbleweed-64:tests/main/auto-refresh-pre-download:close
  • openstack:opensuse-tumbleweed-64:tests/main/auto-refresh-pre-download:close_mid_restart
  • openstack:opensuse-tumbleweed-64:tests/main/auto-refresh-pre-download:restart
  • openstack:opensuse-tumbleweed-64:tests/main/auto-refresh:regular
  • openstack:opensuse-tumbleweed-64:tests/main/auto-refresh-gating
  • openstack:opensuse-tumbleweed-64:tests/main/auto-refresh-pre-download:ignore
  • openstack:opensuse-tumbleweed-64:tests/main/snap-refresh-hold
  • openstack:opensuse-tumbleweed-64:tests/main/auto-refresh:parallel
  • openstack:opensuse-tumbleweed-64:tests/main/auto-refresh-backoff
  • openstack:opensuse-tumbleweed-64:tests/main/auto-refresh-gating-from-snap
  • openstack:opensuse-tumbleweed-64:tests/main/auto-refresh-retry
  • openstack:opensuse-tumbleweed-64:tests/main/refresh-app-awareness-notify
  • google:ubuntu-25.04-64:tests/main/microk8s-smoke:edge

Restoring:

  • openstack:opensuse-tumbleweed-64:tests/main/refresh-app-awareness-notify

@codecov
Copy link

codecov bot commented Feb 17, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Please upload report for BASE (master@1621103). Learn more about missing BASE report.
Report is 449 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff            @@
##             master   #15091   +/-   ##
=========================================
  Coverage          ?        0           
=========================================
  Files             ?        0           
  Lines             ?        0           
  Branches          ?        0           
=========================================
  Hits              ?        0           
  Misses            ?        0           
  Partials          ?        0           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@maykathm maykathm requested a review from bboozzoo February 17, 2025 18:11
@maykathm maykathm force-pushed the SNAPDENG-34442-collect-logs-from-tasks branch from 2b75f48 to 218f9ec Compare February 18, 2025 10:18
@maykathm maykathm marked this pull request as ready for review March 10, 2025 14:18
@maykathm maykathm requested a review from andrewphelpsj March 10, 2025 14:35
if [ -n "$TAG_FEATURES" ]; then
# If feature tagging is enabled, then we need to enable debug logging
remote.exec "sudo mkdir -p /etc/systemd/system/snapd.service.d"
remote.exec "echo -e '[Service]\nEnvironment=SNAPD_DEBUG_HTTP=7 SNAPD_DEBUG=1 SNAPPY_TESTING=1' | sudo tee /etc/systemd/system/snapd.service.d/local.conf"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
remote.exec "echo -e '[Service]\nEnvironment=SNAPD_DEBUG_HTTP=7 SNAPD_DEBUG=1 SNAPPY_TESTING=1' | sudo tee /etc/systemd/system/snapd.service.d/local.conf"
remote.exec "printf '[Service]\nEnvironment=SNAPD_DEBUG_HTTP=7 SNAPD_DEBUG=1 SNAPPY_TESTING=1\n' | sudo tee /etc/systemd/system/snapd.service.d/99-feature-tags.conf"

aren't nested tests already doing this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated with your suggestion.

As to your question, at least some nested tests seem to not have debug logging enabled and without this addition, did not produce debug logs. If that is an error and nested tests should have that enabled, I can always shift the debug logging addition to outside the feature tagging check.

@maykathm maykathm force-pushed the SNAPDENG-34442-collect-logs-from-tasks branch from 5bf824f to 975119e Compare March 14, 2025 15:10
@maykathm maykathm requested a review from bboozzoo March 18, 2025 07:57
Copy link
Member

@andrewphelpsj andrewphelpsj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! Talked with Katie about some potential simplifications to journal-analyzer.py that leverages our expectation of structured logs from snapd, rather than using regular expressions to extract them.

@maykathm maykathm force-pushed the SNAPDENG-34442-collect-logs-from-tasks branch from d394931 to 7f6d860 Compare March 19, 2025 17:39
@maykathm maykathm requested a review from andrewphelpsj March 19, 2025 17:41
Copy link
Member

@andrewphelpsj andrewphelpsj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

spread.yaml Outdated
"$TESTSLIB"/prepare-restore.sh --prepare-suite
prepare-each: |
"$TESTSLIB"/prepare-restore.sh --prepare-suite-each
"$TESTSLIB"/analyze-features.sh --before-non-nested-task
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this could be the last step of the prepare-suite-each

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still not sure if we should tag/find the whole test execution, including prepare and restore

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initially we said that it's ok to tag all three steps. Probably eventually we will want to only tag execution.

@maykathm maykathm force-pushed the SNAPDENG-34442-collect-logs-from-tasks branch from b5d6380 to 835c4bb Compare March 20, 2025 10:21
@maykathm maykathm force-pushed the SNAPDENG-34442-collect-logs-from-tasks branch from 835c4bb to b87ce8e Compare March 20, 2025 10:22
Copy link
Contributor

@bboozzoo bboozzoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor things, but overall looks ok

try:
line_json = json.loads(line)
for feature_class in feature_classes:
feature_class.maybe_add_feature(feature_dict, line_json, state_json)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess later on new classes with add feature depending on contents of the log entry?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes exactly. They would check keys and values and see if the log entry is relevant to them. If it's relevant, then they would grab the information they need and, if necessary, query the state.json for any extra info.


restore_suite_each() {
if not tests.nested is-nested; then
"$TESTSLIB"/collect-features.sh --after-non-nested-task
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should do this just when SPREAD_TAG_FEATURES is set

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The collect-features.sh script checks if TAG_FEATURES is set before actually calling the function. I thought that was easier than checking in all the places the script is called. If you think the check should be outside, I can move it outside

Copy link
Collaborator

@sergiocazzolato sergiocazzolato left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

Copy link
Contributor

@bboozzoo bboozzoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM


@staticmethod
def maybe_add_feature(feature_dict, json_entry, state_json):
def maybe_add_feature(feature_dict: dict, json_entry: dict, state_json: dict):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you'll need to bring in typing.Any

Suggested change
def maybe_add_feature(feature_dict: dict, json_entry: dict, state_json: dict):
def maybe_add_feature(feature_dict: dict[str, list[Any]], json_entry: dict[str, Any], state_json: dict[str, Any]):

parser.add_argument('-o', '--output', help='Output file', required=True)
parser.add_argument(
'-f', '--features', help='Features to extract from journal in a comma-separated list {all}', required=True)
'-f', '--features', help='Features to extract from journal {all}', nargs='+')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since we're collecting them now, maybe --feature ?

Suggested change
'-f', '--features', help='Features to extract from journal {all}', nargs='+')
'-f', '--features', help='Features to extract from journal {all}, can be repeated multiple times', nargs='+')

@ernestl ernestl merged commit 711f96e into canonical:master Mar 27, 2025
70 of 75 checks passed
@maykathm maykathm deleted the SNAPDENG-34442-collect-logs-from-tasks branch March 27, 2025 15:32
maykathm added a commit to maykathm/snapd that referenced this pull request Mar 28, 2025
* tests: add feature analyzer using spread

* tests: removed unnecessary VM removal

* tests: put back xz-utils removal given the switch to gzip sergiocazzolato/spread#6

* tests: spelling correction and small optimizations

* tests: changed forward slash replacement to backslash

* tests: address review comments and add optional cursor input to journal-analyzer

* tests: use json instead of regex for feature extraction

* tests: minor corrections and use cursor instead of timestamp

* tests: only copy journal and state.json during restore

* tests: grep for keyword in logs and minor fixes to python

* tests: expand dict type hints and make argument name and help more precise
maykathm added a commit to maykathm/snapd that referenced this pull request Apr 8, 2025
* tests: add feature analyzer using spread

* tests: removed unnecessary VM removal

* tests: put back xz-utils removal given the switch to gzip sergiocazzolato/spread#6

* tests: spelling correction and small optimizations

* tests: changed forward slash replacement to backslash

* tests: address review comments and add optional cursor input to journal-analyzer

* tests: use json instead of regex for feature extraction

* tests: minor corrections and use cursor instead of timestamp

* tests: only copy journal and state.json during restore

* tests: grep for keyword in logs and minor fixes to python

* tests: expand dict type hints and make argument name and help more precise
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants