Skip to content

label/metric name validation: use prometheus legacy validation scheme, avoid global variable#11629

Closed
juliusmh wants to merge 4 commits intomainfrom
juliusmh/legacy_name_validation
Closed

label/metric name validation: use prometheus legacy validation scheme, avoid global variable#11629
juliusmh wants to merge 4 commits intomainfrom
juliusmh/legacy_name_validation

Conversation

@juliusmh
Copy link
Contributor

@juliusmh juliusmh commented Jun 4, 2025

What this PR does

Mimir does not support prometheus UTF8Validation name validation scheme. Validation helpers
in prometheus common/model package rely on a global variable NameValidationScheme which
defaults to UTF8Validation.

This PR replaces direct calls to prometheus LabelName.IsValid and IsValidMetricName with
a validation wrapper that always uses LegacyValidation scheme and does not rely on the global
variable.

Where I need feedback:

Many of prometheus internal functions still rely on the NameValidationScheme variable. Mimir
uses those functions in distributor, mimirtool, querier and ruler. Hence, I'm not sure if
removing global overrides (see below) introduces side effects.

func init() {
    model.NameValidationScheme = model.LegacyValidation
}

Which issue(s) this PR fixes or relates to

Fixes #11503

Checklist

  • Tests updated.
  • Documentation added.
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX].
  • about-versioning.md updated with experimental features.

@juliusmh juliusmh force-pushed the juliusmh/legacy_name_validation branch from 303af79 to b62fd81 Compare June 4, 2025 13:34
@CLAassistant
Copy link

CLAassistant commented Jun 4, 2025

CLA assistant check
All committers have signed the CLA.

@juliusmh
Copy link
Contributor Author

juliusmh commented Jun 4, 2025

  • Failing integration test TestAlertmanagerClassicMode is caused in validateSilence in prometheus/alertmanager/silence package. This is due to this piece of code, which expects prometheus/common v0.61.0 (which defaults to legacy mode, but mimir uses v0.64.0 (which defaults to UTF8 mode).

  • The unit test in streamingpromql expects UTF8Validation to be enabled:

# Passes due to UTF-8 validation of source labels

@aknuds1 aknuds1 requested a review from Copilot June 4, 2025 16:22

This comment was marked as outdated.

aknuds1
aknuds1 previously requested changes Jun 4, 2025
Copy link
Contributor

@aknuds1 aknuds1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a promising start, although I only had the time for a quick review. Please follow the Mimir project standard of testify assertions in tests though.

Also, we probably want a faillint rule to make sure that your validation functions are used instead of the Prometheus library equivalents.

@juliusmh juliusmh force-pushed the juliusmh/legacy_name_validation branch 2 times, most recently from d668023 to 251c8d6 Compare June 4, 2025 20:37
@aknuds1 aknuds1 requested a review from Copilot June 5, 2025 07:04
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR replaces Prometheus’s global NameValidationScheme overrides by introducing explicit validation wrappers for legacy and UTF-8 schemes, then updates callers to use the new wrappers.

  • Add IsValidLabelName, IsValidMetricName, and IsValidUTF8LabelName in pkg/util/validation
  • Replace direct uses of model.LabelName.IsValid and model.IsValidMetricName with the new wrappers
  • Remove all init blocks that set model.NameValidationScheme = model.LegacyValidation

Reviewed Changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
pkg/util/validation/labels.go New wrappers for legacy and UTF-8 label/metric validation
pkg/util/validation/labels_test.go Tests for the new validation functions
pkg/streamingpromql/operators/functions/label.go Updated label_join and label_replace to use validation APIs
pkg/streamingpromql/operators/aggregations/count_values.go Switched aggregation label checks to validation APIs
pkg/querier/stats_renderer_test.go Removed legacy validation override in tests
pkg/mimir/promexts.go Removed legacy validation override
pkg/frontend/querymiddleware/request_validation.go Removed legacy validation override
pkg/distributor/validate.go Replaced model validation with wrappers for labels and metrics
pkg/distributor/distributor.go Removed legacy validation override
pkg/cardinality/request.go Updated label_names extraction to use validation wrapper
Comments suppressed due to low confidence (2)

pkg/streamingpromql/operators/functions/label.go:32

  • The error for an invalid source label prints dst instead of the src value. Update the format argument to use src.
return nil, fmt.Errorf("invalid source label name in label_join(): %s", dst)

pkg/util/validation/labels_test.go:38

  • TestIsValidLabelName is missing a case for names starting with a digit (e.g., "123label"), which should be invalid under the legacy scheme. Consider adding it.
for _, tc := range testCases {

Copy link
Contributor

@aknuds1 aknuds1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the use of IsValidUTF8LabelName in some code paths here. Could you explain the reason?

Also, could you please add a faillint rule (in Makefile) to ensure we always use validation.IsValidLabelName and validation.IsValidMetricName?

@juliusmh
Copy link
Contributor Author

juliusmh commented Jun 5, 2025

I don't understand the use of IsValidUTF8LabelName in some code paths here. Could you explain the reason?

The pkg/streamingpromql expects source and destination labels to be UTF-8 validated, rather than "legacy". Using legacy validation breaks the unit test:

# Passes due to UTF-8 validation of source labels
eval range from 0 to 10m step 5m label_join(series, "export", ",", "(opt)", "this")
series{export=",", label="c", opt="3"} NaN
series{export=",that", label="a", opt="1", this="that"} 1.1 5
series{export=",there", label="b", opt="2", this="there"} 2.9 {{sum:4 count:23 buckets:[1 2 4]}}
# Passes due to UTF-8 validation of dest labels
eval range from 0 to 10m step 5m label_join(series, "(export)", "-", "label", "opt")
series{"(export)"="a-1", label="a", opt="1", this="that"} 1.1 5.0
series{"(export)"="b-2", label="b", opt="2", this="there"} 2.9 {{sum:4 count:23 buckets:[1 2 4]}}
series{"(export)"="c-3", label="c", opt="3"} NaN _

@aknuds1
Copy link
Contributor

aknuds1 commented Jun 5, 2025

The pkg/streamingpromql expects source and destination labels to be UTF-8 validated, rather than "legacy".

That must mean that model.NameValidationScheme is UTF8Validation in the streaming PromQL engine, right? That was probably unintended when I forced legacy validation in prometheus/common.

I guess it's not important though, as what matters is we don't ingest anything but "legacy" metric and label names.

@aknuds1
Copy link
Contributor

aknuds1 commented Jun 5, 2025

@charleskorn Can you shed some light on why pkg/streamingpromql explicitly expects the UTF-8 metric/label name validation scheme? Is it because of compatibility tests copied from upstream?

@aknuds1 aknuds1 requested a review from charleskorn June 5, 2025 08:45
@aknuds1
Copy link
Contributor

aknuds1 commented Jun 5, 2025

Many of prometheus internal functions still rely on the NameValidationScheme variable. Mimir
uses those functions in distributor, mimirtool, querier and ruler. Hence, I'm not sure if
removing global overrides (see below) introduces side effects.

Can you provide a list of Prometheus internal functions using the NameValidationScheme variable, which affect Mimir? Wondering whether we'd need to patch them in our Prometheus fork (mimir-prometheus).

@juliusmh
Copy link
Contributor Author

juliusmh commented Jun 5, 2025

Also, could you please add a faillint rule (in Makefile) to ensure we always use validation.IsValidLabelName and validation.IsValidMetricName?

We cannot add LabelName.IsValid, faillint only supports function and type declarations, see fatih/faillint#18. We can add the IsValidMetricName though.

That must mean that model.NameValidationScheme is UTF8Validation in the streaming PromQL engine, right?

Yes, during unit tests. When running mimir, this behavior is overridden by model.NameValidationScheme in pkg/mimir/promexts.go and other packages. Always using UTF8 validation decouples this package and aligns testing and runtime environments.

Can you provide a list of Prometheus internal functions using the NameValidationScheme variable, which affect Mimir? Wondering whether we'd need to patch them in our Prometheus fork (mimir-prometheus).

Sure, this is a list of all the references I found that are also used by mimir:

Call hierarchy with depth=2
  • prometheus/common/model#LabelName.IsValid

    • prometheus/alertmanager/matchers/compat#InitFromFlags (link, link)
    • prometheus/alertmanager/silence#validateSilence (link)
    • prometheus/client_golang/prometheus#checkLabelName (link)
    • prometheus/prometheus/web/api/v1#API.labelValues (link)
    • prometheus/prometheus/model/relabel#relabel (link)
    • prometheus/prometheus/model/rulefmt#Rule.Validate (link, link)
    • prometheus/prometheus/model/rulefmt#RuleGroups.Validate (link)
    • prometheus/prometheus/storage/remote#validateLabelsAndMetricName (link)
  • prometheus/common/model#IsValidMetricName

    • prometheus/prometheus/client_golang/prometheus#v2.NewDesc
    • prometheus/prometheus/model/rulefmt#Rule.Validate
    • prometheus/prometheus/storage/remote#validateLabelsAndMetricName (link)

@aknuds1
Copy link
Contributor

aknuds1 commented Jun 5, 2025

We cannot add LabelName.IsValid, faillint only supports function and type declarations

I wonder whether this rule can be implemented in golangci-lint instead. I think one of the included linters has the same type of functionality as faillint.

@charleskorn
Copy link
Contributor

The pkg/streamingpromql expects source and destination labels to be UTF-8 validated, rather than "legacy".

That must mean that model.NameValidationScheme is UTF8Validation in the streaming PromQL engine, right? That was probably unintended when I forced legacy validation in prometheus/common.

I think this is true by accident - I would expect pkg/streamingpromql to use the same name validation scheme as everywhere else.

@charleskorn Can you shed some light on why pkg/streamingpromql explicitly expects the UTF-8 metric/label name validation scheme? Is it because of compatibility tests copied from upstream?

Very possibly - try switching to the classic scheme and see what tests in the testdata/upstream directory break.

I also don't understand why you use IsValidUTF8LabelName instead of IsValidLabelName here. Can you explain?

I think we should change the affected tests to work with either the classic or UTF-8 scheme, and then use IsValidLabelName here.

@juliusmh juliusmh force-pushed the juliusmh/legacy_name_validation branch from 251c8d6 to d330f06 Compare June 7, 2025 13:38
@juliusmh
Copy link
Contributor Author

juliusmh commented Jun 7, 2025

Thanks for the feedback @charleskorn, @aknuds1. Force pushed a couple of changes:

  • adjusted MQE to use validation.IsValidLabelName as suggested, removed conflicting tests, added new ones
  • added lints to prevent LabelName.IsValid and IsValidMetricName

@aknuds1 aknuds1 dismissed their stale review June 7, 2025 15:49

Stale

promqlext.ExtendPromQL()
// Mimir doesn't support Prometheus' UTF-8 metric/label name scheme yet.
// nolint:staticcheck
model.NameValidationScheme = model.LegacyValidation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to retain this for Prometheus code that relies on the global variable (eg. Prometheus' query engine).

@juliusmh juliusmh force-pushed the juliusmh/legacy_name_validation branch from 495dfad to b8a3aeb Compare June 18, 2025 13:59
@juliusmh juliusmh force-pushed the juliusmh/legacy_name_validation branch from b8a3aeb to d10316a Compare June 18, 2025 14:52
@juliusmh
Copy link
Contributor Author

juliusmh commented Aug 7, 2025

Duplicate of: #11848

@juliusmh juliusmh closed this Aug 7, 2025
@juliusmh juliusmh deleted the juliusmh/legacy_name_validation branch August 7, 2025 15:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement classical name validation mode without disabling through global variable

4 participants