Consumer status is unpredictable when multiple topics are consumed

https://github.com/linkedin/Burrow/blob/4a05b203ce40d82def3af986eeca5350570e6b96/core/internal/evaluator/caching.go#L226-L252

This piece of code loops over a map of topics, and if the last topic's last partition is reporting ok, the consumer status will be ok.

Given that the map iteration in go is randomized, the consumer status is unpredictable.

The following are the real world effect from this:

1. The metric from burrow of a consumer when scraping at 2m interval:
   <img width="1141" alt="image" src="https://github.com/linkedin/Burrow/assets/687367/802f6cef-bb39-410a-9904-55a933b5cc3b">

2. The metric from burrow-exporter which requests burrow at 30s interval, and then being scrapped at 2m interval:
   <img width="1142" alt="image" src="https://github.com/linkedin/Burrow/assets/687367/6b6a22ab-60a5-4edf-b236-ec4618fddcc8">

The more frequently we query (as burrow uses 30s cache expiration by default), the more likely to see non-OK consumer status.

	for topic, partitions := range topics {
	for partitionID, partition := range partitions {
	partitionStatus := evaluatePartitionStatus(partition, module.minimumComplete, module.allowedLag)
	partitionStatus.Topic = topic
	partitionStatus.Partition = int32(partitionID)
	partitionStatus.Owner = partition.Owner
	partitionStatus.ClientID = partition.ClientID

	if partitionStatus.Status > status.Status {
	// If the partition status is greater than StatusError, we just mark it as StatusError
	if partitionStatus.Status > protocol.StatusError {
	status.Status = protocol.StatusError
	} else {
	status.Status = partitionStatus.Status
	}
	}

	if (status.Maxlag == nil) \|\| (partitionStatus.CurrentLag > status.Maxlag.CurrentLag) {
	status.Maxlag = partitionStatus
	}
	if partitionStatus.Complete == 1.0 {
	completePartitions++
	}
	status.Partitions[count] = partitionStatus
	count++
	}
	}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Consumer status is unpredictable when multiple topics are consumed #796

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Consumer status is unpredictable when multiple topics are consumed #796

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions