-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Refactor: extract common python/bash functions used in docker-telemetry-sidecar for better reuse in future #24996
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
/azp run Azure.sonic-buildimage |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run Azure.sonic-buildimage |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run Azure.sonic-buildimage |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run Azure.sonic-buildimage |
|
Azure Pipelines successfully started running 1 pipeline(s). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR refactors common Python and Bash functions from docker-telemetry-sidecar into shared libraries for reuse across future sidecar containers. The refactoring extracts file synchronization, CONFIG_DB operations, and Kubernetes pod control logic into common modules.
Key Changes:
- Created new
sonic_py_common.sidecar_commonPython module with utilities for file sync, CONFIG_DB operations, and subprocess management - Extracted Kubernetes pod control logic into a shared
k8s_pod_control.shscript that can be parameterized by service name - Refactored
systemd_stub.pyto use the new common library instead of duplicated code
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
src/sonic-py-common/sonic_py_common/sidecar_common.py |
New common library providing file sync, CONFIG_DB operations, and process execution utilities |
src/sonic-py-common/tests/test_sidecar_common.py |
Unit tests for the new common library |
files/scripts/k8s_pod_control.sh |
New shared Kubernetes pod control script parameterized by service name |
dockers/docker-telemetry-sidecar/systemd_stub.py |
Refactored to import and use functions from sidecar_common instead of defining them locally |
dockers/docker-telemetry-sidecar/systemd_scripts/telemetry.sh |
Simplified to delegate to shared k8s_pod_control.sh script |
dockers/docker-telemetry-sidecar/cli-plugin-tests/test_systemd_stub.py |
Updated to work with refactored code and patched common library functions |
dockers/docker-telemetry-sidecar/Dockerfile.j2 |
Added copying and permissions for new k8s_pod_control.sh script |
rules/scripts.mk |
Added k8s_pod_control.sh to build system |
rules/docker-telemetry-sidecar.mk |
Added k8s_pod_control.sh to container files and added unit test target |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| def db_hget(key: str, field: str) -> Optional[str]: | ||
| """Get a single field value from CONFIG_DB hash.""" | ||
| db = _get_config_db() | ||
| if db is None: | ||
| return None | ||
| try: | ||
| table, entry_key = _split_redis_key(key) | ||
| entry: Dict[str, str] = db.get_entry(table, entry_key) | ||
| except Exception as e: | ||
| logger.log_error(f"db_hget failed for {key} field {field}: {e}") | ||
| return None | ||
|
|
||
| val = entry.get(field) | ||
| if val is None or val == "": | ||
| return None | ||
| return val | ||
|
|
||
|
|
||
| def db_hgetall(key: str) -> Dict[str, str]: | ||
| """Get all field-value pairs from CONFIG_DB hash.""" | ||
| db = _get_config_db() | ||
| if db is None: | ||
| return {} | ||
| try: | ||
| table, entry_key = _split_redis_key(key) | ||
| entry: Dict[str, str] = db.get_entry(table, entry_key) | ||
| return entry or {} | ||
| except Exception as e: | ||
| logger.log_error(f"db_hgetall failed for {key}: {e}") | ||
| return {} | ||
|
|
||
|
|
||
| def db_hset(key: str, field: str, value: str) -> bool: | ||
| """Set a single field value in CONFIG_DB hash.""" | ||
| db = _get_config_db() | ||
| if db is None: | ||
| return False | ||
| try: | ||
| table, entry_key = _split_redis_key(key) | ||
| entry: Dict[str, str] = db.get_entry(table, entry_key) | ||
| entry[field] = value | ||
| db.set_entry(table, entry_key, entry) | ||
| return True | ||
| except Exception as e: | ||
| logger.log_error(f"db_hset failed for {key} field {field}: {e}") | ||
| return False | ||
|
|
||
|
|
||
| def db_del(key: str) -> bool: | ||
| """Delete an entry from CONFIG_DB.""" | ||
| db = _get_config_db() | ||
| if db is None: | ||
| return False | ||
| try: | ||
| table, entry_key = _split_redis_key(key) | ||
| # In ConfigDBConnector, setting an empty dict deletes the key | ||
| db.set_entry(table, entry_key, {}) | ||
| return True | ||
| except Exception as e: | ||
| logger.log_error(f"db_del failed for {key}: {e}") | ||
| return False |
Copilot
AI
Jan 8, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The db_hget, db_hgetall, db_hset, and db_del functions lack test coverage in the new test suite. While these functions are tested indirectly through the systemd_stub tests, the common library should have dedicated tests for these database operations to ensure they work correctly in isolation and handle edge cases properly.
| def run(args: List[str], *, text: bool = True, input_bytes: Optional[bytes] = None) -> Tuple[int, str | bytes, str | bytes]: | ||
| """Run a subprocess command and return (returncode, stdout, stderr).""" | ||
| logger.log_debug("Running: " + " ".join(args)) | ||
| p = subprocess.Popen( | ||
| args, | ||
| text=text, | ||
| stdin=subprocess.PIPE if input_bytes is not None else None, | ||
| stdout=subprocess.PIPE, | ||
| stderr=subprocess.PIPE, | ||
| ) | ||
| out, err = p.communicate(input=input_bytes if input_bytes is not None else None) | ||
| return p.returncode, out, err |
Copilot
AI
Jan 8, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The run function lacks test coverage. Although run_nsenter is tested indirectly through mocking, the base run function should have its own tests to ensure proper command execution, error handling, and stdin/stdout/stderr handling work correctly.
| def _get_config_db() -> Optional[ConfigDBConnector]: | ||
| """Get or create ConfigDBConnector instance (singleton pattern).""" | ||
| global _config_db | ||
| if _config_db is None: | ||
| try: | ||
| db = ConfigDBConnector() | ||
| db.connect() | ||
| _config_db = db | ||
| logger.log_info("Connected to CONFIG_DB via ConfigDBConnector") | ||
| except Exception as e: | ||
| logger.log_error(f"Failed to connect to CONFIG_DB: {e}") | ||
| _config_db = None | ||
| # Do not cache failed connection; try again next time | ||
| return _config_db |
Copilot
AI
Jan 8, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The _get_config_db function has problematic error handling in the singleton pattern. When a connection fails, it sets _config_db = None but the comment says "Do not cache failed connection; try again next time". However, since _config_db is checked with if _config_db is None at the start, it will attempt to reconnect on the next call. But the exception is caught broadly with except Exception as e:, which could mask critical errors. Consider using more specific exception handling or adding a cooldown mechanism to avoid repeated connection attempts on persistent failures.
| cmd_start() { | ||
| if command -v systemd-cat >/dev/null 2>&1; then | ||
| # background + pipe to journald with distinct priorities | ||
| ( kill_pods ) \ | ||
| > >(systemd-cat -t "${SERVICE_NAME}-start" -p info) \ | ||
| 2> >(systemd-cat -t "${SERVICE_NAME}-start" -p err) | ||
| else | ||
| # background + pipe to syslog via logger in case systemd-journald is masked/disabled | ||
| ( kill_pods ) \ | ||
| > >(logger -t "${SERVICE_NAME}-start" -p user.info) \ | ||
| 2> >(logger -t "${SERVICE_NAME}-start" -p user.err) | ||
| fi & | ||
| disown | ||
| exit 0 | ||
| } |
Copilot
AI
Jan 8, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The cmd_start function backgrounds the kill_pods operation and immediately exits with code 0, which means it reports success before the pods are actually deleted. If kill_pods fails (returns non-zero), the calling code will never know. Consider either making start blocking or adding a mechanism to track and report background operation failures.
| COPY ["systemd_stub.py", "/usr/bin/"] | ||
| COPY ["systemd_scripts/", "/usr/share/sonic/systemd_scripts/"] | ||
| COPY ["files/container_checker", "/usr/share/sonic/systemd_scripts/container_checker"] | ||
| COPY ["files/telemetry.sh", "/usr/share/sonic/systemd_scripts/telemetry_v1.sh"] |
Copilot
AI
Jan 8, 2026
•
edited by qiluo-msft
Loading
edited by qiluo-msft
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The COPY command on line 23 copies k8s_pod_control.sh to /usr/share/sonic/scripts/k8s_pod_control.sh, but there's no explicit creation of the /usr/share/sonic/scripts/ directory. While COPY will create intermediate directories, it's better practice to explicitly create the directory first with RUN mkdir -p /usr/share/sonic/scripts before copying files to it, especially for clarity and to ensure proper permissions.
| COPY ["files/telemetry.sh", "/usr/share/sonic/systemd_scripts/telemetry_v1.sh"] | |
| COPY ["files/telemetry.sh", "/usr/share/sonic/systemd_scripts/telemetry_v1.sh"] | |
| RUN mkdir -p /usr/share/sonic/scripts | |
| ``` #Resolved |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
COPY will create intermediate directories, no harm.
files/scripts/k8s_pod_control.sh
Outdated
| shift | ||
| fi | ||
|
|
||
| # SERVICE_NAME must be set by caller (e.g., "telemetry", "telemetry") |
Copilot
AI
Jan 8, 2026
•
edited by qiluo-msft
Loading
edited by qiluo-msft
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment on line 19 mentions "telemetry" twice, which appears to be a copy-paste error. The second instance should likely be a different service name or removed entirely to make it a general comment.
| # SERVICE_NAME must be set by caller (e.g., "telemetry", "telemetry") | |
| # SERVICE_NAME must be set by caller (e.g., "telemetry") | |
| ``` #Resolved |
| return False | ||
| try: | ||
| table, entry_key = _split_redis_key(key) | ||
| entry: Dict[str, str] = db.get_entry(table, entry_key) |
Copilot
AI
Jan 8, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The db_hset function has a potential bug where it retrieves the current entry using get_entry, modifies the returned dict by adding a field, then sets it back. However, if get_entry returns None (which can happen when the entry doesn't exist), the code will fail with an AttributeError when trying to set entry[field]. The function should initialize entry as an empty dict when get_entry returns None.
| entry: Dict[str, str] = db.get_entry(table, entry_key) | |
| entry: Dict[str, str] = db.get_entry(table, entry_key) or {} |
| """Set a field in a CONFIG_DB hash.""" | ||
| if key not in config_db: | ||
| config_db[key] = {} | ||
| config_db[key][field] = value |
Copilot
AI
Jan 8, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the fake_db_hset function, there's a missing return value. The function should return True to indicate success (matching the signature of the real db_hset function which returns a boolean). This could cause tests to fail or behave unexpectedly.
| config_db[key][field] = value | |
| config_db[key][field] = value | |
| return True |
|
/azp run Azure.sonic-buildimage |
|
Azure Pipelines successfully started running 1 pipeline(s). |
make1980
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
![]()
Why I did it
Many future exist container's docker-*-sidecar will potential reuse the code inside docker-telemetry-sidecar folder. So I extract common python/bash functions used in docker-telemetry-sidecar into a common python library repo and buildimage script folder.
Work item tracking
How I did it
How to verify it
Passing unit test
Which release branch to backport (provide reason below if selected)
Tested branch (Please provide the tested image version)
Description for the changelog
Link to config_db schema for YANG module changes
A picture of a cute animal (not mandatory but encouraged)