Refactor: extract common python/bash functions used in docker-telemetry-sidecar for better reuse in future #24996

qiluo-msft · 2026-01-07T19:19:13Z

Why I did it

Many future exist container's docker-*-sidecar will potential reuse the code inside docker-telemetry-sidecar folder. So I extract common python/bash functions used in docker-telemetry-sidecar into a common python library repo and buildimage script folder.

Work item tracking

Microsoft ADO (number only):

How I did it

How to verify it

Passing unit test

Which release branch to backport (provide reason below if selected)

Tested branch (Please provide the tested image version)

Description for the changelog

Link to config_db schema for YANG module changes

A picture of a cute animal (not mandatory but encouraged)

Signed-off-by: Qi Luo <qiluo-msft@users.noreply.github.com>

mssonicbld · 2026-01-07T19:19:21Z

/azp run Azure.sonic-buildimage

azure-pipelines · 2026-01-07T19:19:39Z

Azure Pipelines successfully started running 1 pipeline(s).

mssonicbld · 2026-01-07T21:27:22Z

/azp run Azure.sonic-buildimage

azure-pipelines · 2026-01-07T21:27:33Z

Azure Pipelines successfully started running 1 pipeline(s).

mssonicbld · 2026-01-08T02:14:45Z

/azp run Azure.sonic-buildimage

azure-pipelines · 2026-01-08T02:15:03Z

Azure Pipelines successfully started running 1 pipeline(s).

mssonicbld · 2026-01-08T22:39:17Z

/azp run Azure.sonic-buildimage

azure-pipelines · 2026-01-08T22:39:29Z

Azure Pipelines successfully started running 1 pipeline(s).

Copilot

Pull request overview

This PR refactors common Python and Bash functions from docker-telemetry-sidecar into shared libraries for reuse across future sidecar containers. The refactoring extracts file synchronization, CONFIG_DB operations, and Kubernetes pod control logic into common modules.

Key Changes:

Created new sonic_py_common.sidecar_common Python module with utilities for file sync, CONFIG_DB operations, and subprocess management
Extracted Kubernetes pod control logic into a shared k8s_pod_control.sh script that can be parameterized by service name
Refactored systemd_stub.py to use the new common library instead of duplicated code

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 8 comments.

Show a summary per file

File	Description
`src/sonic-py-common/sonic_py_common/sidecar_common.py`	New common library providing file sync, CONFIG_DB operations, and process execution utilities
`src/sonic-py-common/tests/test_sidecar_common.py`	Unit tests for the new common library
`files/scripts/k8s_pod_control.sh`	New shared Kubernetes pod control script parameterized by service name
`dockers/docker-telemetry-sidecar/systemd_stub.py`	Refactored to import and use functions from sidecar_common instead of defining them locally
`dockers/docker-telemetry-sidecar/systemd_scripts/telemetry.sh`	Simplified to delegate to shared k8s_pod_control.sh script
`dockers/docker-telemetry-sidecar/cli-plugin-tests/test_systemd_stub.py`	Updated to work with refactored code and patched common library functions
`dockers/docker-telemetry-sidecar/Dockerfile.j2`	Added copying and permissions for new k8s_pod_control.sh script
`rules/scripts.mk`	Added k8s_pod_control.sh to build system
`rules/docker-telemetry-sidecar.mk`	Added k8s_pod_control.sh to container files and added unit test target

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-01-08T22:49:00Z

src/sonic-py-common/sonic_py_common/sidecar_common.py

+def db_hget(key: str, field: str) -> Optional[str]:
+    """Get a single field value from CONFIG_DB hash."""
+    db = _get_config_db()
+    if db is None:
+        return None
+    try:
+        table, entry_key = _split_redis_key(key)
+        entry: Dict[str, str] = db.get_entry(table, entry_key)
+    except Exception as e:
+        logger.log_error(f"db_hget failed for {key} field {field}: {e}")
+        return None
+
+    val = entry.get(field)
+    if val is None or val == "":
+        return None
+    return val
+
+
+def db_hgetall(key: str) -> Dict[str, str]:
+    """Get all field-value pairs from CONFIG_DB hash."""
+    db = _get_config_db()
+    if db is None:
+        return {}
+    try:
+        table, entry_key = _split_redis_key(key)
+        entry: Dict[str, str] = db.get_entry(table, entry_key)
+        return entry or {}
+    except Exception as e:
+        logger.log_error(f"db_hgetall failed for {key}: {e}")
+        return {}
+
+
+def db_hset(key: str, field: str, value: str) -> bool:
+    """Set a single field value in CONFIG_DB hash."""
+    db = _get_config_db()
+    if db is None:
+        return False
+    try:
+        table, entry_key = _split_redis_key(key)
+        entry: Dict[str, str] = db.get_entry(table, entry_key)
+        entry[field] = value
+        db.set_entry(table, entry_key, entry)
+        return True
+    except Exception as e:
+        logger.log_error(f"db_hset failed for {key} field {field}: {e}")
+        return False
+
+
+def db_del(key: str) -> bool:
+    """Delete an entry from CONFIG_DB."""
+    db = _get_config_db()
+    if db is None:
+        return False
+    try:
+        table, entry_key = _split_redis_key(key)
+        # In ConfigDBConnector, setting an empty dict deletes the key
+        db.set_entry(table, entry_key, {})
+        return True
+    except Exception as e:
+        logger.log_error(f"db_del failed for {key}: {e}")
+        return False


The db_hget, db_hgetall, db_hset, and db_del functions lack test coverage in the new test suite. While these functions are tested indirectly through the systemd_stub tests, the common library should have dedicated tests for these database operations to ensure they work correctly in isolation and handle edge cases properly.

Copilot · 2026-01-08T22:49:00Z

src/sonic-py-common/sonic_py_common/sidecar_common.py

+def run(args: List[str], *, text: bool = True, input_bytes: Optional[bytes] = None) -> Tuple[int, str | bytes, str | bytes]:
+    """Run a subprocess command and return (returncode, stdout, stderr)."""
+    logger.log_debug("Running: " + " ".join(args))
+    p = subprocess.Popen(
+        args,
+        text=text,
+        stdin=subprocess.PIPE if input_bytes is not None else None,
+        stdout=subprocess.PIPE,
+        stderr=subprocess.PIPE,
+    )
+    out, err = p.communicate(input=input_bytes if input_bytes is not None else None)
+    return p.returncode, out, err


The run function lacks test coverage. Although run_nsenter is tested indirectly through mocking, the base run function should have its own tests to ensure proper command execution, error handling, and stdin/stdout/stderr handling work correctly.

Copilot · 2026-01-08T22:49:00Z

src/sonic-py-common/sonic_py_common/sidecar_common.py

+def _get_config_db() -> Optional[ConfigDBConnector]:
+    """Get or create ConfigDBConnector instance (singleton pattern)."""
+    global _config_db
+    if _config_db is None:
+        try:
+            db = ConfigDBConnector()
+            db.connect()
+            _config_db = db
+            logger.log_info("Connected to CONFIG_DB via ConfigDBConnector")
+        except Exception as e:
+            logger.log_error(f"Failed to connect to CONFIG_DB: {e}")
+            _config_db = None
+            # Do not cache failed connection; try again next time
+    return _config_db


The _get_config_db function has problematic error handling in the singleton pattern. When a connection fails, it sets _config_db = None but the comment says "Do not cache failed connection; try again next time". However, since _config_db is checked with if _config_db is None at the start, it will attempt to reconnect on the next call. But the exception is caught broadly with except Exception as e:, which could mask critical errors. Consider using more specific exception handling or adding a cooldown mechanism to avoid repeated connection attempts on persistent failures.

Copilot · 2026-01-08T22:49:01Z

files/scripts/k8s_pod_control.sh

+cmd_start() {
+  if command -v systemd-cat >/dev/null 2>&1; then
+    # background + pipe to journald with distinct priorities
+    ( kill_pods ) \
+      > >(systemd-cat -t "${SERVICE_NAME}-start" -p info) \
+      2> >(systemd-cat -t "${SERVICE_NAME}-start" -p err)
+  else
+    # background + pipe to syslog via logger in case systemd-journald is masked/disabled
+    ( kill_pods ) \
+      > >(logger -t "${SERVICE_NAME}-start" -p user.info) \
+      2> >(logger -t "${SERVICE_NAME}-start" -p user.err)
+  fi &
+  disown
+  exit 0
+}


The cmd_start function backgrounds the kill_pods operation and immediately exits with code 0, which means it reports success before the pods are actually deleted. If kill_pods fails (returns non-zero), the calling code will never know. Consider either making start blocking or adding a mechanism to track and report background operation failures.

Copilot · 2026-01-08T22:49:01Z

dockers/docker-telemetry-sidecar/Dockerfile.j2

 COPY ["systemd_stub.py", "/usr/bin/"]
 COPY ["systemd_scripts/", "/usr/share/sonic/systemd_scripts/"]
 COPY ["files/container_checker", "/usr/share/sonic/systemd_scripts/container_checker"]
 COPY ["files/telemetry.sh", "/usr/share/sonic/systemd_scripts/telemetry_v1.sh"]


The COPY command on line 23 copies k8s_pod_control.sh to /usr/share/sonic/scripts/k8s_pod_control.sh, but there's no explicit creation of the /usr/share/sonic/scripts/ directory. While COPY will create intermediate directories, it's better practice to explicitly create the directory first with RUN mkdir -p /usr/share/sonic/scripts before copying files to it, especially for clarity and to ensure proper permissions.

Suggested change

COPY ["files/telemetry.sh", "/usr/share/sonic/systemd_scripts/telemetry_v1.sh"]

COPY ["files/telemetry.sh", "/usr/share/sonic/systemd_scripts/telemetry_v1.sh"]

RUN mkdir -p /usr/share/sonic/scripts

``` #Resolved

COPY will create intermediate directories, no harm.

Copilot · 2026-01-08T22:49:01Z

files/scripts/k8s_pod_control.sh

+  shift
+fi
+
+# SERVICE_NAME must be set by caller (e.g., "telemetry", "telemetry")


The comment on line 19 mentions "telemetry" twice, which appears to be a copy-paste error. The second instance should likely be a different service name or removed entirely to make it a general comment.

Suggested change

# SERVICE_NAME must be set by caller (e.g., "telemetry", "telemetry")

# SERVICE_NAME must be set by caller (e.g., "telemetry")

``` #Resolved

Copilot · 2026-01-08T22:49:02Z

src/sonic-py-common/sonic_py_common/sidecar_common.py

+        return False
+    try:
+        table, entry_key = _split_redis_key(key)
+        entry: Dict[str, str] = db.get_entry(table, entry_key)


The db_hset function has a potential bug where it retrieves the current entry using get_entry, modifies the returned dict by adding a field, then sets it back. However, if get_entry returns None (which can happen when the entry doesn't exist), the code will fail with an AttributeError when trying to set entry[field]. The function should initialize entry as an empty dict when get_entry returns None.

Suggested change

entry: Dict[str, str] = db.get_entry(table, entry_key)

entry: Dict[str, str] = db.get_entry(table, entry_key) or {}

Copilot · 2026-01-08T22:49:02Z

dockers/docker-telemetry-sidecar/cli-plugin-tests/test_systemd_stub.py

+        """Set a field in a CONFIG_DB hash."""
+        if key not in config_db:
+            config_db[key] = {}
+        config_db[key][field] = value


In the fake_db_hset function, there's a missing return value. The function should return True to indicate success (matching the signature of the real db_hset function which returns a boolean). This could cause tests to fail or behave unexpectedly.

Suggested change

config_db[key][field] = value

config_db[key][field] = value

return True

mssonicbld · 2026-01-09T00:01:28Z

/azp run Azure.sonic-buildimage

azure-pipelines · 2026-01-09T00:01:40Z

Azure Pipelines successfully started running 1 pipeline(s).

make1980

qiluo-msft added 2 commits January 7, 2026 07:41

(test)

97c4792

Signed-off-by: Qi Luo <qiluo-msft@users.noreply.github.com>

telemetry-sidecar reuse code

d638261

qiluo-msft requested review from FengPan-Frank and make1980 January 7, 2026 19:19

revert sysctl-net script

f73307a

Fix test path

760470f

(comment)

3914c6a

Merge branch 'master' into qiluo/refactorsidcar

68aa3ed

qiluo-msft marked this pull request as ready for review January 8, 2026 22:42

qiluo-msft requested review from lguohan and xumia as code owners January 8, 2026 22:42

Copilot AI review requested due to automatic review settings January 8, 2026 22:42

Copilot started reviewing on behalf of qiluo-msft January 8, 2026 22:42 View session

Copilot AI reviewed Jan 8, 2026

View reviewed changes

(comment)

202f560

make1980 approved these changes Jan 9, 2026

View reviewed changes

	# SERVICE_NAME must be set by caller (e.g., "telemetry", "telemetry")
	# SERVICE_NAME must be set by caller (e.g., "telemetry")
	``` #Resolved

	entry: Dict[str, str] = db.get_entry(table, entry_key)
	entry: Dict[str, str] = db.get_entry(table, entry_key) or {}

	config_db[key][field] = value
	config_db[key][field] = value
	return True

Refactor: extract common python/bash functions used in docker-telemetry-sidecar for better reuse in future #24996

Are you sure you want to change the base?

Refactor: extract common python/bash functions used in docker-telemetry-sidecar for better reuse in future #24996

Uh oh!

Conversation

qiluo-msft commented Jan 7, 2026

Why I did it

Work item tracking

How I did it

How to verify it

Which release branch to backport (provide reason below if selected)

Tested branch (Please provide the tested image version)

Description for the changelog

Link to config_db schema for YANG module changes

A picture of a cute animal (not mandatory but encouraged)

Uh oh!

mssonicbld commented Jan 7, 2026

Uh oh!

azure-pipelines bot commented Jan 7, 2026

Uh oh!

mssonicbld commented Jan 7, 2026

Uh oh!

azure-pipelines bot commented Jan 7, 2026

Uh oh!

mssonicbld commented Jan 8, 2026

Uh oh!

azure-pipelines bot commented Jan 8, 2026

Uh oh!

mssonicbld commented Jan 8, 2026

Uh oh!

azure-pipelines bot commented Jan 8, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 8, 2026 • edited by qiluo-msft Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

qiluo-msft Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 8, 2026 • edited by qiluo-msft Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

mssonicbld commented Jan 9, 2026

Uh oh!

azure-pipelines bot commented Jan 9, 2026

Uh oh!

make1980 left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

Copilot AI Jan 8, 2026 •

edited by qiluo-msft

Loading

Copilot AI Jan 8, 2026 •

edited by qiluo-msft

Loading