Skip to content

Add --stream-logs flag to run-errand for live log tailing via SSH#713

Open
abg wants to merge 2 commits intocloudfoundry:mainfrom
abg:bosh-run-errand-stream-logs
Open

Add --stream-logs flag to run-errand for live log tailing via SSH#713
abg wants to merge 2 commits intocloudfoundry:mainfrom
abg:bosh-run-errand-stream-logs

Conversation

@abg
Copy link
Member

@abg abg commented Mar 5, 2026

When --stream-logs is set, the errand is started asynchronously and task events are polled to discover running instances. SSH sessions are established to each instance to tail errand log files in real time, interleaved with the standard task event output.

Requirements:

  • The CLI must have SSH access to the BOSH VMs where errands run (directly or via a gateway configured with --gw-* flags).
  • Errands must write output to log files under /var/vcap/sys/log/. By default, the CLI tails /var/vcap/sys/log//.{stdout,stderr}.log. This adhere to common bpm conventions. Errands that write to non-standard paths will not have their output streamed unless --stream-log-path is specified.
  • Use --stream-log-path to override the default log location, e.g. --stream-log-path "my-job/*.log". The path is relative to /var/vcap/sys/log/ and supports shell glob and brace expansion.

@abg abg force-pushed the bosh-run-errand-stream-logs branch from f3c0e0a to 263088b Compare March 5, 2026 23:46
When --stream-logs is set, the errand is started asynchronously and
task events are polled to discover running instances. SSH sessions
are established to each instance to tail errand log files in real
time, interleaved with the standard task event output.

Requirements:
- The CLI must have SSH access to the BOSH VMs where errands run
  (directly or via a gateway configured with --gw-* flags).
- Errands must write output to log files under /var/vcap/sys/log/.
  By default, the CLI tails /var/vcap/sys/log/<errand>/<errand>.{stdout,stderr}.log.
  Errands that write to non-standard paths will not have their output
  streamed unless --stream-log-path is specified.
- Use --stream-log-path <path> to override the default log location,
  e.g. --stream-log-path "my-job/*.log". The path is relative to
  /var/vcap/sys/log/ and supports shell glob and brace expansion.
@abg abg force-pushed the bosh-run-errand-stream-logs branch from 263088b to 1850e8d Compare March 6, 2026 14:57
Copy link
Member

@selzoc selzoc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is quite the Rube Goldberg machine! However, without an involved refactoring of how errands work in the BOSH ecosystem, I think this is pretty close to the only way to do it.

Unsure how fragile using SSH streaming will be over long errands.

}

func (c RunErrandCmd) runWithStreaming(opts RunErrandOpts) error {
if c.nonIntSSHRunner == nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a circumstance in which this could happen?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this was silly defensive code.

LogsDirectory DirOrCWDArg `long:"logs-dir" description:"Destination directory for logs" default:"."`

StreamLogs bool `long:"stream-logs" description:"Stream errand log files via SSH while errand runs"`
StreamLogPath string `long:"stream-log-path" description:"Log file path to tail (default: /var/vcap/sys/log/$ERRAND/$ERRAND.{stdout,stderr}.log)"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is implemented as being relative to /var/vcap/sys/log? Should we restrict it to a subdirectory for security reasons (i.e. ../../../etc/passwd or whatever should be disallowed)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or perhaps we just say "we'll stream everything in /var/vcap/sys/log/$ERRAND"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. I had thought about this one earlier. I see in my local history was initially rejecting paths with ".." in them - next to the regex safety checks in BuildErrandTailCmd but it seems to have been dropped from the commit. Whoops.

This is definitely something to guard against.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or perhaps we just say "we'll stream everything in /var/vcap/sys/log/$ERRAND"

Yeah, I had started with that concept early on, but there can be a lot of nonsense in the errand directory that's not pretty to look at. bpm.log for instance is probably not terribly informative. Maybe we could filter that out though.

I think I like the idea of dropping --stream-log-path though. I had put it in initially because there are some really oddly named errands in the world that do not necessarily map to the job name or similar.

If this could enforce the bpm convention of /var/vcap/sys/log/$ERRAND/$ERRAND.{stdout,stderr}.log that simplifies some things including these security concerns.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather take the approach of starting very narrow, and then expanding as needed to keep thing simpler if we can

// arguments with spaces before passing them to the remote shell. Without
// quotes, "bash -c <script>" is interpreted as "bash -c for" with the
// remaining tokens becoming positional parameters.
return []string{"sudo", "bash", "-c", "'" + tailScript + "'"}, nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need sudo if we're looking at stuff in /var/vcap/sys/log?

Copy link
Member Author

@abg abg Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the same sudo wrapping bosh logs --follow does ref.

I think in particular bpm creates logs scoped purely to the vcap user, not the vcap group (0600/-rw-------).

The bosh ssh user is random and part of the vcap group, but not the vcap user so must sudo to read those logs.

Example:

$ bosh ssh ...
$ whoami
bosh_9564f3b73d69470
$ groups
bosh_9564f3b73d69470 admin vcap bosh_sshers bosh_sudoers
$ tail /var/vcap/sys/log/proxy/proxy.stdout.log
tail: cannot open '/var/vcap/sys/log/proxy/proxy.stdout.log' for reading: Permission denied
$ sudo tail ...
# success!

sshCancel()
sshWg.Wait()

for _, slug := range sessions {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will we possibly leak ssh sessions here if an error occurs before we reach this point? Any call for a defer of the cleanup to make sure it runs?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, I suppose I would defend this by suggesting there is no early return that would skip the cleanup.

But the code could be improved here.

return err
}

// Suppress stdout/stderr in the summary since we already streamed the logs.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe if any of the ssh streams fail we can fall back to printing these?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense

@github-project-automation github-project-automation bot moved this from Inbox to Waiting for Changes | Open for Contribution in Foundational Infrastructure Working Group Mar 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Waiting for Changes | Open for Contribution

Development

Successfully merging this pull request may close these issues.

2 participants