Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 5 additions & 5 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

74 changes: 55 additions & 19 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,34 +12,70 @@ to implement and maintain ad-hoc.
`fs-dir-cache` aims to be a simple to use utility from inside
other scripts and programs taking care of the details.

## Use case
## Example

### CI cache

Imagine that you have a CI runner that can persist files between runs,
and you'd like to utilize it to reuse and speed up some things:
This is an example, where a CI runner can persist files between runs,
and it's used to to reuse build artifacts between builds that are likely
building the same build to speed everything up:


```bash
#!/usr/bin/env bash

set -euo pipefail

FS_DIR_CACHE_ROOT="$HOME/.cache/fs-dir-cache" # directory to hold all cache (sub)directories
FS_DIR_CACHE_LOCK_ID="pid-$$-rnd-$RANDOM" # acquire lock based on the current pid and something random (just in case pid gets reused)
FS_DIR_CACHE_KEY_NAME="build-project-x" # the base name of our key
FS_DIR_CACHE_LOCK_TIMEOUT_SECS="600" # unlock after timeout in case our job fails misereably

fs-dir-cache gc unused --seconds "$((7 * 24 * 60 * 60))" # delete caches not used in more than a week
job_name="$1"
shift 1

if [ -z "$job_name" ]; then
>&2 "error: no job name"
exit 1
fi

export FS_DIR_CACHE_LOCK_TIMEOUT_SECS="$((60 * 30))" # unlock after timeout in case our job fails misereably and/or hangs

export FS_DIR_CACHE_ROOT="$HOME/.cache/fs-dir-cache" # directory to hold all cache (sub)directories
export FS_DIR_CACHE_LOCK_ID="pid-$$-rnd-$RANDOM" # acquire lock based on the current pid and something random (just in case pid gets reused)
export FS_DIR_CACHE_KEY_NAME="$job_name" # the base name of our key

log_file="$FS_DIR_CACHE_ROOT/log"

fs-dir-cache gc unused --seconds "$((5 * 24 * 60 * 60))" # delete caches not used in more than a 5 days

export log_file # log when each job starte and ended
export job_name
src_dir=$(pwd)
export src_dir

# This bash command will be executed with a CWD set to the allocated directory
function run_in_cache() {
echo "$(date --rfc-3339=seconds) RUN job=$job_name dir=$(pwd)" >> "$log_file"
>&2 echo "$(date --rfc-3339=seconds) RUN job=$job_name dir=$(pwd)"
CARGO_BUILD_TARGET_DIR="$(pwd)"
export CARGO_BUILD_TARGET_DIR
cd "$src_dir"

function on_exit() {
local exit_code=$?

echo "$(date --rfc-3339=seconds) END job=$job_name code=$exit_code" >> "$log_file"
>&2 echo "$(date --rfc-3339=seconds) END job=$job_name code=$exit_code"

exit $exit_code
}
trap on_exit EXIT

# create/reuse cache (sub-directory) and lock it (wait if already locked)
cache_dir=$(fs-dir-cache lock --key-file Cargo.toml)
# unlock it when the script finish
trap "fs-dir-cache unlock --dir ${cache_dir}" EXIT
"$@"
}
export -f run_in_cache

# 'cache_dir' will now equal to something like '/home/user/.cache/fs-dir-cache/build-project-x-8jg9hsadjfkaj9jkfljdfsd'
# and script has up to 600s to use it exclusively

# build project
cargo build --target-dir="${cache_dir}/target"
fs-dir-cache exec \
--key-file Cargo.lock
--key-str "${CARGO_PROFILE-:dev}" \
--key-file flake.lock \
-- \
bash -c 'run_in_cache "$@"' _ "$@"
```

Using just one tool, it's easy to get correct and practical caching including:
Expand Down
8 changes: 3 additions & 5 deletions src/main.rs
Original file line number Diff line number Diff line change
@@ -1,15 +1,14 @@
mod root;
mod util;

use std::os::unix::net::{UnixListener, UnixStream};
use std::path::{Path, PathBuf};
use std::{ffi, fs, io, process};

use anyhow::{bail, format_err, Context, Result};
use chrono::Utc;
use clap::{Args, Parser, Subcommand};
use rand::distributions::{Alphanumeric, DistString};
use root::Root;
use root::{mk_lock, Root};
use tracing::{debug, error, warn};
use tracing_subscriber::EnvFilter;

Expand Down Expand Up @@ -154,17 +153,16 @@ fn run_exec(ExecOpts { opts, exec }: ExecOpts) -> Result<()> {

let sock_path = root.join(PathBuf::from(format!(
"lock-{}",
Alphanumeric.sample_string(&mut rand::thread_rng(), 10)
Alphanumeric.sample_string(&mut rand::thread_rng(), 16)
)));

debug!(
target: LOG_TARGET,
sock_path = %sock_path.display(),
"Binding liveness socket"
);
let _socket = UnixListener::bind(&sock_path)?;

assert!(UnixStream::connect(&sock_path).is_ok());
let _lock = mk_lock(&sock_path)?;

let exec_dir = lock(None, opts, Some(sock_path.clone()))?;

Expand Down
80 changes: 74 additions & 6 deletions src/root.rs
Original file line number Diff line number Diff line change
@@ -1,8 +1,9 @@
mod dto;

use std::collections::btree_map::Entry;
use std::io::{self, Read as _};
use std::os::unix::net::UnixStream;
use std::io::{self};
#[cfg(not(target_os = "macos"))]
use std::os::unix::net::{UnixListener, UnixStream};
use std::path::{Path, PathBuf};
use std::time::Duration;
use std::{fs, thread};
Expand Down Expand Up @@ -175,18 +176,22 @@ impl<'a> LockedRoot<'a> {
}
Entry::Occupied(mut e) => {
if let Some(prev_sock_path) = e.get().socket_path.as_ref() {
if let Ok(mut s) = UnixStream::connect(prev_sock_path) {
if let Ok(s) = try_lock(prev_sock_path) {
info!(
target: LOG_TARGET,
key,
lock_id,
sock_path = %prev_sock_path.display(),
"Previous lock holder still alive"
"Previous lock holder still alive (potentially)"
);
had_to_wait |= true;
self.r#yield_with(|| {
// we are just waiting to get disconnected here
let _ = s.read(&mut [0]);
let _ = clear_lock(s, prev_sock_path).inspect_err(|err| {
info!(
%err,
"Error during waiting for / clearing the old lock"
)
});
})?;
} else {
debug!(
Expand Down Expand Up @@ -289,3 +294,66 @@ fn rm_prev_sock_path(prev_sock_path: &Path) {
}
}
}

// On Darwin Unix Sockets are not automatically removed, and linger,
// with processes that try to connect to them just hanging. This makes them
// unsuitable for our needs. Just use a file that we lock exclusively.
#[cfg(target_os = "macos")]
pub fn mk_lock(path: &Path) -> Result<fs::File> {
let lock_file = fs::File::create(path)?;
lock_file.lock_exclusive()?;
Ok(lock_file)
}

// On Linux we can use Unix Sockets as they disappear automatically,
// which is nice.
#[cfg(not(target_os = "macos"))]
pub fn mk_lock(path: &Path) -> Result<UnixListener> {
use std::os::unix::net::UnixStream;

let socket = UnixListener::bind(path)?;

assert!(UnixStream::connect(path).is_ok());

Ok(socket)
}

#[cfg(target_os = "macos")]
pub fn try_lock(path: &Path) -> Result<fs::File> {
let lock_file = fs::File::open(path)?;
Ok(lock_file)
}

#[cfg(not(target_os = "macos"))]
pub fn try_lock(path: &Path) -> Result<UnixStream> {
let socket = UnixStream::connect(path)?;

Ok(socket)
}

#[cfg(target_os = "macos")]
pub fn clear_lock(file: fs::File, path: &Path) -> Result<()> {
file.lock_exclusive()?;

// We want to unlock the file, even if we failed to remove it
let rm_res = fs::remove_file(path);

file.unlock()?;

// if removing failed, report it now
rm_res?;

Ok(())
}

#[cfg(not(target_os = "macos"))]
pub fn clear_lock(mut s: UnixStream, _path: &Path) -> Result<()> {
// we are just waiting to get disconnected here

use std::io::Read as _;

// ignore, we *will* disconnect, nothing interesting about it
let _ = s.read(&mut [0]);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dpc maybe calling read was reason it was hanging on macos?

just drop(s) feels like the better implementation to me 🤷

Copy link
Owner Author

@dpc dpc Jun 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is supposed to hang. We're waiting for whoever is holding the lock to disconnect.

It was hanging on Darwin, because on Darwin when the process creating/binding Unix Socket is gone, the socket remains (unlike on Linux), and attempts to connect to it do not disconnect but just wait forever until something new binds on it.

@maan2003

Copy link

@maan2003 maan2003 Jun 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so clear_lock() is for waiting for lock?

@dpc

Copy link
Owner Author

@dpc dpc Jun 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It waits for the lock to clear? Or it waits to clear the lock? Or it waits for me to get better at naming stuff? :D

@maan2003


Ok(())
}
Loading