feat: Add Python Bridge for Modern I/O (POSCAR/ASE) by Liam-Deacon · Pull Request #52 · Liam-Deacon/CLEED

Liam-Deacon · 2025-12-20T11:39:16Z

Modern Input/Output Bridge for CLEED

This PR implements the "Python Bridge" strategy discussed in Issue #51 to enable CLEED to consume modern crystallography formats (POSCAR, CIF, XYZ) via ASE.

Features

1. `cleed-io` Python Package

A new lightweight package in tools/ that provides:

slicer.py: An algorithm to automatically "slice" a 3D periodic slab (from VASP/ASE) into the "Bulk" and "Overlayer" components required by CLEED.
writer.py: Generates valid .bul and .inp files from the sliced atomic structures.
cli.py: A command-line tool cleed-convert for easy batch processing.

2. Supported Workflow

Users can now generate a structure in Python/ASE or VASP and immediately convert it for LEED simulation:

# Convert a VASP POSCAR
cleed-convert --input POSCAR --output my_surface

This generates my_surface.bul and my_surface.inp ready for csearch.

3. Documentation

Added docs/modern-io.md explaining the motivation, installation, and usage of the new tools.

Technical Details

Alignment: The slicer automatically rotates arbitrary input cells so the surface vectors lie in the $xy$-plane (CLEED convention).
Bulk Detection: Currently uses a heuristic (bottom $N$ layers are bulk). Future work will implement rigorous periodicity detection.
Dependencies: Requires ase (Atomic Simulation Environment).

Related Issues

Closes #51

Summary by Sourcery

Introduce a Python-based I/O bridge to convert modern crystallography structures into CLEED bulk/overlayer input files.

New Features:

Add slicer utilities to align ASE slabs and split them into bulk and overlayer regions for CLEED.
Add writers to generate CLEED .bul and .inp files from sliced atomic structures, including basic default parameters.
Provide a cleed-convert CLI for batch conversion of POSCAR/CIF/XYZ files into CLEED inputs.
Package the new tools as a lightweight cleed-io Python package with an installable entry point script.

Build:

Add a pyproject.toml in tools/ to define the cleed-io package metadata, dependencies, and console script entry point.

Documentation:

Add modern-io documentation describing installation and usage of the new cleed-io tools for converting modern structure formats to CLEED files.

Summary by CodeRabbit

New Features
- Added a cleed-convert CLI and supporting I/O bridge for bidirectional conversion between CLEED and common structure formats (POSCAR/CIF/XYZ), with configurable bulk-layer slicing and round-trip read/write support.
Documentation
- Added a comprehensive modern I/O guide covering installation, usage examples, supported formats, options, and integration notes.
Tests
- Added unit tests validating slab slicing and end-to-end conversion/round-trip behavior.
Chores
- Added packaging metadata and a console entry point for the new CLI.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

Implements the 'cleed-convert' tool to slice 3D slabs into CLEED .bul/.inp files. Resolves #51

sourcery-ai · 2025-12-20T11:39:22Z

Reviewer's Guide

Introduces a new Python-based cleed-io bridge package (with slicer, writer, and CLI) plus documentation, enabling conversion of ASE-readable structures (e.g., POSCAR/CIF/XYZ) into CLEED .bul and .inp files.

Sequence diagram for cleed-convert structure conversion pipeline

sequenceDiagram
    actor User
    participant CLI as cleed_convert_CLI
    participant ASE as ASE_io
    participant Slicer as cleed_io_slicer
    participant Writer as cleed_io_writer
    participant FS as File_system

    User->>CLI: cleed-convert -i POSCAR -o ni111_co --bulk-layers 2
    CLI->>ASE: read(input, format)
    ASE-->>CLI: Atoms_object

    CLI->>Slicer: slice_slab(Atoms_object, bulk_layers_count)
    Slicer->>Slicer: align_surface(Atoms)
    Slicer->>Slicer: cluster_layers(Atoms)
    Slicer-->>CLI: bulk_atoms, overlayer_atoms

    CLI->>Writer: write_bul(output_basename.bul, bulk_atoms)
    Writer->>FS: open(.bul, write)
    Writer->>FS: write lattice_vectors, bulk_atoms, parameters
    FS-->>Writer: close(.bul)

    CLI->>Writer: write_inp(output_basename.inp, overlayer_atoms, bulk_atoms)
    Writer->>FS: open(.inp, write)
    Writer->>FS: write superstructure, overlayer_atoms, search_parameters
    FS-->>Writer: close(.inp)

    Writer-->>CLI: success
    CLI-->>User: Done.

Class diagram for cleed-io package modules and key functions

classDiagram
    class CleedIOPackage {
      +string __version__
    }

    class SlicerModule {
      +align_surface(atoms, vacuum_direction, tolerance) Atoms
      +cluster_layers(atoms, tolerance) list
      +slice_slab(atoms, bulk_layers_count) tuple
    }

    class WriterModule {
      +format_vector(label, v) string
      +format_atom(prefix, atom, phase_map, vib_default) string
      +write_bul(filename, bulk_atoms, phase_map) void
      +write_inp(filename, overlayer_atoms, bulk_atoms, phase_map) void
    }

    class CliModule {
      +main() void
    }

    class ExternalASEAtoms {
      +positions
      +cell
      +pbc
      +rotate(axis_vector, axis_name, rotate_cell)
      +get_cell()
      +get_pbc()
      +set_cell(cell)
      +get_chemical_formula()
    }

    CleedIOPackage <|-- SlicerModule
    CleedIOPackage <|-- WriterModule
    CleedIOPackage <|-- CliModule

    SlicerModule ..> ExternalASEAtoms : uses
    WriterModule ..> ExternalASEAtoms : uses
    CliModule ..> SlicerModule : calls_slice_slab
    CliModule ..> WriterModule : calls_writers
    CliModule ..> ExternalASEAtoms : obtains_from_ASE_read

File-Level Changes

Change	Details	Files
Implements slab alignment and heuristic slab slicing into bulk and overlayer components using ASE `Atoms`.	Adds `align_surface` to rotate arbitrary ASE cells so surface vectors lie in the xy-plane with surface normal ~+z, assuming slab-like input. Adds `cluster_layers` to group atoms into z-layer stacks using a simple distance-based tolerance on z-coordinates. Adds `slice_slab` to call alignment and layer clustering, split atoms into bulk vs overlayer by bottom N layers, and construct a heuristic bulk c-vector from inter-layer spacing.	`tools/cleed_io/slicer.py`
Adds writers for CLEED `.bul` and `.inp` formats based on sliced ASE structures.	Adds helpers to format lattice vectors and atoms with optional phase label mapping and default vibrational parameters. Implements `write_bul` that writes lattice vectors, bulk atoms, and default VR/VI parameters from a bulk `Atoms` object. Implements `write_inp` that writes a default 1x1 superstructure, overlayer atoms, and default search energy parameters, using the bulk cell for in-plane periodicity.	`tools/cleed_io/writer.py`
Introduces a CLI tool `cleed-convert` to read modern structure files via ASE and emit CLEED input files.	Parses CLI arguments for input path, output basename, bulk-layer count, and optional ASE format override. Reads input with `ase.io.read`, slices the slab via `slice_slab`, and writes `.bul` and `.inp` using the writer helpers. Adds simple progress and error reporting to stdout/stderr and non-zero exit codes on failure.	`tools/cleed_io/cli.py`
Defines the `cleed-io` Python package metadata and entry point, and exposes a top-level package API.	Adds `pyproject.toml` under `tools/` configuring setuptools build-backend, core dependencies (`ase`, `numpy`, `pyyaml`), Python version requirement, and console script entry point `cleed-convert=cleed_io.cli:main`. Adds a minimal `__init__.py` with package docstring and version constant.	`tools/pyproject.toml` `tools/cleed_io/__init__.py`
Documents the modern I/O workflow and usage of the new conversion tools.	Adds `modern-io.md` describing motivation, installation, example commands, bulk-layer configuration, and underlying algorithm steps (alignment, slicing, separation).	`docs/modern-io.md`

Assessment against linked issues

Issue	Objective	Addressed
#51	Implement a Python-based preprocessor/bridge that uses ASE to read modern structure files (e.g., POSCAR, CIF, XYZ) and prepare data for CLEED.	✅
#51	Provide an automated slicing algorithm that converts a 3D slab/supercell into CLEED-style Bulk and Overlayer components and outputs corresponding .bul and .inp files.	✅
#51	Add documentation describing the motivation, installation, and usage of the new Python bridge workflow for modern I/O.	✅

Tips and commands

Interacting with Sourcery

Trigger a new review: Comment @sourcery-ai review on the pull request.
Continue discussions: Reply directly to Sourcery's review comments.
Generate a GitHub issue from a review comment: Ask Sourcery to create an
issue from a review comment by replying to it. You can also reply to a
review comment with @sourcery-ai issue to create an issue from it.
Generate a pull request title: Write @sourcery-ai anywhere in the pull
request title to generate a title at any time. You can also comment
@sourcery-ai title on the pull request to (re-)generate the title at any time.
Generate a pull request summary: Write @sourcery-ai summary anywhere in
the pull request body to generate a PR summary at any time exactly where you
want it. You can also comment @sourcery-ai summary on the pull request to
(re-)generate the summary at any time.
Generate reviewer's guide: Comment @sourcery-ai guide on the pull
request to (re-)generate the reviewer's guide at any time.
Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
pull request to resolve all Sourcery comments. Useful if you've already
addressed all the comments and don't want to see them anymore.
Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
request to dismiss all existing Sourcery reviews. Especially useful if you
want to start fresh with a new review - don't forget to comment
@sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

Enable or disable review features such as the Sourcery-generated pull request
summary, the reviewer's guide, and others.
Change the review language.
Add, remove or edit custom review instructions.
Adjust other review settings.

Getting Help

Contact our support team for questions or feedback.
Visit our documentation for detailed guides and information.
Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

coderabbitai · 2025-12-20T11:39:29Z

Walkthrough

Adds a Python-based CLEED I/O bridge: documentation, an ASE-backed CLI (cleed-convert), slab alignment and layer-slicing logic, readers/writers for CLEED .bul/.inp files, package metadata/entry point, and unit tests exercising slicing and round-trip IO.

Changes

Cohort / File(s)	Summary
Documentation `docs/modern-io.md`	New guide describing the modern I/O workflow, `cleed-convert` usage, supported formats (via ASE), options (e.g., `--bulk-layers`), and internal slicing/alignment behaviour and conventions.
Package Init `tools/cleed_io/__init__.py`	New package initializer with module docstring and `__version__ = "0.1.0"`.
CLI `tools/cleed_io/cli.py`	New CLI (`cleed-convert`) using argparse; supports mutually exclusive `--to-cleed`/`--from-cleed` modes, reads/writes via ASE, invokes slicer and reader/writer helpers, prints progress and exits on errors; exposes `main()`.
Geometry processing `tools/cleed_io/slicer.py`	New slicer module: `align_surface(atoms)`, `cluster_layers(atoms, tolerance=0.2)`, and `slice_slab(atoms, bulk_layers_count=2)` to orient slabs, group atoms into layers by z, and split bulk vs overlayer (contains TODOs for robust a3/unit-cell derivation).
CLEED file writers `tools/cleed_io/writer.py`	New writer helpers: `format_vector()`, `format_atom()`, `write_bul(filename, bulk_atoms, phase_map=None)`, and `write_inp(filename, overlayer_atoms, bulk_atoms, phase_map=None)` that emit `.bul` and `.inp` text files from ASE Atoms.
CLEED file reader `tools/cleed_io/reader.py`	New reader utilities: `parse_vector(line)`, `parse_atom(line)`, and `read_cleed_files(bul_file, inp_file)` to parse `.bul`/`.inp` and reconstruct an ASE Atoms object; raises on missing lattice vectors.
Project manifest & entry point `tools/pyproject.toml`	New pyproject defining build-system, metadata, dependencies (`ase>=3.22.0`, `numpy`, `pyyaml`), Python >=3.8, console entry `cleed-convert = cleed_io.cli:main`, and optional `phaseshifts` group.
Tests `tools/tests/test_io.py`	New unit tests `TestCleedIO` with `test_slice_slab` and `test_round_trip` that build a Ni(111)+O slab, validate slicing, write/read `.bul`/`.inp`, and assert round-trip atom counts/formula and lattice alignment.

Sequence Diagram(s)

sequenceDiagram
    participant User as User/CLI
    participant CLI as cleed_io.cli
    participant ASE as ASE (reader/writer)
    participant Slicer as cleed_io.slicer
    participant Writer as cleed_io.writer
    participant Reader as cleed_io.reader
    participant FS as File System

    User->>CLI: cleed-convert input.poscar -o output --to-cleed
    CLI->>ASE: read(input.poscar)
    ASE-->>CLI: Atoms object
    CLI->>Slicer: slice_slab(Atoms, bulk_layers_count)
    Slicer-->>CLI: (bulk_atoms, overlayer_atoms)
    CLI->>Writer: write_bul(output.bul, bulk_atoms)
    Writer->>FS: create/write output.bul
    CLI->>Writer: write_inp(output.inp, overlayer_atoms, bulk_atoms)
    Writer->>FS: create/write output.inp
    FS-->>CLI: success
    CLI-->>User: print success message

    alt round-trip
      User->>CLI: cleed-convert --from-cleed output.bul output.inp -o out.poscar
      CLI->>Reader: read_cleed_files(output.bul, output.inp)
      Reader-->>CLI: Atoms object
      CLI->>ASE: write(out.poscar, Atoms)
      ASE-->>FS: write out.poscar
      CLI-->>User: print success message
    end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related issues

ASE/POSCAR adapter + slicer port from CLEED phaseshifts#149 — Implements the same ASE/POSCAR ↔ CLEED I/O bridge (slicer and reader/writer logic), likely covering overlapping objectives.

Suggested labels

documentation, enhancement

Poem

🐰 I nibble atoms, align them straight,
I slice the slab, I shape its fate,
From POSCAR fields to CLEED's embrace,
A tiny bridge in code and space,
Hop—now legacy finds a modern place.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 53.33% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the main change: introducing a Python bridge for modern I/O format conversion (POSCAR/ASE to CLEED), which is the primary objective of the PR.
Linked Issues check	✅ Passed	The PR implements all core objectives from #51: ASE-based format conversion, automated slab slicing algorithm, .bul/.inp file generation, CLI tool, documentation, and support for multiple file formats through ASE.
Out of Scope Changes check	✅ Passed	All changes are directly related to the Python Bridge feature: documentation, core conversion logic (slicer, reader, writer), CLI tool, package metadata, and integration tests. No unrelated modifications detected.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

codacy-production · 2025-12-20T11:40:20Z

Codacy's Analysis Summary

28 new issues (≤ 0 issue)
0 new security issue
59 complexity
0 duplications

Review Pull Request in Codacy →

✨ AI Reviewer available: add the codacy-review label to get contextual insights without leaving GitHub.

sourcery-ai

Hey - I've found 5 issues, and left some high level feedback:

In align_surface, the vacuum_direction and tolerance parameters are unused and the negative-c case is explicitly ignored (pass), which makes the function’s behavior hard to reason about; consider either implementing these options or removing/renaming them and making the z-orientation handling explicit and deterministic.
The bulk cell construction in slice_slab uses dz * len(layers) for the a3 vector even though the comment describes a3 as the inter-layer spacing; if the intent is to represent the bulk periodicity, this should likely be just dz (or a clearly derived periodic translation), otherwise the .bul output will encode the full slab thickness instead of the bulk repeat.
In the CLI, ase.io.read can return a list of Atoms objects for multi-frame inputs (e.g., trajectories); it may be safer to explicitly handle that case (e.g., select the first frame or error out) so downstream slicing logic always receives a single Atoms instance.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- In `align_surface`, the `vacuum_direction` and `tolerance` parameters are unused and the negative-`c` case is explicitly ignored (`pass`), which makes the function’s behavior hard to reason about; consider either implementing these options or removing/renaming them and making the z-orientation handling explicit and deterministic.
- The bulk cell construction in `slice_slab` uses `dz * len(layers)` for the `a3` vector even though the comment describes `a3` as the inter-layer spacing; if the intent is to represent the bulk periodicity, this should likely be just `dz` (or a clearly derived periodic translation), otherwise the `.bul` output will encode the full slab thickness instead of the bulk repeat.
- In the CLI, `ase.io.read` can return a list of `Atoms` objects for multi-frame inputs (e.g., trajectories); it may be safer to explicitly handle that case (e.g., select the first frame or error out) so downstream slicing logic always receives a single `Atoms` instance.

## Individual Comments

### Comment 1
<location> `tools/cleed_io/slicer.py:39-44` </location>
<code_context>
+    atoms.rotate(normal, 'z', rotate_cell=True)
+    
+    # Ensure c is positive z
+    final_cell = atoms.get_cell()
+    if final_cell[2][2] < 0:
+        # Flip z? No, that mirrors the structure. 
+        # Just ensure the slab is oriented correctly.
+        # ViPErLEED: "positive z-direction pointing away from the surface into the vacuum"
+        pass
+
+    return atoms
</code_context>

<issue_to_address>
**issue (bug_risk):** The negative z-component of `c` is detected but never corrected, which can violate the stated CLEED convention.

Right now, when `final_cell[2][2] < 0` we detect the issue but do nothing, so the returned structure can violate the documented CLEED convention (+z into vacuum) without any signal to the caller. Consider either applying a proper 180° rotation around x or y to fix the sign while preserving handedness, raising an error instead of silently continuing, or clearly documenting that the orientation is not enforced.
</issue_to_address>

### Comment 2
<location> `tools/cleed_io/slicer.py:125-132` </location>
<code_context>
+    if len(layers) >= 2:
+        z1 = np.mean(aligned.positions[layers[0], 2])
+        z2 = np.mean(aligned.positions[layers[1], 2])
+        dz = z2 - z1
+        # This assumes vertical stacking.
+        # Ideally, we find the translation vector between layer 0 and layer 1.
+        
+        # Heuristic: Use the slab cell for a1, a2.
+        # Construct a3 from the layer spacing.
+        cell = aligned.get_cell()
+        bulk_atoms.set_cell([cell[0], cell[1], [0, 0, dz * len(layers)]]) # Dummy a3
+        
+    return bulk_atoms, overlayer_atoms
</code_context>

<issue_to_address>
**suggestion (bug_risk):** Using `dz * len(layers)` for the bulk a3 vector is a questionable heuristic and likely misrepresents the bulk periodicity.

Here `dz` is the spacing between the first two layers, but a3 is scaled by `len(layers)` (including any overlayer). This will often make a3 larger than the true bulk repeat and tie it to whether an overlayer is present. If this is only a placeholder, consider using `dz` (or `dz * bulk_layer_count` if appropriate) and making that explicit in the API (e.g., flag/metadata), or allow the caller to provide the bulk a3 explicitly.

Suggested implementation:

```python
        # Heuristic: Use the slab cell for a1, a2.
        # Construct a3 from the inter-layer spacing between the first two bulk-like layers.
        # NOTE: This is a placeholder for the bulk c vector; callers should supply the
        # true bulk a3 if known.
        cell = aligned.get_cell()
        bulk_c = abs(dz)
        bulk_atoms.set_cell([cell[0], cell[1], [0, 0, bulk_c]])  # Dummy a3 based on first-layer spacing

```

To fully address the concern and make the heuristic explicit in the API:
1. Consider adding an optional parameter (e.g. `bulk_a3=None` or `bulk_c=None`) to this function so callers can provide the correct bulk periodicity when available.
2. If such a parameter is added, use it here instead of the placeholder when provided, and fall back to the `abs(dz)` heuristic only when it is not given.
3. Optionally attach a flag/metadata attribute on `bulk_atoms` (e.g. `bulk_atoms.info["a3_is_placeholder"] = True`) when using the heuristic, so downstream code can detect and handle this.
</issue_to_address>

### Comment 3
<location> `tools/cleed_io/slicer.py:5-14` </location>
<code_context>
+from ase import Atoms
+from ase.geometry import cell_to_cellpar
+
+def align_surface(atoms: Atoms, vacuum_direction='z', tolerance=1e-3) -> Atoms:
+    """
+    Aligns the atoms object such that the periodic surface vectors (a, b)
+    lie in the xy-plane (z=0) and the surface normal points along +z.
+    
+    This matches the CLEED convention where z is the surface normal.
+    """
+    # 1. Identify the non-periodic direction or the longest vector if 3D periodic
+    # For a standard slab, we assume c is the vacuum direction.
+    
+    cell = atoms.get_cell()
+    pbc = atoms.get_pbc()
+    
+    # We assume standard ASE slab convention: 3rd vector is out-of-plane
</code_context>

<issue_to_address>
**suggestion:** `vacuum_direction` (and `pbc`) are unused, which could confuse callers about supported orientations.

The implementation assumes the third cell vector is out-of-plane and always maps `cell[0]` to x, so the current behavior doesn’t reflect the `vacuum_direction`/`pbc` arguments. Either add logic to honor these parameters or remove them from the API for now so callers aren’t misled about what orientations are actually supported.

Suggested implementation:

```python
def align_surface(atoms: Atoms) -> Atoms:
    """
    Aligns the atoms object such that the periodic surface vectors (a, b)
    lie in the xy-plane (z=0) and the surface normal points along +z.

    This matches the CLEED convention where z is the surface normal.

    Note:
        The current implementation assumes the standard ASE slab convention
        where the third cell vector is the out-of-plane (vacuum) direction.
        Other orientations are not yet supported.
    """
    # 1. Identify the non-periodic direction or the longest vector if 3D periodic
    # For a standard slab, we assume c is the vacuum direction.

    cell = atoms.get_cell()

```

1. All call sites of `align_surface` must be updated to drop the `vacuum_direction` and `tolerance` arguments, e.g. `align_surface(atoms)` instead of `align_surface(atoms, vacuum_direction='z')`.
2. If the `pbc` variable is used later in this function (outside the snippet shown), reintroduce it at the point of first use; otherwise, keep it removed to avoid unused-variable warnings.
</issue_to_address>

### Comment 4
<location> `tools/cleed_io/writer.py:39-19` </location>
<code_context>
+    """
+    Writes the CLEED .inp file.
+    """
+    cell = bulk_atoms.get_cell() # Overlayer uses bulk 2D periodicity usually
+    
+    with open(filename, 'w') as f:
+        f.write(f"c: Generated by cleed-convert\n")
+        
</code_context>

<issue_to_address>
**nitpick:** `cell` is computed but never used in `write_inp`, suggesting either dead code or a missing use.

If the overlayer `.inp` truly doesn’t need lattice vectors, remove this assignment. If you expect to use the bulk 2D periodicity later (e.g., for superstructure or overlayer positioning), either hook `cell` into that logic now or add a clear TODO so its purpose is explicit.
</issue_to_address>

### Comment 5
<location> `docs/modern-io.md:7` </location>
<code_context>
+
+## Motivation
+
+Standard DFT workflows (VASP, Quantum Espresso) use "Supercell" geometries (3D periodic boxes). CLEED requires a specific "Layered" definition (Semi-infinite Bulk + Finite Overlayer). 
+
+The `cleed-convert` tool bridges this gap by automatically slicing a 3D slab into the required LEED components.
</code_context>

<issue_to_address>
**nitpick (typo):** Consider using the official capitalization "Quantum ESPRESSO".

Please update "Quantum Espresso" to "Quantum ESPRESSO" to match the official project name used in their documentation and common usage.

```suggestion
Standard DFT workflows (VASP, Quantum ESPRESSO) use "Supercell" geometries (3D periodic boxes). CLEED requires a specific "Layered" definition (Semi-infinite Bulk + Finite Overlayer). 
```
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

sourcery-ai · 2025-12-20T11:40:34Z

tools/cleed_io/slicer.py

+    final_cell = atoms.get_cell()
+    if final_cell[2][2] < 0:
+        # Flip z? No, that mirrors the structure. 
+        # Just ensure the slab is oriented correctly.
+        # ViPErLEED: "positive z-direction pointing away from the surface into the vacuum"
+        pass


issue (bug_risk): The negative z-component of c is detected but never corrected, which can violate the stated CLEED convention.

Right now, when final_cell[2][2] < 0 we detect the issue but do nothing, so the returned structure can violate the documented CLEED convention (+z into vacuum) without any signal to the caller. Consider either applying a proper 180° rotation around x or y to fix the sign while preserving handedness, raising an error instead of silently continuing, or clearly documenting that the orientation is not enforced.

sourcery-ai · 2025-12-20T11:40:34Z

tools/cleed_io/slicer.py

+        dz = z2 - z1
+        # This assumes vertical stacking.
+        # Ideally, we find the translation vector between layer 0 and layer 1.
+
+        # Heuristic: Use the slab cell for a1, a2.
+        # Construct a3 from the layer spacing.
+        cell = aligned.get_cell()
+        bulk_atoms.set_cell([cell[0], cell[1], [0, 0, dz * len(layers)]]) # Dummy a3


suggestion (bug_risk): Using dz * len(layers) for the bulk a3 vector is a questionable heuristic and likely misrepresents the bulk periodicity.

Here dz is the spacing between the first two layers, but a3 is scaled by len(layers) (including any overlayer). This will often make a3 larger than the true bulk repeat and tie it to whether an overlayer is present. If this is only a placeholder, consider using dz (or dz * bulk_layer_count if appropriate) and making that explicit in the API (e.g., flag/metadata), or allow the caller to provide the bulk a3 explicitly.

Suggested implementation:

# Heuristic: Use the slab cell for a1, a2. # Construct a3 from the inter-layer spacing between the first two bulk-like layers. # NOTE: This is a placeholder for the bulk c vector; callers should supply the # true bulk a3 if known. cell = aligned.get_cell() bulk_c = abs(dz) bulk_atoms.set_cell([cell[0], cell[1], [0, 0, bulk_c]]) # Dummy a3 based on first-layer spacing

To fully address the concern and make the heuristic explicit in the API:

Consider adding an optional parameter (e.g. bulk_a3=None or bulk_c=None) to this function so callers can provide the correct bulk periodicity when available.

If such a parameter is added, use it here instead of the placeholder when provided, and fall back to the abs(dz) heuristic only when it is not given.

Optionally attach a flag/metadata attribute on bulk_atoms (e.g. bulk_atoms.info["a3_is_placeholder"] = True) when using the heuristic, so downstream code can detect and handle this.

tools/cleed_io/slicer.py

sourcery-ai · 2025-12-20T11:40:34Z

tools/cleed_io/writer.py

+    """
+    cell = bulk_atoms.get_cell()
+
+    with open(filename, 'w') as f:


nitpick: cell is computed but never used in write_inp, suggesting either dead code or a missing use.

If the overlayer .inp truly doesn’t need lattice vectors, remove this assignment. If you expect to use the bulk 2D periodicity later (e.g., for superstructure or overlayer positioning), either hook cell into that logic now or add a clear TODO so its purpose is explicit.

sourcery-ai · 2025-12-20T11:40:34Z

docs/modern-io.md

+
+## Motivation
+
+Standard DFT workflows (VASP, Quantum Espresso) use "Supercell" geometries (3D periodic boxes). CLEED requires a specific "Layered" definition (Semi-infinite Bulk + Finite Overlayer). 


nitpick (typo): Consider using the official capitalization "Quantum ESPRESSO".

Please update "Quantum Espresso" to "Quantum ESPRESSO" to match the official project name used in their documentation and common usage.

Suggested change

Standard DFT workflows (VASP, Quantum Espresso) use "Supercell" geometries (3D periodic boxes). CLEED requires a specific "Layered" definition (Semi-infinite Bulk + Finite Overlayer).

Standard DFT workflows (VASP, Quantum ESPRESSO) use "Supercell" geometries (3D periodic boxes). CLEED requires a specific "Layered" definition (Semi-infinite Bulk + Finite Overlayer).

Adds 'reader.py' to parse legacy CLEED files and updates CLI to support '--from-cleed' export to POSCAR/CIF. Includes unit tests.

coderabbitai

Actionable comments posted: 11

📜 Review details

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d5ba80a and 79aa2d2.

📒 Files selected for processing (6)

docs/modern-io.md (1 hunks)
tools/cleed_io/__init__.py (1 hunks)
tools/cleed_io/cli.py (1 hunks)
tools/cleed_io/slicer.py (1 hunks)
tools/cleed_io/writer.py (1 hunks)
tools/pyproject.toml (1 hunks)

🧰 Additional context used

🧠 Learnings (1)

📚 Learning: 2025-12-20T09:06:56.375Z

Learnt from: CR
Repo: Liam-Deacon/CLEED PR: 0
File: doc/AGENTS.md:0-0
Timestamp: 2025-12-20T09:06:56.375Z
Learning: Applies to doc/CLEED_Manual.pdf : Keep `doc/CLEED_Manual.pdf` tracked in git as the canonical reference manual for the original CLEED program suite

Applied to files:

docs/modern-io.md

🧬 Code graph analysis (1)

tools/cleed_io/cli.py (2)

tools/cleed_io/slicer.py (1)

slice_slab (75-134)

tools/cleed_io/writer.py (2)

write_bul (13-33)

write_inp (35-56)

🪛 GitHub Check: Codacy Static Code Analysis

docs/modern-io.md

[notice] 30-30: docs/modern-io.md#L30
Expected: 1; Actual: 3

[notice] 30-30: docs/modern-io.md#L30
Lists should be surrounded by blank lines

[notice] 31-31: docs/modern-io.md#L31
Expected: 1; Actual: 3

[notice] 44-44: docs/modern-io.md#L44
Expected: 1; Actual: 2

[notice] 45-45: docs/modern-io.md#L45
Expected: 1; Actual: 2

[notice] 46-46: docs/modern-io.md#L46
Expected: 1; Actual: 2

[notice] 47-47: docs/modern-io.md#L47
Expected: 1; Actual: 3

[notice] 48-48: docs/modern-io.md#L48
Expected: 1; Actual: 3

[notice] 53-53: docs/modern-io.md#L53
Expected: 1; Actual: 3

[notice] 53-53: docs/modern-io.md#L53
Lists should be surrounded by blank lines

[notice] 54-54: docs/modern-io.md#L54
Expected: 1; Actual: 3

tools/cleed_io/cli.py

[notice] 38-38: tools/cleed_io/cli.py#L38
expected 2 blank lines after class or function definition, found 1 (E305)

tools/cleed_io/slicer.py

[warning] 3-3: tools/cleed_io/slicer.py#L3
'ase.geometry.cell_to_cellpar' imported but unused (F401)

[warning] 3-3: tools/cleed_io/slicer.py#L3
Unused cell_to_cellpar imported from ase.geometry

[notice] 16-16: tools/cleed_io/slicer.py#L16
Unused variable 'pbc'

[warning] 16-16: tools/cleed_io/slicer.py#L16
local variable 'pbc' is assigned to but never used (F841)

tools/cleed_io/writer.py

[warning] 1-1: tools/cleed_io/writer.py#L1
'numpy as np' imported but unused (F401)

[warning] 1-1: tools/cleed_io/writer.py#L1
Unused numpy imported as np

[notice] 39-39: tools/cleed_io/writer.py#L39
Unused variable 'cell'

[warning] 39-39: tools/cleed_io/writer.py#L39
local variable 'cell' is assigned to but never used (F841)

[warning] 42-42: tools/cleed_io/writer.py#L42
f-string is missing placeholders (F541)

🪛 markdownlint-cli2 (0.18.1)

docs/modern-io.md

7-7: Trailing spaces
Expected: 0 or 2; Actual: 1

(MD009, no-trailing-spaces)

🪛 Ruff (0.14.8)

tools/cleed_io/cli.py

34-34: Do not catch blind exception: Exception

(BLE001)

tools/cleed_io/slicer.py

5-5: Unused function argument: vacuum_direction

(ARG001)

5-5: Unused function argument: tolerance

(ARG001)

16-16: Local variable pbc is assigned to but never used

Remove assignment to unused variable pbc

(F841)

96-96: Avoid specifying long messages outside the exception class

(TRY003)

tools/cleed_io/writer.py

39-39: Local variable cell is assigned to but never used

Remove assignment to unused variable cell

(F841)

42-42: f-string without any placeholders

Remove extraneous f prefix

(F541)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: Codacy Static Code Analysis
GitHub Check: windows-latest

🔇 Additional comments (7)

tools/cleed_io/__init__.py (1)

1-6: LGTM!

Clean package initializer with appropriate version metadata.

docs/modern-io.md (1)

1-54: Excellent documentation.

The documentation clearly explains the motivation, installation, usage, and technical details of the modern I/O bridge. The examples are practical and helpful.

tools/cleed_io/cli.py (2)

34-36: Bare exception catch is acceptable for CLI error handling.

While the static analyzer flags the bare Exception catch, this is appropriate for a CLI tool's top-level error handler where all exceptions should be caught and reported to the user.

7-33: Well-structured CLI implementation.

The CLI follows best practices with clear argument parsing, progress messages, and proper error handling. The workflow is straightforward and user-friendly.

tools/cleed_io/writer.py (1)

4-11: Well-designed formatting utilities.

The format_vector and format_atom functions provide clean, reusable formatting with appropriate defaults and optional customization.

tools/cleed_io/slicer.py (2)

48-73: Solid layer clustering implementation.

The cluster_layers function correctly groups atoms by z-coordinate with appropriate tolerance handling and edge cases (empty structure, single atom).

25-36: The rotation logic is correctly implemented and aligns with ASE's rotate() method behavior. The method signature rotate(a, v) aligns vector a with direction v (not rotates around v), which is exactly what the code intends. Line 26 correctly aligns the cell[0] vector with the x-axis, and line 36 correctly aligns the surface normal with the z-axis to standardize the slab orientation for CLEED. The comments and logic match the surface crystallography convention. No critical issues found.

Likely an incorrect or invalid review comment.

coderabbitai · 2025-12-20T11:43:41Z

docs/modern-io.md

+
+## Motivation
+
+Standard DFT workflows (VASP, Quantum Espresso) use "Supercell" geometries (3D periodic boxes). CLEED requires a specific "Layered" definition (Semi-infinite Bulk + Finite Overlayer). 


🧹 Nitpick | 🔵 Trivial

Remove trailing space.

Markdownlint detected a trailing space at the end of this line.

🔎 Proposed fix

-Standard DFT workflows (VASP, Quantum Espresso) use "Supercell" geometries (3D periodic boxes). CLEED requires a specific "Layered" definition (Semi-infinite Bulk + Finite Overlayer). +Standard DFT workflows (VASP, Quantum Espresso) use "Supercell" geometries (3D periodic boxes). CLEED requires a specific "Layered" definition (Semi-infinite Bulk + Finite Overlayer).

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

Standard DFT workflows (VASP, Quantum Espresso) use "Supercell" geometries (3D periodic boxes). CLEED requires a specific "Layered" definition (Semi-infinite Bulk + Finite Overlayer).

Standard DFT workflows (VASP, Quantum Espresso) use "Supercell" geometries (3D periodic boxes). CLEED requires a specific "Layered" definition (Semi-infinite Bulk + Finite Overlayer).

🧰 Tools

🪛 markdownlint-cli2 (0.18.1)

7-7: Trailing spaces
Expected: 0 or 2; Actual: 1

(MD009, no-trailing-spaces)

🤖 Prompt for AI Agents

In docs/modern-io.md around line 7, remove the trailing whitespace at the end of the line that reads 'Standard DFT workflows (VASP, Quantum Espresso) use "Supercell" geometries (3D periodic boxes). CLEED requires a specific "Layered" definition (Semi-infinite Bulk + Finite Overlayer).' by deleting the extra space character at the end so the line ends with the period; re-save the file to ensure markdownlint no longer flags a trailing space.

tools/cleed_io/slicer.py

tools/cleed_io/writer.py

coderabbitai · 2025-12-20T11:43:41Z

tools/pyproject.toml

+dependencies = [
+    "ase>=3.22.0",
+    "numpy",
+    "pyyaml"
+]


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Find and read the pyproject.toml file in tools directory find . -path "*/tools/pyproject.toml" -type f | head -1 | xargs cat -n

Repository: Liam-Deacon/CLEED

Length of output: 667

Add version constraints to numpy and pyyaml dependencies.

The numpy and pyyaml dependencies lack version constraints, which could lead to unexpected breakage if major versions introduce incompatible changes (e.g., numpy's 1.x to 2.x transition). Consider specifying minimum and ideally maximum version bounds for these dependencies to ensure stability.

🤖 Prompt for AI Agents

In tools/pyproject.toml around lines 12 to 16, numpy and pyyaml are listed without version constraints; add appropriate version bounds (at minimum a minimum version and preferably an upper bound) to both entries (for example pin numpy to a compatible >= and < major bump and pyyaml similarly) so the project avoids breaking changes from major releases; update the dependencies array to include those constraints and run dependency resolution to confirm compatibility.

coderabbitai · 2025-12-20T11:43:41Z

tools/pyproject.toml

+[project.optional-dependencies]
+phaseshifts = ["phaseshifts"]


🧹 Nitpick | 🔵 Trivial

Add version constraint for optional dependency.

The phaseshifts optional dependency has no version constraint, which could lead to compatibility issues.

🤖 Prompt for AI Agents

In tools/pyproject.toml around lines 22-23, the optional dependency entry "phaseshifts" has no version constraint; update the project.optional-dependencies entry to include a semantic version specifier (for example use a range like a minimum supported version and an upper bound to avoid breaking changes, or pin to a specific tested version), commit the change, and run your dependency resolver (poetry/pip-tools) to verify the constraint is valid and the lockfile updates accordingly.

coderabbitai

Actionable comments posted: 5

♻️ Duplicate comments (2)

tools/cleed_io/cli.py (1)
64-65: Add blank line before module guard.

PEP 8 recommends two blank lines before top-level code following function definitions.
🔎 Proposed fix
         sys.exit(1)
 

 if __name__ == "__main__":
     main()
docs/modern-io.md (1)
7-7: Fix trailing space and capitalization.

Line 7 has a trailing space and uses "Quantum Espresso" instead of the official "Quantum ESPRESSO" capitalization.
🔎 Proposed fix
-Standard DFT workflows (VASP, Quantum Espresso) use "Supercell" geometries (3D periodic boxes). CLEED requires a specific "Layered" definition (Semi-infinite Bulk + Finite Overlayer). 
+Standard DFT workflows (VASP, Quantum ESPRESSO) use "Supercell" geometries (3D periodic boxes). CLEED requires a specific "Layered" definition (Semi-infinite Bulk + Finite Overlayer).

📜 Review details

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 79aa2d2 and a417fae.

📒 Files selected for processing (4)

docs/modern-io.md (1 hunks)
tools/cleed_io/cli.py (1 hunks)
tools/cleed_io/reader.py (1 hunks)
tools/tests/test_io.py (1 hunks)

🧰 Additional context used

🧠 Learnings (1)

📚 Learning: 2025-12-20T09:06:56.375Z

Learnt from: CR
Repo: Liam-Deacon/CLEED PR: 0
File: doc/AGENTS.md:0-0
Timestamp: 2025-12-20T09:06:56.375Z
Learning: Applies to doc/CLEED_Manual.pdf : Keep `doc/CLEED_Manual.pdf` tracked in git as the canonical reference manual for the original CLEED program suite

Applied to files:

docs/modern-io.md

🧬 Code graph analysis (2)

tools/cleed_io/cli.py (3)

tools/cleed_io/slicer.py (1)

slice_slab (75-134)

tools/cleed_io/writer.py (2)

write_bul (13-33)

write_inp (35-56)

tools/cleed_io/reader.py (1)

read_cleed_files (49-114)

tools/tests/test_io.py (3)

tools/cleed_io/slicer.py (1)

slice_slab (75-134)

tools/cleed_io/reader.py (1)

read_cleed_files (49-114)

tools/cleed_io/writer.py (2)

write_bul (13-33)

write_inp (35-56)

🪛 GitHub Check: Codacy Static Code Analysis

tools/cleed_io/cli.py

[warning] 3-3: tools/cleed_io/cli.py#L3
'os' imported but unused (F401)

[warning] 3-3: tools/cleed_io/cli.py#L3
Unused import os

[notice] 64-64: tools/cleed_io/cli.py#L64
expected 2 blank lines after class or function definition, found 1 (E305)

docs/modern-io.md

[notice] 22-22: docs/modern-io.md#L22
Expected: 1; Actual: 3

[notice] 22-22: docs/modern-io.md#L22
Lists should be surrounded by blank lines

[notice] 23-23: docs/modern-io.md#L23
Expected: 1; Actual: 3

[notice] 24-24: docs/modern-io.md#L24
Expected: 1; Actual: 3

[notice] 25-25: docs/modern-io.md#L25
Expected: 1; Actual: 3

[notice] 26-26: docs/modern-io.md#L26
Expected: 1; Actual: 3

[notice] 27-27: docs/modern-io.md#L27
Expected: 1; Actual: 3

[notice] 40-40: docs/modern-io.md#L40
Expected: 1; Actual: 3

[notice] 40-40: docs/modern-io.md#L40
Lists should be surrounded by blank lines

[notice] 41-41: docs/modern-io.md#L41
Expected: 1; Actual: 3

[notice] 44-44: docs/modern-io.md#L44
Expected: 1; Actual: 3

[notice] 44-44: docs/modern-io.md#L44
Lists should be surrounded by blank lines

[notice] 58-58: docs/modern-io.md#L58
Expected: 1; Actual: 2

[notice] 59-59: docs/modern-io.md#L59
Expected: 1; Actual: 2

[notice] 60-60: docs/modern-io.md#L60
Expected: 1; Actual: 2

[notice] 61-61: docs/modern-io.md#L61
Expected: 1; Actual: 3

[notice] 62-62: docs/modern-io.md#L62
Expected: 1; Actual: 3

[notice] 67-67: docs/modern-io.md#L67
Expected: 1; Actual: 3

[notice] 67-67: docs/modern-io.md#L67
Lists should be surrounded by blank lines

[notice] 68-68: docs/modern-io.md#L68
Expected: 1; Actual: 3

tools/cleed_io/reader.py

[warning] 49-49: tools/cleed_io/reader.py#L49
Method read_cleed_files has a cyclomatic complexity of 18 (limit is 8)

🪛 markdownlint-cli2 (0.18.1)

docs/modern-io.md

7-7: Trailing spaces
Expected: 0 or 2; Actual: 1

(MD009, no-trailing-spaces)

68-68: Files should end with a single newline character

(MD047, single-trailing-newline)

🪛 Ruff (0.14.8)

tools/cleed_io/cli.py

60-60: Do not catch blind exception: Exception

(BLE001)

tools/tests/test_io.py

35-35: Use a regular assert instead of unittest-style assertEqual