feat: Add Python Bridge for Modern I/O (POSCAR/ASE)#52
feat: Add Python Bridge for Modern I/O (POSCAR/ASE)#52Liam-Deacon wants to merge 3 commits intomasterfrom
Conversation
Implements the 'cleed-convert' tool to slice 3D slabs into CLEED .bul/.inp files. Resolves #51
Reviewer's GuideIntroduces a new Python-based Sequence diagram for cleed-convert structure conversion pipelinesequenceDiagram
actor User
participant CLI as cleed_convert_CLI
participant ASE as ASE_io
participant Slicer as cleed_io_slicer
participant Writer as cleed_io_writer
participant FS as File_system
User->>CLI: cleed-convert -i POSCAR -o ni111_co --bulk-layers 2
CLI->>ASE: read(input, format)
ASE-->>CLI: Atoms_object
CLI->>Slicer: slice_slab(Atoms_object, bulk_layers_count)
Slicer->>Slicer: align_surface(Atoms)
Slicer->>Slicer: cluster_layers(Atoms)
Slicer-->>CLI: bulk_atoms, overlayer_atoms
CLI->>Writer: write_bul(output_basename.bul, bulk_atoms)
Writer->>FS: open(.bul, write)
Writer->>FS: write lattice_vectors, bulk_atoms, parameters
FS-->>Writer: close(.bul)
CLI->>Writer: write_inp(output_basename.inp, overlayer_atoms, bulk_atoms)
Writer->>FS: open(.inp, write)
Writer->>FS: write superstructure, overlayer_atoms, search_parameters
FS-->>Writer: close(.inp)
Writer-->>CLI: success
CLI-->>User: Done.
Class diagram for cleed-io package modules and key functionsclassDiagram
class CleedIOPackage {
+string __version__
}
class SlicerModule {
+align_surface(atoms, vacuum_direction, tolerance) Atoms
+cluster_layers(atoms, tolerance) list
+slice_slab(atoms, bulk_layers_count) tuple
}
class WriterModule {
+format_vector(label, v) string
+format_atom(prefix, atom, phase_map, vib_default) string
+write_bul(filename, bulk_atoms, phase_map) void
+write_inp(filename, overlayer_atoms, bulk_atoms, phase_map) void
}
class CliModule {
+main() void
}
class ExternalASEAtoms {
+positions
+cell
+pbc
+rotate(axis_vector, axis_name, rotate_cell)
+get_cell()
+get_pbc()
+set_cell(cell)
+get_chemical_formula()
}
CleedIOPackage <|-- SlicerModule
CleedIOPackage <|-- WriterModule
CleedIOPackage <|-- CliModule
SlicerModule ..> ExternalASEAtoms : uses
WriterModule ..> ExternalASEAtoms : uses
CliModule ..> SlicerModule : calls_slice_slab
CliModule ..> WriterModule : calls_writers
CliModule ..> ExternalASEAtoms : obtains_from_ASE_read
File-Level Changes
Assessment against linked issues
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
WalkthroughAdds a Python-based CLEED I/O bridge: documentation, an ASE-backed CLI ( Changes
Sequence Diagram(s)sequenceDiagram
participant User as User/CLI
participant CLI as cleed_io.cli
participant ASE as ASE (reader/writer)
participant Slicer as cleed_io.slicer
participant Writer as cleed_io.writer
participant Reader as cleed_io.reader
participant FS as File System
User->>CLI: cleed-convert input.poscar -o output --to-cleed
CLI->>ASE: read(input.poscar)
ASE-->>CLI: Atoms object
CLI->>Slicer: slice_slab(Atoms, bulk_layers_count)
Slicer-->>CLI: (bulk_atoms, overlayer_atoms)
CLI->>Writer: write_bul(output.bul, bulk_atoms)
Writer->>FS: create/write output.bul
CLI->>Writer: write_inp(output.inp, overlayer_atoms, bulk_atoms)
Writer->>FS: create/write output.inp
FS-->>CLI: success
CLI-->>User: print success message
alt round-trip
User->>CLI: cleed-convert --from-cleed output.bul output.inp -o out.poscar
CLI->>Reader: read_cleed_files(output.bul, output.inp)
Reader-->>CLI: Atoms object
CLI->>ASE: write(out.poscar, Atoms)
ASE-->>FS: write out.poscar
CLI-->>User: print success message
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related issues
Suggested labels
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Codacy's Analysis Summary28 new issues (≤ 0 issue) Review Pull Request in Codacy →
|
There was a problem hiding this comment.
Hey - I've found 5 issues, and left some high level feedback:
- In
align_surface, thevacuum_directionandtoleranceparameters are unused and the negative-ccase is explicitly ignored (pass), which makes the function’s behavior hard to reason about; consider either implementing these options or removing/renaming them and making the z-orientation handling explicit and deterministic. - The bulk cell construction in
slice_slabusesdz * len(layers)for thea3vector even though the comment describesa3as the inter-layer spacing; if the intent is to represent the bulk periodicity, this should likely be justdz(or a clearly derived periodic translation), otherwise the.buloutput will encode the full slab thickness instead of the bulk repeat. - In the CLI,
ase.io.readcan return a list ofAtomsobjects for multi-frame inputs (e.g., trajectories); it may be safer to explicitly handle that case (e.g., select the first frame or error out) so downstream slicing logic always receives a singleAtomsinstance.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In `align_surface`, the `vacuum_direction` and `tolerance` parameters are unused and the negative-`c` case is explicitly ignored (`pass`), which makes the function’s behavior hard to reason about; consider either implementing these options or removing/renaming them and making the z-orientation handling explicit and deterministic.
- The bulk cell construction in `slice_slab` uses `dz * len(layers)` for the `a3` vector even though the comment describes `a3` as the inter-layer spacing; if the intent is to represent the bulk periodicity, this should likely be just `dz` (or a clearly derived periodic translation), otherwise the `.bul` output will encode the full slab thickness instead of the bulk repeat.
- In the CLI, `ase.io.read` can return a list of `Atoms` objects for multi-frame inputs (e.g., trajectories); it may be safer to explicitly handle that case (e.g., select the first frame or error out) so downstream slicing logic always receives a single `Atoms` instance.
## Individual Comments
### Comment 1
<location> `tools/cleed_io/slicer.py:39-44` </location>
<code_context>
+ atoms.rotate(normal, 'z', rotate_cell=True)
+
+ # Ensure c is positive z
+ final_cell = atoms.get_cell()
+ if final_cell[2][2] < 0:
+ # Flip z? No, that mirrors the structure.
+ # Just ensure the slab is oriented correctly.
+ # ViPErLEED: "positive z-direction pointing away from the surface into the vacuum"
+ pass
+
+ return atoms
</code_context>
<issue_to_address>
**issue (bug_risk):** The negative z-component of `c` is detected but never corrected, which can violate the stated CLEED convention.
Right now, when `final_cell[2][2] < 0` we detect the issue but do nothing, so the returned structure can violate the documented CLEED convention (+z into vacuum) without any signal to the caller. Consider either applying a proper 180° rotation around x or y to fix the sign while preserving handedness, raising an error instead of silently continuing, or clearly documenting that the orientation is not enforced.
</issue_to_address>
### Comment 2
<location> `tools/cleed_io/slicer.py:125-132` </location>
<code_context>
+ if len(layers) >= 2:
+ z1 = np.mean(aligned.positions[layers[0], 2])
+ z2 = np.mean(aligned.positions[layers[1], 2])
+ dz = z2 - z1
+ # This assumes vertical stacking.
+ # Ideally, we find the translation vector between layer 0 and layer 1.
+
+ # Heuristic: Use the slab cell for a1, a2.
+ # Construct a3 from the layer spacing.
+ cell = aligned.get_cell()
+ bulk_atoms.set_cell([cell[0], cell[1], [0, 0, dz * len(layers)]]) # Dummy a3
+
+ return bulk_atoms, overlayer_atoms
</code_context>
<issue_to_address>
**suggestion (bug_risk):** Using `dz * len(layers)` for the bulk a3 vector is a questionable heuristic and likely misrepresents the bulk periodicity.
Here `dz` is the spacing between the first two layers, but a3 is scaled by `len(layers)` (including any overlayer). This will often make a3 larger than the true bulk repeat and tie it to whether an overlayer is present. If this is only a placeholder, consider using `dz` (or `dz * bulk_layer_count` if appropriate) and making that explicit in the API (e.g., flag/metadata), or allow the caller to provide the bulk a3 explicitly.
Suggested implementation:
```python
# Heuristic: Use the slab cell for a1, a2.
# Construct a3 from the inter-layer spacing between the first two bulk-like layers.
# NOTE: This is a placeholder for the bulk c vector; callers should supply the
# true bulk a3 if known.
cell = aligned.get_cell()
bulk_c = abs(dz)
bulk_atoms.set_cell([cell[0], cell[1], [0, 0, bulk_c]]) # Dummy a3 based on first-layer spacing
```
To fully address the concern and make the heuristic explicit in the API:
1. Consider adding an optional parameter (e.g. `bulk_a3=None` or `bulk_c=None`) to this function so callers can provide the correct bulk periodicity when available.
2. If such a parameter is added, use it here instead of the placeholder when provided, and fall back to the `abs(dz)` heuristic only when it is not given.
3. Optionally attach a flag/metadata attribute on `bulk_atoms` (e.g. `bulk_atoms.info["a3_is_placeholder"] = True`) when using the heuristic, so downstream code can detect and handle this.
</issue_to_address>
### Comment 3
<location> `tools/cleed_io/slicer.py:5-14` </location>
<code_context>
+from ase import Atoms
+from ase.geometry import cell_to_cellpar
+
+def align_surface(atoms: Atoms, vacuum_direction='z', tolerance=1e-3) -> Atoms:
+ """
+ Aligns the atoms object such that the periodic surface vectors (a, b)
+ lie in the xy-plane (z=0) and the surface normal points along +z.
+
+ This matches the CLEED convention where z is the surface normal.
+ """
+ # 1. Identify the non-periodic direction or the longest vector if 3D periodic
+ # For a standard slab, we assume c is the vacuum direction.
+
+ cell = atoms.get_cell()
+ pbc = atoms.get_pbc()
+
+ # We assume standard ASE slab convention: 3rd vector is out-of-plane
</code_context>
<issue_to_address>
**suggestion:** `vacuum_direction` (and `pbc`) are unused, which could confuse callers about supported orientations.
The implementation assumes the third cell vector is out-of-plane and always maps `cell[0]` to x, so the current behavior doesn’t reflect the `vacuum_direction`/`pbc` arguments. Either add logic to honor these parameters or remove them from the API for now so callers aren’t misled about what orientations are actually supported.
Suggested implementation:
```python
def align_surface(atoms: Atoms) -> Atoms:
"""
Aligns the atoms object such that the periodic surface vectors (a, b)
lie in the xy-plane (z=0) and the surface normal points along +z.
This matches the CLEED convention where z is the surface normal.
Note:
The current implementation assumes the standard ASE slab convention
where the third cell vector is the out-of-plane (vacuum) direction.
Other orientations are not yet supported.
"""
# 1. Identify the non-periodic direction or the longest vector if 3D periodic
# For a standard slab, we assume c is the vacuum direction.
cell = atoms.get_cell()
```
1. All call sites of `align_surface` must be updated to drop the `vacuum_direction` and `tolerance` arguments, e.g. `align_surface(atoms)` instead of `align_surface(atoms, vacuum_direction='z')`.
2. If the `pbc` variable is used later in this function (outside the snippet shown), reintroduce it at the point of first use; otherwise, keep it removed to avoid unused-variable warnings.
</issue_to_address>
### Comment 4
<location> `tools/cleed_io/writer.py:39-19` </location>
<code_context>
+ """
+ Writes the CLEED .inp file.
+ """
+ cell = bulk_atoms.get_cell() # Overlayer uses bulk 2D periodicity usually
+
+ with open(filename, 'w') as f:
+ f.write(f"c: Generated by cleed-convert\n")
+
</code_context>
<issue_to_address>
**nitpick:** `cell` is computed but never used in `write_inp`, suggesting either dead code or a missing use.
If the overlayer `.inp` truly doesn’t need lattice vectors, remove this assignment. If you expect to use the bulk 2D periodicity later (e.g., for superstructure or overlayer positioning), either hook `cell` into that logic now or add a clear TODO so its purpose is explicit.
</issue_to_address>
### Comment 5
<location> `docs/modern-io.md:7` </location>
<code_context>
+
+## Motivation
+
+Standard DFT workflows (VASP, Quantum Espresso) use "Supercell" geometries (3D periodic boxes). CLEED requires a specific "Layered" definition (Semi-infinite Bulk + Finite Overlayer).
+
+The `cleed-convert` tool bridges this gap by automatically slicing a 3D slab into the required LEED components.
</code_context>
<issue_to_address>
**nitpick (typo):** Consider using the official capitalization "Quantum ESPRESSO".
Please update "Quantum Espresso" to "Quantum ESPRESSO" to match the official project name used in their documentation and common usage.
```suggestion
Standard DFT workflows (VASP, Quantum ESPRESSO) use "Supercell" geometries (3D periodic boxes). CLEED requires a specific "Layered" definition (Semi-infinite Bulk + Finite Overlayer).
```
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
| final_cell = atoms.get_cell() | ||
| if final_cell[2][2] < 0: | ||
| # Flip z? No, that mirrors the structure. | ||
| # Just ensure the slab is oriented correctly. | ||
| # ViPErLEED: "positive z-direction pointing away from the surface into the vacuum" | ||
| pass |
There was a problem hiding this comment.
issue (bug_risk): The negative z-component of c is detected but never corrected, which can violate the stated CLEED convention.
Right now, when final_cell[2][2] < 0 we detect the issue but do nothing, so the returned structure can violate the documented CLEED convention (+z into vacuum) without any signal to the caller. Consider either applying a proper 180° rotation around x or y to fix the sign while preserving handedness, raising an error instead of silently continuing, or clearly documenting that the orientation is not enforced.
| dz = z2 - z1 | ||
| # This assumes vertical stacking. | ||
| # Ideally, we find the translation vector between layer 0 and layer 1. | ||
|
|
||
| # Heuristic: Use the slab cell for a1, a2. | ||
| # Construct a3 from the layer spacing. | ||
| cell = aligned.get_cell() | ||
| bulk_atoms.set_cell([cell[0], cell[1], [0, 0, dz * len(layers)]]) # Dummy a3 |
There was a problem hiding this comment.
suggestion (bug_risk): Using dz * len(layers) for the bulk a3 vector is a questionable heuristic and likely misrepresents the bulk periodicity.
Here dz is the spacing between the first two layers, but a3 is scaled by len(layers) (including any overlayer). This will often make a3 larger than the true bulk repeat and tie it to whether an overlayer is present. If this is only a placeholder, consider using dz (or dz * bulk_layer_count if appropriate) and making that explicit in the API (e.g., flag/metadata), or allow the caller to provide the bulk a3 explicitly.
Suggested implementation:
# Heuristic: Use the slab cell for a1, a2.
# Construct a3 from the inter-layer spacing between the first two bulk-like layers.
# NOTE: This is a placeholder for the bulk c vector; callers should supply the
# true bulk a3 if known.
cell = aligned.get_cell()
bulk_c = abs(dz)
bulk_atoms.set_cell([cell[0], cell[1], [0, 0, bulk_c]]) # Dummy a3 based on first-layer spacingTo fully address the concern and make the heuristic explicit in the API:
- Consider adding an optional parameter (e.g.
bulk_a3=Noneorbulk_c=None) to this function so callers can provide the correct bulk periodicity when available. - If such a parameter is added, use it here instead of the placeholder when provided, and fall back to the
abs(dz)heuristic only when it is not given. - Optionally attach a flag/metadata attribute on
bulk_atoms(e.g.bulk_atoms.info["a3_is_placeholder"] = True) when using the heuristic, so downstream code can detect and handle this.
| """ | ||
| cell = bulk_atoms.get_cell() | ||
|
|
||
| with open(filename, 'w') as f: |
There was a problem hiding this comment.
nitpick: cell is computed but never used in write_inp, suggesting either dead code or a missing use.
If the overlayer .inp truly doesn’t need lattice vectors, remove this assignment. If you expect to use the bulk 2D periodicity later (e.g., for superstructure or overlayer positioning), either hook cell into that logic now or add a clear TODO so its purpose is explicit.
|
|
||
| ## Motivation | ||
|
|
||
| Standard DFT workflows (VASP, Quantum Espresso) use "Supercell" geometries (3D periodic boxes). CLEED requires a specific "Layered" definition (Semi-infinite Bulk + Finite Overlayer). |
There was a problem hiding this comment.
nitpick (typo): Consider using the official capitalization "Quantum ESPRESSO".
Please update "Quantum Espresso" to "Quantum ESPRESSO" to match the official project name used in their documentation and common usage.
| Standard DFT workflows (VASP, Quantum Espresso) use "Supercell" geometries (3D periodic boxes). CLEED requires a specific "Layered" definition (Semi-infinite Bulk + Finite Overlayer). | |
| Standard DFT workflows (VASP, Quantum ESPRESSO) use "Supercell" geometries (3D periodic boxes). CLEED requires a specific "Layered" definition (Semi-infinite Bulk + Finite Overlayer). |
Adds 'reader.py' to parse legacy CLEED files and updates CLI to support '--from-cleed' export to POSCAR/CIF. Includes unit tests.
There was a problem hiding this comment.
Actionable comments posted: 11
📜 Review details
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (6)
docs/modern-io.md(1 hunks)tools/cleed_io/__init__.py(1 hunks)tools/cleed_io/cli.py(1 hunks)tools/cleed_io/slicer.py(1 hunks)tools/cleed_io/writer.py(1 hunks)tools/pyproject.toml(1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-12-20T09:06:56.375Z
Learnt from: CR
Repo: Liam-Deacon/CLEED PR: 0
File: doc/AGENTS.md:0-0
Timestamp: 2025-12-20T09:06:56.375Z
Learning: Applies to doc/CLEED_Manual.pdf : Keep `doc/CLEED_Manual.pdf` tracked in git as the canonical reference manual for the original CLEED program suite
Applied to files:
docs/modern-io.md
🧬 Code graph analysis (1)
tools/cleed_io/cli.py (2)
tools/cleed_io/slicer.py (1)
slice_slab(75-134)tools/cleed_io/writer.py (2)
write_bul(13-33)write_inp(35-56)
🪛 GitHub Check: Codacy Static Code Analysis
docs/modern-io.md
[notice] 30-30: docs/modern-io.md#L30
Expected: 1; Actual: 3
[notice] 30-30: docs/modern-io.md#L30
Lists should be surrounded by blank lines
[notice] 31-31: docs/modern-io.md#L31
Expected: 1; Actual: 3
[notice] 44-44: docs/modern-io.md#L44
Expected: 1; Actual: 2
[notice] 45-45: docs/modern-io.md#L45
Expected: 1; Actual: 2
[notice] 46-46: docs/modern-io.md#L46
Expected: 1; Actual: 2
[notice] 47-47: docs/modern-io.md#L47
Expected: 1; Actual: 3
[notice] 48-48: docs/modern-io.md#L48
Expected: 1; Actual: 3
[notice] 53-53: docs/modern-io.md#L53
Expected: 1; Actual: 3
[notice] 53-53: docs/modern-io.md#L53
Lists should be surrounded by blank lines
[notice] 54-54: docs/modern-io.md#L54
Expected: 1; Actual: 3
tools/cleed_io/cli.py
[notice] 38-38: tools/cleed_io/cli.py#L38
expected 2 blank lines after class or function definition, found 1 (E305)
tools/cleed_io/slicer.py
[warning] 3-3: tools/cleed_io/slicer.py#L3
'ase.geometry.cell_to_cellpar' imported but unused (F401)
[warning] 3-3: tools/cleed_io/slicer.py#L3
Unused cell_to_cellpar imported from ase.geometry
[notice] 16-16: tools/cleed_io/slicer.py#L16
Unused variable 'pbc'
[warning] 16-16: tools/cleed_io/slicer.py#L16
local variable 'pbc' is assigned to but never used (F841)
tools/cleed_io/writer.py
[warning] 1-1: tools/cleed_io/writer.py#L1
'numpy as np' imported but unused (F401)
[warning] 1-1: tools/cleed_io/writer.py#L1
Unused numpy imported as np
[notice] 39-39: tools/cleed_io/writer.py#L39
Unused variable 'cell'
[warning] 39-39: tools/cleed_io/writer.py#L39
local variable 'cell' is assigned to but never used (F841)
[warning] 42-42: tools/cleed_io/writer.py#L42
f-string is missing placeholders (F541)
🪛 markdownlint-cli2 (0.18.1)
docs/modern-io.md
7-7: Trailing spaces
Expected: 0 or 2; Actual: 1
(MD009, no-trailing-spaces)
🪛 Ruff (0.14.8)
tools/cleed_io/cli.py
34-34: Do not catch blind exception: Exception
(BLE001)
tools/cleed_io/slicer.py
5-5: Unused function argument: vacuum_direction
(ARG001)
5-5: Unused function argument: tolerance
(ARG001)
16-16: Local variable pbc is assigned to but never used
Remove assignment to unused variable pbc
(F841)
96-96: Avoid specifying long messages outside the exception class
(TRY003)
tools/cleed_io/writer.py
39-39: Local variable cell is assigned to but never used
Remove assignment to unused variable cell
(F841)
42-42: f-string without any placeholders
Remove extraneous f prefix
(F541)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Codacy Static Code Analysis
- GitHub Check: windows-latest
🔇 Additional comments (7)
tools/cleed_io/__init__.py (1)
1-6: LGTM!Clean package initializer with appropriate version metadata.
docs/modern-io.md (1)
1-54: Excellent documentation.The documentation clearly explains the motivation, installation, usage, and technical details of the modern I/O bridge. The examples are practical and helpful.
tools/cleed_io/cli.py (2)
34-36: Bare exception catch is acceptable for CLI error handling.While the static analyzer flags the bare
Exceptioncatch, this is appropriate for a CLI tool's top-level error handler where all exceptions should be caught and reported to the user.
7-33: Well-structured CLI implementation.The CLI follows best practices with clear argument parsing, progress messages, and proper error handling. The workflow is straightforward and user-friendly.
tools/cleed_io/writer.py (1)
4-11: Well-designed formatting utilities.The
format_vectorandformat_atomfunctions provide clean, reusable formatting with appropriate defaults and optional customization.tools/cleed_io/slicer.py (2)
48-73: Solid layer clustering implementation.The
cluster_layersfunction correctly groups atoms by z-coordinate with appropriate tolerance handling and edge cases (empty structure, single atom).
25-36: The rotation logic is correctly implemented and aligns with ASE'srotate()method behavior. The method signaturerotate(a, v)aligns vectorawith directionv(not rotates aroundv), which is exactly what the code intends. Line 26 correctly aligns thecell[0]vector with the x-axis, and line 36 correctly aligns the surface normal with the z-axis to standardize the slab orientation for CLEED. The comments and logic match the surface crystallography convention. No critical issues found.Likely an incorrect or invalid review comment.
|
|
||
| ## Motivation | ||
|
|
||
| Standard DFT workflows (VASP, Quantum Espresso) use "Supercell" geometries (3D periodic boxes). CLEED requires a specific "Layered" definition (Semi-infinite Bulk + Finite Overlayer). |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Remove trailing space.
Markdownlint detected a trailing space at the end of this line.
🔎 Proposed fix
-Standard DFT workflows (VASP, Quantum Espresso) use "Supercell" geometries (3D periodic boxes). CLEED requires a specific "Layered" definition (Semi-infinite Bulk + Finite Overlayer).
+Standard DFT workflows (VASP, Quantum Espresso) use "Supercell" geometries (3D periodic boxes). CLEED requires a specific "Layered" definition (Semi-infinite Bulk + Finite Overlayer).📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| Standard DFT workflows (VASP, Quantum Espresso) use "Supercell" geometries (3D periodic boxes). CLEED requires a specific "Layered" definition (Semi-infinite Bulk + Finite Overlayer). | |
| Standard DFT workflows (VASP, Quantum Espresso) use "Supercell" geometries (3D periodic boxes). CLEED requires a specific "Layered" definition (Semi-infinite Bulk + Finite Overlayer). |
🧰 Tools
🪛 markdownlint-cli2 (0.18.1)
7-7: Trailing spaces
Expected: 0 or 2; Actual: 1
(MD009, no-trailing-spaces)
🤖 Prompt for AI Agents
In docs/modern-io.md around line 7, remove the trailing whitespace at the end of
the line that reads 'Standard DFT workflows (VASP, Quantum Espresso) use
"Supercell" geometries (3D periodic boxes). CLEED requires a specific "Layered"
definition (Semi-infinite Bulk + Finite Overlayer).' by deleting the extra space
character at the end so the line ends with the period; re-save the file to
ensure markdownlint no longer flags a trailing space.
| dependencies = [ | ||
| "ase>=3.22.0", | ||
| "numpy", | ||
| "pyyaml" | ||
| ] |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Find and read the pyproject.toml file in tools directory
find . -path "*/tools/pyproject.toml" -type f | head -1 | xargs cat -nRepository: Liam-Deacon/CLEED
Length of output: 667
Add version constraints to numpy and pyyaml dependencies.
The numpy and pyyaml dependencies lack version constraints, which could lead to unexpected breakage if major versions introduce incompatible changes (e.g., numpy's 1.x to 2.x transition). Consider specifying minimum and ideally maximum version bounds for these dependencies to ensure stability.
🤖 Prompt for AI Agents
In tools/pyproject.toml around lines 12 to 16, numpy and pyyaml are listed
without version constraints; add appropriate version bounds (at minimum a
minimum version and preferably an upper bound) to both entries (for example pin
numpy to a compatible >= and < major bump and pyyaml similarly) so the project
avoids breaking changes from major releases; update the dependencies array to
include those constraints and run dependency resolution to confirm
compatibility.
| [project.optional-dependencies] | ||
| phaseshifts = ["phaseshifts"] |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Add version constraint for optional dependency.
The phaseshifts optional dependency has no version constraint, which could lead to compatibility issues.
🤖 Prompt for AI Agents
In tools/pyproject.toml around lines 22-23, the optional dependency entry
"phaseshifts" has no version constraint; update the
project.optional-dependencies entry to include a semantic version specifier (for
example use a range like a minimum supported version and an upper bound to avoid
breaking changes, or pin to a specific tested version), commit the change, and
run your dependency resolver (poetry/pip-tools) to verify the constraint is
valid and the lockfile updates accordingly.
There was a problem hiding this comment.
Actionable comments posted: 5
♻️ Duplicate comments (2)
tools/cleed_io/cli.py (1)
64-65: Add blank line before module guard.PEP 8 recommends two blank lines before top-level code following function definitions.
🔎 Proposed fix
sys.exit(1) if __name__ == "__main__": main()docs/modern-io.md (1)
7-7: Fix trailing space and capitalization.Line 7 has a trailing space and uses "Quantum Espresso" instead of the official "Quantum ESPRESSO" capitalization.
🔎 Proposed fix
-Standard DFT workflows (VASP, Quantum Espresso) use "Supercell" geometries (3D periodic boxes). CLEED requires a specific "Layered" definition (Semi-infinite Bulk + Finite Overlayer). +Standard DFT workflows (VASP, Quantum ESPRESSO) use "Supercell" geometries (3D periodic boxes). CLEED requires a specific "Layered" definition (Semi-infinite Bulk + Finite Overlayer).
📜 Review details
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (4)
docs/modern-io.md(1 hunks)tools/cleed_io/cli.py(1 hunks)tools/cleed_io/reader.py(1 hunks)tools/tests/test_io.py(1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-12-20T09:06:56.375Z
Learnt from: CR
Repo: Liam-Deacon/CLEED PR: 0
File: doc/AGENTS.md:0-0
Timestamp: 2025-12-20T09:06:56.375Z
Learning: Applies to doc/CLEED_Manual.pdf : Keep `doc/CLEED_Manual.pdf` tracked in git as the canonical reference manual for the original CLEED program suite
Applied to files:
docs/modern-io.md
🧬 Code graph analysis (2)
tools/cleed_io/cli.py (3)
tools/cleed_io/slicer.py (1)
slice_slab(75-134)tools/cleed_io/writer.py (2)
write_bul(13-33)write_inp(35-56)tools/cleed_io/reader.py (1)
read_cleed_files(49-114)
tools/tests/test_io.py (3)
tools/cleed_io/slicer.py (1)
slice_slab(75-134)tools/cleed_io/reader.py (1)
read_cleed_files(49-114)tools/cleed_io/writer.py (2)
write_bul(13-33)write_inp(35-56)
🪛 GitHub Check: Codacy Static Code Analysis
tools/cleed_io/cli.py
[warning] 3-3: tools/cleed_io/cli.py#L3
'os' imported but unused (F401)
[warning] 3-3: tools/cleed_io/cli.py#L3
Unused import os
[notice] 64-64: tools/cleed_io/cli.py#L64
expected 2 blank lines after class or function definition, found 1 (E305)
docs/modern-io.md
[notice] 22-22: docs/modern-io.md#L22
Expected: 1; Actual: 3
[notice] 22-22: docs/modern-io.md#L22
Lists should be surrounded by blank lines
[notice] 23-23: docs/modern-io.md#L23
Expected: 1; Actual: 3
[notice] 24-24: docs/modern-io.md#L24
Expected: 1; Actual: 3
[notice] 25-25: docs/modern-io.md#L25
Expected: 1; Actual: 3
[notice] 26-26: docs/modern-io.md#L26
Expected: 1; Actual: 3
[notice] 27-27: docs/modern-io.md#L27
Expected: 1; Actual: 3
[notice] 40-40: docs/modern-io.md#L40
Expected: 1; Actual: 3
[notice] 40-40: docs/modern-io.md#L40
Lists should be surrounded by blank lines
[notice] 41-41: docs/modern-io.md#L41
Expected: 1; Actual: 3
[notice] 44-44: docs/modern-io.md#L44
Expected: 1; Actual: 3
[notice] 44-44: docs/modern-io.md#L44
Lists should be surrounded by blank lines
[notice] 58-58: docs/modern-io.md#L58
Expected: 1; Actual: 2
[notice] 59-59: docs/modern-io.md#L59
Expected: 1; Actual: 2
[notice] 60-60: docs/modern-io.md#L60
Expected: 1; Actual: 2
[notice] 61-61: docs/modern-io.md#L61
Expected: 1; Actual: 3
[notice] 62-62: docs/modern-io.md#L62
Expected: 1; Actual: 3
[notice] 67-67: docs/modern-io.md#L67
Expected: 1; Actual: 3
[notice] 67-67: docs/modern-io.md#L67
Lists should be surrounded by blank lines
[notice] 68-68: docs/modern-io.md#L68
Expected: 1; Actual: 3
tools/cleed_io/reader.py
[warning] 49-49: tools/cleed_io/reader.py#L49
Method read_cleed_files has a cyclomatic complexity of 18 (limit is 8)
🪛 markdownlint-cli2 (0.18.1)
docs/modern-io.md
7-7: Trailing spaces
Expected: 0 or 2; Actual: 1
(MD009, no-trailing-spaces)
68-68: Files should end with a single newline character
(MD047, single-trailing-newline)
🪛 Ruff (0.14.8)
tools/cleed_io/cli.py
60-60: Do not catch blind exception: Exception
(BLE001)
tools/tests/test_io.py
35-35: Use a regular assert instead of unittest-style assertEqual
Replace assertEqual(...) with assert ...
(PT009)
36-36: Use a regular assert instead of unittest-style assertEqual
Replace assertEqual(...) with assert ...
(PT009)
37-37: Use a regular assert instead of unittest-style assertEqual
Replace assertEqual(...) with assert ...
(PT009)
38-38: Use a regular assert instead of unittest-style assertEqual
Replace assertEqual(...) with assert ...
(PT009)
51-51: Use a regular assert instead of unittest-style assertEqual
Replace assertEqual(...) with assert ...
(PT009)
56-56: Use a regular assert instead of unittest-style assertEqual
Replace assertEqual(...) with assert ...
(PT009)
tools/cleed_io/reader.py
66-66: Unnecessary mode argument
Remove mode argument
(UP015)
69-69: Call startswith once with a tuple
Merge into a single startswith call
(PIE810)
88-88: Unnecessary mode argument
Remove mode argument
(UP015)
91-91: Call startswith once with a tuple
Merge into a single startswith call
(PIE810)
101-101: Avoid specifying long messages outside the exception class
(TRY003)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Codacy Static Code Analysis
- GitHub Check: windows-latest
🔇 Additional comments (9)
tools/cleed_io/reader.py (3)
4-16: LGTM: Simple and correct vector parsing.The function correctly extracts the first three numeric values from a line, ignoring comments and non-numeric tokens.
26-32: Fix the coordinate parsing loop logic.Line 32 increments
idxunconditionally, even whenfloat()raisesValueError. This could skip over valid coordinate values if a non-numeric token appears between coordinates.🔎 Proposed fix
coords = [] idx = 2 while len(coords) < 3 and idx < len(parts): try: coords.append(float(parts[idx])) + idx += 1 except ValueError: - pass # Should not happen for coords - idx += 1 + idx += 1 # Skip non-numeric tokenLikely an incorrect or invalid review comment.
34-46: The symbol extraction logic is correct for all observed CLEED label formats.The code extracts all contiguous alphabetic characters from labels like "Ni_BVH" → "Ni" and "O" → "O", which works correctly across all example CLEED files (Ca, Au, Cl, Ag, Cu, Ni, O, C). The fallback to "X" handles edge cases, though it's never triggered in practice. No CLEED files use numeric-prefixed labels like "123Ni", making the stated concern about such formats hypothetical. While optional validation using ASE's
ase.data.chemical_symbolsis feasible (ASE is already a project dependency), the current heuristic is sufficient and robust for real-world CLEED data.Likely an incorrect or invalid review comment.
tools/tests/test_io.py (4)
11-18: LGTM: Clean test setup.The test fixture creates a representative Ni(111) slab with an O adsorbate, providing good coverage for the slicing and I/O operations.
20-24: LGTM: Proper test cleanup.Teardown correctly removes generated test files.
26-38: LGTM: Well-structured slicing test.The test correctly validates atom counts and element symbols for the sliced bulk and overlayer components.
40-64: LGTM: Comprehensive round-trip validation.The test correctly validates that the write-read cycle preserves atom count, composition, and in-plane lattice vectors. The comment on line 59 correctly notes that
a3may differ due to the slicer's heuristic bulk cell reconstruction.Note: The test validates composition but not atomic position preservation. Consider adding position validation in a future enhancement:
# After reconstruction, validate positions are preserved np.testing.assert_array_almost_equal( sorted(self.slab.positions[:, :2]), # xy coordinates sorted(reconstructed.positions[:, :2]), decimal=3 )tools/cleed_io/cli.py (2)
9-26: LGTM: Well-structured argument parser.The mutually exclusive group for mode selection is appropriate, and argument definitions are clear with helpful descriptions.
28-62: LGTM: Clean conversion logic with appropriate error handling.The to-cleed and from-cleed workflows are well-structured with proper validation and user feedback. The broad exception catch on line 60 is acceptable for a CLI tool to provide user-friendly error messages.
| # Modern Input/Output for CLEED | ||
|
|
||
| This guide explains how to use the `cleed-io` tools to convert modern crystallography files (POSCAR, CIF, XYZ) into CLEED-compatible `.bul` and `.inp` files, and vice-versa. | ||
|
|
||
| ## Motivation | ||
|
|
||
| Standard DFT workflows (VASP, Quantum Espresso) use "Supercell" geometries (3D periodic boxes). CLEED requires a specific "Layered" definition (Semi-infinite Bulk + Finite Overlayer). | ||
|
|
||
| The `cleed-convert` tool bridges this gap by automatically slicing a 3D slab into the required LEED components. | ||
|
|
||
| ## Installation | ||
|
|
||
| The tools require Python 3.8+ and the Atomic Simulation Environment (ASE). | ||
|
|
||
| ```bash | ||
| pip install tools/ | ||
| ``` | ||
|
|
||
| ## Supported Formats | ||
|
|
||
| Via the **ASE** backend, `cleed-convert` supports reading and writing over 30 formats, including: | ||
| * **VASP:** `POSCAR`, `CONTCAR`, `XDATCAR` | ||
| * **Crystallography:** `.cif` | ||
| * **XYZ:** `.xyz`, `.extxyz` | ||
| * **Quantum Espresso:** `.in`, `.out` | ||
| * **Gaussian:** `.log` | ||
| * **FHI-aims:** `.in` | ||
|
|
||
| ## Usage | ||
|
|
||
| ### 1. Import: POSCAR $\to$ CLEED | ||
|
|
||
| Convert a VASP POSCAR to CLEED files: | ||
|
|
||
| ```bash | ||
| cleed-convert --to-cleed --input POSCAR --output my_surface | ||
| ``` | ||
|
|
||
| This generates: | ||
| * `my_surface.bul`: Contains the bulk lattice vectors and atoms. | ||
| * `my_surface.inp`: Contains the overlayer atoms. | ||
|
|
||
| **Options:** | ||
| * `--bulk-layers N`: Specify how many bottom layers constitute the bulk (default: 2). | ||
|
|
||
| ### 2. Export: CLEED $\to$ POSCAR | ||
|
|
||
| Convert CLEED simulation files back to a format suitable for visualization (VESTA, Ovito): | ||
|
|
||
| ```bash | ||
| cleed-convert --from-cleed --input my_surface.inp --bulk my_surface.bul --output RECONSTRUCTED_POSCAR --format vasp | ||
| ``` | ||
|
|
||
| This reconstructs the full 3D slab geometry. | ||
|
|
||
| ## How it Works | ||
|
|
||
| 1. **Alignment:** The tool rotates the input structure so the lattice vectors $\vec{a}$ and $\vec{b}$ lie in the $xy$-plane ($z=0$), matching LEED conventions. | ||
| 2. **Slicing:** Atoms are clustered into layers based on their $z$-coordinate. | ||
| 3. **Separation:** | ||
| * The bottom $N$ layers are extracted to form the `.bul` file. | ||
| * All remaining layers (surface + adsorbate) are written to the `.inp` file. | ||
|
|
||
| ## Integration with ViPErLEED | ||
|
|
||
| This tool supports the conventions used by ViPErLEED: | ||
| * Slab is asymmetric (Bulk at bottom). | ||
| * Vacuum is along $+z$. No newline at end of file |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
LGTM: Comprehensive and well-structured documentation.
The documentation clearly explains the motivation, installation, usage, and technical details of the CLEED I/O bridge. The examples are practical and the integration notes with ViPErLEED conventions are helpful.
One minor improvement: add a final newline at the end of the file (line 68) to comply with POSIX standards:
* Slab is asymmetric (Bulk at bottom).
* Vacuum is along $+z$.
+📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # Modern Input/Output for CLEED | |
| This guide explains how to use the `cleed-io` tools to convert modern crystallography files (POSCAR, CIF, XYZ) into CLEED-compatible `.bul` and `.inp` files, and vice-versa. | |
| ## Motivation | |
| Standard DFT workflows (VASP, Quantum Espresso) use "Supercell" geometries (3D periodic boxes). CLEED requires a specific "Layered" definition (Semi-infinite Bulk + Finite Overlayer). | |
| The `cleed-convert` tool bridges this gap by automatically slicing a 3D slab into the required LEED components. | |
| ## Installation | |
| The tools require Python 3.8+ and the Atomic Simulation Environment (ASE). | |
| ```bash | |
| pip install tools/ | |
| ``` | |
| ## Supported Formats | |
| Via the **ASE** backend, `cleed-convert` supports reading and writing over 30 formats, including: | |
| * **VASP:** `POSCAR`, `CONTCAR`, `XDATCAR` | |
| * **Crystallography:** `.cif` | |
| * **XYZ:** `.xyz`, `.extxyz` | |
| * **Quantum Espresso:** `.in`, `.out` | |
| * **Gaussian:** `.log` | |
| * **FHI-aims:** `.in` | |
| ## Usage | |
| ### 1. Import: POSCAR $\to$ CLEED | |
| Convert a VASP POSCAR to CLEED files: | |
| ```bash | |
| cleed-convert --to-cleed --input POSCAR --output my_surface | |
| ``` | |
| This generates: | |
| * `my_surface.bul`: Contains the bulk lattice vectors and atoms. | |
| * `my_surface.inp`: Contains the overlayer atoms. | |
| **Options:** | |
| * `--bulk-layers N`: Specify how many bottom layers constitute the bulk (default: 2). | |
| ### 2. Export: CLEED $\to$ POSCAR | |
| Convert CLEED simulation files back to a format suitable for visualization (VESTA, Ovito): | |
| ```bash | |
| cleed-convert --from-cleed --input my_surface.inp --bulk my_surface.bul --output RECONSTRUCTED_POSCAR --format vasp | |
| ``` | |
| This reconstructs the full 3D slab geometry. | |
| ## How it Works | |
| 1. **Alignment:** The tool rotates the input structure so the lattice vectors $\vec{a}$ and $\vec{b}$ lie in the $xy$-plane ($z=0$), matching LEED conventions. | |
| 2. **Slicing:** Atoms are clustered into layers based on their $z$-coordinate. | |
| 3. **Separation:** | |
| * The bottom $N$ layers are extracted to form the `.bul` file. | |
| * All remaining layers (surface + adsorbate) are written to the `.inp` file. | |
| ## Integration with ViPErLEED | |
| This tool supports the conventions used by ViPErLEED: | |
| * Slab is asymmetric (Bulk at bottom). | |
| * Vacuum is along $+z$. | |
| # Modern Input/Output for CLEED | |
| This guide explains how to use the `cleed-io` tools to convert modern crystallography files (POSCAR, CIF, XYZ) into CLEED-compatible `.bul` and `.inp` files, and vice-versa. | |
| ## Motivation | |
| Standard DFT workflows (VASP, Quantum Espresso) use "Supercell" geometries (3D periodic boxes). CLEED requires a specific "Layered" definition (Semi-infinite Bulk + Finite Overlayer). | |
| The `cleed-convert` tool bridges this gap by automatically slicing a 3D slab into the required LEED components. | |
| ## Installation | |
| The tools require Python 3.8+ and the Atomic Simulation Environment (ASE). | |
🧰 Tools
🪛 GitHub Check: Codacy Static Code Analysis
[notice] 22-22: docs/modern-io.md#L22
Expected: 1; Actual: 3
[notice] 22-22: docs/modern-io.md#L22
Lists should be surrounded by blank lines
[notice] 23-23: docs/modern-io.md#L23
Expected: 1; Actual: 3
[notice] 24-24: docs/modern-io.md#L24
Expected: 1; Actual: 3
[notice] 25-25: docs/modern-io.md#L25
Expected: 1; Actual: 3
[notice] 26-26: docs/modern-io.md#L26
Expected: 1; Actual: 3
[notice] 27-27: docs/modern-io.md#L27
Expected: 1; Actual: 3
[notice] 40-40: docs/modern-io.md#L40
Expected: 1; Actual: 3
[notice] 40-40: docs/modern-io.md#L40
Lists should be surrounded by blank lines
[notice] 41-41: docs/modern-io.md#L41
Expected: 1; Actual: 3
[notice] 44-44: docs/modern-io.md#L44
Expected: 1; Actual: 3
[notice] 44-44: docs/modern-io.md#L44
Lists should be surrounded by blank lines
[notice] 58-58: docs/modern-io.md#L58
Expected: 1; Actual: 2
[notice] 59-59: docs/modern-io.md#L59
Expected: 1; Actual: 2
[notice] 60-60: docs/modern-io.md#L60
Expected: 1; Actual: 2
[notice] 61-61: docs/modern-io.md#L61
Expected: 1; Actual: 3
[notice] 62-62: docs/modern-io.md#L62
Expected: 1; Actual: 3
[notice] 67-67: docs/modern-io.md#L67
Expected: 1; Actual: 3
[notice] 67-67: docs/modern-io.md#L67
Lists should be surrounded by blank lines
[notice] 68-68: docs/modern-io.md#L68
Expected: 1; Actual: 3
🪛 markdownlint-cli2 (0.18.1)
7-7: Trailing spaces
Expected: 0 or 2; Actual: 1
(MD009, no-trailing-spaces)
68-68: Files should end with a single newline character
(MD047, single-trailing-newline)
🤖 Prompt for AI Agents
In docs/modern-io.md around lines 1 to 68 the file is missing a trailing newline
(POSIX requires files end with a newline); fix by editing the file and adding a
single newline character at the end of the file so the last line ends with a LF.
| def read_cleed_files(bul_file, inp_file): | ||
| """ | ||
| Reads CLEED .bul and .inp files and reconstructs the full unit cell. | ||
|
|
||
| Args: | ||
| bul_file (str): Path to .bul file. | ||
| inp_file (str): Path to .inp file. | ||
|
|
||
| Returns: | ||
| Atoms: ASE Atoms object containing Bulk + Overlayer. | ||
| """ | ||
|
|
||
| # 1. Parse Bulk File | ||
| a1 = a2 = a3 = None | ||
| bulk_atoms = [] | ||
| bulk_symbols = [] | ||
|
|
||
| with open(bul_file, 'r') as f: | ||
| for line in f: | ||
| line = line.strip() | ||
| if not line or line.startswith('#') or line.startswith('c:'): | ||
| continue | ||
|
|
||
| if line.lower().startswith('a1:'): | ||
| a1 = parse_vector(line) | ||
| elif line.lower().startswith('a2:'): | ||
| a2 = parse_vector(line) | ||
| elif line.lower().startswith('a3:'): | ||
| a3 = parse_vector(line) | ||
| elif line.lower().startswith('pb:'): | ||
| sym, pos = parse_atom(line) | ||
| bulk_symbols.append(sym) | ||
| bulk_atoms.append(pos) | ||
|
|
||
| # 2. Parse Input File (Overlayer) | ||
| over_atoms = [] | ||
| over_symbols = [] | ||
| # m1, m2 could be parsed here for superstructures | ||
|
|
||
| with open(inp_file, 'r') as f: | ||
| for line in f: | ||
| line = line.strip() | ||
| if not line or line.startswith('#') or line.startswith('c:'): | ||
| continue | ||
|
|
||
| if line.lower().startswith('po:'): | ||
| sym, pos = parse_atom(line) | ||
| over_symbols.append(sym) | ||
| over_atoms.append(pos) | ||
|
|
||
| # 3. Construct Unit Cell | ||
| if a1 is None or a2 is None or a3 is None: | ||
| raise ValueError("Lattice vectors (a1, a2, a3) not found in .bul file") | ||
|
|
||
| cell = np.array([a1, a2, a3]) | ||
|
|
||
| # Combine atoms | ||
| all_symbols = bulk_symbols + over_symbols | ||
| all_positions = bulk_atoms + over_atoms | ||
|
|
||
| if not all_positions: | ||
| # Create empty cell if no atoms | ||
| return Atoms(cell=cell, pbc=True) | ||
|
|
||
| atoms = Atoms(symbols=all_symbols, positions=all_positions, cell=cell, pbc=True) | ||
| return atoms |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Consider refactoring to reduce cyclomatic complexity.
The function has a cyclomatic complexity of 18 (threshold is typically 8-10), as it handles file parsing, lattice vector extraction, atom parsing, and Atoms construction in one function.
Consider splitting into helper functions:
_parse_bul_file(bul_file)→ returns(a1, a2, a3, bulk_symbols, bulk_atoms)_parse_inp_file(inp_file)→ returns(over_symbols, over_atoms)read_cleed_filesorchestrates and constructs the Atoms object
This would improve testability and maintainability.
🧰 Tools
🪛 GitHub Check: Codacy Static Code Analysis
[warning] 49-49: tools/cleed_io/reader.py#L49
Method read_cleed_files has a cyclomatic complexity of 18 (limit is 8)
🪛 Ruff (0.14.8)
66-66: Unnecessary mode argument
Remove mode argument
(UP015)
69-69: Call startswith once with a tuple
Merge into a single startswith call
(PIE810)
88-88: Unnecessary mode argument
Remove mode argument
(UP015)
91-91: Call startswith once with a tuple
Merge into a single startswith call
(PIE810)
101-101: Avoid specifying long messages outside the exception class
(TRY003)
🤖 Prompt for AI Agents
In tools/cleed_io/reader.py around lines 49-114 the read_cleed_files function is
too large and has high cyclomatic complexity; refactor by extracting two
helpers: _parse_bul_file(bul_file) that opens bul_file, extracts and returns
(a1, a2, a3, bulk_symbols, bulk_atoms) preserving current parsing logic and
error raise if lattice vectors missing, and _parse_inp_file(inp_file) that opens
inp_file and returns (over_symbols, over_atoms) using the same parsing rules;
then simplify read_cleed_files to call these helpers, build cell =
np.array([a1,a2,a3]), combine symbols/positions, and return Atoms exactly as
before (including the empty-cell branch), ensuring no change in behavior or
exceptions.
| bulk_atoms = [] | ||
| bulk_symbols = [] | ||
|
|
||
| with open(bul_file, 'r') as f: |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Remove unnecessary mode='r' argument.
Python's open() defaults to read mode, so the explicit mode='r' is redundant.
🔎 Proposed fix
- with open(bul_file, 'r') as f:
+ with open(bul_file) as f:
for line in f:- with open(inp_file, 'r') as f:
+ with open(inp_file) as f:
for line in f:Also applies to: 88-88
🧰 Tools
🪛 Ruff (0.14.8)
66-66: Unnecessary mode argument
Remove mode argument
(UP015)
🤖 Prompt for AI Agents
In tools/cleed_io/reader.py around lines 66 and 88, remove the redundant
explicit mode='r' argument from the open(...) calls (open(bul_file, 'r') ->
open(bul_file)) since open defaults to read mode; update both occurrences to
call open with just the file path and keep the existing context manager and
handling unchanged.
| with open(bul_file, 'r') as f: | ||
| for line in f: | ||
| line = line.strip() | ||
| if not line or line.startswith('#') or line.startswith('c:'): |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Optimize startswith checks with tuple.
Multiple startswith calls can be merged into a single call with a tuple for better performance and readability.
🔎 Proposed fix
- if not line or line.startswith('#') or line.startswith('c:'):
+ if not line or line.startswith(('#', 'c:')):
continueApply the same pattern on line 91.
Also applies to: 91-91
🧰 Tools
🪛 Ruff (0.14.8)
69-69: Call startswith once with a tuple
Merge into a single startswith call
(PIE810)
🤖 Prompt for AI Agents
In tools/cleed_io/reader.py around lines 69 and 91, the code uses multiple
sequential startswith checks (e.g., line.startswith('#') or
line.startswith('c:')), which should be consolidated into a single startswith
call with a tuple for performance and readability; update both occurrences to
use line.startswith(('#', 'c:')) (and similarly include any other prefixes used
on line 91) so the condition becomes one startswith call with a tuple of
prefixes.
|
There was a problem hiding this comment.
Actionable comments posted: 9
🤖 Fix all issues with AI agents
In @tools/cleed_io/cli.py:
- Around line 63-64: Add two blank lines before the module-level execution block
so PEP 8 is satisfied: ensure there are exactly two blank lines separating the
preceding function definitions and the "if __name__ == '__main__':" block that
calls main(), i.e., insert a blank line before and another blank line above the
"if __name__ == '__main__':" line that invokes main().
- Line 9: The ArgumentParser description is too narrow; update the
argparse.ArgumentParser(...) call where parser is created to reflect support for
"POSCAR/CIF/XYZ" and that many other formats are supported via ASE (e.g., change
description to something like "Convert between structure files (POSCAR, CIF, XYZ
and other ASE-supported formats) and CLEED (.bul/.inp)"); modify the description
string passed to parser to mention XYZ and ASE-supported formats so the help
text matches the PR scope.
- Around line 28-46: Before calling slice_slab in the args.to_cleed branch,
validate the input structure returned by read(...) to ensure it is a slab:
verify the cell/periodicity indicates periodicity in at least two directions
(e.g., cell vectors / periodic flags), compute atom z-range (max z - min z) and
compare to the cell z length to confirm a vacuum region (cell_z_length
significantly larger than atom_z_range), and if these checks fail call
parser.error or print a clear error and exit; update the to-cleed flow around
args.to_cleed/read/slice_slab/write_bul/write_inp to perform these checks and
emit a concise error like "input does not appear to be a slab: requires >=2
periodic directions and vacuum along z" before proceeding.
In @tools/cleed_io/slicer.py:
- Around line 119-129: The bulk unit cell a3 is being set incorrectly by
multiplying dz by len(layers); instead compute the inter-layer spacing between
the first two detected layers (dz = z2 - z1) and set the bulk cell's third
lattice vector to represent a single bulk repeat along z (i.e., use [0, 0, dz]
rather than [0, 0, dz * len(layers)]). Update the call to bulk_atoms.set_cell to
use cell[0], cell[1], and the corrected a3, and if necessary add a parameter or
fallback to allow callers to override the bulk repeat vector for multi-layer
unit cells; reference aligned.get_cell(), aligned.positions, layers, dz, and
bulk_atoms.set_cell when locating and changing the code.
- Around line 45-70: The clustering compares each atom z to the first atom of
the current layer (current_z), which fails when z drifts; inside cluster_layers
update the layer reference z after adding an atom (either set current_z to the
newly added atom's z or recompute the layer mean z) so subsequent comparisons
use the running z (use the last added z for simplicity or mean for robustness),
keeping the rest of the logic (sorted_indices, tolerance, layers, current_layer)
intact.
- Around line 4-43: The if-block in align_surface checks final_cell[2][2] < 0
but does nothing; fix by ensuring the slab's out-of-plane vector points +z by
applying a 180° rotation about an in-plane axis (e.g., use atoms.rotate(180,
'x', rotate_cell=True)) when final_cell[2][2] is negative, so the cell[2]
z-component becomes positive without mirroring the slab; reference
align_surface, final_cell[2][2], atoms.rotate, and normal to locate and
implement this change (or remove the dead if-block if you prefer no-op
behavior).
In @tools/cleed_io/writer.py:
- Line 41: The string passed to f.write uses an unnecessary f-string; remove the
f prefix so the call uses a regular string literal. Locate the write call on the
file object named f that writes "c: Generated by cleed-convert\n" and change it
from using an f-string to a plain string literal (i.e., use f.write(...) ->
f.write("c: Generated by cleed-convert\n")).
- Line 38: Remove the unused local variable assignment "cell =
bulk_atoms.get_cell()" in tools/cleed_io/writer.py (or if the cell is intended
for validation/documentation, replace the assignment with a direct use such as
asserting expected periodicity or adding a brief comment using
bulk_atoms.get_cell() result), i.e., eliminate the dead variable or consume the
result via an assertion/log to justify keeping the call.
📜 Review details
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (3)
tools/cleed_io/cli.pytools/cleed_io/slicer.pytools/cleed_io/writer.py
🧰 Additional context used
🧬 Code graph analysis (1)
tools/cleed_io/cli.py (3)
tools/cleed_io/slicer.py (1)
slice_slab(72-131)tools/cleed_io/writer.py (2)
write_bul(12-32)write_inp(34-55)tools/cleed_io/reader.py (1)
read_cleed_files(49-114)
🪛 GitHub Check: Codacy Static Code Analysis
tools/cleed_io/writer.py
[notice] 38-38: tools/cleed_io/writer.py#L38
Unused variable 'cell'
[warning] 38-38: tools/cleed_io/writer.py#L38
local variable 'cell' is assigned to but never used (F841)
[warning] 41-41: tools/cleed_io/writer.py#L41
f-string is missing placeholders (F541)
tools/cleed_io/cli.py
[notice] 63-63: tools/cleed_io/cli.py#L63
expected 2 blank lines after class or function definition, found 1 (E305)
🪛 Ruff (0.14.10)
tools/cleed_io/writer.py
38-38: Local variable cell is assigned to but never used
Remove assignment to unused variable cell
(F841)
41-41: f-string without any placeholders
Remove extraneous f prefix
(F541)
tools/cleed_io/slicer.py
93-93: Avoid specifying long messages outside the exception class
(TRY003)
tools/cleed_io/cli.py
59-59: Do not catch blind exception: Exception
(BLE001)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Codacy Static Code Analysis
- GitHub Check: windows-latest
🔇 Additional comments (2)
tools/cleed_io/writer.py (2)
12-32: LGTM - write_bul implementation is solid.The function correctly writes the CLEED
.bulformat with:
- Appropriate header and comments
- Lattice vectors using format_vector
- Bulk atoms using format_atom
- Default parameters with clear comments for user adjustment
The implementation aligns well with the CLEED format requirements and provides good user guidance.
6-10: Fix potential AttributeError when phase_map is None.Line 8 calls
phase_map.get(symbol, symbol)without checking ifphase_mapis None, which will raise anAttributeError.🔧 Proposed fix
def format_atom(prefix, atom, phase_map=None, vib_default="dr3 0.025 0.025 0.025"): symbol = atom.symbol - label = phase_map.get(symbol, symbol) if phase_map else symbol + label = phase_map.get(symbol, symbol) if phase_map is not None else symbol pos = atom.position return f"{prefix}: {label:<6s} {pos[0]:10.4f} {pos[1]:10.4f} {pos[2]:10.4f} {vib_default}\n"Note: The original code has the conditional in the right place, but
if phase_mapwill beFalsefor an empty dict{}, whileif phase_map is not Nonecorrectly handles bothNoneand empty dict cases.Likely an incorrect or invalid review comment.
| from cleed_io.reader import read_cleed_files | ||
|
|
||
| def main(): | ||
| parser = argparse.ArgumentParser(description="Convert between structure files (POSCAR, CIF) and CLEED (.bul/.inp)") |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Expand format examples in help text to match PR scope.
The description mentions only "POSCAR, CIF" but the PR objectives state support for "POSCAR/CIF/XYZ" and "many file formats through ASE". Consider updating the help text to be more inclusive.
📝 Updated description
- parser = argparse.ArgumentParser(description="Convert between structure files (POSCAR, CIF) and CLEED (.bul/.inp)")
+ parser = argparse.ArgumentParser(description="Convert between structure files (POSCAR, CIF, XYZ, and more via ASE) and CLEED (.bul/.inp)")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| parser = argparse.ArgumentParser(description="Convert between structure files (POSCAR, CIF) and CLEED (.bul/.inp)") | |
| parser = argparse.ArgumentParser(description="Convert between structure files (POSCAR, CIF, XYZ, and more via ASE) and CLEED (.bul/.inp)") |
🤖 Prompt for AI Agents
In @tools/cleed_io/cli.py at line 9, The ArgumentParser description is too
narrow; update the argparse.ArgumentParser(...) call where parser is created to
reflect support for "POSCAR/CIF/XYZ" and that many other formats are supported
via ASE (e.g., change description to something like "Convert between structure
files (POSCAR, CIF, XYZ and other ASE-supported formats) and CLEED
(.bul/.inp)"); modify the description string passed to parser to mention XYZ and
ASE-supported formats so the help text matches the PR scope.
| if args.to_cleed: | ||
| if not args.input: | ||
| parser.error("--to-cleed requires --input") | ||
|
|
||
| print(f"Reading {args.input}...") | ||
| atoms = read(args.input, format=args.format) | ||
|
|
||
| print(f"Slicing structure ({len(atoms)} atoms)...") | ||
| bulk, over = slice_slab(atoms, bulk_layers_count=args.bulk_layers) | ||
|
|
||
| bul_file = f"{args.output}.bul" | ||
| inp_file = f"{args.output}.inp" | ||
|
|
||
| print(f"Writing {bul_file}...") | ||
| write_bul(bul_file, bulk) | ||
|
|
||
| print(f"Writing {inp_file}...") | ||
| write_inp(inp_file, over, bulk) | ||
|
|
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Add validation for slab structure in to-cleed mode.
The to-cleed workflow reads an arbitrary structure file but doesn't validate that it's actually a slab (i.e., has vacuum, correct periodicity). This could lead to confusing errors downstream in slice_slab or produce invalid output.
Consider adding basic validation:
- Check that the structure has periodicity in at least 2 directions
- Check for the presence of vacuum (large z-dimension compared to atom distribution)
- Provide a clear error message if the structure doesn't appear to be a slab
💡 Example validation
print(f"Reading {args.input}...")
atoms = read(args.input, format=args.format)
+
+ # Validate structure is a slab
+ pbc = atoms.get_pbc()
+ if not (pbc[0] and pbc[1]):
+ print("Warning: Structure may not be a proper slab (missing xy periodicity)", file=sys.stderr)
+
+ cell = atoms.get_cell()
+ z_extent = atoms.positions[:, 2].ptp() # peak-to-peak
+ if cell[2][2] < z_extent * 1.5:
+ print("Warning: Structure may not have sufficient vacuum", file=sys.stderr)
print(f"Slicing structure ({len(atoms)} atoms)...")
bulk, over = slice_slab(atoms, bulk_layers_count=args.bulk_layers)Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In @tools/cleed_io/cli.py around lines 28 - 46, Before calling slice_slab in the
args.to_cleed branch, validate the input structure returned by read(...) to
ensure it is a slab: verify the cell/periodicity indicates periodicity in at
least two directions (e.g., cell vectors / periodic flags), compute atom z-range
(max z - min z) and compare to the cell z length to confirm a vacuum region
(cell_z_length significantly larger than atom_z_range), and if these checks fail
call parser.error or print a clear error and exit; update the to-cleed flow
around args.to_cleed/read/slice_slab/write_bul/write_inp to perform these checks
and emit a concise error like "input does not appear to be a slab: requires >=2
periodic directions and vacuum along z" before proceeding.
| if __name__ == "__main__": | ||
| main() |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Add blank line before module-level code.
PEP 8 requires 2 blank lines before module-level code following a function definition.
📝 Formatting fix
print(f"Error: {e}", file=sys.stderr)
sys.exit(1)
+
if __name__ == "__main__":
main()📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if __name__ == "__main__": | |
| main() | |
| print(f"Error: {e}", file=sys.stderr) | |
| sys.exit(1) | |
| if __name__ == "__main__": | |
| main() |
🧰 Tools
🪛 GitHub Check: Codacy Static Code Analysis
[notice] 63-63: tools/cleed_io/cli.py#L63
expected 2 blank lines after class or function definition, found 1 (E305)
🤖 Prompt for AI Agents
In @tools/cleed_io/cli.py around lines 63 - 64, Add two blank lines before the
module-level execution block so PEP 8 is satisfied: ensure there are exactly two
blank lines separating the preceding function definitions and the "if __name__
== '__main__':" block that calls main(), i.e., insert a blank line before and
another blank line above the "if __name__ == '__main__':" line that invokes
main().
| def align_surface(atoms: Atoms) -> Atoms: | ||
| """ | ||
| Aligns the atoms object such that the periodic surface vectors (a, b) | ||
| lie in the xy-plane (z=0) and the surface normal points along +z. | ||
|
|
||
| This matches the CLEED convention where z is the surface normal. | ||
| """ | ||
| # 1. Identify the non-periodic direction or the longest vector if 3D periodic | ||
| # For a standard slab, we assume c is the vacuum direction. | ||
|
|
||
| cell = atoms.get_cell() | ||
| # We assume standard ASE slab convention: 3rd vector is out-of-plane | ||
| # Check if we need to rotate. | ||
|
|
||
| # Vector a (cell[0]) should be along x | ||
| # Vector b (cell[1]) should be in xy plane | ||
| # Vector c (cell[2]) points up | ||
|
|
||
| # Standardize orientation: a along x | ||
| atoms.rotate(cell[0], 'x', rotate_cell=True) | ||
|
|
||
| # Now rotate around x to put b in xy plane | ||
| # We need the normal of (a, b) to be along z | ||
|
|
||
| current_cell = atoms.get_cell() | ||
| # Cross product of a and b | ||
| normal = np.cross(current_cell[0], current_cell[1]) | ||
|
|
||
| # Align this normal to z | ||
| atoms.rotate(normal, 'z', rotate_cell=True) | ||
|
|
||
| # Ensure c is positive z | ||
| final_cell = atoms.get_cell() | ||
| if final_cell[2][2] < 0: | ||
| # Flip z? No, that mirrors the structure. | ||
| # Just ensure the slab is oriented correctly. | ||
| # ViPErLEED: "positive z-direction pointing away from the surface into the vacuum" | ||
| pass | ||
|
|
||
| return atoms |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# First, find and examine the file
fd "slicer.py" --type fRepository: Liam-Deacon/CLEED
Length of output: 85
🏁 Script executed:
#!/bin/bash
# Read the actual file to see the current state
cat -n tools/cleed_io/slicer.py | head -50Repository: Liam-Deacon/CLEED
Length of output: 2074
🌐 Web query:
ASE Atoms.rotate() method signature and documentation
💡 Result:
Signature:
atoms.rotate(a, v, center=(0, 0, 0), rotate_cell=False)
Description (from ASE docs/source):
- Rotates the Atoms object either by an angle a around axis v, or rotates vector a into vector v.
- a: angle in degrees (number) or a vector (then a is rotated into v).
- v: axis vector (can be e.g. (0,0,1)) or axis name string ('x','-x','y', ...).
- center: point kept fixed during rotation; a 3-tuple or one of 'COM' (center of mass), 'COP' (center of positions), 'COU' (center of cell).
- rotate_cell: if True, the unit cell is rotated as well.
Examples:
- atoms.rotate(90, 'z') — rotate 90° about z.
- atoms.rotate('x', 'y') — rotate x-axis into y-axis.
- atoms.rotate((1,0,0), (0,1,0)) — rotate vector (1,0,0) into (0,1,0).
Sources: ASE documentation and source code. [1][2]
References:
[1] ASE Atoms documentation — rotate method.
[2] ASE source implementation of Atoms.rotate.
Remove incomplete if-block or implement the orientation check logic.
Lines 37-41 contain a conditional check for negative z-component in the final cell vector, but the if-block only contains a pass statement with a comment. Either implement the necessary logic to handle cases where final_cell[2][2] < 0 (e.g., by flipping the cell orientation, rotating around the surface normal, or documenting why no action is needed) or remove the dead code entirely.
The rotation logic itself correctly uses the ASE API to align cell[0] to the x-axis and then aligns the normal vector to the z-axis, which properly positions the surface vectors in the xy-plane.
🤖 Prompt for AI Agents
In @tools/cleed_io/slicer.py around lines 4 - 43, The if-block in align_surface
checks final_cell[2][2] < 0 but does nothing; fix by ensuring the slab's
out-of-plane vector points +z by applying a 180° rotation about an in-plane axis
(e.g., use atoms.rotate(180, 'x', rotate_cell=True)) when final_cell[2][2] is
negative, so the cell[2] z-component becomes positive without mirroring the
slab; reference align_surface, final_cell[2][2], atoms.rotate, and normal to
locate and implement this change (or remove the dead if-block if you prefer
no-op behavior).
| def cluster_layers(atoms: Atoms, tolerance=0.2): | ||
| """ | ||
| Groups atoms into layers based on Z-coordinate. | ||
| Returns a list of lists of atom indices. | ||
| """ | ||
| z_coords = atoms.positions[:, 2] | ||
| sorted_indices = np.argsort(z_coords) | ||
|
|
||
| layers = [] | ||
| if len(sorted_indices) == 0: | ||
| return layers | ||
|
|
||
| current_layer = [sorted_indices[0]] | ||
| current_z = z_coords[sorted_indices[0]] | ||
|
|
||
| for idx in sorted_indices[1:]: | ||
| z = z_coords[idx] | ||
| if abs(z - current_z) < tolerance: | ||
| current_layer.append(idx) | ||
| else: | ||
| layers.append(current_layer) | ||
| current_layer = [idx] | ||
| current_z = z | ||
|
|
||
| layers.append(current_layer) | ||
| return layers |
There was a problem hiding this comment.
Fix the layer clustering logic to handle intra-layer z-coordinate drift.
The current implementation compares each atom's z-coordinate against the first atom in the current layer (current_z), which is set only when starting a new layer (line 58, 67). This causes issues when atoms within a single layer have gradually varying z-coordinates.
Example failure case:
- Atoms with z-coordinates: [0.0, 0.15, 0.25] and tolerance=0.2
- First atom (0.0) starts layer, current_z=0.0
- Second atom (0.15): |0.15 - 0.0| = 0.15 < 0.2 ✓ added to layer
- Third atom (0.25): |0.25 - 0.0| = 0.25 > 0.2 ✗ starts new layer incorrectly
The third atom should be in the same layer since |0.25 - 0.15| = 0.1 < 0.2.
🔧 Proposed fix to track running z-coordinate
Option 1: Update current_z to the last added atom's z-coordinate:
def cluster_layers(atoms: Atoms, tolerance=0.2):
"""
Groups atoms into layers based on Z-coordinate.
Returns a list of lists of atom indices.
"""
z_coords = atoms.positions[:, 2]
sorted_indices = np.argsort(z_coords)
layers = []
if len(sorted_indices) == 0:
return layers
current_layer = [sorted_indices[0]]
current_z = z_coords[sorted_indices[0]]
for idx in sorted_indices[1:]:
z = z_coords[idx]
if abs(z - current_z) < tolerance:
current_layer.append(idx)
+ current_z = z # Update to track drift
else:
layers.append(current_layer)
current_layer = [idx]
current_z = z
layers.append(current_layer)
return layersOption 2: Use the mean z-coordinate of the current layer:
def cluster_layers(atoms: Atoms, tolerance=0.2):
"""
Groups atoms into layers based on Z-coordinate.
Returns a list of lists of atom indices.
"""
z_coords = atoms.positions[:, 2]
sorted_indices = np.argsort(z_coords)
layers = []
if len(sorted_indices) == 0:
return layers
current_layer = [sorted_indices[0]]
- current_z = z_coords[sorted_indices[0]]
for idx in sorted_indices[1:]:
z = z_coords[idx]
+ current_z = np.mean(z_coords[current_layer])
if abs(z - current_z) < tolerance:
current_layer.append(idx)
else:
layers.append(current_layer)
current_layer = [idx]
- current_z = z
layers.append(current_layer)
return layersOption 1 is simpler and handles gradual drift; Option 2 is more robust for noisy data.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| def cluster_layers(atoms: Atoms, tolerance=0.2): | |
| """ | |
| Groups atoms into layers based on Z-coordinate. | |
| Returns a list of lists of atom indices. | |
| """ | |
| z_coords = atoms.positions[:, 2] | |
| sorted_indices = np.argsort(z_coords) | |
| layers = [] | |
| if len(sorted_indices) == 0: | |
| return layers | |
| current_layer = [sorted_indices[0]] | |
| current_z = z_coords[sorted_indices[0]] | |
| for idx in sorted_indices[1:]: | |
| z = z_coords[idx] | |
| if abs(z - current_z) < tolerance: | |
| current_layer.append(idx) | |
| else: | |
| layers.append(current_layer) | |
| current_layer = [idx] | |
| current_z = z | |
| layers.append(current_layer) | |
| return layers | |
| def cluster_layers(atoms: Atoms, tolerance=0.2): | |
| """ | |
| Groups atoms into layers based on Z-coordinate. | |
| Returns a list of lists of atom indices. | |
| """ | |
| z_coords = atoms.positions[:, 2] | |
| sorted_indices = np.argsort(z_coords) | |
| layers = [] | |
| if len(sorted_indices) == 0: | |
| return layers | |
| current_layer = [sorted_indices[0]] | |
| current_z = z_coords[sorted_indices[0]] | |
| for idx in sorted_indices[1:]: | |
| z = z_coords[idx] | |
| if abs(z - current_z) < tolerance: | |
| current_layer.append(idx) | |
| current_z = z # Update to track drift | |
| else: | |
| layers.append(current_layer) | |
| current_layer = [idx] | |
| current_z = z | |
| layers.append(current_layer) | |
| return layers |
🤖 Prompt for AI Agents
In @tools/cleed_io/slicer.py around lines 45 - 70, The clustering compares each
atom z to the first atom of the current layer (current_z), which fails when z
drifts; inside cluster_layers update the layer reference z after adding an atom
(either set current_z to the newly added atom's z or recompute the layer mean z)
so subsequent comparisons use the running z (use the last added z for simplicity
or mean for robustness), keeping the rest of the logic (sorted_indices,
tolerance, layers, current_layer) intact.
| bulk_indices = [] | ||
| for i in range(bulk_layers_count): | ||
| bulk_indices.extend(layers[i]) | ||
|
|
||
| overlayer_indices = [] | ||
| for i in range(bulk_layers_count, len(layers)): | ||
| overlayer_indices.extend(layers[i]) |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Consider using list comprehensions for cleaner code.
The index collection loops can be simplified using list comprehensions:
♻️ Optional refactor
- bulk_indices = []
- for i in range(bulk_layers_count):
- bulk_indices.extend(layers[i])
-
- overlayer_indices = []
- for i in range(bulk_layers_count, len(layers)):
- overlayer_indices.extend(layers[i])
+ bulk_indices = [idx for i in range(bulk_layers_count) for idx in layers[i]]
+ overlayer_indices = [idx for i in range(bulk_layers_count, len(layers)) for idx in layers[i]]📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| bulk_indices = [] | |
| for i in range(bulk_layers_count): | |
| bulk_indices.extend(layers[i]) | |
| overlayer_indices = [] | |
| for i in range(bulk_layers_count, len(layers)): | |
| overlayer_indices.extend(layers[i]) | |
| bulk_indices = [idx for i in range(bulk_layers_count) for idx in layers[i]] | |
| overlayer_indices = [idx for i in range(bulk_layers_count, len(layers)) for idx in layers[i]] |
| if len(layers) >= 2: | ||
| z1 = np.mean(aligned.positions[layers[0], 2]) | ||
| z2 = np.mean(aligned.positions[layers[1], 2]) | ||
| dz = z2 - z1 | ||
| # This assumes vertical stacking. | ||
| # Ideally, we find the translation vector between layer 0 and layer 1. | ||
|
|
||
| # Heuristic: Use the slab cell for a1, a2. | ||
| # Construct a3 from the layer spacing. | ||
| cell = aligned.get_cell() | ||
| bulk_atoms.set_cell([cell[0], cell[1], [0, 0, dz * len(layers)]]) # Dummy a3 |
There was a problem hiding this comment.
Fix the bulk unit cell a3 vector calculation.
The calculation of the bulk unit cell's a3 vector is incorrect:
bulk_atoms.set_cell([cell[0], cell[1], [0, 0, dz * len(layers)]]) # Dummy a3Issues:
- Multiplying by
len(layers)(total layers in slab) instead of the bulk periodicity makes the bulk cell artificially large - For a bulk unit cell,
a3should represent the 3D periodicity of the bulk crystal, typically the inter-layer spacing:[0, 0, dz] - The comment acknowledges this is "Dummy a3" and wrong, but the error will propagate to the
.bulfile and cause incorrect CLEED calculations
Impact: The .bul file will contain an incorrect bulk unit cell, leading to wrong LEED calculations. This is a critical correctness issue for the core functionality of this converter.
🔧 Proposed fix for bulk unit cell
if len(layers) >= 2:
z1 = np.mean(aligned.positions[layers[0], 2])
z2 = np.mean(aligned.positions[layers[1], 2])
dz = z2 - z1
- # This assumes vertical stacking.
- # Ideally, we find the translation vector between layer 0 and layer 1.
- # Heuristic: Use the slab cell for a1, a2.
- # Construct a3 from the layer spacing.
+ # Set a3 to the inter-layer spacing (bulk periodicity in z)
+ # For more complex bulk structures, this may need user input
cell = aligned.get_cell()
- bulk_atoms.set_cell([cell[0], cell[1], [0, 0, dz * len(layers)]]) # Dummy a3
+ bulk_atoms.set_cell([cell[0], cell[1], [0, 0, dz]])
+ else:
+ # Fallback: single layer, use original a3 (may be incorrect)
+ cell = aligned.get_cell()
+ bulk_atoms.set_cell([cell[0], cell[1], cell[2]])
return bulk_atoms, overlayer_atomsNote: For complex bulk structures (e.g., multi-layer bulk unit cells), the inter-layer spacing may not equal the bulk repeat vector. Consider adding a parameter for users to specify the bulk unit cell explicitly.
🤖 Prompt for AI Agents
In @tools/cleed_io/slicer.py around lines 119 - 129, The bulk unit cell a3 is
being set incorrectly by multiplying dz by len(layers); instead compute the
inter-layer spacing between the first two detected layers (dz = z2 - z1) and set
the bulk cell's third lattice vector to represent a single bulk repeat along z
(i.e., use [0, 0, dz] rather than [0, 0, dz * len(layers)]). Update the call to
bulk_atoms.set_cell to use cell[0], cell[1], and the corrected a3, and if
necessary add a parameter or fallback to allow callers to override the bulk
repeat vector for multi-layer unit cells; reference aligned.get_cell(),
aligned.positions, layers, dz, and bulk_atoms.set_cell when locating and
changing the code.
| """ | ||
| Writes the CLEED .inp file. | ||
| """ | ||
| cell = bulk_atoms.get_cell() # Overlayer uses bulk 2D periodicity usually |
There was a problem hiding this comment.
Remove unused variable assignment.
The variable cell is assigned but never used. Either remove the assignment or use it for validation/documentation purposes.
🔧 Proposed fixes
Option 1: Remove the unused assignment
def write_inp(filename, overlayer_atoms: Atoms, bulk_atoms: Atoms, phase_map=None):
"""
Writes the CLEED .inp file.
"""
- cell = bulk_atoms.get_cell() # Overlayer uses bulk 2D periodicity usually
with open(filename, 'w') as f:Option 2: Use it for validation or documentation
def write_inp(filename, overlayer_atoms: Atoms, bulk_atoms: Atoms, phase_map=None):
"""
Writes the CLEED .inp file.
"""
- cell = bulk_atoms.get_cell() # Overlayer uses bulk 2D periodicity usually
+ # Note: overlayer uses bulk 2D periodicity (a1, a2)
+ # The superstructure matrix below assumes 1x1, could be generalized
with open(filename, 'w') as f:📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| cell = bulk_atoms.get_cell() # Overlayer uses bulk 2D periodicity usually | |
| def write_inp(filename, overlayer_atoms: Atoms, bulk_atoms: Atoms, phase_map=None): | |
| """ | |
| Writes the CLEED .inp file. | |
| """ | |
| with open(filename, 'w') as f: |
🧰 Tools
🪛 GitHub Check: Codacy Static Code Analysis
[notice] 38-38: tools/cleed_io/writer.py#L38
Unused variable 'cell'
[warning] 38-38: tools/cleed_io/writer.py#L38
local variable 'cell' is assigned to but never used (F841)
🪛 Ruff (0.14.10)
38-38: Local variable cell is assigned to but never used
Remove assignment to unused variable cell
(F841)
🤖 Prompt for AI Agents
In @tools/cleed_io/writer.py at line 38, Remove the unused local variable
assignment "cell = bulk_atoms.get_cell()" in tools/cleed_io/writer.py (or if the
cell is intended for validation/documentation, replace the assignment with a
direct use such as asserting expected periodicity or adding a brief comment
using bulk_atoms.get_cell() result), i.e., eliminate the dead variable or
consume the result via an assertion/log to justify keeping the call.
| cell = bulk_atoms.get_cell() # Overlayer uses bulk 2D periodicity usually | ||
|
|
||
| with open(filename, 'w') as f: | ||
| f.write(f"c: Generated by cleed-convert\n") |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Remove unnecessary f-string prefix.
The f-string at line 41 has no placeholders, so the f prefix is unnecessary.
📝 Fix
- f.write(f"c: Generated by cleed-convert\n")
+ f.write("c: Generated by cleed-convert\n")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| f.write(f"c: Generated by cleed-convert\n") | |
| f.write("c: Generated by cleed-convert\n") |
🧰 Tools
🪛 GitHub Check: Codacy Static Code Analysis
[warning] 41-41: tools/cleed_io/writer.py#L41
f-string is missing placeholders (F541)
🪛 Ruff (0.14.10)
41-41: f-string without any placeholders
Remove extraneous f prefix
(F541)
🤖 Prompt for AI Agents
In @tools/cleed_io/writer.py at line 41, The string passed to f.write uses an
unnecessary f-string; remove the f prefix so the call uses a regular string
literal. Locate the write call on the file object named f that writes "c:
Generated by cleed-convert\n" and change it from using an f-string to a plain
string literal (i.e., use f.write(...) -> f.write("c: Generated by
cleed-convert\n")).



Modern Input/Output Bridge for CLEED
This PR implements the "Python Bridge" strategy discussed in Issue #51 to enable CLEED to consume modern crystallography formats (POSCAR, CIF, XYZ) via ASE.
Features
1.
cleed-ioPython PackageA new lightweight package in
tools/that provides:slicer.py: An algorithm to automatically "slice" a 3D periodic slab (from VASP/ASE) into the "Bulk" and "Overlayer" components required by CLEED.writer.py: Generates valid.buland.inpfiles from the sliced atomic structures.cli.py: A command-line toolcleed-convertfor easy batch processing.2. Supported Workflow
Users can now generate a structure in Python/ASE or VASP and immediately convert it for LEED simulation:
# Convert a VASP POSCAR cleed-convert --input POSCAR --output my_surfaceThis generates
my_surface.bulandmy_surface.inpready forcsearch.3. Documentation
Added
docs/modern-io.mdexplaining the motivation, installation, and usage of the new tools.Technical Details
ase(Atomic Simulation Environment).Related Issues
Closes #51
Summary by Sourcery
Introduce a Python-based I/O bridge to convert modern crystallography structures into CLEED bulk/overlayer input files.
New Features:
Build:
Documentation:
Summary by CodeRabbit
New Features
Documentation
Tests
Chores
✏️ Tip: You can customize this high-level summary in your review settings.