Conversation
There was a problem hiding this comment.
Pull Request Overview
Adds support for pandas DataFrame input to enable non-factorial (row-wise) parametric designs alongside existing dict-based factorial Cartesian product generation.
- Extends generate_variable_combinations to accept DataFrames and return one case per row.
- Adds comprehensive tests and documentation/examples differentiating factorial dict vs non-factorial DataFrame usage.
- Updates README and adds a detailed example guide for DataFrame-driven designs.
Reviewed Changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
| fz/helpers.py | Implements DataFrame handling in generate_variable_combinations with optional pandas import and logging. |
| tests/test_dataframe_input.py | Adds unit and integration tests for DataFrame vs dict behavior and input validation. |
| examples/dataframe_input.md | New extensive guide on using DataFrames for non-factorial designs with multiple sampling patterns. |
| README.md | Updates feature list and documents factorial vs non-factorial input variable formats with examples. |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
|
|
||
|
|
||
| def generate_variable_combinations(input_variables: Dict) -> List[Dict]: | ||
| def generate_variable_combinations(input_variables: Union[Dict, Any]) -> List[Dict]: |
There was a problem hiding this comment.
The type hint Union[Dict, Any] effectively collapses to Any and advertises acceptance of all types, while the function raises TypeError for non-dict/non-DataFrame inputs. Narrow the annotation to accepted types only, e.g. Union[Dict[str, Any], 'pd.DataFrame'] guarded by a TYPE_CHECKING block or a Protocol to improve static analysis.
| var_combinations = [] | ||
| for _, row in input_variables.iterrows(): | ||
| var_combinations.append(row.to_dict()) |
There was a problem hiding this comment.
Using iterrows is relatively slow and may coerce dtypes; you can replace this block with var_combinations = input_variables.to_dict(orient='records') for a vectorized, faster conversion that preserves dtypes.
| var_combinations = [] | |
| for _, row in input_variables.iterrows(): | |
| var_combinations.append(row.to_dict()) | |
| var_combinations = input_variables.to_dict(orient='records') |
| | **Example** | `{"x": [1,2], "y": [3,4]}` → 4 cases | `pd.DataFrame({"x":[1,2], "y":[3,4]})` → 2 cases | | ||
| | **Constraints** | Cannot handle constraints | Can handle constraints | | ||
| | **Sampling** | Grid-based | Any sampling method | | ||
|
|
There was a problem hiding this comment.
Each line has a double leading pipe '||', which will render an extra empty column or break the table. Remove one leading pipe per line so the table starts with a single | (e.g. | Aspect | Dict (Factorial) | DataFrame (Non-Factorial) |).
| model = { | ||
| "formulaprefix": "@", | ||
| "delim": "{}", | ||
| "commentline": "#", | ||
| "output": { | ||
| "result": "grep 'result:' output.txt | awk '{print $2}'" | ||
| } | ||
| } |
There was a problem hiding this comment.
[nitpick] The same model dict is duplicated across multiple tests (e.g., lines 171–178, 201–208, 257–263). Consider extracting it into a fixture or a class attribute to reduce repetition and ease future changes.
5077247 to
6b43c56
Compare
3b8c6c5 to
28d619f
Compare
Resolves PermissionError on Windows during temporary directory cleanup by restoring the original working directory before the TemporaryDirectory context manager exits. On Windows, you cannot delete a directory that is the current working directory. The tests were calling os.chdir(tmpdir) and then attempting to clean up the directory when the context exited, causing: - PermissionError: [WinError 32] The process cannot access the file because it is being used by another process - PermissionError: [WinError 5] Access is denied Solution: Wrap test logic in try/finally blocks that save and restore the original working directory, allowing Windows to successfully delete temporary directories during cleanup. Fixes #40 (Windows CI failure in test_dict_flattening.py)
Resolves PermissionError on Windows during temporary directory cleanup by restoring the original working directory before the TemporaryDirectory context manager exits. On Windows, you cannot delete a directory that is the current working directory. The tests were calling os.chdir(tmpdir) and then attempting to clean up the directory when the context exited, causing: - PermissionError: [WinError 32] The process cannot access the file because it is being used by another process - PermissionError: [WinError 5] Access is denied Solution: Wrap test logic in try/finally blocks that save and restore the original working directory, allowing Windows to successfully delete temporary directories during cleanup. Fixes #40 (Windows CI failure in test_dict_flattening.py) Co-authored-by: Claude <noreply@anthropic.com>
Squash-merge of implement-algorithms branch onto current main (v0.9.1). Resolved conflicts in: - fz/__init__.py: Added both fzl and fzd exports - fz/core.py: Merged imports, kept callbacks from main + added fzd functions - fz/helpers.py: Kept main's format_time (Windows bash moved to shell.py) - fz/cli.py: Kept fzl list command + added algorithm install subcommands - fz/shell.py: Kept main's safer regex replacement from #56 - README.md: Listed both fzl and fzd, kept improved env var docs - tests/test_cli_commands.py: Added new test methods from PR - Removed CLAUDE.md (moved to claude/ dir on main) - Removed setup.py (replaced by pyproject.toml on main)
877028c to
4819568
Compare
… flake8 errors (undefined variable names in fzd code):\n display_dict→analysis_dict, processed_final_analysis→processed_final_display,\n tmp_display_processed→tmp_analysis_processed, iter_display_processed→iter_analysis_processed\n- Remove duplicate 'list' subparser conflicting with fzl's list command\n- Remove shadowing local _resolve_calculators_arg (use imported version from helpers.py)\n- Update tests for new fzl-style 'list' command (no more 'list models' subcommand)"
This commit adds support for using pandas DataFrames as input_variables,
enabling non-factorial parametric study designs alongside the existing
dict-based factorial (Cartesian product) approach.
Implementation (fz/helpers.py):
Key features:
Example: {"x": [1,2], "y": [3,4]} → 4 cases
Example: pd.DataFrame({"x":[1,2], "y":[3,4]}) → 2 cases (rows)
Use cases for DataFrames:
Tests (tests/test_dataframe_input.py):
Documentation:
Backward compatibility:
Example usage: