|
1 | 1 | # Contribution guidelines, tailored for LLM agents |
2 | 2 |
|
3 | | -## Testing |
| 3 | +@.gemini/common/docs.md |
4 | 4 |
|
5 | | -We use `nox` to instrument our tests. |
6 | | - |
7 | | -- To test your changes, run unit tests with `nox`: |
8 | | - |
9 | | - ```bash |
10 | | - nox -r -s unit |
11 | | - ``` |
12 | | - |
13 | | -- To run a single unit test: |
14 | | - |
15 | | - ```bash |
16 | | - nox -r -s unit-3.14 -- -k <name of test> |
17 | | - ``` |
18 | | - |
19 | | -- Ignore this step if you lack access to Google Cloud resources. To run system |
20 | | - tests, you can execute:: |
21 | | - |
22 | | - # Run all system tests |
23 | | - $ nox -r -s system |
24 | | - |
25 | | - # Run a single system test |
26 | | - $ nox -r -s system-3.14 -- -k <name of test> |
27 | | - |
28 | | -- The codebase must have better coverage than it had previously after each |
29 | | - change. You can test coverage via `nox -s unit system cover` (takes a long |
30 | | - time). Omit `system` if you lack access to cloud resources. |
31 | | - |
32 | | -## Code Style |
33 | | - |
34 | | -- We use the automatic code formatter `black`. You can run it using |
35 | | - the nox session `format`. This will eliminate many lint errors. Run via: |
36 | | - |
37 | | - ```bash |
38 | | - nox -r -s format |
39 | | - ``` |
40 | | - |
41 | | -- PEP8 compliance is required, with exceptions defined in the linter configuration. |
42 | | - If you have ``nox`` installed, you can test that you have not introduced |
43 | | - any non-compliant code via: |
44 | | - |
45 | | - ``` |
46 | | - nox -r -s lint |
47 | | - ``` |
48 | | - |
49 | | -- When writing tests, use the idiomatic "pytest" style. |
50 | | - |
51 | | -## Documentation |
52 | | - |
53 | | -If a method or property is implementing the same interface as a third-party |
54 | | -package such as pandas or scikit-learn, place the relevant docstring in the |
55 | | -corresponding `third_party/bigframes_vendored/package_name` directory, not in |
56 | | -the `bigframes` directory. Implementations may be placed in the `bigframes` |
57 | | -directory, though. |
58 | | - |
59 | | -### Testing code samples |
60 | | - |
61 | | -Code samples are very important for accurate documentation. We use the "doctest" |
62 | | -framework to ensure the samples are functioning as expected. After adding a code |
63 | | -sample, please ensure it is correct by running doctest. To run the samples |
64 | | -doctests for just a single method, refer to the following example: |
65 | | - |
66 | | -```bash |
67 | | -pytest --doctest-modules bigframes/pandas/__init__.py::bigframes.pandas.cut |
68 | | -``` |
69 | | - |
70 | | -## Tips for implementing common BigFrames features |
71 | | - |
72 | | -### Adding a scalar operator |
73 | | - |
74 | | -For an example, see commit |
75 | | -[c5b7fdae74a22e581f7705bc0cf5390e928f4425](https://github.com/googleapis/python-bigquery-dataframes/commit/c5b7fdae74a22e581f7705bc0cf5390e928f4425). |
76 | | - |
77 | | -To add a new scalar operator, follow these steps: |
78 | | - |
79 | | -1. **Define the operation dataclass:** |
80 | | - - In `bigframes/operations/`, find the relevant file (e.g., `geo_ops.py` for geography functions) or create a new one. |
81 | | - - Create a new dataclass inheriting from `base_ops.UnaryOp` for unary |
82 | | - operators, `base_ops.BinaryOp` for binary operators, `base_ops.TernaryOp` |
83 | | - for ternary operators, or `base_ops.NaryOp for operators with many |
84 | | - arguments. Note that these operators are counting the number column-like |
85 | | - arguments. A function that takes only a single column but several literal |
86 | | - values would still be a `UnaryOp`. |
87 | | - - Define the `name` of the operation and any parameters it requires. |
88 | | - - Implement the `output_type` method to specify the data type of the result. |
89 | | - |
90 | | -2. **Export the new operation:** |
91 | | - - In `bigframes/operations/__init__.py`, import your new operation dataclass and add it to the `__all__` list. |
92 | | - |
93 | | -3. **Implement the user-facing function (pandas-like):** |
94 | | - |
95 | | - - Identify the canonical function from pandas / geopandas / awkward array / |
96 | | - other popular Python package that this operator implements. |
97 | | - - Find the corresponding class in BigFrames. For example, the implementation |
98 | | - for most geopandas.GeoSeries methods is in |
99 | | - `bigframes/geopandas/geoseries.py`. Pandas Series methods are implemented |
100 | | - in `bigframes/series.py` or one of the accessors, such as `StringMethods` |
101 | | - in `bigframes/operations/strings.py`. |
102 | | - - Create the user-facing function that will be called by users (e.g., `length`). |
103 | | - - If the SQL method differs from pandas or geopandas in a way that can't be |
104 | | - made the same, raise a `NotImplementedError` with an appropriate message and |
105 | | - link to the feedback form. |
106 | | - - Add the docstring to the corresponding file in |
107 | | - `third_party/bigframes_vendored`, modeled after pandas / geopandas. |
108 | | - |
109 | | -4. **Implement the user-facing function (SQL-like):** |
110 | | - |
111 | | - - In `bigframes/bigquery/_operations/`, find the relevant file (e.g., `geo.py`) or create a new one. |
112 | | - - Create the user-facing function that will be called by users (e.g., `st_length`). |
113 | | - - This function should take a `Series` for any column-like inputs, plus any other parameters. |
114 | | - - Inside the function, call `series._apply_unary_op`, |
115 | | - `series._apply_binary_op`, or similar passing the operation dataclass you |
116 | | - created. |
117 | | - - Add a comprehensive docstring with examples. |
118 | | - - In `bigframes/bigquery/__init__.py`, import your new user-facing function and add it to the `__all__` list. |
119 | | - |
120 | | -5. **Implement the compilation logic:** |
121 | | - - In `bigframes/core/compile/scalar_op_compiler.py`: |
122 | | - - If the BigQuery function has a direct equivalent in Ibis, you can often reuse an existing Ibis method. |
123 | | - - If not, define a new Ibis UDF using `@ibis_udf.scalar.builtin` to map to the specific BigQuery function signature. |
124 | | - - Create a new compiler implementation function (e.g., `geo_length_op_impl`). |
125 | | - - Register this function to your operation dataclass using `@scalar_op_compiler.register_unary_op` or `@scalar_op_compiler.register_binary_op`. |
126 | | - - This implementation will translate the BigQuery DataFrames operation into the appropriate Ibis expression. |
127 | | - |
128 | | -6. **Add Tests:** |
129 | | - - Add system tests in the `tests/system/` directory to verify the end-to-end |
130 | | - functionality of the new operator. Test various inputs, including edge cases |
131 | | - and `NULL` values. |
132 | | - |
133 | | - Where possible, run the same test code against pandas or GeoPandas and |
134 | | - compare that the outputs are the same (except for dtypes if BigFrames |
135 | | - differs from pandas). |
136 | | - - If you are overriding a pandas or GeoPandas property, add a unit test to |
137 | | - ensure the correct behavior (e.g., raising `NotImplementedError` if the |
138 | | - functionality is not supported). |
139 | | - |
140 | | - |
141 | | -## Constraints |
142 | | - |
143 | | -- Only add git commits. Do not change git history. |
144 | | -- Follow the spec file for development. |
145 | | - - Check off items in the "Acceptance |
146 | | - criteria" and "Detailed steps" sections with `[x]`. |
147 | | - - Please do this as they are completed. |
148 | | - - Refer back to the spec after each step. |
| 5 | +@.gemini/common/constraints.md |
0 commit comments