Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
51 changes: 51 additions & 0 deletions docs/cookbooks/01-common_setup.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,3 +45,54 @@ monolog:
max_files: 10
channels: ['cleverage_process_task', 'cleverage_process_transformer']
```

Example: lightweight file import
--------------------------------

To give more context on how configuration ties to the PHP code, here is a minimal file-import workflow built with classes already shipped in `src/Task/File` and `src/Task/Reporting`.

```yaml
clever_age_process:
configurations:
app.file_import:
default_error_strategy: stop
tasks:
read_csv:
service: '@CleverAge\ProcessBundle\Task\File\InputFileReaderTask'
options:
file_path: '%kernel.project_dir%/data/products.csv'
format: csv
outputs: [split_rows]

split_rows:
service: '@CleverAge\ProcessBundle\Task/File/Csv/CsvSplitterTask'
options:
delimiter: ';'
outputs: [transform]

transform:
service: '@CleverAge\ProcessBundle\Task\TransformerTask'
options:
transformers:
mapping:
mapping:
id: { code: '[id]' }
slug:
code:
- '[name]'
- '[category]'
transformers:
implode:
separator: '-'
outputs: [write_csv]

write_csv:
service: '@CleverAge\ProcessBundle\Task\File/Csv/CsvWriterTask'
options:
file_path: '%kernel.project_dir%/var/output/products_prepared.csv'
headers:
- id
- slug
```

Each task above maps to the concrete classes such as `InputFileReaderTask`, `CsvSplitterTask` and `CsvWriterTask` found under `src/Task/File`. Mentioning the `default_error_strategy` helps downstream code in `src/Configuration/ProcessConfiguration.php` know how to propagate failures.
52 changes: 52 additions & 0 deletions docs/cookbooks/etl_aggregate_reports.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
ETL report aggregation
======================

This example shows an ETL path that reads several JSON log files, aggregates rates per service, and writes a CSV summary that can feed a dashboard.

```yaml
clever_age_process:
configurations:
app.etl_report_aggregate:
default_error_strategy: stop
tasks:
list_sources:
service: '@CleverAge\ProcessBundle\Task\File\FolderBrowserTask'
options:
folder: '%kernel.project_dir%/data/logs'
filter: '*.json'
outputs: [read_log]

read_log:
service: '@CleverAge\ProcessBundle\Task\File\JsonStream\JsonStreamReaderTask'
options:
file_path: '[file]' # value injected from FolderBrowserTask
iterator: items
outputs: [map_log]

map_log:
service: '@CleverAge\ProcessBundle\Task\TransformerTask'
options:
transformers:
mapping:
mapping:
service: { code: '[service]' }
duration: { code: '[duration]' }
status: { code: '[status]' }
outputs: [group_reports]

group_reports:
service: '@CleverAge\ProcessBundle\Task\GroupByAggregateIterableTask'
options:
group_by: service
aggregate:
duration: { type: 'avg' }
outputs: [write_summary]

write_summary:
service: '@CleverAge\ProcessBundle\Task\File\Csv\CsvWriterTask'
options:
file_path: '%kernel.project_dir%/var/exports/report_summary.csv'
headers: [service, duration]
```

The `FolderBrowserTask`, `JsonStreamReaderTask`, `TransformerTask`, and `GroupByAggregateIterableTask` classes live under `src/Task/File` and `src/Task`. They demonstrate a reusable read/transform/aggregate path for any automation workflow.
57 changes: 57 additions & 0 deletions docs/cookbooks/etl_file_sync.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
File synchronization ETL
=======================

This recipe describes a typical ETL flow: read a CSV file, enrich the data with transformations, then write to another file while keeping a statistics log.

```yaml
clever_age_process:
configurations:
app.etl_file_sync:
default_error_strategy: stop
tasks:
load_source:
service: '@CleverAge\ProcessBundle\Task\File\InputFileReaderTask'
options:
file_path: '%kernel.project_dir%/data/catalog.csv'
format: csv
outputs: [split_rows]

split_rows:
service: '@CleverAge\ProcessBundle\Task\File\Csv\CsvSplitterTask'
options:
delimiter: ';'
outputs: [normalize]

normalize:
service: '@CleverAge\ProcessBundle\Task\TransformerTask'
options:
transformers:
mapping:
mapping:
sku: { code: '[sku]' }
price: { code: '[price]', transformers: { cast: { type: float } } }
sent_at:
code: 'format_datetime("[updated_at]", "Y-m-d")'
outputs: [deduplicate]

deduplicate:
service: '@CleverAge\ProcessBundle\Task\FilterTask'
options:
unique_field: sku
outputs: [write_target]

write_target:
service: '@CleverAge\ProcessBundle\Task\File\Csv\CsvWriterTask'
options:
file_path: '%kernel.project_dir%/var/exports/catalog_normalized.csv'
headers: [sku,price,sent_at]
outputs: [log_stats]

log_stats:
service: '@CleverAge\ProcessBundle\Task\Reporting\StatCounterTask'
options:
increment:
rows: 1
```

The `FilterTask` in the sequence corresponds to `src/Task/FilterTask.php`, and the CSV tasks come from `src/Task/File/Csv`. The final counter uses `src/Task/Reporting/StatCounterTask.php` to show how to append a simple metric to the logs.
5 changes: 5 additions & 0 deletions docs/cookbooks/memory_usage_graph.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,3 +38,8 @@ To graph the output of this process, use this Gnuplot command in your host envir
```bash
$ gnuplot -e 'while(1) {plot "memory.dat" using 0:1 with lines; pause 1; reread}'
```

Alternative: log memory per process phase
-------------------------------------------

If you want more granular analysis, add multiple `TransformerTask` steps that call `memory_get_usage` before and after a critical moment in your process (e.g., before and after the `TransformerTask` in the CSV import flow above). Dump each value to `memory_phase.dat` with a `CsvWriterTask` that adds `phase` and `memory_usage` columns so you can compare how much heap each phase consumes.
13 changes: 12 additions & 1 deletion docs/cookbooks/performances_monitoring.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,4 +46,15 @@ CPU Time n/a
Memory 5.35MB
Network n/a n/a n/a
SQL n/a n/a
```
```

Example: measuring CSV processing throughput
------------------------------------------------

When you run `cleverage:process:execute` you already benefit from the `ProcessLogger` that records task timing stats under `src/Logger`. To highlight actionable Blackfire runs, log the following steps:

```bash
blackfire run php bin/console --env=prod cleverage:process:execute app.file_import
```

After the JSON output, look for the `ProcessLogger` section (e.g., `timing.cleverage_process.task`) and correlate it with the CSV reader/writer tasks you configured in `docs/cookbooks/01-common_setup.md`. Use those duration metrics to decide whether to parallelize `CsvSplitterTask` with `TransformerTask`, or to revisit buffering logic in `CsvWriterTask`.
100 changes: 51 additions & 49 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,8 @@
- [Advanced workflow](04-advanced_workflow.md)
- Cookbooks
- [Common Setup](cookbooks/01-common_setup.md)
- [ETL file synchronization](cookbooks/etl_file_sync.md)
- [ETL report aggregation](cookbooks/etl_aggregate_reports.md)
- [Transformations]
- [Flow manipulation]
- [Dummy tasks]
Expand All @@ -32,18 +34,18 @@
- Data manipulation and transformations
- [DenormalizerTask](reference/tasks/denormalizer_task.md)
- [NormalizerTask](reference/tasks/normalizer_task.md)
- [DeserializerTask]
- [SerializerTask]
- [DeserializerTask](reference/tasks/deserializer_task.md)
- [SerializerTask](reference/tasks/serializer_task.md)
- [PropertyGetterTask](reference/tasks/property_getter_task.md)
- [PropertySetterTask](reference/tasks/property_setter_task.md)
- [ObjectUpdaterTask]
- [SplitJoinLineTask]
- [ObjectUpdaterTask](reference/tasks/object_updater_task.md)
- [SplitJoinLineTask](reference/tasks/split_join_line_task.md)
- [TransformerTask](reference/tasks/transformer_task.md)
- [ValidatorTask]
- [ValidatorTask](reference/tasks/validator_task.md)
- File/CSV
- [CsvReaderTask](reference/tasks/csv_reader_task.md)
- [CsvWriterTask](reference/tasks/csv_writer_task.md)
- [CSVSplitterTask]
- [CsvSplitterTask](reference/tasks/csv_splitter_task.md)
- [InputCsvReaderTask](reference/tasks/input_csv_reader_task.md)
- File/JsonStream
- [JsonStreamReaderTask](reference/tasks/json_stream_reader_task.md)
Expand All @@ -52,14 +54,14 @@
- [XmlReaderTask](reference/tasks/xml_reader_task.md)
- [XmlWriterTask](reference/tasks/xml_writer_task.md)
- File/Yaml
- [YamlReaderTask]
- [YamlWriterTask]
- [YamlReaderTask](reference/tasks/yaml_reader_task.md)
- [YamlWriterTask](reference/tasks/yaml_writer_task.md)
- File
- [FileMoverTask]
- [FileMoverTask](reference/tasks/file_mover_task.md)
- [FileReaderTask](reference/tasks/file_reader_task.md)
- [FileRemoverTask]
- [FileRemoverTask](reference/tasks/file_remover_task.md)
- [FileSplitterTask](reference/tasks/file_splitter_task.md)
- [FileWriterTask]
- [FileWriterTask](reference/tasks/file_writer_task.md)
- [FolderBrowserTask](reference/tasks/folder_browser_task.md)
- [InputFileReaderTask](reference/tasks/input_file_reader_task.md)
- [InputFolderBrowserTask](reference/tasks/input_folder_browser_task.md)
Expand All @@ -69,62 +71,62 @@
- [AggregateIterableTask](reference/tasks/aggregate_iterable_task.md)
- [InputAggregatorTask](reference/tasks/input_aggregator_task.md)
- [InputIteratorTask](reference/tasks/input_iterator_task.md)
- [ArrayMergeTask]
- [ColumnAggregatorTask]
- [RowAggregatorTask]
- [FilterTask]
- [GroupByAggregateIterableTask]
- [SimpleBatchTask]
- [IterableBatchTask]
- [SkipEmptyTask]
- [StopTask]
- [ArrayMergeTask](reference/tasks/array_merge_task.md)
- [ColumnAggregatorTask](reference/tasks/column_aggregator_task.md)
- [RowAggregatorTask](reference/tasks/row_aggregator_task.md)
- [FilterTask](reference/tasks/filter_task.md)
- [GroupByAggregateIterableTask](reference/tasks/group_by_aggregate_iterable_task.md)
- [SimpleBatchTask](reference/tasks/simple_batch_task.md)
- [IterableBatchTask](reference/tasks/iterable_batch_task.md)
- [SkipEmptyTask](reference/tasks/skip_empty_task.md)
- [StopTask](reference/tasks/stop_task.md)
- Process
- [CommandRunnerTask]
- [ProcessExecutorTask]
- [ProcessLauncherTask]
- [CommandRunnerTask](reference/tasks/command_runner_task.md)
- [ProcessExecutorTask](reference/tasks/process_executor_task.md)
- [ProcessLauncherTask](reference/tasks/process_launcher_task.md)
- Reporting
- [AdvancedStatCounterTask]
- [AdvancedStatCounterTask](reference/tasks/advanced_stat_counter_task.md)
- [LoggerTask](reference/tasks/logger_task.md)
- [StatCounterTask]
- [StatCounterTask](reference/tasks/stat_counter_task.md)
- Transformers
- Basic and debug
- [CachedTransformer]
- [CallbackTransformer]
- [CastTransformer]
- [ConstantTransformer]
- [ConvertValueTransformer]
- [DebugTransformer]
- [DefaultTransformer]
- [CachedTransformer](reference/transformers/cached_transformer.md)
- [CallbackTransformer](reference/transformers/callback_transformer.md)
- [CastTransformer](reference/transformers/cast_transformer.md)
- [ConstantTransformer](reference/transformers/constant_transformer.md)
- [ConvertValueTransformer](reference/transformers/convert_value_transformer.md)
- [DebugTransformer](reference/transformers/debug_transformer.md)
- [DefaultTransformer](reference/transformers/default_transformer.md)
- [GenericTransformer]
- [EvaluatorTransformer]
- [ExpressionLanguageMapTransformer]
- [EvaluatorTransformer](reference/transformers/evaluator_transformer.md)
- [ExpressionLanguageMapTransformer](reference/transformers/expression_language_map_transformer.md)
- [MappingTransformer](reference/transformers/mapping_transformer.md)
- [MultiReplaceTransformer](reference/transformers/multi_replace_transformer.md)
- [PregFilterTransformer]
- [PregFilterTransformer](reference/transformers/preg_filter_transformer.md)
- [RulesTransformer](reference/transformers/rules_transformer.md)
- [TypeSetterTransformer]
- [UnsetTransformer]
- [WrapperTransformer]
- [TypeSetterTransformer](reference/transformers/type_setter_transformer.md)
- [UnsetTransformer](reference/transformers/unset_transformer.md)
- [WrapperTransformer](reference/transformers/wrapper_transformer.md)
- Array
- [ArrayElementTransformer]
- [ArrayElementTransformer](reference/transformers/array_element_transformer.md)
- [ArrayFilterTransformer](reference/transformers/array_filter_transformer.md)
- [ArrayFirstTransformer]
- [ArrayLastTransformer]
- [ArrayFirstTransformer](reference/transformers/array_first_transformer.md)
- [ArrayLastTransformer](reference/transformers/array_last_transformer.md)
- [ArrayMapTransformer](reference/transformers/array_map_transformer.md)
- [ArrayUnsetTransformer]
- [ArrayUnsetTransformer](reference/transformers/array_unset_transformer.md)
- Date
- [DateFormatTransformer](reference/transformers/date_format.md)
- [DateParserTransformer](reference/transformers/date_parser.md)
- Object
- [InstantiateTransformer]
- [PropertyAccessorTransformer]
- [RecursivePropertySetterTransformer]
- [InstantiateTransformer](reference/transformers/instantiate_transformer.md)
- [PropertyAccessorTransformer](reference/transformers/property_accessor_transformer.md)
- [RecursivePropertySetterTransformer](reference/transformers/recursive_property_setter_transformer.md)
- Serialization
- [DenormalizeTransformer]
- [NormalizeTransformer]
- [DenormalizeTransformer](reference/transformers/denormalize_transformer.md)
- [NormalizeTransformer](reference/transformers/normalize_transformer.md)
- String
- [ExplodeTransformer]
- [HashTransformer]
- [ExplodeTransformer](reference/transformers/explode_transformer.md)
- [HashTransformer](reference/transformers/hash_transformer.md)
- [ImplodeTransformer](reference/transformers/implode_transformer.md)
- [PregMatchTransformer](reference/transformers/preg_match_transformer.md)
- [SlugifyTransformer](reference/transformers/slugify_transformer.md)
Expand Down
46 changes: 46 additions & 0 deletions docs/reference/tasks/advanced_stat_counter_task.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
AdvancedStatCounterTask
=======================

Log performance statistics at regular intervals during iteration. Displays time between iterations, items per
second rate, total items processed, and total elapsed time.

Task reference
--------------

* **Service**: `CleverAge\ProcessBundle\Task\Reporting\AdvancedStatCounterTask`

Accepted inputs
---------------

`any`: input is not used, only the iteration count matters

Possible outputs
----------------

No output is set (the task skips most iterations). Statistics are logged via the logger.

Options
-------

| Code | Type | Required | Default | Description |
|--------------|-----------|:--------:|---------|-------------------------------------------------------------------|
| `num_items` | `integer` | | `1` | Number of logical items per iteration (multiplier for rate calc) |
| `skip_first` | `integer` | | `0` | Number of initial iterations to skip before tracking |
| `show_every` | `integer` | | `1` | Display statistics every N iterations |

Example
-------

```yaml
# Task configuration level
stats:
service: '@CleverAge\ProcessBundle\Task\Reporting\AdvancedStatCounterTask'
options:
show_every: 100
num_items: 1
```

Output in logs (every 100 iterations):
```
Last iteration 00:00:12 ago - 1,50 items/s - 500 items processed in 00:05:33
```
Loading
Loading