Initialize the taskflow dialect for multi-CGRA/spatial accelerator scenarios #232
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
In this pr, we complete the following things.
The Taskflow Dialect
We introduce the
taskflowdialect, which contains the following ops to build a computation abstraction for both scale-out & scale-up spatial architectures:taskflow.graphop: wraps computation-intensive workloads into its region for multi-CGRA system acceleration.taskflow.taskop: wraps a specific operation into its body for a single CGRA (with affine controller & tile array)taskflow.channelop: carries the data dependencies between two different tasks. We can further add resource binding attributes (e.g., streaming, sequential, coarse-grained pipeline) on this op to denote how we can transfer the data between two tasks that work along with the affine controllertaskflow.driveop: carries the control dependencies between two different tasks. This is mainly used to partition some irregular workloads on multi-CGRA systems.e.g.,
packetdata type intaskflowdialect. This data type is carried by thetaskflow.driveop and contains some metadata of each task (e.g., iteration space, task-level execution conditions).taskflow.taskis the node of thetaskflow.graph, while thetaskflow.channel&taskflow.driveare the edges of the graph.The
convert-linalg-to-taskflowPassWe initially realize a conversion pass to get the
taskflowrepresentation for a simple ResNet block generated by PyTorch.The reason why I implement
linalg-to-taskflowconversion is that for almost all ML workloads, we don't need to consider the control flows, as they only have inter-task data dependencies.Features to Support
Compiler Level:
taskflow.tasklevel fusion, to enable multi-kernels run on a CGRARTL Level: