-
Notifications
You must be signed in to change notification settings - Fork 8
Open
Description
Hi,
I've reading about the code in simulator/accelerator.py, I found I'm confused about the compute_cycles.
In get_core_compute_cycles:
"""
Compute instruction
args:
ic: Input Channels
oc: Output Channels
ow: Output Width
oh: Output Height
kw: Output Height
kh: Output Height
b: Batch Size
im2col: boolean. If true, we assume the cpu does im2col. Otherwise,
we do convolutions channel-wise
"""
overhead = 0
if im2col:
ni = kw * kh * ic
no = oc
batch = b * oh * ow
compute_cycles = batch * ceil_a_by_b(no, self.M) * \
(ceil_a_by_b(ni, self.N) + overhead)
else:
compute_cycles = b * ceil_a_by_b(oc, self.M) * \
ow * oh * kw * kh * \
(ceil_a_by_b(ic, self.N) + overhead)
return compute_cycles
My questions are:
- In a systolic array, the partial sums produced by the PEs need to propagate downward to the bottom each cycle. Is the forwarding latency considered (for example, in a 3×3 systolic array, the first output would need to wait three cycles, corresponding to the array’s height)?
- If the above assumption is correct, does the overhead account for this? If not, what exactly is the purpose of the overhead?
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels