Articles
Decomposing Systolic Arrays
How systolic arrays accelerate matrix multiplications in modern ML accelerator ASICs like TPUs.
Building Softmax In Hardware
Building a custom hardware pipeline for the softmax non-linear function used in transformer attention.