Projects

TinyTPU

2025

Built a minimal TPU from scratch to perform on-chip inference and training on a 2 -> 2 -> 1 MLP to solve the XOR problem.

Softmax in Hardware

2025

Implemented the softmax non-linear function in hardware using SystemVerilog.

Systolic Array

2025

Built a 2x2 systolic array in SystemVerilog to accelerate matrix multiplication in hardware. This is the heart of any modern ML accelerator ASIC.

Hack Computer

2025

Built a 16-bit programmable computer in HDL. Also built an assembler and compiler to covert Java-like code to bytecode, and bytecode to machine code. I build this as part of the NAND2Tetris course.