Speculative Speculative Decoding: Hiding Draft Latency Through Asynchronous Speculation on Multi-GPU Systems
Course project for CMU 15-418 Parallel Computer Architecture and Programming by Marcus Alenius and William Chien.
Spring 2026
Course project for CMU 15-418 Parallel Computer Architecture and Programming by Marcus Alenius and William Chien.
Spring 2026