togetherAI interview question

Code multi-head attention, how to implement speculative decoding, etc