Belgrade, Serbia



In this role, you will implement state of art ML models on Tenstorrent hardware using Python and C++, focusing on pushing both accuracy and inference speed. You will work hands-on with Tenstorrent’s open-source software stack (tt-metalium, tt-nn, tt-llk), taking models from framework to silicon and iterating on performance. You will own a well-defined engineering project under the guidance of a dedicated mentor, with direct impact on how real workloads run on our chips. We are looking for a minimum of 3 months for this role with the potential for extension to 6 months.