









This role is hybrid based in Belgrade, Serbia. Who You Are * Final-year BSc or MSc student in Computer Science, Software Engineering, Electrical Engineering, or a related technical field * Strong programming fundamentals in Python, with familiarity in C++ considered a plus * Interested in backend systems, API design, and how ML models are deployed in production environments * Curious about performance optimization techniques such as batching, caching, and model parallelism * Motivated to learn and contribute in a collaborative engineering environment What We Need * Contribute to backend features and APIs that support AI inference workloads * Assist in deploying, testing, and benchmarking models running on Tenstorrent hardware * Analyze inference performance and help identify optimization opportunities * Write clean, maintainable code with guidance from senior engineers * Collaborate with the team to improve reliability, usability, and performance of the inference server stack What You Will Learn * How end-to-end ML inference is optimized on custom AI hardware * How scalable backend systems are designed to serve real-world AI applications * How APIs and infrastructure shape the developer experience for AI workloads * Practical performance analysis techniques in production-like environments * How modern AI software stacks integrate models, runtimes, and hardware