Theofilos is an accomplished ML architect and an expert in serving large language models with a focus on scalability and performance optimization. With a strong background in ML infrastructure and MLOps principles, he brings a wealth of experience to the table. As a maintainer of the KServe project and contributor to Kubeflow, Theofilos is actively involved in advancing the field of model serving. His deep understanding of Kubernetes, GPU optimization, and open-source tools allows him to navigate the challenges of hosting custom GPT-based models with ease. Attend his talk to gain valuable insights, best practices, and practical knowledge that will empower you to scale and optimize your language models effectively.