Engineering and algorithmic interventions for multimodal post-training at Microsoft scale
Engineering at MicrosoftAditya Challapally leads post-training research and infrastructure for Copilot agent capabilities that process millions of multimodal interactions. This post builds on the diagnostics from Diagnosing instability in production-scale agent reinforcement learning with the engineering and algorithmic interventions we developed to get the best results out of post training at scale. Post-training multimodal agents at scale […] The post Engineering and…