VentureBeat

Baseten Launches Training Platform to Own Model Weights

5 days agoRead original →

Baseten, the San-Francisco-based AI infrastructure company that recently reached a $2.15 billion valuation, is making a bold pivot from inference into training. The company launched Baseten Training, a fully managed platform that lets enterprises fine-tune open-source models without wrestling with GPU clusters, multi-node orchestration or cloud capacity planning. According to CEO Amir Haghighat, the move is a response to relentless customer demand for a lighter weight solution that keeps the full lifecycle of AI deployment in one place while giving users full control over their code, data and, most importantly, the model weights.

What sets Baseten apart is its multi-cloud GPU orchestration and sub-minute scheduling. The platform supports multi-node training on NVIDIA H100 or B200 GPUs, automatic checkpointing for node failures, and a proprietary Multi-Cloud Management system that dynamically provisions capacity across AWS, Azure, Google Cloud and other providers. This gives customers the flexibility of a hyperscaler without long-term contracts. Early adopters report 84 % cost savings and 50 % latency improvements on custom models. Companies such as Oxen AI and Parsed have leveraged the platform to run hundreds of training jobs, spin up HIPAA-compliant deployments in 48 h, and cut end-to-end transcription latency in half.

Baseten’s strategy also blurs the line between training and inference. The company’s inference team uses Baseten Training to create draft models for speculative decoding, a cutting-edge technique that can double inference speed. The platform’s open-source ML Cookbook and support for reinforcement learning, supervised fine-tuning and other advanced methods mean Baseten can keep pace with the rapid evolution of open-source models. While the market is crowded with hyperscalers, GPU-focused providers and vertically integrated platforms, Baseten’s focus on developer experience, performance optimization and weight ownership gives it a defensible niche in the enterprise AI stack.

Key takeaway: Baseten’s strategy of combining low-level, multi-cloud training infrastructure with full weight ownership positions it as a compelling alternative to hyperscalers for enterprises seeking to customize and deploy open-source AI models at scale.

💡 Key Insight

Baseten’s strategy of combining low-level, multi-cloud training infrastructure with full weight ownership positions it as a compelling alternative to hyperscalers for enterprises seeking to customize and deploy open-source AI models at scale.

Want the full story?

Read on VentureBeat