preview 1.2
the ai-native compute platform
run serverless or dedicated. deploy ai agents, inference endpoints, and data pipelines on infrastructure that adapts to your workload.
capabilities
ai-first, full-stack
everything you need to build, deploy, and operate ai applications at production scale.
ai inference
deploy models to gpu-backed endpoints. auto-scaling from zero to thousands of concurrent requests. built-in support for vllm, tgi, and custom runtimes.
adaptive compute
run serverless for bursty workloads or reserve dedicated instances for sustained throughput. switch modes per-service, not per-account.
microservices
deploy independent services with isolated runtimes. automatic discovery, load balancing, and zero-downtime rollouts baked in.
data layer
managed postgres, redis, and object storage. serverless or provisioned. branching for dev, replicas for prod.
agent orchestration
chain models, tools, and data sources into autonomous agents. built-in memory, tool calling, and human-in-the-loop controls.
observability
traces, metrics, and logs across every service and model call. cost attribution per-request. no third-party sdk required.