Replicate
Run any open-source model via API with pay-per-second billing. 50K+ models including LLMs, image gen (FLUX, Stable Diffusion), video (Wan 2.2, Kling 2.6 Pro with audio), speech (Whisper), and community fine-tunes. Custom model deployment via Cog containers. Prediction deadlines for auto-cancellation. Webhook signing for security. Being acquired by Cloudflare. Pricing varies by model and hardware — typically $0.0001-0.005 per second of compute.
Overview
| Category | Ai Image |
| Compliance | SOC2 |
| Self-Hostable | No |
| On-Prem | No |
| Best For | hobby, startup, growth |
| Last Verified | 2026-02-12 |
Strengths & Weaknesses
Strengths:- dx
- cost
- customization
- Cold starts can be 10-30 seconds for infrequently used models
- Less optimized LLM inference than specialized providers (Groq, Cerebras)
- Being acquired by Cloudflare — platform future may change
When to Use
Best when:- Need to run niche or community fine-tuned models
- Want to deploy custom models without managing GPUs
- Image/video/audio generation with 50K+ model options
- Prototyping with many different models
- Need lowest latency for LLM inference
- Production workloads requiring consistent performance
- Enterprise compliance requirements
Known Issues (1)
- [low] `require function is used in a way in which dependencies cannot be statically extracted`