Cerebrium

Cerebrium, based in New York, London, and fully remote, is a B2B analytics company that specializes in providing advanced GPU solutions, automated infrastructure setup, and detailed cost management for machine learning model deployment.

Company Locations

Cerebrium operates out of New York, NY, USA, and London, England, United Kingdom. Additionally, the company supports remote work, making it accessible to a global team and client base.

Industry and Sub-industry

Cerebrium operates within the B2B sector, specifically focusing on analytics. The company provides advanced analytics services to other businesses, utilizing cutting-edge GPU technology and automated infrastructure setup.

Services Offered

Cerebrium offers a range of services including various types of GPUs, automated infrastructure environments, volume storage solutions, hot reload capabilities on GPU containers, and real-time streaming endpoints. The platform also provides detailed cost breakdowns, real-time logging, and performance profiling. Users can deploy models on their own infrastructure and utilize AWS/GCP credits.

Pricing Plans

Cerebrium offers three pricing plans: Hobby, Standard, and Enterprise, with custom options for large organizations. GPU compute pricing is based on VRam, GPU Model, and Cost per Second. CPU compute and memory are priced per core per second and per GB per second respectively. Storage costs $0.3 per GB per month. New users receive a $10 credit which does not require a credit card.

Compliance and Reliability

Cerebrium ensures high standards for customer data management with SOC 2 compliance. The company's architecture is distributed across three regions to minimize downtime, maintaining a 99.99% uptime with less than 0.01% failure rate on requests. This ensures reliable and secure service delivery.

Use Cases and Clientele

Cerebrium is utilized by companies and engineers from Twilio, Rudderstack, and Matterport among others. The platform supports numerous applications such as transcribing podcasts, streaming model outputs, and generating logos. Clients can scale models to over 10K requests per minute with minimal engineering effort.