Cerebrium
Cerebrium, based in New York, London, and fully remote, is a B2B analytics company that specializes in providing advanced GPU solutions, automated infrastructure setup, and detailed cost management for machine learning model deployment.
Company Locations
Cerebrium operates out of New York, NY, USA, and London, England, United Kingdom. Additionally, the company supports remote work, making it accessible to a global team and client base.
Industry and Sub-industry
Cerebrium operates within the B2B sector, specifically focusing on analytics. The company provides advanced analytics services to other businesses, utilizing cutting-edge GPU technology and automated infrastructure setup.
Services Offered
Cerebrium offers a range of services including various types of GPUs, automated infrastructure environments, volume storage solutions, hot reload capabilities on GPU containers, and real-time streaming endpoints. The platform also provides detailed cost breakdowns, real-time logging, and performance profiling. Users can deploy models on their own infrastructure and utilize AWS/GCP credits.
Pricing Plans
Cerebrium offers three pricing plans: Hobby, Standard, and Enterprise, with custom options for large organizations. GPU compute pricing is based on VRam, GPU Model, and Cost per Second. CPU compute and memory are priced per core per second and per GB per second respectively. Storage costs $0.3 per GB per month. New users receive a $10 credit which does not require a credit card.
Compliance and Reliability
Cerebrium ensures high standards for customer data management with SOC 2 compliance. The company's architecture is distributed across three regions to minimize downtime, maintaining a 99.99% uptime with less than 0.01% failure rate on requests. This ensures reliable and secure service delivery.
Use Cases and Clientele
Cerebrium is utilized by companies and engineers from Twilio, Rudderstack, and Matterport among others. The platform supports numerous applications such as transcribing podcasts, streaming model outputs, and generating logos. Clients can scale models to over 10K requests per minute with minimal engineering effort.