What is MaaS
Model-as-a-Service is the layer that sits between your code and the GPU clouds running your models. One client. Any provider.
AI models live in many places — HuggingFace, Replicate, vendor APIs, and your own GPU cluster. Each has its own SDK, authentication scheme, and billing model. If you run models on more than one provider, you carry the cost of that fragmentation: multiple auth flows, multiple retry patterns, multiple billing dashboards, multiple lines of provider-specific client code. MaaS is a layer that hides that fragmentation behind one client, so your application code does not care where the inference actually runs.
- Single SDK across providers — the same Python and JS client works against the supported providers. Today: RunPod EU. Scaleway and Azure are coming soon. Switching providers is a config change, not a rewrite.
- Multi-cloud routing — choose the provider and region per deployment from one client. Today RunPod EU is the live target; Scaleway and Azure expand this once they come online.
- Active-only billing — you pay only for the time a job spends actively running (
PROCESSINGstate). Cold starts, queue time, idle time, and failed attempts are not charged. See Billing for the full breakdown. - EU residency by default — new projects default to EU regions. You opt in to non-EU placement, not the other way around.
Your code talks to one endpoint. Socaity MaaS routes the request to the provider and region you configured in apipod.json. The routing decision happens at the gateway — your service code never changes.
The provider column is the deploy target you set in apipod.json. MaaS does not change the destination at runtime — it enforces your config.
- Not a fine-tuning platform — MaaS runs inference. For fine-tuning, use Replicate or HuggingFace AutoTrain directly and deploy the resulting checkpoint via APIPod.
- Not a model registry — the Socaity catalog is a curated set of hosted models. The authoritative source for model weights, tokenizers, and model cards is HuggingFace Hub.
- Not a marketplace in the commercial sense — models in the catalog are not listed by third parties for sale. They are deployed by Socaity and billed at pass-through GPU cost plus a routing fee. There is no model storefront.
- Multi-provider choice — you want to be able to switch providers without rewriting client code.
- EU residency requirement — you need data to stay in the EEA and want that enforced at the routing layer, not managed in application code.
- Active-only billing — you run bursty or unpredictable workloads and want zero cost when idle.
- Single SDK across model families — you call image generation, transcription, and text models from the same codebase and do not want three separate client libraries.