Wrap Your Own Model
Take any PyTorch (or pure Python) model, wrap it with APIPod, and deploy it as a serverless GPU API β without writing a single line of server code.
APIPod is the packaging and deployment CLI. Install it separately from the consumer SDK.
pip install apipod Create a Python file that defines your model logic. Decorate the function you want to expose with @app.endpoint. APIPod reads this decorator to generate the HTTP API, handle serialisation, and schedule GPU allocation.
# service.py
import os
import torch
from apipod import APIPod
app = APIPod()
# Load your model once at startup (not inside the function)
model = torch.load("my_model.pt").eval()
@app.endpoint("/generate")
def generate(prompt: str, steps: int = 20) -> dict:
"""Generate output from a text prompt."""
with torch.no_grad():
output = model.run(prompt, num_steps=steps)
return {"output": output.to_base64(), "steps": steps}gpu parameter tells APIPod which GPU class to request at runtime. Supported values: "T4", "A10G", "A100", "H100". Omit it to use the platform default. Run the service on your machine to verify it works before deploying. APIPod spins up a local HTTP server and hot-reloads on file changes.
apipod --startTest it with curl while the server is running:
curl -X POST http://localhost:8000/generate \
-H "Content-Type: application/json" \
-d '{"prompt": "hello world", "steps": 10}'apipod --build packages your code, dependencies, and model weights into a container image. It auto-detects your requirements.txt or pyproject.toml and resolves GPU-compatible CUDA versions.
apipod --build service.pyPush the built image and register it as a serverless endpoint. The platform handles auto-scaling, cold-start optimisation, and pay-per-call billing.
# Live today β uses the apipod CLI
apipod --build
# Coming soon β unified socaity CLI will mirror this:
# socaity deploy --serverless --provider runpodOnce deployed, you will receive a permanent endpoint URL:
β Image pushed to registry
β Endpoint registered
β Live at: https://api.socaity.ai/endpoints/my-model-v1/generate Any deployed APIPod service is immediately callable via HTTP. Use requests or any HTTP client. The endpoint URL is printed after a successful deploy.
import os
import requests
# Call your deployed APIPod endpoint directly via HTTP
endpoint_url = "https://api.socaity.ai/endpoints/my-model-v1/generate"
headers = {"Authorization": f"Bearer {os.getenv('SOCAITY_API_KEY')}"}
resp = requests.post(endpoint_url, json={"prompt": "a glowing crystal", "steps": 30}, headers=headers)
job = resp.json() # {"job_id": "...", ...}
# Poll for the result (or use the Socaity SDK job polling helpers)
print(job)A minimal APIPod project only needs two files β the service and a requirements file:
my-model/
βββ service.py # your APIPod service
βββ requirements.txt # Python dependencies
βββ my_model.pt # model weights (or download at runtime)
βββ apipod.json # optional advanced config For advanced configuration, drop an apipod.json in your project root:
{
"project": {
"name": "my-model",
"version": "1.0.0"
},
"build": {
"base_image": ""
},
"runtime": {
"gpu": "A10G",
"memory_gb": 24
},
"queue": {
"max_workers": 1,
"queue_size": 500,
"job_ttl": 3600
}
}- Installed APIPod and wrote a
@app.endpointservice - Tested locally with
apipod --start - Built a GPU-optimised container image with
apipod --build - Deployed serverless to RunPod EU with
apipod --build(the unifiedsocaity deployCLI is coming soon) - Called the live endpoint via HTTP using
requests