Skip to content
Socaity Docs

Wrap Your Own Model

Intermediate
15 min

Take any PyTorch (or pure Python) model, wrap it with APIPod, and deploy it as a serverless GPU API β€” without writing a single line of server code.

Step 1 β€” Install APIPod

APIPod is the packaging and deployment CLI. Install it separately from the consumer SDK.

terminal
pip install apipod

Step 2 β€” Write Your Service File

Create a Python file that defines your model logic. Decorate the function you want to expose with @app.endpoint. APIPod reads this decorator to generate the HTTP API, handle serialisation, and schedule GPU allocation.

python
# service.py
import os
import torch
from apipod import APIPod

app = APIPod()

# Load your model once at startup (not inside the function)
model = torch.load("my_model.pt").eval()

@app.endpoint("/generate")
def generate(prompt: str, steps: int = 20) -> dict:
    """Generate output from a text prompt."""
    with torch.no_grad():
        output = model.run(prompt, num_steps=steps)
    return {"output": output.to_base64(), "steps": steps}

Step 3 β€” Serve Locally

Run the service on your machine to verify it works before deploying. APIPod spins up a local HTTP server and hot-reloads on file changes.

terminal
apipod --start

Test it with curl while the server is running:

terminal
curl -X POST http://localhost:8000/generate \
  -H "Content-Type: application/json" \
  -d '{"prompt": "hello world", "steps": 10}'

Step 4 β€” Build a Container Image

apipod --build packages your code, dependencies, and model weights into a container image. It auto-detects your requirements.txt or pyproject.toml and resolves GPU-compatible CUDA versions.

terminal
apipod --build service.py

Step 5 β€” Deploy to Socaity Cloud

Push the built image and register it as a serverless endpoint. The platform handles auto-scaling, cold-start optimisation, and pay-per-call billing.

terminal
# Live today β€” uses the apipod CLI
apipod --build

# Coming soon β€” unified socaity CLI will mirror this:
# socaity deploy --serverless --provider runpod

Once deployed, you will receive a permanent endpoint URL:

terminal
βœ“ Image pushed to registry
βœ“ Endpoint registered
βœ“ Live at: https://api.socaity.ai/endpoints/my-model-v1/generate

Step 6 β€” Call Your Deployed Endpoint

Any deployed APIPod service is immediately callable via HTTP. Use requests or any HTTP client. The endpoint URL is printed after a successful deploy.

python
import os
import requests

# Call your deployed APIPod endpoint directly via HTTP
endpoint_url = "https://api.socaity.ai/endpoints/my-model-v1/generate"
headers = {"Authorization": f"Bearer {os.getenv('SOCAITY_API_KEY')}"}

resp = requests.post(endpoint_url, json={"prompt": "a glowing crystal", "steps": 30}, headers=headers)
job = resp.json()   # {"job_id": "...", ...}

# Poll for the result (or use the Socaity SDK job polling helpers)
print(job)

Project Layout

A minimal APIPod project only needs two files β€” the service and a requirements file:

terminal
my-model/
β”œβ”€β”€ service.py          # your APIPod service
β”œβ”€β”€ requirements.txt    # Python dependencies
β”œβ”€β”€ my_model.pt         # model weights (or download at runtime)
└── apipod.json         # optional advanced config

apipod.json β€” Configuration Reference

For advanced configuration, drop an apipod.json in your project root:

json
{
  "project": {
    "name": "my-model",
    "version": "1.0.0"
  },
  "build": {
    "base_image": ""
  },
  "runtime": {
    "gpu": "A10G",
    "memory_gb": 24
  },
  "queue": {
    "max_workers": 1,
    "queue_size": 500,
    "job_ttl": 3600
  }
}

What You Built

  • Installed APIPod and wrote a @app.endpoint service
  • Tested locally with apipod --start
  • Built a GPU-optimised container image with apipod --build
  • Deployed serverless to RunPod EU with apipod --build (the unified socaity deploy CLI is coming soon)
  • Called the live endpoint via HTTP using requests