Endpoints

An endpoint is the addressable HTTP surface for a function on your APIPod app. The @app.endpoint decorator registers the route, introspects the function signature for typed parameters and media files, and wires the function into the backend that matches your orchestrator and compute settings.

@app.endpoint

Decorate any function with @app.endpoint to expose it as a route. APIPod handles request parsing, file uploads, queueing when the backend supports it, and response serialisation. The decorator works the same on sync and async functions.

from apipod import APIPod
from apipod import ImageFile

app = APIPod()

# path is positional; queue_size caps the in-memory backlog when a queue is active.
@app.endpoint("/generate", queue_size=500)
def generate(prompt: str, width: int = 512, height: int = 512) -> ImageFile:
    image = my_model(prompt, width=width, height=height)
    return image

APIPod also ships two convenience wrappers: @app.get(path=None, queue_size=100, ...) and @app.post(path=None, queue_size=100, ...). Both call endpoint(...) with the matching HTTP method. Note the shortcuts default queue_size to 100, while the raw endpoint(...) decorator defaults it to 500.

Decorator Parameters

Parameter	Type	Default	Description
`path`	`str`	`required`	URL route for this endpoint, normalised to start with "/".
`methods`	`list[str] \| None`	`None`	HTTP methods accepted by this route. When None, FastAPI infers GET. Use ["POST"] for file uploads, or use the @app.post shortcut.
`max_upload_file_size_mb`	`int \| None`	`None`	Per-endpoint cap on multipart upload size. None means no cap beyond the router default.
`queue_size`	`int`	`500`	Maximum number of jobs waiting in the in-memory queue. The @app.get and @app.post shortcuts default this to 100 instead.
`use_queue`	`bool \| None`	`None`	Override queue auto-detection. Pass False to force a synchronous endpoint even when the router has a queue attached.
`args, *kwargs`		`n/a`	Forwarded to the underlying FastAPI route registration (tags, summary, responses, dependencies, and similar).

Supported Media File Types

APIPod re-exports media-toolkit types from apipod directly. Annotate a parameter with one of these types and APIPod accepts base64 strings, public URLs, or multipart uploads, then materialises the typed object for your function.

Type Hint	Accepts	Python Object
`MediaFile`	base64, URL, file path, bytes	`media_toolkit.MediaFile`
`ImageFile`	base64, URL, file path, PIL.Image	`media_toolkit.ImageFile`
`AudioFile`	base64, URL, file path, bytes	`media_toolkit.AudioFile`
`VideoFile`	base64, URL, file path, bytes	`media_toolkit.VideoFile`
`MediaList`	JSON array of media payloads	`media_toolkit.MediaList`
`MediaDict`	JSON object mapping keys to media payloads	`media_toolkit.MediaDict`

Plain Python types (str, int, float, bool) and Pydantic models pass through FastAPI's standard validation. Raw bytes and FastAPI's UploadFile are also accepted on the same code path.

from apipod import APIPod, ImageFile, AudioFile

app = APIPod()

@app.endpoint("/swap-face", methods=["POST"])
def swap_face(source: ImageFile, target: ImageFile) -> ImageFile:
    # ImageFile materialises the upload into something with .to_pil(), .to_bytes(), etc.
    return face_swap_model(source.to_pil(), target.to_pil())

@app.endpoint("/transcribe", methods=["POST"])
def transcribe(audio: AudioFile) -> dict:
    text = whisper_model(audio.to_bytes())
    return {"transcript": text}

Reporting Progress with JobProgress

Add a parameter annotated JobProgress (or named job_progress) and APIPod injects a reporter object. Call set_status(progress, message) from inside the function to update the running job. progress is a float between 0.0 and 1.0, not a 0 to 100 percentage.

from apipod import APIPod, JobProgress, ImageFile

# A queue is attached for compute="serverless" + provider="localhost".
app = APIPod(compute="serverless", provider="localhost")

@app.endpoint("/generate", methods=["POST"])
def generate(prompt: str, steps: int = 50, job_progress: JobProgress = None) -> ImageFile:
    for i in range(steps):
        image = my_pipeline_step(prompt, i)
        # progress is a 0.0..1.0 float, not a 0-100 percentage.
        job_progress.set_status(progress=(i + 1) / steps, message=f"Step {i + 1}/{steps}")
    return image

Progress updates are visible through the job status endpoint and through the .get_progress() calls a client makes between polls. Clients that only call .get_result() block until the final value and never see intermediate updates.

Queue Behaviour

APIPod auto-detects whether to queue a request. If the backend has an in-memory JobQueue attached, every @app.endpoint registers as a background task that returns a job_id immediately. The client polls /status?job_id=... for the result. If no queue exists on the router, the endpoint runs synchronously and returns the function's return value in the same response.

The queue is attached automatically for compute="serverless" with provider="localhost", and for the socaity + dedicated + localhost combination used in local testing. Production RunPod deploys use the RunPod backend, which handles queueing on the RunPod side. Pass use_queue=False on a single endpoint to force a synchronous response even when a queue is available.

The FastAPI backend runs a single in-process worker thread by default. The Dockerfile generated by socaity build launches uvicorn without --workers, so concurrency is bounded by that one worker unless you override the command. Increase worker count only after benchmarking VRAM usage on your model.

Return Types

APIPod serialises the function's return value to JSON. media-toolkit objects are encoded to base64; plain dicts, lists, and scalars pass through unchanged. Returning a generator turns the endpoint into a streaming endpoint. Attaching Pydantic request/response models to the function registers it as an LLM endpoint with an OpenAI-compatible schema.

from apipod import ImageFile

@app.endpoint("/generate", methods=["POST"])
def generate(prompt: str) -> ImageFile:
    img = my_model(prompt)
    return img  # encoded to base64 in the JSON response

@app.endpoint("/info")
def info() -> dict:
    return {"model": "my-model", "version": "1.0"}  # passes through as JSON

@app.endpoint("/stream", methods=["POST"])
def stream(prompt: str):
    # Returning a generator turns this into a streaming endpoint.
    for chunk in my_model.stream(prompt):
        yield chunk

Next steps

Build. Package the service as a container with socaity build.
Deploy serverless. Ship the built image to RunPod and let APIPod hand off queueing.
Getting Started. Revisit the end-to-end flow from pip install apipod to a running endpoint.
Python SDK reference. Call your deployed endpoint from the Socaity SDK.

Getting Started

Build