Skip to content
Socaity Docs

Multi-Model Pipeline

Intermediate
10 min

Submit FLUX image generation and SpeechCraft voice narration jobs simultaneously — no waiting between them — and combine the results into a paired image-and-audio output.

Uses:flux_schnell + speechcraft.

How Parallel Jobs Work

Every SDK call returns a job object immediately — it does not block. You can submit as many jobs as you like before calling .get_result() on any of them. The platform runs all jobs concurrently on separate GPUs, so total wall-clock time equals the slowest job, not the sum.

Step 1 — Initialise Both Models

Import and instantiate FLUX Schnell and SpeechCraft. Both share the same API key.

python
import os
from socaity import flux_schnell
from socaity import speechcraft

flux = flux_schnell(api_key=os.getenv("SOCAITY_API_KEY"))
sc   = speechcraft(api_key=os.getenv("SOCAITY_API_KEY"))

Step 2 — Submit Jobs in Parallel

Call both models before blocking on either result. The two GPU jobs start simultaneously on the platform.

python
prompt = "A lone explorer on a neon-lit alien planet, cinematic"

# Submit both jobs — neither blocks here
image_job = flux(prompt=prompt, num_outputs=1)
audio_job = sc.text2voice(text=prompt, voice="en_male_calm")

print("Both jobs submitted — running in parallel on the cloud...")

Step 3 — Collect Results

Call .get_result() on each job. Whichever finishes first will return immediately; the other will block for any remaining time.

python
# Block on image first (typically slower)
images = image_job.get_result()
audio  = audio_job.get_result()   # may already be done

# Save the paired outputs
images[0].save("scene.png")
audio.save("narration.mp3")
print("Saved scene.png + narration.mp3")

Full Example — Batch of Prompts

Scale the pattern to a list of prompts. All image and audio jobs are submitted in a single loop before any .get_result() is called.

python
import os
from socaity import flux_schnell
from socaity import speechcraft

flux = flux_schnell(api_key=os.getenv("SOCAITY_API_KEY"))
sc   = speechcraft(api_key=os.getenv("SOCAITY_API_KEY"))

prompts = [
    "A misty mountain range at dawn",
    "A cyberpunk street market at night",
    "An underwater coral city, bioluminescent",
]

# 1. Submit ALL jobs before blocking on any
image_jobs = [flux(prompt=p, num_outputs=1) for p in prompts]
audio_jobs = [sc.text2voice(text=p, voice="en_male_calm") for p in prompts]

# 2. Collect results — cloud processes all in parallel
for i, (img_job, aud_job) in enumerate(zip(image_jobs, audio_jobs)):
    images = img_job.get_result()
    audio  = aud_job.get_result()
    images[0].save(f"scene_{i}.png")
    audio.save(f"narration_{i}.mp3")
    print(f"Saved pair {i}")

print("All done!")

JavaScript Alternative

Use Promise.all to achieve the same effect in JavaScript:

javascript
// The JavaScript SDK is in early development.
// High-level model classes are not yet available.
// For now, use the Python SDK for multi-model workflows.

import { socaity } from 'socaity'

socaity.setApiKey(process.env.SOCAITY_API_KEY)

// Discover available models
const models = await socaity.getAvailableModels()
console.log('Available models:', models)

Timing Comparison

StrategyFLUX (s)SpeechCraft (s)Total (s)
Sequential (naive)6 s4 s
~10 s
Parallel (this tutorial)6 s4 s
~6 s
Batch of 3 (parallel)6 s each4 s each
~7 s

What You Built

  • Submitted FLUX and SpeechCraft jobs concurrently without blocking
  • Collected both results after both GPUs finished
  • Scaled the pattern to a batch of prompts
  • Understood the wall-clock time savings of parallel vs. sequential execution