Multi-Model Pipeline
Submit FLUX image generation and SpeechCraft voice narration jobs simultaneously — no waiting between them — and combine the results into a paired image-and-audio output.
Uses:flux_schnell + speechcraft.
Every SDK call returns a job object immediately — it does not block. You can submit as many jobs as you like before calling .get_result() on any of them. The platform runs all jobs concurrently on separate GPUs, so total wall-clock time equals the slowest job, not the sum.
Import and instantiate FLUX Schnell and SpeechCraft. Both share the same API key.
import os
from socaity import flux_schnell
from socaity import speechcraft
flux = flux_schnell(api_key=os.getenv("SOCAITY_API_KEY"))
sc = speechcraft(api_key=os.getenv("SOCAITY_API_KEY"))Call both models before blocking on either result. The two GPU jobs start simultaneously on the platform.
prompt = "A lone explorer on a neon-lit alien planet, cinematic"
# Submit both jobs — neither blocks here
image_job = flux(prompt=prompt, num_outputs=1)
audio_job = sc.text2voice(text=prompt, voice="en_male_calm")
print("Both jobs submitted — running in parallel on the cloud...") Call .get_result() on each job. Whichever finishes first will return immediately; the other will block for any remaining time.
# Block on image first (typically slower)
images = image_job.get_result()
audio = audio_job.get_result() # may already be done
# Save the paired outputs
images[0].save("scene.png")
audio.save("narration.mp3")
print("Saved scene.png + narration.mp3") Scale the pattern to a list of prompts. All image and audio jobs are submitted in a single loop before any .get_result() is called.
import os
from socaity import flux_schnell
from socaity import speechcraft
flux = flux_schnell(api_key=os.getenv("SOCAITY_API_KEY"))
sc = speechcraft(api_key=os.getenv("SOCAITY_API_KEY"))
prompts = [
"A misty mountain range at dawn",
"A cyberpunk street market at night",
"An underwater coral city, bioluminescent",
]
# 1. Submit ALL jobs before blocking on any
image_jobs = [flux(prompt=p, num_outputs=1) for p in prompts]
audio_jobs = [sc.text2voice(text=p, voice="en_male_calm") for p in prompts]
# 2. Collect results — cloud processes all in parallel
for i, (img_job, aud_job) in enumerate(zip(image_jobs, audio_jobs)):
images = img_job.get_result()
audio = aud_job.get_result()
images[0].save(f"scene_{i}.png")
audio.save(f"narration_{i}.mp3")
print(f"Saved pair {i}")
print("All done!") Use Promise.all to achieve the same effect in JavaScript:
// The JavaScript SDK is in early development.
// High-level model classes are not yet available.
// For now, use the Python SDK for multi-model workflows.
import { socaity } from 'socaity'
socaity.setApiKey(process.env.SOCAITY_API_KEY)
// Discover available models
const models = await socaity.getAvailableModels()
console.log('Available models:', models)| Strategy | FLUX (s) | SpeechCraft (s) | Total (s) |
|---|---|---|---|
| Sequential (naive) | 6 s | 4 s | ~10 s |
| Parallel (this tutorial) | 6 s | 4 s | ~6 s |
| Batch of 3 (parallel) | 6 s each | 4 s each | ~7 s |
- Submitted FLUX and SpeechCraft jobs concurrently without blocking
- Collected both results after both GPUs finished
- Scaled the pattern to a batch of prompts
- Understood the wall-clock time savings of parallel vs. sequential execution