Chat
Last updated
Last updated
The Asynchronous Chat Completion API allows you to submit chat-based LLM tasks that are processed in the background and retrieved later. Ideal for high-latency or large-scale workloads, this custom implementation lets you queue chat requests via a POST
endpoint and retrieve results using a unique task ID. It supports robust job tracking, delayed responses, and retry-safe workflows—making it well-suited for batch processing, long-running prompts, and serverless environments where real-time responses aren't required.