# Async API

{% hint style="info" %}
The page is currently being developed. If you experience issues, please contact our support team. Your feedback helps us improve our documentation for everyone.
{% endhint %}

The Asynchronous Inference API provides a unified interface for submitting and managing long-running AI tasks—such as chat completions, image generations, and other model inferences—without blocking your application. Tasks are queued and processed in the background, allowing you to fetch results later using a unique task ID. This architecture is ideal for scaling inference workloads, handling variable latency, and integrating AI into event-driven or serverless systems. Whether you're running compute-intensive models or managing bursty traffic, the async API ensures flexibility, reliability, and control across all supported task types.

{% content-ref url="openai-compatible-api/chat" %}
[chat](https://docs.distribute.ai/distribute-for-enterprise/enterprise-inference-api/openai-compatible-api/chat)
{% endcontent-ref %}

{% content-ref url="openai-compatible-api/images" %}
[images](https://docs.distribute.ai/distribute-for-enterprise/enterprise-inference-api/openai-compatible-api/images)
{% endcontent-ref %}

{% content-ref url="async-api/speech" %}
[speech](https://docs.distribute.ai/distribute-for-enterprise/enterprise-inference-api/async-api/speech)
{% endcontent-ref %}
