the platform for async ai workflows

Describe your workflows once, then let Exosphere run them in the background with up to 75 % lower cost—built for jobs that must run reliably at scale.

Create Multi Step AI Workflows

Connect tools, models, and APIs to automate complex async jobs.

model_weights.pt

PyTorch model weights containing the trained parameters of a deep learning model for natural language processing.

training_data.csv

Dataset containing labeled examples used for training machine learning models, including features and target variables.

config.yaml

Configuration file defining hyperparameters, model architecture, and training settings for AI model deployment.

api_docs.md

Documentation for the API endpoints and their usage, including request parameters, response formats, and examples.

model_metrics.json

Performance metrics and evaluation results of AI models, including accuracy, precision, recall, and F1 scores.

model_weights.pt

PyTorch model weights containing the trained parameters of a deep learning model for natural language processing.

training_data.csv

Dataset containing labeled examples used for training machine learning models, including features and target variables.

config.yaml

Configuration file defining hyperparameters, model architecture, and training settings for AI model deployment.

api_docs.md

Documentation for the API endpoints and their usage, including request parameters, response formats, and examples.

model_metrics.json

Performance metrics and evaluation results of AI models, including accuracy, precision, recall, and F1 scores.

model_weights.pt

PyTorch model weights containing the trained parameters of a deep learning model for natural language processing.

training_data.csv

Dataset containing labeled examples used for training machine learning models, including features and target variables.

config.yaml

Configuration file defining hyperparameters, model architecture, and training settings for AI model deployment.

api_docs.md

Documentation for the API endpoints and their usage, including request parameters, response formats, and examples.

model_metrics.json

Performance metrics and evaluation results of AI models, including accuracy, precision, recall, and F1 scores.

model_weights.pt

PyTorch model weights containing the trained parameters of a deep learning model for natural language processing.

training_data.csv

Dataset containing labeled examples used for training machine learning models, including features and target variables.

config.yaml

Configuration file defining hyperparameters, model architecture, and training settings for AI model deployment.

api_docs.md

Documentation for the API endpoints and their usage, including request parameters, response formats, and examples.

model_metrics.json

Performance metrics and evaluation results of AI models, including accuracy, precision, recall, and F1 scores.

Work with your own files

Upload, process, and analyze files directly in your workflows.From PDFs to CSVs, bring your own data.

💸

Job Completed: Parsed 1347 documents·Completed in 3 hours

6 hour SLA

Define SLAs for your jobs

Set job deadlines. Exosphere manages batching, retries, and cost optimization.

orbit sdk

open-source sdk for async ai jobs

Define Python functions. Orbit handles retries, scaling, and orchestration.

Higher SLAs = Lower Costs

Trade off latency for price. Define SLAs and let Exosphere optimize cost.

get running within two lines of code

Say goodbye to managing your own code to orchestrate ai workflows.

workflow.pybefore


import requests

def load_files_from_s3(bucket, keys):
    print("Loading files from S3...")
    res = requests.post("https://api.example.com/s3/generate-urls", json={
        "bucket": bucket,
        "keys": keys
    })
    res.raise_for_status()
    return res.json()["urls"]

def send_inference_request(deployment_id, file_urls):
    print("Sending inference request to Gemma...")
    res = requests.post("https://api.example.com/gemma/infer", json={
        "deployment_id": deployment_id,
        "files": file_urls
    })
    res.raise_for_status()
    return res.json()["job_id"]

def poll_for_completion(job_id, interval=5, max_retries=20):
    print("Polling for job completion...")
    for attempt in range(max_retries):
        res = requests.get(f"https://api.example.com/gemma/status/{job_id}")
        res.raise_for_status()
        status = res.json()["status"]

        if status == "completed":
            return True
        elif status == "failed":
            raise Exception("Job failed")

        time.sleep(interval)
    
    raise TimeoutError("Polling timed out")

def fetch_inference_result(job_id):
    print("Fetching inference result...")
    res = requests.get(f"https://api.example.com/gemma/result/{job_id}")
    res.raise_for_status()
    return res.json()

def main(string bucket, string[] keys):
    file_urls = with_retries(load_files_from_s3, bucket, keys)  # [!code highlight]
    job_id = with_retries(send_inference_request, deployment_id, file_urls)  # [!code highlight]

    if with_retries(poll_for_completion, job_id):
        result = with_retries(fetch_inference_result, job_id)  # [!code highlight]
        ocr_result = with_retries(run_ocr_on_images, result["images"])  # [!code highlight]
        summary = summarize_output(ocr_result)  # [!code highlight]
        return summary

if __name__ == "__main__":
    main()

workflow.pyafter


from exosphere import run_workflow

def main():
    result = run_workflow( # [!code focus]
        "gemma_gpu_pipeline.yaml", # [!code focus]
        input={ 
            "bucket": "user-bucket", # [!code focus]
            "files": ["doc1.pdf", "doc2.pdf"] # [!code focus]
        },
        sla_minutes=360  # [!code focus]
    )
    print("Final Summary:", result)

if __name__ == "__main__":
    main()

batch and save on agents

Save up to 60% thanks to batching optimizations

reliability first, always

For workloads that need to be running

launch your ai workflows

Book a call with our team to discuss how we can help you build and optimize your AI workflows for maximum efficiency.