API Documentation

Access GPU rental pricing data and self-hostable LLM planning data programmatically. No authentication required.

GET /api/compare

Latest snapshot per GPU per provider. Powers the main comparison table.

Response

{
  "providers": {
    "vast.ai": { "name": "Vast.ai", "color": "#00d4aa" },
    "aws": { "name": "AWS", "color": "#ff9900" },
    ...
  },
  "gpus": {
    "H100 SXM": {
      "vast.ai": {
        "on-demand": {
          "min": 1.85, "median": 2.10, "mean": 2.25, "max": 3.50,
          "num_offers": 42, "fetched_at": "2026-03-15T06:00:00"
        },
        "spot": { ... }
      },
      "aws": { ... }
    }
  },
  "gpu_specs": {
    "H100 SXM": { "vram_gb": 80, "vram_type": "HBM3", "generation": "Hopper", "tier": "Flagship" }
  }
}

GET /api/current

Current prices, optionally filtered by provider.

provider string, optional Filter by provider key (e.g., aws, vast.ai, gcp)

Example

GET /api/current?provider=aws

Response

[
  {
    "gpu_name": "H100 SXM",
    "provider": "aws",
    "pricing_type": "on-demand",
    "num_offers": 1,
    "min_dph_per_gpu": 3.25,
    "median_dph_per_gpu": 3.25,
    "mean_dph_per_gpu": 3.25,
    "max_dph_per_gpu": 3.25,
    "fetched_at": "2026-03-15T06:00:00"
  },
  ...
]

GET /api/llms

Curated self-hostable LLM catalog with inferred VRAM requirements and live hosting cost estimates based on current GPU pricing.

Response

{
  "pricing_type": "on-demand",
  "research_checked_at": "2026-03-16",
  "summary": {
    "model_count": 9,
    "single_gpu_models": 8,
    "multi_gpu_models": 1,
    "serverless_ready_models": 6,
    "lowest_hourly_estimate": 0.39,
    "largest_min_vram_gb": 640
  },
  "serverless_patterns": [
    {
      "name": "RunPod Serverless + vLLM",
      "source_url": "https://docs.runpod.io/serverless/vllm/get-started"
    }
  ],
  "latest_discovery_feed": {
    "source": "huggingface",
    "fetched_at": "2026-03-16T03:30:00Z",
    "models_seen": 24,
    "authors_queried": ["meta-llama", "Qwen", "mistralai", "deepseek-ai", "google", "microsoft"],
    "relevant_models": 10,
    "models_returned": 12,
    "selection_strategy": "tracked-authors-and-relevance"
  },
  "latest_discoveries": [
    {
      "model_name": "Qwen2.5-Coder-7B-Instruct",
      "release_kind": "Instruction tune",
      "serverless_fit": "possible",
      "quality_tier": "good",
      "quality_label": "Promising",
      "relevance_score": 15,
      "relevance_summary": "trusted org, qwen, primary release, practical size, usage signal",
      "watchouts_summary": "too new for broad field feedback",
      "source_url": "https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct"
    }
  ],
  "models": [
    {
      "name": "Llama 3.1 8B Instruct",
      "params_billions": 8,
      "context_window": 128000,
      "license": "Llama Community License",
      "source_label": "Meta model card",
      "source_url": "https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct",
      "quality_tier": "good",
      "quality_label": "Strong default",
      "quality_note": "A meaningful quality jump over 3B-class models without crossing into premium GPU cost.",
      "serverless_fit": "good",
      "serverless_note": "Good fit for scale-to-zero APIs if you cache weights and accept cold starts.",
      "min_total_vram_gb": 20,
      "deployment_label": "1x 20GB+ GPU",
      "cheapest_tracked_setup": {
        "gpu_name": "RTX 4090",
        "provider_name": "Vast.ai",
        "estimated_hourly": 0.39,
        "estimated_monthly": 284.70
      }
    }
  ],
  "assumptions": [
    "Estimates assume inference hosting, not training or fine-tuning.",
    "Costs use the current median on-demand GPU price and scale linearly for multi-GPU setups.",
    "Quality reads are directional editorial guidance for planning, not formal benchmark rankings."
  ]
}

GET /api/history/{gpu_name}

Historical price data for a specific GPU across all providers.

gpu_name path, required GPU name (e.g., H100 SXM, RTX 4090)

provider query, optional Filter to a single provider

Example

GET /api/history/H100%20SXM?provider=vast.ai

Response

{
  "gpu_name": "H100 SXM",
  "history": [
    {
      "fetched_at": "2024-01-15T06:00:00",
      "num_offers": 38,
      "min": 1.90, "median": 2.15, "mean": 2.30, "max": 3.80,
      "provider": "vast.ai",
      "pricing_type": "on-demand"
    },
    ...
  ]
}

GET /api/history_all

All historical pricing data, keyed by gpu_name|provider|pricing_type.

Response

{
  "H100 SXM|vast.ai|on-demand": [
    { "fetched_at": "2024-01-15", "median": 2.15, "min": 1.90, "num_offers": 38, ... },
    ...
  ],
  "H100 SXM|aws|on-demand": [ ... ],
  ...
}

GET /api/providers

Provider metadata, last snapshot freshness, and latest collector run status.

Response

{
  "vast.ai": {
    "name": "Vast.ai",
    "color": "#00d4aa",
    "url": "https://vast.ai",
    "last_fetched": "2026-03-15T06:00:00",
    "snapshots": 1284,
    "last_run": {
      "started_at": "2026-03-15T06:00:00",
      "completed_at": "2026-03-15T06:00:04",
      "status": "ok",
      "prices": 128,
      "gpu_types": 12,
      "message": null
    }
  },
  ...
}

GET /api/offers/{gpu_name}

Individual offer listings for a GPU. Currently only available for Vast.ai.

gpu_name path, required GPU name (e.g., H100 SXM)

Response

{
  "gpu_name": "H100 SXM",
  "offers": [
    {
      "gpu_name": "H100 SXM",
      "num_gpus": 8,
      "gpu_ram_gb": 80.0,
      "price_per_gpu_hr": 1.85,
      "price_total_hr": 14.80,
      "cpu": "AMD EPYC 9454",
      "cpu_cores": 48.0,
      "ram_gb": 512.0,
      "disk_gb": 2000,
      "location": "US",
      "reliability": 0.9980,
      "tflops": 3958.8
    },
    ...
  ]
}

Supported Providers

Provider Key	Name	Auth Required
`vast.ai`	Vast.ai	No
`aws`	Amazon Web Services	No
`azure`	Microsoft Azure	No
`gcp`	Google Cloud Platform	No (API key on server)
`lambda`	Lambda Labs	No (API key on server)
`runpod`	RunPod	No (API key on server)
`vultr`	Vultr	No

Tracked GPUs

H100 SXM H100 NVL H100 PCIE H200 H200 NVL A100 SXM4 A100 PCIE B200 RTX 4090 RTX 5090 L40 RTX 6000Ada