Deployment Models | hosted·ai Documentation

System Architecture for GPUaaS Pools.

This diagram shows the system architecture with access to both a Ray.io service and direct ssh connections being used as the interface to the virtual GPU, with user namespaces, task schedulers, and GPU pools.

Network Architecture

This diagram illustrates the network architecture, showing the connection between the customer GPU resource namespaces, a customer VM, and public/internal networks.

Deployment Models Overview

hosted.ai offers different deployment models to manage GPU resources for various needs. Understanding these models can help you choose the best option for your specific use case.

GPUaaS Passthrough

In a GPUaaS Passthrough deployment, GPU resources are each dedicated to a single user or team. This model ensures they have exclusive access to GPU resources, ideal for tasks that need dedicated performance.

GPUaaS Pool

GPUaaS Pool deployment lets multiple users share a common pool of GPU resources. Resources are isolated for each user, providing flexibility and maximizing resource use. This model suits environments with different workloads and multiple teams.

Hybrid Approach

You can combine both GPUaaS Passthrough and GPUaaS Pool models. Users can allocate dedicated resources for important tasks while sharing other resources with flexible pool subscriptions. This approach balances cost efficiency and dedicated performance.