logo
Alauda AI
English
Русский
English
Русский
logo
Alauda AI
Navigation

Overview

Introduction
Quick Start
Release Notes

Install

Pre-installation Configuration
Install Alauda AI Essentials
Install Alauda AI

Upgrade

Upgrade from AI 1.3

Uninstall

Uninstall

Infrastructure Management

Device Management

About Alauda Build of Hami
About Alauda Build of NVIDIA GPU Device Plugin

Multi-Tenant

Guides

Namespace Management

Workbench

Overview

Introduction
Install
Upgrade

How To

Create WorkspaceKind
Create Workbench

Model Deployment & Inference

Overview

Introduction
Features

Inference Service

Introduction

Guides

Inference Service

How To

Extend Inference Runtimes
Configure External Access for Inference Services
Configure Scaling for Inference Services

Troubleshooting

Experiencing Inference Service Timeouts with MLServer Runtime
Inference Service Fails to Enter Running State

Model Management

Introduction

Guides

Model Repository

Monitoring & Ops

Overview

Introduction
Features Overview

Logging & Tracing

Introduction

Guides

Logging

Resource Monitoring

Introduction

Guides

Resource Monitoring

API Reference

Introduction

Kubernetes APIs

Inference Service APIs

ClusterServingRuntime [serving.kserve.io/v1alpha1]
InferenceService [serving.kserve.io/v1beta1]

Workbench APIs

Workspace Kind [kubeflow.org/v1beta1]
Workspace [kubeflow.org/v1beta1]

Manage APIs

AmlNamespace [manage.aml.dev/v1alpha1]

Operator APIs

AmlCluster [amlclusters.aml.dev/v1alpha1]
Glossary
Previous PageRelease Notes
Next PagePre-installation Configuration

#Install

#TOC

#Requirements

#Hardware

  • At least two nodes with 16 cores and 32 GB of memory in total.
  • Additional resources for runtime serving are determined by the actual business scale: 10x7B-sized LLM inference instances at the same time, requires at least 10 GPUs and corresponding CPU, memory, disk storage, and object storage.
  • 200G free disk space for each worker node.

#Software

  • CUDA Toolkit Version: 12.6 or higher.
INFO

If your GPU does not support CUDA 12.6, you may still use lower versions of the CUDA Toolkit.However, after deploying Alauda AI, it is necessary to add a custom inference runtime that is adapted for older CUDA versions. This can be done by referring to Extend LLM Inference Runtimes,since the built-in vLLM inference runtime only supports CUDA 12.6 or later versions.

#Installing

Installing Alauda AI involves the following high-level tasks:

  1. Confirm and configure your cluster to meet all requirements. Refer to Pre-installation Configuration.
  2. Install Alauda AI Essentials. Refer to Install Alauda AI Essentials.
  3. Install Alauda AI. Refer to Install Alauda AI.

Then, the core capabilities of Alauda AI have been successfully deployed. If you want to quickly experience the product, please refer to the Quick Start.

Pre-installation Configuration

  • Deploy Service Mesh
  • Preparing the GitLab Service

Install Alauda AI Essentials

  • Downloading
  • Uploading
  • Installing

Install Alauda AI

  • Downloading
  • Uploading
  • Installing Alauda AI Operator
  • Creating Alauda AI Instance
  • Enabling Serverless Functionality
  • Replace GitLab Service After Installation