If your GPU does not support CUDA 12.6, you may still use lower versions of the CUDA Toolkit.However, after deploying Alauda AI, it is necessary to add a custom inference runtime that is adapted for older CUDA versions. This can be done by referring to Extend LLM Inference Runtimes,since the built-in vLLM inference runtime only supports CUDA 12.6 or later versions.
Installing Alauda AI involves the following high-level tasks:
Then, the core capabilities of Alauda AI have been successfully deployed. If you want to quickly experience the product, please refer to the Quick Start.