[DOC]: 环境安装失败 #6066

eccct · 2024-09-21T18:23:35Z

📚 The doc issue

Win11安装 Ubuntu24.04子系统 WSL2
按照网站指导https://colossalai.org/zh-Hans/docs/get_started/installation
具体按照步骤如下：
export CUDA_INSTALL_DIR=/usr/local/cuda-12.1
export CUDA_HOME=/usr/local/cuda-12.1
export LD_LIBRARY_PATH=$CUDA_HOME"/lib64:$LD_LIBRARY_PATH"
export PATH=$CUDA_HOME"/bin:$PATH"

conda create -n colo01 python=3.10
conda activate colo01
export PATH=~/miniconda3/envs/colo01/bin:$PATH

sudo apt update
sudo apt install gcc-10 g++-10
sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-12 60
sudo update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-12 60
sudo update-alternatives --config gcc
gcc --version

wget https://developer.download.nvidia.com/compute/cuda/12.1.0/local_installers/cuda_12.1.0_530.30.02_linux.run
sudo sh cuda_12.1.0_530.30.02_linux.run
验证 CUDA 安装：nvidia-smi

conda install nvidia/label/cuda-12.1.0::cuda-toolkit
conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia

git clone https://github.com/hpcaitech/ColossalAI.git
cd ColossalAI

pip install -r requirements/requirements.txt
CUDA_EXT=1 pip install .

安装相关的开发库
pip install transformers
pip install xformers
pip install datasets tensorboard

运行benchmark
Step1: 切换目录
cd examples/language/llama/scripts/benchmark_7B
修改gemini.sh
bash gemini.sh

执行后提示错误
[rank0]: ImportError: FlashAttention2 has been toggled on, but it cannot be used due to the following error: the package flash_attn seems to be not installed. Please refer to the documentation of https://huggingface.co/docs/transformers/perf_infer_gpu_one#flashattention-2 to install Flash Attention 2.

然后安装flashattention-2成功
pip install packaging
pip install ninja
ninja --version
echo $?
conda install -c conda-channel attention2
pip install flash-attn --no-build-isolation

再次执行bash gemini.sh，还是有错误。麻烦根据上传的log文件给予解答，最好能够完善安装文档，谢谢！

log.txt

Issues-translate-bot · 2024-09-21T18:23:46Z

Bot detected the issue body's language is not English, translate it automatically. 👯👭🏻🧑‍🤝‍🧑👫🧑🏿‍🤝‍🧑🏻👩🏾‍🤝‍👨🏿👬🏿

Title: [DOC]: Environment installation failed

eccct · 2024-09-21T20:58:47Z

再次运行benchmark，bash gemini.sh后系统长时间停顿在编译阶段。

使用colossalai check -i这个命令来检查目前环境里的版本兼容性以及CUDA Extension的状态。

请帮忙分析一下原因，谢谢！

Edenzzzz · 2024-09-22T00:12:37Z

You should use BUILD_EXT=1 pip install . and see if that compiles.

eccct added the documentation Improvements or additions to documentation label Sep 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DOC]: 环境安装失败 #6066

[DOC]: 环境安装失败 #6066

eccct commented Sep 21, 2024

Issues-translate-bot commented Sep 21, 2024

eccct commented Sep 21, 2024

Edenzzzz commented Sep 22, 2024

[DOC]: 环境安装失败 #6066

[DOC]: 环境安装失败 #6066

Comments

eccct commented Sep 21, 2024

📚 The doc issue

Issues-translate-bot commented Sep 21, 2024

eccct commented Sep 21, 2024

Edenzzzz commented Sep 22, 2024