Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: some models failed to load if many GPU are selected #3662

Open
1 of 3 tasks
thonore75 opened this issue Sep 15, 2024 · 5 comments
Open
1 of 3 tasks

bug: some models failed to load if many GPU are selected #3662

thonore75 opened this issue Sep 15, 2024 · 5 comments
Assignees
Labels
category: model running needs info This doesn't seem right, more information is requested os: windows Windows issues type: bug Something isn't working

Comments

@thonore75
Copy link

Jan version

0.5.3

Describe the Bug

I imported many models and for some of them, they are failing to load if I selected my both graphic cards (RTX 3060 12Go).
If I unselect one of them, the model is loaded.

It will be great if the models list could indicate if the models are supporting multi-GPU

Steps to Reproduce

  1. Go to Settings -> Advanced Settings
  2. In Choose device(s), select 2 GPUs
  3. Go in "My Models"
  4. Select "Meta-Llama-3.1-8B-Instruct-128k-Q4_0" and start it -> NOT loaded !!!
  5. Go in Advanced Settings
  6. Unselect one GPU from "Choose device(s)"
  7. Go in "My Models"
  8. Select "Meta-Llama-3.1-8B-Instruct-128k-Q4_0" and start it -> loaded !!!

Screenshots / Logs

No response

What is your OS?

  • MacOS
  • Windows
  • Linux
@thonore75 thonore75 added the type: bug Something isn't working label Sep 15, 2024
@imtuyethan imtuyethan self-assigned this Sep 18, 2024
@imtuyethan imtuyethan added the os: windows Windows issues label Sep 18, 2024
@imtuyethan
Copy link
Contributor

imtuyethan commented Sep 19, 2024

Tested on

114 (windows-dev-tensorRT-LLM)
OS: Windows 11 Pro (Version 23H2, build 22631.4037)
CPU: AMD Ryzen Threadripper PRO 5955WX (16 cores)
RAM: 32 GB
GPU 1: NVIDIA GeForce RTX 3090
GPU 2: NVIDIA GeForce RTX 3090
Storage: 599 GB local disk (C:)

Results

  • I was able to run Mistral 8x7B Instruct Q4 (~24GB) with 2 GPUs turned on.
Screen.Recording.2024-09-19.at.4.20.40.PM.mov
  • I was able to run Aya 23 35B Q4 (~20GB) when using 2 GPUs as well:
Screen.Recording.2024-09-19.at.5.07.32.PM.mov
Screen.Recording.2024-09-19.at.4.23.42.PM.mov

Here's my app logs:

Screenshot 2024-09-19 at 4 29 43 PM

@imtuyethan imtuyethan assigned louis-jan and unassigned imtuyethan Sep 19, 2024
@imtuyethan imtuyethan added the needs info This doesn't seem right, more information is requested label Sep 19, 2024
@imtuyethan
Copy link
Contributor

Quick check @thonore75 what are the models that cannot be run on your end?

@thonore75
Copy link
Author

thonore75 commented Sep 19, 2024

Here are the models I can launch with 1 GPU but not with 2 :

  • CodeLlama-13b-Instruct-hf.Q8_0
  • CodeLlama-70b-Instruct-hf.i1-IQ4_XS
  • gpt4all-13b-snoozy-q4_0
  • gpt4all-falcon-newbpe-q4_0
  • Meta-Llama-3.1-8B-Claude-F16
  • Meta-Llama-3.1-8B-Instruct-128k-Q4_0
  • Meta-Llama-3.1-8B-Instruct.Q4_0
  • mistral-7b-openorca2.Q4_0.gguf
  • Nous-Hermes-2-Mistral-7B-DPO.Q4_0
  • orca-2-7b.Q4_0
  • Phi-3-mini-4k-instruct.Q4_0

app.log

@thonore75
Copy link
Author

thonore75 commented Sep 19, 2024

After my tests, I tried to play a video you posted here (On Google Chrome), no way, no possible to play.
Jan was launched but with no model loaded, my last test was a model failing to load.
I stopped Jan and I was able to play your videos

@thonore75
Copy link
Author

Jan Compatibility.xlsx
app - 1_GPU_1.log
app - 2_GPUs.log
app - CPU.log
app - 1_GPU_0.log

I did some extra tests!
For each tested configuration, the log was cleaned to have separate logs.
4 tested configurations:

  • CPU
  • GPU 0 selected
  • GPU 1 selected
  • GPU 0 & 1 selected

For some models, it's was failing sometime after loading issue with previous tested model, but after loading a correct model, the failing model is loading.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: model running needs info This doesn't seem right, more information is requested os: windows Windows issues type: bug Something isn't working
Projects
Status: Need Investigation
Development

No branches or pull requests

3 participants