We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
I failed to run ChatGLM model with ColossalAI 0.3.6.
KeyError Traceback (most recent call last) Cell In[4], line 112 110 else: 111 print('Skip launch colossalai') --> 112 benchmark_inference( 113 model_id, 114 "fp16", 115 max_input_len=max_input_len, 116 max_output_len=max_seq_len, 117 tp_size=tp_size, 118 batch_size=batch_size) 121 recorder.print()
Cell In[4], line 75, in benchmark_inference(model_id, dtype, max_input_len, max_output_len, tp_size, batch_size) 63 model = model.to(torch.bfloat16) 65 inference_config = InferenceConfig( 66 dtype=dtype, 67 max_batch_size=batch_size, (...) 73 use_cuda_kernel=True, 74 ) ---> 75 engine = InferenceEngine(model, tokenizer, inference_config, verbose=False) 77 generation_config = GenerationConfig( 78 pad_token_id=tokenizer.pad_token_id, 79 max_length=max_input_len + max_output_len, 80 # max_new_tokens=args.max_output_len, 81 ) 82 tokens=gen_tokens(tokenizer, dataset, dataset_format)
File [~/.local/lib/python3.10/site-packages/colossalai/inference/core/engine.py:75], in InferenceEngine.init(self, model_or_path, tokenizer, inference_config, verbose, model_policy) 72 self.verbose = verbose 73 self.logger = get_dist_logger(name) ---> 75 self.init_model(model_or_path, model_policy) 77 self.generation_config = inference_config.to_generation_config(self.model_config) 79 self.tokenizer = tokenizer
File [~/.local/lib/python3.10/site-packages/colossalai/inference/core/engine.py:148], in InferenceEngine.init_model(self, model_or_path, model_policy) 146 else: 147 model_type = "nopadding_" + self.model_config.model_type --> 148 model_policy = model_policy_mapmodel_type 150 pg_mesh = ProcessGroupMesh(self.inference_config.pp_size, self.inference_config.tp_size) 151 tp_group = pg_mesh.get_group_along_axis(TP_AXIS)
KeyError: 'nopadding_chatglm'
ColossalAI 0.3.6 PyTorch 2.3.1 CUDA 12.1 NV driver 545
The text was updated successfully, but these errors were encountered:
We have not yet adapted ChatGLM, but we will adapt these general models in the future.
Sorry, something went wrong.
Can I get an update or plan of the support of ChatGLM2/3? I mainly need inference instead of pretrain/finetune.
isky-cd
No branches or pull requests
Is there an existing issue for this bug?
🐛 Describe the bug
I failed to run ChatGLM model with ColossalAI 0.3.6.
backtrace is here
KeyError Traceback (most recent call last)
Cell In[4], line 112
110 else:
111 print('Skip launch colossalai')
--> 112 benchmark_inference(
113 model_id,
114 "fp16",
115 max_input_len=max_input_len,
116 max_output_len=max_seq_len,
117 tp_size=tp_size,
118 batch_size=batch_size)
121 recorder.print()
Cell In[4], line 75, in benchmark_inference(model_id, dtype, max_input_len, max_output_len, tp_size, batch_size)
63 model = model.to(torch.bfloat16)
65 inference_config = InferenceConfig(
66 dtype=dtype,
67 max_batch_size=batch_size,
(...)
73 use_cuda_kernel=True,
74 )
---> 75 engine = InferenceEngine(model, tokenizer, inference_config, verbose=False)
77 generation_config = GenerationConfig(
78 pad_token_id=tokenizer.pad_token_id,
79 max_length=max_input_len + max_output_len,
80 # max_new_tokens=args.max_output_len,
81 )
82 tokens=gen_tokens(tokenizer, dataset, dataset_format)
File [~/.local/lib/python3.10/site-packages/colossalai/inference/core/engine.py:75], in InferenceEngine.init(self, model_or_path, tokenizer, inference_config, verbose, model_policy)
72 self.verbose = verbose
73 self.logger = get_dist_logger(name)
---> 75 self.init_model(model_or_path, model_policy)
77 self.generation_config = inference_config.to_generation_config(self.model_config)
79 self.tokenizer = tokenizer
File [~/.local/lib/python3.10/site-packages/colossalai/inference/core/engine.py:148], in InferenceEngine.init_model(self, model_or_path, model_policy)
146 else:
147 model_type = "nopadding_" + self.model_config.model_type
--> 148 model_policy = model_policy_mapmodel_type
150 pg_mesh = ProcessGroupMesh(self.inference_config.pp_size, self.inference_config.tp_size)
151 tp_group = pg_mesh.get_group_along_axis(TP_AXIS)
KeyError: 'nopadding_chatglm'
Environment
ColossalAI 0.3.6
PyTorch 2.3.1
CUDA 12.1
NV driver 545
The text was updated successfully, but these errors were encountered: