Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

第二次请求会报错 #19

Closed
JKYtydt opened this issue Jul 18, 2024 · 3 comments
Closed

第二次请求会报错 #19

JKYtydt opened this issue Jul 18, 2024 · 3 comments

Comments

@JKYtydt
Copy link

JKYtydt commented Jul 18, 2024

您好,第一次请求的时候会正常输出,第二次请求会报错,主节点的服务也会终止
工作节点运行命令

CUDA_VISIBLE_DEVICES=3 ./cake-cli --model /sdc/pre_trained_model/Llama3-Chinese-8B-Instruct --mode worker --name worker0 --topology /sdc/jky/cake/topology.yml --address 0.0.0.0:10128

主节点运行命令

CUDA_VISIBLE_DEVICES=3,4,5,6,7 ./cake-cli --model /home/pre_trained_model/Llama3-Chinese-8B-Instruct --api 0.0.0.0:8080 --topology /home/jky/cake/topology.yml

报错如下:

thread 'tokio-runtime-worker' panicked at /sdc/jky/cake/cake-core/src/cake/worker.rs:215:26:
called `Result::unwrap()` on an `Err` value: cannot broadcast [29, 29] to [1, 32, 29, 170]
   0: candle_core::error::Error::bt
   1: candle_core::layout::Layout::broadcast_as
   2: candle_core::tensor::Tensor::broadcast_as
   3: cake_core::models::llama3::cache::Cache::apply_attention_mask
   4: cake_core::models::llama3::attention::CausalSelfAttention::forward
   5: <cake_core::models::llama3::transformer::Transformer as cake_core::cake::Forwarder>::forward::{{closure}}
   6: cake_core::cake::worker::Worker<G>::run::{{closure}}::{{closure}}
   7: tokio::runtime::task::core::Core<T,S>::poll
   8: tokio::runtime::task::harness::Harness<T,S>::poll
   9: tokio::runtime::scheduler::multi_thread::worker::Context::run_task
  10: tokio::runtime::scheduler::multi_thread::worker::Context::run
  11: tokio::runtime::context::set_scheduler
  12: tokio::runtime::context::runtime::enter_runtime
  13: tokio::runtime::scheduler::multi_thread::worker::run
  14: tokio::runtime::task::core::Core<T,S>::poll
  15: tokio::runtime::task::harness::Harness<T,S>::poll
  16: tokio::runtime::blocking::pool::Inner::run
  17: std::sys_common::backtrace::__rust_begin_short_backtrace
  18: core::ops::function::FnOnce::call_once{{vtable.shim}}
  19: std::sys::pal::unix::thread::Thread::new::thread_start
  20: <unknown>
  21: <unknown>


Stack backtrace:
   0: anyhow::error::<impl core::convert::From<E> for anyhow::Error>::from
   1: <cake_core::models::llama3::transformer::Transformer as cake_core::cake::Forwarder>::forward::{{closure}}
   2: cake_core::cake::worker::Worker<G>::run::{{closure}}::{{closure}}
   3: tokio::runtime::task::core::Core<T,S>::poll
   4: tokio::runtime::task::harness::Harness<T,S>::poll
   5: tokio::runtime::scheduler::multi_thread::worker::Context::run_task
   6: tokio::runtime::scheduler::multi_thread::worker::Context::run
   7: tokio::runtime::context::set_scheduler
   8: tokio::runtime::context::runtime::enter_runtime
   9: tokio::runtime::scheduler::multi_thread::worker::run
  10: tokio::runtime::task::core::Core<T,S>::poll
  11: tokio::runtime::task::harness::Harness<T,S>::poll
  12: tokio::runtime::blocking::pool::Inner::run
  13: std::sys_common::backtrace::__rust_begin_short_backtrace
  14: core::ops::function::FnOnce::call_once{{vtable.shim}}
  15: std::sys::pal::unix::thread::Thread::new::thread_start
  16: <unknown>
  17: <unknown>
stack backtrace:
   0: rust_begin_unwind
   1: core::panicking::panic_fmt
   2: core::result::unwrap_failed
   3: cake_core::cake::worker::Worker<G>::run::{{closure}}::{{closure}}
   4: tokio::runtime::task::core::Core<T,S>::poll
   5: tokio::runtime::task::harness::Harness<T,S>::poll
   6: tokio::runtime::scheduler::multi_thread::worker::Context::run_task
   7: tokio::runtime::scheduler::multi_thread::worker::Context::run
   8: tokio::runtime::context::set_scheduler
   9: tokio::runtime::context::runtime::enter_runtime
  10: tokio::runtime::scheduler::multi_thread::worker::run
  11: tokio::runtime::task::core::Core<T,S>::poll
  12: tokio::runtime::task::harness::Harness<T,S>::poll
  13: tokio::runtime::blocking::pool::Inner::run
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
@evilsocket
Copy link
Owner

duplicate of #13

@angive
Copy link

angive commented Jul 24, 2024

Hello, I'm also encountering the same issue. The second request causes both the worker and master to crash.
image
image

@evilsocket
Copy link
Owner

@angive #13

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants