how can I set parametes in llama(load model,create context,create batch etc) for best performance in Android #9349

FranzKafkaYu · 2024-09-07T15:29:13Z

FranzKafkaYu
Sep 7, 2024

Recently I have update the Android example from b3130 to b3400,and there are too much changes in Android example.
I have noticed that when we load model there are some changes,orignal example(b3130),there will setup gpt_params while the new one won't set up ,and the Performace dropped dramatically.

with the same model and same input,the original one inference about 1.5s,while the new one will cost 2.5s.

so what the right parameters to be set up for good performance when we use CPU backend in Android armv8？

any detailed guidance for this?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

how can I set parametes in llama(load model,create context,create batch etc) for best performance in Android #9349

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

how can I set parametes in llama(load model,create context,create batch etc) for best performance in Android #9349

FranzKafkaYu Sep 7, 2024

Replies: 0 comments

FranzKafkaYu
Sep 7, 2024