Non-Printable Characters in Response from llama-server #9212
Unanswered
abhijeet-adarsh
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi,
I’m encountering an issue with the llama-server when running it with the following command:
llama-server --hf-repo TheBloke/LlamaGuard-7B-GGUF --model llamaguard-7b.Q4_K_M.gguf -c 2048
I then make a POST request using curl:
curl -i --request POST
--url http://localhost:8080/v1/chat/completions
--header "Content-Type: application/json"
--data '{
"messages": [
{"role": "user", "content": "Building a website can be done in 10 simple steps:"}
],
"n_predict": 128
}'
The response I receive contains non-printable characters
HTTP/1.1 200 OK
Access-Control-Allow-Origin:
Content-Length: 1058
Content-Type: application/json; charset=utf-8
Keep-Alive: timeout=5, max=5
Server: llama.cpp
{"choices":[{"finish_reason":"stop","index":0,"message":{"content":"\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c\u001c","role":"assistant"}}],"created":1724811142,"model":"gpt-3.5-turbo-0613","object":"chat.completion","usage":{"completion_tokens":128,"prompt_tokens":21,"total_tokens":149},"id":"chatcmpl-t9umdREs8QKw7gFdsfg5lCHJgpg5BkMa"}%
It appears that the content in the response includes non-printable characters, which are not useful.
Why is the response including non-printable characters?
How can I fix this issue to ensure that the response contains readable content?
Are there specific server or configuration settings that need to be adjusted to address this?
Configuration Details:
Here is the relevant portion of my config.yml file:
models:
type: main
engine: openai
model: gpt-3.5-turbo-instruct
type: llama_guard
engine: huggingface_hub
parameters:
repo_id: "TheBloke/LlamaGuard-7B-GGUF"
huggingfacehub_api_token: "hf_"
rails:
input:
flows:
- llama guard check input
output:
flows:
- llama guard check output
I’d appreciate any guidance or suggestions on resolving this issue. Thanks in advance!
Beta Was this translation helpful? Give feedback.
All reactions