How to combine model and mmproj ? #2122
-
|
Hi, I've been playing with Qwen3VL-8B-Instruct-Q8_0.gguf on Win11+conda+cuda environment. Anyone please educate me how to combine mmproj-GGUF to model ? from llama_cpp import Llama
llm = Llama(
model_path="./Qwen3VL-8B-Instruct-Q8_0.gguf",
mmproj_path="./mmproj-Qwen3VL-8B-Instruct-Q8_0.gguf",
n_ctx=1000,
n_gpu_layer=-1,
verbose=True,
)environment: Thank you in advance. |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 1 reply
-
|
I guess you're using JamePeng's fork since the ver is 0.3.23. Anyway, the usage of mmproj is described in class Llava15ChatHander:
# The constructor takes the path to the `mmproj.gguf` file.
def __init__(self, clip_model_path, verbose):
...
# The core logic communicating with libmtmd of C++ side is defined here.
def __call__(self, ...):
...The
|
Beta Was this translation helpful? Give feedback.
-
|
A short code snippet to use Llava15ChatHandler (I haven't tested). import base64
from llama_cpp import Llama
from llama_cpp.llama_chat_format import Llava15ChatHandler
def image_to_base64(image_path):
with open(image_path, "rb") as f:
return base64.b64encode(f.read()).decode("utf-8")
def main():
chat_handler = Llava15ChatHandler(
clip_model_path="./llm/mmproj-model-f16.gguf"
)
llm = Llama(
model_path="./llm/gemma-3-4b-it-Q4_K_M.gguf",
chat_handler=chat_handler,
n_ctx=4096,
n_gpu_layers=-1,
logits_all=True
)
image_path = "./sample_image.png"
image_base64 = image_to_base64(image_path)
messages = [
{
"role": "user",
"content": [
{"type": "text", "text": "What is in the image?"},
{
"type": "image_url",
"image_url": {"url": f"data:image/png;base64,{image_base64}"}
}
]
}
]
print("Assistant: ", end="", flush=True)
stream = llm.create_chat_completion(
messages=messages,
stream=True
)
for chunk in stream:
if "content" in chunk["choices"][0]["delta"]:
print(chunk["choices"][0]["delta"]["content"], end="", flush=True)
print("\n")
llm.close() |
Beta Was this translation helpful? Give feedback.
I guess you're using JamePeng's fork since the ver is 0.3.23.
The interface of abetlen's main branch and the fork are incompatible, hence you need to modify some to let things work.
Anyway, the usage of mmproj is described in
class Llava15ChatHandlerinllama_cpp/llama_chat_format.py.Basically, you can use mmproj by copy-n-pasting the core part of it or by defining a class inheriting it.
The
__call__method does