最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

large language model - Token limit exceeded on Qwen2.5 VL 7B Instruct - Stack Overflow

programmeradmin0浏览0评论

I am making an inference with Qwen2.5 VL 7B like this, but when I try to encode the image with base64, it exceeds the token limit (since the base64 is quite long).

from huggingface_hub import InferenceClient
import pyautogui
import base64
import pathlib

client = InferenceClient(
    api_key=api_key
)

im1 = pyautogui.screenshot()
im1.save(path)

with open(path, "rb") as f:
    image = base64.b64encode(f.read()).decode("utf-8")

image = f"data:image/jpeg;base64,{image}"

messages = [
    {
        "role": "user",
        "content": [
            {
                "type": "text",
                "text": "Describe this image in one sentence."
            },
            {
                "type": "image_url",
                "image_url": {
                    "url": image
                }
            }
        ]
    }
]

stream = client.chatpletions.create(
    model="Qwen/Qwen2.5-VL-7B-Instruct", 
    messages=messages, 
    max_tokens=500,
    stream=True
)

I couldn't use file path to input the image either, and I really wouldn't want to upload the image onto a hosting service to get the url link. Is there a way I can quickly make a call with vision with this inference?

发布评论

评论列表(0)

  1. 暂无评论