最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

gemma - "PreTrainedTokenizerFast._batch_encode_plus() got an unexpected keyword argument 'images'&q

programmeradmin3浏览0评论

I am trying to implement the mlx-collection/gemma-3-4b-it-4bit model with the mlx-vlm library to do multi-image inference, but I get this traceback error and I am not able to figure out how to solve it.

I tried to do both single and multi-image inference but the same error occurs.

Traceback (most recent call last):
  File "/Users/Administrator/Documents/create/controllers/VLM_on_Robotics/main.py", line 52, in <module>
    prediction, comp_time = phi3.generate(prompt, [images_PIL[0]])
                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/Administrator/Documents/create/controllers/VLM_on_Robotics/Llava_Phi3/phi3_mlx.py", line 60, in generate
    prediction = generate(self.model, self.processor, formatted_prompt, images, verbose=False)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/Administrator/Documents/create/venv/lib/python3.11/site-packages/mlx_vlm/utils.py", line 1117, in generate
    for response in stream_generate(model, processor, prompt, image, **kwargs):
  File "/Users/Administrator/Documents/create/venv/lib/python3.11/site-packages/mlx_vlm/utils.py", line 1018, in stream_generate
    inputs = prepare_inputs(
             ^^^^^^^^^^^^^^^
  File "/Users/Administrator/Documents/create/venv/lib/python3.11/site-packages/mlx_vlm/utils.py", line 814, in prepare_inputs
    inputs = processor(
             ^^^^^^^^^^
  File "/Users/Administrator/Documents/create/venv/lib/python3.11/site-packages/transformers/tokenization_utils_base.py", line 2877, in __call__
    encodings = self._call_one(text=text, text_pair=text_pair, **all_kwargs)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/Administrator/Documents/create/venv/lib/python3.11/site-packages/transformers/tokenization_utils_base.py", line 2987, in _call_one
    return self.encode_plus(
           ^^^^^^^^^^^^^^^^^
  File "/Users/Administrator/Documents/create/venv/lib/python3.11/site-packages/transformers/tokenization_utils_base.py", line 3063, in encode_plus
    return self._encode_plus(
           ^^^^^^^^^^^^^^^^^^
  File "/Users/Administrator/Documents/create/venv/lib/python3.11/site-packages/transformers/tokenization_utils_fast.py", line 613, in _encode_plus
    batched_output = self._batch_encode_plus(
                     ^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: PreTrainedTokenizerFast._batch_encode_plus() got an unexpected keyword argument 'images'

Since the traceback refers to an unexpected argument, I tried to take out the images argument and do only-test inference and the script works.

Does this mean that there is a bug on how gemma is implemented for which vision tasks are not supported?

I am trying to implement the mlx-collection/gemma-3-4b-it-4bit model with the mlx-vlm library to do multi-image inference, but I get this traceback error and I am not able to figure out how to solve it.

I tried to do both single and multi-image inference but the same error occurs.

Traceback (most recent call last):
  File "/Users/Administrator/Documents/create/controllers/VLM_on_Robotics/main.py", line 52, in <module>
    prediction, comp_time = phi3.generate(prompt, [images_PIL[0]])
                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/Administrator/Documents/create/controllers/VLM_on_Robotics/Llava_Phi3/phi3_mlx.py", line 60, in generate
    prediction = generate(self.model, self.processor, formatted_prompt, images, verbose=False)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/Administrator/Documents/create/venv/lib/python3.11/site-packages/mlx_vlm/utils.py", line 1117, in generate
    for response in stream_generate(model, processor, prompt, image, **kwargs):
  File "/Users/Administrator/Documents/create/venv/lib/python3.11/site-packages/mlx_vlm/utils.py", line 1018, in stream_generate
    inputs = prepare_inputs(
             ^^^^^^^^^^^^^^^
  File "/Users/Administrator/Documents/create/venv/lib/python3.11/site-packages/mlx_vlm/utils.py", line 814, in prepare_inputs
    inputs = processor(
             ^^^^^^^^^^
  File "/Users/Administrator/Documents/create/venv/lib/python3.11/site-packages/transformers/tokenization_utils_base.py", line 2877, in __call__
    encodings = self._call_one(text=text, text_pair=text_pair, **all_kwargs)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/Administrator/Documents/create/venv/lib/python3.11/site-packages/transformers/tokenization_utils_base.py", line 2987, in _call_one
    return self.encode_plus(
           ^^^^^^^^^^^^^^^^^
  File "/Users/Administrator/Documents/create/venv/lib/python3.11/site-packages/transformers/tokenization_utils_base.py", line 3063, in encode_plus
    return self._encode_plus(
           ^^^^^^^^^^^^^^^^^^
  File "/Users/Administrator/Documents/create/venv/lib/python3.11/site-packages/transformers/tokenization_utils_fast.py", line 613, in _encode_plus
    batched_output = self._batch_encode_plus(
                     ^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: PreTrainedTokenizerFast._batch_encode_plus() got an unexpected keyword argument 'images'

Since the traceback refers to an unexpected argument, I tried to take out the images argument and do only-test inference and the script works.

Does this mean that there is a bug on how gemma is implemented for which vision tasks are not supported?

Share Improve this question edited Mar 25 at 1:36 President James K. Polk 42.1k29 gold badges109 silver badges145 bronze badges asked Mar 24 at 16:33 Tommaso TubaldoTommaso Tubaldo 11 bronze badge 1
  • Please provide enough code so others can better understand or reproduce the problem. – Community Bot Commented Mar 26 at 3:18
Add a comment  | 

1 Answer 1

Reset to default 0

Problem solved, the issue is related to the transformers dependency.

See: https://github/Blaizzy/mlx-vlm/issues/274

发布评论

评论列表(0)

  1. 暂无评论