Error while loading the MultiModal Models from Huggingface hub

I am trying to use a multimodal model from Huggingface hub. I tried with "maya-multimodal/maya" model.(Following is the code to load the model): from llama_index.multi_modal_llms.huggingface import HuggingFaceMultiModal from llama_index.core.schema import ImageDocument

model = HuggingFaceMultiModal.from_model_name("maya-multimodal/maya")

While doing this I am getting the following error while loading the model.(Please find the error below). I have two questions here.

How can I solve this error.
What are some good quantized multimodal models in Huggingface which I can use(for image to text).

The error: ValueError: The checkpoint you are trying to load has model type llava_cohere but Transformers does not recognise this architecture. This could be because of an issue with the checkpoint, or because your version of Transformers is out of date.

You can update Transformers with the command pip install --upgrade transformers. If this does not work, and the checkpoint is very new, then there may not be a release version that supports this model yet. In this case, you can get the most up-to-date code by installing Transformers from source with the command pip install git+.git

I tried doing both the things suggested:

pip install --upgrade transformers and
pip install git+.git but non of them worked.

科技改变生活-雨落星辰 - 所有的伟大,都源于一个勇敢的开始

Error while loading the MultiModal Models from Huggingface hub - Stack Overflow

与本文相关的文章

评论列表(0)