I am trying to extract a table from a PDF.
I was able to use the Le Chat feature of Mistral and get a super great result, but when I try to use the API to programmatically get the same result, I am not able to replicate it. I tried using the OCR API but it did not return anything and the chat completion feature does not seem to be able to recognize my PDF uploads.
I have tried the following code snippet:
from mistralai import Mistral
from mistralai.models import File
import os
api_key = "API_KEY"
client = Mistral(api_key=api_key)
uploaded_pdf = await client.files.upload_async(
file=File(
file_name="table.pdf",
content=open("./table.pdf", "rb").read(),
),
purpose = "ocr",
)
signed_url = client.files.get_signed_url(file_id=uploaded_pdf.id)
ocr_response = client.ocr.process(
model="mistral-ocr-latest",
document={
"type": "document_url",
"document_url": signed_url.url,
}
)
The response is empty with the markdown being a single image file. Additionally, I tried this code snippet as well:
from mistralai import Mistral
from mistralai.models import File
import os
api_key = "API_KEY"
client = Mistral(api_key=api_key)
uploaded_pdf = await client.files.upload_async(
file=File(
file_name="table.pdf",
content=open("./table.pdf", "rb").read(),
),
purpose = "ocr",
)
signed_url = client.files.get_signed_url(file_id=uploaded_pdf.id)
# Define the messages for the chat
messages = [
{
"role": "user",
"content": [
{
"type": "text",
"text": "Can you extract the data from this table from the PDF given."
},
{
"type": "document_url",
"document_url": signed_url.url,
}
]
}
]
# Get the chat response
chat_response = client.chatplete(
model="mistral-large-latest",
messages=messages,
)
And the response I get is something along the lines of: I'm unable to directly access or view documents from URLs.
Can someone please let me know what I am doing wrong?
I am trying to extract a table from a PDF.
I was able to use the Le Chat feature of Mistral and get a super great result, but when I try to use the API to programmatically get the same result, I am not able to replicate it. I tried using the OCR API but it did not return anything and the chat completion feature does not seem to be able to recognize my PDF uploads.
I have tried the following code snippet:
from mistralai import Mistral
from mistralai.models import File
import os
api_key = "API_KEY"
client = Mistral(api_key=api_key)
uploaded_pdf = await client.files.upload_async(
file=File(
file_name="table.pdf",
content=open("./table.pdf", "rb").read(),
),
purpose = "ocr",
)
signed_url = client.files.get_signed_url(file_id=uploaded_pdf.id)
ocr_response = client.ocr.process(
model="mistral-ocr-latest",
document={
"type": "document_url",
"document_url": signed_url.url,
}
)
The response is empty with the markdown being a single image file. Additionally, I tried this code snippet as well:
from mistralai import Mistral
from mistralai.models import File
import os
api_key = "API_KEY"
client = Mistral(api_key=api_key)
uploaded_pdf = await client.files.upload_async(
file=File(
file_name="table.pdf",
content=open("./table.pdf", "rb").read(),
),
purpose = "ocr",
)
signed_url = client.files.get_signed_url(file_id=uploaded_pdf.id)
# Define the messages for the chat
messages = [
{
"role": "user",
"content": [
{
"type": "text",
"text": "Can you extract the data from this table from the PDF given."
},
{
"type": "document_url",
"document_url": signed_url.url,
}
]
}
]
# Get the chat response
chat_response = client.chatplete(
model="mistral-large-latest",
messages=messages,
)
And the response I get is something along the lines of: I'm unable to directly access or view documents from URLs.
Can someone please let me know what I am doing wrong?
Share Improve this question asked Mar 9 at 15:12 Shelly LiuShelly Liu 411 bronze badge 02 Answers
Reset to default 1You are absolutely right.
When calling the OCR method using the 'mistral-ocr-latest' model, if you send an image or a PDF with embedded images, all you get in the markdown property is a "".
I've tried everything imaginable, from running in batch, using the upload method and even sending them as base64 strings. Added payment info and changed to the pay-as-you-go plan. No luck.
I guess there's something wrong with Mistral OCR, a bug or it's simply a scam.
We do get images
for in each page as image_base64
in mistral-ocr
api response please check the response object you get for the API call. Make sure you set param include_image_base64=True
in the API call. I am attaching my API code snippet below
ocr_response = client.ocr.process(
model=ocr_model,
document={
"type": "document_url",
"document_url": "https://arxiv./pdf/2201.04234"
},
include_image_base64=True
)
Here's the response object: