ocr - I'm having trouble trying to convert image to text in python

I'm trying to convert the attached image using the pytesseract and opencv libraries in python, but the conversion is not satisfactory, since many characters are converted incorrectly. Does anyone have a solution to this problem? I would like the conversion to be done exactly as shown in the image. Attached is the image I am trying to convert. Below is the code I am using.

import pytesseract
import cv2

# Carregar a imagem
imagem = cv2.imread("imagem.png")

# Extrair o texto da imagem pré-processada
texto = pytesseract.image_to_string(imagem)

print(texto)

The result that python returns to me is this.

16x 16.86x 3.20x 4.35x 11.74x 1.24 44.33x 20.48x 1.48x 4.10x 134.28x 103K 2.04x 113x 115K 1.46x 1.81x

11.30x 14.31 133x 1.00x 185x 34.26x 16.41x 2.58x 3.34x 179x 789x 102K 1.96x 1.61x 2.07x 17.56x 2.44x

131K 145x 9.73x 11.74x 2.38x 437K 4.67x 2.35x 108K 6.05x 3.90x 5.71x 29212x 4.50x 2.07x 1.00x 3.42x

1004x 3.02x 1.22x 1.00x 2.37x 152x 5.27x 32.68x 1.52x

I hope to get a solution to my problem.

import pytesseract
import cv2

# Carregar a imagem
imagem = cv2.imread("imagem.png")

# Extrair o texto da imagem pré-processada
texto = pytesseract.image_to_string(imagem)

print(texto)

The result that python returns to me is this.

16x 16.86x 3.20x 4.35x 11.74x 1.24 44.33x 20.48x 1.48x 4.10x 134.28x 103K 2.04x 113x 115K 1.46x 1.81x

11.30x 14.31 133x 1.00x 185x 34.26x 16.41x 2.58x 3.34x 179x 789x 102K 1.96x 1.61x 2.07x 17.56x 2.44x

131K 145x 9.73x 11.74x 2.38x 437K 4.67x 2.35x 108K 6.05x 3.90x 5.71x 29212x 4.50x 2.07x 1.00x 3.42x

1004x 3.02x 1.22x 1.00x 2.37x 152x 5.27x 32.68x 1.52x

I hope to get a solution to my problem.

Share Improve this question edited Mar 19 at 16:39 Christoph Rackwitz 15.9k5 gold badges39 silver badges51 bronze badges asked Mar 19 at 15:18 Cristi Garcia 1

1 Post your input image. Welcome to Stack Overflow. Please take the tour (stackoverflow/tour) and read the information guides in the help center (stackoverflow/help), in particular, "How to Ask A Good Question" (stackoverflow/help/how-to-ask), "What Are Good Topics" (stackoverflow/help/on-topic) and "How to create a Minimal, Reproducible Example" (stackoverflow/help/minimal-reproducible-example). – fmw42 Commented Mar 19 at 15:25
1 show us the pic – Christoph Rackwitz Commented Mar 19 at 16:40

Add a comment |

1 Answer 1

Sorted by: Reset to default 0

Try

import pytesseract
import cv2
import numpy as np

# Load image
image = cv2.imread("imagem.png")

# Preprocessing
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blurred = cv2.medianBlur(gray, 5)
kernel = np.array([[0, -1, 0], [-1, 5, -1], [0, -1, 0]])
sharpened = cv2.filter2D(blurred, -1, kernel)
thresh = cv2.adaptiveThreshold(sharpened, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 11, 2)

# Extract text
text = pytesseract.image_to_string(thresh, config='--psm 6 --oem 3 -c tessedit_char_whitelist=0123456789.xK')

print(text)

科技改变生活-雨落星辰 - 所有的伟大,都源于一个勇敢的开始

ocr - I'm having trouble trying to convert image to text in python - Stack Overflow

1 Answer 1

与本文相关的文章

评论列表(0)