image processing - How do I extract faint dotted text using OpenCV python

I was trying to extract serial number from accue case. Said serial number embossed as dotted lines and not very visible. Click to see the image sample. I was able to extract the serial number but the result were not reliable.

Basically I am using edge detection to get the dotted lines and dilate them before feed it to EasyOCR.

reader = easyocr.Reader(["en"], model_storage_directory="./model")

def process(image: bytes) -> List[str]:
    buff = np.frombuffer(image, np.uint8)
    img = cv2.imdecode(buff, cv2.COLOR_BGR2GRAY)

    edges = cv2.Canny(img, 100, 134)
    edges = cv2.morphologyEx(edges, cv2.MORPH_OPEN, (8, 8))

    kernel = np.ones((4, 4), np.uint8)
    dilate = cv2.dilate(edges, kernel, iterations=4)

    result = reader.readtext(dilate)

    return [item[1] for item in result]

Here is the result for above code. It partially work on this case. The serial number should be "2384A8K0280".

Basically I am using edge detection to get the dotted lines and dilate them before feed it to EasyOCR.

reader = easyocr.Reader(["en"], model_storage_directory="./model")

def process(image: bytes) -> List[str]:
    buff = np.frombuffer(image, np.uint8)
    img = cv2.imdecode(buff, cv2.COLOR_BGR2GRAY)

    edges = cv2.Canny(img, 100, 134)
    edges = cv2.morphologyEx(edges, cv2.MORPH_OPEN, (8, 8))

    kernel = np.ones((4, 4), np.uint8)
    dilate = cv2.dilate(edges, kernel, iterations=4)

    result = reader.readtext(dilate)

    return [item[1] for item in result]

Here is the result for above code. It partially work on this case. The serial number should be "2384A8K0280".

Share Improve this question edited Mar 18 at 10:21 Christoph Rackwitz 15.9k5 gold badges39 silver badges51 bronze badges asked Mar 18 at 7:00 Ikhwanul Labib 211 silver badge1 bronze badge

2 those are 3D surface features, so you need use light and shadow to make them visible. then you could use gradient filters (laplacian, sobel filters), or if you know the direction of the light, use a specifically shaped filter that resembles those dents in that surface. -- I'd recommend that you don't use Canny here. Canny is a beginner trap. – Christoph Rackwitz Commented Mar 18 at 10:22

Add a comment |

1 Answer 1

Sorted by: Reset to default 2

If you're just testing some ideas for a school project, you could create a vision system using a flashlight, cardboard, the camera on your phone, and such. If this were a project to inspect cue cases in a factory, or if it were very important not to misread serial numbers on cue cases that could be confused for one another, that's lots more work. And could be relatively expensive.

I'll second Christoph Rackwitz's comment that the right light will help your application.

Your problem is similar to OCR applications for industrial automation, so I'll lean toward treating your cue case ID problem as one of those problems. A semi-formal approach may seem like extra work, but is likely to save you time and effort over the duration of the project.

First I'll describe a general process for tackling a problem like yours. Then I'll point out some problems with OCR applications to read serial numbers.

In order, try the following:

Consider whether there's some other way to solve the problem. OCR is a classical problem in that it's classic how irritating it can be to make robust. Your example is by no stretch the worst I've seen. However, if you want to solve the problem robustly with vision, then it'd be nice to have a 2D barcode accompany the human-readable text. That's the standard for many applications in a number of industries. More about this below. Don't make the problem harder than it really needs to be.
Write down your specifications. Don't just think about them, or say "uh-huh" and write code, but take the time to put the specifications in writing. Not just in industry, but in personal projects, a lack of concrete goals and specs can easily lead to failure and/or disappointment. For OCR, take a guess at how good your solution could possibly be. Set that as a stretch goal. Then set your immediate goal lower.
Try different types of lighting. You can do this even if the only light you have handy is your phone's built-in light or a flashlight. Cardboard, black paper, white paper, etc., will also help. The following is one of many online references about choosing appropriate lighting for an application: https://www.keyence/products/vision/vision-sys/resources/vision-sys-resources/basics-of-lighting-selection.jsp I hope that's enough to get you started.
Do NOT (yet) read about how others have solved the problem. You've already made a good start, and you have a solution that can be improved. Take a few suggestions posted here, then try your own solutions, and then read about how others have solved the problem. (There remains a strong bias in the field toward the "study a lot, then solve problems," which I would claim has helped lead towards some very expensive failures. YMMV.)
Establish baseline performance for your current solution. In step 2 you documented what your specs and goals are. Now, without changing your existing code, document how it performs. Document each attempted improvement. This doesn't have to be extensive documentation, but keep track. Don't judge your code or yourself harshly if something fails: just document the result, reflect on it, and move on.
Work through a list of algorithmic ideas. There are several ways to address a problem like yours. You'll get better advice if you provide the specifics I mentioned above: the purpose of your project, your goals, how many different cue cases there are to inspect, and so on. In any case, think of several different approaches to this problem. Write those down. (Don't just try to remember them.) Work through them.
Focus on improving the quality of the image processed by OCR. It's a good intuition that indentations can be identified as collections of "edge pixels," which suggests further that some sort of edge-finding algorithm would be useful. However, that isn't necessarily the best approach. Once you have good lighting, it may be sufficient to use morphology, or a kernel-based operator, or some more involved technique to render the characters more readable by whatever OCR algorithm / model you use. It depends on what your OCR algorithm/model can handle. With alphanumeric characters created as indented dots, a number of approaches have worked in the past. Which algorithmic approach works best will depend on your specs & goals, and also on how much variation in character quality you'll see across many samples (which could be hundreds or thousands of cases).
Restrict processing to just the image region with characters to read. If you can exclude any portion of the image certain not to include characters, do so. In OpenCV you can read about the many ways to define ROIs (regions of interest) and/or have an algorithm operate on a subimage. But for a truly robust solution, you'd want some method that could locate (and possibly rotate) a region of interest relative to other features, especially if the cue case isn't located in precisely the same position and orientation each time.
If there are any rules for the serial numbers, use those rules as logic in your code. For example, if there is at most one alphabetical character, and if that character will only occur certain positions, then at some point you should encode that rule. "G" and "6" can be confused for one another. The numeral "0" and the letter "O" are easily confused in serial numbers of arbitrary alphanumeric characters.
Consider using multiple OCR algorithms/models. This would be something to try after you're reasonably satisfied with the results of an algorithmic chain of image cleanup + morphology (or whatever) + OCR processing. You can try different types of image cleanup, and feed each result into the same OCR function call or model, then write code to decide how to handle the results. You can read online about "voting schemes" that mix and match results from multiple algorithms. You might merely read about this step rather than implement it, as it can take experience with multiple projects before you develop a sense of when this technique would be appropriate.
When you reach your goals, stop. If you think you won't reach your goals, document the best you were able to do. If it's a work project and you need more help, consider hiring a consultant. One regular poster here works as a consultant and his company has its own vision library, but his name slips my mind at the moment. There is likely a good vision consultant working close-ish to wherever you live.

All that said, here are some things to keep in mind.

Reading alphanumeric serial numbers that are etched or cut or created by some means other than printing is a tough application for OCR. Yes, many such applications can be "solved" to some satisfactory degree, and even sold to a demanding customer (if that's the goal). However, it's typically easy to demonstrate how that solution will fail. Whenever relevant and possible, try to recast the problem so that 2D barcodes can be used. One advantage of 2D barcodes with built-in error correction is that they will either read correctly or fail to read, but not (in your lifetime) generate a misread. Long story. Data Matrix and QR Code are examples of such barcodes, but there are others.
If you or a customer expects better than 99% accuracy (by some definition) for an application like this, consider walking away from the project. The project may be feasible, but could prove painful. (I realize this may just be a hobby project, but this is an important lesson to learn early.) For a hobby project, 99% sounds great. For many industrial applications, 99% accuracy / success means the system may be worse than useless. Failure can be very expensive.
To be clear: it can take effort to make even barcode reading work 99.9% of the time or better in real-world conditions. Getting image processing to work in the lab can be easy. It's really helpful to figure out how many different ways a vision system can fail, and then try to make the system robust against those failures.
Accuracy for reading serial numbers could mean different things. Be sure to define what you mean clearly. For example, if it's a failure if any one character in a serial number is read incorrectly, then that's more stringent than (say) trying to read 99% of all characters across all cases correctly. And if two cases could different by just one character--ouch.
Don't be afraid of making "mistakes." There's lots to learn. Problems like this one can be challenging even for people with some years of experience. Entire companies working on robotics + vision have wasted millions of U.S. dollars on approaches that are provably--and sometimes quickly demonstrated to be--really poor. But if they don't go bankrupt, they may end up with good technology.

If you have sufficient interest, you could dig a bit and find references to "machine vision" systems for OCR. Look for white papers, brochures, and web pages about "direct part marks" created via dot peen, etching, various laser techniques, pin stamping, and so on.

Prior to the past decade, many such systems would have been called "machine vision" rather than "computer vision." (Long story.) Problems similar to yours have been solved well, and you might find inspiration by digging up old machine vision solutions.

You can also check out current hardware and/or software by companies like Cognex, Keyence, Microscan, MvTec, National Instruments, and so on. Sometimes you might use high-level commercial freeware to solve a problem first, then try to duplicate the solution with your own code using Python + OpenCV.

All this is TMI for your project, perhaps, but I've found over decades that some projects call for a long write-up rather than a few quick pointers. If you could get up to speed faster than was possible when I started out (before OpenCV existed), that'd be cool.

科技改变生活-雨落星辰 - 所有的伟大,都源于一个勇敢的开始

image processing - How do I extract faint dotted text using OpenCV python - Stack Overflow

1 Answer 1

与本文相关的文章

评论列表(0)