I have a task to detect text from car parts. For this task, I chose ML Kit, and it works perfectly with printed symbols/numbers. (Photo 1)
Photo 1
However, when we tried to detect text from parts where the numbers have a somewhat non-standard appearance, it stopped working. (Photo 2) Even though the text is clearly visible. Photo 2
Here is my code:
AndroidView(
factory = { ctx ->
PreviewView(ctx).apply {
implementationMode = PreviewView.ImplementationMode.COMPATIBLE
}
},
modifier = Modifier.fillMaxSize()
) { previewView ->
cameraProvider?.let { provider ->
if (preview == null || imageAnalyzer == null) {
preview = Preview.Builder()
.setTargetResolution(android.util.Size(1280, 720))
.build()
imageAnalyzer = ImageAnalysis.Builder()
.setTargetResolution(android.util.Size(1280, 720))
.setBackpressureStrategy(ImageAnalysis.STRATEGY_KEEP_ONLY_LATEST)
.setOutputImageFormat(ImageAnalysis.OUTPUT_IMAGE_FORMAT_YUV_420_888)
.build()
.also { analysis ->
analysis.setAnalyzer(cameraExecutor) { imageProxy ->
val mediaImage : Image? = imageProxy.image
if (mediaImage != null) {
val image = InputImage.fromMediaImage(
mediaImage,
imageProxy.imageInfo.rotationDegrees
)
textRecognizer.process(image)
.addOnSuccessListener { visionText ->
val detectedText = visionText.textBlocks
.filter { block ->
isInScanningArea(
block.boundingBox!!,
imageProxy.width,
imageProxy.height
)
}
.joinToString(" ") { it.text }
val regex = Regex("\\b[A-Za-z0-9]+(-[A-Za-z0-9]+)+\\b")
val extractedText = regex.find(detectedText)?.value
if (!extractedText.isNullOrEmpty() && !findResultScan) {
findResultScan = true
resultScan = extractedText
}
}
.addOnCompleteListener { imageProxy.close() }
}
}
}
try {
provider.unbindAll()
camera = provider.bindToLifecycle(
lifecycleOwner,
CameraSelector.DEFAULT_BACK_CAMERA,
preview,
imageAnalyzer
)
preview?.surfaceProvider = previewView.surfaceProvider
} catch (e: Exception) {
e.printStackTrace()
}
}
}
}
isInScanningArea
– here I'm trying to find text specifically within the scanning area. But even if I print everything the camera sees visionText.text
, it still couldn't detect what I need.
I tried looking for other solutions and even tested ready-made solutions on ML Kit that specialize in text recognition, but without success.
Things are better on iOS. It was able to detect text from image 2, and even Google Translate on Android managed to recognize it.