« MBS Plugin Advent cal… | Home | Static Variables in X… »

MBS Plugin Advent calendar: 5 - WindowsOCR

Door 5 - WindowsOCR

Fact of the day
If you want to program a cross-platform solution or need text recognition for older Windows versions, you can use functions of the Tesseract component from the MBS FileMaker Plugin.

In Door 2 we already introduced you to a way of performing text recognition under Mac. But this possibility is not only available for Mac in the MBS FileMaker Plugin, but the MBS FileMaker Plugin also makes it possible to use Windows' own OCR functions. These WindowsOCR functions are available under Windows 10 and 11.

You can test whether you can use the functions under an operating system by running the WindowsOCR.Available function. If you can use the functions, first create a new OCR engine with WindowsOCR.New. This function returns a reference number which you can use in other functions to address the OCR engine. In this function, you can also optionally specify which language is to be recognized by the engine. If you skip this parameter, the language, that you get back with the WindowsOCR.CurrentInputMethodLanguageTag function, is automatically used for language recognition. Which languages can be recognized depends on which languages are installed on your system. You can obtain a list of the languages that you can currently use in your system with WindowsOCR.AvailableRecognizerLanguages.

Show Custom Dialog [ "Supported languages" ; MBS( "WindowsOCR.AvailableRecognizerLanguages" ) ] 
You can now select whether you want to recognize the text from an image from a container or an image file. If the text is on an image from a container, use the WindowsOCR.Recognize function and enter the reference number and the container in which the image is located, in the parameters.

If [ MBS( "WindowsOCR.Available" ) ] 
	Set Variable [ $OCRen ; Value: MBS( "WindowsOCR.New" ; "en-US" ) ] 
	...
	Set Variable [ $r ; Value: MBS( "WindowsOCR.Recognize"; $OCRen; DoorFive::Container ) ] 
...

Alternatively, you can recognize the text of an image in a file. To do this, use the WindowsOCR.RecognizeFile function and enter the file path to the file instead of the container.

...
	Set Variable [ $r ; Value: MBS( "WindowsOCR.RecognizeFile"; $OCRen; DoorFive::Path ) ] 
	...

You can now query the result. Again, you have two options as to how the result can be presented to you. If you only want the plain text, use WindowsOCR.Text and get the text back, which you can then store in a field, for example. If you want information beyond the plain text, you can use the WindowsOCR.Result function. Here you receive a detailed JSON in which the individual lines and the individual words are specified with their position and size.

...
	Set Variable [ $Text ; Value: MBS( "WindowsOCR.Text"; $OCRen ) ] 
	...
	Set Variable [ $JSON ; Value: MBS( "WindowsOCR.Result"; $OCRen ) ] 
	...

Here you can see such a generated JSON from an image:

{
	"Text":	"WindowsOCR: Functions for OCR in Windows 10 or 11.",
	"TextAngle":	0,
	"LineCount":	1,
	"Lines":	[
		{
			"Text":	"WindowsOCR: Functions for OCR in Windows 10 or 11.",
			"WordCount":	9,
			"X":	19,
			"Y":	9,
			"Width":	2139,
			"Height":	61,
			"Words":	[
				{
					"Text":	"WindowsOCR:",
					"X":	19,
					"Y":	9,
					"Width":	538,
					"Height":	61
				}, 
				{
					"Text":	"Functions",
					"X":	604,
					"Y":	11,
					"Width":	364,
					"Height":	59
				}, 
				{
					"Text":	"for",
					"X":	1000,
					"Y":	9,
					"Width":	107,
					"Height":	61
				}, 
				{
					"Text":	"OCR",
					"X":	1137,
					"Y":	10,
					"Width":	167,
					"Height":	60
				}, 
				{
					"Text":	"in",
					"X":	1337,
					"Y":	11,
					"Width":	58,
					"Height":	58
				}, 
				{
					"Text":	"Windows",
					"X":	1432,
					"Y":	9,
					"Width":	343,
					"Height":	61
				}, 
				{
					"Text":	"10",
					"X":	1815,
					"Y":	10,
					"Width":	84,
					"Height":	60
				}, 
				{
					"Text":	"or",
					"X":	1935,
					"Y":	24,
					"Width":	78,
					"Height":	46
				}, 
				{
					"Text":	"11.",
					"X":	2049,
					"Y":	11,
					"Width":	109,
					"Height":	58
				}
			]
		}
	],
	"TextLines":	"WindowsOCR: Functions for OCR in Windows 10 or 11.\r"
}

The degree of text angle can also be found in the JSON. You can also determine this text angle separately with WindowsOCR.TextAngle. This allows you to straighten the text, for example.

...
	Set Variable [ $Angle ; Value: MBS( "WindowsOCR.TextAngle"; $OCRen ) ] 
...

I hope you enjoyed this article. Have fun recognizing your texts.


Monkeybread Software Logo with Monkey with Santa hat
4 👈 5 of 24 👉 6
05 12 23 - 08:37