Tesseract.js Brings Image OCR Translation to Browsers

OCR translation still isn’t perfect but it has improved dramatically over the past few years. Leading the way is the Tesseract translation engine currently open sourced in C++.

While this is an incredible library, it’s however, restricted to software. Thankfully someone made a port of Tesseract into JavaScript which is called Tesseract.js. It supports up to 60 languages and while it’s certainly not perfect, it does the job well.

Installation and setup is a breeze where you can target any image element on the page and run the Tesseract.recognize() function. This can take any type of image and it’ll automatically compress & translate right in the browser.

You can get a lot more complicated but the beauty is how you can run OCR with a single line of code.

Check out the Tesseract.js landing page if you want to see a live demo. This works right in the browser where you can drag & drop any scanned image of text to get an automatic OCR translation.

You can also download this example locally through the GitHub page or you can build your own app by including the Tesseract.js script right from a CDN.

The simplest code example looks like the following where myImage is a direct reference to an HTML image element:

Tesseract.recognize(myImage).then(function(result){
    console.log(result)
});

Either way this library is so helpful to get moving with OCR on the web. It’s far from perfect but it’s also the best resource for web developers who want dynamic in-page OCR functionality.

To learn more visit the Tesseract.js GitHub page where you can check out a live demo and browse through the online documentation.

tesseract ocr example
FacebookTwitterInstagramPinterestLinkedInGoogle+YoutubeRedditDribbbleBehanceGithubCodePenEmail