Script

Do you ever wonder if AI understands comic books ? Well, we did, but firstly, we need to understand how Ai reads text, and most importantly, how they see and understand images.

How they see

A.I. is able to see and understand the images, separated by gutters, by tracking certain elements that make up a character through computer vision. The A.I. can attempt to understand the context within the images by having a certain level of inference, although it’s not on the same level of inference that we humans have. However, with enough trial and error, they will develop the capacity to see and understand images presented to them at a faster rate. But it won’t be easy. There’s a lot A.I. needs to observe and comprehend in a single image, such as a character’s facial expression, as well as the setting and location of the image, and the types of movements visible within the panel.

In 2012, researchers were able to make an improvement on A.I.’s ability to read images via a neural network, which was given an immense amount of images from ImageNet. The model of A.I. used was one that was being trained to understand images through this single database (FaceBook’s automatic tagging feature is a result of this research effort).

Source: https://www.wired.com/story/ai-can-recognize-images-but-understand-headline/

How they read

In essence they way any ai learn to read is by first associating letters of the alphabet to a binary code. By feeding the AI with the information of multiple dictionaries and the ASCII number value(the american standard code for information Interchange) it can start to understand the placements of each word and it would start to understand how sentences are formed. In 2018 Microsoft research lab source : (https://blogs.microsoft.com/ai/microsoft-creates-ai-can-read-document-answer-questions-well-person/) in asia announced that by using the Stanford question answering dataset they were able to match human reading comprehension using they’re own machine reading comprehension.

How they see and read comic books

Thanks to Mohit Lyyer and his research team (https://arxiv.org/abs/1611.05118) we finally have an idea of how AI reads and understands Comic Books. In their research they used three open source OCR ( optical character recognition is basically the conversation of images of texts or printed texts into machine encoded text) which are tesseract Ocular and google’s cloud vision OCR. They found that google’s Cloud vision performed better at understanding comic texts. After feeding their AI about 4,000 comics books from the mid 1950’s as a reference , it began to develop a better understanding of how to read other comics.

A.I. isn’t perfect, but at the rate technology is advancing, they will eventually learn how to master this. Just as how we humans have been able to program our current A.I. and allow it to gradually improve on tracking, recognition, and basic reading.

Share this: