Imagine an artificial intelligence (AI) model that can observe and understand moving images with the subtlety of a human brain. Now, scientists at Scripps Research have made that a reality by creating MovieNet: an innovative AI that processes videos the same way our brains interpret real-life scenes as they unfold over time.
This brain-inspired AI model, detailed in a study published in the Proceedings of the National Academy of Sciences on November 19, 2024, will be able to perceive moving scenes by simulating how neurons – or brain cells – make sense of the world in real time. Conventional AI excels at recognizing still images, but MovieNet introduces a method for machine learning models to recognize complex and changing scenes; a breakthrough that could transform areas from medical diagnostics to autonomous driving, where detecting subtle changes over time is crucial. MovieNet is also more accurate and more environmentally sustainable than conventional AI.
The brain doesn’t just see still images; this creates a continuous visual narrative. Recognizing static images has come a long way, but the brain’s ability to process fluid scenes, like watching a movie, requires a much more sophisticated form of pattern recognition. By studying how neurons capture these sequences, we were able to apply similar principles to AI. »
Hollis Cline, Ph.D., main author, director of the Dorris Neuroscience Center and Hahn Professor of Neuroscience at Scripps Research
To create MovieNet, Cline and first author Masaki Hiramoto, a researcher at Scripps Research, examined how the brain processes real-world scenes in short sequences, similar to movie clips. Specifically, the researchers studied how the tadpoles’ neurons responded to visual stimuli.
“Tadpoles have a very good visual system and we know that they can detect and respond effectively to moving stimuli,” says Hiramoto.
He and Cline identified neurons that respond to film-like features, such as changes in brightness and image rotation, and can recognize objects as they move and change. Located in the visual processing region of the brain known as the optic tectum, these neurons assemble parts of a moving image into a coherent sequence.
Think of this process as a lenticular puzzle: each piece alone may not make sense, but together they form a complete moving picture. Different neurons process various “puzzle pieces” of a real-life moving image, which the brain then integrates into a continuous scene.
The researchers also found that optical neurons in the tadpoles’ tectum distinguished subtle changes in visual stimuli over time, capturing information in dynamic clips of about 100 to 600 milliseconds rather than in still images. These neurons are very sensitive to patterns of light and shadow, and each neuron’s response to a specific part of the visual field helps construct a detailed map of a scene to form a “video clip.”
Cline and Hiramoto trained MovieNet to emulate this brain-like processing and encode video clips as a series of small, recognizable visual cues. This allowed the AI model to distinguish subtle differences between dynamic scenes.
To test MovieNet, the researchers showed it video clips of tadpoles swimming in different conditions. Not only did MovieNet achieve an accuracy of 82.3 percent in distinguishing normal from abnormal swimming behavior, but it also exceeded the capabilities of trained human observers by approximately 18 percent. It even outperformed existing AI models such as Google’s GoogLeNet, which achieved only 72 percent accuracy despite its extensive training and processing resources.
“That’s where we saw real potential,” Cline emphasizes.
The team determined that MovieNet was not only better than current AI models at understanding scene changes, but also used less data and processing time. MovieNet’s ability to simplify data without sacrificing accuracy also sets it apart from conventional AI. By breaking down visual information into essential sequences, MovieNet efficiently compresses data like a compressed file that retains critical details.
Beyond its high accuracy, MovieNet is an environmentally friendly AI model. Conventional processing of AI requires immense energy, leaving a heavy environmental footprint. MovieNet’s reduced data requirements provide a greener alternative that saves energy while meeting high standards.
“By mimicking the brain, we have managed to make our AI much less demanding, paving the way for models that are not only powerful but durable,” says Cline. “This efficiency also opens the door to the development of AI in areas where conventional methods are costly.”
Additionally, MovieNet has the potential to reshape medicine. As the technology advances, it could become a valuable tool for identifying subtle changes in early-stage conditions, such as detecting irregular heart rhythms or detecting early signs of neurodegenerative diseases like Parkinson’s. For example, small motor changes related to Parkinson’s disease, which are often difficult for the human eye to discern, could be flagged early by AI, giving clinicians valuable time to intervene.
Additionally, MovieNet’s ability to perceive changes in the swimming habits of tadpoles when they are exposed to chemicals could lead to more precise drug testing techniques because scientists could study dynamic cellular responses instead. than relying on static snapshots.
“Current methods neglect critical changes because they can only analyze images captured at regular intervals,” notes Hiramoto. “Observing cells over time means MovieNet can track the most subtle changes during drug tests.”
Looking ahead, Cline and Hiramoto plan to continue refining MovieNet’s ability to adapt to different environments, improving its versatility and potential applications.
“Drawing inspiration from biology will continue to be a fertile area for advancing AI,” Cline says. “By designing models that think like living organisms, we can achieve levels of efficiency that are simply not possible with conventional approaches.”
Source:
Journal reference:
Hiramoto, M. and Cline, H.T. (2024). Identifying movie-coding neurons enables movie recognition AI. Proceedings of the National Academy of Sciences. doi.org/10.1073/pnas.2412260121.