Google researchers said on Friday that they had discovered the first vulnerability using a wide language model.
In a blogGoogle said it thought that the bug is the first public example of an AI tool finding an usable memory security problem previously unknown in widely used real software.
Vulnerability has been found in SQLite, a popular open source database engine among developers.
Google researchers reported the vulnerability to SQLite developers in early October, who corrected it the same day. The problem was found before it appeared in an official version and had no impact on SQLite users. Google praised development as an example of “the immense AI potential can have for cyber defensers”.
“We think this work has enormous defensive potential,” said Google researchers. “Finding vulnerabilities in software even before its release, this means that there is no possibility for attackers to compete: vulnerabilities are fixed even before the attackers have a chance to use them.”
The effort is part of a project called Big Sleep, which is a collaboration between Google Project Zero and Google Deepmind. It has evolved from a past project that started working on vulnerability research helped by important languages of languages.
Google noted that at DEFCON security conference in AugustCybersecurity researchers responsible for creating research tools on vulnerability assisted by AI discovered another problem in SQLite which inspired their team to see if they could find a more serious vulnerability.
Blurred variants
Many companies like Google use a process called “Fuzzing” where software is tested by supplying it with random or unavicable data designed to identify vulnerabilities, trigger errors or crush the program.
But Google said Fuzzing was not doing enough to “help defenders find the difficult (or impossible) bugs to find”, adding that they “hope that AI will be able to reduce this gap”.
“We believe that this is a promising path towards the transformation ultimately of the tables and the realization of an asymmetrical advantage for the defenders,” they said.
“The vulnerability itself is quite interesting, as well as the fact that the existing test infrastructure for Sqlite (both by Oss-Fuzz, and the own project infrastructure) have not found the problem, so we have makes a more in -depth investigation. ”
Google said one of the main motivations for a big sleep is the persistent question of variants of vulnerability. One of the most concern Google found in 2022 was the fact that more than 40% of the zero days observed were variants of vulnerabilities that had already been reported.
More than 20% of bugs were also variants of previous zero-day days, researchers added.
Google said it continues to discover exploits for variants of vulnerability previously found and corrected.
“While this trend continues, it is clear that fuzzing does not succeed in catching such variants, and that for attackers, the analysis of manual variants is a profitable approach,” said the researchers.
“We also believe that this variant analysis task is better suited to current LLM than the more general research problem of open vulnerability. By providing a starting point – like the details of a previously fixed vulnerability – we delete a lot of ambiguity of research on vulnerability and start from a concrete and well -founded theory: “ It was a previous bug; There is probably another somewhere. “”
The project is still in the early stages and they only use small programs with vulnerabilities known to assess progress, they added.
They warned that if it is a moment of validation and success for their team, they reiterated that these are “very experimental results”.
“When provided with the right tools, current LLMs can do research on vulnerability,” they said.
“The position of the Big Sleep team is that at present, it is likely that a specific Fuzzer for the target would at least be as effective (to find vulnerabilities). We hope that in the future, this effort will lead to a significant advantage to the defenders – with the potential not only to find cases of crash test, but also to provide problems of analysis, sorting and fixing of the causes high quality could be much cheaper and more efficient in the future.
Several cybersecurity researchers have agreed that the results were promising. The founder of Bugcrowd, Casey Ellis, said that research on language models was promising and specifically highlighted its use on variants as “really intelligent”.
“He benefits from the strengths of the LLM training, fills some of the shortcomings of fuzzing and, above all, imitates the economy and the trend in search of grouping real security research,” he said.