OhpenAI released its most advanced AI model to date, called o1, on Thursday for paid users. The launch kicked off the business “12 days of OpenAI»: a dozen consecutive releases to celebrate the end of year holidays.
OpenAI touted o1’s “complex reasoning” capabilities and announced Thursday that unlimited access to the model would cost $200 per month. In the video posted by the company to show off the model’s strengths, a user uploads a photo of a wooden birdhouse and asks the model for advice on how to build a similar one. The model “thinks” for a short time, then spits out what, at first glance, appears to be a complete set of instructions.
Close examination reveals that the instructions are almost useless. The AI measures the amount of paint, glue and sealer required for the task in inches. It only gives the dimensions of the front panel of the nest box, and no other. He recommends cutting a piece of sandpaper to a different set of dimensions, for no apparent reason. And in a separate part of the instruction list it says “the exact dimensions are as follows…” and then gives no exact dimensions.
“You would know as much about building the nest box from the image as you would from the text, which defeats the whole purpose of the AI tool,” says James Filus, director of the Institute of Carpenters, a UK-based trading company. body, in an email. He notes that the list of materials includes nails, but the list of required tools does not include a hammer, and that the cost of building the simple birdhouse would be “nowhere near” the $20 to $50 estimated by o1. “Simply saying ‘install a small hinge’ doesn’t really cover what is perhaps the most complex part of the design,” he adds, referring to another part of the video that purports to explain how to add a opening roof to the nest box.
OpenAI did not immediately respond to a request for comment.
This is just the latest example of an AI product demonstration doing the opposite of its intended purpose. Last year, a Google ad because an AI-powered research tool falsely claimed that the James Webb Telescope had made a discovery it had not made, a blunder that sent the company’s stock price plummeting. More recently, an updated version of a similar Google tool told early users it was safe to eat stonesand that they could use glue to stick cheese on their pizza.
OpenAI’s o1, which according to public benchmarks is its best-performing model to date, takes a different approach to answering questions than ChatGPT. It’s still essentially a very advanced next word predictor, trained using machine learning on billions of words of text from the internet and beyond. But instead of immediately spitting out words in response to a prompt, he uses a technique called “chain-of-thought” reasoning to essentially “think” about an answer for a while behind the scenes, then only gives his answer after that. . This technique often yields more accurate answers than having a model reflexively spit out an answer, and OpenAI has touted o1’s reasoning abilities, particularly when it comes to math and coding. It can accurately answer 78% of doctoral-level science questions, according to data from OpenAI. published alongside a preview version of the model released in September.
But it is clear that certain fundamental logical errors can still be made.