The second day of OpenAI’s 12 Days of OpenAI moved towards less spectacular and more professional interests compared to the general deployment of the OpenAI o1 model on ChatGPT on the first day.
Instead, OpenAI announced plans to release Reinforcement Fine-Tuning (RFT), a way to customize its AI models for developers who want to tailor OpenAI’s algorithms to specific types of tasks, particularly more complex. This release marks a clear shift toward enterprise applications from the original consumer-focused updates. You can think of RFT as a method to make AI models work better through their reasoning for answers. Using a dataset and a developer’s assessment rubric allows OpenAI’s platform to train its specialized AI without a lot of costly reinforcement from subsequent experiments.
RFT could be a boon for AI tools used in law and science. OpenAI demonstrated in its live stream the CoCounsel AI assistant built with RFT by Thompson Reuters and how RFT helps researchers studying rare genetic diseases at Berkeley Lab. However, commercial partnerships won’t make much difference in the short term to average users of ChatGPT or other OpenAI products.
Today we are announcing fine-tuning of reinforcement, which makes it much easier to create expert models in specific domains with very little training data. livestream in progress: program starting now, public launch in the first quarterDecember 6, 2024
Business or consumer
If you’re more interested in the consumer side, don’t give up. While the companies’ focus contrasts with day one, it’s easy to imagine that OpenAI wants to have as wide a range of news as possible over the course of 12 days. There will definitely be plenty more consumer news to come. Maybe alternating days or another pattern.
Still, at least OpenAI’s final joke was a little funnier than yesterday. AI has described how self-driving vehicles are popular in San Francisco, and Santa wants to make a self-driving sleigh as part of this trend. The problem is that it keeps hitting trees. What’s the problem? He did not refine his models. Perhaps the image ChatGPT created for TechRadar editor Lance Ulanoff will sell the humor better.