- "Open source" is a term buzzing around the generative AI community
- For certain GenAI applications, open-source models present benefits like cost savings, community input and transparency
- But the true definition of open-source GenAI is still being debated, and these models also present some challenges
Open source has been a game-changer for software development since the early days of coding. Now, as generative artificial intelligence (GenAI) reshapes industries, open-source GenAI is drawing attention. So, what unique benefits—and risks—does open-source GenAI bring?
GenAI, powered by large language models (LLMs), has already revolutionized communication, coding and customer support with tools like ChatGPT and Copilot. Traditionally, many LLMs have been proprietary, restricting public access to their inner workings and modifications.
Open-source GenAI models, on the other hand, provide developers with code transparency, the ability to customize models and often, lower costs. These models allow fine-tuning on popular cloud platforms like AWS, Google Cloud and Microsoft Azure, which can increase model accuracy by 5-10% in specific applications, according to GitHub.
In the case of GenAI, open-source models are especially attractive to enterprises that want to adapt AI tools for niche applications or specialized tasks. Since anyone can examine and modify the underlying code, open-source models also bring a level of transparency that many AI ethicists champion, as this openness allows for more thorough external audits for security and bias issues.
Many vendors are touting the advantages of open-source GenAI since it tends to afford more transparency into models, training data and other parameters, as well as other advantages such as efficiency, modularity and access to source code for customization. That said, open-source models don’t come without risks. Unlike proprietary GenAI models, which are supported and updated by vendors, open-source models raise some data security questions, and require users to self-manage security updates and ensure alignment with industry standards.
Who’s opening the door?
Companies including Red Hat, Intel and IBM have advocated for the future of GenAI being open.
IBM’s latest generation of Granite AI models was released under the Apache 2.0 license, for instance. The permissive Apache 2.0 license supports integration with non-open-source code and allows for modification and distribution of the software free of royalties, “which can have some advantages in terms of scalability and cost, particularly in Gen AI,” said Jay Lyman, senior research analyst at S&P Global Market Intelligence.
In May, IBM and Red Hat together launched InstructLab, an open-source project that gives communities the tools to create and merge changes to LLMs without having to retrain the model from scratch.
“We believe the future of AI is open,” said Maryam Ashoori, IBM Director of Product Management for watsonx.ai, at this year’s Gartner IT Symposium. Ashoori noted IBM is using InstructLab to “leverage the power of community” to enhance the base Granite model.
The use of a specific telco AI model that is open source has become “a common talking point in the telco sector,” noted Patrick Kelly, founder of Appledore Research Group. However, Kelly said the cost to train, develop and sustain such a model would be costly, and “without a strong alliance and fortitude to commit resources and funding on a long-term basis from telco alliance members, the success of it is risky.”
Open-source semantics
Thus, a big question still surrounding all of this is who—and what, exactly—defines open-source GenAI.
To help solve this, the Open Source Initiative last week released its first definition of open source specific to AI. “It’s not perfect, but it’s a starting point and an opening of the discussion about what open-source Gen AI is,” Lyman said.
Although analysts agree that there is an open-source nature to the Granite models, IBM Director of Research Darío Gil said he couldn’t promise the latest model meets the definition of open source as defined by the Open Source Initiative. “There is an active dialog around that, and the industry is going to need to clarify for everybody - It's very useful for us as an industry to start being more precise,” he said at the launch of the Granite 3.0 series.
Even GenAI giants like Meta and X.ai have faced skepticism as they released model weights and architectures, calling them “open source,” but restricting certain data access. One common complaint is that X.ai hasn’t revealed all of the code or training data for its Grok model.
Abu Dhabi’s Technology Innovation Institute also launched Falcon, with its smaller Falcon 40B model released under Apache 2.0, considered open source. But Falcon’s larger model, the 180B, carries some restrictions, raising questions about what qualifies as truly open source.
Will the future really be open source?
As open-source GenAI gains momentum, industry analysts are weighing its long-term potential. Could open-source models really become the new norm for GenAI? It’s still up in the air.
Open-source models are improving in quality and speed, said Arun Chandrasekaran, distinguished VP analyst at Gartner, but questions remain about how well they can compete with closed-source models in the long term.
According to Chandrasekaran, the topic of open models has seen intense debate over the last year as model providers strive to balance model access and customizability with maintaining secrecy due to competitive reasons. “History is littered with companies that started as open source and then went closed source due to inability to monetize open source, a problem likely to be acute in this space due to the high cost of building models and their quick depreciation period,” he told Fierce Network.
A fresh survey from S&P Global Market Intelligence showed that most enterprises today are using a mix of both commercially-licensed and open-source models. Lyman expects that will continue, given both have different advantages depending on the setting. “I think things are likely to play out similar to what we’ve seen in enterprise software, where open source has come to represent a mainstay of development and deployment, but where proprietary software and commercial off-the-shelf software are often still the norm,” he said.
As such, companies must carefully assess the trade-offs between open-source flexibility and the stability of proprietary models. While competition among AI providers heats up, only time will reveal what part open models will play in the future of GenAI.