AUSTIN, Texas — The policy-related programming at this year’s South by Southwest is largely over, but the technologists building the future are still hashing out their plans on the conference stage. And given the laissez-faire status quo for U.S. tech regulation (notwithstanding today’s blow to TikTok in Congress), what they decide in Austin or Silicon Valley will likely matter more than anything being discussed in Washington right now. That’s why I was fascinated by one particular debate that played out this week at SXSW: What, exactly, qualifies as “open source” AI and what are the benefits and drawbacks of building it? This is not an idle, abstract question in the tech world right now. Open and closed AI systems are often described with a stark dichotomy, proponents of the former saying the latter want to choke off public access to AI systems to make a profit, and the latter saying the former will endanger the world. Elon Musk went to court to accuse OpenAI of abandoning its original mission to build open source AI, and he announced this week that his own company’s AI chatbot will be open source. In Europe, Mistral AI, the French company seen as the continent’s great hope for a globally competitive AI firm, just released its open source Claude 3 model, which has been lauded as just as good as American competitors like GPT-4. The AI community, regulators, and watchdogs also criticized Mistral harshly a couple of weeks ago for announcing a partnership with the closed source Microsoft, just weeks after the French company pointed to its open source development philosophy to receive lenient treatment under the European Union’s AI Act. It’s an effective case study for the open source conversation, where questions about the risk or reward of making AI source code open to the public often mask the raw power and profit dynamics at the heart of the decisions these companies make. “Open source is an interesting, complicated question, because with most of these companies, we don't know anything” about how their models are trained, said researcher and all-around AI gadfly Gary Marcus when I spoke to him this week on a sunny bench ahead of his SXSW talk about “AI and the Future of Truth.” “I can understand an argument that says, ‘Hey, we're a commercial company, our secret sauce is the way we build our model,’” Marcus said. “But because society is affected deeply by this data, it’s being used to determine people's livelihoods, whether or not they get a job, and these companies are not really handling the negative externalities that they're causing, the scientific community needs to understand what's going on and we need some openness there.” Whether any given AI system is open source is not exactly a binary choice, but more a matter of determining where it falls on a spectrum. When Meta touted its LLaMA-2 large language model as open source last year, it also noted in its documentation that it was attaching some strings on who could use it. That led to criticism from the nonprofit Open Source Initiative that prohibiting anyone from using the data for commercial purposes means LLAaMA-2 doesn’t meet their definition of open source. Whatever its limitations, Meta is unique among the big Silicon Valley tech companies in taking the open source approach to AI. But another major player taking that route is the New York-based IBM, which sponsored SXSW’s AI programming track this year and whose director of research Dario Gil moderated a panel on “Why the Future Should Be Open” that included Meta’s generative AI product director Joseph Spisak and other pro-open source voices. Rebecca Finlay, CEO of the nonprofit Partnership on AI, lauded what she described as a recent “step back from the binary choice between closed and open models.” “There’s been a lot of work done to understand the spectrum of release approaches… frankly, from my perspective, I think [choosing an approach is] a business decision. Companies are making decisions about the business model that is most appropriate for them, and for the products that they want to release and the work that they're doing with their customers.” Those commercial concerns aside, businesses using a closed system like OpenAI and AI critics share a common worry about open source: Going too far toward that end of the spectrum could empower bad actors to use these powerful tools for nefarious purposes, like fraud and spreading misinformation. But as with so much of the rest of the hope, anxiety, and hype surrounding AI right now, the ultimate impact of releasing sophisticated AI source code into the world is unknowable until the dust settles. “Anybody who tells you they know for sure that we should open source or we shouldn't, which is 95 percent of the people on one side of that issue or the other, is actually kind of full of it,” Marcus told me. “Because the truth is, we don't really know what these models are going to be used for.”
|