OpenAI’s policies hinder reproducible…

Mar 22, 2023

LLMs have become privately-controlled research infrastructure

10 Comments

Mar 23, 2023·edited Mar 23, 2023Liked by Sayash Kapoor, Arvind Narayanan

Its not merely researchers that this would impact. That also seems like a problem for commercial applications that were built using a certain model that was good enough for their purposes. I'm new to LLMs and hadn't built anything using them yet, but I gather they can be used to get a vector embedding for pieces of text for use in things like classification. It seems being forced to change models would lead to a different embedding vector for everything they've classified, and a need then to re-classify everything from scratch, even if the old model was good enough for their purposes. Its unclear if they simply aren't thinking about the needs of other commercial users since they have Microsoft's $, or cynically Microsoft would prefer they scare off any other commercial user of OpenAI (so they need to wait and see if Microsoft offers the model on its systems, so they get paid directly, perhaps with it tweaked so it has different embeddings).

Expand full comment

Reply (1)

Al’s Newsletter

Mar 23, 2023

It should surprise nobody. This is the company, after all, that started with this mission statement: „As a non-profit, our aim is to build value for everyone rather than shareholders. Researchers will be strongly encouraged to publish their work, whether as papers, blog posts, or code, and our patents (if any) will be shared with the world. We’ll freely collaborate with others across many institutions and expect to work with companies to research and deploy new technologies.“

And then promptly became a closed source, very much for profit arm of Microsoft.

Expand full comment

Vamsi

Mar 23, 2023·edited Mar 23, 2023

It's very sad that people just fall for this trap. Don't even get me started on people who are trying to build companies based on this closed tech by a company called "Open AI".

Thanks you people like you for calling out "Open AI" on this.

Edit : people are begging on twitter to no cut off their access to the API.

Expand full comment

direwolff

Mar 23, 2023

How can any researcher, application developer rely on OpenAI’s tech given their opaque and proprietary stance on something that both them and many others see as important foundational technology for some many uses? If we’ve learned anything from platform providers who appear to act benevolently when it suits them (think Facebook, Twitter, Microsoft (during Windows hegemony)), is that as soon as it doesn’t they can crush an entire ecosystem without thinking twice about it. More importantly however, is that these LLMs put everyone at the mercy of one company’s interpretation of the world. Scary stuff for sure. I really don’t love so many researchers setting up their research off of this platform, feels destined for failure.

Expand full comment

Chris Varen

Mar 23, 2023

You didn't mention that Azure is still offering Codex: https://learn.microsoft.com/en-us/azure/cognitive-services/openai/how-to/work-with-code

Or is that not suitable for researchers?

Expand full comment

Mahesh Kumar

Mahesh’s Substack

Mar 22, 2023

Looks like they heard you and some others

https://twitter.com/sama/status/1638420361397309441?s=46&t=k3-sVtLFYwKwRcDUV323pA sama saying they will keep it available for researchers

Expand full comment

Reply (1)

Fionntán

Mar 23, 2023

My half-baked take is that there should be independent bodies funded and set up whose sole job is developing and giving access to the data, code and models.

There are already loads of institutions who do this at the data level why the hell can’t we do it at the LLM (or any other large model) level. https://www.theodi.org/article/the-data-institutions-register/

Now whether that’s a semi-state body who operate across lots of domains, or it’s done at the sector (health, legal, energy), or the national level, or international (EU, CERN for LLMs) are all really interesting unanswered questions.

People have been getting together and publishes models for decades, centuries even, but maybe it’s probably only the state as part of industrial policy that can really compete with big tech private interests in terms of size, cost, compute etc.

But key to these are sustainability and openness - open access to models but also open in documentation and how the model-sausage is made.

Expand full comment

Austin

Mar 29, 2023

Would love to hear your opinions on the recent letter released by the Future of Life Institute calling for a temporary pause on the training in LLM’s!

Expand full comment

AI Snake Oil

OpenAI’s policies hinder reproducible…