GPT-5 is now Trademarked by OpenAI: What Does That Say About the Future of ChatGPT?
Last Updated on August 12, 2023 by Editorial Team
Author(s): Aditya Anil
Originally published on Towards AI.
What is it hinting to us? .. ChatGPT-5?
I. The Trademark of GPT-5
In a 2014 BBC interview, Stephen Hawking said the following words –
The development of full artificial intelligence could spell the end of the human race.
The state of AI in 2014 was different from today. AI was picking up interests in the corporate world. In that year, Google bought DeepMind — a machine learning startup — for over $600 Million. A year later, DeepMind created AlphaGo, which went on to beat Fan Hui, the European Go champion. Facebook then, on the other hand, was creating a system that could predict if two picture showed the same person.
Development in Deep Learning was at its golden state. A small startup named OpenAI got formed then after a year, in Dec 2015. And now, 10 years later, after what now feels like a century of advancements in AI, OpenAI filed a trademark application for “GPT-5” on July 14 with the United States Patent and Trademark Office (USPTO).
This move by OpenAI attracted many speculations. Many say it hints at the potential development of a new version of their language model after GPT4.
This news popped up on Twitter/X post by a trademark attorney Josh Gerben on July 31st.
The trademarking of GPT-5 came as a surprise for many of us.
What is it hinting at?
II. OpenAI’s Code Interpreter: A Stealthy Launch, bridging GPT-4.5 and GPT-5?
Not too long ago, OpenAI released ChatGPT’s newest feature: code interpreter. This was by far the most impressive feature addition to ChatGPT-4. Using the code interpreter, you could now run Python programs in ChatGPT, upload and even download files. Plus, it can even work with images to some extent.
In a podcast on Latent Space (July 11), Simon Willison, Alex Volkov, Aravind Srinivas, and Alex Graveley argue that Code Interpreter is actually GPT-4.5. Of course, OpenAI hasn’t announced whether this was indeed GPT 4.5 or not. However, this isn’t something new. We had seen similar behavior before when OpenAI quietly released Gpt 3.5.
This time, however, OpenAI might not have announced the Gpt 4.5 — keeping up with Sam Altman’s (CEO of OpenAI) statement to adhere to the six pause letter.
When talked about the viral open letter urging for a six-month pause in AI development, Sam said the following:
“There are parts of the thrust that I really agree with… We spent more than six months after we finished training GPT-4 before we released it, so taking the time to really study the safety of the model, to really try to understand what’s going on and mitigate as much as you can is important.”
In the same conversation, Sam’s comment on GPT-5 development was that
“[OpenAI] are not, won’t for some time [develop new versions of gpt], so in that sense [the six-month pause] was sort of silly.”
This talk was held at MIT in March of this year. You can watch the short clip here.
Based on this, many of us became convinced that releasing GPT-5 anytime soon would be unlikely. This clear gap between finishing training and release for GPT-4 meant that the release of GPT — 5 hadn’t started yet.
At least, that was what was expected.
However, OpenAI trademarking GPT-5 is something new. Could it be possible that OpenAI is already developing GPT-5? Is it a new marketing tactic to hype up AGI — a hypothetical AI that can do any task without any help?
Squinting our eyes, we can find the clues in the trademark application itself.
III. Trademarking Tomorrow: GPT-5’s Odyssey into the Multimodal Frontier
Going into a bit more detail, the GPT-5 trademark application refers to ‘downloadable computer programs and computer software related to language models.’ This means that the trademark covers the ‘program’ and ‘computer software’ related to LLMs.
GPT-5 could actually be an LLM that upcoming iterations of GPT-4 could utilize.
Additionally, the main crux of the hint comes from the ones that I have highlighted above. The trademark application includes software for making speech and text, language processing, and machine learning. It also includes software for voice and speech recognition, converting audio files to text, and more.
Does that give you some familiar scent? A chatbot that can — apart from generating responses — work with images, voices, speech, etc.?
Ha! The multimodality of GPT.
Multimodality refers to the ability to work with more than one type of input — like images, texts, audio and so on. People anticipated the release of GPT-4 with hoardings of ‘the future is here’ placards all over the internet. This anticipation was elevated when we got to know that GPT-4 could ‘presumably’ work with images in the near future. During the GPT-4 demo livestream 4 months ago we saw many impressive capabilities of the model. This included the ability to interpret memes and images, describe the various elements of the images, and so on.
President and co-founder of OpenAI, Greg Brockman, demonstrated how he created a website using GPT-4. He did this by inputting a photo of an idea from his notebook, and GPT4 generated the code for the website. That was pretty impressive. We were convinced that the future was indeed near.
But how near is it? As of now, the closest multimodal experience I ever had is with Bing Chat, which runs on GPT4. You could, in theory, make online searches using images and get results based on that. However, Bing still feels rusty and needs development. An experiment done by roboflow showed how good this multimodality feature of Bing is.
Here are some noteworthy findings as covered in the report –
“…The model was subpar at counting the number of people that were present in an image. Surprisingly, asking the model for a simple structured format (in the form of a JSON) worked much better than most other prompts. With that said Bing could not extract exact locations or bounding boxes, either producing fabricated bounding boxes or no answer at all…”
Roboflow concluded the strengths and weaknesses –
One strength of the underlying Bing Chat model is its ability to recognize qualitative characteristics, such as the context and nuances of a situation, in a given image…
There are notable limitations to how Bing’s new features can be used, especially in use cases where quantitative data is important.
Certainly, you cannot use it to make a website — as shown in the demo by Brockman — which makes Bing ‘nearly multimodal’ if not the least. I myself fed it some memes and it couldn’t explain the humour in them — the same way it was shown in the live stream demo. This feature either needs some refinement, or my taste in memes are bad in itself. In my case, both are equally likely (I’m not a big fan of memes).
Right now, only Bing Search — based on GPT-4 — lets you make searches using images. But the responses are not up to the mark, it seems.
In the case of ChatGPT, especially GPT4, you can loosely associate multimodality with the Code Interpreter. It enables you to work with documents and images along with the power of ChatGPT. Feeding a document or image is indeed a ‘new input’ that differs from the text — making GPT-4 fall under multimodality. Thus it would be wrong to say that GPT-4 isn’t multimodal yet.
Code Interpreter gives some taste of multimodality. It sets the expectation of future capabilities on ChatGPT.
If you’d like to read more content like this, head on to Creative Block
Judging from the phrase ‘artificial production of human speech and text’ from the trademark, GPT-5 — if ever released — would likely be based highly on multimodal. A ChatGPT that could work with (of course) texts, plus with images, speeches, documents and so on.
So that means GPT-5 release is around the corner eh? Not really, if we were to believe Sam. Saying GPT-5 would be released anytime soon would contradict Sam Altman’s statement. He confirmed that the company wasn’t working on GPT-5, back in April.
So if it’s true, the trademarking GPT-5 seems to secure the rights to its next iteration of GPT models in advance. This would keep other companies at bay and reduce ‘competition’. GPT-5 may or may not be AGI as anticipated by many, and experts seem to suggest that AGI isn’t yet possible.
However, there is another perspective to see this trademark move from the lenses of hype and hope. And OpenAI seems to master it early on.
IV. Hype, Hope, and Dreams of AGI
In a blog post, Sam declared that his company’s Artificial General Intelligence (AGI) will benefit humanity and it “has the potential to give everyone incredible new capabilities.”
But we are nowhere near AGI. Is it even possible? We don’t know.
The ‘experienced experts’ believe we are far from AGI. Meanwhile, the ‘AI doomers’ believe we are close to AGI. And the ‘AI influencers’ don’t care at all as long as there are apt contents out there. All these people have varied opinions on the future of AI, but one link ties them all: somewhere down below, they all are rowing in the stream of hype. Some oppose it, and some flow into it. And OpenAI seems to manifest the flow.
Reporter Karen Hao — who wrote an extensive report on OpenAI’s company culture in 2020 — suggests that OpenAI’s internal culture has started to reflect less on safe and research-driven AI and more on getting ahead than everyone else. Thus, accusing the company of the “AI hype cycle.”
Here’s an excerpt from the post.
But OpenAI’s media campaign with GPT-2 also followed a well-established pattern that has made the broader AI community leery. Over the years…splashy research announcements have been repeatedly accused of fueling the AI hype cycle…critics have also accused the lab of talking up its results to the point of mischaracterization. For these reasons, many in the field have tended to keep OpenAI at arm’s length.
But let’s assume the hype and rumors are true — OpenAI is building GPT-5 in their secret dungeon.
They claim GPT-5 will be so impressive and will make humans question whether ChatGPT has reached AGI. The future is now here, once again.
Going by the narratives and hype, GPT-5 or ChatGPT 5 would bring the following to the table:
- Multimodal capabilities: GPT-4 can already handle image and text inputs — and that’s a good start. But there is still some scope for audio and video inputs. Companies like Google and Meta already demonstrated using various text-to-speech and text-to-music tools. Google also experimented with multimodal AI to develop the PaLM 2 language model. But these capabilities are still in fragments. If rumors are to be true, then the next ChatGPT would be a culmination of all these multimodal features. An all-in-one ChatGPT if possible. And of course, the competition in generative AI forces OpenAI and other AI companies to innovate something close to AGI. That’s the expectation of this hype-driven AI race.
- Improved accuracy: While it is impossible to remove hallucination — aka the tendency of AI to make up facts — we have seen improvements in the newer GPT versions. According to OpenAI, GPT-4 is 60% less likely to make stuff up. Successive AI models try to be more accurate than their previous versions. We got to see this in GPT-3 and GPT-4, Llama and Llama2, and even in Claude and Claude 2 — where accuracy rate had noticeable improvement. The future version of GPT might expand its training dataset to fix inaccuracies. However, it would make it resource-heavy, as even the current ChatGPT takes $700,000 per day to run. If there isn’t a better way to make it more accurate and less resource-demanding, GPT-5 would be far from the near future.
- Artificial general intelligence (AGI): This is the final destination that every AI research company is heading to. Whether it is achievable or not is still under debate — but it is reasonable to say that AGI is unattainable any time soon. AGI, in theory, is do-anything-on-its-own AI, but how to approach it practically is where the roadblock comes in. Computers are not out there in the world, and in order to do tasks for humans, they need to interact with the environment. How to go about it? Nobody quite knows, but the answers seem to lie in the conjugation of Neuroscience and deep learning. If GPT-5 comes up with AGI — which is highly speculative — it would be yet another milestone; not just for AI but for the whole field of technology. Resurrecting a living and thinking mind from algorithms would undoubtedly be marvelous.
V. Forging the AGI dream
As I am writing this, the GPT-5 trademark application is currently awaiting examination. But whenever things like this grab the headline it sparks a lot of curiosity, as well as speculation in the AI community. There are always two divisions of the crowd — one, those who view it in a skeptical way, and the other who view it in an optimistic way. One class believes in the facts of yesterday, and the other class believes in the hopes of tomorrow. Nonetheless, both classes are equally important — especially when it comes to governing AI.
With tighter regulations and laws — the likes of the EU AI Act and the US AI Bill — it is getting restrictive for AI companies to vouch for breakthroughs. But are such strict measures justified, though? I believe it is.
If you observe the amount of developments in the past few years in the realm of AI, the growth has been exponential.
But the safety aspects, arising from the growing competition in the corporate world, is a matter of concern. OpenAI became a for-profit company. Investors started to burn money behind any company that is turning ‘AI Powered’, making competition intense in the AI race.
Just progress isn’t enough. We need safe progress — safe progress in the development of NLP, multimodality, and artificial general intelligence.
But pushing for trademarks — as a way to protect intellectual property, or for a marketing strategy to create hype and anticipation — doesn’t lower competition. It just increases it.
With that said, if GPT-5 is going to be at par with our expectations, then it would undoubtedly be a game-changer yet again in the field of AI. But that is if it ever becomes something close to AGI, if not complete AGI.
Yet, even in our craziest dream, if we DO get to AGI, then safety and regulation have to be the priority. Otherwise, our pursuit of AGI in the AI race, from the words of Hawking, could spell the end of the human race.
AGI in the wild can-do wonders — even from the perspective of destruction.
Are you interested in keeping up with the latest events in tech, science and AI?
Then you won’t want to miss out on my free weekly newsletter on substack, where I share insights, news, and analysis on all things related to tech and AI.
Creative Block | Aditya Anil | Substack
100+ subscribers. The weekly newsletter about AI, Technology and Science that matters to you. Click to read Creative…
Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.
Published via Towards AI