Last Updated on August 23, 2023 by Editorial Team
Author(s): Towards AI Editorial Team
Originally published on Towards AI.
What happened this week in AI by Louie
In recent months we have continued to see large language model (LLM) advancements and a gradual introduction of novel techniques but we haven’t yet seen competition directly aiming to displace GPT-4 as the most advanced (and training compute-intensive) model. Google consolidated its AI efforts by merging Google Brain and Deepmind earlier this year and has also been rapidly scaling up its training compute resources. Its Gemini model is likely to be the first major new release from this new merged effort. Google has gradually been generating anticipation for Gemini by revealing information through interviews and controlled media releases. The company is likely preparing for Gemini to serve as its response to GPT-4, with ambitions for it to outperform the latter in certain capabilities.
With the release now reportedly gearing up for “this fall,” we are excited to see what new innovations and capabilities Gemini brings and how it stacks up against GPT-4. The effort is being led by Oriol Vinyals and Koray Kavukcuoglu, alongside Jeff Dean, who oversees hundreds of employees in Gemini’s development. We have heard several things about the model, both via direct quotes from management and media leaks. “Gemini combines the strengths present in AlphaGo-type systems with the exceptional language capabilities inherent in large models,” — Demis Hassabis. We also know that the model started training sometime before May; “We’re already at work on Gemini — our next model created from the ground up to be multimodal, highly efficient at tool and API integrations, and built to enable future innovations, like memory and planning. Gemini is still in training, but it’s already exhibiting multimodal capabilities never before seen in prior models.” — Google CEO blog, May-23. The model is anticipated to be multimodal with full image generation capabilities like Midjourney. We have also heard Google is being careful about its training set and that it might have integrated Video and Audio data from YouTube into Gemini.
The ongoing rivalry between Google and OpenAI is exciting to witness, and it will be intriguing to observe how these developments unfold in the future and particularly which techniques from AlphaGo Deepmind are integrating into LLMs. It looks like we are set to see even more exciting evolution in LLMs this year!
– Louie Peters — Towards AI Co-founder and CEO
This issue is brought to you thanks to OVHcloud:
OVHcloud is offering GPUs at unbeatable prices to drive your AI needs. This includes a selection of NVIDIA cloud instances up to 60% off regular prices — while stocks last. Designed to accelerate data processing with the guarantee of complete data reversibility and resource flexibility, the OVHcloud AI portfolio also features bare metal servers and open-source ML solutions such as AI Notebooks, AI Training, and AI Deploy, all benefiting from OVHcloud’s water-cooling technology to achieve the lowest energy consumption.
Meta’s next AI release will reportedly be a coding machine. This new model, dubbed “Code Llama,” will be open-source and available free online. It could see a release as soon as next week. This is consistent with the company’s strategy so far of releasing widely available AI software that makes developing new customizable AI models much easier.
OpenAI proposed the utilization of GPT-4 for the development of content policies and the facilitation of content moderation decisions on digital platforms. This approach aims to relieve humans of the burden involved, thereby enabling more uniform labeling of content and expediting feedback loops.
Microsoft’s recent Azure+Databricks offering provides Databricks users with the capability to employ any AI model, including open-source LLMs, for training their data on the Azure platform. This could potentially lead to a decrease in the number of companies procuring licenses for OpenAI models to fulfill similar use cases.
Google’s top AI experts, whose research contributed to the development of AI bots like OpenAI’s ChatGPT, Google Bard, Stability AI, Midjourney, and Dall-E, have intentions to establish their independent AI development studios. Sakana AI is spearheading efforts to create its proprietary generative AI model, which means software that can create text, images, code, and more.
According to a new paper by researchers at the University of Cambridge, bias in the collection of data on which AI computer programs depend can limit the usefulness of climate scientists in predicting future scenarios and guiding global action. The paper concludes that human-guided technology remains instrumental in the development of socially responsible AI.
Five 5-minute reads/videos to keep you learning
Navigating the current AI hype can be challenging, making it difficult to discern what is truly substantial. This compilation comprises a thoughtfully curated collection of foundational papers, interesting open questions, and a guide for gaining deeper insights into the space.
This post elaborates on the reasons why fine-tuning might not be essential for your application. It delves into a comprehensive explanation of what fine-tuning entails and explores potential alternatives. The content is targeted toward those focused on building LLM applications.
The swift growth of AI has given rise to novel research directions. This article brings together the prevailing challenges within LLM research such as multimodality, alternatives to GPUs, innovative architectures, and more.
AI2 Dolma is a dataset of 3 trillion tokens drawn from a diverse array of sources including web content, academic publications, code repositories, books, and encyclopedic materials. Its primary objective is to provide researchers with the means to investigate the impact of data at scale. It is readily accessible for download on the HuggingFace Hub.
This post shares how to build a standardized process for writing NLP papers. It presents essential components including content structuring, language precision, comprehensive literature reviews, accurate citations, and more. While certain pointers are tailored to NLP research, the principles laid out herein can be effectively employed across niches.
Papers & Repositories
This paper presents a scalable method to build a high-quality instruction-following language model by automatically labeling human-written text with corresponding instructions. It starts with a language model fine-tuned on a small amount of seed data and a given web corpus.
This paper presents Neuralangelo, which combines the representation power of multi-resolution 3D hash grids with neural surface rendering. The two key ingredients that enable this approach are numerical gradients for computing higher-order derivatives and coarse-to-fine optimization on the hash grids.
This work developed Sliced Anti-symmetric Decomposition (SAD), a new model for collaborative filtering based on implicit feedback. SAD produces the most consistent personalized preferences while maintaining a top level of accuracy in personalized recommendations.
This paper shows how the problem of neural text generation can be constructively reformulated in terms of transitions between the states of a finite-state machine. It guides text generation with regular expressions and context-free grammar by allowing the construction of an index over a language model’s vocabulary.
Txtai is an all-in-one open-source embedding database for semantic search, LLM orchestration, and language model workflows. It can be set up in minutes, runs locally, has a low footprint, and works with micromodels up to large language models.
Enjoy these papers and news summaries? Get a daily recap in your inbox!
The Learn AI Together Community section!
Meme of the week!
Meme shared by rucha8062
Featured Community post from the Discord
Marcklingen has recently introduced “langfuse,” an open-source observability and analytics tool designed for LLM (Large Language Model) applications. This tool enables users to accelerate their application development process by providing a detailed perspective of precise execution traces, encompassing aspects like quality, cost, and latency. Currently, Langfuse Analytics is in a closed alpha phase, during which the core team collaborates with a user group to construct the most beneficial analytics platform for LLM applications. Check it out on GitHub and support a fellow community member. Share your feedback and questions in the thread here.
TAI Curated section
Article of the week
In this article, you will learn how to leverage large language models, state-of-the-art text and speech analytics tools, and vector databases to build an end-to-end audio recommendation solution. This solution will suggest top videos based on users’ interests.
Our must-read articles
If you want to publish with Towards AI, check our guidelines and sign up. We will publish your work to our network if it meets our editorial policies and standards.
Interested in sharing a job opportunity here? Contact [email protected].
Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.
Published via Towards AI