🗞️AI highlights from this week (1/27/23)
Generative music, language models as backend servers, AI Family Guy and more…
[Update: the previous version of this post had an incorrect sub header]
Here are my highlights from the last week in AI!
P.S. Don’t forget to hit subscribe if you’re new to AI and want to learn more about the space.
1/ Google makes leap in Generative Music
One of the areas that has most excited me about AI is its ability to democratize the creative process. As a musician myself, when I first started playing with generative AI products like Dall-E, my immediate thought was “This would be amazing for music”.
There have been a few different projects attempting to achieve generative music, including HarmonyAI which is able to generate new music that sounds like the input music and Riffusion which does short text-to-audio using Stable Diffusion by turning audio into images. OpenAI also published a paper on a model they call JukeBox that generates music in particular genres and styles.
In my opinion though, the holy grail is for a user to describe any kind of music or sound and for a model to generate, and it looks like Google just achieved this with MusicML!
Check out their research website where they shared lots of examples of MusicML in action, including longer songs, audio journeys with multiple parts, turning paintings into music and even generating specific instrument sounds! As of yet, there’s no tools for you to try out MusicML with your own prompts but here’s hoping that this research will be available in a product by Google later this year.
2/ Using a Large-scale Language Model as a backend
Last weekend ScaleAI1 hosted an AI hackathon in San Francisco. The winning team’s project “GPT is all your need for backend”2, might pique the curiosity of any engineers reading this post, as they were able to show how a large-scale language model, in this case GPT, could be used instead of a traditional database and server-based backend3:
Here’s how one of the team members described what they were aiming for:
What was so impressive about what the team achieved is that they were able to completely remove the need for a server or database to store data for their example application, a To Do app. Instead they just taught GPT4, what app they were building and how it should respond to requests, as well as providing examples of the type of data the frontend part of the To Do app might request e.g. a list of to do items. Once, this is done, the frontend can just describe the functions it wants to call, without them ever being defined!
Here’s a more detailed description of how “backend-GPT” works from their Github Repository:
We basically used GPT to handle all the backend logic for a todo-list app. We represented the state of the app as a json with some prepopulated entries which helped define the schema. Then we pass the prompt, the current state, and some user-inputted instruction/API call in and extract a response to the client + the new state. So the idea is that instead of writing backend routes, the LLM can handle all the basic CRUD logic for a simple app so instead of writing specific routes, you can input commands like add_five_housework_todos() or delete_last_two_todos() or sort_todos_alphabetically() . It tends to work better when the commands are expressed as functions/pseudo function calls but natural language instructions like delete last todos also work.
I’ve discussed in previous posts about the concept of emergent behavior, whereby a language model which is trained on a large enough dataset is able to carry out tasks and perform logic that is unexpected. This idea of a large-scale language model acting as as general purpose backend is a great example of emergent behavior!
3/ Atomic AI raises $35M to use AI for RNA-based drug discovery
With all the hype around chatbots and generative art, it’s great to also hear that AI companies are being created to save lives too. One such company is Atomic AI, a biotech startup that raised $35M in series A funding to do generative AI-based drug discovery focused on RNA molecules. Here’s how Raphael Townshend, CEO of Atomic AI describes the opportunity his startup is going after in an interview with TechCrunch:
“There’s this central dogma that DNA goes to RNA, which goes to proteins. But it’s emerged in recent years that it does much more than just encode information,… If you look at the human genome, about 2% becomes protein at some point. But 80 percent becomes RNA. And it’s doing… who knows what? It’s vastly underexplored.”
Check out Michael Spencer’s post for more on Atomic AI and the intersection of AI and biotech:
4/ Yann LeCun throws shade on ChatGPT!
The legendary AI researcher Yann LeCun, who was one of a few researchers pushing forward advancements in deep learning during the 70s-90s5 tweeted that he thought ChatGPT was overhyped:
I think Yann might be overestimating the general public’s understanding of deep learning, AI and the progress we’ve made in the last few decades. Until ChatGPT, most people simple had not experience AI in a tangible and impressive product, as I shared in AI: Don’t believe the hype?:
Unlike it’s predecessors (e.g. Google Assistant, Echo, Siri), ChatGPT is really the first time an AI assistant truly seems like it could pass the Turing Test. There have been many impressive examples of ChatGPT in action and if you haven’t tried it yourself you should. ChatGPT successfully wrote a blog post for me and turned it into a twitter thread, gave me a recipe for pancakes that tasted delicious and helped me pick a Christmas present for my wife!
OpenAI are capturing attention not because of the sophistication of their models but because they are shipping great products, as pointed out by Dr. Jim Fan, an AI scientist who previously worked at OpenAI and Google:
It’s also hard not to take Yann’s sentiment with a grain of salt given that he leads AI research at Meta. Maybe Yann should spend less time throwing shade and more time persuading Zuck to burn the virtual boats and join the AI race?
Or, maybe we should all just be friends and work on this together…
5/ Family guy and generative AI
Wrapping up with this fun take on what Family Guy might have looked like as an 80s live action sitcom using images created with Midjourney!
Finally, in case you missed it, I also shared Part 3 of my series on the origins of Deep Learning:
That’s all for this week!
Thanks for reading The Hitchhikers Guide to AI! Subscribe for free to receive new posts and support my work.
Scale AI provides infrastructure and resources to label large datasets for machine learning for many different use cases including robotics, AR/VR, AI and autonomous vehicles.
The project’s title “GPT is all you need for backend”, is a play on words on “Attention is all you need,” the famous Google research paper that introduced the Transformer architecture used by large-scale language models. If you want to learn more about what Transformers are, read my latest post on the origins of Deep Learning.
A “backend” is the part of a web application that stores and serves data to the “frontend” that you interact with as a user. For example this web page is the frontend of substack and the backend is what stores and serves all the text in this post.
GPT or General Pre-trained Transformer is OpenAI’s large-scale language model that powers ChatGPT.