Scroll down for full podcast transcript - click the ‘Show more’ arrow
Generative AI is advancing exponentially. What is happening at the frontier of research and application and how are novel techniques and approaches changing the risks and opportunities linked to frontier, generative AI models?
Speakers:
Yann LeCun, Silver Professor of Data Science, Computer Science, Neural Science and Electrical Engineering, New York University
Nicholas Thompson, Chief Executive Officer, The Atlantic
Kai-Fu Lee, Founder, 01.AI Pte. Ltd.
Daphne Koller, Founder and Chief Executive Officer, Insitro Inc
Andrew Ng, Founder, DeepLearning.AI, LLC
Aidan Gomez, Co-Founder and Chief Executive Officer, Cohere Inc.
This is the full audio from a session at the World Economic Forum’s Annual Meeting 2024
Watch the session here: https://www.weforum.org/events/world-economic-forum-annual-meeting-2024/sessions/the-expanding-universe-of-generative-models
Follow all the action from Davos at wef.ch/wef24 and across social media using the hashtag #WEF24.
Check out all our podcasts on wef.ch/podcasts:
Podcast transcript
This transcript has been generated using speech recognition software and may contain errors. Please check its accuracy against the audio.
Nicholas Thompson, Chief Executive Officer, The Atlantic: All right. Hello, everybody. Good afternoon. I'm Nicholas Thompson. I'm CEO of the Atlantic. I'm the moderator here. I was told this panel sold out in 30 seconds. I know that's not because of me, because I moderate a lot of panels and that doesn't normally happen. I am extremely excited for this group. We have an incredible group here. In fact, I'm so excited that when I was walking here, I was looking at my questions, getting ready, and I was walking down the promenade and I see like kind of a group of people ahead of me, and I'm reading my questions, walking, and it kind of slowing me down. And so I'm doing the New Yorker thing and I move in between them, and then I notice they all have machine guns.
And so then suddenly two guys with machine guns, look at this little guy who's squirrelling through them. It's John Kerry's entourage. But I was very eager to get here. I'm going to introduce our panellists in chronological order from their first major contribution to the field of AI and machine learning. What's amazing about this group of five is that they all have done incredible and different things in the past and are doing incredible and different things right now.
So on my left, Yann LeCun, Turing prize winner, inventor of the architecture of neural networks and who runs AI at Meta. We have Kai-Fu Lee, machine learning for speech recognition, he now runs 01.AI, which is booming up the Hugging Face charts. We have Andrew Ng, he first learned to scale GPUs for AI, of his many affiliations right now he runs the AI fund.
We have Daphne Koller, using Bayesian models in AI for the first time, she now runs Insitro. And, we have Aidan Gomez, who is one of the people who came up with the transformer architecture, GPT T is transformer and he now runs Cohere
So amazing panel what they did what they're doing. This should be great. First question. The thing I most want to understand from all of you when I leave this room, I want to understand what the rate of change in AI will be in the years to come. I want to know whether the crazy innovations we've had in the last two years will continue, will scaling laws continue, will we continue to go faster than Moore's Law? Or are we approaching a plateau of some sort? Will things start to slow down? So I'm going to take that question and I'm gonna give it to you, Kai-fu Lee. Tell me whether the rate of change will increase or slow down in the years to come and why.
Kai-Fu Lee, Founder, 01.AI Pte. Ltd: I think it will slow down a little bit, but I think it'll still be at incredible rates. If you look at just in the last two years, how much the quality of these models has become, you know, two years ago, the MLU, which is roughly a measure of intelligence, was in the 40s and 50s, now it's 90 and there's more room to grow.
Obviously, there is growing by adding more compute and data, but there's also room for tuning and improving different aspects as more and more entrepreneurs and large companies get into the game. We are very late to the game and we were able to make substantial improvements over existing models, not by going after scaling, but by doing, you know, different things in the tweaking of the data, the training, the infrastructure and so on. So I'm optimistic.
Nicholas Thompson, The Atlantic: You're optimistic it will stay both because you think the recent scaling laws will hold and also because they'll be innovations?
Kai-Fu Lee, 01.AI Pte. Ltd: They will hold, but they will obviously slow down. There's a diminishing return to everything that one does, but it's definitely not at the plateau.
Nicholas Thompson, The Atlantic: Andrew, what do you think?
Andrew Ng, Founder, DeepLearning.AI, LLC: Yeah, so scaling gets harder and harder, but I feel like the pace will feel like it's still accelerating to most of us, because the number of innovations and algorithmic evaluations. So some quick examples, we saw the text revolution happen last year, kind of, I think this year we'll see the image processing revolution take place. It's kind of here already with GPT for V and Gemini Ultra, but really computers will see much better.
I'm seeing a lot of innovation in autonomous agents, rather than prompt and LLM that gives you a response, you can give an LLM an instruction, it will go off and do work for you for half an hour, browse web pages and do a lot of research and come back. This is not totally working right now, but a lot of people are working on it and it's another innovation.
AGI, you know, we're used to running OMS in the cloud, but because of open source and other things, I think we'll be running a lot more large language models on our own devices in the future. So with all of these vectors of innovation, I'm actually optimistic that it will feel like the field is continuing to move forward.
Nicholas Thompson, The Atlantic: Aiden, let me ask you this variation on the question, which is if you are able to double the amount of compute applied to large language models right now, will you double their power? Will you, you know, cube their power? What will happen as we increase the amount of compute?
Aidan Gomez, Co-Founder and Chief Executive Officer, Cohere Inc.: I mean, I think I would say that I agree it's going to keep pace or I would even go as far as to say it's going to start to accelerate. There are huge bottlenecks to what we have today, we know the limitations of the architectures that we have, the methods that we're using and I think that's going to get easier. At the same time, the hardware platforms are getting better and better. So the next generation of GPUs are going to be a big step over the generation we have today. And that unlocks new scale, much more expensive algorithms and methods to run.
So I'm very optimistic things are going to accelerate. In terms of the specific question of if you double the compute capacity, you know, what sort of decisions do you make? I would say at the moment we're not done with scaling. We still need to push up all of us. Pretty much all of us building large language models are and so I would double my model size.
Nicholas Thompson, The Atlantic: Whoa, I think everybody would double their model size if it was free. Well, Jonathan, let me ask you part of the reason I'm curious about this is that if the relationship holds and that, you know, the better the GPUs, the more compute you have electricity you have, the better your models. Then it means that power will consolidate with the small number of companies that have access to it. If there's more of a plateau, it makes it a more competitive market, right?
Daphne Koller, Founder and Chief Executive Officer, Insitro Inc.: I'm actually going to turn your question in a slightly different direction.
Nicholas Thompson, The Atlantic: Tweak it however you want.
Daphne Koller, Insitro Inc.: And say that in your taxonomy of the enabling forces you mentioned compute, you mentioned electricity, you did not mention data. And I think that has been probably the single biggest enabler to the incredible amount of progress that we've seen today. And I think we're only starting to scratch the surface of the data that are going to become available to models over time.
So right now, for example, yes, we train on all the web scale data and that's amazing and that's incredible, but these agents are not yet embodied. They don't yet interact with the world. And so as we start to, you know, carry these things around with augmented reality, as we start to get more data from self-driving cars, as we start to tap into data modalities, such as biology and health care and all sorts of data that are currently hidden in silos, I think those models will develop new levels of capability that they don't currently have today. And, so I think that's going to be a major contributor to the expansion of capabilities of these models.
Nicholas Thompson, The Atlantic: So interesting, because I've had conversations in Davos where people have said, well, we're running out of data, right? There's not that much more on the web, Yann, where are you?
Yann LeCun, Silver Professor of Data Science, Computer Science, Neural Science and Electrical Engineering, New York University: I totally agree with Daphne. So I think, I mean, certainly if we concentrate on the paradigm of LLM, it's saturating. There's no question it is saturating. Indeed, we're running out of data, we're basically using all the public data on the internet.
They're trained with what, 10 trillion tokens? Okay, that's, it's about two bytes per token, so it's about two-tenths of a thirteenth bytes of training data. It would take most of us here between 150 and 200,000 years to read this. Now think about what a child sees through vision and try to put a number on how much information a four-year-old child has seen during his or her life and it's about 20MB per second going through the optical nerve, for 16,000 waking hours in the first four years of life and 3600 seconds per hour. You do the calculation, and that's ten to 15 bytes. So what that tells you is that a four-year-old child has seen 50 times more information than the biggest LLMs that we have.
And the four-year-old child is way smarter than the biggest LLMs that we have. The amount of knowledge it's accumulated is apparently smaller because it's in a different form, but in fact, the four-year-old child has an enormous amount of knowledge about how the world works. And we can do this with LLMs today. And so we're missing some essential science and new architectures to take advantage of sensory inputs that, you know, future systems would be capable of taking advantage of and this will require a few scientific and technological breakthroughs which may happen in the next year or three years, five years, ten years, we don't know.
Nicholas Thompson, The Atlantic: I want to make sure I understand you here, Yann. So the amount of text data that's available will grow, but not infinitely, but the amount of visual data that we could potentially put into these machines is massive?
Yann LeCun, New York University: Well, the 16,000 hours of video I was telling you about, that is 30 minutes of uploads on YouTube. I mean, we have way more data that we can deal with. The question is, how do we get machines to learn from video? We don't know.
Nicholas Thompson, The Atlantic: Right, so what is the new architecture that is needed if the next step is going to be video inputs? Obviously, a large language model isn't exactly right the way it's been constructed and optimised for it. What do we have to build now?
Yann LeCun, New York University: Okay, so large language models are trained in one way. You take a piece of text, you corrupt it and you train some gigantic neural net to reconstruct the full text, to predict the words that are missing, basically, you corrupt it by removing some of the words. LLMs like ChatGPT and others you train them by just removing the last word. I mean, technically, it's more complicated, but it's basically what they do, right?
So you train the system to reconstruct missing information about the input. So, of course, the obvious idea is why don't we do this with images? So take an image corrupt it by removing some pieces or corrupting it, and then train some big neural net to recover the image. And that doesn't work or it doesn't work very well. There is a whole thread of efforts in that direction that has been going on for a while, and it doesn't really work very well. It doesn't work for video either. I've been working for nine years on video prediction, you know, trying to predict, you know, show a piece of video to a system and then training it to predict what's going to happen next. And if the system is capable of doing this, it probably has understood something about the underlying nature of the world, the same way a text system that is trying to predict the next words, you know, captures something about the meaning of the sentence. But that doesn't work either.
Nicholas Thompson, The Atlantic: And, so what you mean is, you take a video and you have me going like this and dropping it and it will predict that the pen will fall, but right now, a machine can't do that.
Yann LeCun, New York University: So the question is you know your pen has a particular configuration when you drop it, it's going to follow a particular trajectory. Most of us cannot predict exactly what the trajectory is, but we can predict that the object is going to fall. It takes babies about nine months to figure out that an object that is not supported falls. Okay. How do we do this with machines?
Nicholas Thompson, The Atlantic: Wait. Okay, but sorry, this is a dumb question, but I don't understand, if in the future these things are going to work and be continually revolutionary because they're going to understand video, because that's where the data is. But we don't understand video. How do you square that?
Yann LeCun, New York University: So the potential solution is, there is no real solution yet, but the things that are most promising at the moment, at least the things that work for image recognition, I'm going to surprise everybody are not generative. Okay, so the models that work best do not generate images. They do not reconstruct. They do not predict. What they do is they predict in a space of abstract representation. So the same way, I cannot predict exactly how the pen will fall in your hand. I can predict that it will fall. So at some abstract level of, you know, a pen being here or there without the details of exactly what this configuration is, I can make that prediction. So what's necessary would be to make predictions in abstract representation space as opposed to pixel space. That's why, you know, all the prediction pixel space has failed so far, is because it's just too complicated.
Daphne Koller, Insitro Inc.: But I think it's more than just a video. I think the other thing that babies learn is the notion of cause and effect, which they learn by intervening in the world and seeing what happens. And we have not yet done that at all with LLMs. I mean, they are entirely predictive engines. They're just doing associations, getting to causality, which is so critical when one interacts with, when one tries to cross the chasm between bits and atoms. That's a huge capability that's missing in current-day models. It's missing in models that are embodied, it's missing in the ability of our computers to do common sense resilience, missing when we try to go to other applications, whether it's manufacturing or biology or anything that interacts with the physical world.
Yann LeCun, New York University: Well in embodied system, that's actually kind of working. So, I mean, some of those systems have world models. You know, here's a representation of the state of the world at time T, here is an action I might take. Tell me the state of the world at time T plus one?
So that's called a world model and if you have this kind of world model, you can plan a sequence of actions to arrive at a particular goal. And, we don't have many AI systems based on this principle at the moment, except very simple, kind of robotic-like systems that don't learn very fast. And so once we can scale this kind of model up, we'll have systems that can, understand the world, understand the physical world, they can plan, they can reason and they can understand causality because they understand what effect an action will have.
And there will be goal-oriented, objective-oriented, because, we can give them goals to satisfy with this planning. So that's the future architecture of AI systems and, in my opinion, once you figure out how to make this work, nobody in their right mind would use autoregressive LLMs anymore.
Nicholas Thompson, The Atlantic: All right. Well, I don't understand how to make a neural network, but I do know a little bit about children because I have raised three of them and I think you guys overate babies. So cause and effects, when my youngest was nine months old, I remember him standing at the side of the crib hollering and then he flipped out. Thankfully he landed on his butt, but I don't think he understood the notion of objects falling and exactly what cause and effect are. in in Kai-Fu is Yann correct on what needs to happen? Is he pursuing it the right way or is he falling short because his ideas are wrong?
Kai-Fu Lee, 01.AI Pte. Ltd: No, Yann is always right!
Yann LeCun, New York University: Thank you.
Kai-Fu Lee, 01.AI Pte. Ltd: However, we shouldn't lose sight of the incredible commercial value that exists in the text-based LLMs, right? I mean, they give an incredible pretence of logical reasoning, even common sense. They solve real problems, they can generate content, they dramatically improve our productivity, they're being deployed everywhere.
So, you know, putting on my more entrepreneur hat, I just see so much value that remains to be reaped. On this opportunity to have a world model, I think that's a great thing for researchers to work on. but I think for me as a start-up company, that's something that's a bit farther out. And we'd love to have academia and, you know, large company research labs make the discoveries. Then we'll follow.
Nicholas Thompson, The Atlantic: So your view is that even if we stick with text-based large language models, we don't move on to all this crazy stuff that Yann is talking about here on my left, the world will still be turned upside down?
Kai-Fu Lee, 01.AI Pte. Ltd: We're already seeing it, absolutely. I mean, we're seeing basically content generation and emulation of people and creation of interesting experiences and making better search engines and basically everywhere you can imagine, office productivity, creating power points, content. We have way too many things for us to think about and work on, when we think about what can make the most money and produce the most value for users today.
Nicholas Thompson, The Atlantic: Aiden, do you agree with Kai-Fu?
Aidan Gomez, Cohere Inc.: Yeah, I definitely agree about the market opportunity and the value that exists today and will be coming very, very soon, even if we don't make it all the way to AGI.
I think. Yann answered a really interesting question, which may have been different than the one asked, I think you were asking if we just keep doing what we're currently doing with autoregressive models, will we make it? And I would agree, the answer is no and for the reasons that both of you say, which is like grounding and the ability to actually experience the real world gives you those causal insights. I do believe those aren't insurmountable hurdles and I think you both believe that as well.
So I think people are working on that, so far, the dumb strategy has worked so well that we've just been able to be like, build a bigger supercomputer, scale up the data and we get performance, we get extraordinary performance. Yeah, I think we know what's next. We need online experience. We need to be able to actually interact with the real world.
The way these models are deployed today. We do all of this offline training and offline means there's no interaction with a person. There's no interaction with the environment and we deploy them, we put them into a product. It's static. It doesn't learn. It's fixed from there. Nothing you do changes that model. And so that needs to change in order for these things to continuously learn.
And the other big hurdle is we humans, we learn through debate like this. Right? We discover new ideas. We explore the space of truth and what's possible knowledge. Models they need to be able to do that amongst themselves as well.
So this idea of self-play and self-improvement right now, a major bottleneck, I'm sure Kai-Fu feels that as well. I'm sure we all feel it, is getting access to data that is smarter than our model as it is. So before you could just, like, pull anyone off the street and be like, please teach my model to speak properly and it would improve your model, it would increase the score. They're starting to get really, really smart. And so you can't go to the average person off the street, you need to go to a master's student in mathematics. You need to go to a bio student and then a PhD. And then who? And so humanity and its knowledge is kind of a limiting and upper limit to the current strategy. And so we need to break through that.
Nicholas Thompson, The Atlantic: Okay. So hold on. So eventually, you know, we've got the man off the street and then we've got the PhD and then we've got the smartest person in the world and then eventually the machines are just talking to each other. They're training each other, creating all this synthetic data. They're training on each others synthetic data. We meagre humans, you know, have no idea what's going on. How do we know that they're not corrupting themselves, polluting themselves?
Aidan Gomez, Cohere Inc.: I would just interrupt. I think Daphne would tell you, that it's not just synthetic data and them interacting with themselves in this little box of isolation. They need access to the real world to run those experiments and experience that, to form a hypothesis, test a hypothesis, fail a thousand times and succeed once, just like humans do, to discover new things. Sorry I cut you off. What was your question?
Nicholas Thompson, The Atlantic: No, that was awesome. That was great.
Andrew Ng, DeepLearning.AI, LLC: Can I jump in and comment on the technology limitation? I don't know if Yann and I agree or disagree on this, but people talk about how bad large language models are right now. If you ask it to predict the next character after multiplying two large numbers, it often gives the wrong answer. but it turns out humans are also really bad at that. But if you give a human, you know, pencil and paper, if you give me a pencil and paper, I'm much better at math than if you force me to just spit out the answer without working it through. And I think large language models are like that, too.
So I think one of the ways to overcome some of these limitations is to give a large language model the equivalent of a scratchpad, you could actually get it to be much better at math. And, in fact, large language models are using all sorts of tools, such as scratchpad, to actually make things happen in the world, to browse web pages. And these tools are one way that in the short term, and the probably more fundamental things to be done in the long term, that make it break through some of these limitations of purely autoregressive models.
Daphne Koller, Insitro Inc: But I think more broadly, and coming back to the points that Aidan attributed to me, and thank you for that, we do not have the ability at this point to create an in silico model of the world. The world is really complicated and the ability that we have to experiment with the world and see what happens and learn from it, I think is absolutely critical to what makes for human intelligence.
So, if we want these machines to grow, we need to give them the ability not just to in silico talk to each other and kind of be based on the juice of their own little universe, but really to experiment with the world and generate the kind of data that helps them continue to grow and develop. And I think the big differentiator as we move forward is giving the computers access to designing experiments, whether it's simple experiments like what happens when you drop the pen, or more complex experiments, which is what happens when I put these five chemicals together in a cell. What happens to the cell? What happens to the human? Those are the kinds of experiments that are going to teach the computer about this incredible complexity of the world and allow us to really go beyond what a person can currently teach you. Once you've kind of plateaued the math expert or the biology expert.
Nicholas Thompson, The Atlantic: It is so interesting. I feel like I'm getting a lot smarter.
Kai-Fu Lee, 01.AI Pte. Ltd.: Can I just jump in and say that yes those are all great aspirational goals, but there is so much engineering work, and you could call it a patch if you want to make it better. Like with the problem, that you know, it creates things out of the blue and you can correct that by gluing rag search engine to it. With mathematical problems, theoretically, you could glue a Wolfram Alpha to it. Various companies have attempted to do that. And I think a lot of the issues, that we see today can be addressed by that. I know that's not elegant. One could argue GPT-4 V is a smart way to glue things as well, but I think we have a lot more engineering gluing that can cover up a lot of the issues today as researchers aspire to build embodied AI and advanced things.
Nicholas Thompson, The Atlantic: I want to jump back to something that Aidan said, where you're talking about AGI, and then you talked about making it. And so artificial general intelligence is a term, I don't wanna get stuck in a definition, but superintelligent machines or, you know, the goal of OpenAI, the stated goal, is to build a machine that is smarter than humans at everything. So, I want to ask all of you, is this the proper goal for so many AI researchers? Should we be trying to build machines that are better than humans in general ways?
Yann LeCun, New York University: Okay, should we build aeroplanes that go faster than birds? The purpose of building aeroplanes was not to go faster than birds, but to figure out how you fly. And, I think, you know, the problem of, say, AI research at the moment is, you know, discovering the underlying principles behind intelligence and learning, whether it applies to humans or not. And, the best way to do this is to actually build machines that are intelligent. You can't you know, preserve sanity, I guess, without without actually building those things.
So there's a scientific question, you know, what is intelligence? What are the required components to make machines intelligent? One of them is learning. We know about that. We made some progress on this, but we so far have not been able to reproduce the type of learning that allows a ten-year-old to learn to clear the dinner table instead of the dishwasher in one shot, or allows a 17-year-old to learn to drive in 20 hours. We still don't have level-five self-driving cars. So what is it that you know? What type of learning takes place in humans and animals by the way?
Nicholas Thompson, The Atlantic: But Yann, I would push back on your metaphor because what we did is we took one thing that a bird does, which it flies, and we got way better at it. We didn't try to create a better bird. We said, oh, look, a bird, right? So I would think that maybe the parallel would be AI researchers saying instead of trying to make a human mind that's better at everything, let's figure out how I can best serve human biology, for example.
Daphne Koller, Insitro Inc: So I'm going to, since you asked us to disagree, I'm going to disagree with Yann
Nicholas Thompson, The Atlantic: No one ever disagrees with Yann, as you can tell from Twitter.
Daphne Koller, Insitro Inc: Well, I'm going to try to disagree with Yann, which is to say, I'm not convinced that the current way in which we're designing AI systems is teaching us about human intelligence. It's teaching us about an intelligence. We're certainly learning about how to build an intelligence. I'm not sure that building planes has taught us how birds fly. I'm not sure that building A has taught us how people reason.
So I think it is a very worthy goal. I'm not sure that the path that we're on is necessarily going to lead us to that specific goal of understanding human evolution. On the other hand, I'm going to now agree with Kai-Fu, which is to say there are multiple other endeavours that one can undertake in this world of AI, which is to solve really hard, important aspirational problems that people maybe are not capable of solving. And we will not get there by trying to replicate how people reason and the problems that you alluded to that I'm working on, which is problems in biology and health, people are actually really bad at solving those problems.
And I think that is because people are really bad at finding these subtle patterns in very large amounts of heterogeneous, multimodal data that just don't fit together into the kind of natural patterns that a baby learns when they're, you know, interacting with the world. And so you can imagine that a different aspirational goal not to diminish from Yann's goal is to just build computers that are capable of addressing really hard societal problems, whether it's in health or in agriculture and the environment, climate change and so on, where people are just not going to get there on their own. And so I think that is kind of two different paths, if you will.
Yann LeCun, New York University: So it's absolutely the case that we understand how birds fly because we built aeroplanes. It's absolutely the case that we understand much better about, for example, human perception because of neural nets. There is literally thousands of papers in neuroscience that use convolutional nets as a model for the visual cortex.
Daphne Koller, Insitro Inc: We understand it better, but it's not the same.
Yann LeCun, New York University: It's as good as it gets. It's much better than whatever we were using before, which was basically template matching. However, people are trying to do the equivalent using LLMs to explain how the human brain understands stories, for example. That doesn't work.
Daphne Koller, Insitro Inc: So we agree.
Nicholas Thompson, The Atlantic: I want to understand AGI. I think it's leading to some very interesting places. Andrew, what do you think of it as a goal?
Andrew Ng, DeepLearning.AI, LLC: Just to share, I think I think AGI is a great goal, but AI I think is big enough. We shouldn't all have one goal. I think we should have some of us saying let's work on AGI, some of us working on climate change, some of us working on life sciences. And let's do all of these things and celebrate and let's just celebrate them all.
Kai-Fu Lee, 01.AI Pte. Ltd.: Well, when I was 18, I had AGI as my dream and that's to figure out how human cognition works. And while I can't disagree with Yann's analogy about how human cognition helped the state of the art, I have to side more with Daphne in that today's LLM is a really different kind of animal, that you teach it differently, it's really good at some things, it's really bad at some other things, and I think it's admirable to try to fix some of the problems with but its upside is totally unrealized right now. So there's so much you can do that will make our LLMs create more value for the world without really necessarily making it emulate or learn or beat humans.
It's like when people invented the automobile, no one ever wanted to teach it to walk, right? But cars became trucks, became all kinds of other things and engines started making aeroplanes. And so I think we're at that phase when Henry Ford, I guess you're kind of the Henry Ford who invented the technology, and we're now trying to build on top and say there's so many more different kinds of things you can build once you have the engine, and whether that engine is, is or isn't like the human brain, I'm less interested in that, I think. realising the value also causes us to want to find ways to make that new LLM engine create more value. So that's what I'm focussed on. But I think, you know, other people can love AGI and try to fix problems, I think scientifically that's interesting. But I think the fastest path to value creation is to take this great engine we have and build automobiles, trucks and aeroplanes from it.
Nicholas Thompson, The Atlantic: So, Aidan, when you were talking about AGI, it was about a minute after that and you said something along the lines about if we're going to make it. And I wanted to pause and ask you what it meant, make it, but you had such an interesting answer going that I didn't. But now I want to ask you whether when you said make it, did you mean AGI or did you mean continuously improving AI that serves humanity in some broader way? Because there are a lot of people in the field for who it's almost like AGI is the end zone. How do you view this?
Aidan Gomez, Cohere Inc.: No, it's continuous. Right? I don't think it's discrete. I think you can be superhuman in a specific domain right. And so the models might be superhuman in particular aspects. We have models that go extremely well beyond any human player. So I think it's a continuum. Maybe there is some point where models are better than humans at everything, but I think it's probably an ill-specified definition. And this is much more of a continuous change.
On the question of AGI and whether it's the right call, I mean, for me personally, I just want to create value for the world. And so I don't care if we stopped short of AGI, if we fall short, there's a lot to be done with that technology. And there's a ton of value to be created and I'll be happy with it, but of course, I hope that we achieve the maximum value possible. And I think that does necessitate having a tool that we can lean on, which is as powerful as possible. I think that's what all of us here are trying to build.
Nicholas Thompson, The Atlantic: I want to switch to another topic.
Andrew Ng, DeepLearning.AI, LLC: One funny thing about the definition of AGI is there's been the biological path to intelligence, you know, say human brains, and then there's been the digital path to intelligence, AI models we have and they have different strengths and weaknesses. The funny thing about the definition of AGI is we're forcing the digital path to intelligence to be a benchmark against a biological path. And it's worth doing. But that makes it really tough to create AGI, even as clearly we're building some incredibly valuable digital intelligence.
Nicholas Thompson, The Atlantic: I love that point. That said.
Yann LeCun, New York University: I think a big difference here is, you know, on the progression of science and technology there is offering it, but you can sort of milk in advance to make practical solutions. And, you know, people in this room are either investors or technologists or CEOs of companies who we need to do this in the short term because, you know, that's your purpose, or it could be long term.
And, you know, all of us were inspired by the science of making progress towards AI. To some extent, you have abandoned it, but I'm still 18 years old.
Andrew Ng, DeepLearning.AI, LLC: I have not abandoned it.
Yann LeCun, New York University: Okay, great.
Nicholas Thompson, The Atlantic: Now I want to talk about one of the most controversial ideas in in AI, which is open source and the role of open source in future AI. So as we've seen, by following the conversation in the United States and the United States government, there is a lot of fear of open source.
There is a fear that if there's open source AI that people can modify, that they can use, that can be shared, lots of bad guys will be able to do bad things with AI and not only that, horrible things might happen. Like you might end up with, a Chinese-based large language model that was based partly on the infrastructure of an American-based build by a French guy, they took the infrastructure, built a different model, different, you know, data. You might end up with something terrible like that that would shoot up the Hugging Face rankings.
So a lot of fear about open source. But in this panel, I suspect we're going to lean somewhat towards open source. Yann are one of the chief advocates of open source for everything in AI, so why don't we start with you, present your view and then let's push back and see where we go.
Yann LeCun, New York University: So it's been the case in the history of technology that when something is viewed as a basic, common platform, it ends up being open source, right? That's true for the software infrastructure of the internet. And the reason is that it's safer, it progresses faster. And, the way the reason why we've seen such a fast progress in AI over the last decade or so is because people practice open research. Because as soon as you get a result, you put it on an archive and you open source your code and everybody can reuse it.
And we have common software platforms like PyTorch that everybody uses. And so it makes the entire field accelerate because you have a lot of people working on it, you know, they are self-selected. That's why there is magical open research and open source. And so the way to keep that, that progress fast is to keep the research open. So if you legislated open source out of existence because of fears you slow down progress and the bad guys are going to get access to it anyway. And they are just going to catch up with the rest of the world.
Nicholas Thompson, The Atlantic: So you are in favour of open source all the way. So even when you know people are building, you know, I don't know sex bots and people no longer relate to each other.
Yann LeCun: Okay, so there is a big question there, which is what do you consider acceptable behaviour for the intelligence system? And all answers would be different in this room, particularly if you are from, you know, countries with different languages, cultures, values or centres of interest. So it's not like we're going to have the possibility of having AI systems that cater to all our interests if they are produced by a handful of companies on the West Coast of the US.
The only way we can have AI systems, you know, all of our interaction with the digital world in a somewhat not too distant future is going to be mediated by AI systems. They're going to live in our glasses or smart phones or whatever device and they're going to be like human assistants assisting us in our daily lives. All of our digital diet will be mediated by AI systems.
You do not want this to be under the control of a small number of private companies. So the basic foundation models will have to be open source and then on top of this, you will have systems that are built for particular cultures, languages, values, centres of interests and whatever by groups that can fine-tune those systems for that, for that purpose. And for the same reason we need a diverse press, we will need a diverse population of AI systems to cater to all interests and so, you know this is not going to be legislated. It should not be legislated.
Nicholas Thompson, The Atlantic: So does everybody here agree with the Chief AI researcher at Facebook on the desire to maximally devolve power from the large West Coast tech companies?
Aidan Gomez, Cohere Inc.: I was gonna say I'm in favour of that. I like that, but there are points I want to push back on.
Nicholas Thompson, The Atlantic: Identify the points.
Aidan Gomez, Cohere Inc.: Yeah. I was just going to say that I don't think it's a binary. I don't think you have to choose between open source and closed source and I completely agree, you should not regulate it and create policy that forbids open source, I think, I mean, I don't know a lot of people who are proposing that, but anyone who is, I do think there's a middle ground here. And I think that there are some category of models that can be open source and should be open sourced and fuel the creators. Right? Like the hackers who want to build new experiences, experiment, try to subvert the large West Coast tech companies, they should absolutely have access to technology to do that.
At the same time, they're organizations who are trying to build businesses around this. We want to create a competitive advantage. We should have the right to keep our models closed and that should be okay. And so there should be this ecosystem of a hybrid, as opposed to one. And I know middle answers are boring and I should probably take the extreme to make this a little bit more interesting.
Nicholas Thompson, The Atlantic: There are a lot who are not directly trying to legislate open source out of existence, but the Biden administration passed an executive order which requires, you know, massive legal teams to submit red team results the United States government. It's not as though some open source AI company can comply with that. I mean, there are legislative proposals that would have the effect of consolidating power in a relatively small number of companies.
Andrew Ng, DeepLearning.AI, LLC: That's true.
Aidan Gomez, Cohere Inc.: Yeah, I believe that's true.
Nicholas Thompson, The Atlantic: So does anybody want to push back or augment Yann's arguments about open source?
Andrew Ng, DeepLearning.AI, LLC: So I agree with Yann on some things and disagree on one thing, which is, so I think that to me is a question of do we think the world is better off with more intelligence? Right. We use that primarily with human intelligence. Now with artificial intelligence, I think the intelligence that tends to make societies wealthier and make people better off and with open source intelligence, we can make it available to more people acknowledging that, yes, intelligence can, in some cases, be used for nefarious purposes but on average, I think it actually makes us all much better off.
There is one thing that I may have a different perspective than Yann on, which is that infrastructure naturally lends itself to open source. I wish it were true, but, today we all build on, you know, Nvidia and AMD and Intel things, we all build on the clouds but the semiconductor designs are very closed. And to me, I think its actually up to us work on governments for positive regulations. It is actually up to us collectively how open or close society we want and there are definitely powerful forces at this moment that to some extent have succeeded in pushing forward regulatory proposals to put in place very burdensome requirements on open source. And I think we're in a very dangerous moment where there's a significantly greater risk of overregulation than underregulation. And frankly, when I'm in Washington, DC, interacting with people, I can almost feel the tens or hundreds of millions of dollars of lobbyists attempting to regulate AI with a significant policy agenda to, you know, shutter open source as some companies would maybe rather not have to compete with it.
Nicholas Thompson, The Atlantic: Kai-Fu are you maximally open source?
Kai-Fu Lee, 01.AI Pte. Ltd.: I'm not maximally, but I do agree with almost everything that Yann said. I think I would amplify one point, which is that one of the issues of one or a few companies having all the most power and dominating the model is that it creates tremendous inequality, and not just with people who are less wealthy and less wealthy countries, but also professors, researchers, students, entrepreneurs, hobbyists. If there were no open source, what would they do to learn? Because they might be the next creator, inventor, or developer of applications and to force them to develop on one or two platforms, I think will dramatically limit innovations.
Nicholas Thompson, The Atlantic: Is that necessarily true? Last night I heard Sam Altman at the Innovators Dinner, I believe you were there, make the argument that essentially with open AI, the best position would be for open AI to build sort of the perfect machine that exceeds AGI, and then to give it away for free in order to reduce income inequality. So you don't actually need open source to reduce income inequality.
Kai-Fu Lee, 01.AI Pte. Ltd.: Well, that's the kind of altruism that I think everyone will naturally be about. Right? So I think it's much better to make available technologies to the most people possible and I'm absolutely against regulation. And one point I would say is I'm not sure I would agree with Yann that the foundational model has to be totally open source. I think for some companies, they might want to create a competitive advantage at the foundation model level, but I think both are viable possibilities.
Yann LeCun, New York University: Where you said Sam Altman said, OpenAI does not have a monopoly on good ideas. They are not going to get to AGI, whatever they mean by that, by themselves. In fact, they're using PyTorch. They're using Transformers. I mean, they're using stuff that was published by, you know, many of us infrastructure and, uh, they just clammed up recently you know, because of competitive advantage. And that's the only way to generate revenue, but they're profiting from that open research landscape that I was talking about earlier.
Kai-Fu Lee, 01.AI Pte. Ltd.: And the two largest companies in this space are limiting publication more than I've ever seen in any field that's growing.
Daphne Koller, Insitro Inc: But continuing on this theme, I mean, the models that are constructed by these companies might be great for certain applications, but might not be the right foundation for other applications. And so if you're restricting the set of available foundations to those of one or two or three you know, companies that are designing towards a particular use case, it might not be the right transition for the next innovator to build their new application that uses their new creative idea or new data modality that wasn't used in the original conception of the models that are developed by these companies.
I think you're stifling innovation if you do not create a strong foundation of open-source models in which other people can build. Now, I agree with Aidan and with Kai-Fu that not every model anyone generates has to be open source. I think that, you know, people can certainly have a competitive advantage with a new model architecture, with a new type of data modality, but there needs to be a strong foundation of open-source models for the community to build on.
Nicholas Thompson, The Atlantic: All right. So we have about a minute and a half left. So I want to do a quick round. One of the things Kai-Fu talked about is that open source is essential to make the world more equal. The last revolution in technology seemed more or less to make the world less equal. What has to happen in AI to make the world more equal? Just quickly, one thing you want to see happen.
Yann LeCun, New York University: I think the question of equality. Of course, you know, open-source dissemination is a big factor for equality. Now, for economic equality, it's a fiscal question, a fiscal policy question. It's not a technological question.
Andrew Ng, DeepLearning.AI, LLC: I think training when I look at adoption, AI in enterprises sometimes one of the biggest bottlenecks in many corporations is training. The idea is that the options are there you need to upskill the workforce, not just to build AI apps, but also just to use bots as that drives productivity gains really quickly.
Daphne Koller, Insitro Inc: So maybe not surprisingly, because Andrew and I co-founded Coursera, I'm going to twist Andrew's answer a little bit and say, education. One of the biggest delimiters between people who are able to use this technology and ones who are not is structured thinking. Structured thinking is not something that we teach our kids nearly well enough. And I think that is a skill that we need to start at the elementary level, because that will be the big differentiator between people who successfully use this technology and ones who do not.
Kai-Fu Lee, 01.AI Pte. Ltd.: Global competition. I think not only is it dangerous to have one company dominate everything that stifles innovation, more competition will help address that, but also global, because we have to be careful that this is the first time a technological platform comes with its own ideology, values and biases. So we have to let different people, cultures, countries, religions develop their own models and compete equally globally and not let any one country dominate it.
Aidan Gomez, Cohere Inc.: Yeah, I think it's access. I think access to the technology and making sure that everyone is able to contribute to it, that it speaks their language, that it knows their culture is going to be crucial to equality.
Nicholas Thompson, The Atlantic: All right. Amazing panel. Thank you, you guys are all great. Thanks.
Devanand Ramiah
December 6, 2024