Transcript
Claims
  • Unknown A
    We're going to have these models that are going to be a backbone of our services and we're going to be doing constantly inference for them. We're going to be training future versions. When you think about the amount of compute we'll need by 2030 to accommodate all these use cases, where does the Fermi estimate get you?
    (0:00:00)
  • Unknown B
    You're going to want a lot of inference. Compute is the rough, highest level view of these capable models. Because if one of the techniques for improving their quality is scaling up the amount of inference compute you use, then all of a sudden what's currently like one request to generate some tokens now becomes 50 or 100 or 1000 times as computationally intensive, even though it's producing the same amount of output. And you're also going to then see tremendous scaling up of the uses of these services as not everyone in the world has discovered these chat based conversational interfaces where you can get them to do all kinds of amazing things. You know, probably 10% of the computer users in the world have discovered that today, or 20% as that pushes towards 100% and people make heavier use of it. That's going to be another order of magnitude or two of scaling.
    (0:00:13)
  • Unknown B
    And so you're now going to have two orders of magnitude from that, two orders of magnitude from that, the models are probably going to be bigger, you'll get another order of magnitude or two from that. And there's a lot of inference compute you want. So you want extremely efficient hardware for inference for models you care about in.
    (0:01:12)
  • Unknown A
    Flops, total global inference in 2030.
    (0:01:28)
  • Unknown C
    I think just more is always going to be better. If you just kind of think about, okay, what fraction of world GDP will be, will people decide to spend on, on AI at that point? And then like, okay, what do the AI systems look like? Well, maybe it's some sort of personal assistant like thing that is in your glasses and can see everything around you and has access to all your digital information and the world's digital information. And like, maybe it's like you're Joe Biden and you have the earpiece in the cabinet that can advise you about anything in real time and solve problems for you and give you helpful pointers. Or you could talk to it and it wants to analyze anything that it sees around you for any potential useful impact it has on you. So I mean, I can imagine, okay, and then say it's like your personal assistant or your personal cabinet or something and that every time you spend 2x as much money on compute, the thing gets like 510 IQ points smarter or something like that?
    (0:01:34)
  • Unknown C
    And okay, would you rather spend like $10 a day and have an assistant or $20 a day and have a smarter assistant? And not only is it an assistant in life, but an assistant in getting your job done better because now it makes you from a 10x engineer to 100x or 10 millionx engineer. Okay, so let's see, from first principles, right? So people are going to want to spend some fraction of world GDP on this thing. The world GDP is almost certainly going to go way, way up to like orders of magnitude higher than it is today due to the fact that we have all of these artificial engineers working on improving things. Probably we will have sol unlimited energy and carbon issues by that point. So we should be able to have lots of energy. We should be able to have millions to billions of robots building US data centers.
    (0:02:55)
  • Unknown C
    Let's see, the sun is what, 10 to the 26 watts or something like that? I'm guessing that the amount of computer at the, you know, being used for AI to help each person will be astronomical.
    (0:04:03)
  • Unknown B
    I mean I would add on to that. I'm not sure I agree completely, but it's a pretty interesting thought experiment to go in that direction. And even if you get partway there, it's definitely going to be a lot of compute. And this is why it's super important to have as cheap and a hardware platform for using these models and applying them to problems that Noam described so that you can then make it accessible to everyone in some form and have as low a cost for access to these capabilities as you possibly can. And I think that's achievable. By focusing on hardware and model co design kinds of things, we should be able to make these things much, much more efficient than they are today.
    (0:04:24)
  • Unknown A
    Is Google's data center build out plan over the next few years aggressive enough given this increase in demand you're expecting?
    (0:05:12)
  • Unknown B
    I'm not going to comment on our future capital spending because our CEO and CFO would prefer, undoubtedly. But I will say you can look at our past capital expenditures over the last few years and see that we're, we're definitely investing in this area because we think it's important and that we are continuing to build new and interesting innovative hardware that we think really helps us have an edge in deploying these systems to more and more people, both training them and also how do we make them usable by people for inference language models.
    (0:05:21)
  • Unknown A
    Obviously you put in language, you get language out. Obviously it's multimodal, but you could imagine The Pathways blog post talks about sort of like so many different use cases that are not obviously of this kind of autoregressive nature going through the same model. So could you imagine like basically Google as a company, the product is like Google Search goes through this, Google Images goes through this, Gmail goes through. It's just like the server, the entire server is just this huge mixture of experts, specialized.
    (0:05:57)
  • Unknown B
    I mean you're starting to see some of this by having a lot of uses of Gemini models across Google that are not necessarily fine tuned. They're just sort of given instructions for this particular use case and this feature in this product setting. So I definitely see a lot more sharing of what the underlying models are capable of across more and more services. I do think that's a pretty interesting direction to go for Sure.
    (0:06:27)
  • Unknown A
    I feel like people listening might not register how interesting a prediction this is about where AI. It's sort of like getting Gnome on a podcast in 2018 and being like, yeah, so I think language models will be a thing. It's like this is where things go. This is actually. Yeah, that's incredibly interesting.
    (0:06:56)
  • Unknown B
    Yeah. And I think you might see that might be a big base model and then you might want customized versions of that model with different modules that are added onto it for different settings that maybe have access restrictions. Like maybe we have an internal one for Google use for Google employees that we've trained some modules on internal data and we don't allow anyone else to use those modules, but we can make use of it and maybe other companies you add on other modules that are useful for that company setting and serve it in our cloud APIs.
    (0:07:11)
  • Unknown A
    What is the bottleneck to making this sort of system viable? Is it like systems engineering? Is it ML? Is it?
    (0:07:43)
  • Unknown B
    I mean, it's a pretty different way of operating than our current Gemini development. So I think we will explore these kinds of areas and I think make some progress on them. But we need to sort of really see evidence that it's the right way, you know, that it has a lot of benefits. Some of those benefits may be improved quality, some may be sort of less concretely measurable, like this ability to have lots of parallel development of different modules. And I think that. But that's still a pretty exciting improvement because I think that then that would enable us to make faster progress on improving the model's capabilities for lots of different distinct areas.
    (0:07:51)
  • Unknown C
    I mean, even the data control modularity stuff seems really cool because then you could have the piece of the model that's just trained. For me, it knows all my private data.
    (0:08:34)
  • Unknown B
    Like a personal module for you would be useful. Another thing might be you can use certain data in some settings, but not in other settings. And maybe we have some YouTube data that's only usable in a YouTube product surface, but not in other settings. So we could have a module that is trained on that data for that particular purpose.
    (0:08:43)
  • Unknown C
    We're going to need like a million automated researchers to invent all of this stuff.
    (0:09:03)
  • Unknown B
    Yeah, it's got to be great.
    (0:09:08)