[gentle music]
Devyn McIlraith:Hello and welcome to the Neville Public Museum of Brown County. My name is Devyn McIlraith, and I am the exhibitions manager at The Neville. Tonight, we are welcoming Kyle Cranmer, the David R. Anderson director of the UW-Madison Data Science Institute, who is here through Badger Talks, the speakers’ bureau for the UW-Madison campus. For more information on this great free resource for the state, please visit badgertalks.wisc.edu.
Now, I’m pleased to introduce Kyle Cranmer, who will be presenting “How is AI Revolutionizing Science?” Kyle Cranmer is a David R. Anderson director of the UW-Madison Data Science Institute, where he leads efforts to advance discoveries that benefit society through cutting-edge data science research. A professor of physics with affiliate appointments in computer sciences and statistics, Cranmer played a pivotal role in the discovery of the Higgs boson by developing collaborative statistical methods that enabled thousands of scientists to analyze massive datasets. His work has since expanded to fields including astrophysics, neuroscience, and evolutionary biology. Committed to interdisciplinary collaboration and the Wisconsin idea, Cranmer aims to broaden campus-wide engagement in data science while addressing its social impacts, particularly on marginalized communities. Before joining UW-Madison in 2022, he spent 15 years at NYU and led the Moore-Sloan Data Science Environment. An award-winning researcher and alumnus of UW-Madison, Cranmer continues to shape the future of data science in academia and beyond. Please join me in welcoming Kyle Cranmer.
[audience applauds]
Kyle Cranmer:Hello. All right, well, thank you very much for having me. It’s a pleasure to be here tonight to talk about an exciting subject. We have two, you know, big ticket items. There’s science, which we all know and love, and then the new kid on the block is AI. And we’re gonna be talking a little bit about how these two interact and how you could either ask it as a question, you know, “How is AI revolutionizing science?” or you could just make the assertion that AIisrevolutionizing science. And I think hopefully by the end, you might agree that that’s actually the case. So let’s jump in.
This is a plot at the very beginning that just shows the number of papers in science, in different science areas. So there’s material science, chemistry, and physics that use or talk about AI, or machine learning. I’ll also use that word. So AI, artificial intelligence, but I’ll just be saying AI for most of the talk. And you see, the left part of the plot is 2000 to 2020. And somewhere around, you know, 2016 or so, you see this very strong uptick in the number of papers that are using, you know, referring to or using AI, or machine learning. So, I’m gonna try to talk about, you know, why is this the case. What does this mean? How did we get there? And that’s kind of the subject of the talk.
But before we get going, I want to kind of ground everything and do a little level-setting on some terms and some concepts and things. So, we’re gonna be talking about AI and science. And I thought, to get going, we could do a little bit of word association and just think about what kind of words come up. And so, I’m gonna start with “science,” ’cause hopefully that one’s a little bit more familiar. And it would be fun to do it as a crowd participation, but with the audio, I don’t know that that’s gonna work. So I want you to just think to yourself, what kind of words come to mind when you think about science? And maybe more specifically, think about the process of doing science, like the scientific method.
Okay, so, I’ve asked a few people this question, and so one kind of word you might think of is to observe, right, a classic thing. Another might be hypothesis, or somewhat of a synonym of hypothesis is a theory. Maybe a theory is a little bit more, you know, stood the test of time a little bit more than your average hypothesis. And then, there’s ideas of making predictions, right? So you’re gonna make a prediction. Hopefully, your theory tells you what might happen. You maybe do an experiment. You collect some data. And then, there’s the idea that, you know, you hear oftentimes that this is scientifically proven, you know? So I put “proven” in quotes, but maybe a better word that is I think what people kind of mean is that you’re testing that hypothesis, right? Or a more sophisticated idea might be that you’re looking at some data and you’re trying to infer something, right? So, maybe I’ll just do a show of hands. Do these seem like am I on base? Do these seem like familiar terms? Okay, so, great.
Now, we’re gonna do the same thing for AI, but AI is a little trickier. It’s the new kid on the block, and people, you know, we didn’t all learn about this when we were in grade school and such. So, we’re gonna get to that, but before I get to that, I’m gonna just make the point that the terms and science kind of organize themselves a little bit. On the left, you have these kind of ideas of hypothesis and theory. On the right, you have more like, doing experiments and collecting data. Going from the left to the right is that kind of predictive direction. It’s like one part of what it means to do science, to be able to make those predictions. And then, the other direction, looking at data and trying to infer something, is kind of the other direction, right? And so, these are, that’s just part of the scientific process or the scientific method is you’re kind of constantly going back and forth between these two modes of doing science. And that’s hopefully, like, a pretty familiar idea.
Now, with AI, what are some of the words that come to mind there? Like, maybe there aren’t any words, or maybe a few words. But I think that, you know, if you look going back a few years, probably people, at least when the imagery was–maybe it’s less words and more imagery–but people, you always see pictures of robots when you talk about AI. Maybe you think aboutThe Terminator, you know, the Arnold Schwarzenegger movie back in the day. So, and that has pretty negative connotations about what AI is doing for us. These days, probably the most, you know, familiar word for people about what’s going on with modern AI might be ChatGPT. Have you heard of ChatGPT? This is familiar? Yeah? Good, so I’m reasonably on base. Another idea might be magic. It certainly seems like magic. There’s not a lot of understanding of, like, what’s going on necessarily. A more practical perspective might be efficiency. Like if, you know, companies, you hear lots of advertisements. Why are companies interested in AI? ‘Cause somehow, they have a bunch of data. They wanna go through it, and they wanna do something, and they hope that AI will make that more efficient. And there’s also this very similar idea of automation. Maybe you can automate something that previously people were doing. And, you know, that’s maybe good or maybe that’s bad. And that often takes people to think about what’s going on with jobs and job displacement.
If you have kids that are in school or if you’re, you know, a professor, you also might be thinking about cheating or, you know, what it’s doing to education. It’s certainly changing things very rapidly. There’s also this idea of hallucinations, which is not the hallucinations from the 1960s and ’70s, but it’s more the idea that if you ask ChatGPT a question, sometimes it gives you a good answer. Sometimes, it basically makes something up, and you can’t tell the difference between them, and so, you can’t always trust what you see. And so, if you’re writing a scientific report, it might, for instance, make up references, and that would be bad, right? So that speaks to this bigger issue of trust. How do we trust these things? That is true about AI and society in general, but it’s also definitely gonna be true in science. If we’re gonna use AI to do science, how do we know that we can trust what’s coming out, right? Data was present when we talked about science. It’s definitely also true when you’re talking about AI. When you have these AI systems, they’re trained on lots and lots of data. And then, when you start thinking about that, some people might be even worried about issues of, for instance, privacy. So, these are collection of different topics and thoughts that are kind of swirling around when we talk about AI. Not all of them are necessarily super relevant for science. Some of them are. But I think that this is just to kind of get the show started. These are kind of the ideas that I have in mind and I think people are probably familiar with.
Now, I’d like to tell you a little bit of some brief history and terminology before we get into, you know, like how it’s impacting science. So bear with me as we go through that. But I think it’ll, probably many people will appreciate it, ’cause it does seem like AI is this very new thing. But maybe that term has also been around for a while. It’s actually been around since the ’50s. This is a proposal that was made by Marvin Minsky and John McCarthy, who made this pretty bold claim that’s in red there. They said, “To proceed on the basis of the conjecture that every aspect of learning or any other feature of intelligence can, in principle, be so precisely described that a machine can be made to simulate it.” So, like, that’s a bold claim back then. It’s still a bold claim today. But they were thinking, you know, very far, very deeply about, you know, what it is that computers could do, and what does it mean to be intelligent? And together, one of the organizers, the bottom one there, Claude Shannon, was at Bell Labs. I don’t know if you know that name, but he was responsible for creating information theory, which kind of, like, powers the whole digital world. And, you know, when you talk about bits of information, he introduced the concept of a bit. So pretty foundational work.
So this goes back to the ’50s, and from the ’50s through kind of the mid-’60s, there was that first wave of AI research. And actually, you had the first kind of chatbots. There was this thing called the Turing test, which was an idea that if you could talk to some sort of artificial intelligence and you couldn’t tell it was human or not, then it would, like, pass this Turing test. It’s not a very well-defined test, but, you know, that kind of heuristically was the idea. You also see the rise of some, you know, computers, like, robotic systems being deployed. And there was some progress there, but it’s a very hard and ambitious problem. And around, you know, the late ’60s, they kind of hit a wall, and artificial intelligence, that first wave of artificial intelligence research kind of stopped, and people talk about the AI winter. So it wasn’t like nuclear winter, but it was just a winter in terms of research ideas. And for a little bit there, AI was kind of not a–You wouldn’t wanna try to get tenure at a university by saying you were an AI researcher there for a little bit. It wasn’t a very popular topic for some time.
Now, zoom ahead to the kind of ’80s and ’90s, and there was another wave of progress that was made, but it had a different feel to it. And that’s when people often use the word “machine learning.” And so, this is a paper actually from 2001 by Leo Breiman, which I’ll come back to in a little bit, but I wanna make some quotes from this paper. He’s actually looking back. It’s a retrospective talking about what was going on in the mid-’80s. So he said there were two new powerful algorithms that had been developed for, you know, “fitting data” is the phrase you use. One of them are called neural networks. I don’t know if you’ve heard of neural networks. We don’t need to know much about it. It was just a class of algorithms that were inspired originally by the brain. And then decision trees, which, again, some other algorithm. We’re not gonna go into those details. But the researchers that were using this started, they came from, you know, a collection of different areas. So he talks about people from computer scientists and physicists and engineers, plus some statisticians. And they started working on a very different kind of problem than what statisticians had previously been able to deal with. And so, for instance, he calls them complex prediction problems. And the examples he gives are speech recognition, image recognition, handwriting recognition, and prediction and financial markets, right? So, speech recognition would be like, listen to my voice and try to turn it into, like, the words that I’m saying and figure out how a computer could do that. Like, that was a very, very, very hard problem, and people first started to make some progress in the kind of mid-’80s. And then, he goes on to talk about that with this group of people, the way they were seeing the world was, the only thing that really mattered was how was the predictive accuracy. Like, how good is the black box at, say, listening to this text and transcribing it into words or something like that, and talks about these new emerging communities. So machine learning is that term that you see there at the bottom. There’s a journal associated to it. And then there was also a really big conference called Neural Information Processing Systems, or at the time, it was called NIPS. Now it’s called NeurIPS. And so, he’s just identifying those communities, and I’ll come back to them a little later.
Now, so this is a video from 1989 of my colleague from NYU, Yann LeCun, who’s very well-known for the foundational research that he did in AI and machine learning. And so, he was at Bell Labs at one point, and so he just took, you know, put underneath a camera some handwritten digits. And the computer is, like, looking at the images, and it’s identifying those digits. And today, this is very easy to take for granted, but in 1989, this was a big deal. And the idea was, then, you could start to roll these things out to either read the numbers on a check when you deposit it into a machine, or to read the zip code on a, you know, post that you’re sending. So, instead of having humans reading that, you start being able to automate that with machines that could do it. And that was, like, that was a really big deal. And Yann, so that was, you know, started to work. And Yann had a vision that took, you know, another, you know, more than a decade to come to fruition, which goes under the name of “deep learning.” So this is like the next phase of this, of this saga of machine learning and AI called deep learning. And I’m not gonna go into too many details here, but the idea here is in this example–and this is one of Yann’s slides–is that you’re giving as input some complicated data. In this case, the data is an image, okay? So you give it a photograph, and your task is to try to say, you know, basically what’s in the image. Is it a horse? No. Is it an airplane? No. Is it a car? Yes. Right, so that’s called a classification problem. And so, one part of that machine-learning, you know, algorithm is over here on the right, this box. And so, it’s labeled as trainable classifier. So that’s just, there’s some machine learning, AI something happening here. But the input to that where you didn’t give it the image directly, you would have some kind of hand-engineered, you know, what’s called feature extraction. So you would take this image, and then experts would think about, like, what are the important properties of images? And they would, I don’t know, look at, like, I don’t know, how much yellow is in the image, some things that describe some shapes that are floating around. And somehow, it would process the raw image into some other things that are called features. And then, those features would then go into this classifier, and the classifier would try to say if it’s a car or a horse or an airplane or something like that.
Now, those kinds of systems work pretty well, but the bottleneck was basically how good are these features? So humans come up with good ideas, but maybe they’re missing a lot of important information. And then, as time went on, they started coming up with ways of learning sort of more features and more of, like, a staged approach. And the idea of deep learning was that you would stack many of these stages together, and you’d kind of take the human out of the picture. You would have the machine learning, AI thing would figure out what are the important features at the lowest level, like textures on the grass or edges and things like that, some mid-level features that are a little bit more complicated, and then some high-level features. And that’s pretty abstract, so here’s a visual example. So, this is the same idea. These are actual, real examples from the time. So, these little patches are things that you can kind of tell that these are, like, edges. Basically, it’s looking for something that looks like this in the image. And so, whenever it finds an edge, it’ll say, “I found an edge over here.” And then you assemble a bunch of those together, and you can make some more complicated things which are pretty hard to make sense of. But okay, there’s some kind of weird motifs of some sort, visual motifs. And then, you assemble those together, and you get something like this. And at this stage, if you can look carefully, some of these things start to look like wheels. This thing looks like maybe the grille of the hood of a car or a honeycomb or something, I don’t know. There are some things that look like birds and a nose of a dog or something. I don’t know, some weird things that come up. But no human came up with these things. The computer is sort of, the AI system said these are interesting features, and if we can identify these features, then we can make really good classifiers that can tell you what’s in this image. And so, that idea was not very popular for a while, but Yann and company, like, stuck to their guns and they really pushed on it. And it was a little bit later, around 2010, 2012 when that really first started to take off.
So, the recap on the language here is that we had AI, artificial intelligence is the big idea. And then, within that, there was a kind of specific set of algorithms and approaches which were referred to as machine learning that emerged kind of in the 1980s. And then, within that, there was an even more specialized set of approaches that got dubbed “deep learning” that kind of took off in the 2010s. And so, here’s an example from 2012. This is, like, one of the milestone papers, where things finally started to really work. And it was a combination of having a lot of data, really fast computers, and then some algorithmic tricks. And so, here are some–You give it some images of a dog or a cat, and this machine learning, AI system is saying, “I think this is a dog, I think this is a cat, with some high confidence.” And again, this is easy to take for granted now, but, like, and this wasn’t that long ago. This was 2012. But within just a few years, these systems were actually beating humans. So it was very, very fast progress. So it went from, like, “Can you tell that there’s a car in the picture?” to better than humans. And that’s already kind of a weird idea. What does it mean to be better than humans in identifying images? You know, so it’s kind of a subtle question. But here are some kind of fun examples. You have to quickly tell me, is that a labradoodle or fried chicken? [laughs] Some of these are dogs, and some are fried chicken, right? Okay, so, and you have to do it quickly. Here’s another one. Chihuahuas or blueberry muffins? [audience laughs] Yeah, and you don’t wanna eat the Chihuahua, yeah. And here’s another one: sloths and chocolate croissants, right? So these can get, these problems can get pretty difficult visually, and these AI systems were doing a good job at it. These are mainly for fun, but they’re also kind of half-serious examples.
Now, in this story–I don’t wanna spend the whole time talking about AI–there was another kind of a period of progress that happened fairly recently, and it’s related to this idea that what are you using the AI system to do? What do you want it to do? And so, previously the idea were these so-called prediction problems. So, you know, so one was I give you an image and you try to tell me what’s in it, or I try to, you know, predict what’s gonna happen in the future in the financial markets or something. And so, those are these predictive models. And they had shown quite a bit of success by the kind of, you know, 2015 era. And then, there started to be progress in what’s called generative models. And I don’t know, maybe on the radio, you’ve heard about, like, generative AI. Has anyone heard of that phrase? Maybe not. I hear it on radio ads sometimes, so it’s getting out there, but it’s, like, the new kid on the block. And I’ll just say a little bit about it, because that’s part of where a lot of the excitement is.
So here’s an example from 2016, where instead of consuming audio or consuming an image, the goal is to try to produce audio. So, instead of, you know, instead of taking audio and turning it into words, I’m gonna give it words and say, “Can you produce the audio?” Okay, and so, here, I’ll play this example. Hopefully it will come through the speaker system. So, someone gave it text and said, “Produce the waveform,” almost like on a record player, right? “Produce the waveform so that you can hear the sound.”
AI Voice:The avocado is a pear-shaped fruit with leathery skin, smooth, edible flesh, and a large stone.
Kyle Cranmer:So, that sounds, I mean, it sounds like a person talking. You know, it sounds maybe a little bit like, you know, you can probably tell it’s not a real person. This stuff has improved a lot. This was in 2016. And there were speech-synthesis systems at the time, but they really took, you know, like, little segments of someone talking and pieced them together. And this didn’t work that way at all. It really is, like, making the waveform, like, from scratch. And so, and it didn’t really have any expert, you know, intervention in there. So, it’s really like it was a new capability and got people quite excited. The same year, there was some progress in making images. And, again, no one had ever seen anything like this really. So, these are images of birds and ants, a monastery, volcanoes. And, you know, they look recognizable. They don’t look great. You know, I mean, these birds look pretty funny. The ants look like they got stepped on. The monastery is, you know, looks a little bit great. You know, but volcanoes look okay. But you could think of this maybe as like a child learning to draw or something, okay? It’s like, you know, “Good job, keep it up,” you know? But this was 2016. Within two years, the images looked like this. So, these were trained on–These models were given lots of pictures of, examples of people’s faces and celebrities, and then you say, basically, “Make me another image of a celebrity.” But these people don’t exist. These people never walked the Earth, you know? So, basically, it’s, like, drawing what it imagines a celebrity might look like, right? And so, this is, like, you know, it looks very, very good. You’re hard-pressed to tell that these are not real people. And somehow, the way they work is just very, very different than anything that was done before. So this got people quite excited.
And then, around the same time, there was the first progress–Well, first significant progress, or kind of a milestone in modeling text, okay? So, not images and not sound, but just, like, what you would type on, you know, a word processor or something. And so, and this really, you know, got the attention of the world pretty quickly. So, you really started to be able to have these things write sentences or paragraphs, or… And at first, they weren’t great. You know, I remember around, like, 2018, 2019, you would ask it to write a short story and it was like, you know, sort of passable, but it would kind of lose the plot halfway through, and it would turn into something that wasn’t very coherent. But very quickly, that changed, and they started training these models on essentially all the text that you could find on the Internet. And then, in November of 2022, which was, you know, not very long ago, that’s when ChatGPT was launched and basically everything changed. So, those models were just so, so good at working with text that it got basically everybody’s attention, and AI became, like, something that everybody’s talking about now. And so, here’s just one plot that shows different things that you’ve heard of maybe, Facebook, Instagram, TikTok, these different services. ChatGPT, these are plots of how long did it take them to get 100 million users in months? And so, you know, Facebook was, like, 45 months, and ChatGPT was two. So, like, I mean, it was like a shot heard around the world when this was released. And so, that’s, you know, partially why you’re here tonight.
So, if I were to just add the last little layer of terminology, we started with AI as it was described in the 1950s. And then you go to machine learning in the 1980s, deep learning that started to take off in about 2010, and then there’s this kind of confusing thing that within deep learning, there’s this generative AI, which has taken off in the last few years, but many people just call this innermost thing AI. So, that’s partially one of the things I wanted to leave you with, is that AI is kind of a confusing term right now, because it’s both the really big idea that we can do general purpose things, and it’s also this very specific technology that’s really good at making images of people or audio or text. And so, it’s a little bit confusing about how to communicate, but this term generative AI is a little bit more specific, and so, that’s why I put that there.
All right, so that’s my kind of quick tour. So, I’m gonna just tell you a little bit now about my personal journey briefly, and then some other things that were happening before I kind of tee up to what’s exciting about AI for science that’s happening now and as we look to the future. So, as for myself, I was in high school when I first heard about neural networks and machine learning. And I had a good friend, and he gave me this notebook, and I was trying to teach myself how to program a computer. And so, as luck would have it, I programmed my first neural network when I was in high school in 1993. So, I’ve been at it for a while, but that’s not what I did during, like, college and graduate school. Instead, I went to be a physicist, a particle physicist, and I worked on this gargantuan machine that’s like the size of an 11-story building in Geneva, Switzerland, at the Large Hadron Collider, searching for the most fundamental building blocks of nature. And in 2012, we discovered this particle called the Higgs boson, which is a pretty big deal, and it led to the Nobel Prize in 2013. And that was a very, very exciting time. I mean, it was, you know, a high point for my scientific career was to be part of this whole thing. And as luck would have it, or, you know, kind of in July 2012, when we announced the discovery of the Higgs boson to the world, that same month was when this milestone paper in deep learning happened. So, like, totally coincidentally, these two important things happened. And right after the Higgs discovery, I was thinking about, like, “What am I gonna do next with my career?” You know, I was pretty young, and it was a pretty big deal. And I knew that machine learning and deep learning were taking off, and I had a sabbatical. And I thought to myself, “I need to figure out how we can take advantage of deep learning, but not, like, throw away everything that we do as scientists, but somehow leverage it to do better science. And what does that look like, and how do you even formulate that question?” And then, so, I thought about that. As luck would have it, I was honored, you know, to be invited to give a keynote a few years later at this big machine learning AI conference called NeurIPS, which was held in Barcelona that year. There were a lot of people. It was a big, big, big conference. So this was, like, the biggest AI machine-learning conference in the world, and I was talking about, you know, how we could use this to improve science. And, of course, I wasn’t the only one thinking about it. And over the next few years, that’s when you see this huge explosion in papers. And it’s largely because deep learning allowed us to work with much more complicated data and with less kind of expert intervention than what was previously required. And so, that’s what really is, like, driving what’s going on.
Now, one of the things that I kind of have realized, you know, over those years is that while I was busy working on the Large Hadron Collider, you know, between when I first programmed my first neural network in ’93 to the Higgs discovery, you know, of course, other things were going on in the world. And one of the things that was happening is this term “big data.” So, have you heard, you’re familiar with that phrase, big data? So previously, if you thought about, you know, who is helping scientists analyze data? What profession is that? You would probably think statistics. Like, statisticians, that’s largely what they would do. And in the timeline of statistics, if you zoom in here to 1997, it talks about, this is the first time “big data” was used. And so, this was kind of a pivot of how statisticians were focusing on things. Before, it was, like, pretty small amounts of data that they would deal with. And now, there was something changing. And in this timeline, they also interestingly, end the timeline with the discovery of the Higgs boson. So, very convenient for my storytelling, thank you. And also, around that time was when the term “data science” first emerged, was kind of around 2010 or so.
So what was going on during this time? Well, one, this is when Leo Breiman wrote this article that I was telling you about, “The Two Cultures.” And “The Two Cultures,” what he’s actually talking about are kind of, like, how you model data and, for instance, how you help with science. And he’s talking about the two cultures were statisticians and this new machine-learning crowd. And so, he goes on to talk about, and it’s a kind of a criticism of the statistics community and how they had been doing business up until that point. And so, he’s talking about in this picture, these are figures from his paper. But he’s talking about, you know, on the left box is, like, nature. And there’s some kind of complicated relationship between some input and output that’s, like, graphically, you know, shown in this figure down here, and that’s supposed to be some complicated you know, thing about nature. So, that could either be relationships between, say, the, you know, I don’t know, voltage and current or some kind of, like, temperature and pressure or, I don’t know, pick something, you know, that you like, or, like, something in chemistry or biology or what have you. But when these Xs and Ys start being something like the text and the sound of the audio, it becomes a very, very complicated relationship, right? And the classical statistical-modeling approaches used to just be very simple mathematical models for, you know, describing something like this. And he was saying, you know, you look in the typical statistics paper, and they start with a phrase like, “Assume that the data are generated by the following model,” and then there’s a simple equation. And then he criticizes, saying, “This enterprise has at its heart the belief that a statistician–” I think you could equivalently say physicist here–“by imagination and by looking at the data, can invent a reasonably good model for a complex mechanism devised by nature.” Okay, so, you know, that had worked pretty well for Newton. It worked pretty well for Einstein. But he’s saying, “You know, when you get into image and speech and financial markets, this just doesn’t really work so well anymore.”
And, but as a physicist, I was trained as a physicist. And usually, when you have a complicated system, what you’re trained to do is figure out how to simplify it in something that captures the essence of the system. And we have a joke in physics where we talk about the spherical cow. You know, you get, like, a problem. There’s a problem in a physics textbook about what’s the probability that, you know, a cow gets struck by lightning? And that cow is too complicated, so the first thing you do is approximate the cow with a sphere. And then, you do some calculations, and, you know, you get roughly the right answer. So that’s where that joke comes from. And so, both statisticians and physicists and various kinds of scientists have a long tradition of coming up with good simplifying assumptions. And, but he’s basically saying, “That is gonna, that’s over. You know, we should, instead, we should just ditch that, forget trying to understand things, and we have this new machine-learning, black box thing. And as long as you have a lot of data, I can train it, and it’s really good at making predictions. And so, we should just forget about understanding stuff, and we should just be happy with the black box.”
Now, George Box, who is a famous statistician at UW-Madison, has this famous quote that, “All models are wrong, but some are useful.” So, that’s like the spherical cow, right? The spherical cow, maybe it’s not perfect, but it’s useful. You can reason about it. You get pretty good approximations and things like that. But as big data was taking off and machine learning was taking off, there’s the rejoinder quote from Peter Norvig at Google who says, “All models are wrong, and increasingly, you can succeed without them.” It’s really, like, this big change about, like, what do we mean by doing science? It’s just kind of, like, the attitude at this time was essentially, you can give up with understanding and just replace things with a black box and a bunch of data. And there were even this paper from 2008 saying that the data deluge is making the scientific method obsolete. So, I mean, this was this was inWiredmagazine, so it was a big deal. And so, I think we’re living with the legacy of that big data, those 10 years of big data, and all this idea that just a bunch of data is gonna replace humans and replace theory and everything. And I will just say strongly that I don’t agree with that at all, okay? But it’s hard to go into a talk like this without thinking that that’s essentially gonna be my message, because it’s the continuation. You still have AI, and you still have science. But I wanna draw this distinction between the way that people were thinking about it in the kind of 2000s.
So, my claim is that data and AI are definitely not enough. And you might ask, you know, why aren’t they enough? And the short version of it is, as I’m sure many of you already know, that causation is not correlation, okay? And when you really get into science, causation is critically important. And what is happening in these models, with the AI models with a bunch of data, is essentially, at least what was happening at the time, in the, you know, 2010s, and this kind of big data time, was essentially, association is like correlation. You just look at a bunch of data, you look at, you know, relationships between things. And there’s, you know, a very well-known researcher named Judea Pearl. This is a figure from his book calledThe Book of Why,talking all about how, you know, causation is really important. And so, he has this ladder of causality, and the lowest rung of the ladder he calls association. It’s sort of seeing or observing. And just that you see that, for instance, cancer and smoking are related to each other. But to be able to say that there’s something causal happening, like, you know, the leaves are blowing and it’s windy. And if you ask a little kid, you know, like, you know, “Why is it windy?” Like, occasionally, kids will say, “Because the leaves are shaking,” right? Like, maybe not. That’s not the causal story, right? So higher in this ladder of thinking is the idea of being able to do interventions. So, you have higher-level kinds of questions, like, “What would happen if I did the following? You know, what would happen if I went on a diet or I took an aspirin or something?” That’s not, you’re intervening in the system, and so you’re really changing things, and it’s a different kind of question. And then there are even higher-level questions called counterfactuals. It’s a big, $1,000 word, but it’s very similar when you think, when you talk about, like, the basis of law. Like, you’re asking, like, what would have happened if you had done something different. So, counter to what you actually did, counterfactual. And so, these are even a more difficult kind of question to ask. But they’re, again, at the heart of doing science. And so, if we want to be able to do these kinds of things, and you’re not gonna just solve it with data alone.
So, I think, so, I’m gonna now switch into sort of a more modern view of what AI for science is. This is my, you know, my view, but I’m not the only one. There are other people that are kind of thinking along these lines. Of course, there’s, you know, diversity of opinions out there. But, so, one thing is, I’ll just say that if you go back before all of this AI stuff, you know, the way we approached problems was with human intelligence, right? We had interpretable mechanisms that were happening about biology or physics or chemistry or something. And the way that we approach science was guided by expert knowledge and theoretical insights, and we had a lot of handcrafted solutions where we knew what was going on, but it was very labor intensive and slow, right? And then the pendulum swung way over to the other side, with big data paired with what was going on with machine learning, where the idea was we’re gonna be data-driven, which I’m all for data. I mean, I’m a director of a data science institute. I’m not anti-data. But it was just like, we’re gonna, it’s more the second bullet point, that you’re going to eschew expert knowledge. It’s like, “We don’t need expert knowledge. We don’t need theory. We’re just gonna solve it all with a bunch of data.” Which I don’t believe in. “And we’re gonna try to replace this with kind of black box, end-to-end learning.” And what’s, I think, fortunately happening is that now the pendulum is kind of swinging back, and we have some more mix of these two. And so, my message kind of generally is that it’s gonna be the combination of human and artificial intelligence, where there’s really a lot of synergy, and it’s gonna be key to tackling all sorts of grand challenges. And so, I’m gonna end with some examples and kind of give you a feel of what’s going on now.
So, one of the points that I wanna make is that, you know, when you think about theory, you probably think about, like, Einstein or Newton and equations maybe, or something like that. And that’s certainly true, but modern theories are much, much more complicated than just a simple equation. If you wanna describe a complicated phenomena like air flowing around an airplane wing or what’s going on with how an epidemic spreads or how the neurons in a brain, you know, signal each other to do something, or particle physics or, you know, the evolution of the universe or how, you know, proteins fold. These get to be quite complicated. And, but we still have mechanisms, you know, scientific theories that describe them. But more and more, they get put on computers. You code them up, you describe what you wanna do, and you can simulate them. So, really, simulators are kind of the modern version of theories. You know, so, like conceptually the same ideas as a theory, just a lot more complicated, a lot more moving parts, and allows experts of different, you know, with different kinds of expertise to come together to describe the same phenomena. And so, I’ll just make the statement at the bottom that the forefront of scientific knowledge is often encapsulated in these simulators, but unfortunately, the simulators are really poorly suited for a lot of downstream tasks that you wanna do. So, for instance, if you wanna do statistical inference, it’s difficult to use a simulator directly. There was a nice thing about writing these simple equations is that you could manipulate them mathematically. Simulators are a little bit harder to work with. Another point is, if you wanna design an experiment, in principle, you should be able to use the simulators to help you design an experiment. In practice, that’s hard. And other kinds of things like decision-making. Like, for instance, you wanna model an epidemic, that’s great. And you wanna use that model to try to drive your decision-making about, should you close the airport or something like that, that’s harder. And there’s a quote attributed to Paul Dirac, who’s a famous physicist that was responsible for, you know, quantum mechanics and things like that. He says, “The underlying physical laws for the mathematical theory of a large part of physics and the whole of chemistry are thus completely known, and the difficulty is only that the exact application of these laws leads to equations that are much too complicated to be soluble.” So his point is there, is basically, we understand the rules of the game for chemistry, but that doesn’t mean, like, we solved all the things you can do with chemistry, ’cause it’s really, really hard still.
And so, this is where we’re now getting to, is that there’s this idea that, if you look at the history of science, that there have been, you know, important, you know, paradigms. And the first one was this very empirical, just you’re observing the world. You know, you’re keeping track of, like, different kinds of, you know, species and things like that. Then, there was a theoretical period where, you know, people like, you know, Newton and et cetera up to Einstein are, like, coming up with great theories describing the world with mathematics. And then, that kind of got supercharged with this computational era, which is basically still theoretical. It’s just, you supercharge it with things that you wouldn’t be able to do with a pencil and paper. And then, there was this kind of, I’d say intermediate stage, which was not quite right, but was on the way, which was what people–There was a book even, calledThe Fourth Paradigm of Science,and it’s called the data-intensive scientific discovery. So that’s that big data era. And the problem with that is it kind of devalued the role of theory and, you know, the thousands of years of knowledge that humans have built up, you know, through experimentation and interaction with the world. But this fifth paradigm, it kind of recenters the importance of theory. And so, you’re saying you have data. Data is important. You know, that’s doing experiments. That’s a foundational part of empirical science is data. The simulation is, you can think of it as essentially just theory, but it’s, like, turbocharged theories that take advantage of computational approaches. But that on its own was difficult. But when you pair it with AI and machine learning, now we’re seeing that those three things together are allowing us to do all sorts of stuff that we’ve wanted to be able to do for decades, but never knew how to do. And so, and that’s gonna lead to just this, like, super duper rapid advancements in science. And so, I’m personally very, very excited. Microsoft Research has this article where they’re talking about this AI for science, and they’re calling it “the fifth paradigm of scientific discovery.” They’re investing a lot in that. And there are lots of companies around the world that kind of, one version or another of this insight they believe in, and they’re investing in it heavily.
So, one example I’m gonna show you, this video, on the left is a traditional numerical simulation of, you know, droppings, like fluids moving around and dropping some balls and things like that. And what happened is on the right, we’re gonna use an AI model that at the beginning doesn’t know any physics, but it’s gonna watch a whole bunch of examples of this simulator, and then it’s gonna learn if it can basically mimic it. Can I approximate what the simulator is doing? So it didn’t know any physics before, but it, like, learned the physics by watching a bunch of examples. And then after you’ve trained it, you get something like this. So these are pretty complicated phenomena. On the left is the traditional numerical, like, physics simulator, and on the right is this AI model. And you see, it’s, like, doing a very good job, right? So these AI models can essentially learn the physics from the simulator. So you’re, like, transferring it onto the AI model. But the advantage is the AI models are way faster. And they have a few other advantages, which are too technical for me to get into. But that same idea is being applied, for instance, to numerical weather forecasting. You know, you can run supercomputers to describe, you know, the atmosphere and the oceans and things like this, but it’s really, really expensive on a big, big, you know, supercomputers to do this. And now, AI models are learning basically from those simulations and they’re getting very accurate results. And they’re, like, tens of thousands of times faster. So you can do all sorts of things that you couldn’t do before. And they can also start to learn on real-world data. So they actually have some avenue where they’re even improving on the numerical weather forecasts, which is exciting.
So before, we talked about there’s reality, nature, the real complicated thing, the cow, ’cause we’re in Wisconsin, and then the physics approach or the kind of traditional approaches that also a lot of statisticians would use was to simplify it to some very mathematically tractable model. What we’re seeing with AI and machine learning now are something like the thing on the right. You know, it’s like a–It’s a more high-fidelity version of the cow. It’s still an approximation, but it captures a lot of features that are, you know, looks more like a cow, but it’s still fast and it still allows you to work with it in a way to where you can do all sorts of things that you would’ve not been able to do before. So, for example, people are modeling, if you’re interested in fusion reactors and clean energy, you have this very high-temperature plasma that you’re trying to control with magnets and electric currents and stuff like that. And that’s the bottleneck to being able to get fusion to work, basically, is how hot can you make that plasma and how long can you confine it so that you can extract energy from it. And you use traditional engineering approaches to controlling that system. And you get so far, and now, they’re using AI models that have been trained in simulation of what the system would look like. And then they get much more accurate control, and then they deploy it in the real world. And then they can actually control these so-called tokamaks, which look like a big doughnut. But it’s, like, one of the leading technologies for fusion. And so, they’re making great progress there. So that’s exciting.
So that’s one example. And so, I make this kind of point that was in the abstract of the talk that AI is, like, quickly raising the ambitions of scientists kind of across the board. But the way it’s influencing them, the capabilities that AI enables varies a lot across different fields or subfields or what kind of thing they’re trying to do. So this example I gave is one example, but it looks very different in other fields. So, I’m gonna give some names to them. So, this is the one that I just described with the kind of fusion example is that you have a simulation that you believe in, that you think, you know, like numerical weather or something like that, but it’s too slow to work with. And then you train an AI model to be a very fast approximation of it. Some people call it a surrogate or something like that. And then you can use that thing in your, you know, to make decisions or to control your engineering system or something like that. So I gave it this little figure down here using AI for decisions, control, and design. So, that’s one pattern that we see about how AI is, you know, influencing science and engineering.
But there are other examples. So, let’s talk about molecules and materials briefly. So here’s an example. These are the proteins related to the COVID virus, you know, sitting on a, you know, on, like, a cell wall. I think this is a lipid layer of a cell. And they’re all jiggling around. And it’s a really fun to watch video. And you can do these kinds of simulations of what’s gonna happen with all these atoms moving around, ’cause it’s just chemistry, but it requires, like, a supercomputer, you know, a year or something to be able to make that little video. And if you want, if it takes a long time for this protein to, like, do the important biologically relevant thing, it might not be feasible. And so, what’s happening now is that we’re figuring out how to use generative AI techniques so that instead of making images of people’s faces or audio, it’s basically making the videos of these kinds of things that are grounded in, you know, the laws of chemistry, and you kind of pair them together in a way so that you can maintain kind of a correctness that you know that the calculation is doing what you want. And when you put those two together, you get, like, you make things go faster by, you know, you know, orders of magnitude, 10,000 times, 100,000 times faster. So, there’s all sorts of things that you can do now that were previously not tractable. And so, some people are calling this new capability a computational microscope ’cause it’s sort of like you’re looking at this microscopic thing that you couldn’t actually do with a real microscope, but, you know, you can simulate it in a computer. And AI is accelerating those simulations dramatically.
Okay, another place where this generative AI is happening is in things like drug discovery and materials design. So in this example, instead of giving it text and saying, you know, like, describing an image that you want, like, you know, “Make me an image of a teddy bear on a skateboard in Times Square,” or something. The prompt isn’t text. The prompt is this molecule, which is insulin. And you’re saying try to generate me some new thing, some new protein or something that will bind to insulin. And they’re, like, the AI model, by looking at a bunch of other examples, is, like, now dreaming up a new molecule that it thinks might actually fold up in the way that’s shown and bind to the insulin. And so, this was work that was done by David Baker’s lab, covered inThe New York Times.And he won the Nobel Prize for that last year. So that stuff is, like, is working, which is pretty impressive.
Now, the thing about it that I wanna really stress, there are a few things that are really important, especially since they just won the Nobel Prize, is that one is, like, how do you trust this, right? We talked about trust, right? In this situation, the AI model kind of could be wrong. The thing is, you’re going to follow up afterwards with, you know, this is like a proposal, like, I think that this is a new, like, a new protein that will actually fold up as shown and bind to insulin. But, you know, it could be wrong, right? So you need to actually synthesize that protein and stick it in a lab and do follow-up experiments to confirm that it actually has the properties that you want, okay? And so, you would never, like, roll it out in people or something like that until you’ve done all this other, you know, kind of testing. But the first question is, how do you find this thing in the first place, right? The design space of possible molecules is huge. And so, just having a suggestion, here’s a candidate, maybe it works. If you can do that more efficiently and it accelerates the process of drug discovery, then you still win. Okay, so that’s kind of the idea. It’s okay if the predictions are wrong, as long as it accelerates this discovery process. And then you confirm that it actually has those properties in the kind of old-fashioned way. The same sort of story is happening with materials science that people are trying to come up with new materials that might be good for batteries or good for carbon capture, or any of a number of, you know, have some kind of desired material property. That’s a very hard problem. And generative AI is accelerating that discovery process as well. Okay, so I call this pattern exploration with confirmation.
The other thing that I really want to try to reinforce is that the AI is not doing this on its own. It’s not in isolation. The only way that the AI works, in addition to the confirmation, is that the AI was trained on real data, and that data was either came from a simulation that, like, that traditional chemists made, or it came from some experiments that traditional chemists did. If you took them out of the loop, none of this stuff would work. Okay, so you can’t just replace scientists with AI at this stage. But these are the Nobel Prize winners in chemistry. David Baker is trained as a chemist, and Demis Hassabis and John Jumper are actually coming more from the AI machine learning side. They’re just from Google DeepMind, and they kind of solved the protein folding problem. So, they came at it from the kind of computer science side. But nevertheless, there’s been tremendous progress recently. And it’s a very, you know, exciting moment. Okay, so I already made the point that AI didn’t do it alone. So, some of the things that are out there, there’s, for instance, something called the Protein Data Bank, which scientists all around the world have contributed to for decades. The AI is trained on the data in that database. You take that database away, none of this would have happened. So, and similarly, if you’re looking for, like, new materials for batteries or carbon capture, there’s something called the Materials Project, which material scientists too have been contributing to for decades. So, I just wanna reinforce it’s not AI that’s getting all, I mean, it’s getting all the attention, but it’s the two together that are really making it work.
Now, in contrast, one of the other patterns, and then we’ll start to wrap it up here, is the way that AI is being used in areas like experimental physics or cosmology or astrophysics. And in these situations, you have some kind of complicated experiment. This is like if the particle collider, or you can imagine it’s a telescope or what have you, it’s collecting some data. And the machine learning and AI algorithms are part of the data analysis pipeline. So, they’re collecting the data, they’re analyzing the data. It’s sort of like the souped-up version of, like, is this a car or a horse or something? You’re looking at the data, and you’re trying to identify what’s going on in the data to help automate and improve the data analysis pipeline, that then leads to claims of scientific discovery. There’s no additional, like, follow up. I mean, this is, like, part of the actual process of claiming the discovery. So in this situation, mistakes really matter. And so, you’ve got to be much more careful. And you really need to be able to calibrate these systems or understand how to quantify the uncertainty of using them. And I won’t go into any details, but there’s a lot of progress that’s been made there. And so, if you wanna trust science, if you wanna trust AI more, having a bunch of scientists that are skeptical about the use of it and figuring out, “How do you convince me that it’s not doing something wrong,” is, like, a good place to start. So, scientists are really contributing into the trustworthy AI space. So, that’s this kind of box here I call data analysis and inference.
And then, I’ll just make the point that there I was talking about, you know, things like astronomy and particle physics. But here, I’m going from astronomy to agriculture because the same issues apply when you wanna use AI and machine learning in the context of agriculture. So these are two, you know, journals that are, you know, covers that were about this issue of using machine learning and AI in the context of agriculture. And there’s too much to read here, but I’ll just make the point that they were using some of these machine learning techniques to try to predict nitrogen emissions from soil. And they got these systems to work. They worked pretty well, but when they–They trained them on one kind of agricultural system, and then they tried to apply them to a different one, where the–It says that it was now was trained on “a corn-soybean-wheat rotation,” which is a different type of farm than it was originally trained on. And then it didn’t really work very well. It went from 89% to 38% accuracy, right? So this issue of trust is absolutely important for everyone and definitely important for farmers if you’re gonna try to use these things. And so, they’re just kind of pointing to the same sets of issues that really you see in the context essentially of almost all these scientific contexts who are using it for data analysis. But the good news is we are making progress on it. But it’s still a hard problem.
Now, I’m gonna kind of wrap it up with this quote from Freeman Dyson. He’s another famous physicist. So he says, “New directions in science are launched by new tools much more often than by new concepts. And the effect of a concept-driven revolution is to explain old things in new ways, and the effect of a tool-driven revolution is to discover new things that have yet to be explained.” So, I think it’s a great quote, but I think a lot of what–I use this quote because AI is essentially a new tool that’s helping us do all sorts of things and hopefully discover new things that have yet to be explained. And so, it’s forward-looking and exciting, but there’s a–With time, after giving this talk and thinking about it more, I’m realizing that we’re really going through both kinds of revolutions. There are two different storylines. One is that we have this tool-driven revolution, but the other part is that AI is really changing how we think about, you know, doing science. And so, it is also a concept-driven revolution. And I’ll end with this. It’s an article that I like a lot called “On scientific understanding with AI.” So, I’ll just read directly from it. It says, “Imagine an oracle that correctly predicts the outcomes of every particle physics experiment, the product of every chemical reaction, or the function of every protein. Such an oracle would revolutionize science and technology as we know them. However, as scientists, we would not be satisfied with the oracle itself. We want more. We want to comprehend how the oracle conceived of these predictions, and this feat, denoted as scientific understanding, has frequently been recognized as the essential aim of science.” So, I think it’s really important with everything that’s happening now. It’s also true about teaching and the role of AI and everything. We really need to go back to first principles and think about why are we doing this stuff in the first place. And it’s kind of going back to, you don’t just wanna throw away everything and replace it with a black box just because it’s good at making predictions. You wanna understand, right? And then it–But they go on in a sort of forward-looking way of saying, with the ever-growing power of computers and AI, there’s the ultimate question of how can advanced artificial systems contribute to scientific understanding or achieve it autonomously? Like, could you actually have, essentially, robot scientists and what does that mean, and how does that change what we think about when we talk about doing science?
So, with that in mind, you know, think about science in the broadest picture of the scientific method of, you know, we are asking questions about, you know, generating theories, doing experiments, making hypotheses, testing them, thinking of interesting questions, asking why. And it’s a much more rich area. And then the last thing I’m gonna end on is I just want to also reinforce that AI and science is not just about AI coming to the rescue and having applications to science. When you think about research as a whole, sometimes people talk about research in this plane of, you know, on one axis you have “Is it practically useful?” And another axis, you have “the quest for fundamental understanding.” And so, people often think of applied research in this corner over here. And applied research is certainly important. But, you know, from the point of view of AI, if you’re just, you figured out AI and you’re just using it and applying it to science, it’s not very interesting from a point of view of making progress in AI. Pure research, on the other hand, is sort of like, “I don’t care if it’s practically useful. I’m just gonna follow my curiosity and do this pure basic research.” And that’s very important. But there’s this other style of research called use-inspired research, where you basically ground your questions in real–I mean, your research in real questions. So you’re thinking about–So the statement is basically that when you’re inspired by the context and peculiarities of real applied problems, that often leads to foundational advances. And I think that’s very, very true. Many of the–Much of the progress in AI was inspired by real scientific problems. So it’s not just a one-way street. It’s definitely an interaction. And that was definitely true in the case of the Nobel Prize in Physics. Those advances that were originally motivated by, for instance, biology and et cetera, led to advances in AI. And you see that all across the board. This is a great figure that just talks about federally-funded basic research and how the impact that it’s had on our real-world lives, so that’s sort of pure and applied research and use-inspired research, they’re all very important. And I just kind of want to end with a call to say it’s just so important that we fund federal research. I mean, it enriches our lives. And none of the stuff that I was talking about today basically would be happening if we didn’t have a robust federal research program. And it’s definitely worth its money and, you know, pays back dividends. So with that, I will end. And thank you very much.
[audience applauds]
Search Episodes
Related Stories from PBS Wisconsin's Blog
Donate to sign up. Activate and sign in to Passport. It's that easy to help PBS Wisconsin serve your community through media that educates, inspires, and entertains.
Make your membership gift today
Only for new users: Activate Passport using your code or email address
Already a member?
Look up my account
Need some help? Go to FAQ or visit PBS Passport Help
Need help accessing PBS Wisconsin anywhere?
Online Access | Platform & Device Access | Cable or Satellite Access | Over-The-Air Access
Visit Access Guide
Need help accessing PBS Wisconsin anywhere?
Visit Our
Live TV Access Guide
Online AccessPlatform & Device Access
Cable or Satellite Access
Over-The-Air Access
Visit Access Guide
Passport



Follow Us