SPEAKER_01: Welcome to Unconfuse Me. I'm Bill Gates. Let's talk about AI. You and I have both been lucky enough over the last six months or so to have engagement with the mix of both Microsoft and OpenAI and have early access. And I remember some of the best examples of how to get the AI to do fun things were when you came up and saw me and we were brainstorming about this. You were the one who said, hey, if you tell it to write a speech like various politicians, including Trump and others, that it's stunning how it captures the voice of different people. SPEAKER_01: So, you know, give us the how you first were using the AI and then, you know, this is super timely because you've just recently come out with the con me go. I'm saying that right. SPEAKER_00: That's right. Very good. Yeah. I mean, it was open AI folks reached out and they said, Hey, you know, we're a couple of weeks away from having. you know, the first training of our model. And they wanted to reach out to Khan Academy for two reasons. One was they said, we want to make this really good at AP Biology. And I only found out later. I don't know if this is true. They told me you gave them the challenge. SPEAKER_01: That's right. In June, they kept showing me this thing. And I was like, yeah, it's kind of an idiot savant. I don't think it's practical. Why don't you see if it can do with the AP biology exam. And I'm not going to pay any attention until you can like get a five. And I thought, okay, that'll give me three years to work on HIV and malaria and these guys. And then it was so bizarre because Sam Altman and Greg Brockman in Lake August said, Hey, we want to come show you this thing. And so it was early September when there were like 30 people at my house. SPEAKER_01: And, you know, I've said it's the most stunning demo. I've ever seen in my life. I mean, right up there with seeing the Xerox PARC graphics user interface that, you know, set the agenda for Microsoft for about 15 years. You know, this demo was so surprising to me. The emergent depth that as they scaled up the training set, its fluency and you have to say understanding that You know, computers could not read in the sense humans do, and it couldn't write in the sense humans do. And now, you know, with lots of footnotes about hallucination and things like that. SPEAKER_01: But I'm still personally in a state of shock at, wow, it is so good. And okay, therefore, you know, let's see where we can put it to good use. SPEAKER_00: Yeah, 100%. And so thank you for that challenge to them. I think they came to us because we have a large library of AP biology questions, et cetera. Like, hey, can we use that to either evaluate or train? And at the time, I was like, well, what's in it for Khan Academy? And they're like, well, you know, maybe you get access to the model. And I too was skeptical. I'd seen GPT-3 at that point and it was cool, but it wasn't, I don't see how we're going to apply it. Two weeks later, they showed us the AP bio question. SPEAKER_00: He said, so Sal, what's the answer to this? And I was like, okay, I think it's C. And it said, oh, the answer's C. And I was like, oh, that's interesting. I started getting a few goosebumps. And then I said, well, ask it why that's the answer. It explained it. Oh, yeah, that's it's so good at that. Oh, yeah. And I mean, I think what folks need to realize, because, you know, everyone had that moment with chat GPT. But this was it was like that, but more because GPT-4 is even better. SPEAKER_01: Way better. SPEAKER_00: And I said, can you say why the other answers aren't correct? Did it? Then I said, oh, you know, then I started like I was almost shaking. Then I said, can you write 10 more questions like this? Bam, bam, bam. And the first 10 I saw were all Pretty good. Like I just I'm like, yeah, yeah, yeah, that's all legit. And then, you know, the implications for Khan Academy started to go through my mind. And then we just start to get into some of its hallucinations and some of its math errors in those early days. SPEAKER_00: But that weekend they gave me and our chief learning officer access on Slack. And we just like, I couldn't sleep. I was just having these in the rabbit hole conversations with it. And then we had a hackathon for our team. We got about 40 people on our team under NDA. And we said, like, just come up with stuff. We were having the debates that everyone was having around. Well, you know, the information is not 100% airtight. The math isn't great right now. The costs are not trivial. You know, can introduce bias. What's the safety? Can you, how could you use, you know, the use of people's information, et cetera. SPEAKER_00: But then we were starting to get it to work well as something that just helps you answer questions while you're watching a video, to work well as a tutor. And then every, honestly, every 10 minutes we thought about it, like, wait, it could also do this. It could also do that. It can also do that. And so we said, well, what if we could make it so you could talk to any historical character? What if you can make it so it gets into a debate with you? What if you can make it so it doesn't write your paper, but it writes it with you? SPEAKER_00: What if it can do lesson plans for teachers? It could be the end of static curricula. I mean, just the imagination kept going. And by December, January, we had our team in full rapid prototyping mode. And yeah, just, you know, recently we launched Conmigo and, you know, so far we're starting to titrate it out to the world, giving people access to it. But the feedback is. Very promising. What we're hearing overwhelmingly from social media and the press is that they're really happy that we've engaged in this and that we're taking a safe approach where parents and teachers can monitor it. SPEAKER_00: We have a moderation filter. OpenAI has also gone through great pains to make sure that you know, things stay appropriate. You and I have talked extensively about the math issue, and we've done some things that I think make it quite robust on top of the things that the model does. And the costs are coming down dramatically. SPEAKER_01: That's really impressive. SPEAKER_00: Even chat GPT isn't bad, and then GPT-4 is dramatically better. I mean, it makes mistakes, to be clear. They both make mistakes. But one of the things we realized is, when you do math, especially if I'm tutoring you, let's say you do some work, I don't just immediately say, correct, incorrect. I say, well, let me see this. Okay, let me see what he did. Okay, okay, okay. Oh, okay, okay. Yeah, yeah, good job. And one of the hacks, I don't think it's actually a hack. It's actually, I think, a principle we're doing, which is we weren't getting good results when we just asked it to decide, when it acts as a tutor, whether a student is right or wrong. SPEAKER_00: But as soon as we said, you know what? Construct your thought, and these thoughts are private to you. Write that down first and then evaluate the student's response to your thoughts and then say something publicly to the student, then the math improved dramatically. So it's, you know, it is funny. One day you think like, oh, this is so not like a human being. And so and then the next year is like, wow, that's kind of how we operate. We kind of need that thought before we can talk. SPEAKER_01: The way I think of it is it's like a human that's not very good with context. that the math context we know is a special context of check your answer and it's also sometimes even in conversations if you get it into a mode where it's telling jokes with humans we have kind of a look or you know, the new question is quite different. It thinks it's supposed to just keep telling jokes, and you almost have to do a reset to get it out of this, hey, everything is a joke type mode. So it's like kind of a very naive person. SPEAKER_01: in terms of all these different contexts were in. And my favorite one is where it tries to Sudoku, which it can't do. And you pointed out, hey, that's not a good solution. And it says, oh, I must have mistyped because it's like, where is the typewriter? You have a typewriter. I like it. It's defensive. It gets defensive. It sees how humans deal with being accused of getting my dog a homework or something. Subscribe to Unconfuse Me wherever you listen to podcasts.