OpenAI DevDay 2025: Opening Keynote with Sam Altman
Sam Altman opens DevDay 2025 with a keynote focused on innovative building concepts. Attendees can expect announcements, live demonstrations, and insights into how developers are transforming the future through artificial intelligence. Join us to explore groundbreaking ideas that will challenge conventional thinking in the tech industry.

Convert your Audio Video Files to Text
Speaker Separated Transcription98% Accurate
Free for files under 30 minutes
Thank you. Good morning, and welcome to Dev Day. Thanks for being here in San Francisco, the city where we started and where we are committed to building the future of AI. It's been almost two years since our first Dev Day. We, but most importantly, all of you have come a long way since then. Back in 2023, we had 2 million weekly developers and 100 million weekly Chetch BT users. We were processing about 300 million tokens per minute on our API. And that felt like a lot to us, at least at the time. Today, 4 million developers have built with OpenAI.
More than 800 people use ChatGBT every week, and we process over 6 billion tokens per minute on the API, thanks to all of you. AI has gone from something people play with to something people build with every day. Before we get started with all of today's announcements, we want to do something fun. On the screen behind me are the names of the developers in the room who have built apps on our platforms that have crossed some big milestones. 10 billion tokens processed, 100 billion, even 1 trillion. Let's give them a round of applause. On behalf of all of us at OpenAI, thank you for doing such incredible work.
You are the ones pushing the future forward, and seeing what you've already done makes us so excited about what comes next. While it's exciting to celebrate how far you all have come, we are still so early on this journey. So today we're going to focus on what matters most to you all, which is making it easier to build with AI. We've been listening to developers, hearing where you get stuck, what you want us to build next so that you can build more things. We've got four things for you today. We're going to show you how we're making it possible to build apps inside of ChatGBT and how we can help you get a lot of distribution.
We're going to show you how building agents is going to be much faster and better. You'll see how we're making it easier to write software, taking on the repetitive parts of coding so you can focus on systems and creativity. And underneath all of this, we'll give you updates to models and APIs to support everything you want to build. We think this is the best time in history to be a builder. It has never been faster to go from idea to product. You can really feel the acceleration at this point. So to get started, let's take a look at apps inside of ChatGBT.
We want ChatGBT to be a great way for people to make progress, to be more productive, more inventive, to learn faster, to do whatever they're trying to do in their lives better. We have been continuously amazed by the creative ways that people use it. Since our first Dev Day, we've been working to try to figure out how to open up ChatGBT to developers. And we've tried things like GPTs, we've adopted standards like MCPs, and we've made it possible for developers to connect ChatGBT to more and more applications. Some of this stuff has worked, some of it hasn't, but we've learned a lot along the way.
And today we're going to open up ChatGBT for developers to build real apps inside of ChatGBT. This will enable a new generation of apps that are interactive, adaptive, and personalized that you can chat with. So to build them, today we're launching the Apps SDK, which will be available in preview starting today. With the Apps SDK, you get the full stack. You can connect your data, trigger actions, render a fully interactive UI, and more. The Apps SDK is built on MCP. You get full control over your back-end logic and front-end UI. We've published the standards so that anyone can integrate the Apps SDK.
When you build with the apps SDK, your apps can reach hundreds of millions of chat GPT users. We hope this will be a big deal for helping developers rapidly scale products. Thank you. You're welcome. If a user is already subscribed to your existing product, they'll be able to log in right from the conversation. And in the future, we're going to support many ways to monetize, including the new Agenda Commerce Protocol that offers instant checkout right inside of ChatGBT. So let's take a look at a few examples. When someone's using ChatGBT, you'll be able to find an app by asking for it by name.
For example, you could sketch out a product flow for ChatGBT and then say, Figma, turn the sketch into a workable diagram. The Figma app will take over, respond, and complete the action. You can also then launch FigJam from ChatGBT if you want to iterate further. We're also making apps discoverable right in the conversation. So when a user asks for something, we can serve as a relevant app as a recommendation. Maybe a user says they need a playlist for their party this weekend. ChatGBT could then recommend building it in Spotify. It's an easy way to find the right app or have the right app presented to you at the right time.
And there will be a whole bunch of new ways for developers to get discovered. So now, rather than just talk about this, I'd like to invite Alexi to the stage, and we will show you a live demo.
Hi, I'm Alexi, a software engineer on ChatGPT working on Apps SDK. I'm super excited to showcase some of the first apps users will be able to interact with today. The magic of these apps is combining their rich interactive visuals with the power of ChatGPT. Let's start with Coursera. Let's say I don't spend enough time thinking about machine learning at work and I want to learn more. I can ask the Coursera app in ChatGPT to help me learn more about this. I can say, Coursera, can you teach me something about machine learning? Since this is my first time using Coursera in ChatGPT, I'll need to consent to connect.
The next time I use it, I'll be able to dive in immediately. You'll notice I asked ChatGBT for the Coursera app directly, but ChatGBT can also suggest apps if relevant to the conversation. Apps in ChatGBT, by default, are displayed inline and can support anything that you can render on the web, like this video shown here. Apps SDK also supports picture-in-picture or expanding to a full-screen layout. So now that I have my course up, let's play the video. Playing the video immediately pins it to the top of my screen, which is really helpful for something like Coursera because you can access your conversation while watching the video.
Let's skip ahead a little bit and say I want to go a little bit deeper on something mentioned in the video. I can ask Chachapiti, can you explain more about what they're saying right now? Apps SDK provides an API to expose context back to ChatGPT from your app, ensuring that the model always knows about exactly what your user is interacting with. We're calling it Talking to Apps, and it's really part of the magic here. I'm super excited about how learning with ChatGPT, one of our top use cases, is continuing to get better. And with apps and the Apps SDK, you can unlock richer educational experiences for users around the world.
So here chat to be responded and explained that the instructor is talking about data preparation steps before training a machine learning model. And then it breaks it down in simple terms for me. I don't need to explain what I'm seeing in the video chat to be teased right away. So here I was able to connect Coursera's app discovered start playing a course and directly engage with the video through text all within my existing chat to be conversation pretty cool. Users also love being creative and chatty here. Here I have a conversation where I've been brainstorming some ideas to help my younger siblings dog walking business.
We've gone back and forth a few times and now it's time to make this into a reality. I'm pretty happy with some of the names here. So let's let's take this walk this wag name and now I'm going to ask Canva to turn that into a poster. I can say Canva can you make me a poster with the walk this wag name I want it to be colorful whimsical right. And I prefer, oops, I asked Coursera for a typing course, too, maybe. I prefer sans serif fonts. Cool. Send that off, and now in the background, Canva is generating posters based on the context from my conversation.
Canva is great at creating assets like this, and now you can kick it off directly from chat to PT. Whether you're making professional marketing assets for open AI or just a fun demo for Dev Day, Canva is right there in the conversation as you work. App's SDK, as Sam mentioned, is built on MCP, an open standard we've loved building on at OpenAI. And if you have an existing MCP, it's really quick to enhance it with the app's SDK. All you have to do is add a resource that returns HTML, and the app would be able to be up everywhere ChatGBT is distributed across web and mobile.
As you can see, this is a live demo, so we're experiencing a little bit of latency. But here we go. Canva has returned four poster examples to us. We see them in line, just like the Coursera video, as well as ChatDBT kind of explaining what it's done for us. But we can explore another modality in Apps SDK, which is full screen. I click an asset and the app requests full screen. We are able to focus on a specific asset. And from here, I can see it in more detail. I can ask Jack2BT to request changes, maybe visual tweaks, just like with our image generation experience.
But since we're in San Francisco and it's Dev Day, let's ask Canva to convert this into a pitch deck. I can say, Canva, can you please make this poster into a pitch deck? We're trying to raise a seed round for dog walking. I'll send that off. And since now Cam is going to make us a couple of slide decks, it may take a moment. So in the interest of time, we'll move on to show one more demo while that loads. Say the dog-walking business is going really well and we want to expand to another city.
I can ask ChatGPT, based on our conversations, what would be a good city to expand the dog-walking to? ChatGPT, of course, knows what we've been talking about. Super enthusiastically says Pittsburgh. Great. Now I can invoke the Zillow app to say, please show me some homes for sale there. And now chat to be talking to Zillow to fetch the latest housing data and will get a interactive map embedded in chat to be T and will explore how the full screen experience there goes so. We have our map gorgeous loading state and boom we have a bunch of homes here.
It also looks like our slide decks are done, so we'll go back to those in a moment. But this map is a little hard to see in the inline view, so I can click a specific home and open it full screen. And now we have most of the Zillow experience embedded in ChatGBT. You can request a tour, all the actions that you would expect from Zillow. But we have a lot of options here, and it's kind of hard to parse. So I can ask ChatGBT, can you filter this to just be three-bedroom homes with AI?
For the dog, of course. Chachi PT will talk to Zillow again, and because the app is in full screen, it can now just update the data that's provided to it without needing to create a new instance. We see Chachi PT came back. We get the message overlaid here. If I click that, I can view my conversation over the Zillow app and even open it to the full height. Cool let's zoom in and find a specific home we may be interested in. Now because the Zillow app is exposing context back to check GPT it knows what I'm looking at I can ask for more information about this home like how close is this to a dog park.
Chat GPT is able to compose the context from Zillow with other tools at its disposal, like search. So it's able to give more information about this home. From here, I could invoke other Zillow tools, maybe find out the affordability of it, but it'll provide the best answer every time. This is a great example of how dynamic an experience with apps SDK can be. This all started from an inline map, and now we're able to go back and forth between talking to the app, asking Chat GPT questions, or just using the Zillow experience. Let's check back in on those slide decks so if I.
Pop back over to this conversation see canva has given us a few options here. Like the look of this blue so if we open that up we now see the slides in full screen and I can see all the beautiful slides that canvas generated for me when I'm ready I can. Just like the posters, I could ask for follow-up edits. And when I'm ready, I can open this in Canva to get the real slides out and hopefully close the seed round. So there it is, the magic of apps in ChatGPT, conversations that unite the intelligence of ChatGPT with your favorite products, resulting in truly novel experiences.
I'm so excited to keep building this with all of you. Can't wait to see what you do with it. Now, back to Sam for more on apps. That was awesome.
Thanks, Lexi. That was great. It's very hard to type and talk in front of a bunch of people at the same time. So you did very well. We're excited for you to try out the apps that you saw in the demos, along with a few more from these launch partners. They'll be available in Chatch BT today. And this is just the beginning. We're going to roll out more apps from partners in the weeks ahead. For developers, the Apps SDK is available in preview to start building with today. Our goal is to get this in your hands early, hear your feedback, build it together with you.
And then later this year, developers will be able to submit apps for review and publication. We'll also release a directory that users can browse. In addition to discovery and conversation, any apps that meet the standards provided in our developer guidelines will be eligible to be listed. Apps that meet higher standards for design and functionality will get featured more prominently, including in the directory and suggested as apps and conversations. We've published a draft of the developer guidelines along with the apps SDK, so you'll know what to expect. And we'll share more and monetization from apps soon.
We'd also love your feedback about what you'd want. This should be an exciting new chapter for developers and for ChachiBT users. So that was apps. And we hope everyone loves it. Thank you. So next we want to talk about building agents and how we're going to make this simpler and more effective. AI has moved in the last couple of years from systems that you can ask anything to to systems that you can ask to do anything for you. And we're starting to see this through agents, software that can take on tasks with context, tools, and trust.
But for all the excitement around agents and all the potential, very few are actually making it into production and into major use. It's hard to know where to start, what frameworks to use, and there's a lot of work. There's orchestration, eval loops, connecting tools, building a good UI. And each of these layers adds a lot of complexity before you know what's really going to work. Clearly, there's a ton of energy and the opportunity is very real. So we've talked to thousands of teams, many of them in this room, who are building agents to reimagine how work gets done.
And we've asked what we can do to make agents much easier to build. So today, we're going to launch something to help with that. The goal here is something for every builder that wants to go from idea to agent faster and easier. So we're excited to introduce a new thing called AgentKit. AgentKit is a complete set of building blocks available in the OpenAI platform designed to help you take agents from prototype to production. It is everything you need to build, deploy, and optimize agentic workflows with way less friction. Our hope is that everyone from individual developers to large enterprises will get a lot of value from this.
And we'll talk about a few of the core capabilities now. So the first one is AgentBuilder. This is a canvas to build agents. It's a fast visual way to design the logic steps, test the flows, and ship ideas. It's built on top of the responses API that hundreds of thousands of developers already use. So most of you who have used our platform before should be familiar with the foundation. The second thing is ChatKit. We've heard this one loud and clear, and we're making it easy to bring great chat experiences right into your own apps. You get a simple, embeddable chat interface that you can make your own.
You can bring your own brand, your own workflows, whatever makes your own product unique. And you can see in the video here how chat can work across each agent node and call on tools to form the best response. And then finally, evals for agents. We're shipping new features dedicated to measuring the performance of agents. You get trace grading to help you understand agent decisions step by step. You get data sets so you can assess individual agent nodes. You get automated prompt optimization. And you can even now run evals on external models directly from the OpenAI platform.
This is all the stuff that we wished we had when we were trying to build our first agents. And of course, agents need access to data. So with open as connector registry you can securely connect agents to your internal tools and third party systems through an admin control panel while keeping everything safe and under your control. So let's look at a couple of examples. Albertsons runs over 2,000 grocery stores across the US. More than 37 million people shop there each week. And each store is like its own little economy. Managers have to make all these constant decisions, you know, tweaking this promotion or that product mix, resetting the displays, working with a bunch of vendors.
It's like a lot of stuff. So Albertsons built an agent using agent kit. So now imagine a situation where sales are unexpectedly down for ice cream, down 32%. Before this would have kicked off a long process of reporting. There'd be spreadsheets and meetings, a lot of panic, they sell a lot of ice cream, very important to them. Now an associate can just ask the agent what's going on. The agent will look at the full context of everything it can discover, seasonality, historical trends, external factors, and it'll give a recommendation. Maybe it's time to adjust the display or to run a local ad.
So let's take a look at another agent. HubSpot is a customer platform used by hundreds of thousands of organizations around the world. And they used AgentKit to improve the responses of Breeze, their AI tool, using the custom responses widget. So in this example, a HubSpot customer called Luma Plants gets a question about why a plant is not doing so well in Arizona. It then uses the Breeze Assistant to search its own knowledge base, look up local treatments for the state's low humidity, polls and policy details, and puts everything together. It then offers multiple ideas and a recommendation.
So this is how we imagine intelligence working across many different sources, all operating together to deliver smart, useful answers to customers. And it's a great example of the thing that you can build with AgentKit. We have a bunch of great agent launch partners that have already scaled agents using AgentKit. And it's available to everyone starting today. So let's do a live demo, and I will pass it off to Christina.
Thanks, Sam. Hi, everyone. I'm Christina, and I work on the team building AgentKit. Today, I want to show you how AgentKit helps developers create agents faster than ever before. So you may have already seen our Dev Day website. It's the site here that all of you have access to and has everything about today's schedule. But right now, it's just a static page. What if it could actually help you navigate the day and point you to the sessions that are most relevant to you? We're OpenAI. We need to have AI in our Dev Day website. So that's what we're going to build together, an agent powered by AgentKit deployed right here inside this site.
And to make this interesting, I'm going to give myself eight minutes to build and ship an agent right here in front of you. You just heard how hard it is to build an agent, so this is going to be a bit of a challenge, and I'm going to start the clock now to keep me honest. OK, we have a clock going. So I'm starting in the workflow builder in the OpenAI platform. And instead of starting with code, we can actually wire nodes up visually. Agent Builder helps you model really complex workflows in an easy and visual way using the common patterns that we've learned from building agents ourselves.
So here on the left, we've already extracted the common building blocks. For example, tools like file search and MCP, guardrails, human in the loop, and other logical nodes. Today, I'm planning on building a workflow that uses two specialized agents. The first will be a sessions agent, which will return information about the schedule. And the second will be a more generic Dev Day information agent. So I'm starting off with a categorizing agent to just help route and categorize the type of message coming in, whether it's asking about a specific session or something more generic. And then I've added in an if-else-no to route behavior based on that classifier.
Next, I'll create the session agent. Here I'll drag and drop an agent node. I'll call this session agent. I'll give it the context about kind of grabbing information about a session. And then I can add in various tools here. Today I already have a doc with all the information about sessions. So I'll simply drop that in. Let's call this sessions and attach it. So this agent now has all the information needed to answer my questions. But showing the schedule should also be fun and visually interesting, not just plain text. So I'll also create a widget for them.
I'll head over to our widget builder. Here I could create a widget from scratch. I can browse the gallery to learn about other widgets and reuse them. But for today, I've actually already designed a widget for this use case. In this case, it's an onboarding session widget for Froze, one of our Dev Day friends that you'll see around the venue, who's holding a 101 onboarding session in Golden Gate Park. So we can simply download this and then head back over to our agent and just attach it in as, I don't think I clicked download, so let me go back and actually click the button.
Download. There we go. Great. So head over and attach it as an output format for the sessions agent that we just created. Drop that in. We can preview it to make sure we added in the right widget, and everything looks ready to go. So this session agent is now done. Next, I'll create the general dev day agent. So once again, I'll drag in an agent node. Let's call this the Dev Day agent. Once again, give it some context about what it's doing. And then we'll also make it speak in the style of Froze just to make it really on-brand with the day.
We'll add in a file once again. So we have a file with all of the information about the day. Call this Dev Day. Attach it. This agent is ready to go as well, and we'll attach that here. Now, it looks like I have a couple more minutes, so let's add in some additional security with one of the pre-built guardrails. So one of the most important things when building agents is being able to trust them, and guardrails help you have that confidence, protecting against hallucinations, adding moderation, blocking PII. In this case, we already have a couple pre-built guardrails.
I'll turn one on for a PII, and then I'll just include name as well so I can easily verify its behavior. I'll attach this in to the beginning of the workflow to make sure Frosch is really protected against PII, and then I'll add in an additional agent to handle cases when this information is passed in. So again, I'll make it speak in the style of froze to stay consistent. And I'll remind it that it cannot help with questions that contain sensitive information and remove kind of the context. Great, so I think this workflow is ready to go.
I can also configure the output to determine what shows up to the end user. In this case, I can also turn off file search sources if that is kind of more internal. And I think that's it, let's test it out. I can preview this directly from our agent builder. So here I can ask what session to attend to learn about building agents. And I can see this message moving its way through that workflow we just created, checking guardrail, categorizing intent, pulling information from the file of sessions that I just added in, finding the right session, using the widget that I added, and determining, you know, orchestrating agents at scale at 1115 with James and Rohan is like the best session for me to go to to learn more about this.
And then I see a couple of ribbits because this is actually Froze talking to me and ribbiting at me. So, okay, I think this agent looks good, need to watch the time. So we just built a few specialized agents using tools, we added in guardrails, we customized them using some widgets, and then we also tested out the workflow and preview. The one thing we haven't yet done is a full set of evals, and we can also do that directly from the agent builder to make sure that everything behaves exactly as expected before going live. But right now I've got a giant clock chasing me and Dev Day is waiting, so let's publish this.
Hit publish here. Let's call this ask-froze. Hit publish. And I now have a fully deployed published agent in production with a workflow ID that I can use to run directly. On the right. On the right, we also have code export in case I want to run this in my own environment, in my own servers. But you can see this is quite a bit of code to write. And so I'm just going to stick with using the workflow ID that we just created and then head over to my site. So here in my Dev Day site, I'm first going to create a chat kit session using the workflow that we just created.
I'll simply drop in that workflow ID. I'll add in the ChatKit React component using that client secret that we just created in our own server, and then adding in visual customization as well to, again, make this really Froze themed. In this case, it's going to be called askFroze. It's going to continue to ribbit in the placeholder, and it'll have some Froze specific colors and starter prompts. I'll add this Froze chat in a bottom sheet, so it'll come up from the bottom of the page. And then finally, I'll add in a link to Ask Froze at the top of the site so that it's really front and center on our website.
So let's go back to our site. There it is. Ask Froze, top of the site. Let's try it out. So what session to attend to learn about building agents? And again, this is running through the exact same workflow we just created, checking for guardrails, categorizing the message, pulling from tools from file search, using the widget that we designed, and then, again, deciding orchestrating agents at scale is the right session for me to go to and continuing to ribbit in the style of Frosch. So, okay. Great. We've done it. The agent is ready. We can stop the clock.
The agent is ready with 49 seconds to spare. And I can keep iterating on this agent directly in the Visual Builder and also deploy these changes directly to my site without making any code changes at all. This includes adding new tools, adding new widgets for other use cases, adding new guardrails, and I can even wire it up to client-side tools to take actions directly in my website. So, in just a few minutes, we've designed an agent workflow visually. We added in some tools and widgets, we previewed it, we deployed it, we tested it, and now you all can use it.
This is actually live now in your Dev Day site. You can tap your badge and you should be able to see it and use it and find the sessions that are best for you. So we're looking forward to using it and meeting Froze and also seeing all of the new experiences that you'll now be able to build using AgentKit. Thanks and back to Sam.
Thank you, Christina. I think that's so cool. I can't wait to see what you'll all build with it. So we've looked at AI apps, agents, but now let's shift to one more thing that's just as important, how we write software. One of the most exciting things happening with AI is that we're entering a new era that changes how software gets written. Anyone with an idea can build apps for themselves, their families, or their communities. And before we talk about codecs, we want to show you a few examples. In Japan, an 89-year-old retiree taught himself to code with the help of Chachi BT.
He's now built 11 iPhone apps for elderly users. He's turning a lifetime of wisdom into tools that help others live more independently. In Spain, Pau Garcia and members of domestic data streamers are helping people reconnect with memories using Chachi BT, Image Generation, and Sora. At ASU, med students needed a better way to practice the kinds of difficult human conversations they'll have as doctors. So they built a virtual patient app with our models where they can try, fail, and get better before they step into a real exam room. And at Versailles in France, visitors can now walk the palace and talk to it.
They built an experience where you have a live discussion with art and sculptures with our real-time API. History becomes a conversation. It's awesome to see what people are building, and this is why we're so excited to give developers even more tools to build faster. So, earlier this year, we launched a research preview of Codex, OpenAI's software engineering agent, built to work alongside developers and speed up how software gets created. Since then, Codex has become very loved and grown into a much more capable collaborator. It works everywhere you code now, your ID, the terminal, GitHub, and in the cloud.
Your chat GPT account connects everything so you can move your work seamlessly between your tools. We've released a ton of new features for Codex, and it now runs on the new GPT-5 Codex model, a version of GPT-5 purposely trained for Codex and agentic coding. This model is better at tasks like code refactoring and code review, and it can dynamically adjust its thinking time for the complexity of the task. Developers love this new model, and the Codex usage has gone up really fast. One of our key metrics for looking at this is daily messages, the number of tasks and conversations that developers have with Codex each day.
Since early August, daily messages are up 10x across Codex. And this rapid usage has also helped GPT-5 Codex become one of our fastest growing models ever. Since its release, we have served over 40 trillion tokens from the model. Internally, at this point, Codex is everywhere we build. Almost all new code written at OpenAI today is written by Codex users. Our engineers that use Codex complete 70% more pull requests each week, and nearly every OpenAI PR goes through a Codex review. And from that, people get more depth than they'd expect, even from a very senior engineer.
Starting today, Codex is officially out of Research Preview and into GA. And while... Thank you. And while Codex already has a lot of traction with individual developers, today we're introducing a new set of features to make Codex more helpful for engineering teams. First, we have a Slack integration. This has been very much requested so that you can ask Codex to write code or answer questions directly from team conversations in Slack. Second, a new Codex SDK, so that you can extend and automate Codex in your team's own workflows. And third, new admin tools and reporting, including environment controls, monitoring, analytics dashboards, and more, so that enterprises can better manage Codex.
Expect to see a lot more Codex improvements coming soon. One of the things that's been really inspiring to us is to see the breadth of people using Codex. from developers building side projects on weekends to high-growth startups to big global enterprises. Cisco rolled out Codex across its entire engineering org. They're now able to get through code reviews 50% faster and have reduced the average project timeline from weeks to days. So for our next demo, we're going to show you something fun. We're going to show you how you can use the new Codex and our APIs to turn anything around you into workable software.
And for that, please welcome Ramon to the stage.
Good morning, everyone. Last year, we built an iPhone app from scratch, and we even programmed a mini drone live on stage using O1, our first reasoning model. It was kind of vibe coding before we even had a name for it, frankly. But the progress since has been incredible. Codex is now a teammate that understands your context, it works alongside you, and it can reliably take on work for your team. And we thought about how do we best show you all of the cool things that Codex can now do. We had a lot of ideas. But one that we kind of kept coming back to is what about building something that we could all experience and see together here in this room right now.
So that's our challenge. If you look up here, some of you might notice a camera that's mounting above the stage, and I thought maybe we could start there. And so earlier, I asked Codex CLI to create a very simple control panel interface with a very simple interface. The camera feed on the left. some buttons on the right, and if we can bring my laptop on screen, you'll see what Codex came up with. Initially, it did it really well, but then I also added the Figma branding from the Dev Day event so it could pull the exact colors and components to actually render it perfectly matching our design.
Great, so that's our place to get started. Now, I haven't written a single line of code, but let's dive in and see what we can do on top of that. Now, switching over here to, we're going to skip this version. Skipping over here to my terminal, which you wanted to update, I'm sure you can see we have Codex CLI. It's logged in with my chat GPT account, and it's powered by GPT-5 Codex, our net new model that Sam mentioned. Now, let's start by asking this question that I'm sure many of you have not asked a coding agent before.
How to control a Sony FR7 camera in Node? And honestly, I did not know how to get started. I saw there was a C++ SDK and I thought maybe like Codex would want to use that and turn it into JavaScript. But then it had a much better idea than that. I realized that there's a VSCA protocol apparently to control these cameras. So, you know, as you can see, Codex can respond pretty fast for questions like this. And this all seemed promising to me. So basically I went ahead and I typed this script over to Codex to completely scaffold an integration using the vScap protocol and wire it up to that control panel.
Now, Codex is becoming harder and harder to demo, by the way, because it can really work tirelessly on your task. I've seen it work for up to seven hours on the very big refactoring, for instance, and get it right, which is pretty outstanding. Here, if we switch over, and if we scroll back up, it updated its plan along the way, wrote a lot of code and everything, and this was the final result. As you can see on the screen, it worked for over 13 minutes on that one task, but it did everything I wanted it to do.
So let's take a closer look. If I jump over to VS code now, you can see on the right side of the screen, we also have our codex integration in the ID. And these are the files that the codex CLI came up with for the camera control. So you can see it built a node server. It also figured out all of the UDP packets to send to this camera. Imagine the time it would have taken me to actually learn this protocol that's over 30 years old, by the way. Codex even figured out that there was some very specific headers to send for this particular camera.
All right. So with these UI components now wired up, the server's running. So we can take a quick look and try this. So here, if I turn on the camera, there we go. We have the camera. That's all of you. Awesome. And let's try the controls. Boom. I can actually control the camera now from this interface. It's pretty cool. All right. But controlling with buttons is nice, but I think we can do something better. So I'm going to try to send another task, but live this time inside our ID extension. So check this out. Wire up an Xbox wireless controller to control the camera.
So I'm going to send this one right now. I was backstage earlier, and I found this Xbox controller. I wasn't sure who's playing back there, but I thought this could be something we could try. So we're going to leave it here. And as you can see now, Codex made a plan. Apparently, three tasks have to be completed. It's now exploring the file. It's figuring out how to wire up this gamepad. And what's interesting here is you can see in the ID we also have this concept of auto context. And what that means is that your prompts can be pretty short because codecs will kind of understand your intent.
It will see like the recent files that you've used and really kind of adjust accordingly. So as you can see, we're now at task number two. This one will probably take another minute or so. So we'll leave it run in the background. So in the meantime, what else could we do? Well, I thought one exciting interface is voice. So to save us a few minutes, I've already asked Codex to integrate with our real-time API and our agents SDK. And I wanted to wire up all of this into the app right here on this green little dot at the bottom right of the screen.
But what's great about the real-time API is that it brings natural speech-to-speech into your app, but it also connects to any MCP server in the context of that conversation. And so that really got me thinking, what else could we show you in this room and turn into an MCP server? And then I thought, wait, we have a lighting system, so maybe we could wire up the lighting system of the venue over to an MCP server. So let me check out this task that I sent to Codex, but this time in Codex Cloud. So you can see here my prompt.
I asked Codex to wire up this MCP server for this very specific model of lighting system. I gave it the reference docs that I found, and I gave it the exact interface that I wanted to have for my UI to work. But what's fascinating to me is, like, if you look at the logs, for instance, like, that's really the magic of the agentic behavior of Codex. Like, I could have asked a teammate to do that, but because the task was very specific, now Codex is my teammate. And if you look at how it went through the process, it actually figured at some point that it needed to kind of find new information about command 8 to move forward, so then it went ahead to fetch the GitHub docs again to kind of really like operate and call tools along the way.
And that's really the magic of GPT-5 codecs. You can really let it go off, do some complex tasks, and you'll see this like agentic behavior emerge. And then after that I did one more turn to actually ask codex to like include the readme information so I can understand how to run it and sure enough here is my MCP server. Now going back to VS code what's also very interesting sounds like the Xbox task just finished will see that in a moment awesome. And now going back here, you can see that you have all of your local but also your cloud tasks right here inside the VS Code extension.
So if I click on the build MCP server, for instance, this is the task I just showed you on Codex Cloud. And here I can actually apply all of the changes locally if I want to test them. So I can just go ahead and click apply and now all of the changes with this MCP server are now included locally. Now, I didn't want to let you watch me do NPM install here on stage. I just took that MCP server already, and I started to install all of the dependencies to make it run. So let's now put all of that to the test.
But first of all, if we go back to our app right here, let's see if we can actually control this camera now with this Xbox controller. There we go. It works. That's pretty awesome. What's funny is I didn't even specify to Codex which button should do what, but it figured that the joystick was probably a good idea. So that's actually perfect. But let's do the real test now. Let's enter voice mode and see what happens. Sorry in advance AV team, but we'll try to see if Codex figured this out to control the lights. Hi there.
Hello?
Hi there, I can hear you loud and clear. Let me know what you need. Yeah, what do you see on the camera? I see a large audience in front of you, with rows of attendees seated and ready. The room is well lit, and the crowd stretches far back, filling the space. That's right. Could you shine the lights towards the audience? There we go. I've just lit up the audience with a nice bright walk. They should be perfectly highlighted now. Can you greet everyone here in the room now? Absolutely. The audience lights are up. Let's welcome this fantastic crowd.
Hello, amazing developers at Fort Mason. We are so excited to have you here for an incredible day. Awesome.
Could you now do something fun with the lights and say hi to people tuning in on the live stream?
We've got the fun lights rolling now, some dynamic, colorful effects in motion. And to everyone watching the live stream, thanks for joining us. This is going to be an amazing experience. Cool. All right. Let's go back to normal now.
Awesome. So I still have not written a single line of code, by the way, to make this happen. But there's one more thing I wanted to show you. Last but not least, Sam mentioned that we're launching the Codex SDK today. And so I wanted to finish up with something that gives you a little bit of a glimpse into what might be the future of software engineering. So let me go back into voice mode and try this. Hi, could you ask Codex to show a credits overlay like at the end of the movie, but this time the cast is all Dev Day attendees?
I'm running that with Codex now. I'll let you know when it's ready. Great.
In the meantime, could you start a countdown and take a photo of all of us? Oh, and there we go. So to explain you what just happened here, when I sent a task to the voice agent, it also added Codex SDK as a tool. And what that means is that now, on the fly, I can reprogram this app in real time and instantly adapt it to user needs or any kind of feedback they have. So in this case, when I asked to create a credits overlay, it was able to go ahead and edit the code inside this React app, aut-reload it, find what it needed to complete the task, and now the credits are rolling.
It's pretty amazing. So with all that, we took voice, we took a sketch, we took some devices around us, and we turned all of this into workable software. And all of that without writing a single line of code by hand. So really give Codex your most ambitious ideas, give Codex your most complex coding problems, and see what happens. I think you'll be as amazed as we are every day. The only limit now is your imagination. Thank you so much. Back to you, Sam.
Thanks, Ramon. This is the biggest change to how software gets created that I have ever seen, and we can't wait to see what you all will do with it. I think the future is going to be very bright. We've covered a lot today, but obviously, models matter a lot too. So I want to share a few model updates. Back in August, we launched GPT-5. We trained it to be really good at steering agents and end-to-end coding, and GPT-5 has delivered on that. Leading coding startups like Cursor, Windsurf, and Vercell are using GPT-5 to change how software gets written and ship in their apps.
And then after that, we released GPT-5 Pro, the most intelligent model that we've ever shipped. Today, we're launching GPT-5 Pro in the API. It's available to all developers, and we hope you enjoy. GPT-5 Pro is great for assisting with really hard tasks, domains like finance, legal health care, and much more, where you need high accuracy and depth of reasoning. We're also releasing a smaller voice model in the API with GPT Realtime Mini. It is a smaller, 70% cheaper version of the advanced voice model we shipped two months ago with the same voice quality and expressiveness.
Personally, I think that voice is gonna become one of the primary ways that people interact with AI. And this is a big leap towards that reality. Now I wanna shift gears and talk about what's new for creators. And this has been a hotly requested one, so we hope you'll like it. We've been seeing incredible work from filmmakers, designers, game developers, educators, and more. Using AI as part of their creative process. Today we're releasing a preview of Sora 2 in the API. You now have access to the same model that powers Sora to stunning video outputs right in your own app.
One of the biggest jumps that we've made with this model is how controllable it is. You can give it detailed instructions, and it holds onto the state while delivering results that feel stylized, accurate, and composed. For example, you can take the iPhone view and prompt Sora to expand it into a sweeping cinematic wide shot. But one of the most exciting things that we've been working on is how well this new model can pair sound with visuals. Not just speech, but rich soundscapes, ambient audio, synchronized effects that are grounded in what you're seeing. So here's an example in this kayak video.
You can also bring pieces of the real world into Sora 2. For example, you could take a picture of your dog and give your dog some new friends.
Look who's coming, buddy. Here they all are. Good pups, come on. That's it, everybody together.
Happy dogs. Sore 2 is also great for concept development. You can just describe a general vibe or a product, and Sore will give you a visual starting point. So here we're going to use it to generate concepts for an e-commerce ad.
When your new place feels like a blank canvas, find the pieces that make it yours. Browse, customize, and check out in minutes. Delivered fast to your door.
We hope that now with Sora 2 Preview in the API, you will generate the same high-quality videos directly inside your products, complete with the realistic and synchronized sound, and find all sorts of great new things to build. Just like our other modalities, it's built for flexibility. You get to control video length, aspect ratio, resolution, easily remixed videos. Mattel has been a great partner working with us to test Sora 2 in the API and seeing what they can do to bring product ideas to life more quickly. So one of their designers can now start with a sketch and then turn these early concepts into something that you can see and share and react to.
So let's take a look at how this works. That is a very cool new way to build toys. And it's incredible to watch with AI how fast ideas can turn into shareable, workable designs. So we're excited to see what else you'll come up with as you use SOAR 2 in your own products. We hope that today gave you a few ideas of new things to build. We want OpenAI to be a great platform for this new era of building. We think things are going to get pretty incredible pretty soon. All of our announcements today are aimed to support you in this work.
The Apps SDK for building native apps inside of ChatGPT. AgentKit, so you can deploy agents wherever you'd like, easily and with more confidence. A more powerful codex, changing the way software gets written, helping your team ship faster. And new models in the API. GPT-5 Pro, Sora 2, and Realtime Mini that expand what's possible. We're watching something soon that can happen, I think. Software used to take months or years to build. You saw that it can take minutes now. To build with AI, you don't need a huge team. You need a good idea, and you can just sort of bring it to reality faster than ever before.
So thank you all for being here, and thank you for building. One second, I'm almost done. Our goal is to make AI useful for everyone. And that goal won't happen without all of you. So we're very grateful that you're here to build with us. And also a huge thanks to the team that made today possible. A huge amount of work went into this. And there's a lot more work, there's a lot more happening throughout the day. So enjoy the sessions and we'll see you later. Thank you very much.