ChatGPT Atlas deep dive: The browser that knows too much?

Download MP3

[JINGLE]

All right, we are back after another long hiatus. This time for different reasons. Brad, how's it going?

Pretty good. Back from Korea, back from New York. Lots of travel, but good travel. I ate so much food when I was in Korea—just cafés, shopping, and food in a repeat cycle. It was good, but I was kind of plagued by bad weather while I was out there. Other than that, pretty good. Back in the grind, back home, feeling good.

Was it cold? Was it cold rain, or was it kind of tropical?

It was kind of humid, so at times it was a little too warm to be raining, in my book. I wish it was slightly colder, but then when it gets cold and it rains, it sucks. You can't win when it's raining, honestly.

Yeah, yeah. In Arizona, we don't get much rain, so I'll take some rain. We had some really heavy rain probably a week or two ago now, and it even caused some flooding in certain areas. But when you're on vacation, you probably don't want rain. That's usually not why you go somewhere to relax and then just be rained in for the whole trip.

Yeah, we attended a wedding and were hoping that it wasn't going to rain during the ceremony, but it did. So we got clear umbrellas that we were able to use so it didn't ruin the pictures. I thought that was a cute vibe, but I'm sure they didn't want it, and it kind of changed the plans afterward. I think there was supposed to be dining outside, but then it was moved inside.

Did you see something? Yeah, we're good. Hey, we'll leave it in there. Brad's microphone froze for a second. Hey, it's all right. It's a podcast; it's show business.

So it was good, it was fun, but I was ready to be home. I think it was a long trip, so by the end of it, I was ready. We actually changed hotels every two days; that was a little painful. We went to Seoul, which we've been to before, then Busan, then Jeju, which is like the Hawaii of Korea, and then back to Seoul. It was a hotel change every two to three days, so it was constant checking in and checking out. If I ever do it again, I'll definitely stay at one place longer because that can really take a toll on you.

Yeah, that sounds expensive. When you were saying that, my wallet was hurting a little bit, but it sounds like you had a good time.

It was a good time.

Yeah, okay. Do you get a benefit from the US dollar being strong over there or no?

Oh yeah, everything is really cheap over there.

Yeah, okay, that helps. Cool. Awesome, well, enough about your fun travels. One thing that's been going on in my world, I guess, is if you're on my other YouTube channel—the one that's just me talking about boring accounting stuff—then you probably saw I updated the picture. There's a picture of a little kitten because last Saturday, a little kitten wandered into our backyard in some bad shape and needed help. We got it all situated, and we ended up fostering it for a couple of days. I'm happy to say that it found a forever home. It was a cute little guy. But that took a lot of my time and energy, just taking care of him. You know, he's a little kitten, like six to eight weeks old, I think. And so we were getting the cat tower and litter box and all that stuff for just a five-day stay with us. But, you know, it was cute, and I was sad to see him go. But I was definitely ready to get my routine back and just some normalcy because I have a dog. For those that don't know, you might have seen her in the background here and there, and she does not mess with cats.

I'm glad there were no altercations there.

Yeah, she did pretty well, but you know, we couldn't really let them be together.

I'm surprised you know it was six to eight weeks. I was wondering how old that cat was.

Yeah, we took him to the vet because when we first saw him, he was just in bad shape and really little. That's why I was like, I can't just leave him there because he was so small. But yeah, they estimated him to be like six to eight weeks. I think maybe six to ten, might have been six to ten, but yeah, small. A lot of life has been happening, so that's why we've been a little slow to get back to the pod, but here we are, and we're finally back. So Brad, what's been going on? Where should we start first?

Yeah, honestly, there's been a lot, and we say this every episode, but one that was actually pretty unique is that DeepSeek released a new OCR model. I think DeepSeek is known for being an innovative lab that has really, really smart people. I think the background of DeepSeek people is that they were quant researchers brought into the AI world, so they're definitely intellectually smart people. But the TLDR on this new model is that they came out with a new solution to essentially make LLMs more efficient.

So when you think of an LLM, it's text-in and text-out. What they released in their research paper is essentially a way to take text, create an image from that text, and use that image inside the LLM to then take the image back to text and process it. To break that down, text requires a lot of tokens, but when you take an image of text, they actually have an optimized image encoder and reader. If you represent that text in an image, you essentially save about, I think it was like 10x compression.

That's a big deal because as we get to more complicated systems and longer chats, a lot of times we're looking for a longer context window. So how much can we stuff in this chat window and have the AI still be good at it? This method of taking text, making it an image, and sending it into the LLM for it to process as an image and pull the text back out makes things a lot more efficient. So you can have a longer context window. I think the cool part about this is that it even allows you to kind of downgrade the long-term memory. So when we think about a long chat, there are tokens at the start and tokens at the end. If you think about making that chat become a series of images, you can make that final image a little bit blurrier. That means that maybe the LLM doesn't extract as much text from the final image in the chat, so it has a natural loss of information based on images.

Honestly, it's pretty cool. I was shocked that this came out because, like all things DeepSeek, they kind of just pop out of nowhere. And it seems like this could actually change the game. I think there were a lot of people on Twitter mentioning they had thought of this method, like people trying to say they had the idea first. But essentially, they came out with the idea, and it looks very promising. It parses lots of PDFs and does OCR in general. But I think their research on converting text to an image and then processing text in this kind of middle layer within the LLM is pretty awesome. I'm hoping that this really inspires others to find new approaches to make these context windows at least 5 to 10 times longer.

Yeah, that's cool. I did see that news somewhere on X or something like that. I don't remember the exact example, but they had said that the amount of text that the image can represent is some huge amount, but the image file itself is tiny. I'm not an engineer, so I don't know all the details like Brad does. But it was funny, and as you say it out loud, it's such a weird way to handle things. You have text, and you need to convert it to an image just to re-extract it as text. But obviously, there is that context window problem that you run into at a certain point.

The other thing is, I have an agent I kind of prototype and work with, and I've noticed that the longer the chat conversation runs, the worse the performance of that agent is. This is a pretty common thing people have realized. For the folks in finance that may not be familiar, it stores everything in memory. I don't have any kind of smart release of context windows. And so I've noticed if I run a process three, four, or five times, that fourth and fifth time is less true to the prompt I gave it than the first time. If I feel like I'm getting bad performance, I just shut it down and spin it back up. It has no memory anymore, but if I rerun that process or tool, it's back to being really good.

That's what you should do, too. I feel like there's been a lot of discussion about how to efficiently use AI tools, and I think that is, in essence, one of the major skills. Prompting the right amount and also clearing your chat over time. I think people just get lazy and say, "Oh, do this for me," and then, "Now do something completely different that has nothing to do with that chat." But in these coding tools released by Anthropic and OpenAI, they frequently have this "context compaction," as they call it. There's a certain limit—at which, maybe there are 250k tokens available—and once you're at 150k, it starts warning you that you're getting close to this limit. At a certain point, it will take all your chat, summarize it, and then start a new chat with that summary. I think this exists inside coding tools, but not in normal chat using ChatGPT and Claude. I would say if you're using these tools, be very mindful about keeping a short chat window because you do get the best performance.

With DeepSeek, I guess that could extend a little bit further, where your usability window could be, instead of 250k tokens, maybe two million tokens due to the 10x compression. But it is what they call lossy compression. I think they said if they take a photo of text, it's about 97% accurate, so it doesn't mean it's going to get all of that image back to text correctly. But about 97% is really, really high, and I'm sure they can optimize it further going forward. But yeah, really cool release from them. At its heart, it's an OCR model, but I think their research really showcases that there's a bigger unlock under this model. People are really getting their hands dirty, and hopefully, we'll see more AI labs come out with something that just extends the context window.

Yeah, it's interesting because a lot of times, for accountants, the way that AI agents have been framed is as a helpful intern. And in some ways, that is true. If you give it the right tools and prompting, you can give it work, and it'll do a decent job, but you kind of need to check it and have that human in the loop. But that kind of breaks down if you think about long-term memory with agents. Just as you said, it's better to start a new chat, but people will say, "Well, I talked to my agent about this a week ago. I want it to remember that conversation." And I haven't seen this done really well in practice, but there is a difference between short-term memory, which is in the context window, and long-term memory. If it doesn't know something, it can go do a semantic search. It will go look up a longer-term database of chat history with that agent and then get that context and pull it into the chat. I've seen that in diagrams; I think LangGraph published something about AI agents having short-term and long-term memory. But it's funny how people's expectations of using the tool and thinking about AI agents differ from the intern analogy. You can tell an intern something, and two weeks later, he or she should remember that conversation. But with an AI agent, it's a little different, so you can't really hold that analogy true across the board.

Yeah, I feel like it's good to remind them. It's like talking to someone every day; if you had a coworker sitting next to you every day, you'd just remind them of the most important things. And if you tell them too much, they're going to be overloaded. That's the exact mental model you can apply to chat: if you tell them too much and you keep going, they're going to be overwhelmed. But every day is a reset; start a new chat, do something. Unless you're actually continuing your work, then definitely use that same chat. But if it's something that's not related or only slightly related, definitely create a new chat and prime that chat with as much context as you need. That's the best way, I would say, to use AI chat, but I think a lot of these things will evolve over time. Maybe the problem today honestly won't be a problem in six months because of larger context windows.

Yeah, I just thought of that image you put in my head of telling someone every single day to do something and reminding them. For some reason, I thought of a meme coin trader saying that to their agent, being like, "Buy this meme coin, do not sell." I don't know why that entered my head—sorry to go off on a tangent—but with all the weird terminology like, "Oh, that's gas," you know, all that stuff. But to be serious, yeah, I agree with that, and I think people will get used to working with agents and changing their expectations. You do need to constantly refresh it and then maybe re-prompt it or remind it. You know, say your job is on the line when you prompt it, because that always seems to make it perform better.

Threats are good. Well, speaking of memory, OpenAI just released ChatGPT Atlas. So it's kind of weird. Again, OpenAI is not that great at naming; they've historically had complicated names. If I were them, I would just name it Atlas, not ChatGPT Atlas, but either way, they released a browser. It's centralized around ChatGPT being the entry point into the browser. So when you open up a new tab, there's a composer in the middle, and by default, it goes to ChatGPT instead of something like Google or some other search engine.

I think it was kind of a surprise as well. They announced it maybe a day before on Twitter. During their live demo of about 25 minutes, they showcased their agent mode. When you look at this new browser, it looks very similar to Chrome. I think it actually uses Chrome's rendering engine under the hood, but it's a separate macOS app. So it's their own UI skeleton, but what's actually getting rendered inside the browser is as if you were rendering it within Chrome. But again, it doesn't have some of the Chrome features like translation or other browser features. It's very much a bare-bones, "let's take a browser, let's add ChatGPT, let's make it agentic" approach. I think it's pretty exciting to have that, but it kind of opens up this can of worms with security.

They demoed finding a recipe, creating a list of things the guy needed, taking it to Instacart, and adding it to his cart. The big thing that presents itself here is that these browsers are only as useful if you're logged in. You know, if you want it to buy things from Instacart, you need to be logged in already. So I use Chrome; I'm logged into pretty much every service on Chrome. When you boot up Atlas, it actually asks to import things from Chrome. I said no because, again, I don't trust the security yet, but I think it could get there. It would be very useful to have something that's already logged in and can navigate the browser. How it actually works is it'll take a screenshot, read the HTML on the page, and try to figure out, as if it were using an accessibility screen reader, how to interact and navigate the page.

The demos were cool. I downloaded it, tried it out, and asked it a few things. I think it's in very early stages, and it really sets up the play for OpenAI to get a lot of user data on how to navigate a browser. Memories is a big thing for them, and security is a huge thing for this browser. So honestly, there's a lot to talk about here. I think they've released what is very much a V0 in this category.

Yeah, I thought about the data element too, of them getting everyone's data. But the other part I thought about was when I saw some people kind of talk badly about the release, saying, "Oh, it's just a Chrome extension. They reskinned Chrome and put Gemini as a sidebar," and of course, they did it with ChatGPT. To me, it's more about the data, like you said, but then also the ecosystem part of it. Because for browsing the web, you would of course have to go to Google Chrome, which is probably the biggest browser.

Certainly not Internet Explorer or Safari, yeah.

Right, and Google is one of their main competitors, so it makes sense to me that they see Google Chrome, which is going to have some Gemini capabilities built in. I don't know, to be honest with you; I should know that, but it will at some point, right? And I think they want to make that ecosystem play. The other thing that they released recently was company knowledge within ChatGPT for enterprise customers. Basically, I don't know the technical details, but it indexes your company files, and you can basically chat and RAG over your company data. So if you have a file and you're like, "Oh, where's that file I saved that talked about some memo?" it can go grab that file and summarize it for you or whatever.

I think OpenAI is definitely trying to make that ecosystem play as well and get the data from users browsing everything. Because to me, it seems like they want to be as sticky as they can. If they've got you using ChatGPT enterprise and you're using the Atlas browser, then they can kind of... we've talked about this before, where at some point are they going to start walling themselves off from each other? So where you can't use different models with these different tools and stuff like that. You know, you can't use an MCP to connect to Google Drive if you're using ChatGPT. Who knows where that goes, but that's the other part I thought about when I was thinking about that ChatGPT Atlas release—just the ecosystem piece.

Yeah, I thought it was genius because in the onboarding, they actually ask you to set it as your default browser, which is a pretty standard practice for most browsers. But they offered to let you set it as your default browser to get higher limits on ChatGPT. So there's an incentive there to say, "Use our software, and you'll get more queries to ChatGPT." I thought, wow, that is a really smart offering. I could see Google doing the same because they have Google One storage to power Google Photos and Google Drive. I don't know if the scale makes sense because Google is much larger than OpenAI, but the fact that they're offering an almost invisible quota boost was pretty interesting.

How I read these browser wars is that Perplexity has Comet, which I think was released probably in July but wasn't really open until about a month ago. All these browsers are trying to be agentic. And I think one of the biggest demos that I was very taken aback by was the idea that this new browser, ChatGPT Atlas, could have memories. So they described it as, like if you're planning a trip, maybe you have a tab open for flights and a tab open for hotels, and you're doing all that. You come back a day later, and it saw those tabs you were on and suggests new tabs, for example, for events or activities in that space. It could even suggest booking something for you.

For me, I'll have different projects within my browser tabs, and it'd be great if something could just take a look at that and kind of read the room. You know, "I'm trying to book travel to this location. I need to do these four or five things that aren't crazy hard; I just need to do them." And I think when they showcased the fact that it could have memories and look at what you did and suggest new things—whether that's a news article for the day or planning your trip or doing whatever you do—it's really fascinating how they could take that data and do more with it. Because that's where I see the value coming from: having more information about you based on the sites that you use, hopefully in a privacy-first way, but also making it actually useful.

It kind of reminds me of advertising on Apple platforms, where a few years ago they had this massive shift. Before, each advertiser could kind of track you across apps. Now they have this popup that essentially says, "Would you allow this app to track you outside this current app?" And usually, users say no to that. But this kind of feels like that from ChatGPT Atlas, where you're browsing multiple sites, and ChatGPT is taking data across all these actions and creating a holistic picture. It's using pretty smart intelligence to figure out what you're doing.

I think Chrome could get there, but to me, it's like if I were to use Atlas, I would go all in. I'd bring all my Chrome stuff over; I would full-send on doing everything within Atlas. I think that experience would be great, but then we get to the second issue of security, where they actually have a pretty cool approach. So they have an agent mode, and for those who haven't seen it, agent mode is essentially it navigating your browser, clicking on things, and actually showing you. It has the browser pane on the left and the chat on the right.

They have two modes: one is logged in, and one is logged out. And as you can probably tell, you want to log into various sites and services on the web browser, and if you use a logged-in version, it's much more powerful. But with that power comes that security risk. So they actually tell you, "Hey, be very mindful about which mode you choose." Which I think is the first point: people have to be educated. A lot of people aren't that educated about security, so that's tough. And two, it's nice because they give you that logged-out option where you could say, "Go do something for me." I know that you're not going to need my data because I know the task at hand and what you'd have to do, and that gives me that kind of isolation or sandbox from having any security incidents.

So it's pretty interesting. I would love to give all my data to OpenAI if I trusted them. But then you end up in that spot where to make the browser extremely useful, it would need to be logged in most of the time. And if you're logged in most of the time, your security risk jumps to the moon. Currently, I'm not fully convinced that they're there yet to have an agentic browser model that's not going to do anything malicious.

Yeah, during the demo that they put out, they also mentioned something, and I didn't pick up on this, so I'm just curious if you understood what they were saying. They talked about logged-in versus logged-out mode, but then they also said if you need to, you can browse something incognito, I think is what they said. Do you remember when they talked about that at all? If not, we can move on.

I think that's more about not collecting data. But I think on a secondary note, they do have a "watch mode," which I think is for when ChatGPT realizes this is a sensitive website, for example, navigating to your bank. As much as I hope their agent wouldn't get tricked by being on a bank website, I still want to be able to monitor things. I believe watch mode was like, "Hey, it can interact with this site, but you have to actually look at it while it does things." And I think you either approve it or have the browser window open. Whereas oftentimes for agent mode, you can ask it to do something and go work somewhere else—you could be in a different tab, a different application, have it minimized—and it'll still go. But I think there was a watch mode that says if I'm on a sensitive site and I'm logged in, you must watch what I do in case I screw up.

I think part of their demo of doing a shopping cart transaction, from recipe to Instacart, was to actually get you to that final state where you had to confirm checkout. I think that was pretty important because when you have these important actions, you need to bring the human back in to say, "Now it's your time to review it." Like, "Here is a giant button to check out, but we're not going to hit that because we want you to make sure that we did the right thing." So it does 90% of the job that's boring, and that last 10% is maybe user modification.

But I think the agent mode is good. I'm just worried because when we talk about MCP servers, there's a risk in adding those, but you're manually choosing and installing them, so there's a lot of friction. When you navigate throughout the web, you are opening up so much possibility for not just the site that you're interacting with to be hacked. Like, someone could hack paypal.com, for example. It could be hacked. They could put malicious JavaScript that's like, "Hey, do this," and ChatGPT could follow it. So you could be going to the legit site, but it could just be hacked for five minutes, and you go there at the wrong time. And that still exists for the normal browser, but it's easier to trick an agent than it is to trick a human, in my book. And I think we're getting to a point where you can put a lot of instructions in various places to influence the agent. So they have to be really smart about how they parse and process and how they do what OpenAI calls "red teaming." They have these malicious actors try to break down this agent as much as they can, and then they fix those issues and move on. But again, that's a cat-and-mouse game, so they can't get too far ahead.

Yeah, no, the security element is definitely important, and it's good that it looks like they're thinking about it as they release this product. They made sure to mention it, and they have this logged-in versus logged-out mode. But it reminds me of when—I think Meta and OpenAI had this—there was a point in time where I think on Meta you could see other people's chats and what they were chatting with the AI. And so you could see, you know, Janet Jones or whatever asking the AI some personal thing that you would only discuss with a therapist or, I guess, an AI if you thought no one was listening. And then I think the same thing happened for ChatGPT chats where you could share them publicly, like the chat history. But then it turned out that people were just exposing too much personal information, and I think they turned that off. Do you remember that at all? I can't remember exactly when that was.

I remember the ChatGPT one where people were sharing things and it was being indexed by Google, so people's chats would show up if they shared them, but they didn't realize that they were being indexed by Google. So I think they turned off that part of it and made a more explicit UI saying, "Hey, you're sharing this." So yeah, it definitely was a period of confusion.

And I think that's like, we've talked about it before, where people are so eager because the products are honestly so good and so useful. So you're eager to jump in and use them, but it's almost like the appreciation and understanding of the security elements haven't quite caught up for everybody. There will probably be a period where some people get burned pretty badly, and then people will know across the board, "Okay, that can happen." But last time we were recording, one of the predictions that we had for the rest of the year was that there's going to be some big security breach with one of these models that would be in the news. I do think we will get something big like that. Someone will get tricked, an agent will get tricked into exposing a bunch of information, and then someone's stock price is going to pay the price for it, basically.

Yeah, and if you think about a browser change, I don't know about you, but I have not changed browsers in many, many years. So when you look at Chrome versus ChatGPT Atlas, they have to really deliver on AI across the board and really convince me that that's the way forward. You know, we love AI, I love AI, I use it all the time, but at its current state, my read on it is that it's very early. I think they'll spend a lot of time refining the product experience. And what I want to do in the back of my head is think, if I were using Atlas over the next few weeks, could I have automated this in a way that this agentic browser could have done things for me?

It's kind of how I look at some of the agentic coding tools that I use: could I do this manually, or could I do this with a CLI tool that can write code faster than me? And right now with Atlas, I asked it a few things like, "What schools did the Breakeven Brothers podcast hosts go to?" And the weird quirk is sometimes it actually uses ChatGPT entirely without the browser when I ask it to do something. Other times it would actually navigate throughout the browser, so it's a little confusing. As someone who is technical, I wasn't sure where it was going to go and why, and I still couldn't really figure that out. I think you have to open a new tab and direct it immediately through that versus their starter tab, which is just open. But either way, I really need to be convinced on this browser, and I think they're going to make it a lot more compelling. Even during the live stream, I think the memories part was really big for me.

But yeah, changing a browser... I really use Chrome for development, and I want to think, could Atlas do something for me that I've been doing manually? I think there are tasks that stick out, like these obvious live-stream-example tasks, but outside of that, I do want to think, could this have saved me 15 minutes? Because prompting takes a lot of skill, too, and I don't want ambiguity to pop up. So do I type for two minutes about the task, or do I just click around for five minutes and get the job done? Something to think about.

Yeah. I'm curious too, there's a whole ecosystem of Google Chrome extensions because people add functionality to their browser that they need. I wonder with something like Atlas, will that kind of thing exist, or will it be that the agent can just do everything that you need it to? Because I'm just trying to think of all the different product releases they've had. They had the app SDK release, and then within that is the agent kit, and within that is the embeddable chat pane that you can put on your web application.

So I'm thinking if you are making a web app in 2026 and people are in Atlas, on your website, would you still want that chat kit, or would you just use the browser pane? Do they kind of cannibalize each other in terms of functionality, or is the chat kit just specific to your web app and the chat pane is a separate, more generic thing? I don't know, I'm just thinking out loud, but that's something they're doing a lot right now. And I think it's good. People were saying the browser is really not that cool. But I think it's cool in the sense that it's an all-in-one shop. It's a one-stop-shop where you can get web browsing, AI, chat, and ChatGPT. But that part I'm curious about: how would that look when people are actually building web apps, and you're on, say, my expenses app in the Atlas browser? Does it change at all, or is it the exact same?

Maybe it doesn't change anything. Yeah, and they don't have extensions. Again, they're missing quite a few things like translations, extensions—some pretty core browser features. But I think if they did ship that, it wouldn't be too late; it would just take longer. And so I think if we step back and look at OpenAI's releases for the past three to six months, they've done a whole bunch of stuff. In my head, I'm like, good for them for getting there. I think their internal coding models, like Codex and GPT-5, really power that. They've done a great job.

But I am slightly concerned that they're going to have too many things and too many people and internal chaos that will slow them down. Because as a small company, you're working on fewer things; you know everybody. As a company that's ballooning, where every engineer in the Bay Area wants to work there, there are a lot of people there doing a lot of cool stuff. Are they able to hone in, focus, drive value to their core products, and kill things that aren't working? Because we had—I forgot what they called it—but it was essentially like a ChatGPT marketplace where people would come in and say, "This is my secret sauce prompt, come chat with my character," or whatever.

The GPTs, right?

Yeah, the GPTs. I think that has maybe officially been killed off or is in the process of it. I thought I saw something recently about it, but either way, I think that's not really the focus anymore. And as we talk about them with this app SDK, they're really going after Apple, they're going after Google. Their vision is grand, and they're working on a hardware project that's coming out later next year, so it's kind of insane what they're aiming at. I really hope they do well, but it'll be interesting to see their focus over the next six months on which products take off. Because I've seen stuff on Twitter that's like, "Oh, I used Atlas for one hour, then I'm back to Chrome and I'm never looking back." Because again, the browser war is tough. To get someone to change a browser and give up all their data is really, really hard. So you know, hats off to them for releasing it. It's a tough challenge ahead, but if they can deliver innovative features, you know, I'll be there.

Yeah. The only thing I'll say on browsers, and we can probably move on, is that I view browsers as a utility. To me, I don't care that much. I don't use all the different features of Chrome. I go to a website, I log in, do what I need to do, and log out. Whether it's Safari, Internet Explorer, or Edge—that awful one—I don't care that much about it. So to me, you know, it's not like I'm beholden to Chrome. I could go to Atlas in two seconds. But I also just don't care that much. It's like an electricity provider; you don't care, as long as you get electricity. I don't really care what my browser is if it gets me to the right web page.

Yeah, maybe it's the developer in me just using the Chrome developer tools. They are top-notch. That's something that Google did really well.

Yeah, yeah. Well cool, okay. Atlas, nailed it.

Yeah. So, on to the next topic. I also went to Claude Code Anonymous in San Francisco two days ago. So we were potentially going to record the podcast about two days ago, and I was telling Ben, "Hold up, I'm going to this conference"—a meetup, whatever you want to call it. Essentially, it's put together by Peter. He puts together a meetup just to talk about Claude Code.

I thought the origin story, which he described, was kind of funny. He and his friends were using Claude so much that they would have bags under their eyes. And he was describing, "Oh, we're like addicted to Claude as a slot-machine coding tool, so we need a place to talk about it because we're all addicts." And so that's why it's Claude Code Anonymous. So to give some people some background, this was essentially like a three-hour meetup. You got food, you listened to talks, and then you chatted afterward.

My takeaway was, Simon Willison was there, Peter was there, and there was a guy who created Conductor, which is a macOS app that allows you to run multiple Claude instances. So there were a few people that I was like, okay, these are really intelligent speakers in the AI community, on AI Twitter. So those people were great, and then there was a handful of other folks who I hadn't really talked to or heard of but who also delivered good talks.

From the hour of talks that were allocated, two things stood out to me. One is writing a really good `claude.md` file. So when you use Claude, there's a `claude.md` file which essentially describes how your Claude agent should act in the codebase. One of the talks from the Conductor guy mentioned walking through his 700-line `claude.md` file and describing his entire project, how to write things, code style, and structure. To me, it's not rocket science; it wasn't really new knowledge, but the way he had encapsulated and written it was like an extremely well-written technical document.

And one of the funny parts that he had written was, "Don't ever do these five things." Essentially, these were coding patterns that he had described as things to "never ever do." After that chunk in his description, he had mentioned, "If you want to do one of these things, please mention this comment: 'I really want to do a bad thing, would you let me do it?'" And so essentially, when he runs Conductor, which runs a bunch of Claude instances in parallel, he says every so often Claude will ask him, "Can I do a really bad thing?" because it's trying to do one of those five things. He always tells it no, but he finds it really entertaining to have Claude beckoning him to ask to do something that's not allowed on his list.

But long story short, I think I need to do a better job of writing a good `claude.md` doc because it's one of those things where, when you go inside your coding repo, there's a `/init` command with Claude that will look at your repo and auto-generate this file. I think it does a good job, but if you take a human and spend 30 minutes on it... like, how would I sit down next to someone with my codebase and get them on board most efficiently? Kind of like the intern we talked about, that is the best way to package it. So that was one big takeaway: how can we write a better `claude.md` file?

And the second one is using Claude on the web. So both Codex and Claude are now enabled on the web, which means you can open up the ChatGPT app or the Claude app, go to their code provider—which are Codex and Claude—and you can actually spawn them to do something in the background for you. So I've used these tools extensively on my Mac laptop, but I have not really used them much in the cloud. What the cloud does is it just clones your code, runs Claude in the cloud, and then makes a GitHub pull request back. A lot of people in this meetup were talking about how they use it, and I thought, oh damn, it's one of those things I've heard about but have never really used. So last night, I spent time setting up my Laravel app to be used on this sandbox container that OpenAI set up.

So I think those are the two big takeaways. And like all things, it moves really fast. It was great to meet a bunch of people. I met Evan Bacon from React Native, really nice guy. So yeah, I met a few great people and had some good food and talks. I think at the end of the day, this stuff is moving fast, and it was cool to be in a room with a lot of people who are extremely passionate about it. It was a nerdy crowd, but in a good way, where people were just excited to talk about AI. I think there are a lot of conferences that I go to, like iOS conferences or web conferences like for Laravel, where people are definitely nerdy, but it's a different energy. I think people are really, really excited to talk about AI, whereas at these other conferences, sometimes you're sponsored by your company and you just show up. I feel like everyone there was dying to be there, which was a cool vibe.

Yeah, I have to say as you were talking—I didn't want to interrupt you—but you're such an engineer, dude. You're talking about the MD file and you had your hands like this, you're like, "an extremely well-written technical document," and I was like, this dude is such an engineer. But that's cool. I think only in SF can that kind of thing probably get that kind of crowd, which is cool. I mean, there's such a superpower to being in a room with a bunch of smart people. You know, I couldn't live in SF; that's just not for me and my family. But there is something to be said about being in a room full of people that are all really excited and smart, as a matter of fact. You just get so much, like, secondhand-smoke energy from that. I'm sure you left that event, and even if you feel like you learned these things and that's going to be helpful for you, it also probably just challenged you to think about things a bit differently as well. Even besides the specific takeaways that you have, if you continue to be in those crowds and learn from those kinds of situations, it's just going to make you smarter, too, with how you use these AI tools.

Yeah, definitely the right crowd to soak up info. Every time I talk to people out there in real life, I'm always like, I recognize people from their Twitter avatars. I'm like, "Could you show me yours?" And then I see them, and I'm like, "Oh." And I also have the reverse, where someone will say, "Oh, that's this person," and I'm like, "Oh, I had no idea that was that person," because they got a haircut, they shaved, whatever. And it's extremely hard to identify people. So my advice for all these meetups and conferences, at least for me, is like, put a giant Twitter image or something on your badge. Because when we walked in, it was, "Write your name and write your favorite coding tool." And I think most of the room was Claude because it's a Claude meetup, but I think Codex is picking up some steam. But yeah, it would be much nicer to have a QR code to open something and an image from a social profile. Because I think most people were on tech Twitter, AI Twitter in that area. And so it's much easier to be like, "Oh, I know you because I've seen your profile picture for like a year and a half now." So maybe just some feedback for next time.

Yeah, no, for sure. Cool, awesome. Well, some good stuff, Brad. I think we can probably wrap it up there and jump into our bookmarks. Let's do it.

Yeah, do it.

I'll go first, but I have to give you credit because you sent this to me. It ties into what we were talking about just a second ago and what we've talked about before on the pod. The article that I have is from TechRadar, and it's about Meta banning rival AI chatbots from its WhatsApp. So let me give credit where credit is due. Brad, you sent me the tweet from Tanish Matthew Abraham. He retweeted an OpenAI tweet, okay? The OpenAI tweet announced that Meta changed its policies so ChatGPT won't work, and then I went and googled it and found the TechRadar article. That was a long way to get to what I was trying to say.

Anyways, third-party AI agents are out of Meta's app, and of course, Meta owns WhatsApp. Going back to the "walling off" that we talked about before, I think there are more dominoes to fall, where ChatGPT is going to be like, "Hey, you can't use Google Drive connectors." Or Gemini is going to say, "Hey, you can't use this model," or whatever. So that was interesting, and I think there's more to come on that kind of walling off.

Yeah, that'll be a big day. My bookmark is on Cursor's plan mode. So Cursor recently released a plan mode. Claude has a plan mode, now Cursor has it. It does almost the exact same thing where you type in a prompt, it then goes and searches the code up front, finds all the information to write good code, and then starts writing the code. I think there's been a lot of feedback for Claude recently where you ask it to do something and it just jumps right in. So plan mode is almost required to hold the model back, get all the world knowledge, and then start executing. I was happy to see that Cursor added one in their latest release, so I'll share the bookmark. But it's an awesome release. I've used their plan mode on Cursor quite a few times already, and it's much, much better than just writing stuff directly. So if you're not using plan mode, definitely give it a try. It takes more time, but you'll get better quality output, which I think everyone would consider a huge win.

Yeah, I saw that announcement. I haven't used plan mode yet, but it seems like one of those things where it's like, of course you'd want to plan it. You know, when you actually write code, you don't just wing it usually. I mean, you iterate, but you have this idea of like, okay, I'm going to start my Laravel app, I'm going to get the model set up first, and then you kind of have your own plan. So it makes sense from an AI builder perspective that you need to give it some general path, or it needs to come up with some general path, and then you can proceed from there. So yeah, I want to try it; I just haven't tried it yet.

Yeah, it's one of those things that I think people were working around by doing it manually, and there are even people that I've worked with who do that. And I thought, that sounds like a good idea, just a lot of effort. Now it's built in so that you just ask something and it makes a plan for you. It's like 10 times better because you wanted it, but it was just too much effort. So I'm glad they added that, and I think they're shipping a bunch of stuff recently, so it's pretty exciting. I think Cursor has won the AI IDE war and they're here to stay, whereas I think Winsor is trying to play catch-up. So maybe in the next few months, we'll see how they shape up.

It's a tough world out there.

It is, tons of competition.

All right, well good stuff. Happy to be back, Brad, and yeah, we'll get this one posted. I'll catch you next time.

Cool, sounds good. See ya.

See ya.

[MUSIC]

All views and opinions by Bradley and Bennett are solely their own and unaffiliated with any external parties.

[JINGLE]

ChatGPT Atlas deep dive: The browser that knows too much?
Broadcast by