Are AI agents your next co-workers?
Download MP3I'll be orchestrating the AI tools. I'll be a better prompt engineer. I'll be hooking up these tools internally. Like, there's so much in my day-to-day that I could automate and make better. And I wouldn't do that going forward without AI. And I's just like, if I have a tool for the job that makes me faster, why not use the tool? Hello, and welcome to the Breakeven Brothers podcast, a podcast by two brothers talking all about how to leverage the latest and most advanced AI tools to supercharge your productivity and keep you on the cutting edge of. your profession. If you are looking for real, practical knowledge about the latest AI capabilities that you can apply in your day-to-day job, then get ready and tune in. Welcome back, everybody, to episode 17 of the Breakeven Brothers podcast. Bradley, how's everything going? Super. Absolutely great. Monday. Can't complain. How about you? Cool. Yeah, same. Yeah, had a good weekend. No daylight savings in Arizona, so. Welcome. unfazed by the time change while everyone else had a hard time with it, it sounds like. I did thought I had slept in a lot and then I realized, oh, okay. It's not just me. All my clocks are on. Yeah, yeah. It's weird. For some reason, my computer, though, like, I thought I had it on the automatic time zone setting, but it must have been, like, on California because it, like, it was telling me I was, like, an hour ahead or something. I had to go back and then fix it. But I don't know if you knew this, but I did see this on LinkedIn recently. The PST versus PDT, which is Pacific Standard. time and Pacific Daylight time. And I have always used PST as just, I knew there's PT and PST, and I'd seen PDT, but I just never really spent any effort. And then someone posts on LinkedIn, like, you know, since daylight savings this year, here's the difference. And I was like, I think, if I remember correctly, now we're in PDT and we were in PST, but do you know if that's right? I don't know, but I'll take useless information for 500, Alex. Yeah. Yeah, that's what I thought when I saw it. I was like, oh, okay, I'm going to forget this. And, yeah, here I did, I forgot it. So, yeah. The one I always get confused on is, um, whenever something is expressed in UTC, I'm like, I don't know how that is. Oh, yeah. It's like the, it was, that's coding stuff. Yeah, yeah, right. It's like a coding thing. I'm like, it's like, oh, 17 UTC. I'm like, what does that even mean? I got to like, break out my calculator to figure that out. Universal time, I think. No, it's zero offsets for hours where like PST or PDT is like minus seven hours. so UTC is plus zero, which makes things easy. So is there somewhere on Earth that is... Yeah, it's Greenwich time. Greenwich. Greenwich Mean time. GMT, I think. I think GMT and UTC are maybe equivalent. Interesting. We got off on a tangent. Yes. UTC is equivalent to GMT. UTC is measured from midnight where GMT is measured from midday. Yeah, confusing as heck, but I do know... small random date time zone facts due to fighting time zones as a coder sucks yeah well that's actually an excellent segue because no one might be able to really explain time zones very well AI and what we wanted to talk about and maybe shift our focus on this podcast a little bit was you know there is so much you know new developments in AI is really shifting how people work especially people um you know like myself who's in accounting brad who's a software engineer people that are um you know working with their skills and with their knowledge, not with like physical labor with our hands. And I think AI has really come and changed how that looks pretty dramatically and pretty quickly. And I know for me personally, it can be overwhelming all the new news that you're, if you're not using this, you're behind. And if you're not doing this new model, you're behind. Right. And so we thought it would be a great opportunity for this show to kind of narrow its focus on, you know, what do you need to know in this kind of new AI world as a, you know, coder or as an accountant? but even just beyond those two specific roles and just more of like a knowledge worker. You know, what do I need to know? What should I be looking out for? And hopefully out of this you can kind of just take some little nuggets and just think differently about some of the tasks that you're doing or how you add value to your org because that's going to change pretty significantly, I think over the next, I mean, it already has. And it's just going to continue to change over the next six months, 12 months, 18 months. I was thinking, we really need a coin a phrase for AI thinking. There's been vibe coding. There's been so many different AI terms that have been popped up recently, probably the past few months. Maybe on the podcast. Not this episode. I don't put the pressure on us in real time, but maybe on a future episode, we can have an AI thinking. Like, reshifting your thinking to be AI first, and we can coin that as some phrase. And that could be ours to own because vibe coding, I just love that term. And there's other various terms for AI stuff that I'm just thinking, whoever thought of that, nailed it. And like now it's a right time for everyone to kind of stake their ground in AI terminology. So yeah. If you have any good suggestions for AI thinking, you're listening to the pod, please reach out. We'll think on something. But I think that's the focus going forward. I found, as we found topics to talk about, Ben and I always kind of leaned on AI tooling. And it just felt like it kept coming up and it just has a higher and higher focus. And it's really enjoyable. And it's permeating throughout the whole workplace. So we thought, why not spend a little bit more time here, focus and energy to up level everybody with AI. And like I said, in the past podcast episodes, AI is really darn good today. It's all going to get better. The opportunities are endless. And it's our job to kind of keep you informed and tell you what you need to know. So you're not kind of bewildered by what is out there today and how you can apply it to your domain. So that's what we're looking to do going forward. Really excited about it. And I think we have a lot coming up in our content schedule. So get ready. Absolutely. Cool. Well, so what we wanted in that new frame and that new kind of focus, I think what we wanted to talk about today was agents. And so, you know, there's a lot. I think first it came out with like chat GPT and like, you know, just kind of better thinking and creative, creative products. Like if you wanted to change an email or write something a little better, that was kind of the first, you know, I think widespread iteration or at least when I kind of came into chat GPT and like AI. But now there's agents. And so I guess, Brad, like, you are so close to this stuff. Like, what should people know about agents or like, what's agents as a concept? Maybe we just start there and kind of work our way down. Yeah, I see agents as something that's autonomous. So it's using AI. But when you're using chat, the original chat interface, for example, chat GPT, you provide a context that gives you one answer. I think the difference with agents is that you kind of insert an AI. operator within the context that you work in. So, for example, for coding agents, you can take an AI model that can read your code, that can write to a code file, that can run terminal commands on your behalf. And so I think the difference between the chat interface for AI and agents is that an agent is brought to your domain, given the tools that you use, and given the context that you would see on your screen or context that you would have in your file system and your organization, anything to make an AI feel closer to a human. So I think when you think of the agent approach, you think of something that can self-execute, make progress, make a plan, and execute on all these subtas, for example. So for coding, a lot of the times for AI agents, you'll say, hey, I want this new feature. This new feature has these three requirements. Go. And a lot of times, for example, one popular agent is ClaudeCode. A lot of times will say, okay, I want to create something new. I'm going to go see what exists today, see how I can meet the requirements, and then basically creating an internal plan for itself and iteratively creating progress on like a larger goal. So I think when people think about agents, it can come in a lot of different flavors. I think the general consensus is that agents is using the latest AI models, but in a different realm where it's more inside your context, has more human-like access where a lot of the good agents look at what humans do and try to distill that into a list of tools so that an AI could almost act as a replacement human being, whether that's a software engineer, an accountant, giving them tools that a software engineer would use and our accountant would use makes the AI model much more powerful. So that's kind of the general concept. But yeah, it exists in a lot of different flavors and domains. And some are really good. Some are very basic. There's different tools that can become more complicated than others. It really depends on your organization and your workflow, what works out for you. But yeah, I would say that's more or less the high level. Yeah. And it's, it seems super cool because I think what people, you know, it's, it's almost like it's baby stepped naturally into this where like at first. They were just, you'd use AI to like augment something that you were doing. Like if you had to write an email, you would like manually go and, you know, say, hey, can you change this email to make it sound more coherent? or whatever, you copy paste it in, then you copy paste it back into your email. And it was still just like a manual step you would do. And now in my experience with agents, we'll kind of talk about some specific use cases, but like you kind of give it a set of instructions. And then you can kind of see it real time, like do these things and like take it beyond that first direction. You know, it'll kind of take it beyond like, you know, before you'd say like, hey, can you create me an Excel file that has, you know, this data filled out or whatever. and it would do it and it would like stop there. But like now you can give it a much more complicated like context. And like we'll talk about some specific projects that have been making making the rounds on Twitter and kind of causing a lot of hype right now. But like you can do a lot. And it, I think they call it one shotting, right? Where you just give it a set of instructions for some really complicated task. And then you hit submit on like the prompt. And then it just gets it from start to finish all the way through. One area that I was kind of tinkering with before we got on here and earlier today was I was making a web application and it was it was in jango no surprise but um you know I was trying to have it like work through like this air message I got and you could see it like test itself and then say this still didn't work that we need to change this over here and like I hadn't touched my keyboard in like five minutes and it did all these changes got it to work and then like boom it was done like it was solved that issue that it had ran into on its own there's still like other issues. I think that's just the nature of making a project or making an application or either so many different moving pieces and it's still not a perfect, perfect product yet. But like it's astounding how much it can kind of work through on its own and just kind of go start to finish on something. Yeah. I think the real unlock here is when you're within the chat interface, you only get so much done. When you ask an agent to do something, you give them the goal. And they have this autonomous mechanism. them to keep making progress on that goal. And then if you zoom out of it, you can imagine, you know, a company started in 2025, 2026, having an army of these AI agents. And so maybe you start a startup in Silicon Valley and you're building the next, I don't know, cloud storage company. Like there's tons of them. But instead of hiring five super talented engineers, you're hiring 10 clod AI agents that are just chugging through making. code. And so you can really see the promise is that if these things work well, then you can scale them. Scaling people is really hard. People only have, you know, a single threaded model. They can work on one thing. Context switching is hard. AI agents, on the other hand, they can get carved up a certain area of the project, go work on that. And then you can kind of clone them across various areas and surfaces. And this could be coding. We could have 10 coding agents. This could be accounting. It could be crunching your financials. Like, literally it could be anything as a template and a profile. And then you can scale that to an infinite degree. And of course, we're not there yet, but the progress we're seeing in the past few months has been really, really good. And I, for one, have been trying to use it as much as I can. On nights and weekends, I've been plugging into split my expenses. And it's a little bit different. I've been using cursor, which we've talked about tons on this podcast, and cursor does a good job. I think cursor is great in the fact that you can provide a prompt. It has all the tools to go read your files, modify the files. think about the changes that it's making and make multi-file changes, where if you take that multi-file change and bring it back to the chat world, it's really hard to do that. You have to go copy your code files, bring it into the chat, say, can you modify this file and this file? And it just doesn't make a whole lot of sense. So it was a natural evolution to move from chat to agent. And then I think the next natural evolution we'll see is from agent to agents, where, hey, I'll have an agent that writes the code. I'll have an agent that does SEO marketing. I'll have an agent that's writing a blog post. These things might interrupt with each other They might talk to each other They might not But I think at the end of the day people see it almost like an automated employee I think is the win And right now we're controlling one, and that's doing pretty well. In the future, you can imagine controlling five and just letting it run amok on your code base. And hopefully they're good enough that they're better than a senior staff software engineer at that point. But we're not there yet. And, yeah, we've seen a lot on Twitter. we've been sending back and forth to each other where people are building some really cool stuff. So walk us through something you've seen on Twitter recently. Yeah. So for folks that maybe aren't on Twitter as much as Brad and I, first of all, lucky you. But then second of all, what's been making the rounds has been fully blown, like, video games that are being built, I think, in either a cursor or there's a new one I think called Windsurf. but basically people are saying that they are one-shotting you know again just putting in a prompt and asking cursor to make these games and it makes these games the one that I think got really famous and kind of went viral was a like you know I'm just given the title that I saw it was like a realistic dog fighting like airplane simulator and I think if you look at it I'm not sure where the realistic comes from maybe that's like the physics or something like that but like it still looks very rudimentary you know but that's not the point The point is that you give it these simple instructions or you give it these instructions and it can make this thing that works, which like, you know, for someone who like myself who can not a technical person, to be able to do that is pretty, it's pretty amazing because there's so much that goes into just making a simple like Python script that like reads Excel files and does analysis. Now we're talking about, you know, video game assets that move in a space that are 3D and have like hit metrics. and like, you know, it's pretty incredible that it can do all that. And so we've been seeing, I think there was like a sailing boat game. There was a, again, the airplane fighting game, a soccer game or I think there was one. But that one, I think it was a fake one. It's funny that you bring that up. I had saw that actually this morning and I was dying because, yeah, so like Ben said, there have been lots of people posting their AI games. And usually they either don't have the best graphics or have like, pretty rudimentary graphics. That's how you can tell, hey, it's like, you know, some game. And this morning, this guy posted a troll tweet. I can't remember the username, but he mentioned, hey, I built this in two hours, vibe coding on cursor, which if you go on Twitter or X and search vibe coding, you're going to see a ton of stuff and people just explosively code in with these AI agents. And I think he posted a video of like FIFA 24, some modern FIFA game, where he's like, oh, I built this in two hours using these coding frameworks. And I actually watched the video because for the first five seconds, I was like, damn. I was like, this is really good. And then I realized that I hadn't played FIFA a few years, but you know, you recognize the game when you see it. And I thought, oh, okay, this is, this is a joke, clearly. And this guy didn't build FIFA. And he even tagged, I think 11 Labs, which is the AI voice company, and he said, I built this in two hours. The announcers are with 11 Labs. And I think 11 labs Twitter account, the company account commented on it and said like, uh, like dot that. I'm just like a joke because they also realized it wasn't true. But yeah, I think there's been lots of pretty impressive games that come out where, again, previously, if we think of the chat interface, it was really hard to make that work because when you're chatting with AI, you have a context window. We've talked about this a lot, but to reiterate, you only have so many words that the AI can look at and effectively work with. So if you're talking to your coworker to, you know, get them up to speed on a new project, the more information you tell them the better. But if you tell them too much, they're going to be overwhelmed, but the same thing exists for AI. If you tell them too much, and then you tell them to modify a needle in that haystack that you walk them through, good luck getting there. And the difference with the cursor and AI agents is that they're able to create their own context window with only the code that they need to look at. So, for example, if you're building a game engine, there's image assets, there's character movements, there's communication like chat protocols. Each of these almost exists as a sub-feature within the game. As you're building out the game, when you're working with cursor and you want to maybe work on only the chat feature, cursor is intelligent to build the context window or the conversation only around that code and that code only because AI is effective with a smaller context window or prompt link, and that prompt link is intelligently created. And so imagine cursor is able to work on this feature, find the right code that it needs to look at, modify the right code that it needs to look at, and then the loop just keeps on going and going. And so, So I think the big unlock with cursor and other AI agents is they're able to do what you would do as a human, like we talked about. But the main underpinning technology is that it's able to craft the right prompt. And the reason it's able to craft the right prompt is because they're able to kind of do metadata analysis or almost like human level searching across your code, as if you asked your coworker software engineer to go make a change in this area of the code base. They probably do the same thing. They wouldn't look at all the code. They look at the code related to that feature and modify only that. their working memory. That's how cursor pulls it off. And I think once people realize that it's possible, off to the races, they're doing a bunch of cool stuff. I think this is only the beginning. I say that on every episode, but really, I'm excited to see the new AI tooling come out that supports agents in a way that we have just not even ready to take a look at. Yeah. Well, and I kind of liken it to, you know, I'm not sure if you've had this experience in your working world, But like when you offload a task, either to like your your team member, like someone who works under you, or you offload the task to like an offshore team and it just unlocks a bunch of time for you. Like I've had those before, you know, you have really strong team members or again, you just offload it to an offshore. Offshore is pretty common in accounting. And so like, but when that happens and it goes well, like you feel the instant uplift and like your productivity because you're not doing. this task that was just taking up time that wasn't, you know, the most value ad. You're not going to offshore something that's really a high value. Like that's what you want to work on. You're going to offshore or delegate things that are lower in values. That way you can take on and prioritize things that are higher value. And I think with any job, you know, I don't care what level you're at. There are certain things that you're doing in your day to day that, like, can just be done by an agent now. or can just be automated and done, you know, that don't require your direct involvement. And so I think that's going to be probably the next hurdle is like, how can agents be integrated into, you know, large companies that can really get the benefit of this? Because obviously there comes a whole slew of, you know, security concerns and like how to make sure people, how to how to check them, make sure they're doing the right thing. One funny thing I was going to mention and I forgot, but I'm glad I'm coming back to it, is on that soccer game, like troll tweet. We'll find the tweet because he should get credit for that tweet. We'll put in the show notes. But, you know, he had asked it. He said, oh, like, it was built with Grock. Like, and he had, like, tagged Grock and, like, see, like, it's legit. Because Grock, Twitter, which obviously was like a bot came in and said, oh, yeah, like, he did build this with Grock. I'm not sure if you saw that. Yeah. Yeah. Yeah. Grock is Twitter's AI model. So kind of like chat GPT for those unfamiliar. Yeah. And so he tagged Grog. He said, I built this with Grock. And then someone asked. asked, and that Grock was like, yeah, he did. And then someone asked that Grock, replied on that tweet and said, like, show me your sources. And it was like, well, my sources that he said he built it with Grock. And it was like, well, that doesn't mean anything. So like, obviously it's a silly example of point being, it's like, you know, especially in, you know, in software engineering too, but just speaking purely in my world, like we have to be exact in accounting. Like, you know, like it's dollars and cents. We, you know, we can't have hallucinations like, like Grock where it just kind of state something that's wrong. And so whatever company, you know, whatever county department, in my case, can get that right where, like, we can move work over to agents without, like, sacrificing controllership. It's like, if once we can do that, it's golden because there's so much work like that. And people shouldn't think of that as like a scary thing. Like, oh, my job is going to go. Like, no, you're just going to get unlocked to be able to do way more of the high value stuff that, like, really, you went to school with and you, you, you're going to school for and you learned and got all that experience with. you didn't, you know, do all that to, you know, roll forward an Excel file every month and like just get the basic formatting. Like that should be done by anybody but you, you know. Yeah. And when you're using some of these tools, you have like vibe debugging. I don't know if you've seen that as a term that comes up. But yeah, it's one that maybe came in the past few days. You spent so much time on the vibe coding cycle that AI has written a ton of code. And now things just start to break. Again, it's not perfect. when you have an error and you've only spoken in natural language text to the model, you don't really know or you might not have the skills to figure out where that error is. And so then you start feeding the error in. Maybe it works. Maybe it doesn't. But again, you end up in this situation where sometimes you almost create too much and you don't know how to manage it. And that comes back to the craft of like software engineering where, yeah, you might be able to build like a new project from the ground up at decently fast velocity and have something that's pretty good. but once things start to break down, then it becomes, how do I actually fix these issues if Cursor can't figure it out? What do I even begin to understand why my game is not allowing other players to join or my chat messages aren't going through? I have no idea what to look at because Cursor only exists within a code editor. It might not be able to open up Google Chrome and take a look at console logs or any other developer tools to get more insights into why things are broken. So I don't know if you've seen another kind of hype cycle but to kind of add to the agents, there is a thing called MCP, which is model context protocol. So, yeah, AI Twitter has been booming recently, and it's half vibe coding, half MCP. And what MCP essentially is, it's a way for you to define a way for an AI agent to use a tool. So, for example, if I was doing my end of your taxes, I could define an MCP server. an MCP server is essentially encapsulating one tool. And that MCP server could potentially log into Vanguard or log into my other financial accounts and pull specific tax form. So one, how do you log into Vanguard? Two, how do you navigate to their UI? Three, how do you click download PDF? And that could be all encapsulated within an MCP server. And these MCP servers, there could be multiple, can be integrated into the Claude Desktop app. So I know that was a lot. Just to recap, open up Cloud Desktop. There's an MCP server. It's very technical right now, unfortunately. People are building app stores for MCP, but essentially it's a way for the AI agent to do something, and that do something is whatever your heart imagines. It could be, again, download your tax form. It could be migrating your MySQL database, literally whatever you want, because it's all kind of managed through this interface, which is this MCP server. And then when you're in Cloud Desktop and you've added these MCP servers, for example, one that downloads your tax documents, you could say, hey, I'm doing my taxes for the 2024 year. I have a Vanguard log in. Could you please log into Vanguard and pull my tax forms? So you describe what you want in natural language. AI is able to look at the list of available tools. These tools are represented by MCP servers. And it's able to invoke a tool and say, I have access to Vanguard. and I could go fetch these forms. I don't know how it's done, but my tool can do it. So I'm going to go tell the tool to go do it, and I'm going to wait for it to come back. And so that's the whole interface to essentially make AI capable. And this is exactly how you can push cursor, Claude, any of these agenic and smart models to be even better because you just provide the tooling in the context. So if you find the right domain, figure out what that person would do with their various tools, create an MCP server for each tool in theory, and then chat to get a task done, and hopefully it works, but it's a little rough right now. Yeah. Well, I think that's a good point because, you know, one thing in order for, and this has pretty much been true ever since, you know, AI has really come on the scene, but, like, it's getting even more so true as, like, prompting. Like, you have to be able to prompt it really well and, like, give it instructions. I mean, everyone knows that it'll struggle if you don't give it really clear instructions on what you want it to do. And it's gotten better over time, like as the models improved. But if it if there gaps you know it will either try and fill those gaps which can be dangerous especially in the world of like software if you building a critical app or if you you know in the accounting department for where you work Or it just kind of stop and say like you didn give me any insight like can ask a follow up question which is really good Um, because that's what I think you want it to do. So point over being like, you know, the more specific you are than the better you'll, you'll get results. Um, within within reason, you know, you probably, if you're right not every single detail, then you're probably just not even really reaping the benefit of using it. but like the more specific you can give it instructions the better it will do and honestly too just from a professional development standpoint like you don't really know how to instruct something unless you know it yourself you know so it's one of those things where it's like even you know me I'm not a video game developer like there'd probably be things I'd struggle with in building you know just writing a prompt to make a video game if it didn't ask me clarifying questions right and so like you still need and again this goes back to to like the whole job security thing. Like you still need your expertise to be able to convey to the AI agent, like what you need it to do. And I think that's still something that is super important. And like, you know, as you use these tools more, you'll get better at it. You know, you might go, oh, I'm not actually that clear. I think I'm clear because that's how I think about it. But like this AI has given me back questions like, well, what do you think about this? And how should I handle this? Then you kind of go, okay, maybe I wasn't that clear actually. And that just improves over time, the more and more you do it. And I think that's just more like a call to ask. action like just try these tools like get out there and try them and stumble around a little bit and as you do it more and more and you get more success you kind of know what works and you know I think some of the prompts that I was tinkering with recently like three or four paragraphs because it's just like boom I need this then I need this then I need this and then even then it was so asked me six or seven clarifying questions you know yeah you end up being almost a manager yeah you really have to define the context of the problem give it a head start on where you're looking for it to attack especially in the coding domain, you want to have it look at the minimum files possible and just give it a starting point. Just because I think you need to approach it as you're talking to a human, and again, you're onboarding them onto a system or a task. How much would you tell them? Specifically, how much would you not tell them and what would they need to know that they would run into issues in the future so that the AI has enough information to get going? And I think this means a lot for agents just as much as it does for chat because chat wasn't enabled or at least previously wasn't able to go enact and use tools. Now with the agentic workflow, it has opportunities to go get more data based on its own reasoning where it thinks it needs to do something. And I think, again, the better your prompt is, the better your results will be. But prompting can feel and be very hard to understand how a certain area of the prompt modifies the output. There's tons of research right now on how to actually understand what you typed in matches exactly to this output and why it took this decision. It's a little bit of almost like peeling back the hood of LLMs and understanding if a bad actor or somebody else were to use this incorrectly, how could we stop that? So it's a little bit on the safety and security side is why they're doing this research. But again, it's kind of almost like a black box at times where I ask it to code me a video game. I have no clue how it was able to pull that knowledge, but it came out and it's doing it. But should I know how I came up with that? Maybe. I mean, one could argue it's probably pretty important, but other people who just try to put a product out there could care less. So it really depends on what you're looking at. But I agree with Ben's sake, if you're not close to the tools, I've heard lots of people around me say, like, oh, it's going to replace software engineering. And, you know, it's the end of software engineers. And I think a few months ago, I thought that's total BS, you know, no chance. And over the past few weeks, I thought, things are going to look different. I'm not going to be replaced because I think I have lots of domain expertise. It's not that, you know, like things are going to be dramatically simple, but I think it's going to just evolve in a way that I'll be orchestrating the AI tools, all be a better prompt engineer, I'll be hooking up these tools internally. Like, there's so much in my day-to-day, like, 9-5 and also outside 9-to-5 work that I could automate and make better. And I wouldn't do that going forward without AI. And I don't, I wouldn't even call it reliance on AI. it's just like if I have a tool for the job that makes me faster, why not use the tool? And it's not perfect. Plenty of times it writes bad code. It can break things. By no means is it, you know, and all be all. But for the 80% case, it gets you really far. And if you're hesitant about adopting AI or you feel like it's going to replace you, get closer to the tools, use them every day. Try to use Gemini. Try to use chat, GPT, Claude. Pretty much all of these chats, chat services. chat AI services are looking to get you onboard onto their product. They're offering a generous free tier right now because it's just a full arms rate. So take the same prompt, paste it at each one, see which one gives you better output. It'll differ. Maybe it's a coding task and Claude does well. Maybe it's a reasoning and OpenAI or ChatGPT does well. And so you really have to find your use case, your niche, and just tinker with the tools. Because, for example, Gemini came out recently on like Google Docs and I've tried that. total hit and miss on various things or you know i think examples of like oh highlight this the text that matches these conditions or format the text to do this because sometimes writing the content is hard but other times like making it look pretty whether that's a doc or a sheet or a slide take some creative thinking or just too many button clicks within the ui so why not let jemini do that for me but as i found recently when i tried doing those things jemini says i don't want to help you with that I don't know how to do that, et cetera, et cetera. And I kind of throw my hands up and say, I've used the Gemini API model. So I've used the raw thinking that Google provides. They just haven't integrated it as tightly to their product for the AI to understand how to use the product to get what I want. They'll get there. They're just not there yet. And so, yeah, throw your use case into Cloud, throw a use case into, you know, Gemini on Google Doc sheets, et cetera. It's like a little star icon on the top right if you have access to it. And yeah, just let us know what you run into because I think all these use cases are changing. And Google's probably shipping updates on a weekly basis to make these things better. And my use cases I thought were pretty straightforward. I was a little bit shocked. It wasn't supported. And so I'm curious to see how it evolves over the next coming months with all the age and hype. Yeah. I think we should probably table a next session on like I'd say like Enterprise Suite products. Because I think we could drill down into like Gemini and how that works with Google Docs, Google Sheets, Google Slides. then also to Microsoft, you know, whatever it's called co-pilot AI. We're like, you know, because like half, you know, it depends on where you work, but I'll just say for accounting and finance, like the overwhelming majority of companies to use Microsoft. And so like there's a ton of value if we can use co-pilot to like read our directories and the Git us information quickly. But, you know, like, that's, there's, again, security concerns with that. Like, you know, you're relying more on, you're allowing Microsoft to read. your data potentially. Then obviously Google, firms that use Google Drive, you know, then if you have a lot of things that are using Google Drive, then maybe Gemini is really good, you know, so that's something that we could probably talk about on a future episode. But one thing that you said, you know, and I probably just would try and distill like two things or the way that I think about it with these tools is like, you know, one of the things is like you're, you should be trying to reduce the amount of time it takes you to do something. And like that sounds super obvious, but like sometimes if you, you just don't think about it, you're going to just keep doing it. And you might just go, well, hold on. Wait, I can just put this in chat in two seconds. It'll pop something out that's usable. Like if you have to write like a memo about like what the work you do, like instead of like having to sit there and like think like, oh, this is how a memo should look, just put it in chat. Like, hey, I need to write a memo for, you know, a bad debt policy to put it in the accounting terms. Like can you just give me like a sample memo? And then it writes 80% of it. can kind of go in there and tweak it, but that's still so much more efficient than you starting from scratch or like taking a prior example and trying to just make it your own. And so I think one, just always be looking for those kinds of opportunities, like just make yourself more efficient. Because again, like, that's where we're going to be able to drill down and add value. But then two, you know, a lot of times like these tools with prompt engineering, it kind of teaches you to like question what you're doing a bit more. And I think that's a really good thing. Because then you are kind of focus on better understanding, like, what exactly you're doing, like, why you're doing it. A lot of times people get really focused on, like, how they do something. Okay, I need to take this, you know, this form and put it over here on this form. But like, why are you doing it that way? You know, what's causing? Like, why do we have to do that? And so I think with these tools, you know, you're getting that kind of muscle as well, which is super important because that's kind of, again, where knowledge workers can tap in is, right? It's like being able to kind of understand what we're doing. it's where I think AI really has not, has a lot of room to grow is like when there's any kind of gap. So like if there's, like, if there's data and it's like missing, you know, some data, it doesn't really know what to do with it. It goes like, okay, I'm not sure what this means. Like, we could probably drill more into that in a different topic, but like spreadsheets, for example, like if you have a bunch of data and like a row of data is missing, like, AI might be like, okay, this is just missing. I'm going to continue with my, you know, analysis or whatever, but you might be like, hey, that needs to be there. Something must have gone wrong. And when I pulled this data or something, like just that inference and that, that logic, I guess. It's just missing right now. And so, you know, that's something in the future. I hope they can work out. But, you know, in the meantime, we're kind of worthing stand today, like those two things of just being more efficient, like search for those opportunities to find more efficiency. And then just that questioning, questioning what you're doing and why you're doing it. And like AI prompting forces you to do that. I think it's just super net positive, you know, for your own development. I think on efficiency is something I try to do often is I'll spend times, I'll spend a lot of time on a manual task and I'll think, how could AI pull this off? And then I'll try to just take exactly what I had written or I had done and, you know, chuck it into AI, whether that's an AI tool, AI agent, etc. And sometimes it doesn't work. And I think, oh, is it my prompt engineering? Is the AI model not capable enough? Or other times it does work. And so it really gives you a feel. of how you can take maybe a large part of your workflow, how to chunk it up into steps and how to feed it into AI. It gives you almost like a bounding box of capabilities such that you understand more in the future, if you run into certain tasks that you're doing now, what is actually manageable by AI, which I don't think a ton of people have a lot of grasp on. They think AI is this nebulous thing. I'm not sure it could create some crazy video games, or it could not even answer how many R's are in strawberry. Like, it has a very weird feeling to people that, I don't know how well it's going to do on my task. And I think the answer to that is try it on your task, see what it does. If it doesn't work, iterate on your prompt, try a different version. I think what you'll end up doing is end up in kind of like a second level order of AI usage, where if I'm automating a task, I might say, hey, like you mentioned, draft an email to this person describing XYZ. when AI generates it, it might be super wordy and might have no clue what the context is. And you think, oh, I can't tell us to do something so briefly. I need to give it a lot more context. Then you kind of end up in this domain where how do I give it context and how do I save that context? And so I think I end up in the spot where if you take a look at my Apple notes, I have tons of AI queries for a split my expenses that literally describes the project, the coding project, the web app, the product, et cetera, from start to finish. like, hey, it's a bill splitting app. These are my core, my SQL tables, my front ends in view, my backends in Larva. But, like, again, you're onboarding a teammate, onboarding an engineer. You need to tell them the information they need to know high-level bits and then zoom in on the part that you really need to get done. I think as people start to automate their task, get closer to AI agents and AI chat, the same thing goes. If you want to draft an email and you're a head of finance at a new startup, maybe you include that in your prompt. Hey, I'm Brad. I'm head of finance at, you know, GROC 2.0, and I have 20 reports, and my domain in the company is XYZ, draft an email to, you know, this person and describe a recent feature that we launched or a certain cost for a feature, whatever. Like, literally, the more you provide, the better you get out of it. How do I provide a lot? Well, I don't want to type a lot every time. So how about I spend a one-time effort to type a bunch of stuff, save it on my Apple notes, and just pull it in on every query. It can go a long way. And I think getting closer to the AI tools, that's where you start unlocking. How capable is it? How can I make myself more efficient using the AI tools? Then it compounds and compounds And it just gets in the spot where it fun exciting and you just tinkering with the latest and greatest and you getting a feel of how successful these results are And from that success, it usually just propels you further. So something that I've been kind of going down a cycle of really validating a certain paths of automating tasks in my, like, day-to-day job. Yeah. Well, and one thing, too, I think this is more applicable to accounting than it is to coding because coding, it's all written software, right? but, you know, for accounting, a lot of times people are doing something, like they're doing a task. And that should be written down. It's like, you should have process documentation. And so, like, you already have those things ready to go. They just take them off the shelf and give it to these agent models. And if you have good documentation, the agent will understand. Exactly. I have a couple of things that it needs to get, you know, better understand. But, like, then that's an opportunity for you to upgrade your documentation as well. if you don't have documentation, well, that's just on you. And you need to get some documentation. And now there's a dual benefit because it's like, okay, you know, I can onboard new team members. and I can also maybe put this through an agent and see what we can get out of it. And so, yeah, there's just a case for documentation because it's, it was already super important, but it was often neglected. But now it's even more important because there's this added benefit of like, now it can do something for me. You know, it's not just housekeeping. It's going to really do something for me. Yeah, it's the same thing with code, too, like finding good documentation, makes the agent, the coding agent, so much better. And the same thing exists for normal processes. In the future, we're going to have AI agents that are able to honestly interact with multiple windows, your Chrome browser, you know, Google Sheets, et cetera, more of a high-level agent. If you can program or describe a certain process with very little ambiguity on exact steps, you know, maybe there's fuzziness and the LM can figure it out. But in general, the better you can describe a process and make it understandable with all the context. the better human's going to do, you know, maybe a new joiner on the team, the better an AI agent is going to do. And again, it serves a dual purpose there. You can almost think of the AI agent as the new joiner if you're, you know, struggling to write docs for a bot or an AI agent. But yeah, I totally plus on that. I think, again, the more you use these things, the more you figure out where the problems are, and then you're creating solutions around these problems. I think coding for AI has gone off the deep end for this, which is we have cursor because people wanted to write, code and they're getting code out of chat AI, but they wanted more. How can we automate that and create a loop? Boom. We have things like, you know, Cursor and Cod code, which are AI agents that push it to the extreme. And I think on finance and accounting side, we're inching there with Gemini. And we have, you know, Google Sheets as like an entry point into Gemini. I found that it's a mild success, but you can imagine a world in which this is the first iteration. You can describe it. And then like cursor does, at least in the coding editor, you jump around to certain areas. And it suggests inline edits. I can see the same thing done for Excel where you modify one sheet that's imported from another. And it kind of like jumps around for you and does all of those things. So maybe not the best description, but you get a scenario in which these tools are built because people see problems. And I think coders feel self-empowered to build these problems. I think on the accounting of finance side, which is less technology forward, people are inching in there and they're seeing a glimpse of how can I make cursor for spreadsheets? How can I make XYZ coding AI tool for finance or for accounting? I think we're not there yet. I think we're going to get there very, very soon. And so I would say if you could get close to these AI models, because I imagine there'll be some AI action widget, builder, tool, whatever within the next AI tool you're going to use for finance or accounting. you're going to be a lot more well equipped, both within that tool and on your own, because maybe you might not need a tool. You can do it by yourself using the existing chat AI tool. So it's truly, you know, a new landscape. I think people who are scared of it just need to jump in. And once you're in there, you'd be like, wow, you know, it's actually not that bad. And it's kind of fun to tinker with it. Yeah. Well, again, just feeling the benefit of, like, work that you used to do. Like, I think we're all a little naturally lazy in a different way. And so we don't want to waste time on things that aren't adding a ton of value. But I think just to put a bone on it for this conversation, you know, accounting what our typical, like, processes, at month end, we have closed. Like, we close the books. We record all the entries necessary to get a monthly financial statement together, an income statement balance sheet and, you know, close that period. And so, you know, we know that we're going to have agents. Like, we should, we know that we're going to have some needs of, like, work during that time. maybe a month in the future, it looks like you're going to have like six or seven agents. You might have an agent that handles accruals. It knows about the contracts that you, you know, have with your different vendors. And it knows, you know, how much services you've incurred with them because of you've been receiving, you know, invoices against those POs. You might have another one that handles fixed assets that does the, you know, depreciation entries and maybe does, you know, amortization entries if you have any prepaid. So, like, you can, you know, easily imagine in the future, you might. I have really specific agents that handle these different month and close aspects that you just turn on for the, for the, you know, first four days of the following month. And, you know, to tie into what you've already said, too, about like the MCP, like the model context protocol, right? That's what the MCP stands for. Like, if you are using a general ledger system like QuickBooks or NetSuite or Workday, maybe you can tell, you know, and I'm not as familiar with MCP server, just full-disclosure. But like I did see what Brad was talking about in the what did go viral. But like maybe you need to create an MCP server for, you know, whatever system you're using. And that way, you know, your agent knows, okay, in order to, you know, get fixed asset data out of QuickBooks, I need to run this report and I can pull it this way. Like that's not that far away if we just break it down like that. And so, you know, there's there's again, like security and like accuracy, like issues that still probably need to be worked through. Like, can you just take that as gospel? No, you probably need to check it, you know, so you're not going to be completely hands off. But like, that's just on the horizon. Like, that's right there. You can almost kind of see it just based on what we're talking about already today. And that's super exciting. Yeah, honestly, just talking about it, like, man, maybe I sent up a new lateral project tonight, get to work with cursor. Yeah. You know, two hours later, I'll be done with FIFA. So, yeah, yeah, yeah. But yeah, this has been great. I think if you're listening to the pod, we'd love to hear from you. So as we've been getting deeper into AI tools, AI agents, all the things AI workflow, AI engineering. What are you building with AI agents? Because I think there's tons of people out there that are just doing cool stuff and not everyone is on Twitter. We would love to connect with you and see what you're building in your free time or what success you found using AI agents and AI workflows. And if your workplace has adopted any new cool tools, because I feel like what Ben mentioned, it's usually Microsoft, is usually Google. They're integrating the AI and we just get access to it. But again, there are tools and companies that are pushing the bounds a little bit faster than them and they have cool solutions. So if your company is integrated with something or you are working on something that, you know, maybe companies could adopt. We'd love to hear from you. Feel free to leave a comment in the YouTube section. We can take a look and, you know, maybe we can talk about it on the next pod. So that'd be super awesome. Yeah, absolutely. And also, too, just a quick plug, you know, we'd love to get you guys' reviews. So if you're enjoying what we're talking about today, find an interesting. finding value in it. Definitely give us a five-star review because they mean a lot to us and it does a lot for the show. So yeah, just a quick plug there. But cool. And subscribe on YouTube. Yeah, of course, subscribe. Yeah, subscribe on YouTube, Spotify, Apple Podcasts, wherever else you get your, your podcast from. Cool. Should we end it with our bookmarks? Kind of what we got penciled in. I'll kick it off since mine's a AI coding like every other one. But there is a post from Steve Yege. I'm not sure I pronounce that write on X and he is describing Claude Code as the new AI agent. And he writes maybe 10 paragraphs describing what Devin was. And Devin was the original AI software engineer that came out maybe six or eight months ago, charged $500 a month from the company. And it writes code as if it's a software engineer. What Steve mentions in his ex post is quad code is Devon, but way better and way cheaper and way more accessible. And I kind of scoffed at this initially, but after reading his post, I thought, you know, why not try Claude code and just vibe code with it? And again, vibe coding is just giving AI full control to write code in your code base. And so actually today, I vibe coded with Claude code on my expenses to solve a React Native issue that I had. And it's gotten pretty darn far. And if you read his post, I kind of summarize it as drinking the poolway because he goes, throttle and says, hey, Claude Code is the end-all be-all. This is an incredible piece of software that is doing so much for you under the hood and writing all this code that it's like, people don't even realize it. Like, imagine, you know, AGI was here tomorrow and it was embodied in cloud code. He's almost just describing as, it's this big thing, but people aren't treating it like a big thing. And so that's why I kind of say, drink the Kool-Aid. If you're on X and you're on Twitter, the AI hype cycle can feel like this monumentous thing. And I kind of included his tweet both as like a half joke, but also realistic because again, I've spent time with it and it's done a pretty damn well, good job and better than Cursor would do. But it does cost a little bit more money. But either way, a pretty interesting post, kind of motivated me to go a little bit further with Cod Code. And if you are an engineer and looking to try something, if you tried Cursor, definitely try CodCode, which is an NPM package and essentially a agent in your terminal. So yeah, read Steve's post, get inspired, and then just jump into the cloud code. Awesome. Yeah, it sounds interesting. Mine is a bookmark, and basically there's been a couple of stories. I have one linked here, but it kind of's all into the same theme, that Microsoft is kind of looking to either distance itself or like kind of start to build its own backup plan from Open AI. So for those that don't know, Microsoft is an investor, I believe, in Open AI. And that partnership serve them both very well. but I think, you know, so the article I linked is actually from a quote talking about Mark Beniof, who's the Salesforce CEO, and he is basically predicting what Microsoft and Open AI's partnership is going to be like in the future. And he kind of alludes that the two, you know, the Microsoft CEO and the Open AI CEO are kind of getting frosty. They're not really best of friends. I think is what the article actually says. And that, you know, other articles are kind of showing that like Microsoft is trying to kind of start building its own models that would rival open AI. So it's really interesting because I think it goes back to something that we've talked about on the show about like platform risk. Like if you build all your infrastructure on open AI, then suddenly that's no longer the one to be on and Microsoft pivots and makes its own. That's like 10 times better. Do you have some switching costs, which like isn't unique to AI. Like every platform has switching costs, but like just something to think about. But then too, I think it goes to show that like more kind of. competition in the space is always a good thing because the more tools that are available, the less, you know, a departure of open AI or like less, you know, if it comes less and less relevant, less that might affect you if you're already using, you know, Claude or, you know, deep seek or mistrol. Because I think pronounce it right, mistrol. Yeah, is really making a lot of moves recently. So it's like this constant pendulum swing of, you know, now open AI has the head, the leg up. And then Claude has the leg up. And then now deep seek has the leg up, which is it benefits everybody. Right. So, but yeah, I thought it was interesting to see that Microsoft's kind of, from what it reports, kind of look in at open AI a little bit sideways recently. And so we'll kind of see what that means for the future. I think Sam has definitely done a bit of a dance around the profit structure of open AI. And I think recently Anthropic had a 3.5 billion around. And Amazon was a key partner there to provide AI infrastructure for Anthropic, which is Claude's creator. So, yeah, it's kind of an interesting space to see all these companies just have these massive valuations and these initial investments maybe were huge back then, but I think they have a different audience now to scoop up a lot more money and maybe not from who they previously didn't want, but now who they could choose given that their valuation is so high. So it will have to see. Yeah. It would be interesting. Cool. All right. Well, that'll wrap it up, Brad. Good stuff. I learned a lot about agents. And, again, folks, hope that you found some of the stuff valuable and take away to your own world and whatever your day to day is. So, yeah, until next time. Cool. Sounds good. Bye. Adios. Thank you for listening to the Breakeven Brothers podcast. Note that all views and opinions expressed are solely those of the hosts and not affiliated with any other person or entity. If you enjoyed the show, please leave us a five-star review and subscribe. It means a lot to us. Take care.
Creators and Guests


