Levelling Up to Claude Code

So I mentioned at the end of my last post about AI that switching to Claude unlocked a new level of what I was doing with it.

It started small. Someone at work mentioned they’d been using Claude Code to build actual apps, and they’re not a developer either. Just someone with ideas and enough curiosity to see what happens. That stuck with me.

I had a specific problem I wanted to solve. I have some VoIP numbers through my internet provider, actual UK mobile numbers I can receive texts on. Useful for giving out to LinkedIn contacts or anyone I don’t want having my main number. The problem was sending replies meant logging into a clunky website every time. Nobody wants to do that.

So I asked Claude about it. One thing led to another and it said it could help me build a web app for that. When I asked if we could run it in Docker it said of course. I already run Docker on a few systems at home so that made sense. It built me a container, I ran it on my Synology, got it working for one number, cloned it for a second. Then I thought, wait, can we just have one container with a dropdown to switch between numbers? Of course we can. And it did it.

That was the moment I thought, if it can build me an app with this level of input and time, what else can it really do?

So I set up Claude Code properly, gave it access to my Raspberry Pi setup, and had it take the SMS app further. It interrogated my existing config, made some enhancements, deployed everything from the command line. Straightforward, but impressive.

From there I got more ambitious. My blog had been running on YunoHost, which is a self-service VPS platform. Decent enough, but it’s always on an older version of Debian because the open source volunteers take what feels like a long time to update it, and if an app isn’t in their package store you’re out of luck. I’d always wanted to run my own properly configured stack but never wanted to deal with the time to care and feed it.

I had a spare VPS sitting around doing nothing. So I asked Claude, can we design and build my entire website on this thing. WordPress in Docker, proper backups, the works. It said yes.

First I designed the whole thing with Claude, got a proper design document together, then imported that into Claude Code and let it build. About $25 to $40 in API costs later I had a website. And not just a website. Automated daily backups following a daily, weekly, monthly cadence with pruning built in, all replicating to a second location with an immutable copy at the end. Backup infrastructure honestly better than some hosting providers I’ve heard about. I then migrated my site over to it, wiped the old setup, and migrated it back again just to prove I could restore it. All worked.

Then I got bold. I had the spare VPS now freed up, so I used it to build a set of personal tools I’d always wanted but never had the time to set up properly. An open source SSO system with passkey authentication for me and my wife. A Searx search instance sitting behind the SSO. Network monitoring. I didn’t even know I could set up my own SSO until Claude walked me through it. Some API costs and an afternoon later it was running.

Around the same time I set up a proper monitoring stack. My external VPS now watches my home internet connection and my main Raspberry Pi, and sends push notifications to my phone if anything goes down. Not email, because I’m not going to look at an email. An actual push notification. I also have a speed test running every hour on a gigabit ethernet connection straight into the router, and my internet provider, who I’ve never had a bad word to say about, is consistently delivering around 305 megabit on a 350 megabit plan. Seeing that graphed over time is genuinely satisfying.

All of that cost me some API charges, probably a couple of dollars worth, and £4 for a push notification app I now own outright. Not a subscription. Just mine.

Then I built a router.

I’d looked at this before. A router is basically just a computer with extra network cards, and I’d used open source router software in the past. But I’d always bought the manufacturer’s hardware because I didn’t want to deal with building and maintaining my own. This time I asked Claude to help me design and deploy it, and then help me maintain it going forward.

Of course It said yes. It can be a yes man/lady/person a lot.

I researched the hardware with Claude’s help and landed on a Protectli VP2430, a fanless little box with four 2.5 gigabit network ports and an Intel N150 processor. 16 gigs of RAM and a 256 gig SSD. Overkill for a router, which means it’ll last a long time. Then I designed the whole OpnSense configuration with Claude before touching any hardware.

Deployment was more painful than I expected. The biggest issue was needing the new router connected to my laptop while also needing live internet on the same laptop to use Claude to programme it. The address space decisions I’d made complicated things further. I spent six hours one evening and an entire Saturday on it before rolling back. Then I realised what was in the end the hurdle that had stopped me, made one design adjustment, tried again the following week, and had it running in about two hours. It’s been stable since.

I’m not a software developer. But I’ve spent years building and supporting data centres, call centres, and large scale applications, so I understand enough to know when something looks wrong and push back on it. I understand some scripting and basic fundamentals. What I can do is explain what I want clearly, spot when the output doesn’t smell right, and interrogate it until it does. That combination, it turns out, is enough to build some pretty serious stuff.

The things I’ve been able to do aren’t things I couldn’t have done before in theory. But the time to care and feed a self-hosted setup was always more than I was willing to put in. Now I have AI that helps me design it right, build it right, and fix it when something goes wrong. The barrier that used to stop me isn’t really there anymore.

So now I’m looking at what’s next. I have mockups for a couple of actual apps I want to build. Things I want for myself that don’t exist quite the way I want them. A colleague at work just builds whatever he thinks of now. I’m getting there.

The biggest current headache is cost. Claude Code API charges add up fast when you’re doing serious work, and last month I had enough overages that I’m now trialling whether running both Claude Code and ChatGPT’s Codex together is actually cheaper than paying for the next Claude tier up. Early signs are interesting.

And separately, with my new MacBook Air running 32 gigs of RAM, I’m finally in a position to run proper local models. I’ve started downloading and testing, and the early results suggest I might be able to replace some of what I’m using Lumo for with a local model that’s actually more private and possibly better. That’s an ongoing experiment.

It’s a lot. But honestly, talking to people now, whether friends, colleagues, or people in the industry, so many are either just scratching the surface or not looking at it at all. I was over a year late getting serious about this. I don’t think I’m late anymore.

Switching to Claude

I’ve been on this AI journey for a while now, and for most of it ChatGPT was my main tool. But in February 2026 that changed.

Someone whose opinion I trust recommended I give Claude a try after voicing my frustrations with ChatGPT. When I actually sat down to evaluate it I did something that in hindsight was pretty funny. I asked Claude directly why I should use Claude over ChatGPT.

It told me not to.

Based on what I told Claude I wanted, it said I’d probably get better results from ChatGPT. So naturally I didn’t trust that answer and kept pushing. As I interrogated it further it started explaining that it was slower and more thoughtful, and from its previous read of what I was looking for, it figured I wanted fast straight answers. And that’s when something clicked for me.

One of my biggest frustrations with ChatGPT was exactly that. I’d ask it something and it would just fire back an answer. Fast, confident, and often not what I actually asked for. Like if I asked for specific instructions on how to do something on an Apple product, it would give me generic steps that didn’t even exist in the actual interface. I’d have to stop it and say don’t give me fluff, give me the actual thing. Then it would have to go look it up and either admit it didn’t know or finally give me something useful.

What Claude was describing as a weakness, slower and more considered, was exactly what I wanted. So I signed up and started using both in parallel.

Early on I was genuinely impressed with what was coming out of Claude. I was using Sonnet, the middle tier model, and the difference in output quality was noticeable pretty quickly. The concern then was whether I’d end up paying for two services. I was already paying for Venice and Lumo on the privacy side and the last thing I wanted was more sprawl.

But it became clear fairly fast that Claude was where I wanted to be. Which meant I had to migrate everything I’d built in ChatGPT over the previous six months or so. Custom GPTs, saved prompts, all of it. I had to extract everything, make sure I had backups, build a little text based database of all my prompts, and systematically move it all across.

I got it done within a month and managed to avoid paying for both services at the same time for more than one month. Then I downgraded ChatGPT to the free tier and haven’t looked back.

From a reliability standpoint Claude is better. Not perfect, and everything I said in the last post about not trusting it still applies. But it’s a meaningful improvement. And honestly, switching to it unlocked a whole new level of what I started doing with AI. Which is what the next post is about.

The Sunday Job My Dad Casually Had

Today would’ve been my dad‘s 81st birthday and since I had written this already I felt like he was inappropriate day to post it. I miss you Dad..

In another post I talked about my dad’s work schedule when I was growing up, and I mentioned how he used to pick up an extra shift at Rikers Island on Wednesdays to earn some additional money. That was one of his side gigs. But he had another one that was even more unexpected, and this one happened in the early nineteen nineties.

To set the stage, my dad was a trained physician assistant. A physician assistant is not a doctor, but at least in New York State at the time, you went through two years of medical training and then did about ninety five percent of what a doctor could do under the supervision of a physician. Or at least that is how my dad always described it, and it sounded close enough to the truth that I never questioned it.

He worked in trauma, and he was calm under pressure, steady handed, and very good with anything involving scalpels, needles, or anything sharp.

Which brings me to his second side hustle: piercing.

And not ear piercing. He specifically described it as “below the neck” piercing. That is all we are going to say here.

This was the early nineties. Piercing culture existed, but it was nowhere near as common or mainstream as it is today. My dad somehow got connected with a jewelry shop in the Village in New York City. It catered to a particular clientele who wanted this service, and the fact that he was medically trained, used sterile equipment, and could offer local anesthesia made him the right person for the job.

He only worked by appointment and only on Sundays. He would drive into the city, he never took public transportation (except once for me that may be another story), and set up in the back of the shop. He had the same portable television he used to bring to Rikers. This was before phones, before streaming, before anything digital, so that little TV was his entertainment. If it was winter, it was football. He would do a piercing, then sit back and watch the game, then another appointment, then more football. A very strange rhythm to imagine now, but that was his Sunday routine.

He told me once that almost no one ever used the anesthesia. He always offered it, had it ready, and every time the person said no. Apparently people just wanted to get in, get it done, and get out.

I remember once, when some college friends were visiting, we stumbled across his instructional video. Not his own tape, but the training material he was given when he started. Let us just say it covered a very broad range of below the neck locations. We did not watch the whole thing. We did not want to. But it definitely confirmed that my dad’s side gig was… let us call it unique.

He did that job for years, purely for extra money. It helped when my sister was in college and then when we both were. When he no longer needed the income, he stopped. He never seemed emotionally attached to the job. It was work. It paid well. It did what it needed to do.

Today, piercing is everywhere. You can walk into a studio in almost any city and find people who specialize in it. Back in the early nineties, though, it was niche, edgy, and far less common. Which means, in a very unexpected way, my dad was absolutely ahead of the trend.

At least in that particular area.

Use AI Like It’s Lying To You, Because It Is

I’ve touched on this in passing across a few posts now, but it deserves its own space. Because as useful as AI has been for me, it is not sunshine and rainbows.

AI is a great tool. I genuinely believe that. But I also think the future gets pretty dystopian if people don’t use these tools with their eyes open. And right now, a lot of people aren’t.

One thing I read recently that stuck with me: it can do super advanced calculations but it can’t tell time right. That sounds like a joke but it isn’t. Some of the things I’ve asked it, it will be absolutely insistent it’s correct. You interrogate it because something smells off, and eventually it folds. Oh yeah, you’re right, I was wrong. I’ve had situations where I knew it was wrong, kept pushing, and it took a surprisingly long time before it admitted it.

So what I tell people, my kids, colleagues at work, is this. Use it. But do not trust it. Assume it’s going to lie to you. If you go in with that mindset and you scrutinise the output, it can be really good. But you have to be able to scrutinise it. That’s the part people skip.

That’s also the part that makes it genuinely dangerous in the wrong hands. I can ask it something about cybersecurity and I’ll know pretty quickly if the answer looks right or completely off. But I can’t ask it to do my taxes. I don’t know tax law. So if it tells me I can do something, I have no idea if it’s true. That’s a problem. And it’s why you see things like lawyers submitting court filings with citations that don’t exist because a judge caught them using AI and not checking the output. People just throw stuff in and take whatever comes out.

I ran into this myself about a year or so ago when I was doing some budget planning. Nothing super sensitive, just the savings pots I set aside for predictable expenses throughout the year so I’m not hit with an inconsistent spend later. I do it all in a spreadsheet and it gets involved. I figured let me see if AI can handle it.

It did it, and then it didn’t. The numbers were inconsistent. Flat out wrong in places. I tried for a few months and I just could not rely on it. So I stopped and went back to my spreadsheet.

More recently I tried again with a different model, and I’ll get into that in the next post. But it’s actually working now. Two months in and the output is consistently accurate. I’ve also gotten smarter about how I prompt it, asking it to show its work and export the data in a way I can verify. So it’s a combination of the models improving and me getting better at using them.

But the overall lesson hasn’t changed. The reading of the tea leaves is the hard part. Sometimes the output is exactly what you wanted and better than you could have done yourself. Sometimes it’s close but slightly off in a way that’s easy to miss. And sometimes it’s just wrong and completely confident about it.

The tool is getting better. That’s real. But so is the risk of people treating it like it’s infallible. It isn’t. Not even close.