Follow ZDNET: Add us as a preferred source on Google.
ZDNET's key takeaways
Spammers found ways in and flooded my database.
Claude and Codex became my emergency coding team.
The 4,700-line fix added stronger defenses and cleanup tools.
About a month ago, my main website was on the receiving end of a new attack. Spammers were using the username field as a message carrier, stuffing it with a fake domain and crypto bait such as "check balance," "withdraw funds," "BTC transfer" and "action required." WordPress then helpfully forwarded that payload to me in thousands of "new user registration" emails.
Also: Apple rushed to squash 29 bugs because AI is supercharging hackers - update ASAP
At that time, my server was using a commercially purchased security product that was supposed to protect my WordPress website from registration spam. That product clearly wasn't up to the task.
I'm the developer of a WordPress security plugin that is designed to help users restrict access to their websites. Since the registration spam security product I had been paying for wasn't working, I decided to build a spam security capability into my existing plugin.
I quickly grabbed copies of my Gmail screen with a few hundred spam emails listed, fed those emails into Codex, and asked it to write a mitigation routine I could live deploy at speed within my existing tool. Once Codex finished, I deployed the enhanced plugin to users and to my own website.
The problem went from active attack to completely hushed in under an hour. That was at the beginning of June. Then, last week, the attacks came roaring back like a lion.
Over the years, I've noticed that spammers tend to escalate. They put out feelers to sites and try to find easy vulnerabilities. If they find one, they exploit it. But once you put in mitigation, the attacks don't go away. They keep probing the site, looking for new ways in. AI, I'm sure, is now being deployed by the bad guys to increase the depth of those probes.
Also: 5 security tactics your business can't get wrong in the age of AI - and why they're critical
Friday evening, my hosting provider told me my site database had grown to more than 39,000 user accounts with more than 700,000 user meta records. They were seeing thousands of constant registration bounces. I saw them as well, because my inbox and my spam folder were receiving multiple variations at a fairly fast rate. The user account dashboard was so clogged, you couldn't even load the page.
I was politely informed I needed to clean my database and stop this from happening. The unspoken subtext of the message from my provider was that if I didn't stop the attacks from infecting its database infrastructure, my website would become persona non grata.
This article is about how I spent the weekend using Claude Cowork and OpenAI Codex to fight back against the spam, building far more brutally robust mitigation features into my security product to deploy against the attacks.
Setting the stage
As I mentioned, as a side project, I have a fairly powerful security product that protects WordPress websites. Last year, I used Codex to substantially increase its capabilities.
Also: I got 4 years of product development done in 4 days for $200, and I'm still stunned
At the time, I upgraded Codex to the $200-a-month ChatGPT Pro tier. After those add-ons were shipped, I dropped back to the $20-a-month ChatGPT Plus tier. I am actively developing a series of Apple ecosystem products. For that, I'm using Claude Code on the $100-a-month Max tier.
I use both AIs so I can report back to you about them, but I keep each AI out of the others' code. So Codex does the WordPress product, while Claude Code does the Apple products.
Also: I used Claude Code to vibe code a Mac app in 8 hours, but it was more work than magic
Since the attack was on my website, which is protected by the WordPress product, that was a project for Codex. That said, I really didn't want to spend the extra money to sign up for the $100-a-month ChatGPT tier to fix this. So I decided to use Claude Cowork for aspects of the project that didn't write code because it has much more AI capacity and use Codex primarily for code writing.
To say this mix of services worked well would be a vast understatement.
Letting Claude Cowork loose on my website
My fightback began with a cybersecurity game of whack-a-mole. How, exactly, were the bad guys getting in?
I'd blocked the user registration page in my previous mitigation. I'd even detected spammy signals (machine-generated or gibberish usernames and malformed email addresses), plus used honeypot fields to trap bots, blocked registrations without valid MX records, and checked registrations against the StopForumSpam blocklist.
Yet, somehow, the spammers were back in force. I spent an hour or so paging through my site, and couldn't find any weak points. So I decided to deploy an AI.
Technically, the security product belongs to ChatGPT Codex. But because I was on the $20 Plus tier, I didn't want to give it more work than my usage limit would allow. Since Claude had a much larger usage window, I decided that I'd split my effort. I'd use Claude to diagnose and review, and Codex to write my code. As it turned out, this proved to be a heck of a tag team.
I explained the problem to Cowork and set it loose. At first, it wanted an administrative login, but I explained that spammers were finding exploits without admin access. The AI seemed to understand, and then set off to hammer on my site.
After about 40 minutes of chugging, it identified a number of problems. The most pronounced was that although my user registration page had a CAPTCHA, spammers could submit URLs that would initiate registration without requesting a CAPTCHA. That needed to be fixed.
Also: Treat your AI agents like eager but misguided human interns - before you lose control
All told, it found eight different flaws that gave spammers access to registration entries. Even though my security tool was testing submissions, these exploits bypassed those tests.
The next thing I did was export my site database and feed that into Claude Cowork. I told it to glean whatever information it could about identifying spammy accounts and spam practices, based on what historically made it through protections.
Cowork found a bunch of signals that many accounts were spammy. It also noticed that spammers were dumping URLs into the bio field (not the URL field).
Claude helped me identify the vulnerability points on the site, and specified new features to add to the plugin. I then asked Claude to write a prompt for me that I could feed into Codex, so it could implement fixes for the vulnerabilities Claude identified.
On its first draft, Claude screwed up and gave me a prompt that would have resulted in incredibly destructive code. After a careful read-through, I explained the problem to Claude, and it rewrote the prompt. This time it was helpful, not destructive. You must double-check everything the AIs produce, especially prompts for other AIs.
I was ready to turn over the project to Codex.
How far will $20 get you?
Codex, OpenAI's coding agent, is available from within the $20-a-month Plus tier of ChatGPT. In one of my previous coding sessions, I found Codex to be very powerful, but the amount of work it would do was fairly limited without upgrading. Back then, you had to go to the $200-a-month Pro tier (which I did, for a month). Now, there are various upgrade thresholds.
I wanted to see if I could build the entire block of code needed to mitigate the spam attacks, just using my existing ChatGPT Plus subscription.
TL;DR: I did, but barely.
Also: I did 24 days of coding in 12 hours with a $20 AI tool - but there's one big pitfall
I used Codex to build three main systems. First, I added to the signals it would use to detect spam. Second, I added a registration CAPTCHA to every open pathway where something could try to register, including the standard WordPress registration form and other public entry points, such as REST API, XML-RPC, admin-ajax, and custom registration forms.
Finally, I used Codex to add a massive, multi-stage spam account cleanup tool that uses all the spam account signal analysis features to determine whether a user account is spam. This involved adding a whole new user interface section, complete with resumable browser-driven batch analysis and deletion.
This was a weekend intensive coding push. For every hour this tool went undeployed, more and more user accounts were being created. I was in a race against time to stop it before either the spammers or my hosting provider shut down my server.
I got shut out of Codex twice on Saturday. The first time, I only had to wait a short time for it to reset. I took that as an excuse to have lunch. But the second time was going to be for hours, which I didn't really have. Codex displayed this message:
Clicking Upgrade gives you the option to buy more usage credits. But I had no idea how credits related to the amount of work I was asking Codex to do.
Then there was that "Reset usage" option. I hadn't seen that before. I decided to click it to see what happened. Here's the message Codex displayed.
So I tried it. I hit Reset Usage and was back in Codex, writing more code. I used two of those resets on Saturday, pushing hard until it was time to go to sleep. Each reset got me about 45 minutes more coding time.
Sunday was more about testing than coding. Remember, I had more than 39,000 user accounts with more than 700,000 user meta records to remove. I moved a copy of that database from my server to my local development machine and ran my account cleanup tool on it. With the callouts to the remote StopForumSpam account clearinghouse, each test run took about two hours.
Also: Chainguard's new Athena coalition uses AI to fix open-source flaws - before attackers exploit them
I used my third reset to get 45 minutes of coding changes after my first account cleanup test run. By the time I was cut off again, I was ready to do another two-hour cleanup run test. That (plus lunch) got me over the mandatory wait time for my final coding push.
By late afternoon on Sunday, I was ready to deploy the new modules to my server. I uploaded the new build. I haven't seen any account spam since then. After running the cleanup process, I deleted 15,069 of 39,314 user accounts. I also deleted 275,567 out of 723,799 user meta records.
Not only did this make my hosting provider happy, it also made it possible to once again access the user account dashboard.
Understanding Codex usage
Let me be very clear. I am amazed at how much work I got done on my $20-a-month ChatGPT Plus account. I was willing to buy more usage credits or do whatever I needed to protect my server, but I didn't have to.
Also: 5 ways to fortify your network against the new speed of AI attacks
But why? After finishing up the push to get this new release out (and all these features are now going to my users for free), I had time to think about Codex a bit more. What are resets and how do they work? What are usage credits? How much do they buy you?
On June 11, OpenAI rolled out the reset feature in an X post. I was told by an OpenAI spokesperson:
They don't accrue through ordinary usage and are separate from purchased credits. Eligible Plus and Pro users received one free reset when the feature launched. There isn't just one reason we grant resets. Sometimes when we're working through a bug. Sometimes, to celebrate a milestone. And other times, they're just for fun! Banked resets generally expire 30 days after they're granted.
Credits, on the other hand, are also kind of unclear. Usage is measured based on tokens, not credits. But usage is sold by credits (and subscriptions), not tokens. Maybe if this stuff were a little clearer, we wouldn't need an AI.
I asked OpenAI, and they suggested I ask Codex. Here are the prompts I used:
Inspect ~/.codex/sessions, total the input, cached-input and output tokens from that coding window, and apply the current rate card.
How does that token count compare to the current rate for buying credits?
Codex told me that my entire weekend coding run used 166,806,884 tokens. If I had just paid using API rates, that would have cost $146. Honestly, not at all bad for the amount of work I asked it to do. In terms of credits, Codex estimates I would have used 3,651 credits.
But that was for my entire weekend of work, including the work that came with my ChatGPT Plus plan. Based on my session log, Codex estimates that the ChatGPT short-window allowance was about 500 credits, so I would have needed to buy about 3,100 credits to keep going without waiting.
Also: These 4 critical AI vulnerabilities are being exploited faster than defenders can respond
In that situation, the right idea would have been to upgrade to the $100-a-month Pro tier. Codex provided this estimate:
$100/mo Pro likely gives you roughly 40K Codex credits per longer usage window, based on your observed Plus usage, but OpenAI does not publicly guarantee that as a fixed credit grant.
To be honest, whether it would have cost an extra hundred bucks or $146, it would have been well worth it. The fact that it didn't, and I accomplished all I did with just my regular $20 plan, is nothing short of amazing.
How amazing is it?
Over the weekend, working with Codex, I added 4,700 lines of code and deleted 170 lines, for a net gain of 4,530 lines across 138 new procedural functions. That doesn't count all the CSS and HTML code, the validation and testing routines, and all of the UI work that was accomplished in the same period.
For a full-time, experienced programmer, this work likely would have taken 25 to 45 engineering days, or as long as eight weeks. Claude's database analysis work would probably add another four to five days on top of that.
But I got all of that done over a weekend, as part of the $120 I spend monthly on Claude and ChatGPT. If I had to do it all alone, there would have been no way I could allocate two full months to nothing but mitigation coding.
Also: 'Like handing out the blueprint to a bank vault': Why AI led one company to abandon open source
Let me be clear, though. This kind of vibe coding is not sitting back and just letting the AIs do all the work. I'm constantly bouncing back and forth between AIs, working one while the other is thinking. I put in 12 hours on Saturday and at least eight more on Sunday. My active oversight and involvement was absolutely essential to the process.
During the process, Codex made numerous errors that needed to be fixed. Claude made one mistake that would have nuked my server, and at another point it wasted three hours going down the completely wrong rabbit hole, only to discover too late that you couldn't get there from here.
But do not underestimate the work that was accomplished. I produced a massive intervention in a weekend that otherwise would have taken a full-time dedicated programmer months.
There's also a big, intangible benefit. In years past, I've been alone when trying to solve server problems. Often, my servers have been running my own code, and dealing with attacks and issues fell on me to solve. Getting help would have involved a hefty consulting cost. Even if one or more of my friends would have been willing or able to help, getting them up to speed would have taken days.
So I fought off attacks all alone. My wife can attest to how demoralizing and overwhelming that feeling was.
Not this time. This time, I was able to collaborate with my little team of Codex and Cowork. I cannot overstate what a big difference that is. Yes, they can be a bit obtuse. And yes, this wasn't what I planned to do over the weekend. And yes, there was stress because every hour without a solution meant more incursions.
Also: I tested ChatGPT vs. Claude to see which is better - and if it's worth switching
But, in the end, I wasn't alone. I had two capable partners who were able to come up to speed in minutes. They helped me diagnose, develop, test, and deploy a massive intervention in two days. I asked questions and (mostly) got back constructive, helpful answers.
I asked Claude, "Can you provide a clarifying prompt for Codex?" and asked Codex, "Do you have any analysis assignments you want me to give Claude?" And they did. Both AIs were made aware of the other on the team, and neither offered any pushback because they were competing products.
I know we regularly talk about the job compression AI is causing, especially among programmers and writers. But as a solo practitioner who shoulders the entire tech debt burden of my tiny company (and the more than 20,000 server installations for my users), having the help of the AIs was eye-opening, even this far into generative AI.
I'm tired and a bit fried after a very intense weekend, but I came out of this attack experience with a feeling I've never, ever had while managing a cyberattack crisis before: it was actually fun.
How wild is that? It was actually fun.
You can follow my day-to-day project updates on social media. Be sure to subscribe to my weekly update newsletter, and follow me on Twitter/X at @DavidGewirtz, on Facebook at Facebook.com/DavidGewirtz, on Instagram at Instagram.com/DavidGewirtz, on Bluesky at @DavidGewirtz.com, and on YouTube at YouTube.com/DavidGewirtzTV.
View original source — ZDNet ↗



