DEW #160 - Quant + Detection Engineering, Fable and Mythos banned & Detections for Claude Compliance API
gimme the (detection) loot
Welcome to Issue #160 of Detection Engineering Weekly!
✍️ Musings from the life of Zack:
I swear to God, maybe it’s because I’m in my 30s, but I am noticing so much more plant and flower life everywhere! I saw a huge bunch of flowers on the side of my yard, and when I asked Seek to classify them, they turned out to be blackberry bushes! Now I have blueberries and blackberries to pick when they start to produce :D
I am taking the next week off for much-needed PTO, so the next DEW issue will go out on July 1!
I am gauging interest for a Detection & Response Happy Hour during Black Hat week. Something lowkey, on the strip, where folks in D&R can meet without being barraged by vendors. If you are interested, hit the “Yes” option on the poll below so I can figure out seat count. I’m aiming for Tuesday night before the craziness starts!
💎 Detection Engineering Gem 💎
Detection Engineering’s Quant Era by Gabriel Abdelgawad
For those unfamiliar with the term Quant, it’s short for Quantitative Analyst. These analysts apply rigorous statistical methodologies to financial trading and sit at the intersection of business, mathematics, and market uncertainty. I’ve met several throughout my career, especially during my MBA. They enable massive trading firms on Wall Street to manage large trading portfolios that account for all kinds of risk. And, as we know about risk in security, they try to understand the uncertainty of events in the world, such as the Iran War, to find arbitrage opportunities or hedge against risks to minimize loss when a disaster happens. Sound familiar?
In this post, Abdelgawad surveys the history of quantitative analysis as it evolved from paper to computer spreadsheets, and now high-frequency trading. He compares and contrasts this evolution with that of security operations, especially given current AI capabilities. According to Abdelgaward, the cost of both writing results and triage is being reduced from a human capital perspective. A well-harnessed LLM can perform research, learn your detection stack, and build rules at a faster velocity than a human can. That same agentic system can triage alerts, analyze the event, and present a D&R engineer with its decision and reasoning.
So the question becomes: what do we do when the expensive parts are no longer typing rules or grinding through every alert by hand? Abdelgawad’s answer is that we are not being replaced, but we are being pushed up a layer. The work starts to look less like hand‑building each detection and more like portfolio management: deciding which rules belong in the book, which are brittle, which really work in your environment, and where the blind spots in the overall method are. He compares and contrasts this “squeeze” to an analogy of the front, middle and back offices in finance.
Once the front office (authoring) and the back office (false-positive minimization and risk acceptance) become cheap, the middle office becomes the constraint you optimize to achieve success in detection and response. So your “book” is your ruleset, detection pipeline, incident response playbooks, and your knowledge bases. To draw another comparison, the ship has sailed for writing most of your code by hand, as Claude Code has taken over. The same logic applies to detection: if agents can draft rules and triage alerts, the valuable human work is not clinging to manual authoring; it is managing the portfolio and the risk it represents.
I want to keep reiterating that the cost has shifted, and I’m hopeful because it frees up more time for detection engineers to focus on the important research and engineering parts of our job. Here are some ways I’ve seen this implemented already at my day job, where I run an org with dozens of security engineers doing detection & response:
Teams move more into the threat hunting space, where they spend more time discovering gaps in coverage, telemetry, and infrastructure
Projects emerge that help monitor drift in pipelines and telemetry and will flag when a rule or log source becomes unreliable
Detection & response engineers spend more time with threat intelligence teams to help understand the world outside of the company
This has been one of my favorite posts to read in months. I highly recommend taking time to read and comprehend Abdelgawad’s narrative around this topic because we are going to be expected to take on more with the help of agentic tooling.
🔬 State of the Art
Statement on the US government directive to suspend access to Fable 5 and Mythos 5 by Anthropic
About a week after Anthropic’s release of the “Mythos-class” model Fable and Mythos 5, they pulled access to the model from all customers as well as non-U.S. citizens inside Anthropic. This blog post explains why they pulled access: the U.S. Government issued an export control directive requiring Anthropic to remove access for non-U.S. citizens.
A lot of work goes into releasing these models, including extensive internal and external red teaming to battle-test the jailbreak-prevention defenses that Foundational Labs add to them. According to Anthropic, they followed their normal operating testing procedures and added even more to make sure that the guardrails held up against adversarial prompts. They announced on their blog the intent behind the testing, since Mythos has been touted as an advanced, cybersecurity-capable model.
The blog reads as if they disagree with the directive altogether. According to the post, the government’s explanation was a singular report of a jailbreak as their evidence. This isn’t Anthropic’s first clash with the current Administration, in which the Secretary of Defense threatened to label them a supply chain risk.
My personal opinion is that this is likely a stretch of an argument by the U.S. and that I think this ban will be lifted in the coming months. That being said, Anthropic has leaned into the marketing of their Mythos model as being something that should be regulated, so I hope they aren’t too surprised that something like this happened.
Detecting Misuse with the Claude Compliance API: The Threat Is in the Content by Andrew Byford
I’m excited to see investment from detection engineers and researchers investigating how audit logs and compliance APIs work for AI tools like Claude & OpenAI. The problem with this particular threat surface, as Byford writes in this blog, is separating the threat model into SaaS detections and prompt-and-response detections. These APIs contain audit events similar to what we see across the industry: permissions or API keys being added to an account, MFA devices registered or removed and suspicious logins. But, if the value of these tools is token usage, how do you find threats in unstructured data?
Byford’s solution to this, which he also open-sourced, is a pipeline that splits control plane events under “Activity Feed” and content events. SIEMs, which are built on structured matching on structured data, perform poorly when matching on unstructured, non-deterministic data. These situations are when the problem (unstructured token usage) and the solution (unstructured token usage) are identical. Byford’s content pipeline uses a combination of prefiltering tools and an LLM judge to filter out interesting or malicious prompts before they ever hit a SIEM.
I do see some scale issues with this, but only in the sense that we will all likely face tradeoffs on volume vs precision:
The cost of indexing prompts is shifted from the SIEM to the LLM judge. For every prompt your company generates, you need to use another prompt to evaluate its content
Byford calls out the Judge being susceptible to attacks itself, so it’s important to engineer additional guardrails
Privacy concerns around inspecting content in the prompt and uploaded documents themselves. Depending on your jurisdiction and compliance requirements, you may not get 100% visibility on every prompt and response. (Unless you are American, lol privacy)
Detecting and removing dangerous secrets on dev workstations before Shai-Hulud does by Guillaume Ross
This blog by Ross presents a practical pattern for secrets on dev workstations before infostealers or open-source supply chain attacks from groups like TeamPCP. They vibe-coded a proof-of-concept architecture that combines a secrets scanning tool, bagel, with Fleet and osquery. You can deploy the scan to run periodically via a LaunchAgent on macOS that runs bagel, which creates a findings JSON file.
Fleet then launches an osquery rule to detect secrets in the user’s home directory, sends alerts to Slack, and even instruments a response action to isolate the user’s access to your IdP. There’s a good defense-in-depth story here: you enforce controls before a developer gets infected, and you hope your EDR catches the infection before it’s too late.
☣️ Threat Landscape
Ransomware Tool Matrix Project Updates: Three Groups To Track by Will ‘BushidoToken’ Thomas
Threat research G.O.A.T. BushidoToken released an update to his ransomware tool and vulnerability matrix dataset. He added three emerging groups: TheGentlemen, Dragonforce and WarLock. It’s always difficult for me to stay abreast of TTPs in these groups since they tend to move fast, disappear and re-emerge. The coolest part of this post, IMHO, is the callout on the leaked chats from TheGentlemen and how they helped researchers understand the group's inner workings.
Phishing for Lobsters: How We Tricked OpenClaw into Spilling Secrets by Itay Yashar
In this post, Varonis Security Researcher Itay Yashar set up a simulated enterprise environment on Google Workspace and gave access to an OpenClaw agent. The clever idea here was to test the difference between an agent inadvertently executing malicious instructions via prompt injection versus what Yashar calls “agent phishing.” I think this is an important callout because it challenges the guardrails of a legitimate request with no hidden artifacts from a malicious source. Whereas in prompt injection, they are generally malicious requests with a hidden artifact.
They ran four scenarios, and the results were interesting: they tended to favor resolving issues over verifying the sender's identity and security. They implemented a “strict” verification profile for the agent that explicitly required it to verify identities. In multiple scenarios, a fake email was enough to get cloud credentials, secrets, and CRM exports. It just goes to show that guardrails are more than just prompts, and you should treat prompts and instructions as a security boundary in themselves.
Caught a ClickFix attack today. The domain name alone made me do a double take. (Reddit post) by MoneySaxena
I’ve been checking out Reddit posts more lately for raw analysis and commentary on detection and the threat landscape, and this one on ClickFix struck a good balance between technical depth and a conversational tone. Microsoft Defender fired an alert about a potential ClickFix compromise, and MoneySaxena wrote about their experience triaging the alert and containing the host.
They called the user and tried to understand how they managed to visit a site, copy and paste a malicious command, and then execute it. She said she was “just browsing normally”. This is something I see so many security teams deal with in their day-to-day, so it’s not too surprising that the user was browsing in a benign way and got infected.
This was my favorite quote from their write-up:
The thing that gets me about ClickFix attacks is how simple the social engineering is. There’s no phishing email to analyse, no malicious attachment to sandbox. The user is just browsing a normal website and something on the page tells them to paste a command. The command itself looks like gibberish. Most people have no reason to know what rundll32 is or why a website would need them to run it.
Kind of creepy OSINT-based IP camera crawler. It has 8k+ publicly facing IP cameras that display images and/or video feeds. You can filter across countries, ISP, keywords, and console mode, which looks like Omegle.
🔗 Open Source
BushidoUK/Ransomware-Tool-Matrix & BushidoUK/Ransomware-Vulnerability-Matrix
BushidoToken’s updated Ransomware Tool & Vulnerability matrices from the blog post featured in Threat Landscape above. These are fantastic resources for detection research and creation, especially if you feed your environmental context, ruleset and enrichment from these matrices into an LLM.
Noradrenaline is a set of small offensive shared‑library modules for macOS and Linux meant to be plugged into Poseidon and other post-exploitation agent frameworks. IMHO, this is a great set of capabilities for a detection engineer to test their EDR and detection rules on macOS. I hope someone does Atomic Red Team for macOS soon so this becomes easier and easier!
PaperMtn/claude-enterprise-detections
Andrew Byford’s detection ruleset for their Claude Compliance API research is listed in State of the Art above. It contains the full architecture of detection rules, pipelines, judge integration, and pre-filtering.
Kipi is a self-hosted OSINT platform that uses Claude as its analysis backbone. It’s similar to SpiderFoot in many ways, but I don’t see as many one-off modules or scripts, since it lets Claude make tool calls like API or HTTP requests. The cool part here is how Claude builds relationships via a graph and presents it to you while performing its analysis.



