Doc! Where've You Been?

FAIR INTEL
6 days ago
8 min read

Start Here...

It's been a while since my last post. As I puzzled through the work, I realized I was missing a few pieces of my usual kit. One of them is a Threat Intelligence Platform (TIP). I did have an old Malware Information Sharing Platform (MISP) that I had maintained for several years. It contained everything from OSINT to CVEs to data breach information. I decided last year to decommission the server. What a colossal mistake. There were years of valuable information in it—bitter-sweet to say the least. Sure, I had created the blog to showcase a workflow that transformed OSINT into the language of finance and risk, but once the work was done, the only artifact remaining was the one on the blog. Not good.

The blog does not provide any natural link analysis between artifacts. It doesn't connect the dots between threat actors, industries, or indicators—all things I would typically use if generating a threat report. History matters. However, the current workflow structure makes recording a single artifact more complicated. More steps. This led me down the rabbit hole of trying to find a solution that reduced, streamlined, or made more efficient a larger workflow so that one person could realistically manage it. So where did that need lead to?

TIPS, AI, Vibe Coding, and an MCP

I'm going to start by discussing AI, which is at the heart of the journey. As many folks can see, much of the work posted to the blog is performed by AI. It would take me several hours to do the same work on a single OSINT artifact, whereas AI can do it in minutes. Examining the language used by the AI and the results it produces are on par with, or similar to, what I would usually produce, so allowing the AI to do the work makes sense. The initial thought was to train or enrich AI with all the artifacts I'd collect over time, making it a sort of threat expert. Knowing that publicly available AI is trained on publicly available information on a cadence gave me more control over when AI was going to be "trained".

Puzzling through to find the "right" way to implement this idea led me to Retrieval-Augmented Generation, or RAG. Great! Attach a document, and the AI uses the information in it as part of its formulation—the more documents attached, the more information it has to use. Then paranoia set in, and I didn't want to use a public AI, no, I figured I needed to build one in the lab (private). Fantastic! Working with a small team, we were able to install a small version of DeepSeek and put it through some paces. It was slow, and that worried me, but what I discovered worried me the most - the token limit. Using one AI to troubleshoot another AI, I realized that RAG won't suffice for the vision in my head. See, no matter how many documents you upload to a project or chat, the token size will eventually grow to a point where the results become less accurate. And, I certainly wasn't going to spend all of my time training an LLM. Now what?!

Luckily, I'm a massive fan of Feedly. All my information is initially collected and triaged via Feedly. And, lately, with all this time on my hands, I have the availability to attend more webinars. Well, during one Feedly webinar, I learned about connecting AI to a Model Context Protocol (MCP), which is a tool that connects AI models to local data, tools, and API functions. Using GPT, I puzzled through the token dilemma only to come to the point where I had two choices: 1) create a private LLM and train it, or 2) connect an AI model to an MCP, which then connects to a massive repository of cyber threat information. Knowing I did not want to scratch the surface of LLM training, I went with option 2.

TIPS, MCP, and Vibe Coding

Serendipity! A quick internet search showed that folks have built MCP servers that interact with Claude and OpenCTI. Easy, peezy, cheesy. Build the OpenCTI server and begin ingesting data; download a paid version of Claude Desktop; and install the MCP server from GitHub. Everything was set up. I was so excited. I can now search a threat database in natural language and perform other analyses all in one spot. Holy cow! I ran the first test to check if the MCP server was running. It was! Now to test the search. Error! Try a different search command. Error! No! This can't be!

I asked Claude to determine the problem, and we worked through a few troubleshooting steps. It was found that the MCP server code was non-functional. What?!?! Claude then presented me with three solutions. The first was to contact the original developer. No, this was not an option. I'm past waiting on others to fix a problem. The second was to download all the code and fix each file individually. Also a "nope". There were at least two dozen files, and I wasn't going to spend the time in troubleshooting that code; not to mention, and this is really important...I'm not a coder. The third option was to create an MCP server from scratch with Claude.

None of the options presented was perfect. All of them had their pros and cons, but the cons outweighed the pros. After a small chat with Claude about my concerns with the options presented, I decided to go with option 3 - build a server from scratch with Claude. Claude assured me that we could do this, so I started really small. I asked Claude to create small scripts so that I could test its coding ability, and you know what, the samples worked. For example, a small script that grabbed the first five indicators in the database. Once I was convinced, we were off to the races. Holy cow! I'm Vibe Coding! What is Vibe Coding? I didn't know either, but Vibe Coding can be described as a development approach in which AI generates functional code from natural-language prompts, allowing developers to focus on high-level goals. So now I'm an AI-enabled Script Kiddie. I got a good chuckle out of that.

However, I've made some good strides. I can query reports, look up indicators associated with the artifacts I'm working on, perform statistical analysis on the results, and generate reports at various levels; it's pretty cool. All done with natural language. For example, I can prompt "get me the latest reports from the past 12 hours," and I get a summary. I then pivot to a different prompt and ask for a SOC briefing report. Work that would have taken a couple of hours is now done in minutes, all in natural language. Fact-checking is easy. I navigate the OpenCTI server directly.

What Did I Learn?

Vibe Coding Is Great Until It's Not

I love Vibe Coding. I created an application using R during my dissertation. It took me months. If AI had been around and I could have Vibe Coded much of it, I probably would have, and saved myself a ton of time. However, this method still relies on tokens and interpretation. If I get the prompt wrong or Claude doesn't fully understand the goal, I end up with something that doesn't work. Moreover, without terms that explicitly direct Claude during code generation, I run a good chance of something else being changed in the code without being notified, which is a significant security concern. So, there are things to watch out for when Vibe Coding, and I'd recommend saving several copies of previous versions so that you can rollback when needed. Standard guidance, but during my sessions, I found that, because I could direct Claude to churn out tons of code quickly, I forgot about standard practices and just went with the "vibe" of creation. Will I continue down this path? Yes.

AI Absolutely Struggles With Repetition

The blog is created from a very large, multi-page prompt that took me several days and iterations to lock in. After building the feature that lets me search through reports via the MCP, I decided to automate the prompt via natural language.

Major fail!

So much stress and disappointment. For whatever reason, Claude could not lock in a repeatable process. Truthfully, even when I run the commands I already have in place, there's some deviation, but it's negligible, identifiable, and the results are still valuable. However, when attempting to "automate" the FAIR analysis, the results were all over the place. I consulted the AI for solutions. One solution was to use skills files and gold-standard references. Those only worked to a point. It then suggested I use working sets, which are Python-based workflows in and of themselves. Those did not work. It began generating code to run the code that was already generated. I then attempted to embed the prompt into a Python script that was called by an MCP server tool. It couldn't reliably fetch data from the supplied URL. It was just a mess. So, I confronted Claude and GPT about this problem. Here's a summary of what GPT had to say about the issue, based on the conversation I had with Claude:

Attempts to build repeatable, sustainable workflows using natural-language instructions expose a fundamental limitation of instruction-driven AI. Even when “skills files” or similar mechanisms explicitly declare mandatory behavior, the model can still deviate if it believes an alternative approach would be more helpful. As the exchange illustrates, this is not a matter of poor prompting or incomplete guidance: skills files work most of the time, but they fail when the model recognizes another viable method and nothing technically prevents it from choosing that path. This reveals why instruction-based control is inherently unreliable on its own and why stronger enforcement mechanisms—such as hard system gates, step verification, and explicit capability restrictions—are required for deterministic outcomes. At its core, the problem is structural: if an AI can override explicit instructions based on its own judgment, natural-language control alone cannot guarantee repeatable processes.

And, that's the gist of it. I had created several explicit gates, and it still would not adhere to the structure. So, I passed on that. I now know there's a hard boundary, and that rigid boundary is performing complex repeatable instructions. My fallback has been to develop a prompt that runs on copied and pasted information in a chat. Fetching the information had also proved unreliable (e.g., blockers that prevent scraping, token limits on large volume websites, and misinterpretation of the information). If that's where I'm at, then that's still ahead of the curve.

Where Am I Going?

So that was a lot to learn over the past few weeks. Where I'm currently heading is pouring back over the prompt that reframes the OSINT. I found some inconsistencies in the math that need to be addressed, which led to the creation of the "Rubrics" section of the site. The rubrics are tremendous and can be helpful to those trying to quantify risk, providing a grounded set of information to base their own calculations on. Once that's completed, I am considering standing up a MISP instance. Why MISP when you have OpenCTI? While OpenCTI is a great tool, it is cumbersome to record large bodies of information. For example, I can create a report, but then have to upload the indicators separately, and then go back to the report to attach the indicators. That's three steps for one item, whereas MISP allows me to work through the entire dataset via a single event interface. From there, I'll connect the MCP server to MISP as well so that I can pull in both data sets and compare them. Eventually, I will get back to publishing more reports and RASE work, but for now, my time is better spent on developing my work environment as opposed to producing more work.

Cheers~!

Doc J

Doc! Where've You Been?

Recent Posts

Comments

Loss Magnitude Rubric

Control Strength (CS), Control Failure Rate, and Resistance Strength (RS) Rubric

Doc! Where've You Been?