In this article we’re going to talk about using A.I. with OSINT. We’re going to touch on some ways that we’ve used A.I. with OSINT to test things out or for live client engagements.
Before getting into the main topic, a few important things.
1. When using A.I. with OSINT, cybersecurity, or anything where sensitive information is involved, never ever input anything that identifies anything related to your work into an A.I. application. It will become part of the datasets that the algorithms train on. This is both a privacy and security issue that we’ll touch on later.
2. If you’re exploring OSINT for the first time, whether you want to learn what information about you exists (known as an “ego search”) or maybe you have a work project where you need to look up stuff, first get comfortable with the basics. The basics, to us, are using search engines, finding data sources, verifying information, and learning how to find tools that suit your needs. There’s a bunch of other basic stuff you need to learn, but what we just mentioned are first good steps.
3. With regard to AI, one thing you must do, is verify the output if you’re using an LLM (Large Language Model). The applications are prone to hallucinations where the algorithm spits out incorrect, or misleading information. This is why you need to verify your output and why you need to build your verification skills in general.
Now let’s get into how we’ve used A.I. with OSINT. Hopefully this gives you some ideas that will inspire you to explore.
Learning and Creating Challenges
If you’re just starting out, or are more advanced in the discipline, you can use an LLM and ask it to create a learning plan based on your experience level. You can even use the LLM to create OSINT challenges for you. This will help you put your newly learned skills to the test.
Generic topic of research
Whether you’re just learning, or you’re hired to look into something, if you’re not familiar with the topic or industry that you’re researching that the target is associated with, you can ask your LLM to give you key points about the subject matter in general. From that output, you can further drill down to learn more. While you’re not entering target information that identifies them, you’re learning things in an adjacent way that can help you construct your search strategy and tool selection. Again, we caution you to verify the output is accurate with something like this and ensure that you aren’t using any identifiable information of the target.
Translations
If you find something in a language you aren’t familiar with, you can ask your LLM to translate text into the language you know best. This is tricky, as you need to avoid using anything directly from your target. If you’re playing a CTF (Capture the Flag) game and you come across a language you aren’t familiar with, the risk of something being sensitive in the text is likely low, so using the LLM in this instance is less risky. There’s always Google Translate if you want to avoid the use of an LLM.
Strategy
We’ll guarantee you that in your search you’ll eventually run into dead ends to the point where your investigation is seemingly done at this point. When we run out of real estate, we’ll turn to using an LLM to see if it comes up with a unique strategy to find something we can use to further move the investigation along. For example, we worked on a case and weren’t finding anything at one point, so we asked ChatGPT “what are some ways to find someone’s social media account when you have little information to go on?” which generated a list of things for us to look at. But then we drilled down even more by asking the bot “what social media platforms do [insert demographic] use the most?” and we followed up with “same question, but what lesser know platforms are used?” This generated even more places to explore, some of which we never knew existed. This allowed us to continue digging, even if nothing turned up because you don’t know what’s out there if you don’t look. The best part of doing this is that no directly identifying information was used to generate the output. This is a very strategic use of A.I. Not only does this help identify more sources, but it helps with planning new key words to search, and potential tools to help you out.
Scripting
Sometimes we might want, or need, to create a script to help collect and parse through publicly available information. ChatGPT is pretty good at that, but you need to test the script to make sure it works. If you’re doing something like this, you need to have a very basic understanding of coding because it makes troubleshooting easier and makes engineering your prompt more precise when it comes to tweaking your script. Like in the “Generic topic of research” section, if we’re using an LLM to create a script, it’s very generic. It’s more along the lines of we know what we want, but there might be one portion of the code that we need help figuring out how to build or troubleshoot. That’s where we’ll ask the chatbot something like “What libraries allow the script to do X?” or something like that. As we’ve repeated multiple times now, avoid entering sensitive info into an LLM, in this case proprietary code, as it may end up in as part of the data sets the A.I. algorithm trains on. It may also violate your employer’s Information Security policies.
Conclusion
To conclude, A.I. does have some benefits. If you’re just exploring OSINT for fun and you’re learning new skills, try the tools out.
Once you make the leap from exploring to doing client work, you need to understand the risks of using an A.I. application. Once you understand the risks, you set boundaries on how you use A.I. and what you are prohibited from doing that might jeopardize your client, or yourself, if the output is ever leaked, or if the A.I. platform is fully compromised, or if the tool you’re losing constantly hallucinates.
To learn more about the services Bsquared Intel provides for your business, yourself, your law firm, or your educational needs, fill out the contact form below.
Contact Us | Bsquared Intel
Please fill out the form below, or call 203.828.0012, to learn how bsquared intel can assist you.