Log Analysis

Why Muay Thai Is Like a Security Investigation

Peter Naudus
April 5, 2024
min read

I used to have a Muay Thai coach who had a lot of favorite sayings. One that particularly stuck with me was, "You won't rise to the occasion. Instead, you will fall to the level of your training." That parallel is definitely applicable to security. When it's "go" time, the pressure is on, the stakes are high, and you don't want to scramble to figure out your tools. Instead, you want to be able to grab what you have on hand and be effective during a security investigation when the clock is always running.

Prepping Before Your Security Investigation

In the spirit of training, let's take a simple example that you are likely to encounter, and assess your available tools. Imagine a user in your company reports that they might have been phished. You do what any responsible professional would do. You quickly lock down the user's account, change their passwords, revoke active sessions, and rotate credentials to prevent further damage.

But you're not satisfied with just that. You want answers to the question eating away at you: “Did an attacker use stolen credentials to access anything?” Let's think through how you might use the tools at hand to answer this deceptively simple question during a security investigation.

I can't cover every possible technology stack, but for this example, let's focus on a simple tech stack using Okta, Google Workspace, GitHub, and AWS. As we go through examples, I'll be referencing their APIs. Even if you are visiting the consoles of each technology, leveraging a SIEM (Security Incident and Event Management), have an ELK stack (containing Elasticsearch, Logstash, and Kibana), or you've hand-rolled your own data lake, the data you have available is ultimately defined by the endpoint API.

With that introduction out of the way, let’s dive into an overview of the tools. There are many layers of detail, but this section outlines a high-level playbook to get you started.

Steps for Okta

  1. First, read up on how to create filters on Okta. For example, to view a particular user by the email address you’d use:  filter=actor.alternateId eq "".
  2. Use this filter in the System Log API.
  3. Look at the Event types to see the types of events coming back so you can pick the ones you are interested in and which ones are just noise. You should probably look through each one, because even mundane ones such as user.authentication.sso must be analyzed to see if they originate from a location other than the user's normal location.

For example, here’s an example of me using curl to pull my own logs:

curl 'https://[YOUROKTAHOST]' \
   	--url-query 'filter=actor.displayName eq "Peter Naudus"' \
   	--header "Authorization: SSWS $YOUR_OKTA_TOKEN" > events-okta.json

Then you can open up events-okta.json in your JSON viewer of choice:

Steps for Google Workspace

  1. Use the Admin Reports API to pull the reports you’re interested in. With the API, you will be able to specify both the user (put all if you want everyone) and also the specific application you want to fetch logs for.
  2. Each application has a different format for their logs. For example, drive to view activity concerning documents, login events, or even mail rules for a BEC (business email compromise).
  3. For each of these, designate events that are just noise and others that you care about. Again, you'll have to scrub through ones you care about to see if they originate from an unknown location.

The easiest way to fetch these logs is to install the client library for the language of your choice and reference the quickstart for the language of your choice. Here’s a short script that I used to fetch all my drive activity and dump it to a JSON file:

import json
from google.oauth2.credentials import Credentials
from googleapiclient.discovery import build

creds = Credentials.from_authorized_user_file(
	"token.json", [""]

service = build("admin", "reports_v1", credentials=creds)

results = (
	.list(userKey="", applicationName="drive")

with open("events-google.json", "w") as f:
	json.dump(results, f)

You then can open up events-google.json in the viewer of your choice. Here’s the event showing me editing this very blog post:

Steps for GitHub

  1. Find the user's GitHub username. It will be different from the user's email address (which has been used in Okta and Google Workspace).
  2. You can list the events for the user using the USERNAME/events API. The easiest way to access this API is probably using the official CLI tool.
  3. See API documentation for all the different event types.
  4. As before, you'll have to sift out which ones are fine and which ones are unusual.

This is how I fetched all the events for my GitHub user (Farmer-Pete):

gh api \
      -H "Accept: application/vnd.github+json" \
      -H "X-GitHub-Api-Version: 2022-11-28" \
      /users/[USERNAME]/events > github-events.json

And this is what the JSON looks like:

Steps for AWS

  1. Reference the CloudTrail event reference to see all the different events.
  2. For example: Management Console sign-in events.
  3. To view the login through the AWS console, navigate to CloudTrail and then “Event History”. Under “Lookup attributes”, select “User name” and then enter the user you’re interested in.

Common Investigation Challenges

Now let's discuss some challenges you'll face when you’re under the clock for a security investigation. I've mentioned four technologies: Okta, Google Workspace, GitHub, and AWS. But likely your tech stack includes many more. Each application has its own API and returns different data. One way to handle this is by using a SIEM. This helps eliminate dealing with multiple APIs and console interfaces by having one standard interface.

While using a SIEM simplifies things to some extent, it also introduces another challenge: learning the SIEM language to perform queries. Even if you're already familiar with the SIEM language, there will still be differences in data structures from each application that you need to understand; organizations tend to move to SIEM to facilitate detection and response, and not necessarily data analytics. Simply using a SIEM saves you from having to dive into the API docs of each application to figure out what fields you need to query on.

One big challenge to know about: each of these texts has different fields and capabilities. For instance, the username might be called "actor," "user," or something else. You need to know the exact field you're looking for…and how they all fit together. This isn't difficult to learn, but it's an important skill worth honing that saves research and investigation time when you’re faster. You can't expect to throw everything into a standard place and figure it out in the heat of battle, just like you can't expect to spar at your best without training on the heavy bag and shadowboxing.

Training Regularly

Without speed to match your skill in the middle of a hot security investigation, a few things will happen. First, you'll be scattered in an already stressful situation. You'll have to figure out various documents, APIs, and formats, even if you're using a SIEM. Worse, it’s easy to use an API or SIEM for querying and correlating data in a way that very easily leads to confirmation bias.

In this particular scenario, you're seeking specific answers to a specific question: for example, evidence of compromised logins pointing to credential theft. Omniscience is great in theory; you’d like to know what’s in all your data, but ultimately there's just too much to sift through. You need to have a specific question in mind to get the data down to a manageable amount and focus your queries. For example, you might want to see which logins are located outside of the state of residence for the person you’re researching. So you build a hypothesis and then go to the data to try to prove it. 

Beware of getting tunnel vision and focusing on proving your theory while ignoring other possibilities… it’s easier than you think to fall in love with your own conclusion before you’ve proved it, or to only look for evidence that points in that direction.

Next Steps

So, what are the real solutions? One option is to invest time and effort into creating custom dashboards and interfaces for your preferred SIEM. This way, you can get a clear overview and make accurate conclusions. However, this can be expensive in both labor and opportunity costs, as you'll either need to build on top of your existing SIEM yourself or work with professional services.

Another option is to use a service like Turngate. Turngate brings all the information together and standardizes it, so you don't have to worry about the mechanics. This allows you to see the big picture and drill down into details using a standard interface.

Instead of wrangling with APIs, I can see my user’s data visualized across all our different devices. Plus, I can easily see an overview when it’s helpful and then drill down when I’m curious. Plus I can see my data in context and relation to other users. Obviously, I’m biased, but it’s so much easier and nicer.

Heed my Muay Thai coach’s advice and make sure you’ve trained properly to be effective when the need arises. And if you find your tools lacking, let us help you. With Turngate, you can jump to the right conclusions, quickly, when it matters most.

Share this post