Modeling The Storm of Cybersecurity Data

Jul 18, 2018 3:30:00 AM Ryan Hohimer cybersecurity, Conceptual Models, Modeling Languages, Big Data

Cyber security threats are like a storm

Consider the job of the weather person. It used to be that a weather forecast was almost unusable because of low accuracy. But now, it is commonplace to put a reasonable amount of faith in the forecast. Really, not that long ago, we used an almanac to get a sense of the weather. My, how things have improved!

I actually rely on the forecast I access on my smartphone. Why is the prediction so much better today than it was in the past?

Individual sensors can tell us a great deal. There are sensors for temperature, humidity, rainfall, wind speed, wind direction, barometric pressure, and a host of many others.  Having a collection of measurements from these instruments does indeed give you a sense of the moment in terms of those specific metrics.

However, these sensors by themselves, don’t give you a sense of whether or not a storm is imminent.  For that, you need a model. Modeling is what leads to the successes of the meteorologists.

Developing and Tuning Models

Meteorologists developed models; they collected historical data, they collected real-time data, and they used that data to initialize their models. They constantly compared the results of their modeling with the real world. They tuned and tweaked and improved their models.

Without the model, the data is just a collection of unrelated disparate facts. With the model, the data can be interpreted to infer the current and future states of the weather. For instance, the facts as interpreted by the model may infer a storm is forming.

Developing and Tuning Cyber Security Threats as Conceptual Models

Modeling an enterprise can be looked at similarly to modeling a weather system. Just as meteorologists improved their models, cyber security threat analysts must improve theirs.

In the domain of cyber, we have a full toolbox of sensors, and these sensors give us a wealth of measurements. Any hardware or software that produces a log can be considered a sensor. There is no shortage of data. There is a shortage of good models to interpret the data.

It is the relationship between the data points that must to be considered in order to understand what is happening in our networks. There are many great sources of data:

  • Intrusion Detection Systems
  • Data Loss Protection Systems
  • File Integrity
  • Antivirus
  • Packet Capture
  • Firewall logs
  • Proxy Server logs
  • Web Server logs
  • Email Servers
  • SIEM systems
  • Data Collection and Indexing systems

Common Modeling Languages

There are modeling standards forming within cybersecurity. These standards will facilitate the improvement of cybersecurity models a great deal. By creating and using standard modeling languages and tools, a larger community can contribute to the improvements. 

common modeling language for cybersecurity.jpg

I’m excited about the common direction we are heading in cyber. With the adoption of languages and protocols such as STIX, CybOX, TAXII, we are starting to use a common modeling constructs. Here are some of the concepts that are part of this common model:

  • Package
  • Report
  • Campaign
  • Course of Action
  • Exploit Target
  • Incident
  • Indicator
  • TTP (Tactics, Techniques, and Procedures)

For more information about these concepts check out:

In a previous post, I commented that we need to learn the language of cybersecurity modeling. Learning the concepts of STIX and CybOX is a must. Another great resource on the same GitHub project gives you some tips on creating indicators. I consider this as grammar for using the languages:

Just as meteorologists developed models and experienced significant improvement in their abilities to understand weather systems, cyber security threat analysts will experience a significant improvement in our abilities to understand our networks and the threats within them.

Does YOUR enterprise use Conceptual Models?


Learn how to explain why  MORE data does not mean BETTER security


Ryan Hohimer

Written by Ryan Hohimer

Ryan has been working with “Big Data” before “Big Data” was cool. Dealing with the challenges of managing massive data sparked his interest in metadata, Semantic Web Technologies (SWT) and Knowledge Representation and Reasoning (KR&R) -- which led to the development of the technology behind DarkLight's patented reasoning engine. Ryan is Co-Founder and CTO of Champion Technology Company.

Subscribe to Email Updates

Subscribe via RSS to the blog

Recent Posts