An approach to solving cybersecurity problems is to use machine learning to baseline normal behaviors so that deviations from normal can be identified. This, at first glance makes perfect sense. However, there are flaws that must be considered and avoided.
Threat Intelligence: Non-Malicious Deviations
One of the flaws is that deviations from normal may not be malicious in nature. They may reflect new behaviors that are non-malicious in nature. For example, new software, new policies, new scope of work, or any other of a myriad of changes may cause a deviation from normal. So it isn’t the deviation from normal alone that we must find, it is the deviation from normal that is malicious. This requires that we understand whether the intent of deviation is bad or good. Classifying the malicious intent of an event is not possible by only examining its frequency.
In fact, one would hope that bad behavior is very infrequent. We should hope that bad things happen at a statistically insignificant rate. And if so, why isn’t it obvious that distinguishing the good from the bad using statistical modeling is not such a great idea?
Who, What, Where, When, Why & How
We have to be able to identify and classify malicious intent even though we don’t have training data. How is this done? How do you identify maliciously intended activity programmatically? How is it done manually? It is very difficult to identify intent manually. Who, What, Where, When, Why & How (W5H) are constantly being evaluated in the mental models of cybersecurity analysts.
In cybersecurity, we have much of the information that addresses W5H. Account names, device names, application names, directory structures, and timestamps, all add to our understanding of four of the five "Ws" and much of the "how." However, "why" is best understood by the enterprise’s most senior analysts.
Context is Critical to Threat Intelligence
In order to understand if an action is of ill intent, an interpretation of the action in the context of the business policies, business model, business processes, personnel, laws, and/or business ethics must be formed. In DarkLight™ we call this Contextual Memory.
For instance, copying data to a thumb drive to transfer data to a client may be perfectly fine and commonly done, but the same action of copying data to a thumb drive to transfer data to a competitor may be against company policy. The former action is to the benefit of the company, the latter to the detriment. Statistically baselining normal behavior is not going to flag malicious intent in this case.
Enterprises need to be able to tap into the knowledge and experience they have with the enterprise to apply it to the situation. Situational awareness is the cornerstone of excellent cybersecurity. Actions of malicious intent have a better chance of standing out as odd when the observer is situationally aware.
Good analysts with this situational awarenesscan and do spot “odd behaviors” occurring within the enterprise. Of course, a real problem is they are bowled over with the amount of data they must observe.
They need computational help, not with statistically modeling normal behaviors, but with conceptually modeling the enterprise’s policies, assets, business model and processes, applications, employee's roles and responsibilities, laws, and ethics.
To do a truly comprehensive job of protecting your enterprise, your toolkit has to include a conceptual modeling capability. In DarkLight, artificial intelligence drives cybersecurity conceptual models to create conceptual graphs (a knowledge-based system), where the cyber-analysts represent their expert opinion on what is good and what is bad within their enterprise.
Does your toolkit include conceptual modeling?