Newswise — As the amount of data that people and machines create, collect, process, and store continues to increase in size and complexity, so does the task of keeping cyber systems safe and secure from ever more sophisticated cyber threats. A team of researchers at Pacific Northwest National Laboratory (PNNL) is developing a new approach to explore the higher-dimensional shape of cyber systems to identify signatures of adversarial attacks.
The researchers tested the approach on a publicly available dataset used to study advanced persistent threats, known as APTs, which are extended cyberattacks that, for example, aim to steal sensitive data through multiple stages, such as an intrusion through a phishing email that expands access to network data over time.
They found the approach uncovers patterns in the data that are “consistent with adversary activity,” noted Emilie Purvine, a senior data scientist at PNNL and senior author of a paper describing the approach in The Next Wave, the National Security Agency’s review of emerging technologies.
“Identifying and finding these more complex patterns can help us find behaviors in the network that are unusual and that in turn can help cyber analysts figure out why those are happening, why they are unusual,” noted Purvine. “Are they things to be worried about?”
Out of Flatland
Purvine and colleagues liken the approach to the novella Flatland by Edwin Abbot that tells the story of a square in the two-dimensional world that is visited by a sphere from the three-dimensional world. The sphere appears as a circle in Flatland that is growing and shrinking in size. The sphere tries to explain to the square that there’s a third dimension, but the square is unable to grasp what that means.
“So, the sphere pulls the square out of Flatland and shows the square Flatland from this other perspective. Suddenly, the square understands there’s so much more out there,” explained Purvine. “The idea that we latched onto is that there are connections in cyber systems that we just can’t perceive—there’s so much data.”
To get a different perspective on the data flows in cyber systems, and thus hunt for behaviors that could be malicious, Purvine and her colleagues have devised a framework based in the mathematical theories of hypergraphs and topology that allows them to explore data in cyber networks in higher dimensions through so-called topological signatures.
Topological signatures
Hypergraphs can model systems where groups of things interact or share common behaviors or properties. Purvine likens them to bunches of proteins in a cell that act together to form a reaction, or social networks where groups of people form clubs and some members of one club overlap with members of a different club. Hypergraphs are a mathematical abstraction of this notion of groupwise interaction and behavior, she said.
In a cyber system, the behavior may be a group of devices contacting the same device, or Internet Protocol—a numerical label assigned to a device connected to a network, known as an IP address—that another group of devices is contacting, she added.
“We don’t know that those communications were the same. We just know that these groups of IPs did something similar,” Purvine explained. “And when you start to piece those pieces together, you start getting a landscape of the behaviors of the system and how different IPs interact.”
To understand this landscape, Purvine and her colleagues, who have expertise in abstract mathematics, data science, software engineering, and cybersecurity, have turned to topology.
Topology is the mathematical study of shape, with the flexible notion that two shapes can be the same if they can be continuously deformed into one another via stretching, bending, and twisting but not breaking, tearing, or puncturing. A classic example of this sameness is that a coffee cup can be deformed into a donut shape without breaking, meaning that a coffee cup and donut are topologically the same. In recent years, scientists have turned to topology to understand the shape of data.
Purvine’s team is exploring mathematical techniques to translate hypergraph data about cyber networks into the language of topology to look for topological signatures of malicious behavior such as a cyberattack.
“When something unusual happens, there should be some kind of change in the structure,” said Helen Jenne, a postdoctoral researcher at PNNL who is working with Purvine and is the lead author of the paper explaining the research in The Next Wave. “The challenge is we don’t know what we’re looking for, and part of the reason we don’t know what we’re looking for is because the attacks are always changing.”
Call to action
While the researchers found that their technique of translating hypergraph representations of cyber networks into topological structures uncovers patterns in the data that represent adversarial activity, they also found that the same patterns that represented adversarial activity also could represent benign activity. In addition, not all the adversarial activity was captured by this method, suggesting that there is more exploration to be done in this field of research.
Jenne noted that The Next Wave paper describing the approach is a call to action to focus resources on finding hypergraph constructions and topological methods that could be useful in the ongoing effort to thwart cyberattacks. Indeed, the paper represents just one of several methods the research group is exploring.
“There’s so much to study here,” she said, “and it seems like there’s something useful.”