“On the horizon: faster and more effective propagation methods that maximize the impact of ransomware campaigns and increase the probability that adversaries will generate significant revenue.”

– 2016 Cisco Cybersecurity Report

Today internet-scale cybersecurity is a race to recognize patterns. The attackers? They’re looking for faster and more effective propagation methods. The same is true for responders and, in a way, it’s a game of finding the weakest and strongest links. Appropriately enough, there’s a field focused on propagation algorithms in graph theory that can be applied to label malicious domains within the graph of the internet.

David-Rodriguez-OpenDNS

IMAGE: Post author David Rodriguez of OpenDNS speaks at SAI Conference 2016

While there is a rich history in graph theory on specific propagation algorithms, the question of how best to apply them in practice remains a challenge. The problem is that even if a given propagation algorithm works at one time, it’s not guaranteed to work at another. One problem is that malicious domains occur in a deluge of internet traffic—a large-scale network passing and propagating messages— that’s fast, redundant, and non-uniform. The search for similar domains hosting malware is obscured in this apparent white-wash of noise from one time slice to the next.

A new wave of cybersecurity professionals now seeks to apply techniques from artificial intelligence, neural networks, and deep learning to classify malware samples and botnet campaigns. Less publicized is similar work in the related fields of bio-medical signal and image processing, which may help inform our pattern recognition and classification work. For instance, might techniques used to classify medical images be applied to phishing pages? Can techniques used to classify brain activity from muliple signals streaming simultaneously be applied to classifying various malware campaigns? Perhaps we can rephrase the cybersecurity problem as an image-processing problem? A signal-processing problem?

Last month, the Science and Information (SAI) Conference 2016 assembled a variety of researchers in artificial intelligence, machine vision, and security (to name a few fields) in London (UK) to explore the connection between the fields of bio-medical science, computing, mathematics, and information security. The conference was managed by Supriya Kapoor and emceed by the vibrant Lars Sorensen.

VIDEO: SAI Conference teaser; post author speaks at 0:41

I attended the SAI to present A Neural Decision Forest Scheme Applied to EMG Gesture Classification, joint work with Drs. Zhang and Piryatinska from San Francisco State University. Neural decision forests were first introduced by S. Rota Bulo and M. Kontschieder in Neural Decision Forests for Semantic Image Labelling (2014). The highlight of a neural decision forest is that it is a classifier with embedded signal processing. The upshot: any problem related to labeling a vector of numeric values to a label, just got easier, because the somewhat difficult task of producing hand-crafted features and other signal pre-processing techniques are not typically required and are embedded in the classifier. 

For the visual learner, a neural decision forest is a collection of tree like structures where the internal nodes (or gates) are neural networks (for this research we picked feed-forward networks). Check out the FIGURE.

Neural-decision-forest

FIGURE: Schema of a neural decision forest with three neural networks acting as gating functions.

At the end of the presentation I walked through one application of neural decision forests. Using the Myo Thalmic band I recorded a variety of gestures and trained a neural decision forest to distinguish these gestures. You can check out the accuracy and more here

In addition, the conference offered a variety of tracks: Artificial Intelligence, Machine Vision, Cloud Computing, Security and Privacy, and more. Security professionals may have found the following presentations of particular interest:

  • Statistical Approach towards Malware Classification and Detection
  • An Empirical Study of Security of Voip System
  • A-RSA: Augmented RSA

or perhaps talks more seasoned with machine learning techniques:

  • Fuzzy Based Modeling for an Effective IT Security Policy Management
  • Fuzzy Random Decision Tree (FRDT) Framework for Privacy Preserving Data Mining
  • Machine Learning Based Job Status Prediction in Scientific Clusters

of course, these are just a few of the talks that were offered.

My favorite was Multi-Scale Reflection Invariance. To understand the context, check out the Microsoft #howoldrobot. In related work, Henderson and Izquierdo show that current deep learning techniques may not exhibit reflection invariance in image classification tasks. For example, the above linked Microsoft #howoldrobot will predict Alan Turing to be 53 when the portrait is facing one direction and 46 when flipped along the vertical axis (so he’s facing the opposite direction). Wait, what? Very subtle. How does flipping alter the classification results so much? I wondered if this problem was in any way related to translation-invariance difficulties in classification tasks, but whether the two problems are related appears to be unknown.

So, whether you’re looking to expand your breadth in infosec or machine learning, consider attending this unique gathering, which starting with the 2017 event will now be known as Computing Conference.  Said differently, if you’re in London next July 18-20, take the underground east to Towers Gateway. Watch your step, or as the Londoner says over the loudspeaker, mind the gap. Transfer to the overground DLR line and head to Zone 3. You’ll pass the idyllic flowers and aged brick homes until you reach the ExCel London event center, a modern entertainment center canvassed in banners and large glass windows, host to SAI Conference 2016 and Computing Conference 2017.

This post is categorized in: