At Risk Focus we are often faced with new problems or requirements that cause us to look at technology and tools outside of what we are familiar with. We frequently engage in short Proof-Of-Concept projects based on a simplified, but relevant, use case in order to familiarize ourselves with the technology and determine its applicability to larger problems.
For this POC, we wanted to analyze email interactions between users to identify nefarious activity such as insider trading, fraud, or corruption. We identified the Siren platform as a tool that could potentially aid in this endeavor by providing visualizations on top of Elasticsearch indexes to allow drilling down into the data based on relationships. We also wanted to explore Siren’s ability to define relationships between ES and existing relational databases.
Getting started with the Siren platform is easy given the dockerized instances provided by Siren and the Getting Started guide. Using the guide, I was able to get an instance of the platform running with pre-populated data quickly. I then followed their demo to interact with Siren and get acquainted with its different features.
Once I had a basic level of comfort with Siren, I wanted to see how it could be used to identify relationships in emails, such as who communicates with each other and if anyone circumvents Chinese firewall restrictions by communicating through an intermediary. I chose the Enron email corpus that is publicly available as my test data and indexed it in ES using a simple Java program that I adapted from code I found here. I created one index containing the emails and another index of all the people, identified by email address, who were either senders or recipients of the emails. The resulting data contained half a million emails and more than 80,000 people.
With the data in place, I next set up the indices in Siren and defined the relationships between them. The straightforward UI makes this process very simple. The indices are discoverable by name, and all of the fields are made available. After selecting the fields that should be exposed in Siren, and potentially filtering the data, a saved search is created.