O365 – Digital Lipstick?

O365 – Digital Lipstick?

Having had a few days to clear my head from a full week at my first AWS re:Invent in Las Vegas, I got to thinking about how to make sense of the announcements, customer testimonials, conversations and sheer magnitude of the event. Was there some common thread tying together the 100’s of sessions spread across multiple casinos, 65,000 visitors from around the world and a fantastic exhibitors hall (including a really cool analytics demo from TIBCO via a stationary bike time trial)? Everyone is trying to capture and define the elusive concept of Digital Transformation to be able to RE-INVENT their business, their technology and/or themselves. This then begs the questions, what does REAL DIGITAL TRANSFORMATION look like, who will be the ultimate winners, and who will go the way of the DoDo bird?

Despite noise of Microsoft Azure nipping at their heels, AWS is still the undisputed King of the Cloud. Attendance at re:Invent was up around 10-15% YoY, a 10x increase since the first show 7 years ago! Amazon announced 77 new products and services – 20 alone just around Machine Learning (no surprise here since this has been a steady drumbeat from AWS over the past year). We also heard compelling stories from their Enterprise clients and I was glad to see Amazon moving to make AWS a more enterprise-friendly platform with new products like Control Tower.

A high point of the week was a dinner hosted by Tim Horan from Oppenheimer, where we discussed Cloud, Digital Transformation and the impacts of politics with industry experts. A key topic of conversation was what to make of Microsoft Azure’s gain in cloud market share over the past few years. AWS cloud market share has dropped from ~80% in 2016 to an estimated 63% in 2019, while Azure’s share has climbed from 16% to 28% over that same time period. When looking at Enterprise workloads the race is much tighter; a RightScale 2018 survey shows Azure adoption grew 35% YoY while AWS adoption in this group increased by 15%. But the Azure numbers are worth a closer look.

Microsoft buries its Azure revenues in a much larger pile of “Commercial Cloud” revenues that include Office 365. So, while Microsoft announced a 73% growth in Azure cloud revenue, it’s impossible to put a hard dollar to that number. Industry experts are in agreement that the lion’s share of Microsoft’s commercial cloud growth comes from Office 365. Therefore, it’s safe to assume that the majority of Enterprise workloads running on Azure are O365 which begs the question, “is this real digital transformation?”

In Roxane Googin’s article on July 25, 2019 in the High Technology Observer entitled “Reality v. MSFT: Real versus Fake Digital Transformations”, she concludes that “a true digital transformation is about more than replatforming existing operations. In fact, it does not happen by making ‘personal productivity’ better. Rather, it is about rethinking operations from the ground up from the customer point of view, typically using real-time ‘AI infused’ algorithms to replace annoying, time-consuming and unpredictable manual efforts.“

I’d argue that the shift from PC windows + office to O365 is merely a replatforming exercise to improve productivity. While this move can certainly help businesses reduce expenses by 20 to 30% and drive new revenues, it does not fundamentally alter the way a business operates or interacts with clients. Therefore, perhaps this change should be viewed as Digital Transformation “lipstick”. We do, however, have great examples of Real Digital Transformations; AWS re:Invent was full of transformational testimonials and, at RiskFocus, we are fortunate to be partnering with a number of firms that are also embarking on Real Digital Transformations. I’d like to highlight a couple below.

The first story is about a NY-based genomics company looking to re-invent healthcare. They understand that current healthcare providers use just a tiny portion of information from the human body and little or no environmental data to categorize a patient as either sick or well. They are building predictive patient health solutions leveraging a much richer, deeper and broader set of information. To deliver on this mission they must unleash the power of the cloud; that is the only way they can meet the challenges presented by the scale, sensitivity and complexity of the data and sophistication of their probabilistic testing algorithms. They are not leveraging the cloud to run traditional health-care solutions, but re-inventing what healthcare looks like.

The second use case is an institutional, agency-model broker known for their technology-first approach. They were a FinTech company before the term existed. Sitting on years of institutional data consisting of 100s of petabytes of tick trade data, they are looking to harness the power of this information as a vehicle for changing how they do business. Leveraging a highly performant data lake and building sophisticated AI algorithms, the firm wants to crunch billions of records in seconds to deliver recommendations on trade completion strategies both for their internal consumers and ultimately in an “as a Service” offer. Once again, this is a mission that can only be tackled leveraging the scale and flexibility of the cloud.

Who wins? Do large, multi-national organizations have enough size and staying power that they can afford to take a “lift and shift” approach to the Cloud, replatforming their existing enterprise workloads and then taking a slow methodical approach to transformation? Or is the pressure from upstarts across every industry – the new HealthTechs and FinTechs – going to be so disruptive that the incumbents need to rethink their transformation strategy and partners?

The race is just beginning as, by most estimates, only 10-20% of workloads have moved to public cloud. Over the next two years we will reach a tipping point with more than half of all workloads predicted to be running in public cloud. Microsoft is well-positioned with Enterprises over this timeframe. However, if Amazon continues their pace of delivering innovative, disruptive services and couples that with their increased focus on Enterprise marketing and sales, expect them to retain the throne. One thing is certain, the rate of change will only continue to accelerate, and the winners won’t win by sitting still.

Consolidated Audit Trail (CAT) is live, what’s next?!

Consolidated Audit Trail (CAT) is live, what’s next?!

Compliance beyond Phase 2a/b – Top 5 things firms should be considering as they near initial go-live milestones.

It seems that the industry has resigned to the fact that CAT going live is no longer a matter of ‘IF’ but ‘WHEN’. FINRA’s ability to work through thorny issues and keep up with deliverables / promises to date has been proving the naysayers wrong. General view is that 2a will most definitely happen on time!

While the industry at large is working very hard to achieve go-live with successful testing and April 2020 go-live of 2a, this article aims to give firms time to pause and consider other critical items. The list is not exhaustive and there is no intention to cover the obvious challenges i.e. Phases 2c / d, Legal Entities and FDID, linkages, representative orders, customer data, error corrections, etc. That’s entirely a different topic and would require appropriate focus.

Regardless of your CAT solution (i.e. internal, vendor, etc.) the aim is to provide practical considerations that will yield appropriate fruits and make CAT implementation more accurate, meaningful, and sustainable.

Readiness Assessment

CAT implementation modification from ‘big bang’ to ‘phased out’ go-live has been of tremendous benefit to the industry, and according to some experts, CAT would not have been anywhere near as far along if not for this change. There is a tremendous opportunity for the industry to avoid a typical costly and draining ‘remediation process’.

With CAT there is a unique opportunity to take a pulse check very early on, as you progress through phases by conducting an independent ‘Health Check’, which will yield very important output, e.g. Inform soundness of current implementation, influence future controls, inform upcoming phases, and make overall change managements much more cohesive.

BAU Transition

Due to multiple go-live dates, the transition to BAU is not a trivial / typical exercise as it relates to CAT. The resources working on the immediate implementation will likely have to continue to roll out future phases. The strategy will be unique to each firm / size / location etc. To get you started, some low hanging fruits are: Knowledge transfer, documentation, training materials, regional ownership vs. follow the sun, initial headcount requirements and ways to scale as the scope grows, among others.

Controls

Controls are fabric that gives senior management, auditors, regulators some level of comfort to ensure accuracy, timeliness, and completeness when it comes to regulatory reporting. Unfortunately, controls are typically built in ‘hind-sight’ after a major flaw is uncovered or audit points out a specific weakness. Although, at times necessary, the sequence for building controls on the back of an incident is far from ideal. Firms should build solid controls unique to their implementation, ‘new business’ process and risk tolerance. Consider using independent tools to conduct some controls that can help your firm establish credibility in addition to benefiting from ‘crowd sourcing’ approach to controls and thereby avoiding a siloed viewpoint.

Service Level Agreements (SLAs)

One of the hot button topics for the industry is the ‘error correction cycle’ and its impact on ‘exception management’. Essentially firms will have 1 ½ days to correct errors (T+3 correction requirement is from Trade Date and FINRA will provide broker dealers with errors by 12pm next day). Drafting and finalizing SLAs with key players in the process (e.g. Middle Office, Trade Capture, Technology team, etc.) to make appropriate changes needed to facilitate a reasonable exception management and error corrections process is a very worthwhile exercise.

Traceability

With the passing of time, and natural attrition of your SMEs working on the implementation, knowing the ‘why’ ‘how’ ‘who’ ‘when’ as it relates to your program will be critical. It is inevitable that assumptions are made, unique rule interpretation specific to a business line are penned, and a bespoke code to deal with a unique problem are developed. It may be obvious now why something was done or implemented a certain way; it is NOT the case with the passing of time. Ensuring that you have clear traceability, evidence of sign off, approval of critical decisions, will not only shield your work and withstand the test of time, it will make the lives of people who own the process after you that much easier. Although this item will not show up for a very long time, eventually your due diligence will pay off and earn your work a solid reputation.

All in all, as with any other complicated topic, there are multiple other items that firms should be thinking about now i.e. Impact on surveillance, data lineage and governance, change management, etc. 5 that were covered above seem to be most practical to tackle at this stage, but you should NOT stop here! Wishing you a smooth implementation and a successful go-live!

Deutsche Börse – Leveraging AWS For Rapid Infrastructure Evolution

Deutsche Börse – Leveraging AWS For Rapid Infrastructure Evolution

Abstract

Client: The Deutsche Börse, Market Data and Services Business Groups

Project Dates:  January 2017 through January 2018

Technology Solutions Used:

  • AWS Cloud Hosting
  • Docker Swarm: orchestration framework
  • Redis: in-memory cache
  • Confluent Kafka: scalable replay log
  • TICK: monitoring framework
  • Greylog: log aggregation
  • Jenkins: CI/CD pipeline

Summary

Leveraging AWS, we empowered our client with the automation and tools needed to rapidly create and test a production-scale on-demand infrastructure within a very tight deadline to support MIFID2 regulations.

Given the extreme time pressure that we were under to deliver a mission-critical platform, together with Risk Focus’ we decided to use AWS for Development, QA, and UAT which proved to be the right move, allowing us to hit the ground running. The Risk Focus team created a strong partnership with my team to deliver the project.

Maja Schwob

CIO, Data Services, Deutsche Börse

Problem Statement

In 2017, the Deutsche Börse needed an APA to be developed for their RRH business to support MIFID2 regulations, to be fully operation by January 3, 2018. The service provides real-time MIFID2 trade reporting services to around 3,000 different financial services clients. After an open RFP process, The Deutsche Börse selected Risk Focus to build this system and expand its capacity so it process twenty times the volume of messages without increasing latency and deliver the system to their on-premises hardware within four months.

AWS-Based Solution

To satisfy these needs, we proposed a radical infrastructure overhaul of their systems that involved replacing their existing Qpid bus with Confluent Kafka. This involved architecture changes and configuration tuning.

The client’s hardware procurement timelines and costs precluded the option to develop and test on-premises. Instead, we developed, tested and certified the needed infrastructure in AWS and applied the resulting topology and tuning recommendations for the on-site infrastructure. Finding optimal configuration required executing hundreds of performance tests with 100s of millions of messages flowing through a complex mission-critical infrastructure, and it would have been impossible in the few weeks available without the elasticity and repeatability provided by AWS. This was implemented using an automated CI/CD system that built both environment and application, allowing developers and testers to efficiently create production-scale on-demand infrastructure very cost effectively.

Summary Leveraging AWS, we empowered our client with the automation and tools needed to rapidly create and test a production-scale on-demand infrastructure within a very tight deadline to support MIFID2 regulations through an AWS-Based Solution

Financial Services Domain Expertise

Although this was a technical implementation, we were ultimately selected by the business unit at the Deutsche Börse to provide an implementation of their service because of our strong Financial Services domain expertise and implemented both technical and business requirements. The stakeholder group also included the internal IT team and the Bafin (German Financial Regulator), and all technology, infrastructure and cloud provider decisions had to be made in tandem by all three groups collectively.

Risk Focus’s deep domain knowledge in Regulatory Reporting and Financial Services was crucial to understanding and proposing a viable solution to address the client’s needs and satisfy all stakeholders. That domain expertise, in combination with Risk Focus’ technology acumen, allowed for the delivery of the service that met requirements.

Setting the Stage for The Future

The system was delivered to the client’s data center on time within a startlingly short timeframe. We worked with their internal IT departments closely to transfer fully over the delivered solution, allowing us to disengage from the project and effectively transfer all knowledge to the internal team. All test environments and automation were handed over to the client, allowing them to further tune and evolve the system.

The ability to develop and experiment in AWS empowers the client to make precise hardware purchasing decisions as volumes change, and they now have a testbed for further development to adapt to new regulatory requirements.  They are well-positioned for the future, as it provides a pathway to public cloud migration once that path is greenlighted by regulators.

Using the Siren Platform for Superior Business Intelligence

Using the Siren Platform for Superior Business Intelligence

Can we use a single platform to uncover and visualize interconnections within and across structured and unstructured data sets at scale?

Objective

At Risk Focus we are often faced with new problems or requirements that cause us to look at technology and tools outside of what we are familiar with. We frequently engage in short Proof-Of-Concept projects based on a simplified, but relevant, use case in order to familiarize ourselves with the technology and determine its applicability to larger problems.

For this POC, we wanted to analyze email interactions between users to identify nefarious activity such as insider trading, fraud, or corruption. We identified the Siren platform as a tool that could potentially aid in this endeavor by providing visualizations on top of Elasticsearch indexes to allow drilling down into the data based on relationships. We also wanted to explore Siren’s ability to define relationships between ES and existing relational databases.

The Setup

Getting started with the Siren platform is easy given the dockerized instances provided by Siren and the Getting Started guide. Using the guide, I was able to get an instance of the platform running with pre-populated data quickly. I then followed their demo to interact with Siren and get acquainted with its different features.

Once I had a basic level of comfort with Siren, I wanted to see how it could be used to identify relationships in emails, such as who communicates with each other and if anyone circumvents Chinese firewall restrictions by communicating through an intermediary. I chose the Enron email corpus that is publicly available as my test data and indexed it in ES using a simple Java program that I adapted from code I found here. I created one index containing the emails and another index of all the people, identified by email address, who were either senders or recipients of the emails. The resulting data contained half a million emails and more than 80,000 people.

With the data in place, I next set up the indices in Siren and defined the relationships between them. The straightforward UI makes this process very simple. The indices are discoverable by name, and all of the fields are made available. After selecting the fields that should be exposed in Siren, and potentially filtering the data, a saved search is created.

Once all of the indices are loaded and defined, the next step is to define the relationships. There is a beta feature to automatically do this, but it is not difficult to manually setup. Starting with one of the index pattern searches, the Relations tab is used to define the relationships the selected index has to any others that were defined. The fields that are used in the relationships must be keyword types in ES, or primary keys for other data sources.

Now that the indices are loaded and connected by relationships, the next step is to create some dashboards. From the Discovery section, the initial dashboards can be automatically created with a few common visualizations that are configured based on the data in the index. Each dashboard is linked to an underlying saved search which can then be filtered. There is also a visualization component that allows for filtering one dashboard based on the selection in a related dashboard.

Dashboard

Each dashboard is typically associated with a saved search and contains different visualizations based on the search results. Some of the visualizations show an aggregated view of the results, while others provide an alternative way to filter the data further and to view the results. Once the user has identified a subset of data of interest in a particular dashboard, s/he can quickly apply that filter to another related dashboard using the relational navigator widget. For example, one can identify a person of interest on the people dashboard and then click a link on the relational navigator to be redirected to the emails dashboard, which will be filtered to show just the emails that person sent/received.

The above screenshot shows two versions of the same information.  The people who sent the most emails are on the x-axis, and the people they emailed are on the the y-axis with the number of emails sent in the graph. By clicking on the graphs, the data can be filtered to drill down to emails sent just from one person to another, for example.

Graph Browser

One of the most interesting features of Siren is the graph browser, which allows one to view search results as a graph with the various relationships shown. It is then possible to add/remove specific nodes, expand a node to its neighboring relationships and apply lenses to alter the appearance of the graph. The lenses that come with Siren allow for visualizations such as changing the size or color of the nodes based the value of one of its fields, adding a glyph icon to the node, or changing the labels. It also supports custom lenses to be developed via scripts.

In the screenshot above, I started with a single person and then used the aggregated expansion feature to show all the people that person had emailed. The edges represent the number of emails sent. I then took it one step further by expanding each of those new nodes in the same way. The result is a graph showing a subset of people and the communication between them.

Obstacles

As this was my first foray into both Elasticsearch and Siren, I faced some difficulty in loading the data into ES in such a way that it would be useful in Siren. In addition, the data was not entirely clean given that some of the email addresses were represented differently between emails even though they were for the same person. There were also many duplicate emails since they were stored in multiple inboxes, but there was no clear ID to link them and thus filter them out.

Apart from the data issues, I also had some difficulty using Siren. While the initial setup is not too difficult, there are some details that can be easily missed resulting in unexpected results. For example, when I loaded my ES indices into Siren, I did not specify a primary key field. This is required to leverage the aggregated expansion functionality in the graph browser, but I didn’t find anything in the documentation about this. I also experienced some odd behavior when using the graph browser. Eventually, I contacted someone at Siren for assistance. I had a productive call in which I learned that there was a newer patch release that fixed several bugs, especially in the graph browser. They also explained to me how the primary key of the index drove the aggregated expansion list. Finally, I asked how to write custom scripts to create more advanced lenses or expansion logic. Unfortunately this is not documented yet, and is not widely used. Most people writing scripts just make modifications to the packaged ones currently.

Final Thoughts

In the short amount of time that I spent with Siren, I could see how it can be quite useful for linking different data sets to find patterns and relationships. With the provided Siren platform docker image, it was easy to get up and running. Like with any new technology, there is a learning curve to fully utilize the platform, but for Kibana users this will be minimal. The documentation is still lacking in some areas, but the team is continuously updating it and is readily available to assist new users with any questions they may have.

For my test use case, I feel that my data was not optimal for maximizing the benefits of the relational dependencies and navigation, but for another use case or a more robust data set, it could beneficial. I also did not delve into the monitoring/alerting functionality to see how that could be used with streaming data to detect anomalies in real-time, so that could be another interesting use case to investigate in the future.

References