Confluent Recognizes Risk Focus as a Preferred Professional Services Partner

Confluent Recognizes Risk Focus as a Preferred Professional Services Partner

For Immediate Release
Tara Ronel
June 7, 2019

[New York, NY, June 7, 2019] – Risk Focus, a leading consultancy for highly-regulated industries, is proud to announce that it has earned Preferred status in the Confluent Partner Network Program, the enterprise event streaming platform pioneer.


As a Confluent Preferred Partner, Risk Focus has met all the stringent business and technical program requirements, including significant customer development, training and certification of their staff. They have also demonstrated the required skill sets to help its financial services customers implement Apache™ Kafka™ and the Confluent platform in order to harness its flexible event streaming power in new or existing applications.

“We have invested heavily to develop our team through hands-on enterprise project experience, process implementation, Kafka courses and Confluent certifications,” shares Vassil Avramov, Founder and CEO of Risk Focus. “Confluent is, by far, the leader in the event streaming space, and we are thrilled to be nominated to a very small group of Confluent Preferred organizations. We recognize that this is only the beginning, and are committed to jointly helping our customers unlock the power of their data for years to come.”

“Financial services organizations are among the most impacted by the the effects of the digital revolution where technology and data are becoming key for remaining relevant to customers.  That is why so many are turning to event streaming platforms to power their applications to deliver the modern banking experiences customers expect. As the demand from this sector grows, it is important to have fully-trained and experienced Preferred Partners like Risk Focus assisting financial services organizations deploy solutions based on Confluent Platform to deliver contextual event driven applications.”“We have invested heavily to develop our team through hands-on enterprise project experience, process implementation, Kafka courses and Confluent certifications,” shares Vassil Avramov, Founder and CEO of Risk Focus. “Confluent is, by far, the leader in the event streaming space, and we are thrilled to be nominated to a very small group of Confluent Preferred organizations. We recognize that this is only the beginning, and are committed to jointly helping our customers unlock the power of their data for years to come.”

Vassil Avramov

Founder and CEO, Risk Focus

Confluent provides a complete event streaming platform based on Apache Kafka and extends Kafka’s capabilities with development, connectivity and stream processing capabilities as well as management and operations. The Confluent Partner ecosystem is an important element to drive success with customers to assist customers in developing robust data connectivity, building stream processing applications and ensuring the ongoing success of our joint customer’s strategic initiatives.

“Financial services organizations are among the most impacted by the the effects of the digital revolution where technology and data are becoming key for remaining relevant to customers” said Simon Hayes, vice president of corporate and business development, Confluent. “That is why so many are turning to event streaming platforms to power their applications to deliver the modern banking experiences customers expect. As the demand from this sector grows, it is important to have fully-trained and experienced Preferred Partners like Risk Focus assisting financial services organizations deploy solutions based on Confluent Platform to deliver contextual event driven applications.”

“Financial services organizations are among the most impacted by the the effects of the digital revolution where technology and data are becoming key for remaining relevant to customers.  That is why so many are turning to event streaming platforms to power their applications to deliver the modern banking experiences customers expect. As the demand from this sector grows, it is important to have fully-trained and experienced Preferred Partners like Risk Focus assisting financial services organizations deploy solutions based on Confluent Platform to deliver contextual event driven applications.”

Simon Hayes

Vice President, Corporate and Business Development, Confluent


About Risk Focus

Risk Focus is a NYC-based company that provides strategic IT consulting to global enterprises. The company’s Infrastructure practice provides solutions, methodologies, and strategic guidance for digital transformation, containerization, and automation. Its Financial Services team offers strong domain expertise and technology acumen to deliver feature-focused solutions in Capital Markets. To learn more about Risk Focus, please visit

Deutsche Börse – Leveraging AWS For Rapid Infrastructure Evolution

Deutsche Börse – Leveraging AWS For Rapid Infrastructure Evolution


Client: The Deutsche Börse, Market Data and Services Business Groups

Project Dates:  January, 2017 through January, 2018

Technology Solutions Used:


  • AWS Cloud Hosting
  • Docker Swarm: orchestration framework
  • Redis: in-memory cache
  • Confluent Kafka: scalable replay log
  • TICK: monitoring framework
  • Greylog: log aggregation
  • Jenkins: CI/CD pipeline

Summary: Leveraging AWS, we empowered our client with the automation and tools needed to rapidly create and test a production-scale on-demand infrastructure within a very tight deadline to support MIFID2 regulations.

Given the extreme time pressure that we were under to deliver a mission-critical platform, together with Risk Focus’ we decided to use AWS for Development, QA, and UAT which proved to be the right move, allowing us to hit the ground running. The Risk Focus team created a strong partnership with my team to deliver the project.

Maja Schwob

CIO, Data Services, Deutsche Börse

Problem Statement

In 2017, the Deutsche Börse needed an APA to be developed for their RRH business to support MIFID2 regulations, to be fully operation by January 3, 2018. The service provides real-time MIFID2 trade reporting services to around 3,000 different financial services clients. After an open RFP process, The Deutsche Börse selected Risk Focus to build this system and expand its capacity so it process twenty times the volume of messages without increasing latency and deliver the system to their on-premises hardware within four months.

AWS-Based Solution

To satisfy these needs, we proposed a radical infrastructure overhaul of their systems that involved replacing their existing Qpid bus with Confluent Kafka. This involved architecture changes and configuration tuning.

The client’s hardware procurement timelines and costs precluded the option to develop and test on-premises. Instead, we developed, tested and certified the needed infrastructure in AWS and applied the resulting topology and tuning recommendations for the on-site infrastructure. Finding optimal configuration required executing hundreds of performance tests with 100s of millions of messages flowing through a complex mission-critical infrastructure, and it would have been impossible in the few weeks available without the elasticity and repeatability provided by AWS. This was implemented using an automated CI/CD system that built both environment and application, allowing developers and testers to efficiently create production-scale on-demand infrastructure very cost effectively.

Financial Services Expertise

Although this was a technical implementation, we were ultimately selected by the business unit at the Deutsche Börse to provide an implementation of their service because of our strong Financial Services domain expertise and implemented both technical and business requirements. The stakeholder group also included the internal IT team and the Bafin (German Financial Regulator), and all technology, infrastructure and cloud provider decisions had to be made in tandem by all three groups collectively.

Risk Focus’s deep domain knowledge in Regulatory Reporting and Financial Services was crucial to understanding and proposing a viable solution to address the client’s needs and satisfy all stakeholders. That domain expertise, in combination with Risk Focus’ technology acumen, allowed for the delivery of the service that met requirements.

Setting the Stage for The Future

The system was delivered to the client’s data center on time within a startlingly short timeframe. We worked with their internal IT departments closely to transfer fully over the delivered solution, allowing us to disengage from the project and effectively transfer all knowledge to the internal team. All test environments and automation were handed over to the client, allowing them to further tune and evolve the system.

The ability to develop and experiment in AWS empowers the client to make precise hardware purchasing decisions as volumes change, and they now have a testbed for further development to adapt to new regulatory requirements.  They are well-positioned for the future, as it provides a pathway to public cloud migration once that path is greenlighted by regulators.

Building Large Scale Splunk Environments? Automation and DevOps are a must.

Building Large Scale Splunk Environments? Automation and DevOps are a must.


Splunk is a log aggregation, analysis, and automation platform used by small and large enterprises to provide visibility into computing operations and as a Security Incident and Event Monitoring platform. It is a very mature product, with deep penetration in financial services firms. The Elastic-Logstash-Kibana (ELK) stack is an open source suite with a similar feature set.   

For a large organization, log aggregation platforms can easily end up ingesting terabytes of data. Both Splunk and ELK require incoming data to be indexed and rely on distributed storage to provide scale. In practice, this means dozens, or even hundreds of individual nodes operating in a cluster. Once an organization moves past a 2-3 node log-aggregation cluster, a strong automation framework is essential for testing and deploying all components of the log aggregation platform. 

Risk Focus has built large analytics clusters for a number of clients in recent years, including those focused on financial and healthcare data analysis. Several engagements have involved building large Splunk clusters. The automation challenge for log aggregation platforms are similar in many respects to other analytics platforms. In almost all cases the three major challenges are: 

  • Managing data ingestion/archiving 
  • Configuring and scaling compute clusters required for analysis  
  • Maintaining consistent configuration across entire cluster and robust testing 

Infrastructure automation

Based on our experience, Splunk clusters containing more than a handful of nodes should not be built or configured manually. Automation delivers configuration consistency and prevents configuration drift. It also improves efficiency in resource deployment and utilization. This is true for both the base system configuration as well as cluster deployment and configuration tasks. 

Organizations using current generation technology can deploy standardized base operating system images to virtual machines, speeding up initial infrastructure deployment significantly. An automation and configuration management tool (such as Salt or Ansible) can then be utilized to deploy software and customized configuration onto each node within the network.  Cloud orchestration technology (eg. Terraform or AWS Cloudformation) may be utilized alongside configuration management tools in particularly large environments. As an aside, AWS has a managed Elasticsearch/ELK offering. For organizations considering Cloud deployments, there is no need to re-invent the wheel, the AWS offering does virtually everything an organization might want in terms of infrastructure automation, including multi-AZ deployments for High-Availability.

Utilizing such automation frameworks makes it extremely easy to scale the environment up (and in certain cases down). It also simplifies other common management tasks critical for operational stability, including: 

  • Adding search-heads or indexersThis is easier with a well-understood automated deployment process not subject to manual error. 
  • Disaster Recovery: is easier and less costly to accommodate when compute resources are stood-up quickly and confidently with automated procedures. 
  • Resource EfficiencyWhen search, ingestion and reporting follow a specific pattern during the day, or different business units require additional capacity, infrastructure automation enables re-scaling of components and re-directing resources towards other tasks/nodes. 
  • Testing/Upgrading/Patching/Re-Configuring software: A consistent and modern DevOps practice is necessary to make modifications to a large cluster reliably and with minimal downtime. 
  • Security and Auditability: Financial services firms, health-care providers, and utilities face high regulatory burdens. Auditors and regulators have an interest in the computational operations of these clients. Splunk/ELK is a good source of data for audits and an indicator of mature operational management and monitoring practices within the information technology organization. As an organization begins to rely more on Splunk for both security monitoring and operational visibility, it can expect auditors and regulators to treat Splunk as critical infrastructure and take a greater interest in the Splunk environment itself. Employing automation and consistent DevOps practices to build, deploy and manage the Splunk environment goes a long way towards allaying regulatory concerns. But more importantly, to ensure operational stability, large Splunk clusters (or any other large-scale analytical/compute environment) should use automation for initial deployments and to manage configuration drift across the fleet. 

Splunk Administration Tools

Splunk has a rich toolset of GUI and CLI tools to manage configurations for indexers, forwarders, license managers and other components in a Splunk cluster. Most firms are likely to have some form of automation/DevOps standard across the organization, in aspiration if not yet in full practice. A key part of planning for a large Splunk environment is defining where the boundary between standardized, cross-platform configuration management tools end and where the Splunk toolset exercises control. 

There is no perfect answer to where this boundary lies.  Risk Focus works with clients who have limited the use of Splunk management tools to managing licenses and performed virtually every other task with scripted automation frameworks. There are other clients, who chose to go in the opposite direction and rely largely on Splunk’s tools to manage and configure the cluster. 

Making the right decision on tooling is dependent on the practices and skill-set of the team expected to support Splunk. Organizations that rely on external service providers (including Splunk PS) to maintain their environment need to consider whether these providers are familiar with their preferred automatioor configuration management toolkits. Organizations that prioritize mobility among technology staff will want to place more emphasis on common tooling across all applications, rather than rely on Splunk-specific management tools. 

Application Life Cycle

Testing and Quality Assurance 

An important and often underappreciated part of Splunk environment management is how to deal with the testing and release of software updates and patches, reports, dashboards, and applications. 

Organizations using Splunk for critical business activity should treat Splunk like any other business critical system and follow best practice for software delivery and maintenance. That means establishing processes to test software releases. A release can impact user activity, reports, dashboards, ingestion configurations, and system performance. Building Splunk test environments and efficiently rolling changes through a development/test/deployment lifecycle requires automation. Absent such automation, test cycles become expensive and will not be executed consistently. Mature organizations using a Continuous Integration/Continuous Deployment (CI/CD) lifecycle will find that the effort expended to integrate Splunk into their CI/CD pipeline delivers enormous rewards over time. 

Release Management 

In managing releases and updates to Splunk configuration, it is useful to view Splunk in a broader context as a data analytics tool. Most organizations have some experience with data analytics tools such as SAP, SaaS, Informatica, or Business Intelligence. For these systems, organizations have often established fine-grained control over reports and dashboards used in critical business activities. This includes limiting users’ ability to change them in production. Splunk is no different. We advise customers to clearly establish permission and ownership boundaries in their production Splunk environment. These should be balanced so that they do not constrain the Splunk users’ ability to analyze data. 

One way to balance the tension between user freedom and organizational oversight is to create a dividing line between reports/dashboards used in daily operations and ad-hoc analyses. We find that the following best practices are quite useful: 

  • Critical reports and dashboards should be tightly controlled via Splunk’s permission tools 
  • Changes to critical dashboards, however minor, should be tested 
  • Software upgrades should involve user acceptance testing and automated testing of all such dashboards 
  • Data from test environments should be ingested continuously into a Splunk test environment. Application/infrastructure changes which might impact Splunk should be tested here.  
  • Critical monitoring dashboards should be validated as part of application testing to ensure any changes to software/log format do not impact these dashboards. 


Splunk scales well to meet the needs of large enterprises, but the topology can get complex with increased scale.  An example of a deployment framework for a multi-tenant Splunk cluster is below. Depending on your organization’s need, automated or semi-automated scaling can be built into the framework.


The current generation of automation tools and DevOps best practices can deliver significant benefits to an organization seeking to manage and maintain a large Splunk cluster. Every organization should carefully consider the benefits of using such tools to manage their environment. Organizations relying on Splunk for critical operational management should treat it as such and build a robust testing framework for their environment. 


    Confluent and Risk Focus Partner to Bring the Power of Event Streaming to Financial Services

    Confluent and Risk Focus Partner to Bring the Power of Event Streaming to Financial Services

    Confluent and Risk Focus Partner to Bring the Power of Event Streaming to Financial Services 

    Confluent the enterprise event streaming platform pioneer, and Risk Focus, a leading consultancy for highly regulated industries, have joined forces to help Financial Services clients realize the efficiency and cost-effective benefits of event streaming architectures for Regulatory Compliance. 

    SaltStack: Driving SecOps Compliancy Through Event-Based Automation

    SaltStack: Driving SecOps Compliancy Through Event-Based Automation

    Guest Post Written by: Sean BrownArea Director, EMEA @ SaltStack

    • Automation is a necessity to deliver compliance and configuration management and for all enterprises operating at scale
    • Many existing management platforms were built for a previous generation of IT and cannot scale nor flex to cloud native operations
    • An opportunity exists to reduce cost and risk by considering bi-directional, closed loop event-based automation for configuration management and SecOps compliance

    Download your own copy of the SaltStack SecOps white paper HERE.

    Whether you need central heating or air conditioning, you’ll be familiar with the process of defining and setting a desired state, in this instance setting the temperature.  The sensors within the system then provide constant feedback and the desired state is maintained until the system receives a new target state.  More comprehensive systems allow for more finely grained defined states to be set, perhaps by room or by time of day. 

    Platforms delivering change and configuration management in IT offer similar capabilities albeit with considerably more variables to be managed and most likely involving several people with responsibility for the service level management of the platforms in operation.  How desired state is achieved and maintained varies by platform and approach.

    Some platforms can define a target state and deploy this to a system under management, it is a simplex operation – “top down”.  Once a target state is set there is no ability to receive feedback as this platform has no monitoring capability.   The target state deployment can be scheduled, to re-instantiate the target state, perhaps at an hourly, daily or weekly interval but between intervals there is no visibility into the state of the system under management. 

    “75% of organizations still don’t automate server or application provisioning” [Gartner]

    This is analogous to setting your room temperature at the radiator and having no knowledge of the changing weather conditions, time of day, level of occupancy in the room or knowledge of the window that someone unexpectantly left open.

    Other platforms can offer more capability.  A target state can be defined, deployed and another, separate platform can be deployed to monitor state.  Many of the security compliance propositions available today work in this way, deploying agents to the systems and then reporting back when the system under management deviates from target state (or compliant state in this instance).

    Average cost of a data breach in 2018 was $3.16m and the likelihood of a recurring breach over the next two years: 27.9%  Without automation, estimated cost increases to $4.43 million, a $1.55 million net cost difference. (IBM/Ponemon)

    At this point two teams work together, one to report the compliance drift and the other tasked with remediating that drift back to complaint state.  It works but requires integration of both platforms and processes, resulting in additional cost, risk and operational inertia.  When the systems under management span multiple data-centres, technologies, geographies and operating domains (on premise, hosted, public cloud), more teams have to be involved resulting in more risk, inertia and ultimately cost.



    Of course, the analogy of domestic heating systems and complex enterprise IT is limited. A closer comparison may be found in aviation. Modern airliners are a collection of systems and sensors, operating in concert with each other utilising real-time feedback and automation. The systems are too complicated for any individual pilot to manage and whereas initially other aircrew were added to share the workload, the last 30 years have seen the movement of many responsibilities to automated control logic capable of delivering a safe, repeatable and predictable outcome (destination reached) irrespective of the changing flight conditions (weight, balance, trim, air speed, fuel usage, wind direction, noise regulation, schedule).

    IT will consume 20% of all generated electricity by 2025.  A 2016 Berkeley laboratory report for the US government estimated the country’s data centres, which held about 350m terabytes of data in 2015, could together need over 100TWh of electricity a year by 2020. This is the equivalent of about 10 large nuclear power stations.

    It may be argued that IT is as complex (micro-services, containers, VM’s, public, private, hosted, Linux, Windows, legacy), equally regulated (PCI, CIS, DISA-STIG etc.) and more environmentally conscientious than ever before. 

    For this reason, enterprises are looking to optimise how they manage IT and the systems they have deployed in the past for this purpose are now too brittle to successfully incorporate more modern design patterns such as cloud and containers, or too expensive to scale to the 100k plus individual management targets many mid-size and enterprise.

    A solution may be more integration.  For example, the integration of a dedicated monitoring system alongside an agent-based compliance reporting system, a configuration management platform and a service level management/ticketing system that together allow the enterprise to define, target, deploy, monitor, react and report.  However, interdependent integration has associated costs and risks that many budget conscious businesses would rather avoid. 

    This pattern is playing out in the public cloud today whereby enterprises are finding that to support AWS, Azure and GCP (for example) along with their associated API’s and licensing is a false economy and a more one-stop-shop approach is economic reality. 

    Enterprise want to set their own direction.  Indeed, this is commercial necessity in both competitive market and in public sector where organisations are seeking any combination of differentiation, cost optimization or risk reduction. 

    SaltStack addresses the customer needs stated above and more.  Designed as an extensible, massively scalable and high-performance remote execution engine, SaltStack provides event-based automation capabilities that address IT operations, DevOps and SecOps concerns.  SaltStack can help customers achieve target state using either declarative or imperative approaches and utilises an asynchronous message bus architecture to deliver change and configuration instructions at incredible speed and scale.

    The SaltStack platform also includes the capability to deploy an agent-based monitoring solution (SaltStack Beacons) that utilise the same event bus to transmit messages back to the core, be they simple notifications of a successful system change/package deployment or something more time or operations risk sensitive such as a service being enabled when it should be disabled -telnet being turned on when it should be turned off for example. Being extensible means that customers can extend or build their own beacon monitors to build a unique insight, command and control capability for their extended estate. 

    This capability adds orchestration to automation, resulting in a platform that can interpret environmental change (e.g. file, disk, network, service, login, CPU utilisation, system operating temperature) and can react as required, be it fully autonomously or as part of a service chain process.  When applied to the SecOps use case the ability of an enterprise to deploy a target complaint state to their estate and then have SaltStack keep that estate in target compliant state through self-healing, automated remediation.  No third-party monitoring tools.  No third-party compliance reporting tools.  No integration cost and undeniably less risk.

    In March 2019 SaltStack will be launching a dedicated SecOps compliance service.  Utilising the platform described above to deploy target compliance profiles for Windows and Linux systems at launch, this will incorporate the delivery of pre-built, tested compliance profiles for CIS, DISA-STIG and NIST targets as well as dedicated SaltStack beacons for compliance reporting and alerting. 

    If a system under management moves out of target compliance the CISO and ITOps team will know, in real-time, and will have the ability to remediate back to target compliant state. Assured compliance at scale.  Just another use-case for SaltStack.

    To receive the SaltStack SecOps white paper, please tell us a little about yourself:

    Using Salt and Vagrant for Rapid Development

    Using Salt and Vagrant for Rapid Development

    Written by: Peter Meulbroek, Global Head of DevOps Solutions

    One of our main jobs at Risk Focus is to work closely with our clients to integrate complex tools into their environments, and we use a plethora of technologies to achieve our clients’ goals. We are constantly learning, adopting, and mastering new applications and solutions, and we find ourselves constantly creating demos and proofs of concept (PoCs) to demonstrate new configurations, methods, and tools.

    We use a variety of applications in our deliverables but often rely on Salt for the foundation of our solutions. Salt is amazing: it combines a powerful declarative language that can be easily extended in Python, a set of components that supports a diverse array of use cases from configuration and orchestration to automation and healing, and a strong supportive community of practitioners and users.

    In my own work I’m often off the grid, traveling to clients, at a client site, or in situations where a lack of internet access precludes me from doing work in the public cloud. In these situations, I rely on technology that allows me to experiment or demonstrate some of the key concepts in DevOps from the laptop. Towards this end, I do a lot of this work using Vagrant by Hashicorp. I love Vagrant. It’s a fantastic platform to quickly create experimental environments to test distributed applications. Vagrant’s DSL is based on Ruby, which fits in well with my past developer experience. Finally, Vagrant is easily extensible: it delegates much of the work of provisioning to external components and custom plugins and comes with Salt integration out of the box.

    After working with Salt and Vagrant on a set of demos, I’ve decided to share some of the tools I put together to improve and extend this basic integration. The tools are aimed at two goals: faster development of new environments and environmental validation. In pursuit of the first goal, this post is about host provisioning and integrating Salt states. In pursuit of the second, I will post a follow-up post to describe how to generate ServerSpec tests to validate newly-created hosts. These tools offer a quick path to creating and validating virtual infrastructure using Salt.



    The ironic aspect of using Vagrant is that, although it is implemented in Ruby and the central configuration file (the Vagrantfile) is implemented in Ruby, the actual implementation of Vagrantfiles is un-Ruby-esque. The syntax can be ugly, validation of the file is difficult, and the naïve implementation is not DRY*. However, with a bit of coding, the ugliness can be overcome. I’ve put together a set of provisioning configuration helper classes written in Ruby that allow me to more succinctly define the configuration of a test cluster and share code between projects. The idea behind the classes is very simple: extract from the Vagrantfile all of the ugly and repetitive assignments so that creating a simple Salt-ified platform is trivial.

    The code for this post is found at Readers are encouraged to check it out.

    *Don’t Repeat Yourself 


    The post assumes you have Vagrant installed on your local machine, along with a suitable virtualization engine (such as Oracle’s VirtualBox).

    1.) In your vagrant project directory, check out the code

    git clone

    2.) Copy the configuration file (vagrant-salt/saltconfig.yml.example) to the vagrant project directory as saltconfig.yml

    3.) Copy the sample vagrant file (vagrant-salt/Vagrantfile.example) to the vagrant project directory as Vagrantfile (replacing the default), or update your Vagrantfile appropriately

    4.) Set up default locations for Pillars and Salt States. From the Vagrant project directory:


    5.) Initialize the test machine(s)

    vagrant up

    Congratulations. You now have an example Salt cluster with one minion and one master. The bootstrap script has also created a simple Salt layout, complete with pillar and state directories and top files for both.



    The examples directory contains a few sample configuration files that can be used to explore Salt. These are listed in vagrant-salt/examples/topology and include:

    • One master, two minions
    • One master, one syndic, two minions

    To use either of these topologies, copy the example Vagrantfile and saltconfig.yml to the Vagrant project directory and follow instructions 2-5, above.


    Deep Background: The Classes

    The development necessary to get Vagrant and Salt to work together seamlessly is key management and configuration – creating and installing a unique set of keys per cluster to avoid reuse of keys or potential vulnerability.

    The code consists of a factory class to create the configuration for Salt-controlled hosts and a set of classes that represent each type of Salt host (minion, syndic, master). Each class loosely follows the adapter pattern.

    When put into action, the factory class takes a hash that specifies the hosts to be created. Each host created is defined by a value in this hash. It is quite convenient to use a yaml file to initialize this hash, and all examples given below list the configuration in yaml. Example yaml configurations are provided in the code distribution.

    There are three classes that can be instantialized to create configuration objects for each of the Salt types. All configuration objects need two specific pieces of information: a hostname and IP. If a host is to be included in Salt topology, the configuration must include the name of its Salt master.

    • The base for “salt-ified” hosts is the minion class, which corresponds to a host running a Salt minion.
    • The master class corresponds to a host running a Salt minion and also the Salt master process.
    • The syndic class corresponds to a Salt master that also runs the syndic process.

    The Configuration Structure

    The configuration structure is used by the factory class to instantiate a hash of host objects. It has three sections: defaults, roles, and hosts.

    The defaults section specifies project-wide defaults, such as number of virtual CPUs per host, or memory per host, as follows:


    memory: 1024

    cpus: 1

    Entries in this section will be overwritten by the role- and host-specific values. This section is particularly useful for bootstrapping strings.

    The roles section specifies values per-role (minion, master, syndic). This gives a location to specify role-specific configuration, such as the location of configuration files.




    The hosts section specifies the hosts to create and host-specific configuration. Each host must, at minimum, contain keys for role, IP, and the name of its master (when it has one):




    role: minion

    master: master



    role: master

    master: master

    Note that per-host values (such as memory or cpu count) can be added here to overwrite defaults.:



    cpus: 2

    memory: 1536


    role: master

    master: master


    The Vagrantfile

    Incorporating the configuration classes into a Vagrantfile greatly simplifies its structure. The factory class creates a hash of configuration objects. Executing the Vagrantfile iterates through this hash, creating each VM in turn. The objects store all necessary cross-references to create the desired topology without having to write it all out. The configuration objects also hide all the messy assignments associated with the default Salt implementation in Vagrant and allow the Vagrantfile to remain clean and DRY.

    The file Vagrantfile.example in the distribution shows this looping structure.


    Integrating Salt States

    The above description shows how hosts can be bootstrapped to use Salt. Of much more interest is integrating Salt with the configuration and maintenance of that host. This integration is fairly trivial. Included in the vagrant-salt/bin directory is a bash script called “” that will create a skeleton directory for Salt states and pillars. This directory structure can be used by the Salt master(s) by including the appropriate Salt master configuration. For example, with default setup, the included Salt master configuration will incorporate those directories:



    role: master


    memory: 1536

    cpus: 2

    master: master




    – /vagrant/saltstack/salt



    – /vagrant/saltstack/pillar



    Salt is a very powerful control system for creating and managing the health of an IT ecosystem. This post shares a foundational effort that simplifies the integration of Salt within Vagrant, allowing the user to quickly test deployment and implementation strategies both locally and within the public cloud. For developers, it gives the ability to spin up a new Salt cluster, validating configuration, states, and the more advanced capabilities of Salt such as reactors, orchestrators, mines, and security audits. The classes also provide an easy way to explore Salt Enterprise edition and the visualization capabilities it delivers.

    The next post in this series will focus on validation. As a preview, at Risk Focus we strongly believe in automated infrastructure validation. In the cloud or within container management frameworks, addressable APIs for all aspects of the environment mean that unit, regression, integration, and performance testing of the infrastructure is all automatable. The framework described in this post also includes a test generation model, to quickly set up an automated test framework for the infrastructure. Such testing allows for rapid development and can be moved to external environments. In the next post, we’ll go through using automatically generated, automated tests for Salt and Vagrant.