Before we talk in depth about stackd.io, we want to give a shout out to the team, and all of the folks who are helping make this possible.
Next, we have the internal development team:
Abe Music is a Senior Software Developer who has worked at Digital Reasoning for over four and a half years, helping drive and build the operations and cloud infrastructure for the company’s platform.
Steve Brownlee is also a Senior Software Developer who has worked at Digital Reasoning for over one and half years and helps build the tooling, visualizations and user experiences for people to use the platform.
We also thank the entire Digital Reasoning team for providing us with the invaluable information needed to ensure that we are eliminating as many pain points as possible when managing a cloud infrastructure.
How we picked the name
stackd.io actually grew out of an inside joke between Steve and Abe. When discussing real-time socket communication for our web applications, Abe kept mistakenly naming socket.io as stackd.io, and we started joking that stackd.io would be a great name for a fake product that could do anything and solve any problem.
Months later when we started discussing building an API and UI for helping automate and monitor our cloud deployments, we called the final solution a Stack. We both immediately realized that we already had a name for the project!
It’s not easy, learning how to set up and launch machine instances on a cloud provider. As painless as Amazon tries to make it, it still requires exhaustive research, trial and error, and a good amount of technical prowess to launch instances, install the required software on each one, get those services running, and making them all talk to each other.
Those with some technical skill start to write automation scripts that help alleviate the mind-numbing tediousness of managing this process over and over again.
Here at Digital Reasoning, we used a powerful command-line tool to help automate this process using Python Jinja templates, but it was only a small step in the right direction. It remained a task relegated to those who understood programming and the Linux operating system. We discovered that many people were launching and terminating their instances directly from the ElasticFox plugin for Firefox.
In addition, people are prone to making mistakes. Wrong clusters of machines were being shut down, or being kept on too long, wasting money.
Something had to change.
We decided to take our expertise with Salt, Django, and simple user interfaces and craft an easily consumable REST API, and a no-brainer user interface that would allow anyone, with minimal training, to create Stacks of Hosts that would be automatically launched, provisioned, and managed. We felt it should also include administrative features like user groups, permissions, reporting, automated rules, notifications, and lease periods. These features would help alleviate the problem of mistaken termination and leaving instances running while not in use.
This is not a problem unique to Digital Reasoning. We discovered that many organizations need the ability to let non-experts manage their own Stacks in a cloud infrastructure. This led to another foundational purpose for the product – the ability to use it with any cloud provider, not just Amazon.
We’re only barely scratching the surface so far. With about 2 months of development under our belts, we just recently launched an internal beta of stackd.io to a limited group of people who we felt would use the product heavily. They provided lots of feedback, and based on their top items that needed to be included, we just recently completed development of version 0.3.
Our hope is to have the code open sourced no later than version 0.5. By that version release, we should have all of the major, base functionality that we feel is necessary for people to start using it.
We have already started using this application internally at Digital Reasoning and have seen immediate benefits. Team leaders are defining their own Stacks, and are already saving precious time by having stackd.io launch and provision their disparate environments. Stack definitions are already being shared amongst teams that ensure the consistency and increased productivity of each environment.
We looked at several of the emerging tools in the open source marketplace that are meant to help teams manage highly complex cloud infrastructures. The largest shortcoming was that they all are meant to manage one specific, pre-defined, environment; the tools we found simply made the launching and provisioning of that environment easier.
Our research, field, sales, and engineering teams all require the ability to launch environments that serve very different purposes. In addition, the configuration of those environments changes on a regular basis as our research team discovers new ways of doing things, or our engineering team discovers how to do them more efficiently.
stackd.io is built from the ground up to allow everyone on the team to work on as many environments as they want.
We found that 100 instance clusters were being launched for temporary research & engineering tasks, but, as is prone to happen, the owners became distracted. Even though the cluster’s job was complete, it was not terminated in a timely manner, costing us, sometimes, thousands of extra dollars in wasted up-time.
By making Stacks easy to create and terminate, as well as viewing reports on usage of assets, costs can be dramatically reduced by finding the largest sources of waste and oversight.
Having to design multiple environments for our critical Hadoop environment for processing hundreds of millions of documents through our NLP engine, it quickly became clear that we had myriad purposes and differing lifetimes. It was a time sucking endeavor that was prone to mistakes and changes not being communicated effectively.
By being able to define our environments through an easy user interface, and then re-using that environment over and over again, our team saves large chunks of time. Also, we gain the peace of mind by having a solid, trusted process for launching and provisioning an environment the same way, every time.
Having a powerful REST API driving stackd.io allows our engineering team to have a consistent, common tool that they can integrate with their tools that require the ability to automatically launch large groups of cloud machines with specific configurations; configurations that may change dynamically based on the current needs of the system.
This reduces fragmentation on how each team uses the cloud, and allows for maximum flexibility when designing the environment for the needs of each system.
Custom provisioning formulas
Many of our internal use-cases here at Digital Reasoning require us to experiment with bleeding edge technology and distributed systems. Some require very complicated installation procedures and management tasks.
Initially, we were using a lot of complex Python and bash scripts to alleviate the frustration required to test the latest and greatest technology, but that quickly became more challenging because only a handful of our engineers had the unique set of skills required to handle those tasks — not to mention the time to do so.
With stackd.io and SaltStack, we can now enable our users to easily define declarative rules to focus on installing and managing software, and not worry so much about the environments, systems, or headaches that come with ops management.
We first discovered Salt back in November of 2012 when we were building a installation application for our Synthesys product. An evolutionary progression from tools like Chef and Puppet, it is written in Python, open source, has a thriving community of developers, and it’s fast & flexible.
It was the perfect choice because it allowed us to deploy all the myriad technologies, pieces, and parts that make up Synthesys, keep them provisioned correctly and automatically updated when new features rolled out.
Because of our success during that project, we decided to keep going with Salt for stackd.io. We quickly discovered that many enhancements had been made to the project that fit perfectly with our plans for stackd.io: Overstate system, Syndic servers, a messaging system that allows cross instance communication, the Reactor system, and many others.
Django REST Framework
As a huge proponent of RESTful API’s, our team was able to successfully implement a powerful one in several other of our products with the Django REST Framework. This is another powerful, open-source project that gives you many time-saving features out of the box.
It implements HATEOAS by default, has renderers and serializers for your Django DB models, a browsable HTML interface for your API, and built-in authentication policies. If you’re a Python shop, and you’re considering implementing a REST API, you need to check it out.
It’s also important for us because we have many internal projects that will need to integrate with stackd.io for their testing and production features. Several of our projects need to spin up stacks of EC2 instances in order to do their work, and having an easily navigable API is crucial.
For the user interface, we have chosen several open source libraries that provide just the right amount of functionality without being bloated behemoths.
Require.js allows us to build a modular application and load dependencies when needed.
Twitter Bootstrap let us quickly build an attractive interface that keeps stackd.io consistent with the look and feel of our other internal tools.
Knockout provides the structure and offer useful feature like observable collections and properties, custom bindings, and form handling.
As of the time of this publication, you can define, what we call, Stacks of instances and have them automatically launched and provisioned by choosing from any of the defined SLS files. You can then initiate stop, terminate, start and launch commands at any time once a Stack is defined.
Built for Amazon first
Since we depend on many technologies from Amazon Web Services (AWS) for our own infrastructure, we, naturally, are building out the capabilities to deploy on that platform first so that we could start saving money immediately… while still keeping architectural challenges in mind for building out for other providers.
This allows you to provide your AWS account credentials and it is the basis for all other actions in stackd.io. So if you don’t have an AWS account, and want to try stackd.io when it’s released, go create one now.
For each account, you can create multiple profiles which make it easier for when you want to launch instances with different operating systems, default packages, custom configuration, etc. Profiles, from the perspective of AWS, are simply AMIs that users choose when defining and launching their stacks. Setting up the standard instance profiles you use for your deployments provides time-saving defaults when creating stacks.
Snapshots & Volumes
EC2 allows you to create Elastic Block Storage (EBS) volumes from predefined snapshots and then attach those volumes to an instance for storing critical data that you want to persist when hosts are terminated (as opposed to ephemeral storage that is destroyed when the instance is terminated.) EBS volumes are also great for backing up data, instance copies, or for utilizing standard software across multiple instances. In stackd.io we make it simple to create and attach EBS volumes from snapshots you have on hand, and in the future we’ll make it simple to create preformatted snapshots.
Hosts are the actual instances that you want to be started and provisioned with software. Users will define the number of instances, the size, availability zone, security groups, DNS host pattern (for Route 53 usage), SLS files to use, and any EBS volumes to be created and attached.
Once all Hosts have been defined by the user, they will be thereafter grouped into a Stack, which is given a title and a description. After the user launches a Stack, the UI is updated with status messages to let the user know the progress and also review any errors that may have cropped up during the process.
Stacks are usually a functional grouping of Hosts. For instance, you would create a Stack for your web presences – 4 Apache servers, 5 cache servers, a couple messaging instances, a logging server, and 3 more that will host your actual application servers. You would create a Stack named, for example, “Web stack” and define those hosts and launch them all together.
Entire Stacks can be stopped, started, and terminated. When a Stack is no longer useful, it can be deleted from the stackd.io database along with destroying all of its corresponding infrastructure, never to be seen again.
Other cloud providers
One of our primary goals from the onset was for stackd.io to be an API and UI that would allow users to work with any cloud provider. That’s still a primary goal, but since AWS is the 400 pound gorilla in the industry, and we use it here, it was the obvious place to start.
Our next platform will likely be Rackspace, but there are others in the running and we’ll be making that decision once AWS is rock solid.
One of next, major goals is to allow users to own their own software and environment definitions, and then be able to share them from private, or public, repositories where anyone can import them directly into their instance of stackd.io.
Another important goal for stackd.io is for IT shops to reduce costs. To achieve this, we are planning to let users define rules for who can launch instances, how long they should be available, and respond to triggers on a Stack (e.g. terminate after prolonged low CPU utilization).
We will be implementing a robust and easily customizable reporting system so that a team can see, at a glance, how they are using their cloud assets and make intelligent decisions on how to optimize their usage.
Another idea that we’ve had along the way is to let users define custom actions that need to be performed on instances after they are up and running. From running a custom backup script, source control commands, PGP key management, or whatever needs to be done at a systems level that isn’t the job of Salt provisioning.
Lastly, this project will be open source. We want the developer and operations communities to be a huge part in making this an application that truly makes managing a cloud infrastructure as easy as possible.
We believe that this effort will not only provide Digital Reasoning with additional capabilities, but as an open source project, others within our industry will become more efficient with their cloud infrastructures from this effort, and will have the added benefit of being able to contribute back to the project itself.
If you have any questions or suggestions regarding stackd.io, please feel free to send us an email at firstname.lastname@example.org.