Puppet - nearly a decade long friendship

Puppet – nearly a decade long friendship

By Saša Teković Sysbee Blog advice, howto, puppet architecture, puppet facts, puppet monitoring, tips, tools 0 Comments

Puppet is an open-core configuration management tool that significantly simplifies infrastructure management, especially when sysadmins have to deal with a large number of servers.

Essentially, with its declarative language, Puppet allows sysadmins to describe the desired system state in high-level terms – e.g. users, installed packages, enabled services, configuration files, etc. Puppet then compares desired with the actual state of the system and applies necessary changes.

With this kind of abstraction layer, the sysadmin doesn’t have to specify commands that are necessary to install a package, enable a service or set correct file permissions. Puppet does that automatically, and it even takes care of using the correct OS specific command for the required task (e.g. yum or apt).

We’ve been using Puppet since 2011, and at the time of writing this article, Puppet helps us manage 402.000 resources (users, packages, services, configuration files, etc.) on over 600 virtual and physical servers.

Puppet architecture

While it’s possible to run Puppet in a standalone (masterless) mode, where Puppet manages only the system on which it’s running, the recommended deployment follows traditional agent-master (client-server) architecture.

Agents and masters communicate by HTTPS protocol using SSL certificates for authentication. Puppet includes a built-in certificate authority for managing certificates.

In a standard agent-master deployment, the agent node sends facts (information such as hostname, OS, and IP address) to the master server and requests a catalog.

The catalog is a file that Puppet master compiles and which describes the desired state of managed resources. When the agent node receives the catalog, it applies and sends a report back to the Puppet master.

The report contains various details about how much time the Puppet run took to complete, what was changed, added or removed, and error details (if there were any).

If PuppetDB (an optional component) is installed, the master node will send that report to PuppetDB to be archived. Reports and facts stored in PuppetDB can later be displayed and queried with dashboards such as Puppetboard and Puppet Explorer.

Puppet modules

The module is a collection of classes, resource types, files, functions, and templates which are supposed to manage a specific resource – e.g. an NTP service and it’s configuration or users and their SSH keys.

A good module should be written to do one thing and do it well. E.g. it’s not advisable to cram installation and configuration of the whole LAMP stack into a single module, with all parameters hardcoded within configuration files.

Before you begin writing a new module, it’s recommended to head over to the Puppet Forge module repository and look for an existing module that fits the bill. In some cases you will find more than one module for the same purpose, so you’ll want to take into consideration module badges and give priority to modules that have “supported”, “partner” and “approved” badges. (here you can read more about Puppet module development)

When we’re looking for existing modules, we tend to give priority to respectable module developers such as Puppet, Vox Populi, and Hercules Team.

The power of facts

Earlier, we mentioned that facts are pieces of information about the agent node, such as hostname, operating system, and IP address. Beside core (built-in) facts, Puppet allows sysadmins to add custom and external facts.

Custom facts need to be written in Ruby. Their most significant advantage is the possibility to interact directly with Facter’s API and utilize some of its functions.

On the other hand, external facts can be written in any programming language, as long as they print the information in a valid format to stout, which can either be flat (key=value) or structured (JSON or YAML).

Here are a few examples of flat and structured facts in both JSON and YAML format.

Flat external fact:

exim_installed=yes
exim_smarthost=enabled
exim_version=4.92.3

YAML formatted external structured fact:

---
exim:
  installed: "yes"
  smarthost: "enabled"
  version: 4.92.3

JSON formatted external structured fact:

{ exim:
  {
  "installed": "yes",
  "smarthost": "enabled",
  "version": "4.92.3"
  }
}

Facts can be referenced in Puppet classes (e.g. to deploy different configuration parameters depending on the server’s location or application version), but can also be queried from PuppetDB.

In fact (pun intended), we found this feature so compelling that we use PuppetDB as our inventory backend. Facts allow us to instantly find servers that match specific criteria – e.g. physical servers that are hosted in our Zagreb datacenter and are running CentOS 7 with PHP 7.4.

So, why we chose Puppet?

When we first started using Puppet almost ten years ago, many other configuration management systems didn’t exist yet or were in a very early development phase. But the lack of choice wasn’t the deciding factor why we went with Puppet.

Puppet was initially released in 2005, and even back in 2011, it was prevalent because of its maturity, stability, and vibrant community. Over the last ten years, Puppet Forge and the whole ecosystem grew considerably, and the performance was greatly improved.

Add to that excellent documentation and agent-master architecture which suits the best for our use case, and it’s easy to see why we chose Puppet and never looked back.

A few years ago, when Puppet 3.x reached EOL, we were at the point when we needed to refresh our configuration management system. It was an ideal time to compare other solutions such as Ansible, Chef, and Saltstack.

In the end, the combination of Puppet and Foreman fit our needs the best. An added bonus was that we didn’t have to write all our modules from scratch or invest a lot of time to get to grips with another configuration management tool.

What’s next?

Now that we explained what Puppet is, how we use it and why we recommend, you should try it out yourself! It’s a fantastic tool which we wholeheartedly recommend to anyone looking to facilitate infrastructure management in their own organization.

However, if all of this sounds a bit too complex, or you’d like our experts to take care of your infrastructure, get in touch , we’ll be happy to help. With our Managed Infrastructure solution, you can focus on your business while we do our magic with advanced tools such as this.