Chef: Out of the Gate

Next in my series of posts about the configuration management tool chef, I’d like to talk a little bit about how I got started. First up:

The Server

Although there are several varieties of chef, I prefer to have a central server from which I can examine my nodes (client machines) from. Because I didn’t want to go through the trouble of setting up and maintaining my own, I decided to go with a hosted service from Opscode, which is free up to 5 nodes, and has very reasonable pricing beyond that. You can sign up for an account right on their homepage. Another nice thing about having a server running is that you can use it to examine all the properties of your cookbooks and nodes to see if values are being set the way you expect them two. For example, being able to browse the node attributes on the server helped me diagnose a problem of mismatched brackets, that was putting putting all my custom attributes into one cookbook, but was not an invalid file on my end.

Getting Started

You can find this basic info on a variety of sites, so I’m not going to go into painstaking detail, just run through getting your initial chef directory structure in place. This will require git, which you can learn more about from Brian and Hector’s tech talk, or on the web. The official installation instructions you can get from the Opscode Knowledge Base. This is a condensed version.

First, we want to pull down an empty chef structure (note, all my commands will be for *nix environments, if necessary, you’ll have to swap in the windows equivalents):

git clone git://github.com/opscode/chef-repo.git

This will create the folders that knife expects to find when it runs. We’ll enter the folder and add one more for site cookbooks (which is a clever way to extend cookbooks we download from 3rd parties):

cd chef-repo
mkdir site-cookbooks

Finally, we need to set up knife to be able to talk with our server. On the Opscode site in your console you need to set up a new organization, then generate a new key for it (save this file, we’ll call it ORGANIZATION-validator.pem) and use the “Generate knife config” file to get the knife.rb file. Finally, on your user account page use the “Get a new private key” link to download your keyfile (I’ll call it USERNAME.pem). Now we’ll create the appropriate location in our chef repo and copy the files in (all commands will assume we’re sitting in the chef-repo folder from this point forward):

mkdir .chef
cp /YOURPATH/ORGANIZATION-validator.prm .chef
cp /YOURPATH/knife.rb .chef
cp /YOURPATH/USERNAME.pem .chef

And a quick test:

knife client list

Should output:

[
    "ORGANIZATION-validator"
]

Ok. Now we’re cooking! Heh, get it? Sorry. Let’s move on.

Roles

It’s hard to know whether to introduce roles or recipes first, because they’re pretty useless without one another, but we’ll start with roles. A role is basically defines a list of recipes and attributes that you can apply as a single unit to a node. Although you can alter your roles on the server once you create them, I think that always creating them from a source file is the best way to go about it. I got started with this concept by thinking about the environment I needed to manage, and decided to keep it simple, breaking all the servers up into two pieces:

  • A base role containing all the software and configuration that’s common across all server types (network setup, security, mail)
  • A role for the specific server type (application, database, file, etc.)

So, we’ll get started with the base role. Create a file called base_server.rb in the roles folder of your repository and we’ll start with the bare minimum in there:

name "base_server"
description "Common Server Base Configuration"

To create the role and add it to the server, enter knife:

knife role from file base_server.rb

To see if it worked, log on to the chef server and look in the “Roles” tab. You should see your new role. If you click on it, you’ll get more details, but that’s pretty much empty right now. Before we can do anything clever there we need to learn a little bit about cookbooks. Additionally, we’re going to need a machine to experiment on, so let’s work on that next.

Nodes

Nodes are the computers that you’re controlling with chef. Although you could manually install chef on a machine, register it with your server, and use the server’s web interface to add roles to that node, the easiest way (assuming you have ssh access to your test machine) is to use knife’s bootstrapping command (which we covered in the first chef article). Let’s bootstrap our test system and assign it our new base_server role:

knife bootstrap TEST.SERVER.ADDRESS -x USERNAME -P PASSWORD -r 'role[base_server]' --sudo

When that completes you should see the machine in the node list on the server, or you can even quickly check with:

knife node list

From this point forward on that box, to realize changes you’ve made to your cookbooks and recipes on the test box you need to run the chef-client command. This will contact the server, pull down any new or changed cookbooks, and apply the appropriate recipies given the roles the machine has been assigned. So, on a *nix box:

sudo chef-client

Now, let’s do something with our test machine.

Cookbooks and Recipes

Cookbooks are collections of… wait for it… recipes! (Pretty obvious, huh?) Recipes are what actually makes chef do anything useful, like install a web server, or change your DNS settings. Even better, there’s already a bunch of recipes out there for common tasks. On the down side, not all the recipes you download are going to have good (or any) documentation at all. However, once you understand how they work, it’s easy to read a recipe and figure out what it’s doing (although this does not excuse developers out there from documenting!! Do it!!!)

Cookbooks: Straight up Defaults

To get our feet wet with cookbooks, we’ll get to work on our base server role created above, and have it start to do some stuff. The easiest recipies to use are the ones that require no additional input from us at all, so lets start there. We use git here as version control software, so I’d like that installed as part of our base build. A search on the Opcode Cookbooks Site shows me that someone has already been nice enough to create a cookbook for git. Because it’s there, we can use knife to pull a copy:

knife cookbook site vendor git -d

Using the vendor command does some git magic behind the scenes, which, honestly, I do not entirely understand yet, but this is the preferred way to use other downloaded cookbooks. Once that command runs, you can take a cook in your cookbooks folder and see what got downloaded. Cookbooks have a standard folder structure inside of them, and the first place I look is in the root of the cookbook for some sort of documentation. The next place I tend to look is in the “recipes” sub-folder. In here is the list of recipes that is available to you (with default.rb what is run if you just use the cookbook name as your recipe.) In our case for git the default.rb recipe is what installs the git client packages. Here’s the rule itself for the curious:

case node[:platform]
when "debian", "ubuntu"
  package "git-core"
else 
  package "git"
end

As you can see, the DSL (domain-specific language) in Ruby is pretty easy to read. When the platform is debian or ubuntu, install the “git-core” package, otherwise install the “git” package. Chef is smart enough to have different methods for installing the packages depending on what platform you’re on, but is nice enough to have that abstracted away in the recipes. This will work for our needs, so let’s add a new section to our base_server.rb role file:

run_list(
  "recipe[git]"
)

Had we wanted to the install the non-default recipe of a cookbook, we’d have used the double colon notation. For example, there’s a server recipe in the git cookbook, and to use that we would have added “recipe[git::server]” to our run list. Every time we change one of our roles, we need to tell knife to update the server. Additionally, before we can use a cookbook on a node, we need to upload it to the server as well. (Note: behind the scenes this is all happening via REST API calls to the chef server. Knife is actually just a RESTful client to the chef server API. Now you know.)

knife cookbook upload git
knife role from file base_server.rb

And on the client:

sudo chef-client

Ok, I’m done typing that for now. You’ll know what I mean now when I say to update the cookbook, update the role, and run the client moving forward, right? So, let’s continue. I’m also going to download cookbooks for vim and fail2ban and add them to my base role as well. In it’s entirety base_server.rb now looks like this:

name "base_server"
description "Common Server Base Configuration"
run_list(
  "recipe[fail2ban]",
  "recipe[git]",
  "recipe[vim]"
)

Upload the cookbooks, update the role, and run the client and we now have a server with git and vim installed, and fail2ban providing some security.

Cookbooks: Tinkering with the Defaults

That’s all well and good, but we’ve gotten to the point where I need to install some software as well as change some configuration. As an example, I’d like to set up a NTP service on the box to keep the clock in sync with the Penn time servers. A quick search shows that there’s already a “ntp” service, so let’s start with that.

knife cookbook site vendor ntp -d

Let’s take a look at the relevant parts of the default.rb recipe:

case node[:platform] 
when "ubuntu","debian"
  package "ntpdate" do
    action :install
  end
end

package "ntp" do
  action :install
end

service node[:ntp][:service] do
  action :start
end

template "/etc/ntp.conf" do
  source "ntp.conf.erb"
  owner "root"
  group "root"
  mode 0644
  notifies :restart, resources(:service => node[:ntp][:service])
end

Ok, the first part is easiest enough to understand: install some packages, some conditional on the distribution we’re running. Now we come to some new stuff. First, let’s talk about the “node[:ntp][:service]” variable. This is the chef way off accessing node specific attributes. This could be different on every machine we run chef on, but has default values set by the cookbook and/or the role (they can also be overridden in either of these places, but we don’t need to go into that yet). To get an idea of what variables the cookbook contains, take a look at the default values which are all set by files in the attributes folder. The files in this folder should be named to correspond to the recipes. So, in the attributes/default.rb file we see:

case platform 
when "ubuntu","debian"
  default[:ntp][:service] = "ntp"
when "redhat","centos","fedora"
  default[:ntp][:service] = "ntpd"
end

default[:ntp][:is_server] = false
default[:ntp][:servers]   = ["0.us.pool.ntp.org", "1.us.pool.ntp.org"]

So, we set the name of the service depending on the platform we’re on, by default do not run NTP as a server for others to access, and have a default list of ntp servers. Slightly confusing side note: although it would be nice if this file was commented, it seems that the place to actually do so is in the cookbook root in the metadata.rb file. Here’s the related section from that file:

attribute "ntp",
  :display_name => "NTP",
  :description => "Hash of NTP attributes",
  :type => "hash"

attribute "ntp/service",
  :display_name => "NTP Service",
  :description => "Name of the NTP service",
  :default => "ntp"

attribute "ntp/is_server",
  :display_name => "NTP Is Server?",
  :description => "Set to true if this is an NTP server",
  :default => "false"

attribute "ntp/servers",
  :display_name => "NTP Servers",
  :description => "Array of servers we should talk to",
  :type => "array",
  :default => ["0.us.pool.ntp.org", "1.us.pool.ntp.org"]

Between those two places you should be able to determine what you have control over in your cookbook. Now, to put this into use and change the servers we’ll be talking to to the local UPenn time servers, we need to add the recipe, and add a new section to our base_server.rb role, keeping the structure we’ve identified above:

name "base_server"
description "Common Server Base Configuration"
run_list(
  "recipe[fail2ban]",
  "recipe[git]",
  "recipe[vim]",
  "recipe[ntp]"
)
default_attributes(
  "ntp" => { 
    "servers" => ["timeserver1.upenn.edu", "timeserver2.upenn.edu", "timeserver3.upenn.edu"] 
  }
)

Those couples lines are actually all we need to accomplish the task. Upload the cookbook, update the role, and do a client run and we’re now running the NTP service tied to our local time servers. Although we’re done, let’s take a look at the last section of that recipe and see what it’s doing. Specifically, the template command. If you look at it like english you can basically read that it’s creating a file on the system (“/etc/ntp.conf”) from a source file (“ntp.conf.erb”), setting permissions on the file, and then restarting the NTP service. You’ll see there’s no path on the source file, and that’s because the cookbook expects template files to live in the template folder, beyond there in a folder for the recipe you’re running. In our case, that’s templates/default/ntp.conf.erb. ERB is ruby’s templating system, and provides a way to insert variables and some logic (if necessary) into arbitrary text files. Let’s take a look:

[xhtml]driftfile /var/lib/ntp/ntp.drift
statsdir /var/log/ntpstats/

statistics loopstats peerstats clockstats
filegen loopstats file loopstats type day enable
filegen peerstats file peerstats type day enable
filegen clockstats file clockstats type day enable

<% if node[:ntp][:is_server] -%>
server 0.us.pool.ntp.org
server 1.us.pool.ntp.org
server 2.us.pool.ntp.org
server 3.us.pool.ntp.org
<% else -%>
<% node[:ntp][:servers].each do |ntpserver| -%>
server <%= ntpserver %>
<% end -%>
<% end -%>
restrict default kod notrap nomodify nopeer noquery

restrict 127.0.0.1 nomodify[/xhtml]

The parts that are ERB specific are all contained in <% blocks %>. In this case, the text file generates the list of servers we use based on the attributes we’ve specified on the node. Although a pretty simple example, I think you can see how this is an easy way to manage settings files, and most recipes leverage this template system pretty heavily.

Ok, getting a little long here, so let’s just add a few more cookbooks that will just require a few attribute changes and call it a day. We’ll add cookbooks to set our DNS settings (resolver), and to add a mail server and set it to relay through an internal relay host. At the end of all of that, here’s our final base_server.rb file:

name "base_server"
description "Common Server Base Configuration"
run_list(
  "recipe[fail2ban]",
  "recipe[git]",
  "recipe[vim]",
  "recipe[ntp]"
)
default_attributes(
  "ntp" => { 
    "servers" => ["timeserver1.upenn.edu", "timeserver2.upenn.edu", "timeserver3.upenn.edu"] 
  },
  "resolver" => {
    "nameservers" => ["128.91.87.123", "128.91.91.87", "128.91.2.13"],
    "search" => "wharton.upenn.edu"
  },
  "postfix" => {
    "relayhost" => "SOME.RELAY.SERVER"
  }
)

You should definitely go through the postfix and resolver cookbooks at this point and make sure you can understand what’s going on. When you’re through with that, you’ll be ready for our next installment, which is to build on all this knowledge and extend vendor cookbooks, and ultimately author our own!

Rapid Configuration with Chef

Here’s the scenario: it’s 3:30 on Friday, and I’d like to spin up a new application server to sit in a linux cluster. Here’s what I have:

  • A provisioned, but completely empty VM

Here’s what I need:

  • A full install of Ubuntu Linux
  • Networking configured, because this will be an LVS client, this entails:
    • Setting up a primary static ip
    • Adding multiple IPs netmasked to 255.255.255.255 on the loopback interface
    • Adding routing rules for those IPs
    • Disabling ARPing from those interfaces
    • Firewalling the box and allowing various services different levels of access
    • DNS setup
  • OpenSSH Server Installed
  • Apache Fully Configured, including:
    • PHP5 installed and configured
    • SSL Certificates Installed
    • Multiple Virtual Hosts setup
  • NFS setup, and appropriate shares mounted
  • NTP setup for clock synchronization
  • VMWare tools installed
  • A mail server enabled and configured to only accept mail from localhost, and relay through another machine on the network
  • Monitoring software installed and configured
  • Security software installed and configured
  • Memcached installed and enabled
  • An assortment of utilities for diagnosing load and network issues installed

I’d like to do all of this with the guaranteed consistency, so I do not want to do a single piece of it manually. So, let’s do it. I’ve got the Ubuntu install ISO mounted in the VM, I reboot, and add the following to the boot parameters:

ks=http://[MY_BOX]/[SERVER].cfg

Sit back and watch. 10 minutes later we have a full OS install with openssh server running, firewalled, and one user account added. Now, on my machine, I run one command:

knife bootstrap [SERVER] -x [USER] -P [PASSWORD] -r 'role[kw_server]' --sudo

It’s 3:50, and I’m done. Everything is installed, configured, and secured. Need to add a second or third server to the cluster? Rinse and repeat. The million dollar question: how on earth is this all possible? The answer: Chef. What is it? I’ll quote the creators (Opscode):

Chef is an open source systems integration framework built to bring the benefits of configuration management to your entire infrastructure. You write source code to describe how you want each part of your infrastructure to be built, then apply those descriptions to your servers. The result is a fully automated infrastructure: when a new server comes on line, the only thing you have to do is tell Chef what role it should play in your architecture.

I was able to accomplish automating system configuration through a combination of custom defined “roles”, and an assortment of “cookbooks” some downloaded and used as it, some tailored to my specific use, and some written from scratch. It is my goal over a series of posts to take you through the process of doing this yourself, attempting to address some of the problem areas I encountered while doing so, and some of the items that take a little longer to grasp. Of note, Chef is entirely written in ruby, and although I have very limited experience with the language thus far, the syntax is easy enough to understand that it shouldn’t frighten anyone off. There’s no advanced programming concepts, mostly what you’re doing is a putting together simple templates, and I did not see my lack of knowledge as a hinderance in any way.

The Nitty Gritty

Because I’d like these to be constructive, I’ll try to explain what every command does for the uninitiated:

ks=http://[MY_BOX]/[SERVER].cfg

Adding that line to the boot parameters on Ubuntu points the installed to what’s called a kickstart file. In the simplest terms, this is basically an answer file to all the questions the installer would ask you, as well as allowing you to do some minimal configuration and additional package installations. In my case, I’ve found the easiest way to configure a user account, partitioning, initial access, and the primary IP is to have a kickstart file define these things. Would it have been relatively easy to just set these things after installing the box? Yes. However, I think the most important/difficult part about going through this process is that you have to force yourself to do some extra work up front so that you never have to touch the actual server. The payoff down the line is well worth the time up front if you ever find yourself having to provision and deploy boxes at a rapid clip, you’ll want everything to happen automatically, with nothing waiting on your manual input. Initially I started with a very generic kickstart that would bring the box up via DHCP, but then I would have had to put properties on the node itself once chef was installed to configure networking, which didn’t seem ideal (too manual, even if it still didn’t involve touching the server). However, as I embraced the chef way of doing things, I found it was just as easy to create a dedicated kickstart file per-server automatically, and that actually lead to a better starting point for the box. I will go through all the details of creating this kickstart cookbook in part two when I introduce the concept of the cookbook.

knife bootstrap [SERVER] -x [USER] -P [PASSWORD] -r 'role[kw_server]' --sudo

Knife is the command line tool for chef, and you’ll find yourself using it quite regularly while working with chef’s cookbooks and recipes. In this case, however, we’re using the bootstrapping functionality of knife to completely configure our new server. What does the bootstrapping do? It logs onto the new box via SSH, installs ruby, ruby gems and chef itself. It configures chef on that box and registers it as a client to your server (in my case, I’m using the free server provided by Opscode). Finally, it can assign a role to the box. In our case, we’re assigning the “kw_server” role which is a collection of all the recipes and configuration needed to turn our box from a base OS install, to a fully configured application server. We’ll go through how the role system works in a later post as well.

Thus concludes my intro. If you’d like to see any of this in action (and you work with me) let me know and I’d be glad to run through it with you. As well, my goal is to get a series of posts written as sort of a road map for starting to use Chef at Wharton. Finally, I intend open-source the cookbooks I’ve written once I make them a little more cross platform, and make sure they’re general purpose / configurable enough to benefit other users.