Day 5: Tuning and Benchmarking

Whoops. Sort of let the last day slip away there huh? Well, no time like the present to try to wrap this up. As it turns out, just getting everything to a baseline state, and enabling the ability to start fine tuning things is more or less a whole day. So, we’ll adjust and just document that, with at least a baseline setup for tuning, and try to get a test is as well. Before we get to the fun part, the series for those just joining us:

Because we’re tuning specifically for the h1.4xlarge instance, we’ll do our tuning in a role and just for that purpose. Our roll tune-web-hi1-4xlarge.rb starting point:

name "tune-web-hi1-4xlarge"
description "Tune Apache/PHP for the h1.4xlarge instance"
default_attributes()
override_attributes()
run_list()

Day 4: Install WordPress with Chef/Git

Alright, so after a few days of vacation (and a few days to catch back up) we’re at it again. Today’s goal: finish the WordPress install. Previous steps:

For our purposes, I’m going to replicate one of our production WordPress installs, which is our Knowledge@Wharton High School site. I’ve run the site both locally on VMWare infrastructure, as well as externally. I feel this is a good example because I have a pretty good idea how to setup the site correctly for decent performance, which we also seem to get from our host. If you don’t believe me, here’s our Pingdom report the two weeks surrounding our cut-over from local to hosted:

Pingdom Cutober

Can you tell me which day we cut over? I’m just going to take this is a sign that we’re both “doing it right”.

Day 3: VirtualHost Setup

As we near the proverbial hump-day of our 5 Day plan, a quick glance at where we’ve been thus far:

Because today is a pretty short day (well, it’s a full day, but my time to work on this was limited) I think all we’re gonna have time to tackle is setting up the VirtualHost. This definitely puts a little pressure on for next week (again, a few days off coming up), but we’ll do what we can do!

Off we go:

Setup the VirtualHost

As we stated in the goals, we want to run SSL-only, so I wanted to create a template we could use with the web_app provider we get courtesy of the apache2 cookbook. So, we’ll need a rewrite for anything non-SSL to SSL, and then I’m going to base the rest of the config on the .htaccess file provided by the HTML5 Boilerplate project. So far, I’ve only removed a little bit, so there’s a good change we’ll have to revisit this file later. Here’s our apache-ssl-only-boilerplate.erb:

Day 2: Setup and Tune MySQL

Day two of our five day experiment arrives (and no, I’m not cheating, I had a few days off since the last post!). Quickly, let’s recap where we’ve been:

We left of with a working web server configuration, including the mounting of an ephemeral drive if available. Next up, the database. So, a short statement of our goal: Install and configure MySQL, using an ephemeral drive, if available, as our data directory.

MySQL

Let’s start, like we usually do, with a new role, db-common.rb. This will just install MySQL, and will be our starting point for master and (possible) slave servers down the road:

name "db-common"
description "Install MySQL."
default_attributes()
override_attributes()
run_list(
  "recipe[mysql::server]"
)

Next, let’s create a DB Master role that calls that common role. It’s job will be to setup the ephemeral mount, configure the server for our EC2 instance, and turn it into a master.

Day 1: Starting the Server Build (Chef, Vagrant & EC2)

So, if you’re just tuning in, we’re at the start of building a large EC2 instance to run WordPress. You can read the details here. Today we’re going to tackle creating the actual server. To do this, we’re going to use Chef. I wrote a few intro to Chef pieces a few years ago, although they’re getting pretty dated at this point. Luckily, there’s plenty of resources online now for learning (like, this one from Opscode). I also hope that this exercise because a real-world exercise in using Chef for a pretty run-of-the-mill setup. From this point forward, I’ll be writing assumeing a base-level understanding of Chef, my apologies!

The Pre-reqs

First things first, we need some preliminary setup for the tools we’ll need to make this thing work.

Chef Organization

Because we’ve been using more Chef lately at Penn (and best practices have been appearing) I’m starting a new Chef organization for this project. Also, it will keep documenting everything for blogging purposes a lot easier. We keep a template organization git repo internally here that is based on the Opscode base repository with a few adjustments for working with team members here, and a Cheffile for Librarian that has a bunch of our internal cookbooks defined already. If you wanted to follow along at home, the Opscode base should work just fine. I created a new organization on Hosted Chef, configure knife appropriately, and ran a quick test:

% knife client list

Which connects successfully and just shows that my only configured client is the validator, which is correct.

Cloud in Five: WordPress on AWS

I’ve been meaning to get back to blogging a bit, so we’re going to try this format out. More-or-less the goal is to build and test something in the cloud in a week or so (let’s say 5 business days, because I have a few personal days coming up). Can I guarantee it will be a week every time? Nope! I can’t even say that it will be a week this time. But it’s an exercise in technology and writing that I think is worth trying out. So, on to our first challenge:

Build an SSD-backed WordPress on AWS Setup

For a while I’ve been curious about the performance you could squeeze from one of the EC2 SSD backed instances running WordPress. But why, you ask? There’s already companies that specialize in WordPress hosting. True, and we use one of the biggest of them for one of our web properties. We also run a number of sites on a clustered setup that we run internally. Initially, I had thought we would probably do one of those for a much larger project we have in the works, but I’ve got some reservations / lingering doubts:

Control: At the end of the day, this may just be an exercise in me wanting to control our own infrastructure again. Being unable to access all the things I need to effectively debug an issue can be frustrating. Having to convince someone that the problem you’re starting a ticket about does actually exist is frustrating. Feeling like your ticket is being ignored for large stretches of time is frustrating. I know, you get what you pay for, and there are trade-offs to not having to worry about servers in the middle of the night.

SSL: I want to use SSL exclusively. I know this isn’t the norm yet, but I think that if you have to go back and forth on a site, in this day and age, you should run completely secured. Even if you don’t have people authenticating, there still may be reasons why you want to do this. To run this way, I’d like to see CDN assets distributed this way, as well as enable things like Google’s SPDY protocol. This is where you start to lose some bonus features of the hosting providers, as they don’t all support SSL CDN, or SPDY. Additionally, just doing in some places starts the end-around of all the caching systems people tend to put in place for WordPress. And if SSL doesn’t…

Users: It seems to me that the most common approach to optimize WordPress to scale, quite naturally, confronts the most common use case: anonymous readers. Whether it’s in WordPress itself through the use of the popular caching plugins, or through a layer of nginx or varnish caches, one of the first configuration pieces you tend to see is “skip me if you’re logged in”. But what if you want to encourage people to create accounts and engage? What if those are actually the most important users you have? I think we need to provide them a experience that’s on par, if not better, then, say the Googlebot (although I do realize that we can’t ignore the Googlebot, it requires speed these days as well!)

Simplicity: Although I’m part of a large organization, our team is actually rather small, and what I’ll be proposing should, with the aid of configuration management, be a really simple setup compared to the distributed complexity (and latency!) you tend to end up with when you’re sharing space in, say, a VMWare environment with a variety of other users. I’m also hopeful that AWS reduces complexity by turning parts I would normally have to manage on a device (like say, load balancers) into a service I can configure via API. At the very least, it’s going to let me spin and and destroy computers loaded up with SSDs without any major time or monetary investment!

So, now that we’re through the doubts and goals, here’s my proposal:

Let’s build an optimized WordPress setup based around EC2’s hi1.4xlarge (SSD-backed) instance type.

Is it cheap? Nope. This isn’t a bargain basement setup, I want to take one of the best machines they offer and see what it can do. Here’s the specs (from Amazon’s list):

High I/O Quadruple Extra Large Instance

  • 60.5 GiB of memory
  • 35 EC2 Compute Units (16 virtual cores*)
  • 2 SSD-based volumes each with 1024 GB of instance storage
  • 64-bit platform
  • I/O Performance: Very High (10 Gigabit Ethernet)
  • Storage I/O Performance: Very High*
  • EBS-Optimized Available: No**
  • API name: hi1.4xlarge

Sounds powerful. So, what are we enabling? I’m hopeful that we’re going back to the good old days of one server handling all my stuff! DB Server on the same box as the assets as the webserver. Old fashioned, I know. I’ll be jettisoning what I could envision as a terribly complex setup of load balancers, web servers, NAS, db cluster, and memcached servers. So, we’ve got a terrabyte drive to hand to the database, and another to hand to the webserver. Of course, we’ll need redundancy, and that also needs to be part of this project, but the theory is that when everything is going great this box should be able to stand up to everything we could potentially throw at it (note: we’ve got years of our traffic data, so we won’t be guessing).

Approach

Now that we’ve outlined the goals, we need an approach. Here are the steps I’m envisioning (not necessarily in order), and a few bullets about each:

  • Using Chef, configure the server.
    • Share as much as possible about the setup
    • Have a Vagrant setup to use for testing the recipes
    • Install/configure MySQL to use one of the SSDs to its data store
    • Install/configure Apache & PHPFPM to serve from the other SSD
    • We’re going to run with the Ubuntu instances, because that’s where I’m most comfortable.
  • Install/Configure WordPress
  • Build it into the AWS architecture.
    • Use an Elastic Load Balancer
    • Setup CloudFront CDN
  • Add redundancy / failover / backup
    • Leverage EBS/S3 for persistent storage?
    • Build a backup (and restore) process
    • Setup a hot standby (probably not SSD backed)
  • Benchmark / Tune
    • WordPress Caching
    • PHP Tuning
    • MySQL Tuning
    • Build a Jmeter test suite

So, start your timers, we’re off! By tomorrow I hope to have at least tackled the complete server setup.

Managing Standalone Cookbooks in Chef

Managing cookbooks in Chef is a topic of heavy discussion right now in the community. The biggest argument seems to be between forking and maintaining an entire cookbooks repository (like the main Opsode one), or managing your custom cookbooks as single git repositories. At the moment, I’m leaning towards the one repo/cookbook approach because I feel that it takes advantage of all the built in GitHub goodness that I’ve become accustomed to, as well as makes it easier to share single cookbooks with the communtity. The biggest problem I think people have with this approach is that when you’re developing your cookbooks and testing, that you either make changes in your full Chef repository and then backport the changes into your standalone repo, or vice-versa. After stuggling with this for a bit, I think I’ve found an approach that now seems pretty obvious. However, because it wasn’t at the time, I thought I’d share.

Step 1: Working with Single Repo Cookbooks

One of the biggest pieces facilitating this kind of setup is Jesse Newland’s (@jnewland) Knife plugin for working with github cookbooks. It nicely follows the conventions that Knife already uses for vendor cookbooks (forks, downloads, merges) so your processes really don’t have to change to install or update the cookbooks that you’d like to use, and the files still end up in your main Chef repo. I find this more useful then some of the other plugins that keep the cookbooks out of the repo, especially when using Chef Solo, or developing with others. For those of you that just skim, typing commands as they arrive:

gem install knife-github-cookbooks

Step 2: Give Back

So now you’re using cookbooks from here and there as needed, and everything is swell until you realize that you just wrote a really useful cookbook. Or maybe, a really niche cookbook that someone else out there might need. Either way, it’s time to share it with the world! Of course, the Opscode Community is kinda the official place for this sort of thing, but they want a tarball. Hey, you know where you can make a tarball from your source pretty easily? GitHub! So, go create a repo for your cookbook. Because you’re a convention loving sort of person, you’ll prefix it with “chef-” or add “-cookbook” as a suffix, which the Knife plugin will kindly remove from the folder when it installs (I know, they thought of everything!).

Step 3: No Backporting Local Development

You’re happy at this point. Your cookbook is out there for the world to use, and the sun is shining. Then, out of the blue: bugs, feature requests, pull requests! The eye of the community has now focused it’s gaze on you and it’s time to get back to work. You’ve got your cookbook nicely vendored into your main repo, but that’s not where you want to make changes. So, why not create another location for your cloned repos? Chef is structured in a way that if you add additional cookbook paths, each one overrides the previous, so create a new folder in your main repo. I went with “forked-cookbooks” myself, and immediately added it to my .gitignore file. Now pop into that folder and clone your standalone repo into there, making sure to change the name of the folder so that it matches what the Knife plugin would do. Example from real life:

git clone https://github.com/wharton/chef-coldfusion9 ./forked-cookbooks/coldfusion9 

Now, add that path to your Knife config, Vagrantfile, or wherever else you specify it, and you can start making and testing your changes. Because we’re big on examples here, the line in my .chef/knife.rb looks like this:

cookbook_path ["#{current_dir}/../cookbooks", "#{current_dir}/../site-cookbooks", "#{current_dir}/../forked-cookbooks"] 

And the appropriate line in my Vagrantfile looks like this:

chef.cookbooks_path = ["~/Projects/knowledge-chef/cookbooks", "~/Projects/knowledge-chef/forked-cookbooks"]

When you’re happy with it, push your changes to your cookbook repo, and then use Knife to install a new copy of the cookbook into your main cookbooks path. The detailed changes go into your cookbook repo, where I think they belong, and your main Chef repo is up to date for use in solo deploys, or sharing with your team. Done with changes for now, or maybe your pull request was accepted? Kill the folder in forked-cookbooks, and you’re ensured that you’re back to using what everyone else is, and not accidentally pushing old copies of the cookbook up to your Chef Server. Finally: go grab a cold one, you’ve earned it!

Update: It appears that the debate is largely over, as Opsode has announced they’ll be moving to the one cookbook/one repo approach. I would like to say it was because of this post, but that would be a flat out lie. Props to Opscode for being willing to rework their development processes around the new setup.

Keeping your machine clean with Vagrant & Chef

Recently a coworker of mine (@hectcastro) turned me on to Vagrant. In their words: “Vagrant is a tool for building and distributing virtualized development environments.” What does that mean for me? No more installing all sorts of madness for the variety of servers I need to work with in our somewhat mixed-technology environment. In essence, disposable VMs for local development once Chef is configured to provision the machines properly. Because I needed to update an older ColdFusion application last week, and didn’t have it installed, I figured it was as good an excuse as any to kick the tires a little. Here’s what I did:

  1. Install Vagrant
    $ gem install vagrant
    
  2. Install their Lucid Box
    $ vagrant box add base http://files.vagrantup.com/lucid32.box
    
  3. Create a new folder, initialize vagrant and checkout the chef repo. Vagrant will end up sharing this folder with the VM, which makes it handy for moving extraneous files around.
    $ mkdir YOUR_FOLDER
    $ cd YOUR_FOLDER
    $ vagrant init
    $ git clone gitosis@MYREPO:cfenv-chef.git
    
  4. Modify the Vagrantfile to change the VM specs, base box, IP you’ll access it on, and set up chef provisioning. Here are the relevant lines from my setup:
    # Every Vagrant virtual environment requires a box to build off of.
     config.vm.box = "lucid32"
    
    # Assign this VM to a host only network IP, allowing you to access it
    # via the IP.
    config.vm.network "33.33.33.50"
    
    # Boost the RAM slightly and give it a reasonable name
    config.vm.customize do |vm|
      vm.memory_size = 512
      vm.name = "CFEnv"
    end
    
    # Enable provisioning with chef solo, specifying a cookbooks path (relative
    # to this Vagrantfile), and adding some recipes and/or roles.
    #
    config.vm.provision :chef_solo do |chef|
      chef.cookbooks_path = "cfenv-chef/cookbooks"
      chef.roles_path = "cfenv-chef/roles"
      chef.add_role "cfserver"
    end
    
  5. Bring it up! (It takes a few minutes the first time.) When it’s done you should have CF running on port 8500 of the ip you specified in the Vagrantfile. It’s installed in /opt/coldfusion if you need to find it, and the administrator password is set by the chef cookbook.
    $ vagrant up
    
  6. What I do after this is share folders into the CF root so I can edit locally, but run in the VM. Here’s two examples which you can modify to your own taste (also in Vagrantfile):
    config.vm.share_folder "kw-core", "/opt/coldfusion/wwwroot/core", "~/Sites/core"
    config.vm.share_folder "kw", "/opt/coldfusion/wwwroot/knowledge", "~/Sites/knowledge"
    
  7. To see your changes to the file shares, run a vagrant reload. If you modify the chef stuff, you can run a vagrant provision to kick off a chef run. Because of the way I set up the chef recipe, that will also have the side effect of restarting CF. Just for reference, you can log into the box with vagrant ssh. That’s it!

“But where is the link to this chef repo!” you ask? Well, because I developed it just for myself I embedded the CF installer in it (which isn’t open source). However, because some co-workers have shown some interest, I’ll change a few things and try to get up it publicly. More importantly, I was able to provision a machine for local development without having to install any of the footprint of the app server I really didn’t want to install locally, and think this could be a very powerful solution for developing in a mixed-technology enviroment, or even in a single tech workplace for consistency of dev setups.

UPDATE: I moved the installer of the repo and pushed it to github with slightly modified directions.

Chef: Rounding the Bend

Since you’ve already ready my intro to Chef, and well as my article on getting started (right?) we’re going to do away with the long recap and instead give you a “Preiviously on 24” style list:

  • You know what Chef is
  • You’ve installed and configured Chef
  • You’ve created a node
  • You’ve created a role
  • You’ve downloaded and used some existing cookbooks, changing default settings as necessary
  • Jack Bauer has muttered something about not having enough time

You can actually accomplish a fair amount just using the methods above, but eventually you’re going to need some functionality that goes beyond what the one-size-fits-all cookbooks can provide. A common reason that I’ve encountered for this is some recipes put in place a structure where you can easily extend them with your own templates, so we’ll look at that next.

Extending Recipes with Site Cookbooks

If you’ve been following along so far, you’ve got an empty site-cookbooks folder in the root of your chef repository (if you don’t, go ahead and create one). How does this work? Basically, you create a structure in there to mimic that of the cookbooks folder, and when knife uploads the cookbook, it uses the files it finds in site-cookbooks instead of those in cookbooks. To clarify: you only need to create the folders and files that you plan on overriding, or files that don’t exist in the cookbook. So, why don’t you just edit the cookbook itself? Actually, if it’s your own cookbook, that’s what you should do, but I’m talking about the case where you’re extending one you’ve found online. Granted, you could just as easily update it in the cookbooks folder, using git to manage your changes, but I personally thing it’s cleaner to use the site cookbooks method. This way, you can keep track of what you’ve written and/or changed. Additionally, if the author of the original enhances their cookbook while you’re off conquering the world with your new Chef setup, I think it’s easier to just replace the original in cookbooks, because you can use version control to see what’s changed and preserve all your own customizations.

So, on to the example. At the moment we have some basic functionality happening in our base_server.rb role file, but now we want to lock the machine down with an iptables firewall. Luckily, there’s a cookbook for that called, appopriately, iptables, so let’s vendor that cookbook with knife:

knife cookbook site vendor git -d

If you glance through the source of the cookbook, you’ll see that it’s creating a /etc/iptables.d directory, in which it will be placing rules, these rules are created by template files with a ‘definition’ call. Finally, the machine is locked down to only accept connections defined in those rule files. Two things worth noting here: First, this is our first look at a definition in Chef, so that warrants an explanation. To quote the chef wiki: “Definitions allow you to create new Resources by stringing together existing resources.” There’s some good examples on there as well, but we’re going to procede with ours. Here’s the relevant source from the definitions/iptables_rule.rb file:

define :iptables_rule, :enable => true, :source => nil, :variables => {} do
  template_source = params[:source] ? params[:source] : "#{params[:name]}.erb"
  
  template "/etc/iptables.d/#{params[:name]}" do
    source template_source
    mode 0644
    variables params[:variables]
    backup false
    notifies :run, resources(:execute => "rebuild-iptables")
    if params[:enable]
      action :create
    else
      action :delete
    end
  end
end

In short, it’s creating a new file in the iptables.d folder using a source based either on the name of the definition we’re creating or one we pass as a parameter, and enabling it. Why is this handy? Because instead of putting all that code in our recipe, we get a nice reusable snippet. Here’s how this is used in the default.rb recipe:

iptables_rule "all_established"
iptables_rule "all_icmp"

Sure beats writing all that code up there over and over again. Also, in Ruby fashion, it’s really easy to read. (I read: “Create an iptables rule for all established connections and for ping”). Just to fill in the last piece of the puzzle, here’s the template all_icmp.erb (which, if you’re following, is called by the iptables_rule definition):

# ICMP 
-A FWR -p icmp -j ACCEPT

Now, this is all well and good, except that most people are going to need rules for more then just established connections and ping. That brings us to the second thing worth noting about this cookbook: we need more templates! Enter: site cookbooks. For our example below, let’s extend this cookbook to include rule for a web server. To begin, create the appropriate structure (run from the site-cookbooks folder):

mkdir -p iptables/templates/default/
mkdir -p iptables/attributes
mkdir -p iptables/recipes

Let’s start with some simple templates for iptables including rules for http and https traffic from anywhere, as well as one for ssh (we do want to be able to administrate the box remotely, right?). Here’s the templates/default/all_http.erb file:

# HTTP
-A FWR -p tcp --dport 80 -j ACCEPT

Next, the template/default/all_https.erb file:

# HTTPS
-A FWR -p tcp --dport 443 -j ACCEPT

And finally, template/default/all_ssh.erb:

# SSH
-A FWR -p tcp --dport ssh -j ACCEPT

I suppose you could combine those into one template, but at this level I prefer to keep things as granular as possibly, so we can mix and match down the road. Now, let’s get tricky and try to apply one of the other things we know about Chef: the templates can be dynamic. So, let’s throw in some rules for locked down versions of those same services. Here’s templates/default/network_http.erb:

# HTTP Locked Down
<% @node&#91;:iptables&#93;&#91;:ssh&#93;&#91;:addresses&#93;.each do |address| %>
-A FWR -p tcp -s <%= address %> --dport 80 -j ACCEPT
<% end %>

And to match, templates/default/network_https.erb:

# HTTPS Locked Down
<% @node&#91;:iptables&#93;&#91;:ssh&#93;&#91;:addresses&#93;.each do |address| %>
-A FWR -p tcp -s <%= address %> --dport 443 -j ACCEPT
<% end %>

Finally, templates/default/network_ssh.erb:

# SSH Locked Down
<% @node&#91;:iptables&#93;&#91;:ssh&#93;&#91;:addresses&#93;.each do |address| %>
-A FWR -p tcp -s <%= address %> --dport 22 -j ACCEPT
<% end %>

Now we have a nice base to work with, some iptables templates we can mix and match as necessary. Of note: we’re introduced a node variable to the network rules, so we have to remember to cover that in our recipe. Also worth noting: most programmers are going to start to see repetition above, and may feel tempted to create an all_tcp rule with an additional “port” variable. Don’t let me stop you, it might make sense. Two reasons why I didn’t: 1) Down the road there could be more complicated services I’m defining in templates that could have multiple iptables rules, and I would prefer to have them in one template so that.. 2) each service remains a granular object, easy to read when defined in the recipe. I’m willing to sacrifice a little bit of repetition if it makes my recipes easier to read and administrate. Again, personal choice, and you’re a rugged individualist, so do your own thing if it makes you happy!

Moving on, I think recipes don’t have to be as granular (because we did so in the templates), so let’s create a recipe using these templates for “iptables::web” encompassing http and https, as well as a decision on which rules to use based on a node variable. Here’s our recipes/web.rb:

# Have we decided to lock down the node?
if node[:iptables][:web][:addresses].empty?
  # Use the all_ rules
  iptables_rule "all_http"
  iptables_rule "all_https"
  # Disable the network rules
  iptables_rule "network_http", :enable => false
  iptables_rule "network_https", :enable => false
else
  # Use the network rule
  iptables_rule "network_http"
  iptables_rule "network_https"
  # Disable the all traffic rules
  iptables_rule "all_http", :enable => false
  iptables_rule "all_https", :enable => false
end

Note: if we don’t do that enable => false bit, the file will remain on the server even if we remove the line later. Strange, I know. Moving on, one for ssh, “iptables::ssh” (recipes/ssh.rb):

# Have we decided to lock down the node?
if node[:iptables][:ssh][:addresses].empty?
  # Use the all_ssh rule
  iptables_rule "all_ssh"
  # Disable the network ssh rule
  iptables_rule "network_ssh", :enable => false
else
  # Use the network rule
  iptables_rule "network_ssh"
  # Disable the all traffic rule
  iptables_rule "all_ssh", :enable => false
end

Pretty simple right? If we’ve defined addresses on the node, use the lock down rules, otherwise, open the port up to the world. Finally, because we’ve introduced new node attributes, we need to create two attribute files to correspond to our new recipes. First, attributes/web.rb:

# Web Traffic Allowed Networks (IP/NETMASK)
default[:iptables][:web][:addresses] = Array.new

And one for SSH as well (attributes/ssh.rb):

# SSH Allowed Networks (IP/NETMASK)
default[:iptables][:ssh][:addresses] = Array.new

Awesome. Now the big finish: let’s add the ssh recipe to our base server, then create a new role for a web server that applies our “base server” configuration and then locks down the machine. In the real world, this would be part of a role that also configures your web servers. Coincidentally, the apache2 cookbook uses a very similar mechanism so if you’re anxious you can move ahead using the versatile “web_app” definition in that cookbook. Now our roles/base_server.rb looks like this (locking down ssh to an arbitrary subnet):

name "base_server"
description "Common Server Base Configuration"
run_list(
  "recipe[fail2ban]",
  "recipe[git]",
  "recipe[vim]",
  "recipe[ntp]",
  "recipe[iptables]",
  "recipe[iptables::ssh]"
)
default_attributes(
  "ntp" => {
    "servers" => ["timeserver1.upenn.edu", "timeserver2.upenn.edu", "timeserver3.upenn.edu"]
  },
  "resolver" => {
    "nameservers" => ["128.91.87.123", "128.91.91.87", "128.91.2.13"],
    "search" => "wharton.upenn.edu"
  },
  "postfix" => {
    "relayhost" => "SOME.RELAY.SERVER"
  },
  "iptables" => {
    "ssh" => { "addresses" => ["128.91.0.0/255.255.0.0", "130.91.0.0/255.255.0.0"] }
  }
)

Great, now lets create a more task specific role for a web server, building off of that base role. Because I want to prove it works, why don’t you go and download the cookbook for apache2. I’ll wait. Ok, let’s include it to do a base install so you can check easily. Here’s the new roles/web_server.rb role file:

name "web_server"
description "Generic Web Server"
run_list(
  "role[base_server]",
  "recipe[apache2]",
  "recipe[apache2::mod_ssl]",
  "recipe[iptables::web]"
)

There you have it. Note the “role[base_server]” line in run_list, that includes all the good stuff we have in our base server role (obvious, right?) Upload the cookbooks, update the roles, and try assigning the new web server role to a test node and you’re rolling!

Housekeeping

To complete this example, I need to include a few more things I did. This falls into a grey area for me because I actually think this belongs in the original recipe, which is something you’re likely to encounter as well as you extend cookbooks: When to override and when to patch? My approach is going to be fix in site cookbooks, and then submit a patch back to the author, if it gets included, great, update the cookbook and remove it from the site cookbook. Because I haven’t gotten this patch back to Opscode yet, and I want you to be able to use my examples if you’d like, here was the last piece of the iptables functionality I had to change.

The problem: iptables weren’t persisting through reboot. A pretty big issue! My solution was to add a script to the ifup.d folder that would call the rebuild-iptables script (created by the iptables rule) when network interfaces come up. To do this required a short template (templates/default/iptables.erb):

#!/bin/sh
/usr/sbin/rebuild-iptables

And a recipe to place it (recipes/on_boot):

if platform?("debian", "ubuntu")
  # Add a script to restore the rules on boot
  template "/etc/network/if-up.d/iptables" do
    source "iptables.erb"
    owner "root"
    group "root"
    mode 0755
  end
end

Now, include the “iptables::on_boot” recipe in your base_server.rb role and you’re good to go! In case you’re curious, I haven’t submitted the patch yet because I haven’t had time to test it on anything but the ubuntu boxes I’m running (thus the “if platform?” conditional), and I’d prefer to submit a patch that also works in the RHEL space as well. Opsode, don’t let this stop you from taking this and running!

Seems to me we’ve gone quite long again, so we’ll wrap this one up. Still to come: the wild wild west, creating your own cookbook in it’s entirety from scratch! Until then, enjoy extending recipes. At this point if you’ve been following along you’re already wielding a pretty powerful configuration tool!

Chef: Out of the Gate

Next in my series of posts about the configuration management tool chef, I’d like to talk a little bit about how I got started. First up:

The Server

Although there are several varieties of chef, I prefer to have a central server from which I can examine my nodes (client machines) from. Because I didn’t want to go through the trouble of setting up and maintaining my own, I decided to go with a hosted service from Opscode, which is free up to 5 nodes, and has very reasonable pricing beyond that. You can sign up for an account right on their homepage. Another nice thing about having a server running is that you can use it to examine all the properties of your cookbooks and nodes to see if values are being set the way you expect them two. For example, being able to browse the node attributes on the server helped me diagnose a problem of mismatched brackets, that was putting putting all my custom attributes into one cookbook, but was not an invalid file on my end.

Getting Started

You can find this basic info on a variety of sites, so I’m not going to go into painstaking detail, just run through getting your initial chef directory structure in place. This will require git, which you can learn more about from Brian and Hector’s tech talk, or on the web. The official installation instructions you can get from the Opscode Knowledge Base. This is a condensed version.

First, we want to pull down an empty chef structure (note, all my commands will be for *nix environments, if necessary, you’ll have to swap in the windows equivalents):

git clone git://github.com/opscode/chef-repo.git

This will create the folders that knife expects to find when it runs. We’ll enter the folder and add one more for site cookbooks (which is a clever way to extend cookbooks we download from 3rd parties):

cd chef-repo
mkdir site-cookbooks

Finally, we need to set up knife to be able to talk with our server. On the Opscode site in your console you need to set up a new organization, then generate a new key for it (save this file, we’ll call it ORGANIZATION-validator.pem) and use the “Generate knife config” file to get the knife.rb file. Finally, on your user account page use the “Get a new private key” link to download your keyfile (I’ll call it USERNAME.pem). Now we’ll create the appropriate location in our chef repo and copy the files in (all commands will assume we’re sitting in the chef-repo folder from this point forward):

mkdir .chef
cp /YOURPATH/ORGANIZATION-validator.prm .chef
cp /YOURPATH/knife.rb .chef
cp /YOURPATH/USERNAME.pem .chef

And a quick test:

knife client list

Should output:

[
    "ORGANIZATION-validator"
]

Ok. Now we’re cooking! Heh, get it? Sorry. Let’s move on.

Roles

It’s hard to know whether to introduce roles or recipes first, because they’re pretty useless without one another, but we’ll start with roles. A role is basically defines a list of recipes and attributes that you can apply as a single unit to a node. Although you can alter your roles on the server once you create them, I think that always creating them from a source file is the best way to go about it. I got started with this concept by thinking about the environment I needed to manage, and decided to keep it simple, breaking all the servers up into two pieces:

  • A base role containing all the software and configuration that’s common across all server types (network setup, security, mail)
  • A role for the specific server type (application, database, file, etc.)

So, we’ll get started with the base role. Create a file called base_server.rb in the roles folder of your repository and we’ll start with the bare minimum in there:

name "base_server"
description "Common Server Base Configuration"

To create the role and add it to the server, enter knife:

knife role from file base_server.rb

To see if it worked, log on to the chef server and look in the “Roles” tab. You should see your new role. If you click on it, you’ll get more details, but that’s pretty much empty right now. Before we can do anything clever there we need to learn a little bit about cookbooks. Additionally, we’re going to need a machine to experiment on, so let’s work on that next.

Nodes

Nodes are the computers that you’re controlling with chef. Although you could manually install chef on a machine, register it with your server, and use the server’s web interface to add roles to that node, the easiest way (assuming you have ssh access to your test machine) is to use knife’s bootstrapping command (which we covered in the first chef article). Let’s bootstrap our test system and assign it our new base_server role:

knife bootstrap TEST.SERVER.ADDRESS -x USERNAME -P PASSWORD -r 'role[base_server]' --sudo

When that completes you should see the machine in the node list on the server, or you can even quickly check with:

knife node list

From this point forward on that box, to realize changes you’ve made to your cookbooks and recipes on the test box you need to run the chef-client command. This will contact the server, pull down any new or changed cookbooks, and apply the appropriate recipies given the roles the machine has been assigned. So, on a *nix box:

sudo chef-client

Now, let’s do something with our test machine.

Cookbooks and Recipes

Cookbooks are collections of… wait for it… recipes! (Pretty obvious, huh?) Recipes are what actually makes chef do anything useful, like install a web server, or change your DNS settings. Even better, there’s already a bunch of recipes out there for common tasks. On the down side, not all the recipes you download are going to have good (or any) documentation at all. However, once you understand how they work, it’s easy to read a recipe and figure out what it’s doing (although this does not excuse developers out there from documenting!! Do it!!!)

Cookbooks: Straight up Defaults

To get our feet wet with cookbooks, we’ll get to work on our base server role created above, and have it start to do some stuff. The easiest recipies to use are the ones that require no additional input from us at all, so lets start there. We use git here as version control software, so I’d like that installed as part of our base build. A search on the Opcode Cookbooks Site shows me that someone has already been nice enough to create a cookbook for git. Because it’s there, we can use knife to pull a copy:

knife cookbook site vendor git -d

Using the vendor command does some git magic behind the scenes, which, honestly, I do not entirely understand yet, but this is the preferred way to use other downloaded cookbooks. Once that command runs, you can take a cook in your cookbooks folder and see what got downloaded. Cookbooks have a standard folder structure inside of them, and the first place I look is in the root of the cookbook for some sort of documentation. The next place I tend to look is in the “recipes” sub-folder. In here is the list of recipes that is available to you (with default.rb what is run if you just use the cookbook name as your recipe.) In our case for git the default.rb recipe is what installs the git client packages. Here’s the rule itself for the curious:

case node[:platform]
when "debian", "ubuntu"
  package "git-core"
else 
  package "git"
end

As you can see, the DSL (domain-specific language) in Ruby is pretty easy to read. When the platform is debian or ubuntu, install the “git-core” package, otherwise install the “git” package. Chef is smart enough to have different methods for installing the packages depending on what platform you’re on, but is nice enough to have that abstracted away in the recipes. This will work for our needs, so let’s add a new section to our base_server.rb role file:

run_list(
  "recipe[git]"
)

Had we wanted to the install the non-default recipe of a cookbook, we’d have used the double colon notation. For example, there’s a server recipe in the git cookbook, and to use that we would have added “recipe[git::server]” to our run list. Every time we change one of our roles, we need to tell knife to update the server. Additionally, before we can use a cookbook on a node, we need to upload it to the server as well. (Note: behind the scenes this is all happening via REST API calls to the chef server. Knife is actually just a RESTful client to the chef server API. Now you know.)

knife cookbook upload git
knife role from file base_server.rb

And on the client:

sudo chef-client

Ok, I’m done typing that for now. You’ll know what I mean now when I say to update the cookbook, update the role, and run the client moving forward, right? So, let’s continue. I’m also going to download cookbooks for vim and fail2ban and add them to my base role as well. In it’s entirety base_server.rb now looks like this:

name "base_server"
description "Common Server Base Configuration"
run_list(
  "recipe[fail2ban]",
  "recipe[git]",
  "recipe[vim]"
)

Upload the cookbooks, update the role, and run the client and we now have a server with git and vim installed, and fail2ban providing some security.

Cookbooks: Tinkering with the Defaults

That’s all well and good, but we’ve gotten to the point where I need to install some software as well as change some configuration. As an example, I’d like to set up a NTP service on the box to keep the clock in sync with the Penn time servers. A quick search shows that there’s already a “ntp” service, so let’s start with that.

knife cookbook site vendor ntp -d

Let’s take a look at the relevant parts of the default.rb recipe:

case node[:platform] 
when "ubuntu","debian"
  package "ntpdate" do
    action :install
  end
end

package "ntp" do
  action :install
end

service node[:ntp][:service] do
  action :start
end

template "/etc/ntp.conf" do
  source "ntp.conf.erb"
  owner "root"
  group "root"
  mode 0644
  notifies :restart, resources(:service => node[:ntp][:service])
end

Ok, the first part is easiest enough to understand: install some packages, some conditional on the distribution we’re running. Now we come to some new stuff. First, let’s talk about the “node[:ntp][:service]” variable. This is the chef way off accessing node specific attributes. This could be different on every machine we run chef on, but has default values set by the cookbook and/or the role (they can also be overridden in either of these places, but we don’t need to go into that yet). To get an idea of what variables the cookbook contains, take a look at the default values which are all set by files in the attributes folder. The files in this folder should be named to correspond to the recipes. So, in the attributes/default.rb file we see:

case platform 
when "ubuntu","debian"
  default[:ntp][:service] = "ntp"
when "redhat","centos","fedora"
  default[:ntp][:service] = "ntpd"
end

default[:ntp][:is_server] = false
default[:ntp][:servers]   = ["0.us.pool.ntp.org", "1.us.pool.ntp.org"]

So, we set the name of the service depending on the platform we’re on, by default do not run NTP as a server for others to access, and have a default list of ntp servers. Slightly confusing side note: although it would be nice if this file was commented, it seems that the place to actually do so is in the cookbook root in the metadata.rb file. Here’s the related section from that file:

attribute "ntp",
  :display_name => "NTP",
  :description => "Hash of NTP attributes",
  :type => "hash"

attribute "ntp/service",
  :display_name => "NTP Service",
  :description => "Name of the NTP service",
  :default => "ntp"

attribute "ntp/is_server",
  :display_name => "NTP Is Server?",
  :description => "Set to true if this is an NTP server",
  :default => "false"

attribute "ntp/servers",
  :display_name => "NTP Servers",
  :description => "Array of servers we should talk to",
  :type => "array",
  :default => ["0.us.pool.ntp.org", "1.us.pool.ntp.org"]

Between those two places you should be able to determine what you have control over in your cookbook. Now, to put this into use and change the servers we’ll be talking to to the local UPenn time servers, we need to add the recipe, and add a new section to our base_server.rb role, keeping the structure we’ve identified above:

name "base_server"
description "Common Server Base Configuration"
run_list(
  "recipe[fail2ban]",
  "recipe[git]",
  "recipe[vim]",
  "recipe[ntp]"
)
default_attributes(
  "ntp" => { 
    "servers" => ["timeserver1.upenn.edu", "timeserver2.upenn.edu", "timeserver3.upenn.edu"] 
  }
)

Those couples lines are actually all we need to accomplish the task. Upload the cookbook, update the role, and do a client run and we’re now running the NTP service tied to our local time servers. Although we’re done, let’s take a look at the last section of that recipe and see what it’s doing. Specifically, the template command. If you look at it like english you can basically read that it’s creating a file on the system (“/etc/ntp.conf”) from a source file (“ntp.conf.erb”), setting permissions on the file, and then restarting the NTP service. You’ll see there’s no path on the source file, and that’s because the cookbook expects template files to live in the template folder, beyond there in a folder for the recipe you’re running. In our case, that’s templates/default/ntp.conf.erb. ERB is ruby’s templating system, and provides a way to insert variables and some logic (if necessary) into arbitrary text files. Let’s take a look:

[xhtml]driftfile /var/lib/ntp/ntp.drift
statsdir /var/log/ntpstats/

statistics loopstats peerstats clockstats
filegen loopstats file loopstats type day enable
filegen peerstats file peerstats type day enable
filegen clockstats file clockstats type day enable

<% if node[:ntp][:is_server] -%>
server 0.us.pool.ntp.org
server 1.us.pool.ntp.org
server 2.us.pool.ntp.org
server 3.us.pool.ntp.org
<% else -%>
<% node[:ntp][:servers].each do |ntpserver| -%>
server <%= ntpserver %>
<% end -%>
<% end -%>
restrict default kod notrap nomodify nopeer noquery

restrict 127.0.0.1 nomodify[/xhtml]

The parts that are ERB specific are all contained in <% blocks %>. In this case, the text file generates the list of servers we use based on the attributes we’ve specified on the node. Although a pretty simple example, I think you can see how this is an easy way to manage settings files, and most recipes leverage this template system pretty heavily.

Ok, getting a little long here, so let’s just add a few more cookbooks that will just require a few attribute changes and call it a day. We’ll add cookbooks to set our DNS settings (resolver), and to add a mail server and set it to relay through an internal relay host. At the end of all of that, here’s our final base_server.rb file:

name "base_server"
description "Common Server Base Configuration"
run_list(
  "recipe[fail2ban]",
  "recipe[git]",
  "recipe[vim]",
  "recipe[ntp]"
)
default_attributes(
  "ntp" => { 
    "servers" => ["timeserver1.upenn.edu", "timeserver2.upenn.edu", "timeserver3.upenn.edu"] 
  },
  "resolver" => {
    "nameservers" => ["128.91.87.123", "128.91.91.87", "128.91.2.13"],
    "search" => "wharton.upenn.edu"
  },
  "postfix" => {
    "relayhost" => "SOME.RELAY.SERVER"
  }
)

You should definitely go through the postfix and resolver cookbooks at this point and make sure you can understand what’s going on. When you’re through with that, you’ll be ready for our next installment, which is to build on all this knowledge and extend vendor cookbooks, and ultimately author our own!