Journey in a software world…
21 Jul
As a Puppet Mongrel Nginx user, I’m really ashamed about the convoluted nginx configuration needed (two server blocks listening on different ports, you need to direct your clients CA interactions to the second port with –ca_port), and the lack of support of proper CRL verification.
If you are like me, then there is some hope in this blog post.
Last week-end, I did some intense Puppet hacking (certainly more news about this soon), and part of this work is two Nginx patch:
First, download both patches:
Then apply them to Nginx (tested on 0.7.59):
$ cd nginx-0.7.59 $ patch -p1 < ../0001-Support-ssl_client_verify-optional-and-ssl_client_v.patch $ patch -p1 < ../0002-Add-SSL-CRL-verifications.patch
Then build Nginx as usual.
Here is a revised Puppet Nginx Mongrel configuration:
upstream puppet-production {
server 127.0.0.1:18140;
server 127.0.0.1:18141;
}
server {
listen 8140;
ssl on;
ssl_session_timeout 5m;
ssl_certificate /var/lib/puppet/ssl/certs/puppetmaster.pem;
ssl_certificate_key /var/lib/puppet/ssl/private_keys/puppetmaster.pem;
ssl_client_certificate /var/lib/puppet/ssl/ca/ca_crt.pem;
ssl_ciphers SSLv2:-LOW:-EXPORT:RC4+RSA;
# allow authenticated and client without certs
ssl_verify_client optional;
# obey to the Puppet CRL
ssl_crl /var/lib/puppet/ssl/ca/ca_crl.pem;
root /var/tmp;
location / {
proxy_pass http://puppet-production;
proxy_redirect off;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Client-Verify $ssl_client_verify;
proxy_set_header X-SSL-Subject $ssl_client_s_dn;
proxy_set_header X-SSL-Issuer $ssl_client_i_dn;
proxy_read_timeout 65;
}
}
Reload nginx, and enjoy
18 Jul
It’s been a long time since my last post… which just means I was really busy both privately, on the Puppet side and at work (I’ll talk about the Puppet side soon, for the private life you’re on the wrong blog
).
For a project I’m working on at Days of Wonder, I had to use the nginx secure link module. This module allows a client to access to the pointed resource only if the given MD5 HashMAC matches the arguments.
To use it, it’s as simple as:
location /protected/ {
secure_link "this is my secret";
root /var/www/downloads;
if ($secure_link = "") {
return 403;
}
rewrite ^ /$secure_link break;
}
To generate an URL, use the following PHP snippet:
<?php $prefix = "http://www.domain.com/protected"; $protected_resource = "my-super-secret-resource.jpg"; $secret = "this is my secret"; $hashmac = md5( $protected_resource . $secret ); $url = $prefix . "/" . $hashmac . "/" . $protected_resource; ?>
But that wasn’t enough for our usage. We needed the url to expire automatically after some time. So I crafted a small patch against Nginx 0.7.59.
It just extends the nginx secure link module with a TTL. The time at which the resource expires is embedded in the url, and the HMAC. If the server finds that the current time is greater than the embedded time, then it denies access to the resource.
The timeout can’t be tampered as it is used in the HMAC.
The usage is the same as the current nginx secure link module, except:
You need to use the following (sorry only PHP) code:
define(URL_TIMEOUT, 3600) # one hour timeout
$prefix = "http://www.domain.com/protected";
$protected_resource = "my-super-secret-resource.jpg";
$secret = "this is my secret";
$time = pack('N', time() + URL_TIMEOUT);
$timeout = bin2hex($time);
$hashmac = md5( $protected_resource . $time . $secret );
$url = $prefix . "/" . $hashmac . $timeout . "/" . $protected_resource;
location /protected/ {
secure_link "this is my secret";
secure_link_ttl on;
root /var/www/protected;
if ($secure_link = "") {
return 403;
}
rewrite ^ /$secure_link break;
}
The server generating the url and hashmac and the one delivering the protected resource must have synchronized clocks.
There is no support. If it eats your server, then I or Days of Wonder can’t be
It’s simple:
24 May
As announced in my last edit of my yesterday post Puppet and JRuby a love and hate story, I finally managed to run a webrick puppetmaster under JRuby with a MRI client connecting and fetching it’s config.
Unfortunately Puppet creates its first certificate with a serial number of 0, which JRuby-OpenSSL finds invalid (in fact that’s Bouncy Castle JCE Provider). So the first thing is to check if you already have some certificate generated with a serial of 0. If you have none, then everything is great you can skip this.
You can see a certificate content with openssl:
% openssl x509 -text -in /path/to/my/puppet/ssl/ca/ca_cert.pem Certificate: Data: Version: 3 (0x2) Serial Number: 1 (0x1) Signature Algorithm: sha1WithRSAEncryption Issuer: CN=ca Validity Not Before: May 23 18:38:19 2009 GMT Not After : May 22 18:38:19 2014 GMT Subject: CN=ca ...
If no certificate has a serial of 0, then it’s OK, otherwise I’m afraid you’ll have to start the PKI from scratch (which means rm -rf $vardir/ssl and authenticate clients again), after applying the following Puppet patch:
JRuby fix: make sure certificate serial > 0
JRuby OpenSSL implementation is more strict than real ruby one and
requires certificate serial number to be strictly positive.
Signed-off-by: Brice Figureau <brice-puppet@daysofwonder.com>
diff --git a/lib/puppet/ssl/certificate_authority.rb b/lib/puppet/ssl/certificate_authority.rb
index 08feff0..4a7d461 100644
--- a/lib/puppet/ssl/certificate_authority.rb
+++ b/lib/puppet/ssl/certificate_authority.rb
@@ -184,7 +184,7 @@ class Puppet::SSL::CertificateAuthority
# it, but with a mode we can't actually read in some cases. So, use
# a default before the lock.
unless FileTest.exist?(Puppet[:serial])
- serial = 0x0
+ serial = 0x1
end
Puppet.settings.readwritelock(:serial) { |f|
I’ll post this patch to puppet-dev soon, so I hope it’ll eventually get merged soon in mainline.
You need the freshest JRuby available at this time. My test were conducted with latest JRuby as of commit “3aadd8a”. The best is to clone the github jruby repository, and build it (it requires of course a JDK and Ant, but that’s pretty much all).
Then install jruby in your path (if you need assistance for this, I’m not sure this blog post is for you
)
As I explained in my previous blog post about the same subject, Puppet exercises a lot the Ruby OpenSSL subsystem. During this experiment, I found a few shortcomings in the current JRuby-OpenSSL 0.5, including missing methods, or missing behaviors needed by Puppet to run fine.
So to get a fully Puppet enabled JRuby-OpenSSL you need either to get the very latest JRuby-OpenSSL from its own github repository (or checkout the puppet-fixes branch of my fork of said repository on github) and or apply manually the following patches on top of the 0.5 source tarballs:
Then rebuild JRuby-OpenSSL which is a straightforward process (copy build.properties.SAMPLE to build.properties, adjust jruby.jar path, and then issue ant jar to build the jopenssl.jar).
Once done, install the 0.5 JRuby-OpenSSL gem in your jruby install, and copy other the built jar in lib/ruby/gems/1.8/gems/jruby-openssl-0.5/lib.
Then it’s time to run your puppetmaster, just start it with jruby instead of ruby. Of course you need the puppet dependencies installed (Facter).
My next try will be to run Puppet on Jruby and mongrel (or what replaces it in JRuby world), then try with storeconfig on…
Hope that helps, and for any question, please post in the puppet-dev list.
23 May
Since I heard about JRuby about a year ago, I wanted to try to run my favorite ruby program on it. I’m working with Java almost all day long, so I know for sure that the Sun JVM is a precious tool for running long-lived server. It is pretty fast, and has a very good (and tunable) garbage collector.
In a word: the perfect system to run a long-lived puppetmaster!
The first time I tried, back in February 2009, I unfortunately encountered the bug JRUBY-3349 which prevented Puppet to run quite early, because the Fcntl constants weren’t defined. Since my understanding of JRuby internal is near zero, I left there.
But thanks to Luke Kanies (Puppet creator), one of the JRuby main developers Charles Oliver Nutter fixed the issue a couple of weeks ago (thanks to him, and they even fixed another issue at about the same time about fcntl which didn’t support SET_FD).
That was just in time for another test…
But what I forgot was that Puppet is not every ruby app on the block. It uses lots of cryptography behind the scene. Remember that Puppet manages its own PKI, including:
That just means Puppet exercise a lot the Ruby OpenSSL extension.
The main issue is that MRI uses OpenSSL for all the cryptographic stuff, and JRuby uses a specific Java version of this extension. Of course this later is still young (presently at v 0.5) and doesn’t contain yet everything needed to be able to run Puppet.
In another life I wrote a proprietary cryptographic Java library, so I’m not a complete cryptography newcomer (OK, I forgot almost everything, but I still have some good books to refer to). So I decided to implement what is missing in JRuby-openssl to allow a webrick Puppetmaster to run.
You can find my contributions in the various JRUBY-3689, JRUBY-3690, JRUBY-3691, JRUBY-3692, JRUBY-3693 bugs.
I still have another a minor patch to submit (OpenSSL::X509::Certificate#to_text implementation).
So the question is: with all that patches applied, did I get a puppetmaster running?
And the answer is unfortunately no.
I can get the puppetmaster to start on a fresh configuration (ie it creates everything SSL related and such), but it fails as soon a client connects (hey that’s way better than before I started
).
All comes from SSL. The issue is that with the C OpenSSL implementation it is possible to get the peer certificate anytime, but the java SSL implementation (which is provided by the Sun virtual machine) requires the client to be authenticated before anyone get access to the peer certificate.
That’s unfortunate because to be able to authenticate a not-yet-registered client, we must have access to its certificate. I couldn’t find any easy code fix, so I stopped my investigations there.
There is still some possible workarounds, like running in mongrel mode (provided JRuby supports mongrel which I didn’t check) and let Nginx (or Apache) handle the SSL stuff, but still it would be great to be able to run a full-fledged puppetmaster on JRuby.
I tried with a known client and get the same issue, so maybe that’s a whole different issue, I guess I’ll have to dig deeper in the Java SSL code, which unfortunately is not available
Stay tuned for more info about this. I hope to be able to have a full puppetmaster running on JRuby soon!
EDIT: I could run a full puppetmaster on webrick from scratch under JRuby with a normal ruby client. I’ll post the recipe in a subsequent article soon.
19 Apr
Note: when I started writing this post, I didn’t know it would be this long.
I decided then to split it in several posts, each one covering one or more interesting aspect of zsh.
You’re now reading part 1.
I first used a Unix computer in 1992 (it was running SunOS 4.1 if I remember correctly).
I’m using Linux since 1999 (after using VMS throughout the 90s in school, but I left the Unix world while I was working with RAYflect doing 3D stuff on Mac and Windows).
During the time I worked with those various unices (including Irix on a Crimson), I think I’ve used almost every possible shell with various level of pleasure and expertise, including but not limited to:
When my own road crossed ZSH (about 6 years ago), I felt in love with this powerful shell, and it’s now my default shell on my servers, workstations and of course my macbook laptop.
The point of this blog post is to give you an incentive to switch from insert random shell here to zsh and never turn back.
The whole issue with zsh is that the usual random Linux distribution ships with Bash by default (that’s not really, true as GRML ships with zsh, and a good configuration). And Bash does its job well enough and is wide-spread, that people have usually only low incentive to switch to something different. I’ll try to let you see why zsh is worth the little investment.
Right now, zsh exists in 2 versions a stable one (4.2.7) and a development one (4.3.9).
I’m of course running the development version (I usually prefer seeing new bugs than old bugs
)
I recommend using the development version.
Some people don’t want to switch to zsh because they think zsh doesn’t support UTF-8. That’s plain wrong, if you follow my previous advice which is to run a version greater than 4.3.6, UTF-8 support is there and works really fine.
One of the best thing in zsh is the TAB completion. It’s certainly the best TAB completion I could use in every shell I tried. It can completes almost anything, from files (of course), to users, including but not limited to hosts, command options, package names, git revisions/branches etc.
Zsh ships with completions for almost every shipped apps on earth. And the beauty is that completion is so much configurable that you can twist it to your own specific taste.
To activate completion on your setup:
% zmodload zsh/complist % autoload -U compinit && compinit
The completion system is completely configurable. To configure it we use the zstyle command:
zstyle <context> <styles>
How does it work?
The context defines where the style will apply. The context is a string of ‘:’ separated strings:
‘:completion:function:completer:command:argument:tag’
Some part can be replaced by *, so that ‘:completion:*’ is the least specific context.
More specific context wins over less specific ones of course.
The various styles selects the options to activate (see below).
If you want to learn more about zsh completion, please read the zsh section completion manual.
Zsh completion is also:
When zsh needs to display completion matches or errors, it uses the format style for doing so.
zstyle ':completion:*' format 'Ouch: %d'
%d will be replaced by the actual text zsh would have been printed if no format style were applied.
You can use the same escape sequences as in zsh prompts.
Since there are many different types of messages, it is possible to restrict to warnings or messages by changing
the tags part of the context:
zstyle ':completion:*:warnings' format 'Too bad there is nothing'

And since it is possible to use all the prompt escapes, you can add style to the formats:
# format all messages not formatted in bold prefixed with ----
zstyle ':completion:*' format '%B---- %d%b'
# format descriptions (notice the vt100 escapes)
zstyle ':completion:*:descriptions' format $'%{\e[0;31m%}completing %B%d%b%{\e[0m%}'
# bold and underline normal messages
zstyle ':completion:*:messages' format '%B%U---- %d%u%b'
# format in bold red error messages
zstyle ':completion:*:warnings' format "%B$fg[red]%}---- no match for: $fg[white]%d%b"
And the result:
By default matches comes in no specific order (or in the order they’ve been found).
It is possible to separate the matches in distinct related groups:
# let's use the tag name as group name zstyle ':completion:*' group-name ''
Menu completion is when you press TAB several times and the completion changes to cycle through the available matches. By default in zsh, menu completion activates the second time you press the TAB key (the first one triggered the first completion).
Menu selection is when zsh displays below your prompt the list of possible selections arranged by categories.
A short drawing is always better than thousands words, so hop an example:
In this example I typed gzip -<TAB> then navigated with the arrows to –stdout.
To activate menu selection:
# activate menu selection zstyle ':completion:*' menu select
With this, zsh corrects what you already have typed.
Approximate completion is controlled by the
_approximate
completer.
Approximate completion looks first for matches that differs by one error (configurable) to what you typed.
An error can be either a transposed character, a missing character or an additional character.
If some corrected entries are found they are added as matches, if none are found, the system continues with 2 errors and so on.
Of course, you want it to stop at some level (use the max-errors completion style).
# activate approximate completion, but only after regular completion (_complete) zstyle ':completion:::::' completer _complete _approximate # limit to 2 errors zstyle ':completion:*:approximate:*' max-errors 2 # or to have a better heuristic, by allowing one error per 3 character typed # zstyle ':completion:*:approximate:*' max-errors 'reply=( $(( ($#PREFIX+$#SUFFIX)/3 )) numeric )'
From X windows, to hosts from users, almost everything including shell variables can be completed or menu-selected.
Here I typed “echo $PA<TAB>” and navigated to PATH:

Now, one thing that is extremely useful is completion of hosts:
# let's complete known hosts and hosts from ssh's known_hosts file
basehost="host1.example.com host2.example.com"
hosts=($((
( [ -r .ssh/known_hosts ] && awk '{print $1}' .ssh/known_hosts | tr , '\n');\
echo $basehost; ) | sort -u) )
zstyle ':completion:*' hosts $hosts
Yeah, I see, you’re wondering, aliases, pffuuuh, every shell on earth has aliases.
Yes, but does your average shell has global or suffix aliases?
Suffix aliases are aliases that matches the end of the command-line.
Ex:
% alias -s php=nano
Now, I just have to write:
% index.php
And zsh executes nano index.php. Clever isn’t it?
Global aliases are aliases that match anywhere in the command line.
Typical uses are:
% alias -g G='| grep % alias -g WC='| wc -l' % alias -g TF='| tail -f' % alias -g DN='/dev/null'
Now, you just have to issue:
% ps auxgww G firefox
to find all firefox processes. Still not convinced?
Some might argue that global aliases are risky because zsh can change your command line behind your back if you need to have let’s say a capital G in there.
Because of this I’m using the GRML way: I use a special key combination (see in an upcoming post about key binding) that auto-completes my aliases directly on the command line, without defining a global alias.
One of the best feature, albeit one of the more difficult to master is zsh extended globing.
Globbing is the process of matching several files or paths with an expression. The most usually known forms are * or ?, like: *.c to match every file ending with .c.
Zsh pushes the envelop far away, supporting the following:
Let’s say our current directory contains:
test.c test.h test.1 test.2 test.3 a/a.txt b/1/b.txt b/2/d.txt team.txt term.txt
This is the well known wildcard. It matches any amount of characters.
As in:
% echo *.c test.c
This matches only one character.
As in:
% echo test.? test.c test.h
This is a character class. It matches any character listed between the braces.
The content can be either single characters:
[abc0123] will match either a,b,c,0,1,2,3
or range of characters:
[a-e] will match from a to e inclusive
or POSIX character classes
[[:space:]] will match only spaces (refer to zshexpn(1) for more information)
The character classes can be negated by a leading ^:
[^abcd] matches only character outside of a,b,c,d
If you need to list – or ], it should be the first character of the class. If you need both list ] first.
Example:
% echo test.[ch] test.c test.h
x and/or y can be omitted to have an open-ended range.
<-> match all numbers.
% echo test.<0-10> test.1 test.2 test.3
% echo test.<2-> test.2 test.3
You know find(1), but did you know you can do almost everything you need with only zsh?
% echo **/*.txt a/a.txt b/1/b.txt b/2/d.txt
Matches either a or b. a and b can be any globbing expressions of course.
% echo test.(1|2) test.1 test.2 % echo test.(c|<1-2>) test.1 test.2 test.c
There are two possibilities:
leading ^: as in ^*.o which selects every file except those ending with .o
pattern1^pattern2: pattern1 will be matched as a prefix, then anything not matching pattern2 will be selected
% ls te* test.c test.h team.txt term.txt % echo te^st.* team.txt term.txt
If you use the negation in the middle of a path section, the negation only applies to this path part:
% ls /usr/^bin/B* /usr/lib/BuildFilter /usr/sbin/BootCacheControl
Pattern exceptions are a way to express: “match this pattern, but not this one”.
# let's match all files except .svn dirs % print -l **/*~*/.svn/* | grep ".svn" # an nothing prints out, so that worked
It is to be noted that * after the ~ matches a path, not a single directory like the regular wildcard.
zsh allows to further restrict matches on file meta-data and not only file name, with the globbing qualifiers.
The globbing qualifier is placed in () after the expression:
# match a regular file with (.) % print -l *(.)
We can restrict by:
% ls -al total 0 drwxr-xr-x 8 brice wheel 272 2009-04-14 18:59 . drwxrwxrwt 11 root wheel 374 2009-04-14 20:04 .. -rw-r--r-- 1 root wheel 0 2009-04-14 18:59 test.c -rw-r--r-- 1 brice wheel 10 2009-04-14 18:59 test.h -rw-r--r-- 1 brice wheel 20 2009-04-12 16:30 old # match only files we own % print -l *(U) test.h # match only file whose size less than 2 bytes % print -l *(L-2) test.c # match only files older than 2 days % print -l *(m-2) old
It is possible to combine and/or negate several qualifiers in the same expressions
# print executable I can read but not write % echo *(*r^w)
And there’s more, you can change the sort order, add a trailing distinctive character (ala ls -F).
Refer to zshexpn(1) for more information.
In the next post, I’ll talk about some other interesting things:
But that’s all for the moment.
Newcomer, new switchers, if you want to get bootstrapped in a glimpse, I recommend using the GRML
configuration:
# IMPORTANT: please note that you might override an existing # configuration file in the current working directory! wget -O ~/.zshrc http://git.grml.org/f/grml-etc-core/etc/zsh/zshrc
13 Apr
Thanks to Days of Wonder the company I work for, I’m proud to release in Free Software (GPL):
At Days of Wonder, we’re using MySQL for almost everything since the beginning of the company. We were initially monitoring all our infrastructure with mon and Cricket, including our MySQL servers. Nine months ago I migrated the monitoring infrastructure to OpenNMS, and at the same we lost the Cricket MySQL monitoring (which was done with direct SQL SHOW STATUS LIKE commands).
I had to find another way, and since OpenNMS excels at SNMP, it was natural to monitor MySQL through SNMP. My browsing crossed this blog post. At about the same time I noticed that Baron Schwartz had released some very good MySQL Cacti Templates, so I decided I should cross both project and started working on mysql-snmp on my free time.
Hopefully, Days of Wonder has an IANA SNMP enterprises sub-number (20267, we use this for monitoring our game servers), so the MIB I wrote for this project is hosted in a natural place in the MIB hierarchy.
It’s a Net-SNMP perl subagent that connects to your MySQL server, and reports various statistics (from show status or show innodb status, or even replication) through SNMP.
If you followed this blog from the very start, you know we’re using OpenNMS to monitor Days of Wonder infrastructure. So I included the various OpenNMS configuration bit to display nice and usable graphs, inspired by the excellent MySQL Cacti Templates.
Here are some examples:
The code is hosted in my github repository, and everything you should know is in the mysql-snmp page on my site.
If you use this software, please do not hesitate to contribute, and/or fix bugs
12 Apr
There is something I used to hate to do. And I think all admins also hate to do that.
It’s when you need to reboot a server on a rescue environment to perform an administration task (i.e. fixing unbootable servers, fixing crashed root filesystems, and so on).
The commonly found problems with rescue environment are:
OK, so a long time ago, I had a crashed server refusing to start on a reboot, and I had to chose a rescue environment for linux servers, other than booting on the Debian CD once again.
That’s how I discovered PLD Linux rescue CD:
and GRML:
My heart still goes to PLD rescue (because it’s really light), but I must admit that GRML has a really good zsh configuration (I even used some of their configuration ideas for my day to day zsh).
On that subject, if you don’t use zsh or don’t even know it and still want to qualify as a knowledgeable Unix admin, then please try it (preferably with GRML so that you’ll have an idea of what’s possible, and they even have a good documentation), another solution is to buy of course this really good book: “From Bash to Z Shell: Conquering the Command Line”
That makes me think I should do a whole blog post on zsh.
OK, so let’s go back to our sheep (yes that’s a literally French translated expression, so I don’t expect anyone to grasp the funny part except the occasional French guys reading me
).
So what’s so good about PLD Rescue:
So my basic usage is to have a PXE netboot environment in our remote colocation, a console server (it is a real damn good Opengear CM4116).
With this setup I can netboot remotely any server to a PLD Rescue image with serial support, and then rescue my servers without going to the datacenter (it’s not that it is far from home or the office, but at 3AM, you don’t usually want to go out).
If you have a preferred rescue setup, please share it!
18 Mar
When I wrote my previous post titled all about storedconfigs, I was pretty confident I explained everything I could about storedconfigs… I was wrong of course
A couple of days ago, I was helping some USG admins who were facing an interesting issue. Interesting for me, but I don’t think they’d share my views on this, as their servers were melting down under the database load.
But first let me explain the issue.
The thing is that when a client checks in to get its configuration, the puppetmaster compiles its configuration to a digestible format and returns it. This operation is the process of transforming the AST built by parsing the manifests to what is called the catalog in Puppet. This is this catalog (which in fact is a graph of resources) which is later played by the client.
When the compilation process is over, and if storedconfigs is enabled on the master, the master connects to the RDBMS, and retrieves all the resources, parameters, tags and facts. Those, if any, are compared to what has just been compiled, and if some resources differs (by value/content, or if there are some missing or new ones), they get written to the database.
Pretty straightforward, isn’t it?
As you can see, this process is synchronous and while the master processes the storedconfigs operations, it doesn’t serve anybody else.
Now, imagine you have a large site (ie hundreds of puppetd clients), and you decide to turn on storedconfigs. All the clients checking in will see their current configuration stored in the database.
Unfortunately the first run of storedconfigs for a client, the database is empty, so the puppetmaster has to send all the information to the RDBMS which in turns as to write it to the disks. Of course on subsequent runs only what is modified needs to reach the RDBMS which is much less than the first time (provided you are running 0.24.8 or applied my patch).
But if your RDBMS is not correctly setup or not sized for so much concurrent write load, the storedconfigs process will take time. During this time this master is pinned to the database and can’t serve clients. So the immediate effect is that new clients checking in will see timeouts, load will rise, and so on.
If you are in the aforementioned scenario you must be sure your RDBMS hardware is properly sized for this peak load, and that your database is properly tuned.
I’ll soon give some generic MySQL tuning advices to let MySQL handle the load, but remember those are generic so YMMV.
What people usually forget is that disk (ie those with rotating plates, not SSDs) have a maximum number of I/O operations per seconds. This value is for professional high-end disks about 250 IOP/s.
Now, to simplify, let’s say your average puppet client has 500 resources with an average of 4 parameters each. That means the master will have to perform at least 500 * 4 + 500 = 2500 writes to the database (that’s naive since there are indices to modify, and transactions can be grouped, etc.. but you see the point).
Add to this the tags, hmm let’s say an average of 4 tags per resources, and we have 500 * 4 + 500 + 500 * 4 = 4500 writes to perform to store the configuration of a given host.
Now remember our 250 IOP/s, how many seconds does the disk need to performs 4500 writes?
The answer is 18s!! Which is a high value. During this time you can’t do anything else. Now add concurrency to the mix, and imagine what that means.
Of course this supposes we have to wait for the disk to have finished (ie synchronous writing), but in fact that’s pretty how RDBMS are working if you really want to trust your data.
So the result is that if you want a fast RDBMS you must be ready to pay for an expensive I/O subsystem.
That’s certainly the most important part of your server.
You need:
If you don’t have this, do not even think turning on storedconfigs for a large site.
Of course other things matters. If the database can fit in RAM (the best if you don’t want to be I/O bound), then you obviously need RAM. Preferably ECC Registered RAM. Use 64 bits hardware with a 64 bits OS.
Then you need some CPU. Nowadays they’re cheap, but beware of InnoDB scaling issues on multi-core/multi-CPU systems (see below).
Here is a checklist on how to tune MySQL for a mostly write load:
For concurrency, stability and durability reasons InnoDB is mandatory. MyISAM is at best usable for READ workload but suffers concurrency issues so it is a no-no for our topic
The default InnoDB settings are tailored to very small 10 years old servers…
Things to look to:
The fine people at Percona or Ourdelta produces some patched builds of MySQL that removes some of the MySQL InnoDB scalability issues. This is more important on high concurrency workload on multi-core/multi-cpu systems.
It can also be good to run MySQL with Google’s perftools TCMalloc. TCMalloc is a memory allocator which scales way better than the Glibc one.
The immediate and most straightforward idea is to limit the number of clients that can check in at the same time. This can be done by disabling puppetd on each client (puppetd –disable), blocking network access, or any other creative mean…
When all the active hosts have checked in, you can then enable the other ones. This can be done hundreds of hosts at a time, until all hosts have a configuration stored.
Another solution is to direct some hosts to a special puppetmaster with storedconfigs on (the regular one still has storedconfigs disabled), by playing with DNS or by configuration, whatever is simplest in your environment. Once those hosts have their config stored, move them back to their regular puppetmaster and move newer hosts there.
Since that’s completely manual, it might be unpractical for you, but that’s the simplest method.
As long as your manifests are only slightly changing, subsequent runs will see only a really limited database activity (if you run a puppetmaster >= 0.24.8). That means the tuning we did earlier can be undone (for instance you can lower the innodb_log_file_size for instance, and adjust the innodb_buffer_pool_size to the size of the hot set).
But still storedconfigs can double your compilation time. If you are already at the limit compared to the number of hosts, you might see some client timeouts.
Today Luke announced on the puppet-dev list that they were working on a queuing system to defer storedconfigs and smooth out the load by spreading it on a longer time. But still, tuning the database is important.
The idea is to offload the storedconfigs to another daemon which is hooked behind a queuing system. After the compilation the puppetmaster queues the catalog, where it will be unqueued by the puppet queue daemon which will in turn execute the storedconfigs process.
I don’t know the ETA for this interesting feature, but meanwhile I hope the tips I provided here can be of any help to anyone
Stay tuned for more puppet stories!
8 Mar
Since a long time people (including me) complained that storeconfigs was a real resource hog. Unfortunately for us, this option is so cool and useful.
Storeconfigs is a puppetmasterd option that stores the nodes actual configuration to a database. It does this by comparing the result of the last compilation against what is actually in the database, resource per resource, then parameter per parameter, and so on.
The actual implementation is based on Rails’ Active Record, which is a great way to abstract the gory details of the database, and prototype code easily and quickly (but has a few shortcomings).
The immediate use of storeconfigs is exported resources. Exported resources are resources which are prefixed by @@. Those resources are marked specially so that they can be collected on several other nodes.
A little completely dumb example speaks by itself:
class exporter {
@@file {
"/var/lib/puppet/nodes/$fqdn": content => "$ipaddress\n", tag => "ip"
}
}
node "export1.daysofwonder.com" {
include exporter
}
node "export2.daysofwonder.com" {
include exporter
}
node "collector.daysofwonder.com" {
File <<| tag == "ip" |>>
}
What does this example do?
That’s simple, all the exporter nodes creates a file in /var/lib/puppet/nodes whose name is the node name and whose content is its primary IP address.
What is interesting is that the node “collector.daysofwonder.com” collects all files tagged by “ip“, that is all the exported files. In the end, after exporter1, exporter2 and collector have run a compilation, the collector host will have the /var/lib/puppet/nodes/exporter1.daysofwonder.com and /var/lib/puppet/nodes/exporter2.daysofwonder.com and their respective content.
Got it?
That’s the perfect tool for instance to automatically:
Still there is another use, since the whole configuration of your nodes is in an RDBMS, you can use that to perform some data-mining about your hosts configuration. That’s what puppetshow does.
The storeconfigs issue its current incarnation (ie 0.24.7) is that it is a slow feature (it usually doubles the compilation time), and imposes an higher load on the puppetmaster and the database engine.
For large installation it might not possible to be able to run with this feature on. There were also some reports of high memory usage or leak with this feature on (see my recommendation about this in my puppetmaster memory leak post).
Here my usual puppet and storeconfigs recommendations:
I think the last point deserves a little bit more explanation:
I had the following schematized pattern in some of my manifests, that I took from David Schmitt excellent modules:
in one class:
if defined(File["/var/lib/puppet/modules/djbdns.d/"]) {
warn("already defined")
} else {
file {
"/var/lib/puppet/modules/djbdns.d/": ...
}
}
and in another class the exact same code:
if defined(File["/var/lib/puppet/modules/djbdns.d/"]) {
warn("already defined")
} else {
file {
"/var/lib/puppet/modules/djbdns.d/": ...
}
}
What happens is that from run to run the evaluation order could change, and the defined resource could be the one in the first class and another time it could be the one in the second class, which meant the storeconfigs code had to remove the resources from the database and re-create them again. Clearly not the best way to have less database workload
I contributed for 0.24.8 a partial rewrite of some parts of the storeconfigs feature to increase its performance.
My analysis is that what was slow in the feature is threefold:
I fixed the first two points by attacking directly the database to fetch the parameters and tags, keeping them in hash instead of objects. This saves a large number of database request and at the same time it prevents a large number of ruby objects to be created (it should even save some memory).
The last point was fixed by imposing a strict order (although not completely correct, but still better that how it was) in the way the tags are assigned to resources.
Both patches have been merged for 0.24.8, and some people reported some performance improvements.
On the Days of Wonder infrastructure I found that with a 562 resources node, on a tuned mysql database:
That’s a nice improvement, isn’t it
Luke and I discussed about this, it was also discussed on the puppet-dev list a few times. I think that a RDBMS might not be the right storage choice for this feature, because clearly there is almost no random keyed access to the individual parameters of a resource (so having a table dedicated to parameters is of almost no use).
I know Luke’s plan is to abstract the storeconfigs feature from the current implementation (certainly through the indirector), so that we can use different storeconfigs engines.
I also know that someone is working on a promising CouchDB implementation. I myself can see a memcached implementation (which I’d really like to start working on). Maybe even the filesystem would be enough.
Of course, I’m open to any other improvements or storage engine ideas
21 Feb
This seems to be recurrent this last 3 or 4 days with a few #puppet, redmine or puppet-user requests, asking about why puppetd is consuming so much CPU and/or memory.
While I don’t have a definitive answer about why it could happen (hey all software components have bugs), I think it is important to at least know how to see what happens. I even include some common issues I myself have observed.
I mean, know what is puppetd doing. That’s easy, disable puppetd on the host where you have an issue, and try to run it manually in debug mode. I’m really astonished that almost nobody tries a debug run before complaining that something doesn’t work
% puppetd --disable % puppetd --test --debug --trace ... full output on the console ...
At the same time, monitor the CPU usage and look at the debug entries when most of the CPU is consumed.
If nothing is printed at this same moment, and it still uses CPU, CTRL-C the process, maybe it will print a useful stack trace that will help you (or us) understand what happens.
With this you will certainly catch things you didn’t intend (see below computing checksums when it is not necessary).
I already mentioned this tip in my puppetmaster memory leak post a month ago. You can’t imagine how much useful information you can get with this tool.
Install as explained in the original article the ruby file into ~/.gdb/ruby, copy the following into your ~/.gdbinit:
define session-ruby source ~/.gdb/ruby end
Here I’m going to show how to do this with a puppetmasterd, but it is exactly the same thing with puppetd.
Basically, the idea is to attach gdb to the puppet process, halt it and look to the current stack trace:
% ps auxgww | grep puppetd puppet 28602 2.0 8.9 275508 184492 pts/3 Sl+ Feb19 65:25 ruby /usr/bin/puppetmasterd --debug % gdb /usr/bin/ruby GNU gdb 6.8-debian Copyright (C) 2008 Free Software Foundation, Inc. ... (gdb) session-ruby (gdb) attach 28602 Attaching to program: /usr/bin/ruby, process 28602 ...
Now our gdb is attached to our ruby interpreter.
Lets see where we stopped:
(gdb) rb_backtrace $3 = 34
Note: the output is displayed by default on the stdout/stderr of the attached process, so in our case my puppetmasterd. Going to the terminal where it runs (actually the screen):
...
from /usr/lib/ruby/1.8/webrick/server.rb:91:in `select'
from /usr/lib/ruby/1.8/webrick/server.rb:91:in `start'
from /usr/lib/ruby/1.8/webrick/server.rb:23:in `start'
from /usr/lib/ruby/1.8/webrick/server.rb:82:in `start'
from /usr/lib/ruby/1.8/puppet.rb:293:in `start'
from /usr/lib/ruby/1.8/puppet.rb:144:in `newthread'
from /usr/lib/ruby/1.8/puppet.rb:143:in `initialize'
from /usr/lib/ruby/1.8/puppet.rb:143:in `new'
from /usr/lib/ruby/1.8/puppet.rb:143:in `newthread'
from /usr/lib/ruby/1.8/puppet.rb:291:in `start'
from /usr/lib/ruby/1.8/puppet.rb:290:in `each'
from /usr/lib/ruby/1.8/puppet.rb:290:in `start'
from /usr/sbin/puppetmasterd:285
It works!
It is now easy to see what puppetd is doing:
Examining the stack traces should give you hints (or us) to what your puppetd is doing at this moment.
You might have encountered a bug. Please report it in Puppet redmine, and enclose all the useful information you gathered by following the two points above.
That’s the usual suspect, and one I encountered myself.
Let’s say you have something like this in your manifest:
File { checksum => md5 }
...
file {
"/path/to/so/many/files":
owner => myself, mode => 0644, recurse => true
}
What does that mean?
You’re telling puppet that every file resource should compute checksum, and you have a recursive file operation managing owner and mode. What puppetd will do is to traverse the whole ‘/path/to/so/many/files’ and happily manage them changing owner and mode when needed.
What you might have forgotten, is that you requested checksum to be MD5, so puppetd instead of only doing a bunch of stat(3) on your files will also compute MD5 sums of their content.
If you have tons of files in this hierarchy this can take quite some time. Since checksums are cached, it can also take quite some memory.
How to solve this issue:
File { checksum => md5 }
...
file {
"/path/to/so/many/files":
owner => myself, mode => 0644, recurse => true, checksum => undef
}
Sometimes, it isn’t possible to solve this issue, if your file {} resource is a retrieve file (ie there is a source parameter), because you need to have checksum to manage the files. In this case, just byte the bullet, change the checksum to mtime, limit recursion or wait for my fix of Puppet bug #1469.
Actually it is in your interest that puppetd is taking 100% of CPU while applying the configuration the puppetmaster has given. That just means it’ll do its job faster than if it was consuming 10% of CPU
I mean, puppetd has a fixed amount of things to perform, some are CPU bound, some are I/O bound (actually most are I/O bound), so it is perfectly normal that it takes wall clock time and consume resources to play your manifests.
What is not normal is consuming CPU or memory between configuration run. But you already know how to diagnose such issues if you read the start of this post
Not all resource consumption are bad.
We’re all dreaming of a faster puppetd.
And at this subject, I think it should be possible (provided ruby supports native thread (maybe a task for JRuby)) to apply the catalog in a multi-threaded way. I never really thought about this (I mean technically), but I don’t see why it couldn’t be possible. That would allow puppetd to do several I/O bound operations in parallel (like installing packages and managing files at the same time).
Recent Comments