Journey in a software world…
28 Jan
As every reader of this blog certainly know, I’m a big fan of Puppet, using it in production on Days of Wonder servers, up to the point I used to contribute regularly bug fixes and new features (not that I stopped, it’s just that my spare time is a scarce resource nowadays).
Still, I think there are some issues in term of scalability or resource consumption (CPU or memory), for which we can find some workarounds or even fixes. Those issues are not a symptom bad programming or bad design. No, most of the issues come either from ruby itself or some random library issues.
Let’s review the things I have been thinking about lately.
This is by far one of the most seen issues both on the client side and the server side. I’ve mainly seen this problem on the client side, up to the point that most people recommend running puppetd as cronjobs, instead of being a long lived process.
All boils down to the ruby (at least the the MRI 1.8.x version) allocator. This is the part in the ruby interpreter that deals with memory allocations. Like in many dynamic languages, the allocator manages a memory pool that is called a heap. And like some other languages (among them Java), this heap can never shrink and always grows when more memory is needed. This is done this way because it is simpler and way faster. Usually applications ends using their nominal part of memory and no more memory has to be allocated by the kernel to the process, which gives faster applications.
The problem is that if the application needs transiently a high amount of memory that will be trashed a couple of millisecond after, the process will pay this penalty all its life, even though say 80% of the memory used by the process is free but not reclaimed by the OS.
And it’s even worst. The ruby interpreter when it grows the heap, instead of allocating bytes per bytes (which would be really slow) does this by chunk. The whole question is what is the proper size of a chunk?
In the default implementation of MRI 1.8.x, a chunk is the size of the previous heap times 1.8. That means at worst a ruby process might end up allocating 1.8 times more than what it really needs at a given time. (This is a gross simplification, read the code if you want to know more).
So how does it apply to puppetd?
It’s easy, puppetd uses memory for two things (beside maintaining some core data to be able to run):
Hopefully, nobody distributes large files with Puppet
If you’re tempted to do so, see below…
But again there’s more, as Peter Meier (known as duritong in the community) discovered a couple of month ago: when puppetd gets its catalog (which by the way is transmitted in json nowadays), it also stores it as a local cache to be able to run if it can’t contact the master for a subsequent run. This operation is done by unserializing the catalog from json to ruby live objects, and then serializing the laters to YAML. Beside the evident loss of time to do that on large catalog, YAML is a real memory hog. Peter’s experience showed that about 200MB of live memory his puppetd process was using came from this final serialization!
So I had the following idea: why not store the serialized version of the catalog (the json one) since we already have it in a serialized form when we receive it from the master (it’s a little bit more complex than that of course). This way no need to serialize it again in YAML. This is what ticket #2892 is all about. Luke is committed to have this enhancement in Rowlf, so there’s good hope!
So what can we do to help puppet not consume that many memory?
In theory we could play on several factors:
Note that the same issues apply to the master too (especially for the file serving part). But it’s usually easier to run a different ruby interpreter (like REE) on the master than on all your clients.
Streaming HTTP requests is promising but unfortunately would require large change to how Puppet deals with HTTP. Maybe it can be done only for file content requests… This is something I’ll definitely explore.
This file serving thing let me think about the following which I already discussed several time with Peter…
One of the mission of the puppetmaster is to serve sourced file to its clients. We saw in the previous section that to do that the master has to read the file in memory. That’s one reason it is recommended to use a dedicated puppetmaster server to act as a pure fileserver.
But there’s a better way, provided you run puppet behind nginx or apache. Those two proxies are also static file servers: why not leverage what they do best to serve the sourced files and thus offload our puppetmaster?
This has some advantages:
In fact it was impossible in 0.24.x, but now that file content serving is RESTful it becomes trivial.
Of course offloading would give its best if your clients requires lots of sourced files that change often, or if you provision lots of new hosts at the same time because we’re offloading only content, not file metadata. File content is served only if the client hasn’t the file or the file checksum on the client is different.
Imagine we have a standard manifest layout with:
Here is what would be the nginx configuration for such scheme:
server {
listen 8140;
ssl on;
ssl_session_timeout 5m;
ssl_certificate /var/lib/puppet/ssl/certs/master.pem;
ssl_certificate_key /var/lib/puppet/ssl/private_keys/master.pem;
ssl_client_certificate /var/lib/puppet/ssl/ca/ca_crt.pem;
ssl_crl /var/lib/puppet/ssl/ca/ca_crl.pem;
ssl_verify_client optional;
root /etc/puppet;
# those locations are for the "production" environment
# update according to your configuration
# serve static file for the [files] mountpoint
location /production/file_content/files/ {
# it is advisable to have some access rules here
allow 172.16.0.0/16;
deny all;
# make sure we serve everything
# as raw
types { }
default_type application/x-raw;
alias /etc/puppet/files/;
}
# serve modules files sections
location ~ /production/file_content/[^/]+/files/ {
# it is advisable to have some access rules here
allow 172.16.0.0/16;
deny all;
# make sure we serve everything
# as raw
types { }
default_type application/x-raw;
root /etc/puppet/modules;
# rewrite /production/file_content/module/files/file.txt
# to /module/file.text
rewrite ^/production/file_content/([^/]+)/files/(.+)$ $1/$2 break;
}
# ask the puppetmaster for everything else
location / {
proxy_pass http://puppet-production;
proxy_redirect off;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Client-Verify $ssl_client_verify;
proxy_set_header X-SSL-Subject $ssl_client_s_dn;
proxy_set_header X-SSL-Issuer $ssl_client_i_dn;
proxy_buffer_size 16k;
proxy_buffers 8 32k;
proxy_busy_buffers_size 64k;
proxy_temp_file_write_size 64k;
proxy_read_timeout 65;
}
}
EDIT: the above configuration was missing the only content-type that nginx can return for Puppet to be able to actually receive the file content (that is raw).
I leave as an exercise to the reader the apache configuration.
It would also be possible to write some ruby/sh/whatever to generate the nginx configuration from the puppet fileserver.conf file.
And that’s all folks, stay tuned for more Puppet (or even different) content.
10 Jan
I’m really proud to announce the release of the version 1.0 of mysql-snmp.
mysql-snmp is a mix between the excellent MySQL Cacti Templates and a Net-SNMP agent. The idea is that combining the power of the MySQL Cacti Templates and any SNMP based monitoring would unleash a powerful mysql monitoring system. Of course this project favorite monitoring system is OpenNMS.
mysql-snmp is shipped with the necessary OpenNMS configuration files, but any other SNMP monitoring software can be used (provided you configure it).
To get there, you need to run a SNMP agent on each MySQL server, along with mysql-snmp. Then OpenNMS (or any SNMP monitoring software) will contact it and fetch the various values.
Mysql-snmp exposes a lot of useful values including but not limited to:
Here are some graph examples produced with OpenNMS 1.6.5 and mysql-snmp 1.0 on one of Days of Wonder MySQL server (running a MySQL 5.0 Percona build):
mysql-snmp is available in my github repository. The repository contains a spec file to build a RPM and what is needed to build a Debian package. Refer to the README or the mysql-snmp page for more information.
Thanks to gihub, it is possible to download the tarball instead of using Git:
This lists all new features/options from the initial version v0.6:
Please use Github issue system to report any issues.
There is a little issue here. mysql-snmp uses Net-Snmp. Not all versions of Net-Snmp are supported as some older versions have some bug for dealing with Counter64. Version 5.4.2.1 with this patch is known to work fine.
Also note that this project uses some Counter64, so make sure you configure your SNMP monitoring software to use SNMP v2c or v3 (SNMP v1 doesn’t support 64 bits values).
I wish everybody an happy new year. Consider this new version as my Christmas present to the community
19 Dec
Yes, I know… I released v0.7 less than a month ago. But this release was crippled by a crash that could happen at start or reload.
Bonus in this new version, brought to you by Tizoc:
If you wonder what JSONP is (as I did when I got the merge request), you can check the original blog post that lead to it.
To activate JSONP you need:
This version has been tested with 0.7.64 and 0.8.30.
Easy, download the tarball from the nginx upload progress module github repository download section.
If you want to report a bug, please use the Github issue section.
22 Nov
I’m proud to announce the release of Nginx Upload Progress module v0.7
This version sees a crash fix and various new features implemented by Valery Kholodkov (the author of the famous Nginx Upload Module).
This version has been tested with Nginx 0.7.64.
What is cool is that now with only one directive (upload_progress_json_output) the responses are sent in pure Json and not in javascript mix as it was before.
Another cool feature is the possibility to use templates to send progress information. That means with a simple configuration change nginx can now return XML:
upload_progress_content_type 'text/xml'; upload_progress_template starting '<upload><state>starting</state></upload>'; upload_progress_template uploading '<upload><state>uploading</state><size>$uploadprogress_length</size><uploaded>$uploadprogress_received</uploaded></upload>'; upload_progress_template done '<upload><state>done</state></upload>'; upload_progress_template error '<upload><state>error</state><code>$uploadprogress_status</code></upload>';
Refer to the README in the distribution for more information.
Easy, download the tarball from the nginx upload progress module github repository download section.
Normally you have to use your own client code to display the progress bar and contact nginx to get the progress information.
But some nice people have created various javascript libraries doing this for you:
Happy uploads!
5 Oct
I attended Puppet Camp 2009 in San Francisco last week. It was a wonderful event and I could meet a lot of really smart developers and sysadmins from a lot of different countries (US, Australia, Europe and even Singapore).
The format of the event (an unconference with some scheduled talks in the morning) was really great. Everybody got a chance to enter or propose a discussion topic they care about. I could attend some development sessions about the Ruby DSL vs Parser DSL, Code smells, Puppet Provider/Type developments, Augeas, and so on…
Morning talks were awesome. I was presenting a talk about storeconfigs, called “All About Storeconfigs”. Puppet Storeconfigs is a feature where you can store nodes configuration and export/collect resources between nodes with the help of a database. I already talked about this in a couple of posts:
You can enjoy the recording of the session (event though they cut the first part which is not that good), and have closer look to my slides here:
What’s great with those conferences in foreign countries is that you usually finish at the pub with some local people to continue to share Puppet (or not) experiences. Those parties were plenty of fun, so thank you everybody for this.
So thanks everybody and Reductive Labs team (especially Andrew who organized everything) for this event, and thanks to Days of Wonder for funding my trip!
21 Jul
As a Puppet Mongrel Nginx user, I’m really ashamed about the convoluted nginx configuration needed (two server blocks listening on different ports, you need to direct your clients CA interactions to the second port with –ca_port), and the lack of support of proper CRL verification.
If you are like me, then there is some hope in this blog post.
Last week-end, I did some intense Puppet hacking (certainly more news about this soon), and part of this work is two Nginx patch:
First, download both patches:
Then apply them to Nginx (tested on 0.7.59):
$ cd nginx-0.7.59 $ patch -p1 < ../0001-Support-ssl_client_verify-optional-and-ssl_client_v.patch $ patch -p1 < ../0002-Add-SSL-CRL-verifications.patch
Then build Nginx as usual.
Here is a revised Puppet Nginx Mongrel configuration:
upstream puppet-production {
server 127.0.0.1:18140;
server 127.0.0.1:18141;
}
server {
listen 8140;
ssl on;
ssl_session_timeout 5m;
ssl_certificate /var/lib/puppet/ssl/certs/puppetmaster.pem;
ssl_certificate_key /var/lib/puppet/ssl/private_keys/puppetmaster.pem;
ssl_client_certificate /var/lib/puppet/ssl/ca/ca_crt.pem;
ssl_ciphers SSLv2:-LOW:-EXPORT:RC4+RSA;
# allow authenticated and client without certs
ssl_verify_client optional;
# obey to the Puppet CRL
ssl_crl /var/lib/puppet/ssl/ca/ca_crl.pem;
root /var/tmp;
location / {
proxy_pass http://puppet-production;
proxy_redirect off;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Client-Verify $ssl_client_verify;
proxy_set_header X-SSL-Subject $ssl_client_s_dn;
proxy_set_header X-SSL-Issuer $ssl_client_i_dn;
proxy_read_timeout 65;
}
}
Reload nginx, and enjoy
18 Jul
It’s been a long time since my last post… which just means I was really busy both privately, on the Puppet side and at work (I’ll talk about the Puppet side soon, for the private life you’re on the wrong blog :-)).
For a project I’m working on at Days of Wonder, I had to use the nginx secure link module. This module allows a client to access to the pointed resource only if the given MD5 HashMAC matches the arguments.
To use it, it’s as simple as:
location /protected/ {
secure_link "this is my secret";
root /var/www/downloads;
if ($secure_link = "") {
return 403;
}
rewrite ^ /$secure_link break;
}
To generate an URL, use the following PHP snippet:
<?php $prefix = "http://www.domain.com/protected"; $protected_resource = "my-super-secret-resource.jpg"; $secret = "this is my secret"; $hashmac = md5( $protected_resource . $secret ); $url = $prefix . "/" . $hashmac . "/" . $protected_resource; ?>
But that wasn’t enough for our usage. We needed the url to expire automatically after some time. So I crafted a small patch against Nginx 0.7.59.
It just extends the nginx secure link module with a TTL. The time at which the resource expires is embedded in the url, and the HMAC. If the server finds that the current time is greater than the embedded time, then it denies access to the resource.
The timeout can’t be tampered as it is used in the HMAC.
The usage is the same as the current nginx secure link module, except:
You need to use the following (sorry only PHP) code:
define(URL_TIMEOUT, 3600) # one hour timeout
$prefix = "http://www.domain.com/protected";
$protected_resource = "my-super-secret-resource.jpg";
$secret = "this is my secret";
$time = pack('N', time() + URL_TIMEOUT);
$timeout = bin2hex($time);
$hashmac = md5( $protected_resource . $time . $secret );
$url = $prefix . "/" . $hashmac . $timeout . "/" . $protected_resource;
location /protected/ {
secure_link "this is my secret";
secure_link_ttl on;
root /var/www/protected;
if ($secure_link = "") {
return 403;
}
rewrite ^ /$secure_link break;
}
The server generating the url and hashmac and the one delivering the protected resource must have synchronized clocks.
There is no support. If it eats your server, then I or Days of Wonder can’t be
It’s simple:
24 May
As announced in my last edit of my yesterday post Puppet and JRuby a love and hate story, I finally managed to run a webrick puppetmaster under JRuby with a MRI client connecting and fetching it’s config.
Unfortunately Puppet creates its first certificate with a serial number of 0, which JRuby-OpenSSL finds invalid (in fact that’s Bouncy Castle JCE Provider). So the first thing is to check if you already have some certificate generated with a serial of 0. If you have none, then everything is great you can skip this.
You can see a certificate content with openssl:
% openssl x509 -text -in /path/to/my/puppet/ssl/ca/ca_cert.pem Certificate: Data: Version: 3 (0x2) Serial Number: 1 (0x1) Signature Algorithm: sha1WithRSAEncryption Issuer: CN=ca Validity Not Before: May 23 18:38:19 2009 GMT Not After : May 22 18:38:19 2014 GMT Subject: CN=ca ...
If no certificate has a serial of 0, then it’s OK, otherwise I’m afraid you’ll have to start the PKI from scratch (which means rm -rf $vardir/ssl and authenticate clients again), after applying the following Puppet patch:
JRuby fix: make sure certificate serial > 0
JRuby OpenSSL implementation is more strict than real ruby one and
requires certificate serial number to be strictly positive.
Signed-off-by: Brice Figureau <brice-puppet@daysofwonder.com>
diff --git a/lib/puppet/ssl/certificate_authority.rb b/lib/puppet/ssl/certificate_authority.rb
index 08feff0..4a7d461 100644
--- a/lib/puppet/ssl/certificate_authority.rb
+++ b/lib/puppet/ssl/certificate_authority.rb
@@ -184,7 +184,7 @@ class Puppet::SSL::CertificateAuthority
# it, but with a mode we can't actually read in some cases. So, use
# a default before the lock.
unless FileTest.exist?(Puppet[:serial])
- serial = 0x0
+ serial = 0x1
end
Puppet.settings.readwritelock(:serial) { |f|
I’ll post this patch to puppet-dev soon, so I hope it’ll eventually get merged soon in mainline.
You need the freshest JRuby available at this time. My test were conducted with latest JRuby as of commit “3aadd8a”. The best is to clone the github jruby repository, and build it (it requires of course a JDK and Ant, but that’s pretty much all).
Then install jruby in your path (if you need assistance for this, I’m not sure this blog post is for you :-))
As I explained in my previous blog post about the same subject, Puppet exercises a lot the Ruby OpenSSL subsystem. During this experiment, I found a few shortcomings in the current JRuby-OpenSSL 0.5, including missing methods, or missing behaviors needed by Puppet to run fine.
So to get a fully Puppet enabled JRuby-OpenSSL you need either to get the very latest JRuby-OpenSSL from its own github repository (or checkout the puppet-fixes branch of my fork of said repository on github) and or apply manually the following patches on top of the 0.5 source tarballs:
Then rebuild JRuby-OpenSSL which is a straightforward process (copy build.properties.SAMPLE to build.properties, adjust jruby.jar path, and then issue ant jar to build the jopenssl.jar).
Once done, install the 0.5 JRuby-OpenSSL gem in your jruby install, and copy other the built jar in lib/ruby/gems/1.8/gems/jruby-openssl-0.5/lib.
Then it’s time to run your puppetmaster, just start it with jruby instead of ruby. Of course you need the puppet dependencies installed (Facter).
My next try will be to run Puppet on Jruby and mongrel (or what replaces it in JRuby world), then try with storeconfig on…
Hope that helps, and for any question, please post in the puppet-dev list.
23 May
Since I heard about JRuby about a year ago, I wanted to try to run my favorite ruby program on it. I’m working with Java almost all day long, so I know for sure that the Sun JVM is a precious tool for running long-lived server. It is pretty fast, and has a very good (and tunable) garbage collector.
In a word: the perfect system to run a long-lived puppetmaster!
The first time I tried, back in February 2009, I unfortunately encountered the bug JRUBY-3349 which prevented Puppet to run quite early, because the Fcntl constants weren’t defined. Since my understanding of JRuby internal is near zero, I left there.
But thanks to Luke Kanies (Puppet creator), one of the JRuby main developers Charles Oliver Nutter fixed the issue a couple of weeks ago (thanks to him, and they even fixed another issue at about the same time about fcntl which didn’t support SET_FD).
That was just in time for another test…
But what I forgot was that Puppet is not every ruby app on the block. It uses lots of cryptography behind the scene. Remember that Puppet manages its own PKI, including:
That just means Puppet exercise a lot the Ruby OpenSSL extension.
The main issue is that MRI uses OpenSSL for all the cryptographic stuff, and JRuby uses a specific Java version of this extension. Of course this later is still young (presently at v 0.5) and doesn’t contain yet everything needed to be able to run Puppet.
In another life I wrote a proprietary cryptographic Java library, so I’m not a complete cryptography newcomer (OK, I forgot almost everything, but I still have some good books to refer to). So I decided to implement what is missing in JRuby-openssl to allow a webrick Puppetmaster to run.
You can find my contributions in the various JRUBY-3689, JRUBY-3690, JRUBY-3691, JRUBY-3692, JRUBY-3693 bugs.
I still have another a minor patch to submit (OpenSSL::X509::Certificate#to_text implementation).
So the question is: with all that patches applied, did I get a puppetmaster running?
And the answer is unfortunately no.
I can get the puppetmaster to start on a fresh configuration (ie it creates everything SSL related and such), but it fails as soon a client connects (hey that’s way better than before I started :-)).
All comes from SSL. The issue is that with the C OpenSSL implementation it is possible to get the peer certificate anytime, but the java SSL implementation (which is provided by the Sun virtual machine) requires the client to be authenticated before anyone get access to the peer certificate.
That’s unfortunate because to be able to authenticate a not-yet-registered client, we must have access to its certificate. I couldn’t find any easy code fix, so I stopped my investigations there.
There is still some possible workarounds, like running in mongrel mode (provided JRuby supports mongrel which I didn’t check) and let Nginx (or Apache) handle the SSL stuff, but still it would be great to be able to run a full-fledged puppetmaster on JRuby.
I tried with a known client and get the same issue, so maybe that’s a whole different issue, I guess I’ll have to dig deeper in the Java SSL code, which unfortunately is not available
Stay tuned for more info about this. I hope to be able to have a full puppetmaster running on JRuby soon!
EDIT: I could run a full puppetmaster on webrick from scratch under JRuby with a normal ruby client. I’ll post the recipe in a subsequent article soon.
8 Mar
Since a long time people (including me) complained that storeconfigs was a real resource hog. Unfortunately for us, this option is so cool and useful.
Storeconfigs is a puppetmasterd option that stores the nodes actual configuration to a database. It does this by comparing the result of the last compilation against what is actually in the database, resource per resource, then parameter per parameter, and so on.
The actual implementation is based on Rails’ Active Record, which is a great way to abstract the gory details of the database, and prototype code easily and quickly (but has a few shortcomings).
The immediate use of storeconfigs is exported resources. Exported resources are resources which are prefixed by @@. Those resources are marked specially so that they can be collected on several other nodes.
A little completely dumb example speaks by itself:
class exporter {
@@file {
"/var/lib/puppet/nodes/$fqdn": content => "$ipaddress\n", tag => "ip"
}
}
node "export1.daysofwonder.com" {
include exporter
}
node "export2.daysofwonder.com" {
include exporter
}
node "collector.daysofwonder.com" {
File <<| tag == "ip" |>>
}
What does this example do?
That’s simple, all the exporter nodes creates a file in /var/lib/puppet/nodes whose name is the node name and whose content is its primary IP address.
What is interesting is that the node “collector.daysofwonder.com” collects all files tagged by “ip“, that is all the exported files. In the end, after exporter1, exporter2 and collector have run a compilation, the collector host will have the /var/lib/puppet/nodes/exporter1.daysofwonder.com and /var/lib/puppet/nodes/exporter2.daysofwonder.com and their respective content.
Got it?
That’s the perfect tool for instance to automatically:
Still there is another use, since the whole configuration of your nodes is in an RDBMS, you can use that to perform some data-mining about your hosts configuration. That’s what puppetshow does.
The storeconfigs issue its current incarnation (ie 0.24.7) is that it is a slow feature (it usually doubles the compilation time), and imposes an higher load on the puppetmaster and the database engine.
For large installation it might not possible to be able to run with this feature on. There were also some reports of high memory usage or leak with this feature on (see my recommendation about this in my puppetmaster memory leak post).
Here my usual puppet and storeconfigs recommendations:
I think the last point deserves a little bit more explanation:
I had the following schematized pattern in some of my manifests, that I took from David Schmitt excellent modules:
in one class:
if defined(File["/var/lib/puppet/modules/djbdns.d/"]) {
warn("already defined")
} else {
file {
"/var/lib/puppet/modules/djbdns.d/": ...
}
}
and in another class the exact same code:
if defined(File["/var/lib/puppet/modules/djbdns.d/"]) {
warn("already defined")
} else {
file {
"/var/lib/puppet/modules/djbdns.d/": ...
}
}
What happens is that from run to run the evaluation order could change, and the defined resource could be the one in the first class and another time it could be the one in the second class, which meant the storeconfigs code had to remove the resources from the database and re-create them again. Clearly not the best way to have less database workload
I contributed for 0.24.8 a partial rewrite of some parts of the storeconfigs feature to increase its performance.
My analysis is that what was slow in the feature is threefold:
I fixed the first two points by attacking directly the database to fetch the parameters and tags, keeping them in hash instead of objects. This saves a large number of database request and at the same time it prevents a large number of ruby objects to be created (it should even save some memory).
The last point was fixed by imposing a strict order (although not completely correct, but still better that how it was) in the way the tags are assigned to resources.
Both patches have been merged for 0.24.8, and some people reported some performance improvements.
On the Days of Wonder infrastructure I found that with a 562 resources node, on a tuned mysql database:
That’s a nice improvement, isn’t it
Luke and I discussed about this, it was also discussed on the puppet-dev list a few times. I think that a RDBMS might not be the right storage choice for this feature, because clearly there is almost no random keyed access to the individual parameters of a resource (so having a table dedicated to parameters is of almost no use).
I know Luke’s plan is to abstract the storeconfigs feature from the current implementation (certainly through the indirector), so that we can use different storeconfigs engines.
I also know that someone is working on a promising CouchDB implementation. I myself can see a memcached implementation (which I’d really like to start working on). Maybe even the filesystem would be enough.
Of course, I’m open to any other improvements or storage engine ideas
Recent Comments