Masterzen's Blog

Journey in a software world…

Puppet Internals: The Compiler

| Comments

And I’m now proud to present the second installation of my series of post about Puppet Internals:

Today we’ll focus on the compiler.

The Compiler

The compiler is at the heart of Puppet, master/agent or masterless. Its responsibility is to transform the AST into a set of resources called the catalog that the agent can consume to perform the necessary changes on the node.

You can see the compiler as a function of the AST and Facts and returning the catalog.

The compiler lives in the lib/puppet/parser/compiler.rb file and more specifically in the Puppet::Parser::Compiler class. When a node connects to a master to ask for a catalog, the Indirector directs the request to the compiler.

In a classic master/agent system, the agent does a REST find catalog request to the master. The master catalog indirection is configured to delegate to the compiler. This happens in the lib/puppet/indirector/catalog/compiler.rb file. Check this previous article about the Indirector if you want to know more.

The indirector request contains two things:

  • what node we should compile
  • the node’s facts

Produced Catalog

When we’re talking about catalog, in the Puppet system it can mean two distinct things:

  • a containment catalog
  • a relationship resource catalog

The first one is the product of the compiler (which we’ll delve into in this article). The second one is formed by the transformation of the first one in the agent. This is the later one that we usually call the puppet catalog.

Here is a simple manifest and the containment catalog that I obtained after compiling:

1
2
3
4
5
6
7
class test {
  file {
    "/tmp/a": content => "test!"
  }
}

include test

And here is the produced catalog:

Out of compiler containment catalog

You’ll notice that as its name implies, the containment catalog is a graph of classes and resources that follows the structure of the manifest.

When Facts matter

In a master/agent system the facts are coming from the request in a serialized form. Those facts were created by calling Facter on the remote node.

Once unserialized, the facts are cached locally as YAML (as per the default terminus for facts on a master). You can find them in the $vardir/yaml/facts/$certname.yaml file.

At the same time the compiler catalog terminus compute some server facts that are injected into the current node instance.

Looking for the node

In Puppet there are several possibilities to store node definitions. They can be defined by node {} blocks in the site.pp, by an ENC, into an LDAP directory, etc…

Before the compiler can start, it needs to create an instance of the Puppet::Node class, and fill this with the node informations.

The node indirection terminus is controlled by the node_terminus puppet settings which by default is plain. This terminus just creates a new empty instance of a Puppet::Node.

In an ENC setup, the terminus for the node indirection will be exec. This terminus will create a Puppet::Node instance initialized with a set of classes and global parameters the compiler will be able to use.

The plain terminus for nodes calls Puppet::Node#fact_merge. This methods finds the current set of Facts of this node. In the plain case it involves reading the YAML facts we wrote to disk in the last chapter, and merging those to the current node instance parameters.

Back to the compiler catalog terminus, this one tries to find the node with the given request information and if not present by using the node certname. Remember that the request to get a catalog from REST matches /catalog/node.domain.com, in which case the request key is node.domain.com.

Let’s compile

After that, we really enter the compiler code, when the compiler catalog terminus calls Puppet::Parser::Compiler.compile, which creates a new Puppet::Parser::Compiler instance giving it our node instance.

When creating this compiler instance, the following is created:

  • an empty catalog (an instance of Puppet::Resource::Catalog). This one will hold the result of the compilation.
  • a companion top scope (an instance of Puppet::Parser::Scope)
  • some other internal data structures helping the compilation

If the given node was coming from an ENC, the catalog is bootstrapped with the known node classes.

Once done, the compile method is called on the compiler instance. The first thing done is to bootstrap top scope with the node parameters (which contains the global data coming from the ENC if one is used and the facts).

Remember the AST

When we left the Parser post, we obtained an AST. This AST is a tree of AST instances that implement the guts of the Puppet language.

In this previous article we left aside 3 types of AST:

  • Node AST
  • Hostclass AST
  • Definition AST

Those are different in the sense that we don’t strictly evaluate them during compilation (more later on this step). No, those are instantiated as part of the initial import of the known types. If you’re wondering why I spelled the Class AST as Hostclass, then it’s because that’s how it is spelled in the Puppet code; the reason being that class is a reserved word in Ruby :)

Using a lazy evaluation scheme, Puppet keeps (actually per environments), a list of all the parsed known types (classes, definitions and nodes that the parser encountered during parsing); this is called the known types.

When this list is first accessed, if it doesn’t exist, Puppet triggers the parser to populate it. This happens in Puppet::Node::Environment.known_resource_types which calls the import_ast method with the result of the parsing phase.

import_ast adds to the known types an instance of every definitions, hostclass, node returned by their respective instantiate method.

Let’s have a closer look of the hostclass instantiate:

Simplified Hostclass AST instantiate method
1
2
3
4
5
6
7
8
9
10
def instantiate(modname)
  new_class = Puppet::Resource::Type.new(:hostclass, @name)
  all_types = [new_class]
  code.each do |nested_ast_node|
    if nested_ast_node.respond_to? :instantiate
      all_types += nested_ast_node.instantiate(modname)
    end
  end
  return all_types
end

So instantiate returns an array of Puppet::Resource::Type of the given type. You’ll notice that the hostclass code above analyzes its current class AST children for other ‘instantiable’ AST elements that will also end in the known types.

Known Types

The known types I’m talking about since a while all live in the Puppet::Resource::TypeCollection object. There’s one per Puppet environment in fact.

This object main responsibility is storing all known classes, nodes and definitions to be easily accessed by the compiler. It also watches all loaded files by the parser, so that it can trigger a re-parse when one of those is updated. It also serves as the Puppet class/module autoloader (when asking it for an unknown class, it will first try to load it from disk and parse it).

Scopes

Let’s open a parenthesis to explain a little bit what the scope is. The scope is an instance of Puppet::Parser::Scope and is simply a symbol table (as explained in the Dragon Book). It just keeps the values of Puppet variables.

It forms a tree, with the top scope (the one we saw the creation earlier) being the root of all scopes. This tree contains one child per new namespace.

The scope supports two operations:

  1. Looking up a variable value
  2. Setting a variable value

Look up is done with the lookupvar method. If the variable is qualified it will directly ask the correct scope for its value. For instance ::$hostname will fetch directly the top scope fact hostname.

Otherwise it will either return its value in the local scope if it exists or delegate to the parent scope. This can happen up until the top scope. If the value can’t be found anywhere, the :undef ruby symbol will be returned.

Note that this dynamic scope behavior will be removed in the next Puppet version, where only the local scope and the top scope will be supported. More information is available in this Scope and Puppet article.

Setting a variable is done with the setvar method. This method is called for instance by the AST class responsible of variable assignment (the AST::VarDef).

Along with regular variables, each scope has the notion of ephemeral scope. An ephemeral scope is a special transient scope that stores only regex capture $0 to $xy variables.

Each scope level maintains a stack of ephemeral scopes, which is by default empty.

In Puppet there is no scopes for other language structures than classes (and nodes and definitions), so inside the following if, an ephemeral scope is created, and pushed on the stack, to store the result of the regex match:

Epehmeral scope in action
1
2
3
if $var =~ /test(.*)/ {
  # here $0, $1... are available
}

When Puppet execution reaches the closing ‘}’, the ephemeral scope is popped from the ephemeral scope stack, removing the $0 definition.

lookupvar will also ask the ephemeral scope stack if needed.

Orthogonally, the scope instance will also store resource defaults.

Talking about AST Evaluation

And here we need to take a break from compilation to talk about AST evaluation, which I elegantly eluded from the previous post on the Parser.

Every AST node (both branch and leaf ones) implements the evaluate method. This method takes a Puppet::Parser::Scope instance as parameter. This is the scope instance that is valid at the moment we evaluate this AST node (usually the scope associated with the class where the code we evaluate is).

There are several outcomes possible after evaluation:

  • Manipulation of the scope (like variable assignment, variable lookup, parser function call)
  • Evaluation of AST children of this node (for instance if, case, selectors need to evaluate code in one their children branch)
  • Creation of Puppet::Parser::Resource when encountering a puppet resource
  • Creation of Puppet::Resource::Type (more puppet classes)

When an AST node evaluates its children it does so by calling safeevaluate on them which in turn will call evaluate. Safeevaluate will shield the caller from exceptions, and transform them to parse errors that can specify the line and file of the puppet instruction that triggered the problem.

Shouldn’t we talk about compilation?

Let’s go back to the compiler now. We left the compiler after we populated the top scope with the node’s facts, and we still didn’t properly started the compilation phase itself.

Here is what happens after:

  1. Main class evaluation
  2. Node AST evaluation
  3. Evaluation of the node classes if any
  4. Recursive evaluation of definitions and collections (called generators)
  5. Evaluation of relationships
  6. Resource overrides evaluation
  7. Resource finish
  8. Ship the catalog

After that, what remains is the containment catalog. This one will be transformed to a resource containment catalog. We call resource catalog an instance of Puppet::Resource::Catalog where all Puppet::Parser::Resource have been transformed to Puppet::Resource instances.

Let’s now see in order the list of operations we outlined above and that form the compilation.

Main class evaluation

The main class is an hidden class where every code outside any definition, node or class ends. It’s a kind of top class from which any other class is inner. This class is special because it has an empty name.

Evaluating the main class means:

  1. Creating a companion resource (an instance of Puppet::Parser::Resource) whose scope is the top scope.
  2. Add this resource to the catalog
  3. Evaluating the class code of this resource

Let’s focus on this last step which happens in Puppet::Parser::Resource.evaluate. It involves mainly getting access to the Puppet::Resource::Type instance matching our class (its type in fact) from the known types, and then calling the Puppet::Resource::Type.evaluate_code method.

Evaluating code of a class

I’m putting aside the main class evaluation to talk a little bit about code evaluation of a given class because that’s something we’ll see for every class or node during compilation.

This happens during Puppet::Resource::Type.evaluate_code which essentially does:

  1. Create a scope for this class (unless we’re compiling the main class which already uses the top scope)
  2. Ask the class AST children to evaluate with this scope

We saw in the Puppet Parser post how the AST was produced. Eventually some of those AST nodes will end up in the code element of a given puppet class (you can refer to the Puppet grammar and Puppet::Parser::AST::Hostclass for the code), under the form of an ASTArray (which is an array of AST nodes).

Node Evaluation

As for the main class, the current node compilation phase:

  • ask the known types about the current node, and if none are found ask for a default node.
  • creates a resource for this node, add it to the catalog
  • evaluates this node resource

This last evaluation will execute the given node AST code.

Node class evaluation

If the node was provided by an ENC, the compiler will then evaluate those classes. This is the same process as for the main class, where for every classes we create a resource, add it to the catalog and then evaluate it.

Evaluation of Generators

In Puppet the generators are the entities that are able to spawn new resources:

  • collections, including storeconfig exported resources
  • definitions

This part of the compilation loops calling evaluate_definitions and evaluate_collections, until none of those produces new resources.

Definitions

During the AST code evaluation, if the compiler encounters a definition call, the Puppet::Parser::AST::Resource.evaluate will be called (like for every resource).

Since this resource comes from a definition, a type resource will be instantiated and added to the catalog. This resource will not be evaluated at this stage.

Later, when evaluate_definitions is called, it will pick up any resource that hasn’t been evaluated (which is the case of our definition resources) and evaluates them.

This operation might in turn create more unevaluated resources (ie new definition spawning more definition resources), which will be evaluated in a subsequent pass over evaluate_definitions.

Collections

When the parser parses a collection which are defined like this in the Puppet language:

1
File <<| tag == 'key' |>>

it creates an AST node of type Puppet::Parser::AST::Collection. The same happen if you use the realize function.

Later when the compiler evaluate code and encounters this collection instance, it will create a Puppet::Parser::Collector and register it to the compiler.

Even later, during evaluate_collections, the evaluate method of all the registered collectors will be called. This method will either fetch exported resources from storeconfigs or virtual resources, and create Puppet::Parser::Resource that are registered to the compiler.

If the collector has created all its resources, it is removed from the compiler.

Relationship evaluation

The current compiler holds the list of relationships defined with the -> class of relationship operators (but not the ones defined with the require or before meta-parameters).

During code evaluation, when the compiler encounters the relationship AST node (an instance of Puppet::Parser::AST::Relationship), it will register a Puppet::Parser::Relationship instance to the compiler.

During the evaluate_relationships method of the compiler, every registered relationship will be evaluated. This evaluation simply adds the destination resource reference to the source resource meta-parameter matching the operator.

Resource overrides

And the next compilation phase consists in adding all the overrides we discovered during the AST code evaluation. Normally overrides are applied as soon as they are discovered, but it can happen than an override (especially for collection overrides), can not be applied because the resources it should apply on are not yet created.

Applying an override consist in setting a given resource parameter to the overridden value.

Resource finishing

During this phase, the compiler will call the finish method on every created resources. This methods is responsible of:

  • adding resource defaults to the resource parameters
  • tagging the resource with the current scope tags
  • checking that resource parameter are valid

Resource meta-parameters

The next step in the compilation process is to set all meta-parameter of our created resources, starting from the main class and walking the catalog from there.

Finish

Once everything has been done, the compiler runs some checks to make sure all overrides and collections have been evaluated. Then the catalog is transformed to a Puppet::Resource catalog (which doesn’t change its layout, just the instance of its vertices).

Conclusion

I hope you now have a better view of the compilation process. As you’ve seen the compilation is a complex process, which is one of the reason it can take some time. But that’s the price to pay to produce a data only graph tailored to one host that can be applied on the host.

Stay tuned here for the next episode of my Puppet Internals series of post. The next installment will certainly cover the Puppet transaction system, whose role is to apply the catalog on the agent.

Benchmarking Puppet Stacks

| Comments

I decided this week-end to try the more popular puppet master stacks and benchmark them with puppet-load (which is a tool I wrote to simulate concurrent clients).

My idea was to check the common stacks and see which one would deliver the best concurrency. This article is a follow-up of my previous post about puppet-load and puppet master benchmarking

Methodology

I decided to try the following stacks:

  • Apache and Passenger, which is the blessed stack, with MRI 1.8.7 and 1.9.2
  • Nginx and Mongrel
  • JRuby with minzuno

The setup is the following:

  • one m1.large ec2 instance as the master
  • one m1.small ec2 instance as the client (in the same availability zone if that matters)

To recap, m1.large instances are:

  • 2 cpu with 2 virtual core each
  • 8 GiB of RAM

All the benchmarks were run on the same instance couples to prevent skew in the numbers.

The master uses my own production manifests, consisting of about 100 modules. The node for which we’ll compile a catalog contains 1902 resources exactly (which makes it a big catalog).

There is no storeconfigs involved at all (this was to reduce setup complexity).

The methodology is to setup the various stacks on the master instance and run puppet-load on the client instance. To ensure everything is hot on the master, a first run of the benchmark is run at full concurrency first. Then multiple run of puppet-load are performed simulating an increasing number of clients. This pre-heat phase also make sure the manifests are already parsed and no I/O is involved.

Tuning has been done as best as I could on all stacks. And care was taken for the master instance to never swap (all the benchmarks involved consumed about 4GiB of RAM or less).

Puppet Master workload

Essentially a puppet master compiling catalog is a CPU bound process (that’s not because a master speaks HTTP than its workload is a webserver workload). That means during the compilation phase of a client connection, you can be guaranteed that puppet will consume 100% of a CPU core.

Which essentially means that there is usually little benefit of using more puppet master processes than CPU cores on a server.

A little bit of scaling math

When we want to scale a puppet master server, there is a rough computation that allows us to see how it will work.

Here are the elements of our problem:

  • 2000 clients
  • 30 minutes sleep interval, clients evenly distributed in time
  • master with 8 CPU core and 8GiB of RAM
  • our average catalog compilation is 10s

30 minutes interval means that every 30 minutes we must compile 2000 catalogs for our 2000 nodes. That leaves us with 2000/30 = 66 catalogs per minute.

That’s about a new client checking-in about every seconds.

Since we have 8 CPU, that means we can accommodate 8 catalogs compilation in parallel, not more (because CPU time is a finite quantity).

Since 66/8 = 8.25, we can accommodate 8 clients per minute, which means each client must be serviced in less than 60/8.25 = 7.27s.

Since our catalogs take about 10s to compile (in my example), we’re clearly in trouble and we would need to either add more master servers or increase our client sleep time (or not compile catalogs, but that’s another story).

Results

Comparing our stacks

Let’s first compare our favorite stacks for an increasing concurrent clients number (increasing concurrency).

For setups that requires a fixed number of workers (Passenger, Mongrel) those were setup with 25 puppet master workers. This was fitting in the available RAM.

For JRuby, I had to use the at the time of writing jruby-head because of a bug in 1.6.5.1. I also had to comment out the Puppet execution system (in lib/puppet/util.rb).

Normally this sub-system is in use only on clients, but when the master loads the types it knows for validation, it also autoloads the providers. Those are checking if some support commands are available by trying to execute them (yes I’m talking to you rpm and yum providers).

I also had to comment out when puppet tries to become the puppet user, because that’s not supported under JRuby.

JRuby was run with Sun java 1.6.0_26, so it couldn’t benefit from the invokedynamic work that went into Java 1.7. I fully expect this feature to improve the performances dramatically.

The main metric I’m using to compare stacks is the TPS (transaction per seconds). This is in fact the number of catalogs a master stack can compile in one second. The higher the better. Since compiling a catalog on our server takes about 12s, we have TPS numbers less than 1.

Here are the main results:

Puppet Master Stack / Catalog compiled per Seconds

And, here is the failure rate:

Puppet Master Stack / Failure rate

First notice that some of the stack exhibited failures at high concurrency. The errors I could observe were clients timeouts., even tough I configured a large client side timeout (around 10 minutes). This is what happens when too many clients connect at the same time. Everything slows down until the client times out.

Fairness

In this graph, I plotted the min, average, median and max time of compilation for a concurrency of 16 clients.

Puppet Master Stack / fairness

Of course, the better is when min and max are almost the same.

Digging into the number of workers

For the stacks that supports a configurable number of workers (mongrel and passenger), I wanted to check what impact it could have. I strongly believe that there’s no reason to use a large number (compared to I/O bound workloads).

Puppet Master Stack / Worker # influence

Conclusions

Beside being fun this project shows why Passenger is still the best stack to run Puppet. JRuby shows some great hopes, but I had to massage the Puppet codebase to make it run (I might publish the patches later).

That’d would be really awesome if we could settle on a corpus of manifests to allow comparing benchmark results between Puppet users. Anyone want to try to fix this?

Puppet Internals: The Parser

| Comments

As more or less promised in my series of post about Puppet Extension Points, here is the first post about Puppet Internals.

The idea is to produce a series of blog post about each one about a Puppet sub-system.

Before starting, I first want to present what are the various sub-blocks that forms Puppet, or Puppet: the Big Picture:

Puppet the Big Picture

I hope to be able to cover each of those sub-blocks in various posts, but we’ll today focus on the Puppet Parser.

The Puppet Parser

The Puppet Parser responsibility is to transform the textual manifests into a computer usable data structure that could be fed to the compiler to produce the catalog. This data structure is called an AST (Abstract Syntax Tree).

The Puppet Parser is the combination of various different sub-systems:

  • the lexer
  • the racc-based parser
  • the AST model

The Lexer

The purpose of the lexer is to read manifests characters by characters and to produce a stream of tokens. A token is just a symbol (combined with data) that represents a valid part of the Puppet language.

For instance, the lexer is able to find things such (but not limited to):

  • reserved keywords (like case, class, define…)
  • quoted strings
  • identifiers
  • variables
  • various operators (like left parenthesis or right curly braces…)
  • regexes

Let’s take an example and follow what comes out of the lexer when scanning this manifest:

1
$variable = "this is a string"

And here is the stream of tokens that is the outcome of the lexer:

1
2
3
:VARIABLE(VARIABLE) {:line=>1, :value=>"variable"}
:EQUALS(EQUALS) {:line=>1, :value=>"="}
:STRING(STRING) {:line=>1, :value=>"this is a string"}

As you can see, a puppet token is the combination of a symbol and a hash.

Let’s see how we achieved this result. First you must know that the Puppet lexer is a regex-based system. Each token is defined as a regex (or a stock string). When reading a character, the lexer ‘just’ checks if one of the string or regex can match. If there is one match, the lexer emits the corresponding token.

Let’s take our example manifest (the variable assignment above), and see what happens in the lexer:

  1. read $ character
  2. no regex match, let’s read some more characters
  3. read ‘variable’, still no match, our current buffer contains $variable
  4. read ’ ‘, oh we have a match against the DOLLAR_VARIABLE token regex
  5. this token is special, it is defined with a ruby block. When one of those token is read and matched, the block is executed.
  6. the block just emits the VARIABLE("variable") token

The lexer’s scanner doesn’t try every regexes or strings, it does this in a particular order. In short it tries to maximize the length of the matched string, in a word the lexer is greedy. This helps removing ambiguity.

As seen in the token stream above, the lexer associates to each token an hash containing the line number where we found it. This allows error messages in case of parsing error to point to the correct line. It also helps puppetdoc to associate the right comment with the right language structure.

The lexer also supports lexing contexts. Some tokens are valid in some specific contexts only, this is true especially when parsing quoted strings for variables interpolation.

Not all lexed tokens emit tokens for the parser. For instance comments are scanned (and stored in a stack for puppetdoc use), but they don’t produce a token for the parser: they’re skipped.

Finally, the lexer also maintains a stack of the class names it crossed. This is to be able to find the correct fully qualified name of inner classes as seen in the following example:

Fully qualified class names by keeping a stack of names during lexing
1
2
3
4
5
6
7
class outer {
  class middle {
    class inner {
      # we're in outer::middle::inner
    }
  }
}

If you want more information about the lexer, check the Puppet::Parser::Lexer class.

The parser

The parser is based on racc. Racc is a ruby port of the good old Yacc. Racc, like Yacc, is what we call a LALR parser.

The ‘cc’ in Racc means ‘compiler of compiler’. It means in fact that the parser is generated from what we call a grammar (and for LALR parsers, even a context free grammar). The generated parser is table driven and consumes tokens one by one. Those kind of parsers are sometimes called Shift/Reduce parsers.

This grammar is written in a language that is a machine readable version of a Backus-Naur Form or “BNF”.

There are different subclasses of context free grammars. Racc works best with LR(1) grammars, which means it must be possible to parse any portion of an input string with just a single token lookahead. Parsers for LR(1) grammars are deterministic. This means that we only need a fixed number of lookahead tokens (in our case 1) and what we already parsed to find what next rule to apply.

Roughly it does the following:

  1. read a token
  2. shift (this mean put the token on the stack), goto 1. until we can reduce
  3. reduce the read tokens with a grammar rules (this involves looking ahead)

We’ll have a deeper look in the subsequent chapters. Meanwhile if you want to learn everything about LALR Parsers or parsers in general, I highly recommend the Dragon Book

The Puppet Grammar

The Puppet Grammar can be found in lib/puppet/parser/grammar.ra in the sources. It is a typical racc/yacc grammar that

  • defines the known tokens (those matches the lexed token names)
  • defines the precedence of operators
  • various recursive rules that form the definition of the Puppet languages

Let’s have a look to a bit of the Puppet Grammar to better understand how it works:

Excerpt of the Puppet Grammar
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
statement_or_declaration:    resource
  | collection
  | assignment
  | casestatement
  | ifstatement_begin
...
assignment:     VARIABLE EQUALS expression {
  variable = ast AST::Name, :value => val[0][:value], :line => val[0][:line]
  result = ast AST::VarDef, :name => variable, :value => val[2], :line => val[0][:line]
}
...
expression:   rvalue
  | hash
  ...

rvalue:       quotedtext
  | name
  ...

quotedtext: STRING  { result = ast AST::String, :value => val[0][:value], :line => val[0][:line] }

So the closer look above shows 4 rules:

  • a non-terminal rule called statement_or_declaration which is an alternation of sub-rules
  • a terminal rule called assignment, with a ruby code block that will be executed when this rule will be reduced.
  • a non terminal rule called expression
  • a terminal rule quotedtext with a ruby block

To understand what that means, we could translate those rules by:

  1. A statement or declaration can be either a resource or a collection, or an assignement
  2. An assignment is when the parser finds a VARIABLE token followed by an EQUALS token and an expression
  3. An expression can be a rvalue or an hash (all defined later on in the grammar file)
  4. A rvalue can be among other things a quotedtext
  5. And finally a quotedtext can be STRING (among other things)

You can generate yourself the puppet parser by using racc, it’s as simple as:

  1. Installing racc (available as a gem)
  2. running: make -C lib/puppet/parser

This rebuilds the lib/puppet/parser/parser.rb file.

You can generate a debug parser that prints everything it does if you use -g command-line switch to racc (check the lib/puppet/parser/makefile and define @@yydebug = true in the parser class.

The parser itself is controlled by the Puppet::Parser::Parser class which is in lib/puppet/parser/parser_support.rb. This class is requiring the generated parser (both share the same ruby class). That means that the ruby blocks in the grammar will be executed in the context of an instance of the Puppet::Parser::Parser class. In other words, you can call from the grammar, methods defined in the parser_support.rb file. That’s the reason we refer to the ast method in the above example. This method just creates an instance of the given class and associates some context to it.

Let’s go back a little bit to the reduce operation. When the parser is reducing, it pops from the stack the reduced tokens and pushes the result to the stack. The result can either be what ends in the result field of the grammar ruby block or the result of the reduction of the mentioned rule (when it’s a non-terminal one).

In the ruby block of a terminal rule, it is possible to access the tokens and rule results currently parsed in the val array. To get back to the assignment statement above, val[0] is the VARIABLE token, and val[2] the result of the reduction of the expression rule.

The AST

The AST is the computer model of the parsed manifests. It forms a tree of instances of the AST base class. There are AST classes (all inheriting the AST base class) for every elements of the language. For instance there’s one for puppet classes, for if, case and so on. You’ll find all those in lib/puppet/parser/ast/ directory.

There are two kinds of AST classes:

  • leaves: which represent some kind of values (like an identifier or a string)
  • branches: which encompass more than one other AST classes (like if, case or class). This is what forms the tree.

All AST classes implement the evaluate method which we’ll cover in the compiler article.

For instance when parsing an if/else statement like this:

An If/Else statement
1
2
3
4
5
if $var {
  notice("var is true")
} else {
  notice("var is false")
}

The whole if/else once parsed will be an instance of Puppet::Parser::AST::IfStatement (which can be found in lib/puppet/parser/ast/ifstatement.rb.

This class defines three instance variables:

  1. @test
  2. @statements
  3. @else

The grammar rule for ifstatement is (I simplified it for the purpose of the article):

Simplified Grammar rule for If/else
1
2
3
4
5
6
7
8
ifstatement:  IF expression LBRACE statements RBRACE else {
  args = {
    :test => val[0],
    :statements => val[2],
    :else = val[4]
  }
  result = ast AST::IfStatement, args
}

Notice how the AST::IfStatement is initialized with the args hash containing the test,statements and else result of the those rules. Those rules result will also be AST classes, and will end up in the IFStatement fields we talked about earlier.

Thus this forms a tree. If you look to the AST::IfStatement#evaluate implementation you’ll see that depending on the result of the evaluation of the @test it will either evaluate @statements or @else.

Calling the evaluate method of the root element of this tree will in chain trigger calling evaluate on children like for the IfStatement example. This process will be explained in details in the compiler article, but that’s essentially how Puppet compiler works.

An Example Step by Step

Let’s see an end-to-end example of parsing a simple manifest:

A Simple Example
1
2
3
4
5
class test {
  file {
    "/tmp/a": content => "test!"
  }
}

This will produce the following stream of tokens:

A Simple Example
1
2
3
4
5
6
7
8
9
10
11
12
:CLASS(CLASS) {:line=>1, :value=>"class"}
:NAME(NAME) {:line=>1, :value=>"test"}
:LBRACE(LBRACE) {:line=>1, :value=>"{"}
:NAME(NAME) {:line=>2, :value=>"file"}
:LBRACE(LBRACE) {:line=>2, :value=>"{"}
:STRING(STRING) {:line=>3, :value=>"/tmp/a"}
:COLON(COLON) {:line=>3, :value=>":"}
:NAME(NAME) {:line=>3, :value=>"content"}
:FARROW(FARROW) {:line=>3, :value=>"=>"}
:STRING(STRING) {:line=>3, :value=>"test!"}
:RBRACE(RBRACE) {:line=>4, :value=>"}"}
:RBRACE(RBRACE) {:line=>5, :value=>"}"}

And now let’s dive in the parser events (I simplified the outcome because the Puppet grammar is a little bit more complex than necessary for this article). The following example shows all actions of the Parser and how looks the parser stack after the operation took place. I elided some of the stacks when not strictly needed to understand what happened.

  1. receive: CLASS (our parser got the first token from the lexer)
  2. shift CLASS (there’s nothing else to do for the moment)

    the result of the shift is that we now have one token in the parser stack

    stack: [ CLASS ]

  3. receive: NAME("test") (we get one more token)

  4. shift NAME (still no rules can match so we shift it)

    stack: [ CLASS NAME("test") ]

  5. reduce NAME –> classname (oh and now we can reduce a rule)

    notice how the stacks now contains a classname and not a NAME

    stack: [ CLASS (classname "test") ]

  6. receive: LBRACE

  7. shift LBRACE

    stack: [ CLASS (classname "test") LBRACE ]

  8. receive: NAME("file")

  9. shift NAME

    stack: [ CLASS (classname "test")
LBRACE NAME("file") ]

  10. receive: LBRACE

  11. reduce NAME –> classname

    stack: [ CLASS (classname "test")
LBRACE (classname "file") ]

  12. shift: LBRACE

    stack: [ CLASS (classname "test")
LBRACE (classname "file") LBRACE ]

  13. receive STRING("/tmp/a")

  14. shift STRING

    stack: [ CLASS (classname "test") LBRACE (classname "file") LBRACE STRING("/tmp/a") ]

  15. reduce STRING –> quotedtext

    stack: [ CLASS (classname "test") LBRACE (classname "file") LBRACE (quotedtext AST::String("/tmp/a")) ]

  16. receive COLON

  17. reduce quotedtext –> resourcename

    stack: [ CLASS (classname "test") LBRACE (classname "file") LBRACE (resourcename AST::String("/tmp/a")) ]

  18. shift COLON

    stack: [ CLASS (classname "test") LBRACE (classname "file") LBRACE (resourcename AST::String("/tmp/a")) COLON ]

  19. receive: NAME("content")

  20. shift NAME

    stack: [ CLASS (classname "test") LBRACE (classname "file") LBRACE (resourcename AST::String("/tmp/a")) COLON NAME("content") ]

  21. receive: FARROW

  22. shift FARROW

    stack: [ CLASS (classname "test") LBRACE (classname "file") LBRACE (resourcename AST::String("/tmp/a")) COLON NAME("content") FARROW ]

  23. receive: STRING("test!")

  24. shift: STRING
  25. reduce STRING –> quotedtext
  26. receive: RBRACE
  27. reduce quotedtext –> rvalue

    stack: [ CLASS (classname "test") LBRACE (classname "file") LBRACE (resourcename AST::String("/tmp/a")) COLON NAME("content") FARROW (rvalue AST::String("test!"))]

  28. reduce rvalue –> expression

    stack: [ CLASS (classname "test") LBRACE (classname "file") LBRACE (resourcename AST::String("/tmp/a")) COLON NAME("content") FARROW (expression AST::String("test!"))]

  29. reduce NAME FARROW expression –> param (we’ve now a resource parameter)

    stack: [ CLASS (classname "test") LBRACE (classname "file") LBRACE (resourcename AST::String("/tmp/a")) COLON (param AST::ResourceParam("content"=>"test!")))]

  30. reduce param –> params (multiple parameters can form a params)

  31. reduce resourcename COLON params –> resourceinst (name: parameters form a resouce)

    stack: [ CLASS (classname "test") LBRACE (classname "file") LBRACE (resourceinst (AST::ResourceInstance(...)))]

  32. reduce resourceinst –> resourceinstances (more than one resourceinst can form resourceinstances)

    stack: [ CLASS (classname "test") LBRACE (classname "file") LBRACE (resourceinstances [(resourceinst (AST::ResourceInstance(...)))] )]

  33. shift RBRACE

  34. reduce classname LBRACE resourceinstances RBRACE –> resource (we’ve discovered a resource)

    stack: [ CLASS (classname "test") LBRACE (classname "file") LBRACE (resource AST::Resource(...))]

  35. receive: RBRACE

  36. reduce resource –> statement_or_declaration (a resource is one statement)
  37. reduce statement_or_declaration –> statement_and_declarations
  38. shift RBRACE

    stack: [ CLASS (classname "test") LBRACE (classname "file") LBRACE (resource AST::Resource(...)) RBRACE ]

  39. reduce CLASS classname LBRACE statements_and_declarations RBRACE –> hostclass (we’ve discovered a puppet class)

    stack: [ (hostclass AST::Hostclass(...)) ]

  40. reduce hostclass –> statement_or_declaration

  41. reduce statement_or_declaration –> statements_and_declarations
  42. receive: end of file
  43. reduce statements_and_declarations –> program
  44. shift end of file

    stack: [ (program (AST::ASTArray [AST::Hostclass(...))])) ]

And the parsing is now over. What is returned is this program, which is in fact an instance of an AST::ASTArray.

If we now analyze the produced AST, we find:

  • AST::ASTarray - array of AST instances, this is our program
    • AST::Hostclass - an instance of a class
      • AST::Resource - contains an array of resource instances
        • AST::ResourceInstance
          • AST::ResourceParam - contains the “content” parameter
            • AST::String("content")
            • AST::String("test!")

What’s important to understand is that the AST depends only from the manifests. Thus the Puppet master needs only to reparse manifests only if they change.

What’s next?

The next episode will follow-up after the Parser: the compilation. The Puppet compiler takes the AST, injects into it the facts and gets what we call a catalog; that’s exactly what we’ll learn in the next article (sorry, no ETA yet).

Do not hesitate to comment or ask questions on this article with the comment system below :)

And happy new year all!

Protobuf, Maven, M2E and Eclipse Are on a Boat

| Comments

At Days of Wonder we develop several Java projects (for instance our online game servers). Those are built with Maven, and most if not all are using Google Protocol Buffers for data interchange.

Development happens mostly in Eclipse, and until a couple of months ago with m2eclipse. With the release of m2e (m2eclipse successor), our builds don’t work as is in Eclipse.

The reason is that we run the maven-protoc-plugin (the David Trott fork which is more or less now the only one available still seeing development). This maven plugins allows the protoc Protocol Buffers compiler to be run at the generate-sources phase of the Maven Lifecycle. Under m2eclipse, this phase was happening outside Eclipse and the builds was running fine.

Unfortunately m2e is not able to solve this correctly. It requires using a connector. Those connectors are Eclipse plugins that ties a maven plugin to a m2e build lifecycle phase. This way when m2e needs to execute this phase of the build, it can do so with the connector.

Until now, there wasn’t any lifecycle connector for the maven-protoc-plugin. This wasn’t possible to continue without this in the long term for our development team, so I took a stab to build it.

In fact it was way simpler than what I first thought. I used the m2e Extension Development Guide as a bootstrap (and especially the EGit extension).

The result of this few hours of development is now open-source and available in the m2e-protoc-connector Github repository.

Installation

I didn’t release an Eclipse p2 update repository (mainly because I don’t really know how to do that), so you’ll have to build the project by yourself (but it’s easy).

  1. Clone the repository
1
git clone git://github.com/masterzen/m2e-protoc-connector.git
  1. Build with maven 3
1
mvn package

Once built, you’ll find the feature packaged in com.daysofwonder.tools.m2e-protoc-connector.feature/target/com.daysofwonder.tools.m2e-protoc-connector.feature-1.0.0.20111130-1035-site.zip.

To install in Eclipse Indigo:

  1. open the Install New Software window from the Help menu.
  2. Then click on the Add button
  3. select the Archive button and point it to the: com.daysofwonder.tools.m2e-protoc-connector.feature/target/com.daysofwonder.tools.m2e-protoc-connector.feature-1.0.0.20111130-1035-site.zip file.
  4. Accept the license terms and restart eclipse.

Usage

To use it there is no specific need, as long as your pom.xml conforms roughly to what we use:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
<plugin>
    <groupId>com.google.protobuf.tools</groupId>
    <artifactId>maven-protoc-plugin</artifactId>
    <executions>
        <execution>
            <id>generate proto sources</id>
            <goals>
                <goal>compile</goal>
            </goals>
            <phase>generate-sources</phase>
            <configuration>
                <protoSourceRoot>${basedir}/src/main/proto/</protoSourceRoot>
                <includes>
                    <param>**/*.proto</param>
                </includes>
            </configuration>
        </execution>
    </executions>
</plugin>
...
  <dependency>
<groupId>com.google.protobuf</groupId>
<artifactId>protobuf-java</artifactId>
<version>2.4.1</version>
  </dependency>
...
<pluginRepositories>
    <pluginRepository>
        <id>dtrott-public</id>
        <name>David Trott's Public Repository</name>
        <url>http://maven.davidtrott.com/repository</url>
    </pluginRepository>
</pluginRepositories>

If you find any problem, do not hesitate to open an issue on the github repository.

Redis-snmp: Redis Performance Monitoring Through SNMP

| Comments

The same way I created mysql-snmp a small Net-SNMP subagent that allows exporting performance data from MySQL through SNMP, I’m proud to announce the first release of redis-snmp to monitor Redis servers. It is also inspired by the Cacti MySQL Templates (which also covers Redis).

I originally created this Net-SNMP perl subagent to monitor some Redis performance metrics with OpenNMS.

The where

You’ll find the sources (which allows to produce a debian package) in the redis-snmp github repository

The what

Here are the kind of graphs and metrics you can export from a redis server:

Redis Connections

Redis Commands

Redis Memory

The how

Like mysql-snmp you need to run redis-snmp on a host that has a connectivity with the monitored redis server (the same host makes sense). You also need the following dependencies:

  • Net-SNMP >= 5.4.2.1 (older versions contains a 64 bits varbind issue)
  • perl (tested under perl 5.10 from debian squeeze)

Once running, you should be able to ask your snmpd about redis values:

1
2
3
4
5
6
7
$ snmpbulkwalk -m'REDIS-SERVER-MIB' -v 2c  -c public redis-server.domain.com .1.3.6.1.4.1.20267.400
REDIS-SERVER-MIB::redisConnectedClients.0 = Gauge32: 1
REDIS-SERVER-MIB::redisConnectedSlaves.0 = Gauge32: 0
REDIS-SERVER-MIB::redisUsedMemory.0 = Counter64: 154007648
REDIS-SERVER-MIB::redisChangesSinceLastSave.0 = Gauge32: 542
REDIS-SERVER-MIB::redisTotalConnections.0 = Counter64: 6794739
REDIS-SERVER-MIB::redisCommandsProcessed.0 = Counter64: 37574019

Of course you must adjust the hostname and community. SNMP v2c (or better) is mandatory since we’re reporting 64 bits values.

Note that you can get the OID translation to name only if the REDIS-SNMP-SERVER MIB is installed on the host where you run the above command.

OpeNMS integration

To integrate to OpenNMS, it’s as simple as adding the following group to your datacollection-config.xml file:

1
2
3
4
5
6
7
8
9
<!-- REDIS-SERVER MIB -->
<group name="redis" ifType="ignore">
    <mibObj oid=".1.3.6.1.4.1.20267.400.1.1" instance="0" alias="redisConnectedClnts" type="Gauge32" />
    <mibObj oid=".1.3.6.1.4.1.20267.400.1.2" instance="0" alias="redisConnectedSlavs" type="Gauge32" />
    <mibObj oid=".1.3.6.1.4.1.20267.400.1.3" instance="0" alias="redisUsedMemory" type="Gauge64" />
    <mibObj oid=".1.3.6.1.4.1.20267.400.1.4" instance="0" alias="redisChangsSncLstSv" type="Gauge32" />
    <mibObj oid=".1.3.6.1.4.1.20267.400.1.5" instance="0" alias="redisTotalConnectns" type="Counter64" />
    <mibObj oid=".1.3.6.1.4.1.20267.400.1.6" instance="0" alias="redisCommandsPrcssd" type="Counter64" />
</group>

And the following graph definitions to your snmp-graph.properties file:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
report.redis.redisconnections.name=Redis Connections
report.redis.redisconnections.columns=redisConnectedClnts,redisConnectedSlavs,redisTotalConnectns
report.redis.redisconnections.type=nodeSnmp
report.redis.redisconnections.width=565
report.redis.redisconnections.height=200
report.redis.redisconnections.command=--title "Redis Connections" \
 --width 565 \
 --height 200 \
 DEF:redisConnectedClnts={rrd1}:redisConnectedClnts:AVERAGE \
 DEF:redisConnectedSlavs={rrd2}:redisConnectedSlavs:AVERAGE \
 DEF:redisTotalConnectns={rrd3}:redisTotalConnectns:AVERAGE \
 LINE1:redisConnectedClnts#9B2B1B:"REDIS Connected Clients         " \
 GPRINT:redisConnectedClnts:AVERAGE:"Avg \\: %8.2lf %s" \
 GPRINT:redisConnectedClnts:MIN:"Min \\: %8.2lf %s" \
 GPRINT:redisConnectedClnts:MAX:"Max \\: %8.2lf %s\\n" \
 LINE1:redisConnectedSlavs#4A170F:"REDIS Connected Slaves          " \
 GPRINT:redisConnectedSlavs:AVERAGE:"Avg \\: %8.2lf %s" \
 GPRINT:redisConnectedSlavs:MIN:"Min \\: %8.2lf %s" \
 GPRINT:redisConnectedSlavs:MAX:"Max \\: %8.2lf %s\\n" \
 LINE1:redisTotalConnectns#38524B:"REDIS Total Connections Received" \
 GPRINT:redisTotalConnectns:AVERAGE:"Avg \\: %8.2lf %s" \
 GPRINT:redisTotalConnectns:MIN:"Min \\: %8.2lf %s" \
 GPRINT:redisTotalConnectns:MAX:"Max \\: %8.2lf %s\\n"

report.redis.redismemory.name=Redis Memory
report.redis.redismemory.columns=redisUsedMemory
report.redis.redismemory.type=nodeSnmp
report.redis.redismemory.width=565
report.redis.redismemory.height=200
report.redis.redismemory.command=--title "Redis Memory" \
  --width 565 \
  --height 200 \
  DEF:redisUsedMemory={rrd1}:redisUsedMemory:AVERAGE \
  AREA:redisUsedMemory#3B7AD9:"REDIS Used Memory" \
  GPRINT:redisUsedMemory:AVERAGE:"Avg \\: %8.2lf %s" \
  GPRINT:redisUsedMemory:MIN:"Min \\: %8.2lf %s" \
  GPRINT:redisUsedMemory:MAX:"Max \\: %8.2lf %s\\n"

report.redis.rediscommands.name=Redis Commands
report.redis.rediscommands.columns=redisCommandsPrcssd
report.redis.rediscommands.type=nodeSnmp
report.redis.rediscommands.width=565
report.redis.rediscommands.height=200
report.redis.rediscommands.command=--title "Redis Commands" \
 --width 565 \
 --height 200 \
 DEF:redisCommandsPrcssd={rrd1}:redisCommandsPrcssd:AVERAGE \
 AREA:redisCommandsPrcssd#FF7200:"REDIS Total Commands Processed" \
 GPRINT:redisCommandsPrcssd:AVERAGE:"Avg \\: %8.2lf %s" \
 GPRINT:redisCommandsPrcssd:MIN:"Min \\: %8.2lf %s" \
 GPRINT:redisCommandsPrcssd:MAX:"Max \\: %8.2lf %s\\n"

report.redis.redisunsavedchanges.name=Redis Unsaved Changes
report.redis.redisunsavedchanges.columns=redisChangsSncLstSv
report.redis.redisunsavedchanges.type=nodeSnmp
report.redis.redisunsavedchanges.width=565
report.redis.redisunsavedchanges.height=200
report.redis.redisunsavedchanges.command=--title "Redis Unsaved Changes" \
  --width 565 \
  --height 200 \
  DEF:redisChangsSncLstSv={rrd1}:redisChangsSncLstSv:AVERAGE \
  AREA:redisChangsSncLstSv#A88558:"REDIS Changes Since Last Save" \
  GPRINT:redisChangsSncLstSv:AVERAGE:"Avg \\: %8.2lf %s" \
  GPRINT:redisChangsSncLstSv:MIN:"Min \\: %8.2lf %s" \
  GPRINT:redisChangsSncLstSv:MAX:"Max \\: %8.2lf %s\\n"

Do not forget to register the new graphs in the report list at the top of snmp-graph.properties file.

Restart OpenNMS, and it should start graphing your redis performance metrics. You’ll find those files in the opennms directory of the source distribution.

Enjoy :)

The Indirector - Puppet Extension Points 3

| Comments

This article is a follow-up of those previous two articles of this series on Puppet Internals:

Today we’ll cover the The Indirector. I believe that at the end of this post, you’ll know exactly what is the indirector and how it works.

The scene

The puppet source code needs to deal with lots of different abstractions to do its job. Among those abstraction you’ll find:

  • Certificates
  • Nodes
  • Facts
  • Catalogs

Each one those abstractions can be found in the Puppet source code under the form of a model class. For instance when Puppet needs to deal with the current node, it in fact deals with an instance of the node model class. This class is called Puppet::Node.

Each model can exist physically under different forms. For instance Facts can come from Facter or a YAML file, or Nodes can come from an ENC, LDAP, site.pp and so on. This is what we call a Terminus.

The Indirector allows the Puppet programmer to deal with model instances without having to manage herself the gory details of where this model instance is coming/going.

For instance, the code is the same for the client call site to find a node when it comes from an ENC or LDAP, because it’s irrelevant to the client code.

Actions

So you might be wondering what the Indirector allows to do with our models. Basically the Indirector implements a basic CRUD (Create, Retrieve, Update, Delete) system. In fact it implements 4 verbs (that maps to the CRUD and REST verb sets):

  • Find: allows to retrieve a specific instance, given through the key
  • Search: allows to retrieve some instances with a search term
  • Destroy: remove a given instance
  • Save: stores a given instance

You’ll see a little bit later how it is wired, but those verbs exist as class and/or instance methods in the models class.

So back to our Puppet::Node example, we can say this:

1
2
3
4
5
6
7
8
9
10
11
12
  # Finding a specific node
  node = Puppet::Node.find('test.daysofwonder.com')

  # here I can use node, being an instance of Puppet::Node
  puts "node: #{node.name}"

  # I can also save the given node (if the terminus allows it of course)
  # Note: save is implemented as an instance method
  node.save

  # we can also destroy a given node (if the terminus implements it):
  Puppet::Node.destroy('unwanted.daysowonder.com')

And this works for all the managed models, I could have done the exact same code with certificate instead of nodes.

Terminii

For the Latin illiterate out-there, terminii is the latin plural for terminus.

So a terminus is a concrete class that knows how to deal with a specific model type. A terminus exists only for a given model. For instance the catalog indirection can use the Compiler or the YAML terminus among half-dozen of available terminus.

The terminus is a class that should inherit somewhere in the class hierarchy from Puppet::Indirector::Terminus. This last sentence might be obscure but if your terminus for a given model directly inherits from Puppet::Indirector::Terminus, it is considered as an abstract terminus and won’t work.

1
2
3
4
5
6
7
8
9
10
11
12
13
  def find(request)
    # request.key contains the instance to find
  end

  def destroy(request)
  end

  def search(request)
  end

  def save(request)
    # request.instance contains the model instance to save
  end

The request parameter used above is an instance of Puppet::Indirector::Request. This request object contains a handful property that might be of interest when implementing a terminus. The first one is the key method which returns the name of the instance we want to manipulate. The other is instance which is available only when saving is a concrete instance of the model.

Implementing a terminus

To implement a new terminus of a given model, you need to add a ruby file of the terminus name in the puppet/indirector/<indirection>/<terminus>.rb.

For instance if we want to implement a new source of puppet nodes like storing node classes in DNS TXT resource records, we’d create a puppet/node/dns.rb file whose find method would ask for TXT RR using request.key.

Puppet already defines some common behavior like yaml based files, rest based, code based or executable based. A new terminus can inherit from one of those abstract terminus to inherit from its behavior.

I contributed (but hasn’t been merged yet) and OCSP system for Puppet. This one defines a new indirection: ocsp. This indirection contains two terminus:

The real concrete one that inherits from Puppet::Indirector::Code, it in fact delegates the OCSP request verification to the OCSP layer:

1
2
3
4
5
6
7
8
9
10
11
require 'puppet/indirector/ocsp'
require 'puppet/indirector/code'
require 'puppet/ssl/ocsp/responder'

class Puppet::Indirector::Ocsp::Ca < Puppet::Indirector::Code
  desc "OCSP request revocation verification through the local CA."

  def save(request)
    Puppet::SSL::Ocsp::Responder.respond(request.instance)
  end
end

It also has a REST terminus. This allows for a given implementation to talk to a remote puppet process (usually a puppetmaster) using the indirector without modifying client or server code:

1
2
3
4
5
6
7
8
9
require 'puppet/indirector/ocsp'
require 'puppet/indirector/rest'

class Puppet::Indirector::Ocsp::Rest < Puppet::Indirector::REST
  desc "Remote OCSP certificate REST remote revocation status."

  use_server_setting(:ca_server)
  use_port_setting(:ca_port)
end

As you can see we can do a REST client without implementing any network stuff!

Indirection creation

To tell Puppet that a given model class can be indirected it’s just a matter or adding a little bit of Ruby metaprogramming.

To keep my OCSP system example, the OCSP request model class is declared like this:

1
2
3
4
5
6
7
8
9
10
class Puppet::SSL::Ocsp::Request < Puppet::SSL::Base
  ...

  extend Puppet::Indirector
  # this will tell puppet that we have a new indirection
  # and our default terminus will be found in puppet/indirector/ocsp/ca.rb
  indirects :ocsp, :terminus_class => :ca

  ...
end

Basically we’re saying the our model Puppet::SSL::Ocsp::Request declares an indirection ocsp, whose default terminus class is ca. That means, if we straightly try to call Puppet::SSL::Ocsp::Request.find, the puppet/indirection/ocsp/ca.rb file will be used.

Terminus selection

There’s something I didn’t talk about. You might ask yourself how Puppet knows which terminus it should use when we call one of the indirector verb. As seen above, if nothing is done to configure it, it will default to the terminus given on the indirects call.

But it is configurable. The Puppet::Indirector module defines the terminus_class= method. This methods when called can be used to change the active terminus.

For instance in the puppet agent, the catalog indirection has a REST terminus, but in the master the same indirection uses the compiler:

1
2
3
4
5
  # puppet agent equivalent code
  Puppet::Resource::Catalog.terminus_class = :rest

  # puppet master equivalent code
  Puppet::Resource::Catalog.terminus_class = :compiler

In fact the code is a little bit more complicated than this for the catalog but in the end it’s equivalent.

There’s also the possibility for a puppet application to specify a routing table between indirection and terminus to simplify the wiring.

More than one type of terminii

There’s something I left aside earlier. There are in fact two types of terminii per indirection:

  • regular terminus as we saw earlier
  • cache terminus

For every model class we can define the regular indirection terminus and an optional cache terminus.

Then when finding for an instance the cache terminus will first be asked for. If not found in the cache (or asked to not get from the cache) the regular terminus will be used. Afterward the instance will be saved in the cache terminus.

This cache is exploited in lots of place in the Puppet code base.

Among those, the catalog cache terminus is set to :yaml on the agent. The effect is that when the agent retrieves the catalog from the master through the :rest regular terminus, it is locally saved by the yaml terminus. This way if the next agent run fails when retrieving the catalog through REST, it will used the previous one locally cached during the previous run.

Most of the certificate stuff is handled along the line of the catalog, with local caching with a file terminus.

REST Terminus in details

There is a direct translation between the REST verbs and the indirection verbs. Thus the :rest terminus:

  1. transforms the indirection and key to an URI: /<environment>/<indirection>/<key>
  2. does an HTTP GET|PUT|DELETE|POST depending on the indirection verb

On the server side, the Puppet network layer does the reverse, calling the right indirection methods based on the URI and the REST verb.

There’s also the possibility to sends parameters to the indirection and with REST, those are transformed into URL request parameters.

The indirection name used in the URI is pluralized by adding a trailing ‘s’ to the indirection name when doing a search, to be more REST. For example:

  • GET /production/certificate/test.daysofwonder.com is find
  • GET /production/certificates/unused is a search

When indirecting a model class, Puppet mixes-in the Puppet::Network::FormatHandler module. This module allows to render and convert an instance from and to a serialized format. The most used one in Puppet is called pson, which in fact is json in disguised name.

During a REST transaction, the instance can be serialized and deserialized using this format. Each model can define its preferred serialization format (for instance catalog use pson, but certificates prefer raw encoding).

On the HTTP level, we correctly add the various encoding headers reflecting the serialization used.

You will find a comprehensive list of all REST endpoint in puppet here

Puppet 2.7 indirection

The syntax I used in my samples are derived from the 2.6 puppet source. In Puppet 2.7, the dev team introduced (and are now contemplating removing) an indirection property in the model class which implements the indirector verbs (instead of being implemented directly in the model class).

This translates to:

1
2
3
4
5
  # 2.6 way, and possibly 2.8 onward
  Puppet::Node.find(...)

  # 2.7 way
  Puppet::Node.indirection.find(...)

Gory details anyone?

OK, so how it works?

Let’s focus on Puppet::Node.find call:

  1. Ruby loads the Puppet::Node class
  2. When mixing in Puppet::Indirector we created a bunch of find/destroy… methods in the current model class
  3. Ruby execute the indirects call from the Puppet::Indirector module
    1. This one creates a Puppet::Indirector::Indirection stored locally in the indirection class instance variable
    2. This also registers the given indirection in a global indirection list
    3. This also register the given default terminus class. The terminus are loaded with a Puppet::Util::Autoloader through a set of Puppet::Util::InstanceLoader
  4. When this terminus class is loaded, since it somewhat inherits from Puppet::Indirector::Terminus, the Puppet::Indirector:Terminus#inherited ruby callback is executed. This one after doing a bunch of safety checks register the terminus class as a valid terminus for the loaded indirection.
  5. We’re now ready to really call Puppet::Node.find. find is one of the method that we got when we mixed-in Puppet::Indirector
    1. find first create a Puppet::Indirector::Request, with the given key.
    2. It then checks the terminus cache if one has been defined. If the cache terminus finds an instance, this one is returned
    3. Otherwise find delegates to the registered terminus, by calling terminus.find(request)
    4. If there’s a result, this one is cached in the cache terminus
    5. and the result is returned

Pretty simple, isn’t it? And that’s about the same mechanism for the three other verbs.

It is to be noted that the terminus are loaded with the puppet autoloader. That means it should be possible to add more indirection and/or terminus as long as paths are respected and they are in the RUBYLIB. I don’t think though that those paths are pluginsync’ed.

Conclusion

I know that the indirector can be intimidating at first, but even without completely understanding the internals, it is quite easy to add a new terminus for a given indirection.

On the same subject, I highly recommends this presentation about Extending Puppet by Richard Crowley. This presentation also covers the indirector.

This article will certainly close the Puppet Extension Points series. The last remaining extension type (Faces) have already been covered thoroughly on the Puppetlabs Docs site.

The next article will I think cover the full picture of a full puppet agent/master run.

Puppet Extension Points - Part 2

| Comments

After the first part in this series of article on Puppet extensions points, I’m proud to deliver a new episode focusing on Types and Providers.

Note that there’s a really good chapter on the same topic in James Turnbull and Jeff McCune Pro Puppet (which I highly recommend if you’re a serious puppeteer). Also note that you can attend Puppetlabs Developper Training, which covers this topic.

Of Types and Providers

One of the great force of Puppet is how various heterogenous aspects of a given POSIX system (or not, like the Network Device system I contributed) are abstracted into simple elements: types.

Types are the foundation bricks of Puppet, you use them everyday to model how your systems are formed. Among the core types, you’ll find user, group, file, …

In Puppet, manifests define resources which are instances of their type. There can be only one resource of a given name (what we call the namevar, name or title) for a given catalog (which usually maps to a given host).

A type models what facets of a physical entity (like a host user) are managed by Puppet. These model facets are called “properties” in Puppet lingo.

Essentially a type is a name, some properties to be managed and some parameters. Paramaters are values that will help or direct Puppet to manage the resource (for instance the managehome parameter of the user type is not part of a given user on the host, but explains to Puppet that this user’s home directory is to be managed).

Let’s follow the life of a resource during a puppet run.

  1. During compilation, the puppet parser will instantiate Puppet::Parser::Resource instances which are Puppet::Resource objects. Those contains the various properties and parameters values defined in the manifest.

  2. Those resources are then inserted into the catalog (an instance of Puppet::Resource::Catalog)

  3. The catalog is then sent to the agent (usually in json format)

  4. The agent converts the catalog individual resources into RAL resources by virtue of Puppet::Resource#to_ral. We’re now dealing with instances of the real puppet type class. RAL means Resource Abstraction Layer.

    1. The agent then applies the catalog. This process creates the relationships graph so that we can manage resources in an order obeying require/before metaparameters. During catalog application, every RAL resource is evaluated. This process tells a given type to do what is necessary so that every managed property of the real underlying resource match what was specified in the manifest. The software system that does this is the provider.

So to summarize, a type defines to Puppet what properties it can manage and an accompanying provider is the process to manage them. Those two elements forms the Puppet RAL.

There can be more than one provider per type, depending on the host or platform. For instance every users have a login name on all kind of systems, but the way to create a new user can be completely different on Windows or Unix. In this case we can have a provider for Windows, one for OSX, one for Linux… Puppet knows how to select the best provider based on the facts (the same way you can confine facts to some operating systems, you can confine providers to some operating systems).

Looking Types into the eyes

I’ve written a combination of types/providers for this article. It allows to manage DNS zones and DNS Resource Records for DNS hosting providers (like AWS Route 53 or Zerigo). To simplify development I based the system on Fog DNS providers (you need to have the Fog gem installed to use those types on the agent). The full code of this system is available in my puppet-dns github repository.

This work defines two new Puppet types:

  • dnszone: manage a given DNS zone (ie a domain)
  • dnsrr: manage an individual DNS RR (like an A, AAAA, … record). It takes a name, a value and a type.

Here is how to use it in a manifest:





Let’s focus on the dnszone type, which is the simpler one of this module:





Note, that the dnszone type assumes there is a /etc/puppet/fog.yaml file that contains Fog DNS options and credentials as a hash encoded in yaml. Refer to the aforementioned github repository for more information and use case.

Exactly like parser functions, types are defined in ruby, and Puppet can autoload them. Thus types should obey to the Puppet type ruby namespace. That’s the reason we have to put types in puppet/type/. Once again this is ruby metaprogramming (in its all glory), to create a specific internal DSL that helps describe types to Puppet with simple directives (the alternative would have been to define a datastructure which would have been much less practical).

Let’s dive into the dnszone type.

  • Line 1, we’re calling the Puppet::Type#newtype method passing, first the type name as a ruby symbol (which should be unique among types), second a block (from line 1 to the end). The newtype method is imported in Puppet::Type but is in fact defined in Puppet::Metatype::ManagerNewtype job is to create a new singleton class whose parent is Puppet::Type (or a descendant if needed). Then the given block will be evaluated in class context (this means that the block is executed with self being the just created class). This singleton class is called Puppet::TypeDnszone in our case (but you see the pattern).

  • Line 2: we’re assigning a string to the Puppet::Type class variable @doc. This will be used to to extract type documentation.

  • Line 4: This straight word ensurable, is a class method in Puppet::Type. So when our type block is evaluated, this method will be called. This methods installs a new special property Ensure. This is a shortcut to automatically manage creation/deletion/existence of the managed resource. This automatically adds support for ensure =&gt; (present|absent) to your type. The provider still has to manage ensurability, though.

  • Line 6: Here we’re calling Puppet::Type#newparam. This tells our type that we’re going to have a parameter called “name”. Every resource in Puppet must have a unique key, this key is usually called the name or the title. We’re giving a block to this newparam method. The job of newparam is to create a new class descending of Puppet::Parameter, and to evaluate the given block in the context of this class (which means in this block self is a singleton class of Puppet::Parameter). Puppet::Parameter defines a bunch of utility class methods (that becomes apparent directives of our parameter DSL), among those we can find isnamevar which we’ve used for the name parameter. This tells Puppet type system that the name parameter is what will be the holder of the unique key of this type. The desc method allows to give some documentation about the parameter.

  • Line 12: we’re defining now the email parameter. And we’re using the newvalues class method of Puppet::Parameter. This method defines what possible values can be set to this parameter. We’re passing a regex that allows any string containing an ‘@’, which is certainly the worst regex to validate an e-mail address :) Puppet will raise an error if we don’t give a valid value to this parameter.

  • Line 17: and again a new parameter. This parameter is used to control Fog behavior (ie give to it your credential and fog provider used). Here we’re using defaultto, which means if we don’t pass a value then the defaultto value will be used.

  • Line 22: there is a possibility for a given resource to auto-require another resource. The same way a file resource can automatically add ‘require’ to its path ancestor. In our case, we’re autorequiring the yaml_fog_file, so that if it is managed by puppet, it will be evaluated before our dnszone resource (otherwise our fog provider might not have its credentials available).

Let’s now see another type which uses some other type DSL directives:





We’ll pass over the bits we already covered with the first type, and concentrate on new things:

  • Line 12: our dnszone type contained only parameters. Now it’s the first time we define a property. A property is exactly like a parameter but is fully managed by Puppet (see the chapter below). A property is an instance of a Puppet::Property class, which itself inherits from Puppet::Parameter, which means all the methods we’ve covered in our first example for parameters are available for properties. This type property is interesting because it defines discrete values. If you try to set something outside of this list of possible values, Puppet will raise an error. Values can be either ruby symbols or strings.

  • Line 17: a new property is defined here. With the isrequired method we tell Puppet that is is indeed necessary to have a value. And the validate methods will store the given validate block so that when Puppet will set the desired value to this property it will execute it. In our case we’ll report an error if the given value is empty.

  • Line 24: here we defined a global validation system. This will be called when all properties will have been assigned a value. This block executes in the instance context of the type, which means that we can access all instance variables and methods of Puppet::Type (in particualy the [] method that allows to access parameters/properties values). This allows to perform validation across the boundaries of a given parameter/property.

  • Line 25: finally, we declare a new parameter that references a dnszone. Note that we use a dynamic defaultto (with a block), so that we can look up the given resource name and derive our zone from the FQDN. This raises an important feature of the type system: the order of the declarations of the various blocks is important. Puppet will always respect the declaration order of the various properties when evaluating their values. That means a given property can access a value of another properties defined earlier.

I left managing RR TTL as an exercise to the astute reader :) Also note we didn’t cover all the directives the type DSL offers us. Notably, we didn’t see value munging (which allows to transform a string representation coming from the manifest to an internal (to the type) format). For instance that can be used to transform string IP address to the ruby IPAddr type for later use. I highly recommend you to browse the default types in the Puppet source distribution and check the various directives used there. You can also read Puppet::Parameter, Puppet::Property and Puppet::Type source code to see the ones we didn’t cover.

Life and death of Properties

So, we saw that a Puppet::Parameter is just a holder for the value coming from the manifest. A Puppet::Property is a parameter that along with the desired value (the one coming from the manifest) contains the current value (the one coming from the managed resource on the host). The first one is called the “should”, and the later one is called the “value”. Those innocently are methods of the Puppet::Property object and returns respectively those values. A property implements the following aspects:

  • it can retrieve a value from the managed resource. This is the operation of asking the real host resource to fetch its value. This is usually performed by delegation to the provider.

  • it can report its should which is the value given in the manifest

  • it can be insync?. This returns true if the retrieved value is equal to the “should” value.

  • and finally it might sync. Which means to the necessary so that “insync?” becomes true. If there is a provider for the given type, this one will be called to take care of the change.

When Puppet manages a resource, it does it with the help of a Puppet::Transaction. The given transaction orders the various properties that are not insync? to sync. Of course this is a bit more complex than that, because this is done while respecting resource ordering (the one given by the require/before metaparameter), but also propagating change events (so that service can be restarted and so on), and allowing resources to spawn child resources, etc… It’s perfectly possible to write a type without a provider, as long as all properties used implement their respective retrieve and sync methods. Some of the core types are doing this.

Providers

We’ve seen that properties usually delegate to the providers for managing the underlying real resource. In our example, we’ll have two providers, one for each defined type. There are two types of providers:

  • prefetch/flush
  • per properties

The per properties providers needs to implement a getter and a setter for every property of the accompanying type. When the transaction manipulates a given property its provider getter is called, and later on the setter will be called if the property is not insync?. It is the responsibility of those setters to flush those values to the physical managed resource. For some providers it is highly impractical or inefficient to flush on every property value change. To solve this issue, a given provider can be a prefetch/flush one. A prefetch/flush provider implements only two methods:

  • prefetch, which given a list of resources will in one call return a set of provider instances filled with the value fetched from the real resource.
  • flush will be called after all values will have been set, and that they can be persisted to the real resource.

The two providers I’ve written for this article are prefetch/flush ones, because it was impractical to call Fog for every property.

Anatomy of the dnszone provider

We’ll focus only on this provider, and I’ll leave as an exercise to the reader the analysis of the second one. Providers, being also ruby extensions, must live in the correct path respecting their ruby namespaces. For our dnszone fog provider, it should be in the puppet/provider/dnszone/fog.rb file. Unlike what I did for the types, I’ll split the provider code in parts so that I can explain them with the context. You can still browse the whole code.





This is how we tell Puppet that we have a new provider for a given type. If we decipher this, we’re fetching the dnszone type (which returns the singleton class of our dnszone type), and call the class method “provide”, passing it a name, some options and a big block. In our case, the provider is called “fog”, and our parent should be Puppet::Provider::Fog (which defines common methods for both of our fog providers, and is also a descendant of Puppet::Provider). Like for types, we have a desc class method in Puppet::Provider to store some documentation strings. We also have the confine method. This method will help Puppet choose the correct provider for a given type, ie its suitability. The confining system is managed by Puppet::Provider::Confiner. You can use:

  • a fact or puppet settings value, as in: confine :operatingsystem =&gt; :windows
  • a file existence: confine :exists =&gt; "/etc/passwd"
  • a Puppet “feature”, like we did for testing the fog library presence
  • an arbitrary boolean expression confine :true =&gt; 2 == 2

A provider can also be the default for a given fact value. This allows to make sure the correct provider is used for a given type, for instance the apt provider on debian/ubuntu platforms.

And to finish, a provider might need to call executables on the platform (and in fact most of them do). The Puppet::Provider class defines a shortcut to declare and use those executables easily:





Let’s continue our exploration of our dnszone provider





mk_resource_methods is an handy system that creates a bunch of setters/getters for every parameter/properties for us. Those fills values in the @property_hash hash.





The prefetch methods calls fog to fetch all the DNS zones, and then we match those with the ones managed by Puppet (from the resources hash).

For each match we instantiate a provider filled with the values coming from the underlying physical resource (in our case fog). For those that don’t match, we create a provider whose only existing properties is that ensure is absent.





Flush does the reverse of prefetch. Its role is to make sure the real underlying resource conforms to what Puppet wants it to be.

There are 3 possibilities:

  • the desired state is absent. We thus tell fog to destroy the given zone.
  • the desired state is present, but during prefetch we didn’t find the zone, we’re going to tell fog to create it.
  • the desired state is present, and we could find it during prefetch, in which case we’re just refreshing the fog zone.

To my knowledge this is used only for ralsh (puppet resource). The problem is that our provider can’t know how to access fog until it has a dnszone (which creates a chicken and egg problem :)

And finally we need to manage the Ensure property which requires our provider to implement: create, destroy and exists?.

In a prefetch/flush provider there’s no need to do more than controlling the ensure value.

Things to note:

  • a provider instance can access its resource with the resource accessor
  • a provider can access the current catalog through its resource.catalog accessor. This allows as I did in the dnsrr/fog.rb provider to retrieve a given resource (in this case the dnszone a given dnsrr depends to find how to access a given zone through fog).

Conclusion

We just surfaced the provider/type system (if you read everything you might disagree, though).

For instance we didn’t review the parsed file provider which is a beast in itself (the Pro Puppet book has a section about it if you want to learn how it works, the Puppet core host type is also a parsed file provider if you need a reference).

Anyway make sure to read the Puppet core code if you want to know more :) feel free to ask questions about Puppet on the puppet-dev mailing list or on the #puppet-dev irc channel on freenode, where you’ll find me under the masterzen nick.

And finally expect a little bit of time before the next episode, which will certainly cover the Indirector and how to add new terminus (but I first need to find an example, so suggestions are welcome).

Puppet Extension Points - Part 1

| Comments

It’s been a long time since my last blog post, almost a year. Not that I stopped hacking on Puppet or other things (even though I’m not as productive as I had been in the past), it’s just that so many things happened last year (Memoir’44 release, architecture work at Days of Wonder) that I lost the motivation of maintaining this blog.

But that’s over, I plan to start a series of Puppet internals articles. The first one (yes this one) is devoted to Puppet Extension Points.

Since a long time, Puppet contains a system to dynamically load ruby fragments to provide new functionalities both for the client and the master. Among the available extension points you’ll find:

  • manifests functions
  • custom facts
  • types and providers
  • faces

Moreover, Puppet contains a synchronization mechanism that allows you to ship your extensions into your manifests modules and those will be replicated automatically to the clients. This system is called pluginsync.

This first article will first dive into the ruby meta-programming used to create (some of) the extension DSL (not to be confused with the Puppet DSL which is the language used in the manifests). We’ll talk a lot about DSL and ruby meta programming. If you want to know more on those two topics, I’ll urge you to read those books:

Anatomy of a simple extension

Let’s start with the simplest form of extension: Parser Functions.

Functions are extensions of the Puppet Parser, the entity that reads and analyzes the puppet DSL (ie the manifests). This language contains a structure which is called “function”. You already use them a lot, for instance “include” or “template” are functions.

When the parser analyzes a given manifest, it detects the use of functions, and later on during the compilation phase the function code is executed and the result may be injected back into the compilation.

Here is a simple function:

The given function uses the puppet functions DSL to load the extension code into Puppet core code. This function is simple and does what its basename shell equivalent does: stripping leading paths in a given filename. For this function to work you need to drop it in the lib/puppet/parser/functions directory of your module. Why is that? It’s because after all, extensions are written in ruby and integrate into the Puppet ruby namespace. Functions in puppet live in the Puppet::Parser::Functions class, which itself belongs to the Puppet scope.

The Puppet::Parser::Functions class in Puppet core has the task of loading all functions defined in any puppet/parser/functions directories it will be able to find in the whole ruby load path. When Puppet uses a module, the modules’ lib directory is automatically added to the ruby load path. Later on, when parsing manifests and a function call is detected, the Puppet::Parser::Functions will try to load all the ruby files in all the puppet/parser/functions directory available in the ruby load path. This last task is done by the Puppet autoloader (available into Puppet::Util::Autoload). Let’s see how the above code is formed:

  • Line 1: this is ruby way to say that this file belongs to the puppet function namespace, so that Puppet::Parser::Functions will be able to load it. In real, we’re opening the ruby class Puppet::Parser::Functions, and all that will follow will apply to this specific puppet class.

  • Line 2: this is where ruby meta-programming is used. Translated to standard ruby, we’re just calling the “newfunction” method. Since we’re in the Puppet::Parser::Functions class, we in fact are just calling the class method Puppet::Parser::Functions#newfunction.

We pass to it 4 arguments:

  • the function name, encoded as a symbol. Functions name should be unique in a given environment
  • the function type: either your function is a rvalue (meaning a right-value, an entity that lies on the right side of an assignment operation, so in real English: a function that returns a value), or is not (in which case the function is just a side-effect function not returning any values).
  • a documentation string (here we used a ruby heredoc) which might be extracted later.
  • and finally we’re passing a ruby code block (from the do on line 5, to the inner end on line 10). This code block won’t be executed when puppet loads the functions.

  • Line 5 to 10. The body of the methods. When ruby loads the function file on behalf of Puppet, it will happily pass the code block to newfunction. This last one will store the code block for later use, and make it available in the Puppet scope class under the name function_basename (that’s one of the cool thing about ruby, you can arbitrarily create new methods on classes, objects or even instances).

So let’s see what happens when puppet parses and executes the following manifest:

The first thing that happens when compiling manifests is that the Puppet lexer triggers. It will read the manifest content and split it in tokens that the parser knows. So essentially the above content will be transformed in the following stream of tokens:

The parser, given this input, will reduce this to what we call an Abstract Syntax Tree. That’s a memory data structure (usually a tree) that represents the orders to be executed that was derived from the language grammar and the stream of tokens. In our case this will schematically be parsed as:

In turns, when puppet will compile the manifest (ie execute the above AST), this will be equivalent to this ruby operation:

Remember how Puppet::Parser::Functions#newfunction created the function_basename. At that time I didn’t really told you the exact truth. In fact newfunction creates a function in an environment specific object instance (so that functions can’t leak from one Puppet environment to another, which was one of the problem of 0.25.x). And any given Puppet scope which are instances of Puppet::Parser::Scope when constructed will mix in this environment object, and thus bring to life our shiny function as if it was defined in the scope ruby code itself.

Pluginsync

Let’s talk briefly about the way your modules extensions are propagated to the clients. So far we’ve seen that functions live in the master, but some other extensions types (like facts or types) essentially live in the client. Since it would be cumbersome for an admin to replicate all the given extensions to all the clients manually, Puppet offers pluginsync, a way to distribute this ruby code to the clients. It’s part of every puppet agent run, before asking for a catalog to the master. The interesting thing (and that happens in a lot of place into Puppet, which always amazes me), is that this pluginsync process is using Puppet itself to perform this synchronization. Puppet is good at synchronizing remotely and recursively a set of files living on the master. So pluginsync just create a small catalog containing a recursive File resource whose source is the plugins fileserver mount on the master, and the destination the current agent puppet lib directory (which is part of the ruby load path). Then this catalog is evaluated and the Puppet File resource mechanism does its magic and creates all the files locally, or synchronizes them if they differ. Finally, the agent loads all the ruby files it synchronized, registering the various extensions it contains, before asking for its host catalog.

Wants some facts?

The other extension point that you certainly already encountered is adding custom facts. A fact is simply a key, value tuple (both are strings). But we also usually call a fact the method that dynamically produces this tuple. Let’s see what it does internally. We’ll use the following example custom fact:





It’s no secret that Puppet uses Facter a lot. When a puppet agent wants a catalog, the first thing it does is asking Facter for a set of facts pertaining to the current machine. Then those facts are sent to the master when the agent asks for a catalog. The master injects those facts as variables in the root scope when compiling the manifests.

So, facts are executed in the agent. Those are pluginsync’ed as explained above, then loaded into the running process.

When that happens the add method of the Facter class is called. The block defined between line 2 and 6 is then executed in the Facter::Util::Resolution context. So the Facter::Util::Resolution#setcode method will be called and the block between line 3 and 5 will be stored for later use.

This Facter::Util::Resolution instance holding our fact code will be in turn stored in the facts collection under the name of the fact (see line 2).

Why is it done in this way? Because not all facts can run on every hosts. For instance our above facts does not work on Windows platform. So we should use facter way of confining our facts to architectures on which we know they’ll work. Thus Facter defines a set of methods like “confine” that can be called during the call of Facter#add (just add those outside of the setcode block).  Those methods will modify how the facts collection will be executed later on. It wouldn’t have been possible to confine our facts if we stored the whole Facter#add block and called it directly at fact resolution, hence the use of this two-steps system.

Conclusion

And, that’s all folks for the moment. Next episode will explain types and providers inner workings. I also plan an episode about other Puppet internals, like the parser, catalog evaluation, and/or the indirector system.

Tell me (though comments here or through my twitter handle @masterzen) if you’re interested in this kind of Puppet stuff, or if there are any specific topics you’d like me to cover :)

Puppet SSL Explained

| Comments

The puppet-users or #puppet freenode irc channel is full of questions or people struggling about the puppet SSL PKI. To my despair, there are also people wanting to completely get rid of any security.

While I don’t advocate the live happy, live without security motto of some puppet users (and I really think a corporate firewall is only one layer of defense among many, not the ultimate one), I hope this blog post will help them shoot themselves in their foot :)

I really think SSL or the X509 PKI is simple once you grasped its underlying concept. If you want to know more about SSL, I really think everybody should read Eric Rescola’s excellent ”SSL and TLS: Designing and Building Secure Systems”.

I myself had to deal with SSL internals and X509 PKI while I implemented a java secured network protocol in a previous life, including a cryptographic library.

Purpose of Puppet SSL PKI

The current puppet security layer has 3 aims:

  1. authenticate any node to the master (so that no rogue node can get a catalog from your master)
  2. authenticate the master on any node (so that your nodes are not tricked into getting a catalog from a rogue master).
  3. prevent communication eavesdropping between master and nodes (so that no rogue users can grab configuration secrets by listening to your traffic, which is useful in the cloud)

A notion of PKI

PKI means: Public Key Infrastructure. But whats this?

A PKI is a framework of computer security that allows authentication of individual components based on public key cryptography. The most known system is the x509 one which is used to protect our current web.

A public key cryptographic system works like this:

  • every components of the system has a secret key (known as the private key) and a public key (this one can be shared with other participant of the system). The public and private keys are usually bound by a cryptographic algorithm.
  • authentication of any component is done with a simple process: a component signs a message with its own private key. The receiver can authenticate the message (ie know the message comes from the original component) by validating the signature. To do this, only the public key is needed.

There are different public/private key pair cryptosystem, the most known ones are RSA, DSA or those based on Elliptic Curve cryptography.

Usually it is not good that all participants of the system must know each other to communicate. So most of the current PKI system use a hierarchical validation system, where all the participant in the system must only know one of the parent in the hierarchy to be able to validate each others.

X509 PKI

X509 is an ITU-T standard of a PKI. It is the base of the SSL protocol authentication that puppet use. This standard specifies certificates, certificate revocation list, authority and so on…

A given X509 certificate contains several information like those:

  • Serial number (which is unique for a given CA)
  • Issuer (who created this certificate, in puppet this is the CA)
  • Subject (who this certificate represents, in puppet this is the node certname or fqdn)
  • Validity (valid from, expiration date)
  • Public key (and what kind of public key algorithm has been used)
  • Various extensions (usually what this certificate can be used for,…)

You can check RFC1422 for more details.

The certificate is usually the DER encoding of the ASN.1 representation of those informations, and is usually stored as PEM for consumption.

A given X509 certificate is signed by what we call a Certificate Authority (CA for short). A CA is an infrastructure that can sign new certificates. Anyone sharing the public key of the CA can validate that a given certificate has been validated by the CA.

Usually X509 certificate embeds a RSA public key with an exponent of 0x100001 (see below).  Along with a certificate, you need a private key (usually also PEM-encoded).

So basically the X509 system works with the following principle: CA are using their own private keys to sign components certificates, it is the CA role to sign only trusted component certificates. The trust is usually established out-of-bound of the signing request.

Then every component in the system knows the CA certificate (ie public key). If one component gets a message from another component, it checks the attached message signature with the CA certificate. If that validates, then the component is authenticated. Of course the component should also check the certificate validity, if the certificate has been revoked (from OCSP or a given CRL), and finally that the certificate subject matches who the component pretends to be (usually this is an hostname validation against some part of the certificate Subject)

RSA system

Most of X509 certificate are based on the RSA cryptosystem, so let’s see what it is.

The RSA cryptosystem is a public key pair system that works like this:

Key Generation

To generate a RSA key, we chose two prime number p and q.

We compute n=pq. We call n the modulus.

We compute φ(pq) = (p − 1)(q − 1).

We chose e so that e>1 and e<φ(pq) (e and φ(pq) must be coprime). e is called the exponent. It usually is 0x10001 because it greatly simplifies the computations later (and you know what I mean if you already implemented this :)).

Finally we compute d=e-1 mod((p-1)(q-1)). This will be our secret key. Note that it is not possible to get d from only e (and since p and q are never kept after the computation this works).

In the end:

  • e and n form the public key
  • d is our private key

Encryption

So the usual actors when describing cryptosystems are Alice and Bob. Let’s use them.

Alice wants to send a message M to Bob. Alice knows Bob’s public key (e,n). She transform M in a number < n (this is called padding) that we’ll call m, then she computes: _c = me . mod(n) _

Decryption

When Bob wants to decrypt the message, he computes with his private key d: m = cd . mod(n)

Signing message

Now if Alice wants to sign a message to Bob. She first computes a hash of her message called H, then she computes: s = H(d mod n). So she used her own private key. She sends both the message and the signature.

Bob, then gets the message computes H and computes h’ = H(e mod n) with Alice’s public key. If h’ = h, then only Alice could have sent it.

Security

What makes this scheme work is the fundamental that finding p and q from n is a hard problem (understand for big values of n, it would take far longer than the validity of the message). This operation is called factorization. Current certificate are numbers containing  2048 bits, which roughly makes a 617 digits number to factor.

Want to know more?

Then there are a couple of books really worth reading:

How does this fit in SSL?

So SSL (which BTW means Secure Socket Layer) and now TLS (SSL successor) is a protocol that aims to provide security of communications between two peers. It is above the transport protocol (usually TCP/IP) in the OSI model. It does this by using symmetric encryption and message authentication code (MAC for short). The standard is (now) described in RFC5246.

It works by first performing an handshake between peers. Then all the remaining communications are encrypted and tamperproof.

This handshake contains several phases (some are optional):

  1. Client and server finds the best encryption scheme and MAC from the common list supported by both the server and the clients (in fact the server choses).
  2. The server then sends its certificate and any intermediate CA that the client might need
  3. The server may ask for the client certificate. The client may send its certificate.
  4. Both peers may validate those certificates (against a common CA, from the CRL, etc…)
  5. They then generate the session keys. The client generates a random number, encrypts it with the server public key. Only the server can decrypt it. From this random number, both peers generate the symmetric key that will be used for encryption and decryption.
  6. The client may send a signed message of the previous handshake message. This way the server can verify the client knows his private key (this is the client validation). This phase is optional.

After that, each message is encrypted with the generated session keys using a symmetric cipher, and validated with an agreed on MAC. Usual symmetric ciphers range from RC4 to AES. A symmetric cipher is used because those are usually way faster than any asymmetric systems.

Application to Puppet

Puppet defines it’s own Certificate Authority that is usually running on the master (it is possible to run a CA only server, for instance if you have more than one master).

This CA can be used to:

  • generate new certificate for a given client out-of-bound
  • sign a new node that just sent his Certificate Signing Request
  • revoke any signed certificate
  • display certificate fingerprints

What is important to understand is the following:

  • Every node knows the CA certificate. This allows to check the validity of the master from a node
  • The master doesn’t need the node certificate, since it’s sent by the client when connecting. It just need to make sure the client knows the private key and this certificate has been signed by the master CA.

It is also important to understand that when your master is running behind an Apache proxy (for Passenger setups) or Nginx proxy (ie some mongrel setups):

  • The proxy is the SSL endpoint. It does all the validation and authentication of the node.
  • Traffic between the proxy and the master happens in clear
  • The master knows the client has been authenticated because the proxy adds an HTTP header that says so (usually X-Client-Verify for Apache/Passenger).

When running with webrick, webrick runs inside the puppetmaster process and does all this internally. Webrick tells the master internally if the node is authenticated or not.

When the master starts for the 1st time, it generates its own CA certificate and private key, initializes the CRL and generates a special certificate which I will call the server certificate. This certificate will be the one used in the SSL/TLS communication as the server certificate that is later sent to the client. This certificate subject will be the current master FQDN. If your master is also a client of itself (ie it runs a puppet agent), I recommend using this certificate as the client certificate.

The more important thing is that this server certificate advertises the following extension:

1
2
X509v3 Subject Alternative Name:
                DNS:puppet, DNS:$fqdn, DNS:puppet.$domain

What this means is that this certificate will validate if the connection endpoint using it has any name matching puppet, the current fqdn or puppet in the current domain.

By default a client tries to connect to the ”puppet” host (this can be changed with –server which I don’t recommend and is usually the source of most SSL trouble).

If your DNS system is well behaving, the client will connect to puppet.$domain. If your DNS contains a CNAME for puppet to your real master fqdn, then when the client will validate the server certificate it will succeed because it will compare “puppet” to one of those DNS: entries in the aforementioned certificate extension. BTW, if you need to change this list, you can use the –certdnsname option (note: this can be done afterward, but requires to re-generate the server certificate).

The whole client process is the following:

  1. if the client runs for the 1st time, it generates a Certificate Signing Request and a private key. The former is an x509 certificate that is self-signed.
  2. the client connects to the master (at this time the client is not authenticated) and sends its CSR, it will also receives the CA certificate and the CRL in return.
  3. the master stores locally the CSR
  4. the administrator checks the CSR and can eventually sign it (this process can be automated with autosigning). I strongly suggest verifying certificate fingerprint at this stage.
  5. the client is then waiting for his signed certificate, which the master ultimately sends
  6. All next communications will use this client certificate. Both the master and client will authenticate each others by virtue of sharing the same CA.

Tips and Tricks

Troubleshooting SSL

Certificate content

First you can check any certificate content with this:

Simulate a SSL connection

You can know more information about a SSL error by simulating a client connection. Log in the trouble node and:

Check the last line of the report, it should say “Verify return code: 0 (ok)” if both the server and client authenticated each others. Check also the various information bits to see what certificate were sent. In case of error, you can learn about the failure by looking that the verification error message.

ssldump

Using ssldump or wireshark you can also learn more about ssl issues. For this to work, it is usually needed to force the cipher to use a simple cipher like RC4 (and also ssldump needs to know the private keys if you want it to decrypt the application data).

Some known issues

Also, in case of SSL troubles make sure your master isn’t using a different $ssldir than what you are thinking. If that happens, it’s possible your master is using a different dir and has regenerated its CA. If that happens no one node can connect to it anymore. This can happen if you upgrade a master from gem when it was installed first with a package (or the reverse).

If you regenerate a host, but forgot to remove its cert from the CA (with puppetca –clean), the master will refuse to sign it. If for any reason you need to fully re-install a given node without changing its fqdn, either use the previous certificate or clean this node certificate (which will automatically revoke the certificate for your own security).

Looking to the CRL content:

Notice how the certificate serial number 3 has been revoked.

Fingerprinting

Since puppet 2.6.0, it is possible to fingerprint certificates. If you manually sign your node, it is important to make sure you are signing the correct node and not a rogue system trying to pretend it is some genuine node. To do this you can get the certificate fingerprint of a node by running puppet agent –fingerprint, and when listing on the master the various CSR, you can make sure both fingerprint match.

Dirty Trick

Earlier I was saying that when running with a reverse proxy in front of Puppet, this one is the SSL endpoint and it propagates the authentication status to Puppet.

I strongly don’t recommend implementing the following. This will compromise your setup security.

This can be used to severely remove Puppet security for instance you can:

  • make so that every nodes are authenticated for the server by always returning the correct header
  • make so that nodes are authenticated based on their IP addresses or fqdn

You can even combine this with a mono-certificate deployment. The idea is that every node share the same certificate. This can be useful when you need to provision tons of short-lived nodes. Just generate on your master a certificate:

You can then use those generated certificate (which will end up in /var/lib/puppet/ssl/certs and /var/lib/puppet/private_keys) in a pre-canned $ssldir, provided you rename it to the local fqdn (or symlink it). Since this certificate is already signed by the CA, it is valid. The only remaining issue is that the master will serve the catalog of this certificate certname. I proposed a patch to fix this, this patch will be part of 2.6.3. In this case the master will serve the catalog of the given connecting node and not the connecting certname. Of course you need a relaxed auth.conf:

Caveat: I didn’t try, but it should work. YMMV :)

Of course if you follow this and shoot yourself in the foot, I can’t be held responsible for any reasons, you are warned. Think twice and maybe thrice before implementing this.

Multiple CA or reusing an existing CA

This goes beyond the object of this blog post, and I must admit I never tried this. Please refer to: Managing Multiple Certificate Authorities and  Puppet Scalability

Conclusion

If there is one: security is necessary when dealing with configuration management. We don’t want any node to trust rogue masters, we don’t want masters to distribute sensitive configuration data to rogue nodes. We even don’t want a rogue user sharing the same network to read the configuration traffic. Now that you fully understand SSL, and the X509 PKI, I’m sure you’ll be able to design some clever attacks against a Puppet setup :)

Benchmarking Puppetmaster Stacks

| Comments

It’s been a long time since my last puppet blog post about file content offloading. Two puppetcamps even passed (more on the last one in a next blog article). A new major puppet release (2.6) was even released, addressing lots of performance issues (including the file streaming patch I contributed).

In this new major version, I contributed a new 3rd party executable (available in the ext/ directory in the source tree) that allows to simulate concurrent nodes hammering a puppetmaster. This tool is called puppet-load.

Rationale

I created this tool for several reasons:

  • I wanted to be able to benchmark and compare several ruby interpreter (like comparing JRuby against MRI)
  • I wanted to be able to benchmark and compare several deployements solutions (like passenger against mongrel)

There was already a testing tool (called puppet-test) that could do that. Unfortunately puppet-test had the following issues:

  • No REST support besides some never merged patches I contributed, which render it moot to test 0.25 or 2.6 :(
  • based on a forking process models, so simulating many clients is not resource friendly
  • it consumes the master response and fully unserializes it creating puppet internals objects, which takes plenty of RAM and time, penalizing the concurrency.
  • no useful metrics, except the time the operation took (which was in my test mostly dominated by the unserialization of the response)

Based on those issues, I crafted from scratch a tool that:

  • is able to impose an high concurrency to a puppetmaster, because it is based on EventMachine (no threads or processes are harmed in this program)
  • is lightweight because it doesn’t consume puppet responses
  • is able to gather some (useful or not) metrics and aggregates them

Caveats

For the moment, puppet-load is still very new and only supports catalog compilations for a single node (even though it simulates many clients in parallel requesting this catalog). I just released a patch to support multiple node catalogs. I also plan to support file sourcing in the future.

So far, since puppet-load exercise a puppetmaster in such a hard way, achieving concurrencies nobody has seen on production puppetmasters, we were able to find and fix half a dozen threading race condition bugs in the puppet code (some have been fixed in 2.6.1 and 2.6.2, the others will soon be fixed).

Usage

The first thing to do is to generate a certificate and its accompanying private key:

Then modify your auth.conf (or create one if you don’t have one) to allow puppet-load to compile catalos. Unfortunately until #5020 is merged, the puppetmaster will use the client certname as the node to compile instead of the given URI. Let’s pretend your master has the patch #5020 applied (this is a one-liner).

Next, we need the facts of the client we’ll simulate. Puppet-load will overwrite the ‘fqdn’, ‘hostname’ and ‘domain’ facts with values inferred from the current node name.

Then launch puppet-load against a puppet master:

If we try with an higher concurrency (here my master is running under webrick with a 1 resource catalog, so compilations are extremely fast):

It returns a bunch of informations. First if you ran it in debug mode, it would have printed when it would start simulated clients (up to the given concurrency) and when it receives the response.

Then it displays some important information:

  • availability %: which is the percent of non-error response it received
  • min and max request time
  • average and median request time (this can be used to see if the master served clients in a fair way)
  • real concurrency: how many clients the master was able to serve in parallel
  • transaction rate: how many compilation per seconds the master was able to perform (I expect this number to vary in function of applied concurrency)
  • various transfer metrics like throughput and catalog size transferred: this can be useful to understand the amount of information transferred to every clients (hint: puppet 2.6 and puppet-load both support http compression)

At last puppetcamp, Jason Wright from Google, briefly talked about puppet-load (thanks Jason!). It was apparently already helpful to diagnose performance issues in his External Node Tool classifier.

If you also use puppet-load, and/or have ideas on how to improve it, please let me know! If you have interesting results to share like comparison of various puppet master stacks, let me know!