Posted on 14 March 2013
Today I have a case of conflicting advice from different tools.
I’m trying to generate GPG keys on virtual machines. The usual way of
generating cryptographically secure keys is to use a good entropy
source, such as /dev/random
. However, this source gets most of its
entropy from physical devices attached to the machine, such as
keyboards, mice, and disks, none of which are available to a
VM. Therefore, when generating GPG keys, it’s quite common to find
yourself starved of entropy, with a message like:
We need to generate a lot of random bytes. It is a good idea to perform some other action (type on the keyboard, move the mouse, utilize the disks) during the prime generation; this gives the random number generator a better chance to gain enough entropy. Not enough random bytes available. Please do some other work to give the OS a chance to collect more entropy! (Need 280 more bytes)
However, things got more curious when I looked at the manpage for
/dev/random
(which you can find with man 4 random
):
While some safety margin above that minimum is reasonable, as a guard against flaws in the CPRNG algorithm, no cryptographic primitive available today can hope to promise more than 256 bits of security, so if any program reads more than 256 bits (32 bytes) from the kernel random pool per invocation, or per reasonable reseed interval (not less than one minute), that should be taken as a sign that its cryptography is not skilfully imple- mented.
I did a bit of reading into the meaning behind this. The argument goes something like this:
Breaking a GPG key requires less than 256-bits' worth of effort: I don’t remember the details, but a 2048-bit GPG key requires something in the region of 100-200 bits' worth of effort to crack, as you are performing a prime factorization.
So, why is GPG greedily asking /dev/random
for 280 bytes of
entropy, when all it conceivably needs is 32? I’m not sure, and I’d be
delighted to learn, but it seems that /dev/random
and GPG
fundamentally disagree on what the contract is between them. What this
means for me as a user, however, is that GPG is massively gorging
itself on entropy from my entropy-starved VM, which means it takes
forever and a day to generate GPG keys on a VM.
Interestingly, OS X implements its /dev/random
device differently;
it uses Schneier, Kelsey and Ferguson’s
Yarrow algorithm, which
operates on a similar basis to that given above: once you have
achieved a certain minimal level of true entropy, you can use that as
a seed to a PRNG to feed cryptographic key generators with no loss of
security in the system. That means that once it has gathered its
initial 256 bits (or whatever) of entropy, OS X’s /dev/random
will
continue generating random bits effectively forever, making it a much
better choice for PRNG on a VM.
PS: Instead of brute-forcing the seed, there is a potential alternative attack against the PRNG, which is that someone finds a way to predict the PRNG output with much less computational effort than brute-force guessing the seed. But this is much the same kind of attack as “someone finds a problem with AES” or “someone finds a problem with GPG” — ie we presume our cryptographic primitives are good because no known attack against them has been discovered, not because we are able to prove that no attack is possible. Using true entropy instead of a PRNG guards against attacks against your PRNG, but you still need to worry about attacks against your crypto algorithm if you’re being that paranoid. IOW, I don’t think GPG’s strategy here seems to be the right tradeoff.
Posted on 01 October 2012
I just finished watching this great interview of John Allspaw on Devops and Continuous Delivery. John Allspaw is SVP of Tech Operations at Etsy.
It’s worth watching the full talk, but here are some of the things I took from it:
“We may deploy 20 times a day, but we wouldn’t deploy 20 times a day if we went down 20 times a day. The only reason we got to 20 times a day, is that the first time we deployed 5 times a day, it worked out.”
I have nothing to add to this.
I particularly liked the question “What’s the role of operations in an organization that wants to practice devops?” There is an idea floating around that devops means that there should no longer be separate development and operations teams — and while there is a lot of merit in forming cross-functional teams, this doesn’t necessarily mean that we can (or should) do away with operations entirely.
Certainly, product-focussed teams should be taking on a lot of what was traditionally operational responsibility — but they don’t have to take it all on. For example, John describes this process at etsy as freeing the operations from “reactive work” — eg deployments — and allowing them to focus instead on “proactive work” — eg designing infrastructure.
I have previously been a developer, writing Java code and using dbdeploy to migrate the database schema in line with deploying a new version of code which requires a schema change. I have also been on a more operational team, where I was deploying other people’s ruby code to production using capistrano, where the deploy:migrations target handles database migrations in sync with an application deployment. Database migrations have always made me nervous — they are in general irreversible, and the database is such a core part of the system that a failed migration can be disastrous to try to recover from.
John Allspaw has worked on places that deploy 50 times a day. He has fielded the question “If you deploy 50 times a day, how do you change the database 50 times a day?” The answer is quite simply: you don’t.
He instead describes his previous experience at flickr, where frequent code deployments were enabled by separating code and database deployment. The database is migrated maybe once a week, and the schemas are in place before the code that needs to use those schemas is deployed.
This is one of those obvious-in-hindsight revelations. Working with Clojure has previously encouraged me to look for entangled concerns in code and to decompose code into simple pieces. John’s solution to the database migration problem is the same approach in a different sphere — we want to deploy frequently, but database deployment is risky. Ergo, we should decouple code deployment from database deployment.
That’s what I got from the talk, but he spoke about a whole bunch more topics beyond that. Give it a watch!
Link again: John Allspaw on Devops and Continuous Delivery
Posted on 26 September 2012
A short post about a problem we were having.
If you are load balancing https traffic with haproxy in tcp mode, and you are fronting this with nginx, and you get 502 errors accompanied by these SSL errors in nginx’s error log:
SSL_do_handshake() failed (SSL: error:1408C095:SSL routines:SSL3_GET_FINISHED:digest check failed)
then you need to turn off the
proxy_ssl_session_reuse
option:
proxy_ssl_session_reuse off;
By default, nginx tries to reuse ssl sessions for an https upstream; but when HAProxy is round-robining the tcp connections between different backends, the ssl session will not be valid from one tcp connection to the next.
UPDATE: @zaargy points out that the development branch of haproxy has https support. Awesome!
Posted on 28 June 2012
This doesn’t seem to be specified anywhere in the rspec-puppet documentation, so I thought I’d leave it here for the moment. Suppose you have a puppet type which always depends on another:
define foo () { file { "/etc/${name}": } Bar[$name] -> Foo[$name] }
If you want to write an rspec-puppet unit test for this, it will fail because it can’t find the resource Bar[$name]
, unless you define it as a precondition:
describe 'foo', :type => :define do let(:title) { 'my-foo' } let(:pre_condition) { 'bar { "my-foo" }' } it { should contain_file('/etc/my-foo') } end
Posted on 06 June 2012
I was on a project recently where we wanted to deploy a Ruby Sinatra application to a CentOS 6.2 production environment. Our means of distributing software to all our environments was RPM – we took our sinatra app, packaged it into an RPM, and stuck it in production. Installing all our software via RPM has certain advantages:
Any nontrivial ruby application will want to depend on some gems, and ours was no different. We used bundler to manage our gem dependencies. This carries its own advantages:
We were also following the advice of
vendor everything
— we were running bundle package
to download gem files to
vendor/cache and checking them into source control. This practice means:
However, we quickly hit a number of issues with bundler which made it difficult to package up our RPM satisfactorily:
In order to get bundler to install the gems that we have previously
stored in the vendor/cache directory, we need to run bundle install
--deployment
. The --deployment
option combines all sorts of
desirable options for a production environment:
A fundamental question we had was: at what point in the
build/test/deploy process should we run bundle install --deployment
?
The bundler docs are pretty clear about this: all of the
deployment examples run
bundle install
on the target machine; the
bundle install overview
page says of --deployment
: “Do not use this flag on a
development machine.”, though it offers no reason why. (The man page
says: “it will cause in an error when the Gemfile is modified”, but it
doesn’t say why this will happen. For that, see the next section.)
Conversely, the philosophy of RPM is pretty clear too: bundle install
--deployment
should not be run on the target machine, because it
creates a vendor/bundle directory which does not belong to any
RPM. This means that when we uninstall or upgrade the RPM, the
vendor/bundle directory will be left behind, potentially poisoning the
bundle for future versions of the app. We could add a %preun script in
our RPM specfile to remove the bundle and the .bundle/config file, but
it’s a hack. What we really want is to deploy our gems into their
final locations on the CI server, and package them up into an RPM.
It seems that bundler and RPM have competing design principles, so they don’t want to play nicely together.
Bundler also has a confusing habit of implicitly creating and storing all sorts of state. There are two main culprits here: the .bundle/config file, and environment variables.
The .bundler/config file, which lives in the same place as the
Gemfile, is the reason that you shouldn’t run bundle install
--deployment
on a development system. Bundler will save state to this
file about the installation that it has done: location of installed
gems, excluded groups, whether or not the gemfile is frozen, etc.
Bundler also sets up some environment variables which mean that bundler is not reentrant. We came up against problems during our build process, where within our Rakefile we had the line:
bundle install \ --path %{buildroot}/usr/lib/%{name}/vendor/bundle/ \ --deployment \ --binstubs %{buildroot}/usr/lib/%{name}/vendor/bin/ \ --without test
If we ran the rakefile using plain old rake package
, it would create
our package with no issues. However, we want to use bundler to manage
all of our gems — build, test and production dependencies. We want to
use a bundler-provided rake, not a system-installed one. But if we ran
rake using bundle exec rake package
, it would fail with the
following errors:
$ bundle exec rake package # ... lots of output ... + bundle install --path /home/ppotter/src/node-api/BUILDROOT/node-api-0.0.3-9001.x86_64/usr/lib/node-api/vendor/bundle/ --deployment --binstubs /home/ppotter/src/node-api/BUILDROOT/node-api-0.0.3-9001.x86_64/usr/lib/node-api/vendor/bin/ --without test Could not find rake-0.9.2.2 in any of the sources Run `bundle install` to install missing gems.
This is confusing — bundle install --deployment
shouldn’t care
about the rake gem, because in our Gemfile we’ve declared it in the
test group, which we are excluding using --without
test
. Furthermore, the working directory for this command is
/home/ppotter/src/node-api/BUILD/node-api, which is different from the
directory where we are running bundle exec rake package
, so any
/home/ppotter/src/node-api/.bundle/config file which the outer bundler
process has created should not conflict.
The error occurs because bundler achieves much of its magic by setting various environment variables. To prevent the outer bundler instance — the one that runs rake — from interfering with the inner bundler instance — the one that installs our gems in deployment mode to the BUILDROOT directory — we need to unset those environment variables:
env -u BUNDLE_GEMFILE -u BUNDLE_BIN_PATH -u RUBYOPT -u GEM_HOME -u GEM_PATH \ bundle install \ --path %{buildroot}/usr/lib/%{name}/vendor/bundle/ \ --deployment \ --binstubs %{buildroot}/usr/lib/%{name}/vendor/bin/ \ --without test
The .bundle/config file (which lives in the same place as the Gemfile) contains configuration which tells bundler where it has installed its gems. If we want to package our bundler-installed gems into an RPM, we need to also package .bundle/config so that bundler will know where the gems live on the target machine. Here is mine, after running the above command:
--- BUNDLE_WITHOUT: test BUNDLE_FROZEN: "1" BUNDLE_BIN: /home/ppotter/src/node-api/BUILDROOT/node-api-0.0.3-9001.x86_64/usr/lib/node-api/vendor/bin/ BUNDLE_PATH: /home/ppotter/src/node-api/BUILDROOT/node-api-0.0.3-9001.x86_64/usr/lib/node-api/vendor/bundle/ BUNDLE_DISABLE_SHARED_GEMS: "1"
This is clearly going to cause problems if we package this file as-is, because the gems are not going to live in these directories but instead in /usr/lib/node-api/vendor/bundle. We need to strip the leading BUILDROOT path from the directories in this file before we can package it. We do this with a sed script in the %install section of the RPM specfile:
sed -i -e 's,%{buildroot},,' %{buildroot}/usr/lib/${name}/.bundle/config
I’m sure that this is not the “bundler way” of doing things, but as I have said before, bundler and RPM’s worldview are seemingly irreconcilable and something like this is necessary to get them to work together.
The process of installing gems can also install system-specific
extensions which will not be as portable as pure ruby code, nor as
portable as the source .gem files from the vendor/cache
directory. This is another reason for recommending that you run
bundle install --deployment
on the target machine rather than in the
build environment.
RPM, however, also has a way of coping with this portability problem, by marking packages as architecture-specific. If you don’t specify an architecture yourself, rpmbuild will even autodetect any system-specific binaries in your RPM and give it an appropriate tag. We relied on this behaviour and sure enough, our resultant RPM is considered x86_64 code rather than noarch. This is fine for our production environment, where all machines run the same hardware and OS.
We still need bundler to be present on the target machine. We used the fantastic fpm tool to create a rubygem-bundler RPM, and made our node-api RPM depend on rubygem-bundler.
We could have used fpm to package every single gem as a separate RPM. We didn’t go with this option, because it doesn’t enable separation of sets of gems used by different applications, and it doesn’t allow us to know exactly which gems will be used by a particular source code version.
Bundler 1.1 (which hasn’t yet been released) provides a --standalone
option which allows you to install gems in such a way that they don’t
depend on bundler. I’d be very interested to investigate this option
for packaging ruby apps as RPMs, although since we were running our
ruby through Phusion Passenger I wonder whether it would work for us,
or if it only works for the bundler-created binstubs.
After all of this work, we have a solution which combines the advantages of both RPM and bundler: