Lyte's Blog

Bad code, bad humour and bad hair.

Spying With PHPUnit

Trying to spy on invocations with PHPUnit seems to normally involve either writing your own spy class:

1
2
3
4
5
6
class IAmASpy {
  public $invocations = array();
  public function foo() {
      $this->invocations []= 'foo';
  }
}

or trying to use execution checks on mock objects to determine that things were called with the right arguments:

1
2
3
4
$mock = $this->GetMock('Foo');
$mock->expects($this->once())
    ->method('bar')
    ->with($this->identicalTo('baz'));

What A Pain!

What if you want to check the arguments going in to the last call? Well you can use at():

1
$mock->expects($this->at(7)) // ...

… better hope we never add any other calls!

What if we don’t know the exact parameter that it’s being called with and want to check it with something more complex? Well if you dig really hard in the manual you’ll find there’s a whole bunch of assertions that let you feed in crazier stuff like:

1
2
3
4
// ...
->with($this->matchesRegularExpression(
    '/Oh how I love (regex|Regular Expressions)/'
));

So that’s pretty cool, if you happen to like really obscure features that are impossible to remember.

Surely there’s a better way? Think of the children!

What if you could just ask for all the invocations and test that they were right in that language you’re already using for all your production logic? Wouldn’t that be just dandy!

Turns out you can, but it’s hiding – and I don’t mean it’s hiding in a “you will find this if you read the manual” kind of way, I mean it’s hiding in the source code, where everyone totally looks first for easy examples right?

All you have to do is store the result of $this->any() and you can use it as a spy:

1
2
$exec->expects($spy = $this->any())
    ->method('foo');

(I’ve got to wonder if documenting those extra 7 characters might be the colloquial straw that breaks the PHPUnit manual’s back.)

Now that you have a spy, you can just do normal stuff that calls it, then use normal PHP logic (I had to laugh when I wrote “normal PHP logic”) to confirm it’s right:

1
2
3
// get the last invocation
$invocation = end($spy->getInvocations());
$this->assertEquals('foo', $invocation->arguments[0]);

An Example You Say?

As a concrete example, lets ensure the NSA is spying on its citizens just the right amount.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
<?php
// What we're testing today
class AverageCitizen {
    public function spyOn() {}
}

// Our tests (yes, normally these would be in some other file)
class TestAverageCitizens extends PHPUnit_Framework_TestCase {
    public function testSpyingLikeTheNSAShould() {
        $citizen = $this->getMock('AverageCitizen');
        $citizen->expects($spy = $this->any())
            ->method('spyOn');

        $citizen->spyOn("foo");

        $invocations = $spy->getInvocations();

        $this->assertEquals(1, count($invocations));

        // we can easily check specific arguments too
        $last = end($invocations);
        $this->assertEquals("foo", $last->parameters[0]);
    }

    public function testSpyingLikeTheNSADoes() {
        $citizen = $this->getMock('AverageCitizen');
        $citizen->expects($spy = $this->any())
            ->method('spyOn');

        $citizen->spyOn("foo");
        $citizen->spyOn("bar");

        $invocations = $spy->getInvocations();

        $this->assertEquals(1, count($invocations));
    }
}
?>

and when we run the tests we can see that even PHPUnit knows the NSA has crossed the line:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
$ phpunit --debug test.php 
PHPUnit 3.6.10 by Sebastian Bergmann.


Starting test 'TestAverageCitizens::testSpyingLikeTheNSAShould'.
.
Starting test 'TestAverageCitizens::testSpyingLikeTheNSADoes'.
F

Time: 0 seconds, Memory: 3.25Mb

There was 1 failure:

1) TestAverageCitizens::testSpyingLikeTheNSADoes
Failed asserting that 2 matches expected 1.

/i/be/a/coder/test.php:35

FAILURES!
Tests: 2, Assertions: 4, Failures: 1

Keep Cron Simple Stupid

Someone brought up from our Sys Admin chat at work yesterday that crontab and % are insane, the summary was:

If you want the % character in a command, as part of a cronjob:
1. You escape the %, so it becomes \%
2. echo and pipe the command you want to run (with the escaped %) into sed
3. Have sed unescape the %
4. Pipe it into the original program

Which I responded to with roughly “if you want a ‘%’ in your cron line you actually want a shell script instead”… this turns out to be a great argumentdebate as a lot of very good Sys Admins (who were online at the time) completely disagreed with me until I’d spelled out my argument in more detail.

Problems with cron

  • The syntax can be quite insane if you’re expecting it to behave like shell (hint: cron != shell)
  • There’s no widely used crontab linter (I was going to leave it at “there’s no crontab linter”, but found chkcrontab while writing this, which looks like a good start but isn’t packaged for any distro I’ve checked yet)
  • Badly breaking syntax in a crontab file will cause all jobs in that file to stop running (usually with no error recorded anywhere)
  • Unless you’re double entering your scheduling information you’re not going to be able to pick up the absense of the job in your monitoring solution when it fails to run
  • I’m stupid (yes this is a problem with cron)

All of these have led me to break cron lots of times and even more times I’ve had to try to figure out why a scheduled job isn’t running after someone else has broken it for me. Happy days.

KISS

Whenever I’m breaking something fairly critical too often for comfort, it’s time to Keep It Simple Stupid and the way I’ve tried to do that with cron is to never ever put anything complicated on a cron line.

Lets take a simple example:

1
* * * * * echo % some % percents % for % you %

Intuitively I’d just expect that to do what it does on the shell (echo’s back the string, which in cron would normally make it back to someone in email form), but instead the first % will start STDIN for the command, the remaining %s will get changed to new lines and you’ll end up with an echo statement that just echo’s a single new line out to cron as it’s not interested in the STDIN fed to it.

This creates a testing problem because now to test the behaviour of the cron line I need to wait for cron to run the cron line (there’s no way to immediately confirm the validity of the line).

If we instead place the behaviour we want in a script:

1
2
#!/bin/bash -e
echo % some % percents % for % you %

and call that from cron:

1
* * * * * /path/to/script

You can be reasonably confident that it’ll do exactly the same thing when cron runs it as when you test it on the terminal.

But % is ok when it’s simple

Some people tried to make the argument that a % is really ok when it’s actually really simple, e.g.:

1
* * * * * date +\%Y\%m\%d_\%H\%M\%S > /tmp/test

happens to work the same if you copy it to a terminal because the %s are escaped in the cron line and the escaping will happen to drop off in shell as well, but what if you want a quoted % – you’re stuffed.

Back to KISS again.

Other reasons to keep cron simple

If you’re editing cron via crontab -e it’s far too easy to wipe out your crontab file.

While this is mostly an argument for backups, if you keep your cron files simple it may not matter as much when they get nuked accidentally as now you’ve only lost scheduling information and not critical syntax :)

Summary

If I’m not 100% certain I can copy a line out of cron and run it on the terminal I think it doesn’t belong in cron.

Better XML Support in PHP

XML support in PHP is actually pretty good these days, but as with anything in PHP (why is that?) it has a few little quirks and corner cases that provide for continual facepalm moments.

Rather than just sit around and complain or try to get stuff in to the core (where there’s no way I’d be able to use it in real world projects until RHEL catches up, i.e. 3-4 years from now) I thought I’d see what I could do purely in PHP.

Turns out it’s quite a lot, so it’s up on Github: https://github.com/neerolyte/php-lyte-xml#readme

So if you have to deal with the XML in PHP fairly often consider taking it for a spin.

Git Stash That Won’t Make You Hate Yourself in the Morning

Git has a feature called stash that lets drop whatever you’re working on and put the working directory back to a clean state without having to commit or lose whatever deltas you had.

This is a great idea, but it’s sorely missing one core feature for anyone who works on more than one machine – the ability synchronise the stashes between machines, so if you’re like me (I work on the same code on up to about 4 individual machines in a week) you probably want some way to move stashes around.

So I’ve started git-rstash, as usual it’s written in terrible bash in the hope that someone will take enough offence at it to take the whole problem off my hands, in the mean time maybe you’ll find it useful too.

For the moment synchronising them is purely up to the user, but they are conveniently placed where the user can drop them in whatever cloud-syncy-like thing they’re already using (Unison, Ubuntu One, Dropbox, etc).

Too Many Ways to Base64 in Bash

I find myself writing little functions to to paste in to terminals to provide stream handlers quite often, like:

1
2
3
4
5
6
base64_decode() {
  php -r 'echo base64_decode(stream_get_contents(STDIN));'
}
base64_encode() {
  php -r 'echo base64_encode(stream_get_contents(STDIN));'
}

Which can be used to encode or decode base64 strings in a stream, e.g.:

1
2
3
4
$ echo foo | base64_encode
Zm9vCg==
$ echo Zm9vCg== | base64_decode
foo

which is fun, but I wanted to make it a little more portable, so lets try a few more languages…

1
2
3
4
5
6
7
8
9
10
11
12
# ruby
base64_encode() { ruby -e 'require 'base64'; puts Base64.encode64(ARGF.read)'; }
base64_decode() { ruby -e 'require 'base64'; puts Base64.decode64(ARGF.read)'; }
# python
base64_encode() { python -c 'import base64, sys; sys.stdout.write(base64.b64encode(sys.stdin.read()))'; }
base64_decode() { python -c 'import base64, sys; sys.stdout.write(base64.b64decode(sys.stdin.read()))'; }
# perl
base64_encode() { perl -e 'use MIME::Base64; print encode_base64(<STDIN>);'; }
base64_decode() { perl -e 'use MIME::Base64; print decode_base64(<STDIN>);'; }
# openssl
base64_encode() { openssl enc -base64; }
base64_decode() { openssl enc -d -base64; }

and now to wrap them all under something that picks whichever seems to be available:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
base64_php() { php -r 'echo base64_$1(stream_get_contents(STDIN));'; }
base64_ruby() { ruby -e 'require 'base64'; puts Base64.${1}64(ARGF.read)'; }
base64_perl() { perl -e 'use MIME::Base64; print $1_base64(<STDIN>);'; }
base64_python() { python -c 'import base64, sys; sys.stdout.write(base64.b64$1(sys.stdin.read()))'; }
base64_openssl() { openssl enc $([[ $1 == encode ]] || echo -d) -base64; }
base64_choose() {
  for lang in openssl perl python ruby php; do 
    if [[ $(type -t '$lang') == 'file' ]]; then
      'base64_$lang' '$1'
      return
    fi
  done
  echo 'ERROR: No suitable language found'
  return 1
}
base64_encode() { base64_choose encode; }
base64_decode() { base64_choose decode; }

great, now I can quickly grab some base64 commands on any box I’m likely to be working on in the foreseeable future.

Applying Settings to All Vagrant VMs

After upgrading from Ubuntu 12.04 (“Precise Pangolin”) to 12.10 (“Quantal Quetzal”) I lost the DNS resolver that VirtualBox normally provides on 10.0.2.3, meaning I couldn’t boot my Vagrant VMs.

Finding an answer that worked around it for singular VMs, I wanted something that worked for all VMs on the laptop (at least until I can fix the actually broken DNS resolver) and I’ve found one.

As per http://docs-v1.vagrantup.com/v1/docs/vagrantfile.html, it’s possible to specify additional parameters to Vagrant in ~/.vagrant.d/Vagrantfile – but it’s not exactly clear how, turns out you can just place them in a normal config block like so:

1
2
3
Vagrant::Config.run do |config|
    config.vm.customize ["modifyvm", :id, "--natdnshostresolver1", "on"]
end

Before applying the fix I was getting hangs at “Waiting for VM to boot” like so:

1
2
3
4
5
6
7
8
9
10
$ vagrant up
[default] VM already created. Booting if it's not already running...
[default] Clearing any previously set forwarded ports...
[default] Forwarding ports...
[default] -- 22 => 2222 (adapter 1)
[default] Creating shared folders metadata...
[default] Clearing any previously set network interfaces...
[default] Preparing network interfaces based on configuration...
[default] Booting VM...
[default] Waiting for VM to boot. This can take a few minutes.

after applying the fix, it continues on:

1
2
3
4
5
6
7
...
[default] Waiting for VM to boot. This can take a few minutes.
[default] VM booted and ready for use!
[default] Configuring and enabling network interfaces...
[default] Setting host name...
[default] Mounting shared folders...
[default] -- v-root: /vagrant

Edit: I just thought I’d add that the config has changed slightly with Vagrant 1.1+, so you now need a “v2 config block” (see: http://docs.vagrantup.com/v2/virtualbox/configuration.html):

1
2
3
4
5
Vagrant::configure('2') do |config|
  config.vm.provider "virtualbox" do |v|
    v.customize ["modifyvm", :id, "--natdnshostresolver1", "on"]
  end
end

Deleting an Apt-snapshot Btrfs Subvolume Is Hard

This took me a while to figure out, so thought I’d write up.

If for whatever reason a do-release-upgrade of Ubuntu fails part way through (say because you’ve used 75% of metadata and btrfs doesn’t seem to fail gracefully) and you’ve got a working apt-snapshot, after you get everything back in order again it will have left behind a snapshot that you no longer want:

1
2
3
4
root@foo:/# btrfs subvol list /
ID 256 top level 5 path @
ID 257 top level 5 path @home
ID 259 top level 5 path @apt-snapshot-release-upgrade-quantal-2013-03-07_21:34:28

So you think to yourself, “ok I can just delete it now”:

1
2
3
4
root@foo:/# btrfs subvol delete @apt-snapshot-release-upgrade-quantal-2013-03-07_21:34:28
ERROR: error accessing '@apt-snapshot-release-upgrade-quantal-2013-03-07_21:34:28'
root@foo:/# btrfs subvol delete /@apt-snapshot-release-upgrade-quantal-2013-03-07_21:34:28
ERROR: error accessing '/@apt-snapshot-release-upgrade-quantal-2013-03-07_21:34:28'

… hmm the obvious things don’t work.

Turns out when a subvolume (in this case “@”) is mounted the snapshot subvolumes aren’t mounted anywhere and you actually have to give a place where it’s visible as a subvolume (not a direct mount).

Because that sentence made no sense, here’s an example, this doesn’t work:

1
2
3
4
root@foo:~# mkdir /@apt-snapshot-release-upgrade-quantal-2013-03-07_21:34:28
root@foo:~# mount -t btrfs -o subvol=@apt-snapshot-release-upgrade-quantal-2013-03-07_21:34:28 /dev/mapper/foo-root /@apt-snapshot-release-upgrade-quantal-2013-03-07_21:34:28
root@foo:~# btrfs subvol delete @apt-snapshot-release-upgrade-quantal-2013-03-07_21:34:28
ERROR: error accessing '@apt-snapshot-release-upgrade-quantal-2013-03-07_21:34:28'

because I’ve mounted the subvolume I want to delete and I’m giving the top of a FS to the subvolume delete command.

Instead, here’s what does work (even with the FS already mounted on /), create somewhere to mount it:

1
root@foo:/# mkdir /mnt/tmp

Mount it:

1
root@foo:/# mount /dev/mapper/foo-root /mnt/tmp

Show that the subvolumes are all available as directories under the mount point:

1
2
3
4
5
root@foo:/# ls -l /mnt/tmp/
total 0
drwxr-xr-x 1 root root 292 Mar  8 10:26 @
drwxr-xr-x 1 root root 240 Feb 21 08:31 @apt-snapshot-release-upgrade-quantal-2013-03-07_21:34:28
drwxr-xr-x 1 root root  14 Mar  8 10:24 @home

Delete it:

1
2
root@foo:/# btrfs subvol delete /mnt/tmp/@apt-snapshot-release-upgrade-quantal-2013-03-07_21\:34\:28
Delete subvolume '/mnt/tmp/@apt-snapshot-release-upgrade-quantal-2013-03-07_21:34:28'

Hooray! It didn’t error this time, it also actually worked:

1
2
3
root@foo:/# btrfs subvol list /
ID 256 top level 5 path @
ID 257 top level 5 path @home

Clean up:

1
root@foo:/# umount /mnt/tmp

New and Simplified License From Samsung Kies

I’m sure something’s wrong here, but can’t quite put my finger on what…

Blank license agreement from Kies

… must be something to do with having to fire up a Windows VM to talk to a Linux based phone.

VM Build Automation With Vagrant and VeeWee

So I’ve finally done my second talk at a LUG (Melbourne Linux Users Group to be precise), I mainly put my hand up to do a talk because we were short this month and I wanted some practice. I decided to go the route of trying to generate a lot of discussion and demoing some of the stuff, rather than preparing and running through a linear set of slides. All in all, it seemed to go fairly well.

I decided to talk about Vagrant and VeeWee. I don’t claim to be any kind of expert, but I have been tinkering and thought maybe I could stir a little more local interest in these types of automation tools.

Audio

Talk audio is available (50MB OGG).

For the most part the videos below were what was demoed during the talk, but there are some questions discussed during the talk that may be helpful (and some that simply will not make sense if you weren’t there :p).

Videos

First up I go through  a standard Vagrant Init process, Vagrant is already installed but otherwise it just demos the minimum number of steps to get a VM up and running, then running something dubious from the internets and returning the system to a clean state.

apt-cacher-ng demo – shows what should now be the standard steps for trialling something with Vagrant (git clone, vagrant up). In this case apt-cacher-ng, which can be used to speed up system building by caching repositories of various types when you are rebuilding similar machines quite often (https://github.com/neerolyte/mlug-vm-demos):

Spree Install – a nice complex rails app that grabs things off all sorts of places on the internet, sped up slight by using the apt-cacher-ng proxy above, but of course with Vagrant we still just “vagrant up” (https://github.com/neerolyte/mlug-vm-demos):

VeeWee build of a Vagrant basebox with a wrapper script (https://github.com/neerolyte/lyte-vagrant-boxes):

Resources

Dropping Repoproxy Development for Apt-cacher-ng

Initially I started writing repoproxy because I thought there were a bunch a of features I needed that simply couldn’t be configured with apt-cacher-ng (ACNG), it turns out I was wrong.

The reasons I had (broadly) for starting repoproxy:

- shared caches - multiple caches on a LAN, they should share some how

- CentOS - CentOS uses yum, how could it possibly be supported by ACNG?

- roaming clients - e.g. laptops that can't always sit behind the same static instance of ACNG

- NodeJS - I wanted to learn NodeJS, I've learnt a bit now and don't feel a huge desire to complete this project given I've figured out most of the other features neatly

Below I cover how to achieve all of the above features with ACNG as it really wasn’t obvious to me so I suspect it might be useful to other too.

Shared Caches

I run a lot of VMs on my laptop, but also have other clients in the home and work networks that could benefit from having a shared repository cache. For me a good balance is having static ACNG servers at both home and work, but also having ACNG deployed to my laptop so that the the VMs can be updated, reimaged or get new packages without chewing through my mobile bandwidth.

This is actually natively supported with ACNG, it’s just a matter of putting a proxy line in /etc/apt-cacher-ng/acng.conf like so:

1
Proxy: http://mirror.lyte:3142/

Then it’s just a matter of telling apt to use a localhost proxy anywhere that ACNG is installed:

1
echo 'Acquire::http { Proxy 'http://localhost:3142/'; };' > /etc/apt/apt.conf.d/01proxy

This allows VMs on my laptop to have a portable repository cache when I’m not on a normal network, but also allows me to benefit from cache others generate and vice versa.

I’d like to at some point have trusted roaming clients (i.e. only my laptop) publish captured ACNG cache back to static ACNG cache servers. I’m pretty sure I can achieve this using some if-up.d trickery combined with rsync, but I haven’t tried yet.

I had considered trying to add something more generic to repoproxy that did a cache discovery and then possibly ICP (Internet Cache Protocol) to share cache objects between nodes on the same LAN, but there are some generalised security issues I can’t come up with a good solution for, e.g. If my laptop is connected to a VPN and another node discovers it, how do I sensibly ensure they can’t get anything via the VPN without making the configuration overly obtuse?

It seems like trying to implement self discovery would either involve a lot of excess configuration on participating nodes, or leave gaping security holes so for the moment I’m keeping it simple.

CentOS

I use CentOS and Scientific Linux sometimes and I’d like their repos to be cached too.

I had originally falsely assumed this would simply be impossible, but I read somewhere that there was at least a little support.

In my testing some things didn’t work out of the box, but could be worked around.

Essentially it seems like ACNG treats most files as one of volatile, persistent or force-cached and it’s just a matter of relying on tweaking URL based regexs to understand any repositories you want to work with.

Roaming Clients

When I move my laptop between home and work or on to a public wifi point I want apt to “just work” I don’t want to have to remember to alter some config each time I need it.

I found two methods described on help.ubuntu.com that were sort of on the right track, but running a cron every minute that insights network traffic when my network is mostly stable seems like a bad idea, as does having to reboot to gain the new settings (especially as Network Manager won’t bring wireless up until after I login, so it wouldn’t even work for my use case).

NB: I’ve already gone back and added my event driven method to the help.ubuntu.com article.

I suspect it would be possible to utilise ACNG’s hook functionality:

8.9 How to execute commands before and after going online? It is possible to configure custom commands which are executed before the internet connection attempt and after a certain period after closing the connection. The commands are bound to a remapping configuration and the config file is named after the name of that remapping config, like debrep.hooks for Remap-debrep. See section 4.3.2, conf/.hooks and /usr/share/doc/apt-cacher-ng/examples/.hooks files for details.

I couldn’t immediately bend this to my will, so I decided to go down a route I already understood.

I decided to use if-up.d to reset ACNG’s config every time there was a new interface brought online, this allows for an event-driven update of the upstream proxy rather than relying on polling intermittently or a reboot of the laptop.

Create a new file /etc/network/if-up.d/apt-cacher-ng-reset-proxy and put the following script in it:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
#!/bin/bash

# list of hosts that the proxy might be running on
hosts=(
        acng.on.my.home.network
        acng.on.my.work.network
        acng.at.my.friends.place
)

set_host() {
        host=$1
        line='Proxy: http://$host:3142/'
        if [[ -z $host ]]; then
                line='# Proxy: disabled because none are contactable'
        fi

        # adjust ACNG configuration to use supplied proxy
        sed -i -r 's%^\s*(#|)\s*Proxy: .*$%$line%g' \
                /etc/apt-cacher-ng/acng.conf

        # if apt-cacher-ng is running
        if service apt-cacher-ng status > /dev/null 2>&1; then
                # restart it to take hold of new config
                service apt-cacher-ng restart
        fi
        exit 0
}

try_host() {
        host=$1
        # if we can get to the supplied host
        if ping -c 1 '$host' > /dev/null 2>&1; then
                # tell ACNG to use it
                set_host '$host'
        fi
}

# Run through all possible ACNG hosts trying them one at a time
for host in '${hosts[@]}'; do
        try_host '$host'
done

# no proxies found, unset upstream proxy (i.e. we connect straight to the internet)
set_host

Make sure to adjust the script for your environment and make it executable with:

1
chmod +x /etc/network/if-up.d/apt-cacher-ng-reset-proxy