Concatenating Data

Written: 2017-07-21
Author: WhatsARanjit
Link: WhatsARanjit/data_fragments

The problem

Let’s say we have several network devices that use JSON for their configuration. It looks something like this:

{
  "switch1": "192.168.1.2"
}

Also we have a single upstream switch that needs to dynamically discover downstream devices. The configuration on the upstream switch looks like this:

{
  "switch1": "192.168.1.2",
  "switch2": "192.168.1.3"
}

An easy way in Puppet for discovery and adding to a file might be using exported resources and the puppetlabs/concat module. When a new device comes online and gets its first catalog, it will send a fragment to the Puppet master. When the upstream switch checks in next, it will collect all those fragments and piece them together into a file.

But if we export these fragments:

# From switch1
{
  "switch1": "192.168.1.2"
}

# From switch2
{
  "switch2": "192.168.1.3"
}

..we get…

{
  "switch1": "192.168.1.2"
}
{
  "switch2": "192.168.1.3"
}

This isn’t valid JSON and your switch will surely break. You can try starting with an inital fragment of { and an ending fragment of }. But because JSON hates trailing commas, you’ll need a clever way of comma-separating each key without leaving a trailing comma. This means you can’t just do this:

concat_fragment { 'header':
  content => '{',
  order   => 1,
}
concat_fragment { 'footer':
  content => '}',
  order   => 100,
}
# From switch1
@@concat_fragment { 'nope1':
  content => '"switch1": "192.168.1.2",',
  order   => 2,
}
# From switch2
@@concat_fragment { 'nope2':
  content => '"switch2": "192.168.1.3",',
  order   => 2,
}

The fix

puppetlabs/concat is great for concatenating plain-text, but isn’t able to work with data structures. Looking into the concat module’s code, it’s very close to being able to do so. I have an alternate module at WhatsARanjit/data)fragments while a pull request sits in wait. It adds a format attribute that allows you to specify that you’re working with json or yaml. I added json-pretty in there as well. So let’s start with 2 dummy data files:

first.json

{
  "one": {
    "oneA": "A",
    "oneB": {
      "oneB1": "1",
      "oneB2": "2"
    }
  },
  "two": [
    "twoA",
    "twoB"
  ]
}

second.json

{
  "one": {
    "oneA": "B",
    "oneB": {
      "oneB1": "2",
      "oneB3": "3"
    }
  },
  "two": [
    "twoA",
    "twoC"
  ]
}

We can write puppet code like this:

$output = '/tmp/json.json'
$dp     = '/path/to/data'

data_file { $output:
  ensure => present,
  tag    => 'stuff',
  format => 'json',
  force  => true,
}
data_fragment { 'first':
  content => file("${dp}/examples/data/first.json"),
  order   => '01',
  target  => $output,
  tag     => 'stuff',
}
data_fragment { 'second':
  content => file("${dp}/examples/data/second.json"),
  order   => '10',
  target  => $output,
  tag     => 'stuff',
}

I’ve also added a force attribute. In the case that you have overlapping keys, you’ll get a parser fail letting you know there is conflicting data. Using the force attribute tells Puppet to merge the data, keeping the data that has the highest ranking order. So now order is not just how to piece the data together, but which dataset takes priority over another. Running the above code produces an output like this:

{
  "one": {
    "oneA": "A",
    "oneB": {
      "oneB1": "1",
      "oneB2": "2",
      "oneB3": "3"
    }
  },
  "two": [
    "twoA",
    "twoB",
    "twoC"
  ]
}

For format, you can also choose plain, which is the default and it will function just like concat would. The module lives here.