YAML and Devicetree

Introduction

This document attempts to explain the rationale behind using a YAML based data model instead of the standard devicetree source (DTS). It assumes a working knowledge of devicetree, so readers are expected to have perused the devicetree specification located here.

Devicetree and its underlying concepts

While device tree deals with describing hardware devices, at its core it is a method of declaring a hierarchical structure as defined in the Devicetree Specification:

“A devicetree is a tree data structure with nodes that describe the devices in a system. Each node has property/value pairs that describe the characteristics of the device being represented. Each node has exactly one parent except for the root node, which has no parent.”

This structure is familiar to anyone with a passing knowledge of programming languages with rich data structures: nodes can be hashes keyed by their name, properties can be either scalars or sequences of scalars, and labels of nodes and phandles can be references/pointers.

Unfortunately, device tree is not orthogonal enough for this mapping to work. Namely, properties are irregular in the following ways:

  1. Boolean values cannot be part of a sequence, since a named property is defined as false if it doesn’t exist in a node and as true otherwise.

  2. Phandles are encoded as integer (cell) scalar values and are allowed in any property that contains cell values.

  3. While properties are defined either as a single value or a sequence of values, their type information is thrown away. The importance of this is that the property accessors must have an out-of-band way to be informed of the type(s) used in the property, i.e. property type information is not discoverable.

Lifecycle of a Devicetree.

The purpose of the device tree is (or at least was until recently) to be provided to an operating system at boot time. This was done by the following steps.

  1. Device Tree source files (DTS) are processed by a compiler to generate an in-memory tree structure. This structure is dynamically created at compile time by editing operations of the the compile tree sources which are:
    • Device tree sources are usually now pre-processed using the C proprocessor, but the built-in source include directive is still supported. Note that mixing them is permitted although it can lead to the unexpected behaviour of the base source file being preprocessed while the included one is not.
    • Declaration of a device node results in the creation of new in-memory device node if it doesn’t exist, or reusing it if it does.
    • Declaration of a property results in the creation of a new in-memory property containing the new property values if it doesn’t exist or replacing it if it does.
    • Node and property removal directives remove nodes and properties of the runtime tree structure as appropriate.
    • Node labels and references to them in properties are tracked. Note that references are the only scalar values that are tracked in the in-memory property data structure.
    • phandle references editing operations of the form ‘&label’ & ‘&{/path}’ are processed. These reference nodes with labels declared earlier in the main tree source. This form is typically used when compiling a device tree comprised of a main source file and a number of included files because it lends well to the a pattern of incremental change.
    • The special /memreserve/ directive is parsed and processed.
  2. The in-memory tree structure is ‘flattened’, i.e. it is serialized to create a device tree blob (DTB). It is in this stage that the symbolic references to node labels are resolved to integer/cell phandle values, with references to them being replaced by a cell value of the node’s phandle. Special ‘automatic’ properties (named phandle) containing the assigned phandle values are created for nodes that are referenced.

  3. This device tree blob file is placed in the applicable device and the bootloader is informed about how to retrieve it. This may be done by placing it in non-volatile storage at a specific byte offset, or being put in a boot-loader accesible filesystem with a specific name, etc.

  4. The bootloader starts, retrieves the device tree blob, and either passes it unchanged to the operating system or performs minor modification (i.e. altering the boot command line in the chosen node or enabling/disabling devices by modifying the status properties of some nodes). The bootloader typically does not create an in-memory tree structure at this step, it operates on the DTB blob level.

  5. The operating system starts and ‘unflattens’ the device tree blob which the bootloader has passed to it using the agreed upon architecture specific interface. The in-memory tree structure created is the same as the one created at the end of the compilation step, but with any changes that the boot-loader performed. The kernel at this point starts using the in-memory data structure, and it is referred to as the live-tree going forward. Note that while node phandles are discovered and tracked by the ‘phandle’ properties, their references cannot be deduced at this time.

  6. The operating system (including any device drivers) scans the live-tree and performs initialization and configuration of the hardware described there. Note that the operating system must have complete knowledge of the nodes and properties of an active node. This is evident by the use of access methods that include type information (e.g. of_property_read_u32() ), node references needing to be explicitly discovered by converting cell phandle value to a reference to a node, etc. Unfortunately, this is very error-prone since the type information has been discarded. There is no way to disallow access to a property using a different method than what was declared in the original source file.

The steps above are applicable to the simple case of a single platform, and up to a few years ago used to be the norm. In contemporary systems the situation is more complex for the following reasons:

  1. A common requirement is for a single image to be used for a number of different (but sufficiently similar) platforms. The number of stored DTBs would match the number of supported platforms, even if their changes are minimal.

  2. Hardware is no longer static. The proliferation of FPGAs and add-on expansion boards requires runtime device tree modification using device tree overlays. Those overlays are extremely similar to the way in-memory tree modification is performed at compile time but it is different in subtle ways.

  3. The device tree lifecycle expects perfect coordination across all the steps without the possibility of errors. This is troublesome in practice since every step in the sequence is part of a different project (compiler, bootloader, operating system). Errors can easily creep in and are usually not detected until the last step of the sequence, the operating system boot process. In case of an error, the result is usually a hung system without any indication what might have gone wrong.

YAML as a source format alternative

YAML is a human-readable data serialization language which is expressive enough to cover all DTS features. Simple YAML files are just key-value pairs that are very easy to parse, even without using a formal YAML parser. YAML streams are containing documents separated with a — marker. This model is a good fit for device tree since one may simply append a few lines of text to a given YAML stream to modify it.

YAML parsers are very mature, as YAML was first released in 2001. It is currently in wide-spread use and schema validation tools are available and common. Additionally, YAML support is available for many major programming languages.

Mapping of DTS constructs to YAML

The mapping of DTS constructs to YAML is relatively straightforward since they are both key-value declaration languages.

  • Comments in YAML are done using the # character instead of the C-like comments that DTS uses.
/* dts comment */
# YAML comment
  • DTS is a free form language using braces for denoting nest level while YAML is indentation sensitive in standard YAML encoding. Fortunately YAML is a superset of JSON which can be used as a valid free form.
node {
   property = "foo";
}
node:
  property: "foo"
{ "node": { "property": "foo" } }
  • There is no explict root in YAML encoding. Top level nodes & properties are taken to be located in the root.
/ {
    property;
    subnode {
       another-property;
    };
};
property: true
  subnode:
    another-property: true
  • Sequences in YAML may be denoted either by a single line starting with a hyphen ‘-‘, or bracketed JSON form. The following are equivalent.
property = "a", "b", "c";
property:
  - "a"
  - "b"
  - "c"
property: [ "a", "b", "c" ]
  • Values that may be evaluated as numeric scalars are used as cells.
property = <10>;
property: 10

Note that this includes integer expressions as well

property = <(5 + 5)>;
property: 5+5
  • String property values are enclosed in double quotes, although this is optional if the value cannot be expressed as a numeric scalar.
property = "string";
property: "string"
  • Boolean values are encoded as true and false. This is not implicit like in DTS.
property;
property: true

Note that it is possible to declare a property as false but you will get a warning about it being removed when generating the DTB.

property: false
  • It is possible to explictly declare the type of a scalar using the standard ‘!’ method of YAML. For instance this is how byte properties are supported.
property = [0124AB];
property: !int8 [ 0x01, 0x24, 0xab ]
  • Similarly the /bits/ directive is supported by explicit tagging.
property = /bits/ 64 <100>;
property: !int64 100
  • Labels are named anchors and are referenced by a ‘*’. Note that references are typed as such in YAML, they are transformed to phandle cells only on DTB generation.
label: node {
   property;
};

ref = <&label>;

&label {
    foo;
};
node: &label
  property: true

ref: *label

*label:
  foo: true
  • The delete node and properties directives are replaced with assignment to null/~. It works the same for both properties and nodes.

/ { node { property; }; }; / { /delete-node/ node; };
node:
  property: true

node: ~
  • There is no source /include/ directive in YAML. It is expected that thet C preprocessor will be used as is the norm with DTS.

  • Similarly there is no /include-bin/ directive, YAML can relatively easily include binary data as base64 string properties.

  • To easily support pre-processor macros from a DTS environment, scalars that are detected to be space separated integer expressions are transparently converted to scalar integer sequences.

#define MACRO(x, y) x y (x + y)
property = ;
#define MACRO(x, y) x y (x + y)
property: MACRO(10, 5)

Will result in

property: [ 10, 5, 15 ]

The YAML advantage

Radical changes are seldom worth it without bringing in significant benefits. Switching to YAML instead of DTS is indeed a radical change, but it does carry benefits, namely:

  1. YAML is a well known and mature technology which is supported by many programming languages and environments.

  2. YAML’s original purpose was data serialization. Therefore it is orthogonal and supports high-level language data structures well.

  3. It is suited for the description of graph structures, since it supports references and anchors.

  4. With its mature parsers and tools, it easily supports the human edit and compile cycle that is now common with device tree development. Since all property values are potentially typed, it is possible to track type information in order to perform thorough validation and checking against device tree bindings (once the bindings are converted to a machine readable format, preferably YAML). As well, this allows the reporting of accurate error messages and warnings at any stage of the compilation process.

  5. It is possible to generate YAML as an intermediate format with references not resolved, in a similar way that object files are used. Those intermediate files can them be compiled/linked again to generate the final DTB/YAML file. For example, instead of compiling into a single output file, one could generate intermediate YAML files, similar in every way to device tree overlays, and then perform the final ‘linking’ step at either compile time or the bootloader.

  6. It is relatively easy to parse, and a resource limited parser that can be included in bootloaders or the kernel is possible.

  7. Data in YAML can easily be converted to and from other formats making it convertable to formats which future tools may understand.

The yamldt compiler

yamldt is a YAML/DTS to DT blob generator/compiler and validator. The YAML schema is functionally equivalent to DTS and supports all DTS features, while as a DTS compiler it is bit-exact compatible with DTC. yamldt parses a device tree description (source) file in YAML/DTS format and outputs a device tree blob (which can be bit-exact to the one generated from the reference dtc compiler if the -C option is used).

Validation is performed against a YAML schema that defines properties and constraints. A checker uses this schema to generate small code fragments that are compiled to eBPF and executed for the specific validation of each DT node the rule selects in the output tree.

Validation

As mentioned above, yamldt is capable of performing validation of DT constructs using a C-based eBPF checker. eBPF code fragments are assembled that can perform type checking of properties and enforce arbitrary value constraints while fully supporting inheritance.

As an example, here’s how the validation of a given fragment works using on a jedec,spi-nor node:

m25p80@0:
  compatible: "s25fl256s1"
  spi-max-frequency: 76800000
  reg: 0
  spi-tx-bus-width: 1
  spi-rx-bus-width: 4
  "#address-cells": 1
  "#size-cells": 1

The binding for this is:

%YAML 1.1
---
jedec,spi-nor:
  version: 1

  title: >
    SPI NOR flash: ST M25Pxx (and similar) serial flash chips

  maintainer:
    name: Unknown

  inherits: *spi-slave

  properties:
    reg:
      category: required
      type: int
      description: chip select address of device

    compatible: &jedec-spi-nor-compatible
      category: required
      type: strseq
      description: >
        May include a device-specific string consisting of the
        manufacturer and name of the chip. A list of supported chip
        names follows.
        Must also include "jedec,spi-nor" for any SPI NOR flash that can
        be identified by the JEDEC READ ID opcode (0x9F).
      constraint: |
        anystreq(v, "at25df321a") ||
        anystreq(v,  "at25df641") ||
        anystreq(v, "at26df081a") ||
        anystreq(v,   "mr25h256") ||
        anystreq(v,    "mr25h10") ||
        anystreq(v,    "mr25h40") ||
        anystreq(v, "mx25l4005a") ||
        anystreq(v, "mx25l1606e") ||
        anystreq(v, "mx25l6405d") ||
        anystreq(v,"mx25l12805d") ||
        anystreq(v,"mx25l25635e") ||
        anystreq(v,    "n25q064") ||
        anystreq(v, "n25q128a11") ||
        anystreq(v, "n25q128a13") ||
        anystreq(v,   "n25q512a") ||
        anystreq(v, "s25fl256s1") ||
        anystreq(v,  "s25fl512s") ||
        anystreq(v, "s25sl12801") ||
        anystreq(v,  "s25fl008k") ||
        anystreq(v,  "s25fl064k") ||
        anystreq(v,"sst25vf040b") ||
        anystreq(v,     "m25p40") ||
        anystreq(v,     "m25p80") ||
        anystreq(v,     "m25p16") ||
        anystreq(v,     "m25p32") ||
        anystreq(v,     "m25p64") ||
        anystreq(v,    "m25p128") ||
        anystreq(v,     "w25x80") ||
        anystreq(v,     "w25x32") ||
        anystreq(v,     "w25q32") ||
        anystreq(v,     "w25q64") ||
        anystreq(v,   "w25q32dw") ||
        anystreq(v,   "w25q80bl") ||
        anystreq(v,    "w25q128") ||
        anystreq(v,    "w25q256")

    spi-max-frequency:
      category: required
      type: int
      description: Maximum frequency of the SPI bus the chip can operate at
      constraint: |
        v > 0 && v < 100000000

    m25p,fast-read:
      category: optional
      type: bool
      description: >
        Use the "fast read" opcode to read data from the chip instead
        of the usual "read" opcode. This opcode is not supported by
        all chips and support for it can not be detected at runtime.
        Refer to your chips' datasheet to check if this is supported
        by your chip.

  example:
    dts: |
      flash: m25p80@0 {
          #address-cells = <1>;
          #size-cells = <1>;
          compatible = "spansion,m25p80", "jedec,spi-nor";
          reg = <0>;
          spi-max-frequency = <40000000>;
          m25p,fast-read;
      };
    yaml: |
      m25p80@0: &flash
        "#address-cells": 1
        "#size-cells": 1
        compatible: [ "spansion,m25p80", "jedec,spi-nor" ]
        reg: 0;
        spi-max-frequency: 40000000
        m25p,fast-read: true

Note the constraint rule matches on any compatible string in the given list. This binding inherits from spi-slave as indicated by the line: inherits: *spi-slave

*spi-slave is standard YAML reference notation which points to the spi-slave binding, pasted here for convenience:

%YAML 1.1
---
spi-slave: &spi-slave
  version: 1

  title: SPI Slave Devices

  maintainer:
    name: Mark Brown <broonie@kernel.org>

  inherits: *device-compatible

  class: spi-slave
  virtual: true

  description: >
    SPI (Serial Peripheral Interface) slave bus devices are children of
    a SPI master bus device.

  # constraint: |+
  #  class_of(parent(n), "spi")

  properties:
    reg:
      category: required
      type: int
      description: chip select address of device

    compatible:
      category: required
      type: strseq
      description: compatible strings

    spi-max-frequency:
      category: required
      type: int
      description: Maximum SPI clocking speed of device in Hz

    spi-cpol:
      category: optional
      type: bool
      description: >
        Boolean property indicating device requires
        inverse clock polarity (CPOL) mode

    spi-cpha:
      category: optional
      type: bool
      description: >
        Boolean property indicating device requires
        shifted clock phase (CPHA) mode

    spi-cs-high:
      category: optional
      type: bool
      description: >
        Boolean property indicating device requires
        chip select active high

    spi-3wire:
      category: optional
      type: bool
      description: >
        Boolean property indicating device requires
        3-wire mode.

    spi-lsb-first:
      category: optional
      type: bool
      description: >
        Boolean property indicating device requires
        LSB first mode.

    spi-tx-bus-width:
      category: optional
      type: int
      constraint: v == 1 || v == 2 || v == 4
      description: >
        The bus width(number of data wires) that
        used for MOSI. Defaults to 1 if not present.

    spi-rx-bus-width:
      category: optional
      type: int
      constraint: v == 1 || v == 2 || v == 4
      description: >
        The bus width(number of data wires) that
        used for MISO. Defaults to 1 if not present.

  notes: >
    Some SPI controllers and devices support Dual and Quad SPI transfer mode.
    It allows data in the SPI system to be transferred in 2 wires(DUAL) or
    4 wires(QUAD).
    Now the value that spi-tx-bus-width and spi-rx-bus-width can receive is
    only 1(SINGLE), 2(DUAL) and 4(QUAD). Dual/Quad mode is not allowed when
    3-wire mode is used.
    If a gpio chipselect is used for the SPI slave the gpio number will be
    passed via the SPI master node cs-gpios property.

  example:
    dts: |
      spi@f00 {
          ethernet-switch@0 {
              compatible = "micrel,ks8995m";
              spi-max-frequency = <1000000>;
              reg = <0>;
          };

          codec@1 {
              compatible = "ti,tlv320aic26";
              spi-max-frequency = <100000>;
              reg = <1>;
          };
      };
    yaml: |
      spi@f00:
        ethernet-switch@0:
          compatible: "micrel,ks8995m"
          spi-max-frequency: 1000000
          reg: 0

        codec@1:
          compatible: "ti,tlv320aic26"
          spi-max-frequency: 100000
          reg: 1

Note the &amp;spi-slave anchor, this is what it is used to refer to other parts of the schema.

The SPI slave binding defines a number of properties that all inherited bindings include. This in turn inherits from device-compatible, which is this:

%YAML 1.1
---
device-compatible: &device-compatible
  title: Contraint for devices with compatible properties
  # select node for checking when the compatible constraint and
  # the device status enable constraint are met.
  selected: [ "compatible", *device-status-enabled ]

  class: constraint
  virtual: true

Note that device-compatible is a binding that all devices defined with the DT schema will inherit from.

The selected property will be used to generate a select test that will be used to to find out whether a node should be checked against a given rule.

The selected rule defines two constraints. The first one is the name of a variable in a derived binding that all its constraints must satisfy; in this case it’s the jedec,spi-nor compatible constraint in the binding above. The selected constraint is a reference to the device-status-enabled constraint defined at:

%YAML 1.1
---
device-enabled:
  title: Contraint for enabled devices

  class: constraint
  virtual: true

  properties:
    status: &device-status-enabled
      category: optional
      type: str
      description: Marks device state as enabled
      constraint: |
        !exists || streq(v, "okay") || streq(v, "ok")

The device-enabled constraint checks where the node is enabled in DT parlance.

Taking those two constraints together, yamldt generates an enable method filter which triggers on an enable device node that matches any of the compatible strings defined in the jedec,spi-nor binding.

The check method will be generated by collecting all the property constraints (category, type and explicit value constraints).

Note how in the above example a variable (v) is used as the current property value. The generated methods will provide it, initialized to the current value to the constraint.

Note that custom, manually written select and check methods are possible but their usage is not recommended for simple types.

Installation

Install libyaml-dev and the standard autoconf/automake generation tools, then compile with the standard ./autogen.sh, ./configure, and make cycle. Note that the bundled validator requires a working eBPF compiler and libelf. Known working clang versions with eBPF support are 4.0 and higher.

For a complete example of a port of a board DTS file to YAML take a look in the port/ directory

Usage

The yamldt options available are:

yamldt [options] <input-file>
 options are:
   -q, --quiet           Suppress; -q (warnings) -qq (errors) -qqq (everything)
   -I, --in-format=X     Input format type X=[auto|yaml|dts]
   -O, --out-format=X    Output format type X=[auto|yaml|dtb|dts|null]
   -o, --out=X           Output file
   -c                    Don't resolve references (object mode)
   -g, --codegen         Code generator configuration file
       --schema          Use schema (all yaml files in dir/)
       --save-temps      Save temporary files
       --schema-save     Save schema to given file
       --color           [auto|off|on]
       --debug           Debug messages
   -h, --help            Help
   -v, --version         Display version

   DTB specific options

   -V, --out-version=X   DTB blob version to produce (only 17 supported)
   -C, --compatible      Bit-exact DTC compatibility mode
   -@, --symbols         Generate symbols node
   -A, --auto-alias      Generate aliases for all labels
   -R, --reserve=X       Make space for X reserve map entries
   -S, --space=X         Make the DTB blob at least X bytes long
   -a, --align=X         Make the DTB blob align to X bytes
   -p, --pad=X           Pad the DTB blob with X bytes
   -H, --phandle=X       Set phandle format [legacy|epapr|both]
   -W, --warning=X       Enable/disable warning (NOP)
   -E, --error=X         Enable/disable error (NOP)
   -b, --boot-cpu=X      Force boot cpuid to X

-q/--quiet suppresses message output.

The -I/--in-format option selects the input format type. By default it is set to auto which is capable of selecting based on file extension and input format source patterns.

The -O/--out-format option selects the output format type. By default it is set to auto which uses the output file extension.

-o/--out sets the output file.

The -c option causes unresolved references to remain in the output file resuling in an object file. If the output format is set to DTB/DTS it will generate an overlay, if set to yaml it results in a YAML file which can be subsequently recompiled as an intermediate object file.

The -g/--codegen option will use the given YAML file(s) (or dir/ as in the schema option) as input for the code generator.

The --schema option will use the given file(s) as input for the checker. As an extension, if given a directory name with a terminating slash (i.e. dir/) it will recursively collect and use all YAML files within.

The --save-temps option will save all intermediate files/blobs.

--schema-save will save the processed schema and codegen file including all compiled validation filters. Using it speeds validation of multiple files since it can be used as an input via the –schema option.

--color controls color output in the terminal, while --debug enables the generation of a considerable amount of debugging messages.

The following DTB specific options are supported:

-V/--out-version selects the DTB blob version; currently only version 17 is supported.

The -C/--compatible option generates a bit-exact DTB file as the DTC compiler.

The -@/--symbols and -A/--auto-alias options generate a symbols and alias entries for all the defined labels in the source files.

The -R/--reserve, -S/--space, -a/--align and -p/--pad options work the same way as in DTC. -R add reserve memreserve entries, -S adds extra space, -a aligns and -p pads extra space end of the DTB blob.

The -H/--phandle option selects either legacy/epapr or both phandle styles.

The -W/--warning and -E/--error options are there for command line compatibility with dtc and are ignored.

Finally -d/--boot-cpu forces the boot cpuid.

Automatic suffix detection does what you expect, i.e. an output file ending in .dtb if selecting the DTB generation option, .yaml if selecting the yaml generation option, and so on.

Given a source file in YAML foo.yaml, you generate a DTB file with:

# foo.yaml
foo: &foo
  bar: true
  baz:
   - 12
   - 8
   - *foo
  frob: [ "hello", "there" ]

To process it with yamldt:

$ yamldt -o foo.dtb foo.yaml
$ ls -l foo.dtb
-rw-rw-r-- 1 panto panto 153 Jul 27 18:50 foo.dtb
$ fdtdump foo.dtb
/dts-v1/;
// magic:       0xd00dfeed
// totalsize:       0xe1 (225)
// off_dt_struct:   0x38
// off_dt_strings:  0xc8
// off_mem_rsvmap:  0x28
// version:     17
// last_comp_version:   16
// boot_cpuid_phys: 0x0
// size_dt_strings: 0x19
// size_dt_struct:  0x90

/ {
    foo {
        bar;
        baz = <0x0000000c 0x00000008 0x00000001>;
        frob = "hello", "there";
        phandle = <0x00000001>;
    };
    __symbols__ {
        foo = "/foo";
    };
};

dts2yaml

dts2yaml is an automatic DTS to YAML conversion tool, that works on standard DTS files which use the preprocessor. It is capable of detecting macro usage and advanced DTS concepts, like property/node deletes, etc. Conversion is accurate as long as the source file still looks like DTS source (i.e. it is not using extremely complex macros).

dts2yaml [options] [input-file]
 options are:
   -o, --output        Output file
   -t, --tabs       Set tab size (default 8)
   -s, --shift      Shift when outputing YAML (default 2)
   -l, --leading    Leading space for output
   -d, --debug      Enable debug messages
       --silent        Be really silent
       --color         [auto|off|on]
   -r, --recursive     Generate DTS/DTSI included files
   -h, --help       Help
       --color         [auto|off|on]

All the input files will be converted to yaml format. If no output option is given, the output will be named according to the input filename. So foo.dts will yield foo.yaml and foo.dtsi will yield foo.yamli.

The recursive option converts all included files that have a dts/dtsi extension as well.

Test suite

To run the test-suite you will need a relatively recent DTC compiler, YAML patches are no longer required.

The test-suite first converts all the DTS files in the Linux kernel for all architectures to YAML format using dts2yaml. Afterwards, it compiles the YAML files with yamldt and the DTS files with dtc. The resulting dtb files are bit-exact because the -C option is used.

Run make check to run the test suite.
Run make validate to run the test suite and perform schema validation checks. It is recommended to use the --keep-going flag to continue checking even in the presence of validation errors.

Currently out of 1379 DTS files, only 6 fail conversion:

exynos3250-monk exynos4412-trats2 exynos3250-rinato exynos5433-tm2
exynos5433-tm2e

All 6 use a complex pin mux macro declaration that it is not possible to automatically convert.

Workflow

It is expected that the first thing a user of yamldt would want to do is to convert an existing DTS configuration to YAML.

The following example uses the beaglebone black and the am335x-boneblack.dts source as located in the port/ directory.

Compile the original DTS source with DTC

$  cc -E  -I ./ -I ../../port -I ../../include -I ../../include/dt-bindings/input 
    -nostdinc -undef -x assembler-with-cpp -D__DTS__ am335x-boneblack.dts 
    | dtc -@ -q -I dts -O dtb - -o am335x-boneblack.dtc.dtb

Use dts2yaml to convert to yaml

$ dts2yaml -r am335x-boneblack.dts
$ ls *.yaml*
am335x-boneblack-common.yamli  am335x-bone-common.yamli  am33xx-clocks.yamli
am33xx.yamli  tps65217.yamli

Note the recursive option automatically generates the dependent include files.

$ cc -I ./ -I ../../port -I ../../include -I ../../include/dt-bindings/input 
    -nostdinc -undef -x assembler-with-cpp -D__DTS__ am335x-boneblack.yaml | 
    ../../yamldt -C -@ - -o am335x-boneblack.dtb 
$ ls -l *.dtb
-rw-rw-r-- 1 panto panto 50045 Jul 27 19:10 am335x-boneblack.dtb
-rw-rw-r-- 1 panto panto 50045 Jul 27 19:07 am335x-boneblack.dtc.dtb
$ md5sum *.dtb
3bcf838dc9c32c196f66870b7e6dfe81  am335x-boneblack.dtb
3bcf838dc9c32c196f66870b7e6dfe81  am335x-boneblack.dtc.dtb

Compiling without the -C option results in a file with the same functionality, but it is slightly smaller due to better string table optimization.

$ yamldt am335x-boneblack.dtc.yaml -o am335x-boneblack.dtb
$ ls -l *.dtb
-rw-rw-r-- 1 panto panto 50003 Jul 27 19:12 am335x-boneblack.dtb
-rw-rw-r-- 1 panto panto 50045 Jul 27 19:07 am335x-boneblack.dtc.dtb

Note that the CPP command line is the same, so no changes to header files are required. dts2yaml will detect macro usage and convert from the space delimited form that DTC uses to the comma delimted form used by YAML.

yamldt as a DTC compiler

yamldt supports all dtc options, so using it as a dtc replacement is straightforward.

Using it for compiling the Linux Kernel DTS files is as simple as:

$ make DTC=yamldt dtbs

Note that by default the compatibility option (-C) is not used, so if you need to be bit-compatible with DTC pass the -C flag as follows:

$ make DTC=yamldt DTC_FLAGS="-C"

Generally, yamldt is a little bit faster than dtc and generates somewhat smaller DTB files (if not using the -C option). However, due to internally tracking all parsed tokens and their locations in files, it is capable of generating accurate error messages that are parseable by text editors for automatic movement to the error.

For example, with this file containing an error:

/* duplicate label */
/dts-v1/;
/ {
    a: foo { foo; };
    a: bar { bar; };
};

yamldt will generate the following error:

$ yamldt -I dts -o dts -C duplabel.dts
duplabel.dts:8:2: error: duplicate label a at "/bar"
  a: bar {
  ^
duplabel.dts:4:2: error: duplicate label a is defined also at "/foo"
  a: foo {
  ^

while dtc will generate:

$ yamldt -I dts -o dts -C duplabel.dts
dts: ERROR (duplicate_label): Duplicate label 'a' on /bar and /foo
ERROR: Input tree has errors, aborting (use -f to force output)

Known features of DTC that are not available are:

  • Only version 17 DT blobs are supported. Passing a -V argument requesting a different one will result in error.
  • Assembly output is not supported.
  • Assembly and filesystem inputs are not supported.
  • The warning and error options are accepted, but they don’t do anything. yamldt uses a validation schema for application specific errors and warnings so those options are superfluous.

Notes on DTS to DTS conversion

The conversion from DTS is straight forward:

For example:

/* foo.dts */
/ {
    foo = "bar";
    #cells = <2>;
    phandle-ref = <&ref 1>;
    ref: refnode { baz; };
};
# foo.yaml
foo: "bar"
"#cells": 2
phandle-ref: [ *ref 1 ]
refnode: &ref
  baz: true

Major differences between DTS & YAML:

  • YAML is using # as a comment marker, therefore properties with a # prefix get converted to explicit string literals:
#cells = <0>;

to YAML

"#cells": 0
  • YAML is indentation sensitive, but it is a JSON superset. Therefore the following are equivalent:
foo: [ 1, 2 ]
foo:
 - 1
 - 2
  • The labels in DTS are defined and used as:
foo: node { baz; };
bar = <&foo>;

In YAML the equivalent methods are called anchors and are defined as follows:

node: &foo
  baz: true
bar: *foo
  • Explicit tags in YAML are using !, so the following:
mac = [ 0 1 2 3 4 5 ];

is used like this in YAML:

mac: !int8 [ 0, 1, 2, 3, 4, 5 ]
  • DTS uses spaces to seperate array elements, YAML uses either indentation or commas in JSON form. Note that yamldt is smart enough to detect the DTS form and automatically convert in most cases:
pinmux = <0x00 0x01>;

In YAML:

pinmux:
  - 0x00
  - 0x01

or:

pinmux: [ 0x00, 0x01 ]
  • Path references () automatically are converted to pseudo YAML anchors (of the form yaml_pseudo__n__):
/ {
    foo { bar; };
};
ref = <&/foo>;

In YAML:

foo: &yaml_pseudo__0__
ref: *foo
  • Integer expression evaluation, similar in manner to that which the CPP preprocessor performs, is available. This is required in order for macros to work. For example, given the following two files:
/* add.h */
#define ADD(x, y) ((x) + (y))
# macro-use.yaml

#include "add.h"

result: ADD(10, 12)

The output after the cpp preprocessor pass:

result: ((10) + (12))

Parsing with yamldt to DTB will generate a property:

result = <22>;

Validation example

For this example we’re going to use port/am335x-boneblack-dev/. An extra rule-check.yaml file has been added where validation tests can be performed.

That file contains a single jedec,spi-nor device:

*spi0:
  m25p80@0:
    compatible: "s25fl256s1"
    spi-max-frequency: 76800000
    reg: 0
    spi-tx-bus-width: 1
    spi-rx-bus-width: 4
    "#address-cells": 1
    "#size-cells": 1

This is a valid device node, so running validate produces the following:

$ make validate
cc -E -MT am33xx.cpp.yaml -MMD -MP -MF am33xx.o.Yd -I ./ -I ../../port -I ../../include -I ../../include/dt-bindings/input -nostdinc -undef -x assembler-with-cpp -D__DTS__ -D__YAML__ am33xx.yaml >am33xx.cpp.yaml
cc -E -MT am33xx-clocks.cpp.yaml -MMD -MP -MF am33xx-clocks.o.Yd -I ./ -I ../../port -I ../../include -I ../../include/dt-bindings/input -nostdinc -undef -x assembler-with-cpp -D__DTS__ -D__YAML__ am33xx-clocks.yaml >am33xx-clocks.cpp.yaml
cc -E -MT am335x-bone-common.cpp.yaml -MMD -MP -MF am335x-bone-common.o.Yd -I ./ -I ../../port -I ../../include -I ../../include/dt-bindings/input -nostdinc -undef -x assembler-with-cpp -D__DTS__ -D__YAML__ am335x-bone-common.yaml >am335x-bone-common.cpp.yaml
cc -E -MT am335x-boneblack-common.cpp.yaml -MMD -MP -MF am335x-boneblack-common.o.Yd -I ./ -I ../../port -I ../../include -I ../../include/dt-bindings/input -nostdinc -undef -x assembler-with-cpp -D__DTS__ -D__YAML__ am335x-boneblack-common.yaml >am335x-boneblack-common.cpp.yaml
cc -E -MT am335x-boneblack.cpp.yaml -MMD -MP -MF am335x-boneblack.o.Yd -I ./ -I ../../port -I ../../include -I ../../include/dt-bindings/input -nostdinc -undef -x assembler-with-cpp -D__DTS__ -D__YAML__ am335x-boneblack.yaml >am335x-boneblack.cpp.yaml
cc -E -MT rule-check.cpp.yaml -MMD -MP -MF rule-check.o.Yd -I ./ -I ../../port -I ../../include -I ../../include/dt-bindings/input -nostdinc -undef -x assembler-with-cpp -D__DTS__ -D__YAML__ rule-check.yaml >rule-check.cpp.yaml
../../yamldt  -g ../../validate/schema/codegen.yaml -S ../../validate/bindings/ -y am33xx.cpp.yaml am33xx-clocks.cpp.yaml am335x-bone-common.cpp.yaml am335x-boneblack-common.cpp.yaml am335x-boneblack.cpp.yaml rule-check.cpp.yaml -o am335x-boneblack-rules.pure.yaml
jedec,spi-nor: /ocp/spi@48030000/m25p80@0 OK

Note the last line. It means the node was checked and was found OK.

Editing the rule-check.yaml file, let’s introduce a couple of errors. The following output is generated by commenting out the reg property # reg: 0:

$ make validate
cc -E -MT rule-check.cpp.yaml -MMD -MP -MF rule-check.o.Yd -I ./ -I ../../port -I ../../include -I ../../include/dt-bindings/input -nostdinc -undef -x assembler-with-cpp -D__DTS__ -D__YAML__ rule-check.yaml &gt;rule-check.cpp.yaml
../../yamldt  -g ../../validate/schema/codegen.yaml -S ../../validate/bindings/ -y am33xx.cpp.yaml am33xx-clocks.cpp.yaml am335x-bone-common.cpp.yaml am335x-boneblack-common.cpp.yaml am335x-boneblack.cpp.yaml rule-check.cpp.yaml -o am335x-boneblack-rules.pure.yaml
jedec,spi-nor: /ocp/spi@48030000/m25p80@0 FAIL (-2004)
../../validate/bindings/jedec,spi-nor.yaml:15:5: error: missing property: property was defined at /jedec,spi-nor/properties/reg
     reg:
     ^~~~

Note the descriptive error and the pointer to the missing property in the schema.

Making another error, assign a string to the reg property reg: "string":

$ make validate
$ make validate
cc -E -MT rule-check.cpp.yaml -MMD -MP -MF rule-check.o.Yd -I ./ -I ../../port -I ../../include -I ../../include/dt-bindings/input -nostdinc -undef -x assembler-with-cpp -D__DTS__ -D__YAML__ rule-check.yaml &gt;rule-check.cpp.yaml
../../yamldt  -g ../../validate/schema/codegen.yaml -S ../../validate/bindings/ -y am33xx.cpp.yaml am33xx-clocks.cpp.yaml am335x-bone-common.cpp.yaml am335x-boneblack-common.cpp.yaml am335x-boneblack.cpp.yaml rule-check.cpp.yaml -o am335x-boneblack-rules.pure.yaml
jedec,spi-nor: /ocp/spi@48030000/m25p80@0 FAIL (-3004)
rule-check.yaml:8:10: error: bad property type
     reg: "string"
          ^~~~~~~~
../../validate/bindings/jedec,spi-nor.yaml:15:5: error: property was defined at /jedec,spi-nor/properties/reg
     reg:
     ^~~~

Note the message about the type error, and the pointer to the location where the reg property was defined.

Finally, let’s make an error that violates a constraint.

Change the spi-tx-bus-width value to 3:

$ make validate
cc -E -MT rule-check.cpp.yaml -MMD -MP -MF rule-check.o.Yd -I ./ -I ../../port -I ../../include -I ../../include/dt-bindings/input -nostdinc -undef -x assembler-with-cpp -D__DTS__ -D__YAML__ rule-check.yaml &gt;rule-check.cpp.yaml
../../yamldt  -g ../../validate/schema/codegen.yaml -S ../../validate/bindings/ -y am33xx.cpp.yaml am33xx-clocks.cpp.yaml am335x-bone-common.cpp.yaml am335x-boneblack-common.cpp.yaml am335x-boneblack.cpp.yaml rule-check.cpp.yaml -o am335x-boneblack-rules.pure.yaml
jedec,spi-nor: /ocp/spi@48030000/m25p80@0 FAIL (-1018)
rule-check.yaml:9:23: error: constraint rule failed
     spi-tx-bus-width: 3
                       ^
../../validate/bindings/spi/spi-slave.yaml:77:19: error: constraint that fails was defined here
       constraint: v == 1 || v == 2 || v == 4
                   ^~~~~~~~~~~~~~~~~~~~~~~~~~
../../validate/bindings/spi/spi-slave.yaml:74:5: error: property was defined at /spi-slave/properties/spi-tx-bus-width
     spi-tx-bus-width:

Note how the offending value is highlighted. The offending constraint and property definition are aslo listed.

Home Servers, AArch64, and You (well, me)

Recently, my SoftIron OverDrive1000 arrived, and I’ve finally given myself some time to sit down and implement the project I purchased it for. First, one may ask, why AArch64? The answer lies in my history of installing Linux-based machines at home, the number of PowerPC machines still exceeds the number of x86/x86-64 machines that I have done. So when I got the chance to pick up some AArch64 hardware that came in a regular form factor, I decided to jump on it. The other reason is that, as I have joked with some people, I like to do things the hard way. What do I mean by that? Well, I’ll explain. But first, some background.

The hardware comes with openSUSE Leap 42.2 preinstalled. I decided to take the fact that this is the least familiar distribution to me as a challenge. At the same time, I also decided that it would be worth evaluating other distributions that I’m familiar with, such as CentOS and Debian, on AArch64. Most of what I do with my home server today is related to locally streaming media. So while I had been installing and running various applications directly (making use of crontab’s @reboot keyword to launch them), I decided I should move on to modern best practices and use Docker to isolate, control, and update these applications. Finally, while I’ve been using Rygel for a few months, I want to get back to using Plex Media Server to serve the content. This in turn introduces the constraint of needing to use a 32-bit ARM binary, as Plex currently does not have a 64-bit build available.

The first bit of fun I found was that while the Docker project has community provided support for AArch64, Docker, Inc does not actually provide AArch64 builds. So if you want to have the most up to date Docker installation and wish to have Docker be managed by your distribution packaging manager, there’s some work to be done. The Docker project does have some contrib scripts to create packages of Docker for many distributions, but they’re written assuming that you’re running on x86-64. The good news is that all 3 of the distributions mentioned above do ship a version of Docker. For my needs (a well firewalled internal server), the versions they provide are OK.

Testing on CentOS proved interesting. It was easy to get it installed and running from a spare drive on the real hardware. This was just as easy and boring as advertised. After checking basic functionality, I switched to running it in a virtual machine for the rest of my tests. Here’s where things got difficult. First, as of today, CentOS uses 64K pages rather than 4K pages. This in turn disallows the possibility of having 32-bit binary compatibility. One can recompile the CentOS kernel to change the page size and get 32-bit compatibility working. I did this and it proved straightforward yet time consuming. After confirming that this was enough to enable at least basic 32-bit compatibility, I decided to move on.

Debian was the next distribution that I tried out. While the Debian wiki says that you need a newer image to install on AMD Seattle hardware such as the Overdrive 1000, the latest Jessie images are actually new enough to boot and start the installer. At this point in my experiments I didn’t want do another install on the hardware, so the rest of my tests were done on a virtual machine install. This installation didn’t work out well for me and my needs, unfortunately. While there is a docker-engine package available which is used by 96boards.org, I didn’t want to introduce such a large deviation from upstream Debian. The docker.io package exists as a backport to Jessie, but not for arm64. Further, at the time that I was testing things, Sid was in an incomplete state for the docker.io package. While I was able to resolve the dependencies manually, Docker didn’t want to start. I should note that I’m doing my reviews here out of order slightly. At this point I had openSUSE with Docker running, and didn’t want to further dive into this problem, which I’m sure could be resolved. One last note is that I did test 32-bit compatibility, and it works fine on Debian out of the box.

Which brings me back to openSUSE. The main repository has Docker available, and the kernel is already built with 32-bit compatibility enabled. This was the easiest distribution for getting the first order of problems solved on, as everything just worked, and I was also able to easily set up VMs for testing the other distributions. The biggest hurdle I faced on openSUSE is that I didn’t find a lot of documentation about installing Docker, but instead about setting up various flavours of LXC. Fortunately, Docker is not very host distribution specific, so this was not a big problem.

All of the above distributions, and other efforts such as the Linaro distributions for the Dragonboard, suffer from one more issue with respect to running Docker containers consisting of 32-bit binaries. Namely, that we’re not using binaries that we’ve controlled the build process of, we’re just picking up whatever is in various public containers from Docker. Without getting too deep into the technical details of what instructions AArch64 will emulate, not all instructions required to run “armhf” binaries are required. For example, a number of Docker images are built optimized to the ARMv6 architecture as found in some models of the Raspberry Pi, and these instructions require additional Linux kernel support to be enabled. For more details see here.

So, where does that leave us? Well, first of all, I’ve completed the software side of the desired migration. The new server is doing everything the old server was doing before. As far as my end users go, it’s all working just as well as before too. In doing this migration, I’ve been reminded of the phrase “the more things change, the more they stay the same”. Back during the early Linux on PowerPC days, one would often have to deal with the problems induced by “Linux” really meaning “Linux on 32-bit Intel/x86 compatible CPUs” to some project or maintainer. Today one has to deal with the problem of “Linux” often meaning “Linux on 64-bit Intel/AMD x86-64 compatible systems” with an occasional “Linux on the Raspberry Pi” instead. Both of these assumptions make life interesting at times. With the former, for example, it is possible to have Docker images deal with multiple architectures, but that’s just not done today. I suspect that while it will be done for the core Docker images at some point, one is always going to need to take care when using arbitrary community images, setting aside any other concerns one might have in that area. With the latter assumption, we run into the kernel issue mentioned above, as without the Pi, “armhf” would likely mean “ARMv7” in practice. In the end, the cool thing here is that the distributions have achieved pretty close to parity between x86-64 and AArch64 in terms of the offered software. However, once you start to move out of that garden and into the world at large, you’re going to start to find some oddities. Now, if you’re like me and find the challenge fun, or are looking for a reason to dive deeper into how things work, it’s a great time and a great idea. But if you’re looking for a turn-key solution on an AArch64 platform, things aren’t quite there yet.

Konsulko Group recognized as a top AGL contributor

Konsulko Group recently participated in the Automotive Grade Linux (AGL) Winter All Members Meeting (AMM) in Tokyo, Japan. During the AMM, Konsulko Group was recognized as the #6 ranked contributor to the AGL project for 2016.

Matt Porter, CTO of Konsulko Group, said, “We ramped up our contributions to AGL in Q4 2016 as a part of supporting several impressive new demos for CES 2017. I was thrilled to see our AGL team so well represented alongside other automotive industry companies“.

AGL is critical to our customers in the automotive market. With our automotive business accelerating, we expect our AGL contributions to increase greatly in 2017“, commented Pete Popov, CEO of Konsulko Group.

Tool Time: Quilt

All of us have our favorite tools and workflows. For me, I am a fan of git. One of the things I really like about it is git stash, because it lets me keep drafts of my changes and develop them incrementally. But there are times where you need to adapt your workflow to fit within the requirements your customer provides. For example, I’m working with subversion currently with a customer. And while I’m familiar with git-svn, it’s not always the right choice. So I dug around in my toolbox and found quilt from my pre-git days. It’s not quite the same as my usual workflow, but with a minimal of mental gymnastics I’m productive and working rather than fighting with my tools.

Here’s my quick guide to getting started using quilt. As a first step it’s worth looking at the quilt sub-commands:

$ quilt help
Usage: quilt [--trace[=verbose]] [--quiltrc=XX] command [-h] ...
       quilt --version
Commands are:
        add       fold    new       remove    top
        annotate  fork    next      rename    unapplied
        applied   graph   patches   revert    upgrade
        delete    grep    pop       series
        diff      header  previous  setup
        edit      import  push      shell
        files     mail    refresh   snapshot

Global options:

--trace
        Runs the command in bash trace mode (-x). For internal debugging.

--quiltrc file
        Use the specified configuration file instead of ~/.quiltrc (or
        /etc/quilt.quiltrc if ~/.quiltrc does not exist).  See the pdf
        documentation for details about its possible contents.  The
        special value "-" causes quilt not to read any configuration
        file.

--version
        Print the version number and exit immediately.

Some of these are going to look familiar. This is because quilt was one of the inspirations for git. Some of the commands are a little different. For example, quilt fold is the equivalent of doing git rebase –interactive and using the squash and fixup keywords. Using quilt header is like setting your commit message when you git commit your changes. It even has quilt snapshot, which behaves like git stash, but I’m going to set it aside in favor or making new patches as I go. With all of this covered, it’s time to show a sample workflow:

$ quilt new first.patch
Patch first.patch is now on top
$ quilt add src/stuff.c
File src/stuff.c added to patch first.patch
$ ${EDITOR} src/stuff.c
$ quilt refresh
Refreshed patch first.patch
$ quilt new second.patch
Patch second.patch is now on top
$ quilt add src/stuff.c
File src/stuff.c added to patch second.patch
$ # ... and so on

What you’ll notice from the commands is what I feel is the hard part of this work flow, you must tell quilt what to track. Because it’s not your revision control system, quilt doesn’t know what the original state of a file is before you change it. However, with a little time you’ll be doing quilt new, quilt edit and quilt refresh without a second thought.

One noteworthy thing about quilt is the workflow for reviewing your changes. To see what your current patch looks like compared with the state of the tree prior to any changes, you do quilt diff. Note that this is akin to git diff HEAD in that it will disregard changes made since your last quilt refresh. To get the same type of output as git diff, you do quilt diff -z. When you wish to review your series of changes quilt pop will unapply the current patch, and quilt push will apply the next patch in your series. quilt series will list all patches in the series with quilt next, and quilt previous lists the next and previous patches in the series, respectively.

Developing on hardware without functional networking

One of the more painful steps in doing development on hardware is when you don’t have any networking that’s functional and reliable yet. So you end up having to shuffle a SD card, USB stick, or similar back and forth. This can be even less fun when you’re working on a prototype, and need to take care to avoid disconnecting a fragile collection of boards, wires, and cables.

While there are a few different ways to get around this problem, depending on what is and isn’t working, my current favorite is wireless enabled SD cards. Why? One reason is that they’re OS-independent. So long as the board can supply power the card will be able to bringup its network. Break booting into Linux with your latest change? Not a problem. Want to integrate with your existing build cycle? There are CLI tools, and at the end of the day you’re sending stuff via HTTP to the card so you can always write something yourself if you don’t like what you find for tooling. The biggest drawback to me is that it only supports exposing the first FAT partition, but on the other hand you can happily partition the card and there is no requirement on where the FAT partition physically resides.

To make the best use of these cards, there are a few tricks you will want to employ. And while everything is documented, I found it handy to make up some templates to work with when I was setting up my lab with a few cards. On each card I would leave the VERSION and CID fields alone as they are pre-populated, and then overwrite the rest of the contents based on my template, fill it in, and go. My template looks like:

[WLANSD]

DHCP_Enabled=NO
IP_Address=192.168.0.XXX
Subnet_Mask=255.255.255.0
Default_Gateway=192.168.0.1
Preferred_DNS_Server=192.168.0.1
Alternate_DNS_Server=8.8.8.8

[Vendor]

CIPATH=/DCIM/100__TSB/FA000001.JPG
PRODUCT=FlashAir
VENDOR=TOSHIBA
WEBDAV=1
TIMEZONE=-28
APPSSID=MY-LOCAL-SSID
APPNETWORKKEY=MY-LOCAL-SSID-PASSPHRASE
APPNAME=flashair-XXX
APPMODE=5
APPAUTOTIME=300000
DNSMODE=0
LOCK=1
UPLOAD=1
WLANAPMODE=0x82

A few of these choices are odd enough that they are worth explaining. First, while the cards can happily do DHCP, I don’t like relying on that in cases like this. I’m much happier to instead put the card in a case with a sticky note on top that notes the IP, for the next project I need it for. Next, I set APPMODE to 5 so that it will join my network rather than start its own. The TIMEZONE key is a little odd. It is UTC based but works in 15 minute increments, which can be useful, but is somewhat unexpected. Note that if you do not spell out UPLOAD=1 uploading of files is disabled. The default upload path is the root, and that is often what we will want. Finally, I have WLANAPMODE set to the value for 802.11n/g rather than the default of 802.11b/g. It is also worth noting that once the card has run with this config file, it will update and asterisk out your network passphrase.

At the end of the day, I like these, and have a number of them because they’re so versatile. They come in sizes up to 32GB, which is enough to have a pretty well featured distribution available; either for the board itself, or to chroot into during development to easily add perf or other tools to a stripped down system. They are a full-size SD card but that’s just an adapter away from fitting into a microSD slot which is how I use about half of mine. And if you don’t have a spare (or functional yet) SD slot a USB card reader works just as well to bring this into the system you’re working on.

Supporting Flame Graphs on production kernels

Background

Perf is an amazing tool for observing system performance in Linux. Using perf on production kernels can be filled with pitfalls, due to the rapid pace at which new features are being added. In my case, I support a production kernel team that expects every feature they read about on the web to work on their older production kernel. A good example of a downstream use case of perf is Brendan Gregg’s very nice Flame Graphs tool for visualizing frequently used code paths in a system.

Example mysql Flame Graph

Example mysql Flame Graph

Recording call frame information with perf

Generation of Flame Graphs depends on perf capturing call frames. As documented in the Flame Graph tools, one records perf data on a x86-64 system by enabling DWARF call graph support with a command line like:

$ perf record -F 99 -a --call-graph dwarf -- sleep 60

That, of course, produces the raw perf.data file. The call frames we need are there. However, we need to process this data with a reporting tool.

Problems generating Flame Graphs

Now we start running into the problem with our production kernel. In our case, we are on a 4.1 kernel. Users are happily running perf report, seeing the complete set of call frame information throughout the system components under observation. The interesting thing is that if we generate a Flame Graph using this same data, then the users no longer have visibility into the complete calling tree information. That is, the Flame Graph will simply show time spent in a given library. So what’s wrong? Let’s take a look at how Flame Graphs are generated:

$ perf script > out.perf
$ stackcollapse-perf.pl out.perf > out.folded
$ flamegraph.pl out.folded > out.svg

The key here is that we are no longer parsing the perf data using perf report, but rather using perf script to do the heavy lifting and feeding the result into the Flame Graph generation tools. Doing a bit of git detective work, we can see that perf report added callchain sampling all the way back in 3.18:

$ git describe --contains 0cdccac6fe4b1316f04f0dbfcc4efab51932014a
v3.18-rc1~8^2~2^2~6
$ git log -1 -p 0cdccac6fe4b1316f04f0dbfcc4efab51932014a
commit 0cdccac6fe4b1316f04f0dbfcc4efab51932014a
Author: Namhyung Kim <namhyung@kernel.org>
Date:   Mon Oct 6 09:45:59 2014 +0900

    perf report: Set callchain_param.record_mode for future use

    Normally the callchain_param.record_mode is used only for record path.
    But as it might need to prepare something for dwarf unwinding, setup
    this info for perf report too.

    Signed-off-by: Namhyung Kim <namhyung@kernel.org>
    Acked-by: Jiri Olsa <jolsa@kernel.org>
    Cc: David Ahern <dsahern@gmail.com>
    Cc: Frederic Weisbecker <fweisbec@gmail.com>
    Cc: Ingo Molnar <mingo@kernel.org>
    Cc: Jean Pihet <jean.pihet@linaro.org>
    Cc: Jiri Olsa <jolsa@redhat.com>
    Cc: Namhyung Kim <namhyung.kim@lge.com>
    Cc: Paul Mackerras <paulus@samba.org>
    Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
    Link: http://lkml.kernel.org/r/1412556363-26229-2-git-send-email-namhyung@kernel.org
    Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 2cfc4b93..140a6cd 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -257,6 +257,13 @@ static int report__setup_sample_type(struct report *rep)
                }
        }

+       if (symbol_conf.use_callchain || symbol_conf.cumulate_callchain) {
+               if ((sample_type & PERF_SAMPLE_REGS_USER) &&
+                   (sample_type & PERF_SAMPLE_STACK_USER))
+                       callchain_param.record_mode = CALLCHAIN_DWARF;
+               else
+                       callchain_param.record_mode = CALLCHAIN_FP;
+       }
        return 0;
 }

diff --git a/tools/perf/tests/dwarf-unwind.c b/tools/perf/tests/dwarf-unwind.c
index 96adb73..fc25e57 100644
--- a/tools/perf/tests/dwarf-unwind.c
+++ b/tools/perf/tests/dwarf-unwind.c
@@ -9,6 +9,7 @@
 #include "perf_regs.h"
 #include "map.h"
 #include "thread.h"
+#include "callchain.h"

 static int mmap_handler(struct perf_tool *tool __maybe_unused,
                        union perf_event *event,
@@ -120,6 +121,8 @@ int test__dwarf_unwind(void)
                return -1;
        }

+       callchain_param.record_mode = CALLCHAIN_DWARF;
+
        if (init_live_machine(machine)) {
                pr_err("Could not init machinen");
                goto out;

Making Flame Graphs work with our kernel

Knowing that this worked on newer versions of perf in at least the 4.6 kernel, we were then able to spot that it wasn’t until 4.3 that perf script gained callchain support. Notice the addition of the analogous code to what was already in perf report:

$ git describe --contains 7322d6c98dd214252bd697f8dde64a3576977fab
v4.3-rc1~138^2~5^2~10
$ git log -1 -p 7322d6c98dd214252bd697f8dde64a3576977fab
commit 7322d6c98dd214252bd697f8dde64a3576977fab
Author: Jiri Olsa <jolsa@redhat.com>
Date:   Thu Aug 13 09:17:24 2015 +0200

    perf script: Initialize callchain_param.record_mode

    Milian Wolff reported non functional DWARF unwind under perf script. The
    reason is that perf script does not properly configure
    callchain_param.record_mode, which is needed by unwind code.

    Stealing the code from report and leaving the place for more
    initialization code in a hope we could merge it with
    report__setup_sample_type one day.

    Reported-by: Milian Wolff <mail@milianw.de>
    Signed-off-by: Jiri Olsa <jolsa@kernel.org>
    Tested-by: Milian Wolff <milian.wolff@kdab.com>
    Cc: David Ahern <dsahern@gmail.com>
    Cc: Namhyung Kim <namhyung@kernel.org>
    Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
    Link: http://lkml.kernel.org/r/20150813071724.GA21322@krava.brq.redhat.com
    Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 7b376d2..105332e 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -1561,6 +1561,22 @@ static int have_cmd(int argc, const char **argv)
        return 0;
 }

+static void script__setup_sample_type(struct perf_script *script)
+{
+       struct perf_session *session = script->session;
+       u64 sample_type = perf_evlist__combined_sample_type(session->evlist);
+
+       if (symbol_conf.use_callchain || symbol_conf.cumulate_callchain) {
+               if ((sample_type & PERF_SAMPLE_REGS_USER) &&
+                   (sample_type & PERF_SAMPLE_STACK_USER))
+                       callchain_param.record_mode = CALLCHAIN_DWARF;
+               else if (sample_type & PERF_SAMPLE_BRANCH_STACK)
+                       callchain_param.record_mode = CALLCHAIN_LBR;
+               else
+                       callchain_param.record_mode = CALLCHAIN_FP;
+       }
+}
+
 int cmd_script(int argc, const char **argv, const char *prefix __maybe_unused)
 {
        bool show_full_info = false;
@@ -1849,6 +1865,7 @@ int cmd_script(int argc, const char **argv, const char *prefix __maybe_unused)
                goto out_delete;

        script.session = session;
+       script__setup_sample_type(&script);

        session->itrace_synth_opts = &itrace_synth_opts;

By backporting this support from the 4.3 version of perf, we were able to support generation of Flame Graphs with our 4.1 production kernel tooling.

Conclusion

The moral of the story is: don’t count on well publicized perf features working on your older kernel. It is just as important to backport updates to the userspace perf tools as it is to backport updates for the production kernel itself.

Git workflow for upstreaming patches from a vendor kernel

Background

Upstreaming patches from a vendor kernel is a constant process of trying to get on board a train, falling behind, and hopping back on again.

Typically, a vendor kernel is based on an older version of the kernel. As a result, one has to forward port a series of patches against the latest kernel mainline. This is seldom a pain free affair, since the patches may not apply without manual edits. For example, the kernel interfaces may have changed, patches merged into mainline could have conflicting changes, and so on. On top of all of this, you have the reality that this task will need to be performed a number of times while you track mainline.

Basics

I usually perform cherry-picking of each patch on top of the mainline branch from the vendor branch, which is usually based on some old stable point release. Let’s assume that there are tags pointing to the vendor and stable commits, this reports the number of patches on top of that stable release.

panto@dev:~/linux (master)$ git describe stable
v4.1.15

panto@dev:~/linux (master)$ git describe vendor
v4.1.15-926-gea8c225

In this case, the patches are against stable v4.1.15 and there are 926 patches
on top of it; the format of the describe label is <tag>-<#-of-patches>-g<commit>

Workflow

First we get our needed information on the master branch.

panto@dev:~/linux (master)$ git describe master
v4.8-rc8-13-g53061af

Our master is today’s mainline kernel (v4.8-rc8) with just 13 patches on top of it. The problem with cherry-picking and manual editing is that when you edit the patch the commit id changes since the contents of the patch changes. We need a way to have a list of patches to cherry-pick, iteratively apply them, and manually fix any problems.

panto@dev:~/linux (master)$ git checkout --track -b work master
Checking out files: 100% (33262/33262), done.
Branch work set up to track local branch master.
Switched to a new branch 'work'

panto@dev:~/linux (work)$ git log --reverse --oneline stable..vendor >patchlist.txt

We create file with a list of the commits we want to apply on the work branch.

panto@dev:~/linux (work)$ cat patchlist.txt | head -n 2
ff24250 ppc: Make number of GPIO pins configurable
e4d443c pci/pciehp: Allow polling/irq mode to be decided on a per-port basis

This is the standard oneline format of git log (in reverse since we want the list to be in chronological order). If we were to do this manually, we’d have to do it like this:

panto@dev:~/linux (work)$ git cherry-pick ff24250 

If the cherry-pick is successful, we can proceed with the next one and so on. Otherwise, we have to manually fix it and issue git cherry-pick –continue Why not automate this by picking out the commit from the list and work iteratively? We can’t simply use commit IDs because the commit ID changes after every edit. The following cherry-pick-list.sh script does the heavy lifting of picking out the commits for us. Given the patchlist file, it will git cherry-pick each commit in the list. However, it will skip the already applied commits which match the description. It does not consider commit IDs since those might have changed.

#!/bin/bash

# get top
top=`git log --oneline HEAD^..HEAD | head -n1`
ctop=`echo ${top} | cut -d' ' -f1`
dtop=`echo ${top} | cut -d' ' -f2-`
l="$ctop $dtop"
if [ "$l" != "$top" ] ; then
        echo "Reconstructed top failure"
        echo $top
        echo $l
        exit 5
fi

# get list of commits and descriptions
old_IFS=${IFS}
IFS=$'n'

j=0
for i in `grep -v '^#' $1`; do
        c[${j}]=`echo ${i} | cut -d' ' -f1`
        d[${j}]=`echo ${i} | cut -d' ' -f2-`
        l="${c[${j}]} ${d[${j}]}"
        if [ $l != $i ] ; then
                echo "Reconstructed changeset failure $i"
                exit 5
        fi
        ((j++))
done
last=$((j - 1))
IFS=${old_IFS}

# skip over patches that are applied (checking description only)
match=0
for i in `seq 0 $last`; do
        ct=${c[${i}]}
        dt=${d[${i}]}
        if [ "${dt}" == "${dtop}" ]; then
                echo "Match found at $i: $dt"
                match=$(($i + 1))
                break;
        fi
        # echo "$i: $ct $dt"
done

for i in `seq $match $last`; do
        ct=${c[${i}]}
        dt=${d[${i}]}
        echo "cherry-picking: $i: $ct $dt"
        git cherry-pick $ct
        if [ $? -ne 0 ] ; then
                exit 5;
        fi
done

It makes sense to work on a simplified example using the kernel’s README file.

panto@dev:~/linux (work)$ git checkout --track -b foo master
Switched to branch 'foo'
Your branch is up-to-date with 'master'.

Edit the README file resulting to the following patch:

panto@dev:~/linux (foo)$ git diff
diff --git a/README b/README
index a24ec89..947fe6c 100644
--- a/README
+++ b/README
@@ -6,6 +6,8 @@ kernel, and what to do if something goes wrong.

WHAT IS LINUX?

+  foo
+
Linux is a clone of the operating system Unix, written from scratch by
Linus Torvalds with assistance from a loosely-knit team of hackers across
the Net. It aims towards POSIX and Single UNIX Specification compliance.

Commit the change:

panto@dev:~/linux (foo)$ git commit -m 'foo description'
[foo 4b5f122b] foo description
1 file changed, 2 insertions(+)

List the patches on top of master in sequence:

panto@dev:~/linux (foo)$ git log --oneline --reverse master..foo
4b5f122b foo description

Let’s create a new bar branch:

panto@dev:~/linux (foo)$ git checkout --track -b bar master
Switched to branch 'bar'
Your branch is up-to-date with 'master'.

Edit the README file resulting to the following patch:

panto@dev:~/linux (bar)$ git diff
diff --git a/README b/README
index a24ec89..4e7043c 100644
--- a/README
+++ b/README
@@ -6,6 +6,8 @@ kernel, and what to do if something goes wrong.

WHAT IS LINUX?

+  bar
+
Linux is a clone of the operating system Unix, written from scratch by
Linus Torvalds with assistance from a loosely-knit team of hackers across
the Net. It aims towards POSIX and Single UNIX Specification compliance.

Note that this conflicts with the foo patch, we will need to manually fix it later.

Commit the change:

panto@dev:~/linux (bar)$ git commit -m 'bar description'
[bar aba1679] bar description
 1 file changed, 2 insertions(+)

Make another commit that is conflict free:

panto@dev:~/linux (bar)$ git diff

diff --git a/README b/README
index 2788bfc..fbdf488 100644 
--- a/README
+++ b/README 
@@ -412,3 +412,4 @@ IF SOMETHING GOES WRONG:
gdb'ing a non-running kernel currently fails because gdb (wrongly)
disregards the starting offset for which the kernel is compiled.

+   more bar

Switch back to the foo branch to apply the changes in the bar branch.

panto@dev:~/linux (bar)$ git checkout foo
Switched to branch 'foo'
Your branch is ahead of 'master' by 1 commit.
  (use "git push" to publish your local commits)

Generate the patchlist file:

panto@dev:~/linux (foo)$ git log --oneline --reverse master..bar | tee patchlist.txt
22d7ac6 bar description
2be1bbb more bar description

Apply them using the script:

panto@dev:~/linux (foo)$ cherry-pick-list.sh patchlist.txt  
cherry-picking: 0: 22d7ac6 bar description
error: could not apply 22d7ac6... bar description
hint: after resolving the conflicts, mark the corrected paths
hint: with 'git add <paths>' or 'git rm <paths>'
hint: and commit the result with 'git commit'
Recorded preimage for 'README'

panto@dev:~/linux (foo)$ git diff
diff --cc README
index 947fe6c,4e7043c..0000000
--- a/README
+++ b/README
@@@ -6,7 -6,7 +6,11 @@@ kernel, and what to do if something goe

WHAT IS LINUX?

++<<<<<<< HEAD
+  foo
++=======
+   bar
++>>>>>>> 22d7ac6... bar description

Linux is a clone of the operating system Unix, written from scratch by
Linus Torvalds with assistance from a loosely-knit team of hackers across

Edit and fix it to look like this:

panto@dev:~/linux (foo)$ git diff
diff --cc README
index 947fe6c,4e7043c..0000000
--- a/README
+++ b/README
@@@ -6,7 -6,7 +6,8 @@@ kernel, and what to do if something goe

WHAT IS LINUX?

+  foo
+   bar

Linux is a clone of the operating system Unix, written from scratch by
Linus Torvalds with assistance from a loosely-knit team of hackers across

panto@dev:~/linux (foo)$ git add README

Edit the commit message (leaving the conflict or removing the Conflicts: tag)

panto@dev:~/linux (foo)$ git cherry-pick continue
Recorded resolution for 'README'.
[foo cc9dc34] bar description
1 file changed, 1 insertion(+)

Note the Recorded resolution line. Next time we will perform the same operation so we don’t have to repeat the manual fix. Run the script again to pick up the rest of the patchlist.

panto@dev:~/linux (foo)$ git cherry-pick continue
Match found at 0: bar description
cherry-picking: 1: 2be1bbb more bar description
[foo 63c1973] more bar description
 1 file changed, 1 insertion(+)

Note the message Match found at 0:. The script picked up that the first commit has been applied (albeit manually edited) and continued with the rest, which apply without problems. List the patches on top of master on the foo branch. Note that the commit ids of the patch sequence have changed.

panto@dev:~/linux (foo)$ git log --oneline --reverse master..
4b5f122b foo description 
cc9dc34 bar description
63c1973 more bar description

Now if we reset the foo branch back to the starting point:

panto@dev:~/linux (foo)$ git reset --hard HEAD^^
HEAD is now at 4b5f122b foo description

Try to apply the patchlist again to see what happens:

panto@dev:~/linux (foo)$ cherry-pick-list.sh patchlist.txt
cherry-picking: 0: 22d7ac6 bar description
error: could not apply 22d7ac6... bar description
hint: after resolving the conflicts, mark the corrected paths
hint: with 'git add <paths>' or 'git rm <paths>'
hint: and commit the result with 'git commit' 
Resolved 'README' using previous resolution.

Note the Resolved ‘README’ using previous resolution. This means that git determined that we are trying to perform the same edit and already made the change for us. Of course, it didn’t commit the change so that we have a chance to verify that it is correct.

panto@dev:~/linux (foo)$ diff --cc README
index 947fe6c,4e7043c..0000000
--- a/README
+++ b/README
@@@ -6,7 -6,7 +6,8 @@@ kernel, and what to do if something goe

WHAT IS LINUX?

+  foo
+   bar

Linux is a clone of the operating system Unix, written from scratch by
Linus Torvalds with assistance from a loosely-knit team of hackers across

Just add the changed file as earlier:

panto@dev:~/linux (foo)$ git add README 
panto@hp800z:~/juniper/linux-medatom.git (foo)$ git cherry-pick --continue
[foo 2586dba] bar description
 1 file changed, 1 insertion(+)

Use the script again and end up at the same result:

panto@dev:~/linux (foo)$ ./cherry-pick-list.sh patchlist.txt 
Match found at 0: bar description
cherry-picking: 1: 2be1bbb more bar description
[foo 6641dd4] more bar description
 1 file changed, 1 insertion(+)

I’m not a regular git guru but I found out that this small script ended up saving me a large amount of repetitive work. I hope someone else will find this useful.