Supporting Flame Graphs on production kernels

Background

Perf is an amazing tool for observing system performance in Linux. Using perf on production kernels can be filled with pitfalls, due to the rapid pace at which new features are being added. In my case, I support a production kernel team that expects every feature they read about on the web to work on their older production kernel. A good example of a downstream use case of perf is Brendan Gregg’s very nice Flame Graphs tool for visualizing frequently used code paths in a system.

Example mysql Flame Graph

Example mysql Flame Graph

Recording call frame information with perf

Generation of Flame Graphs depends on perf capturing call frames. As documented in the Flame Graph tools, one records perf data on a x86-64 system by enabling DWARF call graph support with a command line like:

$ perf record -F 99 -a --call-graph dwarf -- sleep 60

That, of course, produces the raw perf.data file. The call frames we need are there. However, we need to process this data with a reporting tool.

Problems generating Flame Graphs

Now we start running into the problem with our production kernel. In our case, we are on a 4.1 kernel. Users are happily running perf report, seeing the complete set of call frame information throughout the system components under observation. The interesting thing is that if we generate a Flame Graph using this same data, then the users no longer have visibility into the complete calling tree information. That is, the Flame Graph will simply show time spent in a given library. So what’s wrong? Let’s take a look at how Flame Graphs are generated:

$ perf script > out.perf
$ stackcollapse-perf.pl out.perf > out.folded
$ flamegraph.pl out.folded > out.svg

The key here is that we are no longer parsing the perf data using perf report, but rather using perf script to do the heavy lifting and feeding the result into the Flame Graph generation tools. Doing a bit of git detective work, we can see that perf report added callchain sampling all the way back in 3.18:

$ git describe --contains 0cdccac6fe4b1316f04f0dbfcc4efab51932014a
v3.18-rc1~8^2~2^2~6
$ git log -1 -p 0cdccac6fe4b1316f04f0dbfcc4efab51932014a
commit 0cdccac6fe4b1316f04f0dbfcc4efab51932014a
Author: Namhyung Kim <[email protected]>
Date:   Mon Oct 6 09:45:59 2014 +0900

    perf report: Set callchain_param.record_mode for future use

    Normally the callchain_param.record_mode is used only for record path.
    But as it might need to prepare something for dwarf unwinding, setup
    this info for perf report too.

    Signed-off-by: Namhyung Kim <[email protected]>
    Acked-by: Jiri Olsa <[email protected]>
    Cc: David Ahern <[email protected]>
    Cc: Frederic Weisbecker <[email protected]>
    Cc: Ingo Molnar <[email protected]>
    Cc: Jean Pihet <[email protected]>
    Cc: Jiri Olsa <[email protected]>
    Cc: Namhyung Kim <[email protected]>
    Cc: Paul Mackerras <[email protected]>
    Cc: Peter Zijlstra <[email protected]>
    Link: http://lkml.kernel.org/r/[email protected]
    Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index 2cfc4b93..140a6cd 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -257,6 +257,13 @@ static int report__setup_sample_type(struct report *rep)
                }
        }

+       if (symbol_conf.use_callchain || symbol_conf.cumulate_callchain) {
+               if ((sample_type & PERF_SAMPLE_REGS_USER) &&
+                   (sample_type & PERF_SAMPLE_STACK_USER))
+                       callchain_param.record_mode = CALLCHAIN_DWARF;
+               else
+                       callchain_param.record_mode = CALLCHAIN_FP;
+       }
        return 0;
 }

diff --git a/tools/perf/tests/dwarf-unwind.c b/tools/perf/tests/dwarf-unwind.c
index 96adb73..fc25e57 100644
--- a/tools/perf/tests/dwarf-unwind.c
+++ b/tools/perf/tests/dwarf-unwind.c
@@ -9,6 +9,7 @@
 #include "perf_regs.h"
 #include "map.h"
 #include "thread.h"
+#include "callchain.h"

 static int mmap_handler(struct perf_tool *tool __maybe_unused,
                        union perf_event *event,
@@ -120,6 +121,8 @@ int test__dwarf_unwind(void)
                return -1;
        }

+       callchain_param.record_mode = CALLCHAIN_DWARF;
+
        if (init_live_machine(machine)) {
                pr_err("Could not init machinen");
                goto out;

Making Flame Graphs work with our kernel

Knowing that this worked on newer versions of perf in at least the 4.6 kernel, we were then able to spot that it wasn’t until 4.3 that perf script gained callchain support. Notice the addition of the analogous code to what was already in perf report:

$ git describe --contains 7322d6c98dd214252bd697f8dde64a3576977fab
v4.3-rc1~138^2~5^2~10
$ git log -1 -p 7322d6c98dd214252bd697f8dde64a3576977fab
commit 7322d6c98dd214252bd697f8dde64a3576977fab
Author: Jiri Olsa <[email protected]>
Date:   Thu Aug 13 09:17:24 2015 +0200

    perf script: Initialize callchain_param.record_mode

    Milian Wolff reported non functional DWARF unwind under perf script. The
    reason is that perf script does not properly configure
    callchain_param.record_mode, which is needed by unwind code.

    Stealing the code from report and leaving the place for more
    initialization code in a hope we could merge it with
    report__setup_sample_type one day.

    Reported-by: Milian Wolff <[email protected]>
    Signed-off-by: Jiri Olsa <[email protected]>
    Tested-by: Milian Wolff <[email protected]>
    Cc: David Ahern <[email protected]>
    Cc: Namhyung Kim <[email protected]>
    Cc: Peter Zijlstra <[email protected]>
    Link: http://lkml.kernel.org/r/[email protected]
    Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>

diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 7b376d2..105332e 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -1561,6 +1561,22 @@ static int have_cmd(int argc, const char **argv)
        return 0;
 }

+static void script__setup_sample_type(struct perf_script *script)
+{
+       struct perf_session *session = script->session;
+       u64 sample_type = perf_evlist__combined_sample_type(session->evlist);
+
+       if (symbol_conf.use_callchain || symbol_conf.cumulate_callchain) {
+               if ((sample_type & PERF_SAMPLE_REGS_USER) &&
+                   (sample_type & PERF_SAMPLE_STACK_USER))
+                       callchain_param.record_mode = CALLCHAIN_DWARF;
+               else if (sample_type & PERF_SAMPLE_BRANCH_STACK)
+                       callchain_param.record_mode = CALLCHAIN_LBR;
+               else
+                       callchain_param.record_mode = CALLCHAIN_FP;
+       }
+}
+
 int cmd_script(int argc, const char **argv, const char *prefix __maybe_unused)
 {
        bool show_full_info = false;
@@ -1849,6 +1865,7 @@ int cmd_script(int argc, const char **argv, const char *prefix __maybe_unused)
                goto out_delete;

        script.session = session;
+       script__setup_sample_type(&script);

        session->itrace_synth_opts = &itrace_synth_opts;

By backporting this support from the 4.3 version of perf, we were able to support generation of Flame Graphs with our 4.1 production kernel tooling.

Conclusion

The moral of the story is: don’t count on well publicized perf features working on your older kernel. It is just as important to backport updates to the userspace perf tools as it is to backport updates for the production kernel itself.

Git workflow for upstreaming patches from a vendor kernel

Background

Upstreaming patches from a vendor kernel is a constant process of trying to get on board a train, falling behind, and hopping back on again.

Typically, a vendor kernel is based on an older version of the kernel. As a result, one has to forward port a series of patches against the latest kernel mainline. This is seldom a pain free affair, since the patches may not apply without manual edits. For example, the kernel interfaces may have changed, patches merged into mainline could have conflicting changes, and so on. On top of all of this, you have the reality that this task will need to be performed a number of times while you track mainline.

Basics

I usually perform cherry-picking of each patch on top of the mainline branch from the vendor branch, which is usually based on some old stable point release. Let’s assume that there are tags pointing to the vendor and stable commits, this reports the number of patches on top of that stable release.

[email protected]:~/linux (master)$ git describe stable
v4.1.15

pa[email protected]:~/linux (master)$ git describe vendor
v4.1.15-926-gea8c225

In this case, the patches are against stable v4.1.15 and there are 926 patches
on top of it; the format of the describe label is <tag>-<#-of-patches>-g<commit>

Workflow

First we get our needed information on the master branch.

[email protected]:~/linux (master)$ git describe master
v4.8-rc8-13-g53061af

Our master is today’s mainline kernel (v4.8-rc8) with just 13 patches on top of it. The problem with cherry-picking and manual editing is that when you edit the patch the commit id changes since the contents of the patch changes. We need a way to have a list of patches to cherry-pick, iteratively apply them, and manually fix any problems.

[email protected]:~/linux (master)$ git checkout --track -b work master
Checking out files: 100% (33262/33262), done.
Branch work set up to track local branch master.
Switched to a new branch 'work'

[email protected]:~/linux (work)$ git log --reverse --oneline stable..vendor >patchlist.txt

We create file with a list of the commits we want to apply on the work branch.

[email protected]:~/linux (work)$ cat patchlist.txt | head -n 2
ff24250 ppc: Make number of GPIO pins configurable
e4d443c pci/pciehp: Allow polling/irq mode to be decided on a per-port basis

This is the standard oneline format of git log (in reverse since we want the list to be in chronological order). If we were to do this manually, we’d have to do it like this:

[email protected]:~/linux (work)$ git cherry-pick ff24250 

If the cherry-pick is successful, we can proceed with the next one and so on. Otherwise, we have to manually fix it and issue git cherry-pick –continue Why not automate this by picking out the commit from the list and work iteratively? We can’t simply use commit IDs because the commit ID changes after every edit. The following cherry-pick-list.sh script does the heavy lifting of picking out the commits for us. Given the patchlist file, it will git cherry-pick each commit in the list. However, it will skip the already applied commits which match the description. It does not consider commit IDs since those might have changed.

#!/bin/bash

# get top
top=`git log --oneline HEAD^..HEAD | head -n1`
ctop=`echo ${top} | cut -d' ' -f1`
dtop=`echo ${top} | cut -d' ' -f2-`
l="$ctop $dtop"
if [ "$l" != "$top" ] ; then
        echo "Reconstructed top failure"
        echo $top
        echo $l
        exit 5
fi

# get list of commits and descriptions
old_IFS=${IFS}
IFS=$'n'

j=0
for i in `grep -v '^#' $1`; do
        c[${j}]=`echo ${i} | cut -d' ' -f1`
        d[${j}]=`echo ${i} | cut -d' ' -f2-`
        l="${c[${j}]} ${d[${j}]}"
        if [ $l != $i ] ; then
                echo "Reconstructed changeset failure $i"
                exit 5
        fi
        ((j++))
done
last=$((j - 1))
IFS=${old_IFS}

# skip over patches that are applied (checking description only)
match=0
for i in `seq 0 $last`; do
        ct=${c[${i}]}
        dt=${d[${i}]}
        if [ "${dt}" == "${dtop}" ]; then
                echo "Match found at $i: $dt"
                match=$(($i + 1))
                break;
        fi
        # echo "$i: $ct $dt"
done

for i in `seq $match $last`; do
        ct=${c[${i}]}
        dt=${d[${i}]}
        echo "cherry-picking: $i: $ct $dt"
        git cherry-pick $ct
        if [ $? -ne 0 ] ; then
                exit 5;
        fi
done

It makes sense to work on a simplified example using the kernel’s README file.

[email protected]:~/linux (work)$ git checkout --track -b foo master
Switched to branch 'foo'
Your branch is up-to-date with 'master'.

Edit the README file resulting to the following patch:

[email protected]:~/linux (foo)$ git diff
diff --git a/README b/README
index a24ec89..947fe6c 100644
--- a/README
+++ b/README
@@ -6,6 +6,8 @@ kernel, and what to do if something goes wrong.

WHAT IS LINUX?

+  foo
+
Linux is a clone of the operating system Unix, written from scratch by
Linus Torvalds with assistance from a loosely-knit team of hackers across
the Net. It aims towards POSIX and Single UNIX Specification compliance.

Commit the change:

[email protected]:~/linux (foo)$ git commit -m 'foo description'
[foo 4b5f122b] foo description
1 file changed, 2 insertions(+)

List the patches on top of master in sequence:

[email protected]:~/linux (foo)$ git log --oneline --reverse master..foo
4b5f122b foo description

Let’s create a new bar branch:

[email protected]:~/linux (foo)$ git checkout --track -b bar master
Switched to branch 'bar'
Your branch is up-to-date with 'master'.

Edit the README file resulting to the following patch:

[email protected]:~/linux (bar)$ git diff
diff --git a/README b/README
index a24ec89..4e7043c 100644
--- a/README
+++ b/README
@@ -6,6 +6,8 @@ kernel, and what to do if something goes wrong.

WHAT IS LINUX?

+  bar
+
Linux is a clone of the operating system Unix, written from scratch by
Linus Torvalds with assistance from a loosely-knit team of hackers across
the Net. It aims towards POSIX and Single UNIX Specification compliance.

Note that this conflicts with the foo patch, we will need to manually fix it later.

Commit the change:

[email protected]:~/linux (bar)$ git commit -m 'bar description'
[bar aba1679] bar description
 1 file changed, 2 insertions(+)

Make another commit that is conflict free:

[email protected]:~/linux (bar)$ git diff

diff --git a/README b/README
index 2788bfc..fbdf488 100644 
--- a/README
+++ b/README 
@@ -412,3 +412,4 @@ IF SOMETHING GOES WRONG:
gdb'ing a non-running kernel currently fails because gdb (wrongly)
disregards the starting offset for which the kernel is compiled.

+   more bar

Switch back to the foo branch to apply the changes in the bar branch.

[email protected]:~/linux (bar)$ git checkout foo
Switched to branch 'foo'
Your branch is ahead of 'master' by 1 commit.
  (use "git push" to publish your local commits)

Generate the patchlist file:

[email protected]:~/linux (foo)$ git log --oneline --reverse master..bar | tee patchlist.txt
22d7ac6 bar description
2be1bbb more bar description

Apply them using the script:

[email protected]:~/linux (foo)$ cherry-pick-list.sh patchlist.txt  
cherry-picking: 0: 22d7ac6 bar description
error: could not apply 22d7ac6... bar description
hint: after resolving the conflicts, mark the corrected paths
hint: with 'git add <paths>' or 'git rm <paths>'
hint: and commit the result with 'git commit'
Recorded preimage for 'README'

[email protected]:~/linux (foo)$ git diff
diff --cc README
index 947fe6c,4e7043c..0000000
--- a/README
+++ b/README
@@@ -6,7 -6,7 +6,11 @@@ kernel, and what to do if something goe

WHAT IS LINUX?

++<<<<<<< HEAD
+  foo
++=======
+   bar
++>>>>>>> 22d7ac6... bar description

Linux is a clone of the operating system Unix, written from scratch by
Linus Torvalds with assistance from a loosely-knit team of hackers across

Edit and fix it to look like this:

[email protected]:~/linux (foo)$ git diff
diff --cc README
index 947fe6c,4e7043c..0000000
--- a/README
+++ b/README
@@@ -6,7 -6,7 +6,8 @@@ kernel, and what to do if something goe

WHAT IS LINUX?

+  foo
+   bar

Linux is a clone of the operating system Unix, written from scratch by
Linus Torvalds with assistance from a loosely-knit team of hackers across

[email protected]:~/linux (foo)$ git add README

Edit the commit message (leaving the conflict or removing the Conflicts: tag)

[email protected]:~/linux (foo)$ git cherry-pick continue
Recorded resolution for 'README'.
[foo cc9dc34] bar description
1 file changed, 1 insertion(+)

Note the Recorded resolution line. Next time we will perform the same operation so we don’t have to repeat the manual fix. Run the script again to pick up the rest of the patchlist.

[email protected]:~/linux (foo)$ git cherry-pick continue
Match found at 0: bar description
cherry-picking: 1: 2be1bbb more bar description
[foo 63c1973] more bar description
 1 file changed, 1 insertion(+)

Note the message Match found at 0:. The script picked up that the first commit has been applied (albeit manually edited) and continued with the rest, which apply without problems. List the patches on top of master on the foo branch. Note that the commit ids of the patch sequence have changed.

[email protected]:~/linux (foo)$ git log --oneline --reverse master..
4b5f122b foo description 
cc9dc34 bar description
63c1973 more bar description

Now if we reset the foo branch back to the starting point:

[email protected]:~/linux (foo)$ git reset --hard HEAD^^
HEAD is now at 4b5f122b foo description

Try to apply the patchlist again to see what happens:

[email protected]:~/linux (foo)$ cherry-pick-list.sh patchlist.txt
cherry-picking: 0: 22d7ac6 bar description
error: could not apply 22d7ac6... bar description
hint: after resolving the conflicts, mark the corrected paths
hint: with 'git add <paths>' or 'git rm <paths>'
hint: and commit the result with 'git commit' 
Resolved 'README' using previous resolution.

Note the Resolved ‘README’ using previous resolution. This means that git determined that we are trying to perform the same edit and already made the change for us. Of course, it didn’t commit the change so that we have a chance to verify that it is correct.

[email protected]:~/linux (foo)$ diff --cc README
index 947fe6c,4e7043c..0000000
--- a/README
+++ b/README
@@@ -6,7 -6,7 +6,8 @@@ kernel, and what to do if something goe

WHAT IS LINUX?

+  foo
+   bar

Linux is a clone of the operating system Unix, written from scratch by
Linus Torvalds with assistance from a loosely-knit team of hackers across

Just add the changed file as earlier:

[email protected]:~/linux (foo)$ git add README 
[email protected]:~/juniper/linux-medatom.git (foo)$ git cherry-pick --continue
[foo 2586dba] bar description
 1 file changed, 1 insertion(+)

Use the script again and end up at the same result:

[email protected]:~/linux (foo)$ ./cherry-pick-list.sh patchlist.txt 
Match found at 0: bar description
cherry-picking: 1: 2be1bbb more bar description
[foo 6641dd4] more bar description
 1 file changed, 1 insertion(+)

I’m not a regular git guru but I found out that this small script ended up saving me a large amount of repetitive work. I hope someone else will find this useful.