1. 20 Feb, 2019 2 commits
    • Jin Yao's avatar
      perf report: Fix wrong iteration count in --branch-history · 394fc1c6
      Jin Yao authored
      [ Upstream commit a3366db0 ]
      
      By calculating the removed loops, we can get the iteration count.
      
      But the iteration count could be reported incorrectly, reporting
      impossibly high counts.
      
      That's because previous code uses the number of removed LBR entries for
      the iteration count. That's not good. Fix this by increasing the
      iteration count when a loop is detected.
      
      When matching the chain, the iteration count would be added up, finally we need
      to compute the average value when printing out.
      
      For example,
      
        $ perf report --branch-history --stdio --no-children
      
      Before:
      
        ---f2 +0
           |
           |--33.62%--f1 +9 (cycles:1)
           |          f1 +0
           |          main +22 (cycles:1)
           |          main +17
           |          main +38 (cycles:1)
           |          main +27
           |          f1 +26 (cycles:1)
           |          f1 +24
           |          f2 +27 (cycles:7)
           |          f2 +0
           |          f1 +19 (cycles:1)
           |          f1 +14
           |          f2 +27 (cycles:11)
           |          f2 +0
           |          f1 +9 (cycles:1 iter:2968 avg_cycles:3)
           |          f1 +0
           |          main +22 (cycles:1 iter:2968 avg_cycles:3)
           |          main +17
           |          main +38 (cycles:1 iter:2968 avg_cycles:3)
      
      2968 is an impossible high iteration count and avg_cycles is too small.
      
      After:
      
        ---f2 +0
           |
           |--33.62%--f1 +9 (cycles:1)
           |          f1 +0
           |          main +22 (cycles:1)
           |          main +17
           |          main +38 (cycles:1)
           |          main +27
           |          f1 +26 (cycles:1)
           |          f1 +24
           |          f2 +27 (cycles:7)
           |          f2 +0
           |          f1 +19 (cycles:1)
           |          f1 +14
           |          f2 +27 (cycles:11)
           |          f2 +0
           |          f1 +9 (cycles:1 iter:1 avg_cycles:23)
           |          f1 +0
           |          main +22 (cycles:1 iter:1 avg_cycles:23)
           |          main +17
           |          main +38 (cycles:1 iter:1 avg_cycles:23)
      
      avg_cycles:23 is the average cycles of this iteration.
      
      Fixes: c4ee0625 ("perf report: Calculate the average cycles of iterations")
      Signed-off-by: 's avatarJin Yao <yao.jin@linux.intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1546582230-17507-1-git-send-email-yao.jin@linux.intel.comSigned-off-by: 's avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: 's avatarSasha Levin <sashal@kernel.org>
      394fc1c6
    • Jin Yao's avatar
      perf stat: Fix endless wait for child process · 3902b972
      Jin Yao authored
      [ Upstream commit 8a99255a ]
      
      We hit a 'perf stat' issue by using following script:
      
        #!/bin/bash
      
        sleep 1000 &
        exec perf stat -a -e cycles -I1000 -- sleep 5
      
      Since "perf stat" is launched by exec, the "sleep 1000" would be the
      child process of "perf stat". The wait4() call will not return because
      it's waiting for the child process "sleep 1000" to end. So 'perf stat'
      doesn't return even after 5s passes.
      
      This patch lets 'perf stat' return when the specified child process ends
      (in this case, the specified child process is "sleep 5").
      
      Committer testing:
      
        # cat test.sh
        #!/bin/bash
      
        sleep 10 &
        exec perf stat -a -e cycles -I1000 -- sleep 5
        #
      
      Before:
      
        # time ./test.sh
        #           time             counts unit events
             1.001113090        108,453,351      cycles
             2.002062196        142,075,435      cycles
             3.002896194        164,801,068      cycles
             4.003731666        107,062,140      cycles
             5.002068867        112,241,832      cycles
      
        real	0m10.066s
        user	0m0.016s
        sys	0m0.101s
        #
      
      After:
      
        # time ./test.sh
        #           time             counts unit events
             1.001016096         91,412,027      cycles
             2.002014963        124,063,708      cycles
             3.002883964        125,993,929      cycles
             4.003706470        120,465,734      cycles
             5.002006778        163,560,355      cycles
      
        real	0m5.123s
        user	0m0.014s
        sys	0m0.105s
        #
      Signed-off-by: 's avatarJin Yao <yao.jin@linux.intel.com>
      Reviewed-by: 's avatarJiri Olsa <jolsa@kernel.org>
      Tested-by: 's avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Kan Liang <kan.liang@linux.intel.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/1546501245-4512-1-git-send-email-yao.jin@linux.intel.comSigned-off-by: 's avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: 's avatarSasha Levin <sashal@kernel.org>
      3902b972
  2. 15 Feb, 2019 1 commit
  3. 12 Feb, 2019 19 commits
  4. 06 Feb, 2019 1 commit
  5. 31 Jan, 2019 1 commit
    • Dave Hansen's avatar
      x86/selftests/pkeys: Fork() to check for state being preserved · 940343c7
      Dave Hansen authored
      commit e1812933 upstream.
      
      There was a bug where the per-mm pkey state was not being preserved across
      fork() in the child.  fork() is performed in the pkey selftests, but all of
      the pkey activity is performed in the parent.  The child does not perform
      any actions sensitive to pkey state.
      
      To make the test more sensitive to these kinds of bugs, add a fork() where
      the parent exits, and execution continues in the child.
      
      To achieve this let the key exhaustion test not terminate at the first
      allocation failure and fork after 2*NR_PKEYS loops and continue in the
      child.
      Signed-off-by: 's avatarDave Hansen <dave.hansen@linux.intel.com>
      Signed-off-by: 's avatarThomas Gleixner <tglx@linutronix.de>
      Cc: bp@alien8.de
      Cc: hpa@zytor.com
      Cc: peterz@infradead.org
      Cc: mpe@ellerman.id.au
      Cc: will.deacon@arm.com
      Cc: luto@kernel.org
      Cc: jroedel@suse.de
      Cc: stable@vger.kernel.org
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Joerg Roedel <jroedel@suse.de>
      Link: https://lkml.kernel.org/r/20190102215657.585704B7@viggo.jf.intel.comSigned-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      940343c7
  6. 26 Jan, 2019 14 commits
    • Arnaldo Carvalho de Melo's avatar
      perf tools: Add missing open_memstream() prototype for systems lacking it · 22fd9239
      Arnaldo Carvalho de Melo authored
      [ Upstream commit d7a8c4a6 ]
      
      There are systems such as the Android NDK API level 24 has the
      open_memstream() function but doesn't provide a prototype, adding noise
      to the build:
      
        builtin-timechart.c: In function 'cat_backtrace':
        builtin-timechart.c:486:2: warning: implicit declaration of function 'open_memstream' [-Wimplicit-function-declaration]
          FILE *f = open_memstream(&p, &p_len);
          ^
        builtin-timechart.c:486:2: warning: nested extern declaration of 'open_memstream' [-Wnested-externs]
        builtin-timechart.c:486:12: warning: initialization makes pointer from integer without a cast
          FILE *f = open_memstream(&p, &p_len);
                    ^
      
      Define a LACKS_OPEN_MEMSTREAM_PROTOTYPE define so that code needing that
      can get a prototype.
      
      Checked in the bionic git repo to be available since level 23:
      
      https://android.googlesource.com/platform/bionic/+/master/libc/include/stdio.h#241
      
        FILE* open_memstream(char** __ptr, size_t* __size_ptr) __INTRODUCED_IN(23);
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-343ashae97e5bq6vizusyfno@git.kernel.orgSigned-off-by: 's avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: 's avatarSasha Levin <sashal@kernel.org>
      22fd9239
    • Arnaldo Carvalho de Melo's avatar
      perf tools: Add missing sigqueue() prototype for systems lacking it · f5033ca8
      Arnaldo Carvalho de Melo authored
      [ Upstream commit 748fe088 ]
      
      There are systems such as the Android NDK API level 24 has the
      sigqueue() function but doesn't provide a prototype, adding noise to the
      build:
      
        util/evlist.c: In function 'perf_evlist__prepare_workload':
        util/evlist.c:1494:4: warning: implicit declaration of function 'sigqueue' [-Wimplicit-function-declaration]
            if (sigqueue(getppid(), SIGUSR1, val))
            ^
        util/evlist.c:1494:4: warning: nested extern declaration of 'sigqueue' [-Wnested-externs]
      
      Define a LACKS_SIGQUEUE_PROTOTYPE define so that code needing that can
      get a prototype.
      
      Checked in the bionic git repo to be available since level 23:
      
      https://android.googlesource.com/platform/bionic/+/master/libc/include/signal.h#123
      
        int sigqueue(pid_t __pid, int __signal, const union sigval __value) __INTRODUCED_IN(23);
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Link: https://lkml.kernel.org/n/tip-lmhpev1uni9kdrv7j29glyov@git.kernel.orgSigned-off-by: 's avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: 's avatarSasha Levin <sashal@kernel.org>
      f5033ca8
    • Leo Yan's avatar
      perf cs-etm: Correct packets swapping in cs_etm__flush() · d6404de9
      Leo Yan authored
      [ Upstream commit 43fd5666 ]
      
      The structure cs_etm_queue uses 'prev_packet' to point to previous
      packet, this can be used to combine with new coming packet to generate
      samples.
      
      In function cs_etm__flush() it swaps packets only when the flag
      'etm->synth_opts.last_branch' is true, this means that it will not swap
      packets if without option '--itrace=il' to generate last branch entries;
      thus for this case the 'prev_packet' doesn't point to the correct
      previous packet and the stale packet still will be used to generate
      sequential sample.  Thus if dump trace with 'perf script' command we can
      see the incorrect flow with the stale packet's address info.
      
      This patch corrects packets swapping in cs_etm__flush(); except using
      the flag 'etm->synth_opts.last_branch' it also checks the another flag
      'etm->sample_branches', if any flag is true then it swaps packets so can
      save correct content to 'prev_packet'.  Finally this can fix the wrong
      program flow dumping issue.
      
      The patch has a minor refactoring to use 'etm->synth_opts.last_branch'
      instead of 'etmq->etm->synth_opts.last_branch' for condition checking,
      this is consistent with that is done in cs_etm__sample().
      Signed-off-by: 's avatarLeo Yan <leo.yan@linaro.org>
      Reviewed-by: 's avatarMathieu Poirier <mathieu.poirier@linaro.org>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mike Leach <mike.leach@linaro.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Robert Walker <robert.walker@arm.com>
      Cc: coresight@lists.linaro.org
      Cc: linux-arm-kernel@lists.infradead.org
      Link: http://lkml.kernel.org/r/1544513908-16805-2-git-send-email-leo.yan@linaro.orgSigned-off-by: 's avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: 's avatarSasha Levin <sashal@kernel.org>
      d6404de9
    • Arnaldo Carvalho de Melo's avatar
      tools lib subcmd: Don't add the kernel sources to the include path · dfa71e42
      Arnaldo Carvalho de Melo authored
      [ Upstream commit ece98049 ]
      
      At some point we decided not to directly include kernel sources files
      when building tools/perf/, but when tools/lib/subcmd/ was forked from
      tools/perf it somehow ended up adding it via these two lines in its
      Makefile:
      
        CFLAGS += -I$(srctree)/include/uapi
        CFLAGS += -I$(srctree)/include
      
      As $(srctree) points to the kernel sources.
      
      Removing those lines and keeping just:
      
        CFLAGS += -I$(srctree)/tools/include/
      
      Is enough to build tools/perf and tools/objtool.
      
      This fixes the build when building from the sources in environments such
      as the Android NDK crossbuilding from a fedora:26 system:
      
        subcmd-util.h:11:15: error: expected ',' or ';' before 'void'
         static inline void report(const char *prefix, const char *err, va_list params)
                       ^
        In file included from /git/perf/include/uapi/linux/stddef.h:2:0,
                         from /git/perf/include/uapi/linux/posix_types.h:5,
                         from /opt/android-ndk-r12b/platforms/android-24/arch-arm/usr/include/sys/types.h:36,
                         from /opt/android-ndk-r12b/platforms/android-24/arch-arm/usr/include/unistd.h:33,
                         from run-command.c:2:
        subcmd-util.h:18:17: error: '__no_instrument_function__' attribute applies only to functions
      
      The /opt/android-ndk-r12b/platforms/android-24/arch-arm/usr/include/sys/types.h
      file that includes linux/posix_types.h ends up getting the one in the kernel
      sources causing the breakage. Fix it.
      
      Test built tools/objtool/ too.
      Reported-by: 's avatarJiri Olsa <jolsa@kernel.org>
      Tested-by: 's avatarJiri Olsa <jolsa@kernel.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Fixes: 4b6ab94e ("perf subcmd: Create subcmd library")
      Link: https://lkml.kernel.org/n/tip-5lhaoecrj12t0bqwvpiu14sm@git.kernel.orgSigned-off-by: 's avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: 's avatarSasha Levin <sashal@kernel.org>
      dfa71e42
    • Michael Petlan's avatar
      perf stat: Avoid segfaults caused by negated options · 77d1d83b
      Michael Petlan authored
      [ Upstream commit 51433ead ]
      
      Some 'perf stat' options do not make sense to be negated (event,
      cgroup), some do not have negated path implemented (metrics). Due to
      that, it is better to disable the "no-" prefix for them, since
      otherwise, the later opt-parsing segfaults.
      
      Before:
      
        $ perf stat --no-metrics -- ls
        Segmentation fault (core dumped)
      
      After:
      
        $ perf stat --no-metrics -- ls
         Error: option `no-metrics' isn't available
         Usage: perf stat [<options>] [<command>]
      Signed-off-by: 's avatarMichael Petlan <mpetlan@redhat.com>
      Tested-by: 's avatarArnaldo Carvalho de Melo <acme@redhat.com>
      LPU-Reference: 1485912065.62416880.1544457604340.JavaMail.zimbra@redhat.com
      Signed-off-by: 's avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: 's avatarSasha Levin <sashal@kernel.org>
      77d1d83b
    • Andi Kleen's avatar
      perf vendor events intel: Fix Load_Miss_Real_Latency on SKL/SKX · f633da09
      Andi Kleen authored
      [ Upstream commit 91b2b970 ]
      
      Fix incorrect event names for the Load_Miss_Real_Latency metric for
      Skylake and Skylake Server.
      
      Fixes https://github.com/andikleen/pmu-tools/issues/158
      
      Before:
      
        % perf stat -M Load_Miss_Real_Latency true
        event syntax error: '..ss.pending,mem_load_retired.l1_miss_ps,mem_load_retired.fb_hit_ps}:W'
                                          \___ parser error
      
         Usage: perf stat [<options>] [<command>]
      
            -M, --metrics <metric/metric group list>
                                  monitor specified metrics or metric groups (separated by ,)
      
      After:
      
        % perf stat -M Load_Miss_Real_Latency true
      
         Performance counter stats for 'true':
      
                   279,204      l1d_pend_miss.pending     #     14.0 Load_Miss_Real_Latency
                     4,784      mem_load_uops_retired.l1_miss
                    15,188      mem_load_uops_retired.hit_lfb
      
               0.000899640 seconds time elapsed
      Signed-off-by: 's avatarAndi Kleen <ak@linux.intel.com>
      Acked-by: 's avatarJiri Olsa <jolsa@kernel.org>
      Tested-by: 's avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Link: http://lkml.kernel.org/r/20181120050635.4215-1-andi@firstfloor.orgSigned-off-by: 's avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: 's avatarSasha Levin <sashal@kernel.org>
      f633da09
    • Arnaldo Carvalho de Melo's avatar
      perf parse-events: Fix unchecked usage of strncpy() · 10d02f78
      Arnaldo Carvalho de Melo authored
      [ Upstream commit bd8d57fb ]
      
      The strncpy() function may leave the destination string buffer
      unterminated, better use strlcpy() that we have a __weak fallback
      implementation for systems without it.
      
      This fixes this warning on an Alpine Linux Edge system with gcc 8.2:
      
        util/parse-events.c: In function 'print_symbol_events':
        util/parse-events.c:2465:4: error: 'strncpy' specified bound 100 equals destination size [-Werror=stringop-truncation]
            strncpy(name, syms->symbol, MAX_NAME_LEN);
            ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        In function 'print_symbol_events.constprop',
            inlined from 'print_events' at util/parse-events.c:2508:2:
        util/parse-events.c:2465:4: error: 'strncpy' specified bound 100 equals destination size [-Werror=stringop-truncation]
            strncpy(name, syms->symbol, MAX_NAME_LEN);
            ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        In function 'print_symbol_events.constprop',
            inlined from 'print_events' at util/parse-events.c:2511:2:
        util/parse-events.c:2465:4: error: 'strncpy' specified bound 100 equals destination size [-Werror=stringop-truncation]
            strncpy(name, syms->symbol, MAX_NAME_LEN);
            ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        cc1: all warnings being treated as errors
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Fixes: 947b4ad1 ("perf list: Fix max event string size")
      Link: https://lkml.kernel.org/n/tip-b663e33bm6x8hrkie4uxh7u2@git.kernel.orgSigned-off-by: 's avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: 's avatarSasha Levin <sashal@kernel.org>
      10d02f78
    • Arnaldo Carvalho de Melo's avatar
      perf svghelper: Fix unchecked usage of strncpy() · 5d9435e2
      Arnaldo Carvalho de Melo authored
      [ Upstream commit 2f530253 ]
      
      The strncpy() function may leave the destination string buffer
      unterminated, better use strlcpy() that we have a __weak fallback
      implementation for systems without it.
      
      In this specific case this would only happen if fgets() was buggy, as
      its man page states that it should read one less byte than the size of
      the destination buffer, so that it can put the nul byte at the end of
      it, so it would never copy 255 non-nul chars, as fgets reads into the
      orig buffer at most 254 non-nul chars and terminates it. But lets just
      switch to strlcpy to keep the original intent and silence the gcc 8.2
      warning.
      
      This fixes this warning on an Alpine Linux Edge system with gcc 8.2:
      
        In function 'cpu_model',
            inlined from 'svg_cpu_box' at util/svghelper.c:378:2:
        util/svghelper.c:337:5: error: 'strncpy' output may be truncated copying 255 bytes from a string of length 255 [-Werror=stringop-truncation]
             strncpy(cpu_m, &buf[13], 255);
             ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Fixes: f48d55ce ("perf: Add a SVG helper library file")
      Link: https://lkml.kernel.org/n/tip-xzkoo0gyr56gej39ltivuh9g@git.kernel.orgSigned-off-by: 's avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: 's avatarSasha Levin <sashal@kernel.org>
      5d9435e2
    • Florian Fainelli's avatar
      perf tests ARM: Disable breakpoint tests 32-bit · 389cde57
      Florian Fainelli authored
      [ Upstream commit 24f96733 ]
      
      The breakpoint tests on the ARM 32-bit kernel are broken in several
      ways.
      
      The breakpoint length requested does not necessarily match whether the
      function address has the Thumb bit (bit 0) set or not, and this does
      matter to the ARM kernel hw_breakpoint infrastructure. See [1] for
      background.
      
      [1]: https://lkml.org/lkml/2018/11/15/205
      
      As Will indicated, the overflow handling would require single-stepping
      which is not supported at the moment. Just disable those tests for the
      ARM 32-bit platforms and update the comment above to explain these
      limitations.
      Co-developed-by: 's avatarWill Deacon <will.deacon@arm.com>
      Signed-off-by: 's avatarFlorian Fainelli <f.fainelli@gmail.com>
      Signed-off-by: 's avatarWill Deacon <will.deacon@arm.com>
      Acked-by: Jiri Olsa's avatarJiri Olsa <jolsa@redhat.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Link: http://lkml.kernel.org/r/20181203191138.2419-1-f.fainelli@gmail.comSigned-off-by: 's avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: 's avatarSasha Levin <sashal@kernel.org>
      389cde57
    • Adrian Hunter's avatar
      perf intel-pt: Fix error with config term "pt=0" · 7eb6443f
      Adrian Hunter authored
      [ Upstream commit 1c6f709b ]
      
      Users should never use 'pt=0', but if they do it may give a meaningless
      error:
      
      	$ perf record -e intel_pt/pt=0/u uname
      	Error:
      	The sys_perf_event_open() syscall returned with 22 (Invalid argument) for
      	event (intel_pt/pt=0/u).
      
      Fix that by forcing 'pt=1'.
      
      Committer testing:
      
        # perf record -e intel_pt/pt=0/u uname
        Error:
        The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (intel_pt/pt=0/u).
        /bin/dmesg | grep -i perf may provide additional information.
      
        # perf record -e intel_pt/pt=0/u uname
        pt=0 doesn't make sense, forcing pt=1
        Linux
        [ perf record: Woken up 1 times to write data ]
        [ perf record: Captured and wrote 0.020 MB perf.data ]
        #
      Signed-off-by: 's avatarAdrian Hunter <adrian.hunter@intel.com>
      Tested-by: 's avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Link: http://lkml.kernel.org/r/b7c5b4e5-9497-10e5-fd43-5f3e4a0fe51d@intel.comSigned-off-by: 's avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: 's avatarSasha Levin <sashal@kernel.org>
      7eb6443f
    • Adrian Hunter's avatar
      tools lib traceevent: Fix compile warnings in tools/lib/traceevent/event-parse.c · 7dd70b09
      Adrian Hunter authored
      [ Upstream commit 0631ca3a ]
      
      Fix following warnings:
      
        event-parse.c: In function ‘tep_find_event_by_name’:
        event-parse.c:3521:21: warning: ‘event’ may be used uninitialized in this function [-Wmaybe-uninitialized]
          pevent->last_event = event;
          ~~~~~~~~~~~~~~~~~~~^~~~~~~
          CC       ui/gtk/hists.o
          LINK     plugin_mac80211.so
          CC       nlattr.o
        event-parse.c: In function ‘tep_data_lat_fmt’:
        event-parse.c:5200:4: warning: ‘migrate_disable’ may be used uninitialized in this function [-Wmaybe-uninitialized]
            trace_seq_printf(s, "%d", migrate_disable);
            ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        event-parse.c:5207:4: warning: ‘lock_depth’ may be used uninitialized in this function [-Wmaybe-uninitialized]
            trace_seq_printf(s, "%d", lock_depth);
            ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
          LINK     plugin_sched_switch.so
          LINK     plugin_function.so
          LINK     plugin_xen.so
        event-parse.c: In function ‘tep_event_info’:
        event-parse.c:5047:7: warning: ‘len_arg’ may be used uninitialized in this function [-Wmaybe-uninitialized]
               trace_seq_printf(s, format, len_arg, (char)val);
               ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        event-parse.c:4884:6: note: ‘len_arg’ was declared here
          int len_arg;
              ^~~~~~~
        event-parse.c:4338:11: warning: ‘vsize’ may be used uninitialized in this function [-Wmaybe-uninitialized]
             val = tep_read_number(pevent, bptr, vsize);
                   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        event-parse.c:4224:6: note: ‘vsize’ was declared here
          int vsize;
              ^~~~~
      
      $ gcc --version
        gcc (Clear Linux OS for Intel Architecture) 8.2.1 20180502
      Signed-off-by: 's avatarAdrian Hunter <adrian.hunter@intel.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
      Cc: Tzvetomir Stoyanov (VMware) <tz.stoyanov@gmail.com>
      Link: http://lkml.kernel.org/r/20181122112937.10582-1-adrian.hunter@intel.comSigned-off-by: 's avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: 's avatarSasha Levin <sashal@kernel.org>
      7dd70b09
    • Jiong Wang's avatar
      bpf: relax verifier restriction on BPF_MOV | BPF_ALU · 525cd39f
      Jiong Wang authored
      [ Upstream commit e434b8cd ]
      
      Currently, the destination register is marked as unknown for 32-bit
      sub-register move (BPF_MOV | BPF_ALU) whenever the source register type is
      SCALAR_VALUE.
      
      This is too conservative that some valid cases will be rejected.
      Especially, this may turn a constant scalar value into unknown value that
      could break some assumptions of verifier.
      
      For example, test_l4lb_noinline.c has the following C code:
      
          struct real_definition *dst
      
      1:  if (!get_packet_dst(&dst, &pckt, vip_info, is_ipv6))
      2:    return TC_ACT_SHOT;
      3:
      4:  if (dst->flags & F_IPV6) {
      
      get_packet_dst is responsible for initializing "dst" into valid pointer and
      return true (1), otherwise return false (0). The compiled instruction
      sequence using alu32 will be:
      
        412: (54) (u32) r7 &= (u32) 1
        413: (bc) (u32) r0 = (u32) r7
        414: (95) exit
      
      insn 413, a BPF_MOV | BPF_ALU, however will turn r0 into unknown value even
      r7 contains SCALAR_VALUE 1.
      
      This causes trouble when verifier is walking the code path that hasn't
      initialized "dst" inside get_packet_dst, for which case 0 is returned and
      we would then expect verifier concluding line 1 in the above C code pass
      the "if" check, therefore would skip fall through path starting at line 4.
      Now, because r0 returned from callee has became unknown value, so verifier
      won't skip analyzing path starting at line 4 and "dst->flags" requires
      dereferencing the pointer "dst" which actually hasn't be initialized for
      this path.
      
      This patch relaxed the code marking sub-register move destination. For a
      SCALAR_VALUE, it is safe to just copy the value from source then truncate
      it into 32-bit.
      
      A unit test also included to demonstrate this issue. This test will fail
      before this patch.
      
      This relaxation could let verifier skipping more paths for conditional
      comparison against immediate. It also let verifier recording a more
      accurate/strict value for one register at one state, if this state end up
      with going through exit without rejection and it is used for state
      comparison later, then it is possible an inaccurate/permissive value is
      better. So the real impact on verifier processed insn number is complex.
      But in all, without this fix, valid program could be rejected.
      
      >From real benchmarking on kernel selftests and Cilium bpf tests, there is
      no impact on processed instruction number when tests ares compiled with
      default compilation options. There is slightly improvements when they are
      compiled with -mattr=+alu32 after this patch.
      
      Also, test_xdp_noinline/-mattr=+alu32 now passed verification. It is
      rejected before this fix.
      
      Insn processed before/after this patch:
      
                              default     -mattr=+alu32
      
      Kernel selftest
      
      ===
      test_xdp.o              371/371      369/369
      test_l4lb.o             6345/6345    5623/5623
      test_xdp_noinline.o     2971/2971    rejected/2727
      test_tcp_estates.o      429/429      430/430
      
      Cilium bpf
      ===
      bpf_lb-DLB_L3.o:        2085/2085     1685/1687
      bpf_lb-DLB_L4.o:        2287/2287     1986/1982
      bpf_lb-DUNKNOWN.o:      690/690       622/622
      bpf_lxc.o:              95033/95033   N/A
      bpf_netdev.o:           7245/7245     N/A
      bpf_overlay.o:          2898/2898     3085/2947
      
      NOTE:
        - bpf_lxc.o and bpf_netdev.o compiled by -mattr=+alu32 are rejected by
          verifier due to another issue inside verifier on supporting alu32
          binary.
        - Each cilium bpf program could generate several processed insn number,
          above number is sum of them.
      
      v1->v2:
       - Restrict the change on SCALAR_VALUE.
       - Update benchmark numbers on Cilium bpf tests.
      Signed-off-by: 's avatarJiong Wang <jiong.wang@netronome.com>
      Signed-off-by: 's avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: 's avatarSasha Levin <sashal@kernel.org>
      525cd39f
    • Dmitry V. Levin's avatar
      selftests: do not macro-expand failed assertion expressions · 3fef5905
      Dmitry V. Levin authored
      [ Upstream commit b708a3cc ]
      
      I've stumbled over the current macro-expand behaviour of the test
      harness:
      
      $ gcc -Wall -xc - <<'__EOF__'
      TEST(macro) {
      	int status = 0;
      	ASSERT_TRUE(WIFSIGNALED(status));
      }
      TEST_HARNESS_MAIN
      __EOF__
      $ ./a.out
      [==========] Running 1 tests from 1 test cases.
      [ RUN      ] global.macro
      <stdin>:4:global.macro:Expected 0 (0) != (((signed char) (((status) & 0x7f) + 1) >> 1) > 0) (0)
      global.macro: Test terminated by assertion
      [     FAIL ] global.macro
      [==========] 0 / 1 tests passed.
      [  FAILED  ]
      
      With this change the output of the same test looks much more
      comprehensible:
      
      [==========] Running 1 tests from 1 test cases.
      [ RUN      ] global.macro
      <stdin>:4:global.macro:Expected 0 (0) != WIFSIGNALED(status) (0)
      global.macro: Test terminated by assertion
      [     FAIL ] global.macro
      [==========] 0 / 1 tests passed.
      [  FAILED  ]
      
      The issue is very similar to the bug fixed in glibc assert(3)
      three years ago:
      https://sourceware.org/bugzilla/show_bug.cgi?id=18604
      
      Cc: Shuah Khan <shuah@kernel.org>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Andy Lutomirski <luto@amacapital.net>
      Cc: Will Drewry <wad@chromium.org>
      Cc: linux-kselftest@vger.kernel.org
      Signed-off-by: 's avatarDmitry V. Levin <ldv@altlinux.org>
      Acked-by: 's avatarKees Cook <keescook@chromium.org>
      Signed-off-by: 's avatarShuah Khan <shuah@kernel.org>
      Signed-off-by: 's avatarSasha Levin <sashal@kernel.org>
      3fef5905
    • Quentin Monnet's avatar
      selftests/bpf: enable (uncomment) all tests in test_libbpf.sh · 8662b900
      Quentin Monnet authored
      [ Upstream commit f96afa76 ]
      
      libbpf is now able to load successfully test_l4lb_noinline.o and
      samples/bpf/tracex3_kern.o.
      
      For the test_l4lb_noinline, uncomment related tests from test_libbpf.c
      and remove the associated "TODO".
      
      For tracex3_kern.o, instead of loading a program from samples/bpf/ that
      might not have been compiled at this stage, try loading a program from
      BPF selftests. Since this test case is about loading a program compiled
      without the "-target bpf" flag, change the Makefile to compile one
      program accordingly (instead of passing the flag for compiling all
      programs).
      
      Regarding test_xdp_noinline.o: in its current shape the program fails to
      load because it provides no version section, but the loader needs one.
      The test was added to make sure that libbpf could load XDP programs even
      if they do not provide a version number in a dedicated section. But
      libbpf is already capable of doing that: in our case loading fails
      because the loader does not know that this is an XDP program (it does
      not need to, since it does not attach the program). So trying to load
      test_xdp_noinline.o does not bring much here: just delete this subtest.
      
      For the record, the error message obtained with tracex3_kern.o was
      fixed by commit e3d91b0c ("tools/libbpf: handle issues with bpf ELF
      objects containing .eh_frames")
      
      I have not been abled to reproduce the "libbpf: incorrect bpf_call
      opcode" error for test_l4lb_noinline.o, even with the version of libbpf
      present at the time when test_libbpf.sh and test_libbpf_open.c were
      created.
      
      RFC -> v1:
      - Compile test_xdp without the "-target bpf" flag, and try to load it
        instead of ../../samples/bpf/tracex3_kern.o.
      - Delete test_xdp_noinline.o subtest.
      
      Cc: Jesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: Quentin Monnet's avatarQuentin Monnet <quentin.monnet@netronome.com>
      Acked-by: 's avatarJakub Kicinski <jakub.kicinski@netronome.com>
      Acked-by: 's avatarJesper Dangaard Brouer <brouer@redhat.com>
      Signed-off-by: 's avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: 's avatarSasha Levin <sashal@kernel.org>
      8662b900
  7. 13 Jan, 2019 2 commits
    • Shuah Khan's avatar
      selftests: Fix test errors related to lib.mk khdr target · a05257f9
      Shuah Khan authored
      commit 211929fd upstream.
      
      Commit b2d35fa5 ("selftests: add headers_install to lib.mk") added
      khdr target to run headers_install target from the main Makefile. The
      logic uses KSFT_KHDR_INSTALL and top_srcdir as controls to initialize
      variables and include files to run headers_install from the top level
      Makefile. There are a few problems with this logic.
      
      1. Exposes top_srcdir to all tests
      2. Common logic impacts all tests
      3. Uses KSFT_KHDR_INSTALL, top_srcdir, and khdr in an adhoc way. Tests
         add "khdr" dependency in their Makefiles to TEST_PROGS_EXTENDED in
         some cases, and STATIC_LIBS in other cases. This makes this framework
         confusing to use.
      
      The common logic that runs for all tests even when KSFT_KHDR_INSTALL
      isn't defined by the test. top_srcdir is initialized to a default value
      when test doesn't initialize it. It works for all tests without a sub-dir
      structure and tests with sub-dir structure fail to build.
      
      e.g: make -C sparc64/drivers/ or make -C drivers/dma-buf
      
      ../../lib.mk:20: ../../../../scripts/subarch.include: No such file or directory
      make: *** No rule to make target '../../../../scripts/subarch.include'.  Stop.
      
      There is no reason to require all tests to define top_srcdir and there is
      no need to require tests to add khdr dependency using adhoc changes to
      TEST_* and other variables.
      
      Fix it with a consistent use of KSFT_KHDR_INSTALL and top_srcdir from tests
      that have the dependency on headers_install.
      
      Change common logic to include khdr target define and "all" target with
      dependency on khdr when KSFT_KHDR_INSTALL is defined.
      
      Only tests that have dependency on headers_install have to define just
      the KSFT_KHDR_INSTALL, and top_srcdir variables and there is no need to
      specify khdr dependency in the test Makefiles.
      
      Fixes: b2d35fa5 ("selftests: add headers_install to lib.mk")
      Cc: stable@vger.kernel.org
      Signed-off-by: 's avatarShuah Khan <shuah@kernel.org>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a05257f9
    • Dan Williams's avatar
      mm, devm_memremap_pages: fix shutdown handling · 6e6a8b24
      Dan Williams authored
      commit a95c90f1 upstream.
      
      The last step before devm_memremap_pages() returns success is to allocate
      a release action, devm_memremap_pages_release(), to tear the entire setup
      down.  However, the result from devm_add_action() is not checked.
      
      Checking the error from devm_add_action() is not enough.  The api
      currently relies on the fact that the percpu_ref it is using is killed by
      the time the devm_memremap_pages_release() is run.  Rather than continue
      this awkward situation, offload the responsibility of killing the
      percpu_ref to devm_memremap_pages_release() directly.  This allows
      devm_memremap_pages() to do the right thing relative to init failures and
      shutdown.
      
      Without this change we could fail to register the teardown of
      devm_memremap_pages().  The likelihood of hitting this failure is tiny as
      small memory allocations almost always succeed.  However, the impact of
      the failure is large given any future reconfiguration, or disable/enable,
      of an nvdimm namespace will fail forever as subsequent calls to
      devm_memremap_pages() will fail to setup the pgmap_radix since there will
      be stale entries for the physical address range.
      
      An argument could be made to require that the ->kill() operation be set in
      the @pgmap arg rather than passed in separately.  However, it helps code
      readability, tracking the lifetime of a given instance, to be able to grep
      the kill routine directly at the devm_memremap_pages() call site.
      
      Link: http://lkml.kernel.org/r/154275558526.76910.7535251937849268605.stgit@dwillia2-desk3.amr.corp.intel.comSigned-off-by: 's avatarDan Williams <dan.j.williams@intel.com>
      Fixes: e8d51348 ("memremap: change devm_memremap_pages interface...")
      Reviewed-by: 's avatar"Jérôme Glisse" <jglisse@redhat.com>
      Reported-by: Logan Gunthorpe's avatarLogan Gunthorpe <logang@deltatee.com>
      Reviewed-by: Logan Gunthorpe's avatarLogan Gunthorpe <logang@deltatee.com>
      Reviewed-by: 's avatarChristoph Hellwig <hch@lst.de>
      Cc: Balbir Singh <bsingharora@gmail.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: 's avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: 's avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6e6a8b24