1. 05 Apr, 2019 15 commits
    • Florian Westphal's avatar
      netfilter: physdev: relax br_netfilter dependency · ebd0f306
      Florian Westphal authored
      [ Upstream commit 8e2f311a ]
      
      Following command:
        iptables -D FORWARD -m physdev ...
      causes connectivity loss in some setups.
      
      Reason is that iptables userspace will probe kernel for the module revision
      of the physdev patch, and physdev has an artificial dependency on
      br_netfilter (xt_physdev use makes no sense unless a br_netfilter module
      is loaded).
      
      This causes the "phydev" module to be loaded, which in turn enables the
      "call-iptables" infrastructure.
      
      bridged packets might then get dropped by the iptables ruleset.
      
      The better fix would be to change the "call-iptables" defaults to 0 and
      enforce explicit setting to 1, but that breaks backwards compatibility.
      
      This does the next best thing: add a request_module call to checkentry.
      This was a stray '-D ... -m physdev' won't activate br_netfilter
      anymore.
      Signed-off-by: 's avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: 's avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: 's avatarSasha Levin <sashal@kernel.org>
      ebd0f306
    • Oleg Nesterov's avatar
      cgroup/pids: turn cgroup_subsys->free() into cgroup_subsys->release() to fix the accounting · b8498a26
      Oleg Nesterov authored
      [ Upstream commit 51bee5ab ]
      
      The only user of cgroup_subsys->free() callback is pids_cgrp_subsys which
      needs pids_free() to uncharge the pid.
      
      However, ->free() is called from __put_task_struct()->cgroup_free() and this
      is too late. Even the trivial program which does
      
      	for (;;) {
      		int pid = fork();
      		assert(pid >= 0);
      		if (pid)
      			wait(NULL);
      		else
      			exit(0);
      	}
      
      can run out of limits because release_task()->call_rcu(delayed_put_task_struct)
      implies an RCU gp after the task/pid goes away and before the final put().
      
      Test-case:
      
      	mkdir -p /tmp/CG
      	mount -t cgroup2 none /tmp/CG
      	echo '+pids' > /tmp/CG/cgroup.subtree_control
      
      	mkdir /tmp/CG/PID
      	echo 2 > /tmp/CG/PID/pids.max
      
      	perl -e 'while ($p = fork) { wait; } $p // die "fork failed: $!\n"' &
      	echo $! > /tmp/CG/PID/cgroup.procs
      
      Without this patch the forking process fails soon after migration.
      
      Rename cgroup_subsys->free() to cgroup_subsys->release() and move the callsite
      into the new helper, cgroup_release(), called by release_task() which actually
      frees the pid(s).
      Reported-by: 's avatarHerton R. Krzesinski <hkrzesin@redhat.com>
      Reported-by: 's avatarJan Stancek <jstancek@redhat.com>
      Signed-off-by: 's avatarOleg Nesterov <oleg@redhat.com>
      Signed-off-by: 's avatarTejun Heo <tj@kernel.org>
      Signed-off-by: 's avatarSasha Levin <sashal@kernel.org>
      b8498a26
    • Valdis Kletnieks's avatar
      bpf: fix missing prototype warnings · 08619450
      Valdis Kletnieks authored
      [ Upstream commit 116bfa96 ]
      
      Compiling with W=1 generates warnings:
      
        CC      kernel/bpf/core.o
      kernel/bpf/core.c:721:12: warning: no previous prototype for ?bpf_jit_alloc_exec_limit? [-Wmissing-prototypes]
        721 | u64 __weak bpf_jit_alloc_exec_limit(void)
            |            ^~~~~~~~~~~~~~~~~~~~~~~~
      kernel/bpf/core.c:757:14: warning: no previous prototype for ?bpf_jit_alloc_exec? [-Wmissing-prototypes]
        757 | void *__weak bpf_jit_alloc_exec(unsigned long size)
            |              ^~~~~~~~~~~~~~~~~~
      kernel/bpf/core.c:762:13: warning: no previous prototype for ?bpf_jit_free_exec? [-Wmissing-prototypes]
        762 | void __weak bpf_jit_free_exec(void *addr)
            |             ^~~~~~~~~~~~~~~~~
      
      All three are weak functions that archs can override, provide
      proper prototypes for when a new arch provides their own.
      Signed-off-by: 's avatarValdis Kletnieks <valdis.kletnieks@vt.edu>
      Acked-by: 's avatarSong Liu <songliubraving@fb.com>
      Signed-off-by: 's avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: 's avatarSasha Levin <sashal@kernel.org>
      08619450
    • Andrea Parri's avatar
      sched/core: Use READ_ONCE()/WRITE_ONCE() in move_queued_task()/task_rq_lock() · 5ca05ecd
      Andrea Parri authored
      [ Upstream commit c546951d ]
      
      move_queued_task() synchronizes with task_rq_lock() as follows:
      
      	move_queued_task()		task_rq_lock()
      
      	[S] ->on_rq = MIGRATING		[L] rq = task_rq()
      	WMB (__set_task_cpu())		ACQUIRE (rq->lock);
      	[S] ->cpu = new_cpu		[L] ->on_rq
      
      where "[L] rq = task_rq()" is ordered before "ACQUIRE (rq->lock)" by an
      address dependency and, in turn, "ACQUIRE (rq->lock)" is ordered before
      "[L] ->on_rq" by the ACQUIRE itself.
      
      Use READ_ONCE() to load ->cpu in task_rq() (c.f., task_cpu()) to honor
      this address dependency.  Also, mark the accesses to ->cpu and ->on_rq
      with READ_ONCE()/WRITE_ONCE() to comply with the LKMM.
      Signed-off-by: 's avatarAndrea Parri <andrea.parri@amarulasolutions.com>
      Signed-off-by: 's avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Alan Stern <stern@rowland.harvard.edu>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul E. McKenney <paulmck@linux.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will.deacon@arm.com>
      Link: https://lkml.kernel.org/r/20190121155240.27173-1-andrea.parri@amarulasolutions.comSigned-off-by: 's avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: 's avatarSasha Levin <sashal@kernel.org>
      5ca05ecd
    • Mathieu Poirier's avatar
      perf/aux: Make perf_event accessible to setup_aux() · 4e4fba6d
      Mathieu Poirier authored
      [ Upstream commit 84001866 ]
      
      When pmu::setup_aux() is called the coresight PMU needs to know which
      sink to use for the session by looking up the information in the
      event's attr::config2 field.
      
      As such simply replace the cpu information by the complete perf_event
      structure and change all affected customers.
      Signed-off-by: 's avatarMathieu Poirier <mathieu.poirier@linaro.org>
      Reviewed-by: 's avatarSuzuki Poulouse <suzuki.poulose@arm.com>
      Acked-by: 's avatarPeter Zijlstra <peterz@infradead.org>
      Cc: Adrian Hunter <adrian.hunter@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Alexei Starovoitov <ast@kernel.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Jiri Olsa <jolsa@redhat.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Namhyung Kim <namhyung@kernel.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: linux-arm-kernel@lists.infradead.org
      Cc: linux-s390@vger.kernel.org
      Link: http://lkml.kernel.org/r/20190131184714.20388-2-mathieu.poirier@linaro.orgSigned-off-by: 's avatarArnaldo Carvalho de Melo <acme@redhat.com>
      Signed-off-by: 's avatarSasha Levin <sashal@kernel.org>
      4e4fba6d
    • Thomas Gleixner's avatar
      genirq: Avoid summation loops for /proc/stat · c0ed0486
      Thomas Gleixner authored
      [ Upstream commit 1136b072 ]
      
      Waiman reported that on large systems with a large amount of interrupts the
      readout of /proc/stat takes a long time to sum up the interrupt
      statistics. In principle this is not a problem. but for unknown reasons
      some enterprise quality software reads /proc/stat with a high frequency.
      
      The reason for this is that interrupt statistics are accounted per cpu. So
      the /proc/stat logic has to sum up the interrupt stats for each interrupt.
      
      This can be largely avoided for interrupts which are not marked as
      'PER_CPU' interrupts by simply adding a per interrupt summation counter
      which is incremented along with the per interrupt per cpu counter.
      
      The PER_CPU interrupts need to avoid that and use only per cpu accounting
      because they share the interrupt number and the interrupt descriptor and
      concurrent updates would conflict or require unwanted synchronization.
      Reported-by: Waiman Long's avatarWaiman Long <longman@redhat.com>
      Signed-off-by: 's avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: Waiman Long's avatarWaiman Long <longman@redhat.com>
      Reviewed-by: 's avatarMarc Zyngier <marc.zyngier@arm.com>
      Reviewed-by: 's avatarDavidlohr Bueso <dbueso@suse.de>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: linux-fsdevel@vger.kernel.org
      Cc: Davidlohr Bueso <dave@stgolabs.net>
      Cc: Miklos Szeredi <miklos@szeredi.hu>
      Cc: Daniel Colascione <dancol@google.com>
      Cc: Dave Chinner <david@fromorbit.com>
      Cc: Randy Dunlap <rdunlap@infradead.org>
      Link: https://lkml.kernel.org/r/20190208135020.925487496@linutronix.de
      
      8<-------------
      
      v2: Undo the unintentional layout change of struct irq_desc.
      
       include/linux/irqdesc.h |    1 +
       kernel/irq/chip.c       |   12 ++++++++++--
       kernel/irq/internals.h  |    8 +++++++-
       kernel/irq/irqdesc.c    |    7 ++++++-
       4 files changed, 24 insertions(+), 4 deletions(-)
      Signed-off-by: 's avatarSasha Levin <sashal@kernel.org>
      c0ed0486
    • Luc Van Oostenryck's avatar
      sched/topology: Fix percpu data types in struct sd_data & struct s_data · 83a6f919
      Luc Van Oostenryck authored
      [ Upstream commit 99687cdb ]
      
      The percpu members of struct sd_data and s_data are declared as:
      
      	struct ... ** __percpu member;
      
      So their type is:
      
      	__percpu pointer to pointer to struct ...
      
      But looking at how they're used, their type should be:
      
      	pointer to __percpu pointer to struct ...
      
      and they should thus be declared as:
      
      	struct ... * __percpu *member;
      
      So fix the placement of '__percpu' in the definition of these
      structures.
      
      This addresses a bunch of Sparse's warnings like:
      
      	warning: incorrect type in initializer (different address spaces)
      	  expected void const [noderef] <asn:3> *__vpp_verify
      	  got struct sched_domain **
      Signed-off-by: Luc Van Oostenryck's avatarLuc Van Oostenryck <luc.vanoostenryck@gmail.com>
      Signed-off-by: 's avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: https://lkml.kernel.org/r/20190118144936.79158-1-luc.vanoostenryck@gmail.comSigned-off-by: 's avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: 's avatarSasha Levin <sashal@kernel.org>
      83a6f919
    • Anders Roxell's avatar
      efi: Fix build error due to enum collision between efi.h and ima.h · aa6c9fca
      Anders Roxell authored
      [ Upstream commit 5c418dc7 ]
      
      The following commit:
      
        a893ea15d764 ("tpm: move tpm_chip definition to include/linux/tpm.h")
      
      introduced a build error when both IMA and EFI are enabled:
      
          In file included from ../security/integrity/ima/ima_fs.c:30:
          ../security/integrity/ima/ima.h:176:7: error: redeclaration of enumerator "NONE"
      
      What happens is that both headers (ima.h and efi.h) defines the same
      'NONE' constant, and it broke when they started getting included from
      the same file:
      
      Rework to prefix the EFI enum with 'EFI_*'.
      Signed-off-by: 's avatarAnders Roxell <anders.roxell@linaro.org>
      Signed-off-by: 's avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-efi@vger.kernel.org
      Link: http://lkml.kernel.org/r/20190215165551.12220-2-ard.biesheuvel@linaro.org
      [ Cleaned up the changelog a bit. ]
      Signed-off-by: 's avatarIngo Molnar <mingo@kernel.org>
      Signed-off-by: 's avatarSasha Levin <sashal@kernel.org>
      aa6c9fca
    • Sedat Dilek's avatar
      scsi: fcoe: make use of fip_mode enum complete · 456736ab
      Sedat Dilek authored
      [ Upstream commit 8beb90aa ]
      
      commit 1917d42d ("fcoe: use enum for fip_mode") introduces a separate
      enum for the fip_mode that shall be used during initialisation handling
      until it is passed to fcoe_ctrl_link_up to set the initial fip_state.  That
      change was incomplete and gcc quietly converted in various places between
      the fip_mode and the fip_state enum values with implicit enum conversions,
      which fortunately cannot cause any issues in the actual code's execution.
      
      clang however warns about these implicit enum conversions in the scsi
      drivers. This commit consolidates the use of the two enums, guided by
      clang's enum-conversion warnings.
      
      This commit now completes the use of the fip_mode: It expects and uses
      fip_mode in {bnx2fc,fcoe}_interface_create and fcoe_ctlr_init, and it calls
      fcoe_ctrl_set_set() with the correct values in fcoe_ctlr_link_up().  It
      also breaks the association between FIP_MODE_AUTO and FIP_ST_AUTO to
      indicate these two enums are distinct.
      
      Link: https://github.com/ClangBuiltLinux/linux/issues/151
      Fixes: 1917d42d ("fcoe: use enum for fip_mode")
      Reported-by: Dmitry Golovin's avatarDmitry Golovin <dima@golovin.in>
      Original-by: Lukas Bulwahn's avatarLukas Bulwahn <lukas.bulwahn@gmail.com>
      CC: Lukas Bulwahn <lukas.bulwahn@gmail.com>
      CC: Nick Desaulniers <ndesaulniers@google.com>
      CC: Nathan Chancellor <natechancellor@gmail.com>
      Reviewed-by: Nathan Chancellor's avatarNathan Chancellor <natechancellor@gmail.com>
      Tested-by: Nathan Chancellor's avatarNathan Chancellor <natechancellor@gmail.com>
      Suggested-by: 's avatarJohannes Thumshirn <jthumshirn@suse.de>
      Signed-off-by: 's avatarSedat Dilek <sedat.dilek@gmail.com>
      Signed-off-by: 's avatarHannes Reinecke <hare@suse.com>
      Signed-off-by: Martin K. Petersen's avatarMartin K. Petersen <martin.petersen@oracle.com>
      Signed-off-by: 's avatarSasha Levin <sashal@kernel.org>
      456736ab
    • Katsuhiro Suzuki's avatar
      clk: fractional-divider: check parent rate only if flag is set · 1a4faefc
      Katsuhiro Suzuki authored
      [ Upstream commit d13501a2 ]
      
      Custom approximation of fractional-divider may not need parent clock
      rate checking. For example Rockchip SoCs work fine using grand parent
      clock rate even if target rate is greater than parent.
      
      This patch checks parent clock rate only if CLK_SET_RATE_PARENT flag
      is set.
      
      For detailed example, clock tree of Rockchip I2S audio hardware.
        - Clock rate of CPLL is 1.2GHz, GPLL is 491.52MHz.
        - i2s1_div is integer divider can divide N (N is 1~128).
          Input clock is CPLL or GPLL. Initial divider value is N = 1.
          Ex) PLL = CPLL, N = 10, i2s1_div output rate is
            CPLL / 10 = 1.2GHz / 10 = 120MHz
        - i2s1_frac is fractional divider can divide input to x/y, x and
          y are 16bit integer.
      
      CPLL --> | selector | ---> i2s1_div -+--> | selector | --> I2S1 MCLK
      GPLL --> |          | ,--------------'    |          |
                            `--> i2s1_frac ---> |          |
      
      Clock mux system try to choose suitable one from i2s1_div and
      i2s1_frac for master clock (MCLK) of I2S1.
      
      Bad scenario as follows:
        - Try to set MCLK to 8.192MHz (32kHz audio replay)
          Candidate setting is
          - i2s1_div: GPLL / 60 = 8.192MHz
          i2s1_div candidate is exactly same as target clock rate, so mux
          choose this clock source. i2s1_div output rate is changed
          491.52MHz -> 8.192MHz
      
        - After that try to set to 11.2896MHz (44.1kHz audio replay)
          Candidate settings are
          - i2s1_div : CPLL / 107 = 11.214945MHz
          - i2s1_frac: i2s1_div   = 8.192MHz
            This is because clk_fd_round_rate() thinks target rate
            (11.2896MHz) is higher than parent rate (i2s1_div = 8.192MHz)
            and returns parent clock rate.
      
      Above is current upstreamed behavior. Clock mux system choose
      i2s1_div, but this clock rate is not acceptable for I2S driver, so
      users cannot replay audio.
      
      Expected behavior is:
        - Try to set master clock to 11.2896MHz (44.1kHz audio replay)
          Candidate settings are
          - i2s1_div : CPLL / 107          = 11.214945MHz
          - i2s1_frac: i2s1_div * 147/6400 = 11.2896MHz
                       Change i2s1_div to GPLL / 1 = 491.52MHz at same
                       time.
      
      If apply this commit, clk_fd_round_rate() calls custom approximate
      function of Rockchip even if target rate is higher than parent.
      Custom function changes both grand parent (i2s1_div) and parent
      (i2s_frac) settings at same time. Clock mux system can choose
      i2s1_frac and audio works fine.
      Signed-off-by: 's avatarKatsuhiro Suzuki <katsuhiro@katsuster.net>
      Reviewed-by: 's avatarHeiko Stuebner <heiko@sntech.de>
      [sboyd@kernel.org: Make function into a macro instead]
      Signed-off-by: 's avatarStephen Boyd <sboyd@kernel.org>
      Signed-off-by: 's avatarSasha Levin <sashal@kernel.org>
      1a4faefc
    • Jim Broadus's avatar
      i2c: Allow recovery of the initial IRQ by an I2C client device. · e2427570
      Jim Broadus authored
      [ Upstream commit 93b6604c ]
      
      A previous change allowed I2C client devices to discover new IRQs upon
      reprobe by clearing the IRQ in i2c_device_remove. However, if an IRQ was
      assigned in i2c_new_device, that information is lost.
      
      For example, the touchscreen and trackpad devices on a Dell Inspiron laptop
      are I2C devices whose IRQs are defined by ACPI extended IRQ types. The
      client device structures are initialized during an ACPI walk. After
      removing the i2c_hid device, modprobe fails.
      
      This change caches the initial IRQ value in i2c_new_device and then resets
      the client device IRQ to the initial value in i2c_device_remove.
      
      Fixes: 6f108dd7 ("i2c: Clear client->irq in i2c_device_remove")
      Signed-off-by: Jim Broadus's avatarJim Broadus <jbroadus@gmail.com>
      Reviewed-by: 's avatarBenjamin Tissoires <benjamin.tissoires@redhat.com>
      Reviewed-by: 's avatarCharles Keepax <ckeepax@opensource.cirrus.com>
      [wsa: this is an easy to backport fix for the regression. We will
      refactor the code to handle irq assignments better in general.]
      Signed-off-by: 's avatarWolfram Sang <wsa@the-dreams.de>
      Signed-off-by: 's avatarSasha Levin <sashal@kernel.org>
      e2427570
    • Chao Yu's avatar
      f2fs: fix to check inline_xattr_size boundary correctly · 88596e78
      Chao Yu authored
      [ Upstream commit 500e0b28 ]
      
      We use below condition to check inline_xattr_size boundary:
      
      	if (!F2FS_OPTION(sbi).inline_xattr_size ||
      		F2FS_OPTION(sbi).inline_xattr_size >=
      				DEF_ADDRS_PER_INODE -
      				F2FS_TOTAL_EXTRA_ATTR_SIZE -
      				DEF_INLINE_RESERVED_SIZE -
      				DEF_MIN_INLINE_SIZE)
      
      There is there problems in that check:
      - we should allow inline_xattr_size equaling to min size of inline
      {data,dentry} area.
      - F2FS_TOTAL_EXTRA_ATTR_SIZE and inline_xattr_size are based on
      different size unit, previous one is 4 bytes, latter one is 1 bytes.
      - DEF_MIN_INLINE_SIZE only indicate min size of inline data area,
      however, we need to consider min size of inline dentry area as well,
      minimal inline dentry should at least contain two entries: '.' and
      '..', so that min inline_dentry size is 40 bytes.
      
      .bitmap		1 * 1 = 1
      .reserved	1 * 1 = 1
      .dentry		11 * 2 = 22
      .filename	8 * 2 = 16
      total		40
      Signed-off-by: 's avatarChao Yu <yuchao0@huawei.com>
      Signed-off-by: 's avatarJaegeuk Kim <jaegeuk@kernel.org>
      Signed-off-by: 's avatarSasha Levin <sashal@kernel.org>
      88596e78
    • Arnd Bergmann's avatar
      kasan: fix kasan_check_read/write definitions · 64f33625
      Arnd Bergmann authored
      [ Upstream commit bcf6f55a ]
      
      Building little-endian allmodconfig kernels on arm64 started failing
      with the generated atomic.h implementation, since we now try to call
      kasan helpers from the EFI stub:
      
        aarch64-linux-gnu-ld: drivers/firmware/efi/libstub/arm-stub.stub.o: in function `atomic_set':
        include/generated/atomic-instrumented.h:44: undefined reference to `__efistub_kasan_check_write'
      
      I suspect that we get similar problems in other files that explicitly
      disable KASAN for some reason but call atomic_t based helper functions.
      
      We can fix this by checking the predefined __SANITIZE_ADDRESS__ macro
      that the compiler sets instead of checking CONFIG_KASAN, but this in
      turn requires a small hack in mm/kasan/common.c so we do see the extern
      declaration there instead of the inline function.
      
      Link: http://lkml.kernel.org/r/20181211133453.2835077-1-arnd@arndb.de
      Fixes: b1864b828644 ("locking/atomics: build atomic headers as required")
      Signed-off-by: 's avatarArnd Bergmann <arnd@arndb.de>
      Reported-by: 's avatarAnders Roxell <anders.roxell@linaro.org>
      Acked-by: 's avatarAndrey Ryabinin <aryabinin@virtuozzo.com>
      Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Alexander Potapenko <glider@google.com>
      Cc: Dmitry Vyukov <dvyukov@google.com>
      Cc: Andrey Konovalov <andreyknvl@google.com>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>,
      Signed-off-by: 's avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: 's avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: 's avatarSasha Levin <sashal@kernel.org>
      64f33625
    • Luc Van Oostenryck's avatar
      include/linux/relay.h: fix percpu annotation in struct rchan · fd9317a3
      Luc Van Oostenryck authored
      [ Upstream commit 62461ac2 ]
      
      The percpu member of this structure is declared as:
      	struct ... ** __percpu member;
      So its type is:
      	__percpu pointer to pointer to struct ...
      
      But looking at how it's used, its type should be:
      	pointer to __percpu pointer to struct ...
      and it should thus be declared as:
      	struct ... * __percpu *member;
      
      So fix the placement of '__percpu' in the definition of this
      structures.
      
      This silents a few Sparse's warnings like:
      	warning: incorrect type in initializer (different address spaces)
      	  expected void const [noderef] <asn:3> *__vpp_verify
      	  got struct sched_domain **
      
      Link: http://lkml.kernel.org/r/20190118144902.79065-1-luc.vanoostenryck@gmail.com
      Fixes: 017c59c0 ("relay: Use per CPU constructs for the relay channel buffer pointers")
      Signed-off-by: Luc Van Oostenryck's avatarLuc Van Oostenryck <luc.vanoostenryck@gmail.com>
      Cc: Jens Axboe <axboe@suse.de>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: 's avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: 's avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: 's avatarSasha Levin <sashal@kernel.org>
      fd9317a3
    • Douglas Anderson's avatar
      tracing: kdb: Fix ftdump to not sleep · 043a4400
      Douglas Anderson authored
      [ Upstream commit 31b265b3 ]
      
      As reported back in 2016-11 [1], the "ftdump" kdb command triggers a
      BUG for "sleeping function called from invalid context".
      
      kdb's "ftdump" command wants to call ring_buffer_read_prepare() in
      atomic context.  A very simple solution for this is to add allocation
      flags to ring_buffer_read_prepare() so kdb can call it without
      triggering the allocation error.  This patch does that.
      
      Note that in the original email thread about this, it was suggested
      that perhaps the solution for kdb was to either preallocate the buffer
      ahead of time or create our own iterator.  I'm hoping that this
      alternative of adding allocation flags to ring_buffer_read_prepare()
      can be considered since it means I don't need to duplicate more of the
      core trace code into "trace_kdb.c" (for either creating my own
      iterator or re-preparing a ring allocator whose memory was already
      allocated).
      
      NOTE: another option for kdb is to actually figure out how to make it
      reuse the existing ftrace_dump() function and totally eliminate the
      duplication.  This sounds very appealing and actually works (the "sr
      z" command can be seen to properly dump the ftrace buffer).  The
      downside here is that ftrace_dump() fully consumes the trace buffer.
      Unless that is changed I'd rather not use it because it means "ftdump
      | grep xyz" won't be very useful to search the ftrace buffer since it
      will throw away the whole trace on the first grep.  A future patch to
      dump only the last few lines of the buffer will also be hard to
      implement.
      
      [1] https://lkml.kernel.org/r/20161117191605.GA21459@google.com
      
      Link: http://lkml.kernel.org/r/20190308193205.213659-1-dianders@chromium.orgReported-by: 's avatarBrian Norris <briannorris@chromium.org>
      Signed-off-by: 's avatarDouglas Anderson <dianders@chromium.org>
      Signed-off-by: 's avatarSteven Rostedt (VMware) <rostedt@goodmis.org>
      Signed-off-by: 's avatarSasha Levin <sashal@kernel.org>
      043a4400
  2. 03 Apr, 2019 6 commits
    • Nicolas Boichat's avatar
      mm: add support for kmem caches in DMA32 zone · ed3886c7
      Nicolas Boichat authored
      commit 6d6ea1e9 upstream.
      
      Patch series "iommu/io-pgtable-arm-v7s: Use DMA32 zone for page tables",
      v6.
      
      This is a followup to the discussion in [1], [2].
      
      IOMMUs using ARMv7 short-descriptor format require page tables (level 1
      and 2) to be allocated within the first 4GB of RAM, even on 64-bit
      systems.
      
      For L1 tables that are bigger than a page, we can just use
      __get_free_pages with GFP_DMA32 (on arm64 systems only, arm would still
      use GFP_DMA).
      
      For L2 tables that only take 1KB, it would be a waste to allocate a full
      page, so we considered 3 approaches:
       1. This series, adding support for GFP_DMA32 slab caches.
       2. genalloc, which requires pre-allocating the maximum number of L2 page
          tables (4096, so 4MB of memory).
       3. page_frag, which is not very memory-efficient as it is unable to reuse
          freed fragments until the whole page is freed. [3]
      
      This series is the most memory-efficient approach.
      
      stable@ note:
        We confirmed that this is a regression, and IOMMU errors happen on 4.19
        and linux-next/master on MT8173 (elm, Acer Chromebook R13). The issue
        most likely starts from commit ad67f5a6 ("arm64: replace ZONE_DMA
        with ZONE_DMA32"), i.e. 4.15, and presumably breaks a number of Mediatek
        platforms (and maybe others?).
      
      [1] https://lists.linuxfoundation.org/pipermail/iommu/2018-November/030876.html
      [2] https://lists.linuxfoundation.org/pipermail/iommu/2018-December/031696.html
      [3] https://patchwork.codeaurora.org/patch/671639/
      
      This patch (of 3):
      
      IOMMUs using ARMv7 short-descriptor format require page tables to be
      allocated within the first 4GB of RAM, even on 64-bit systems.  On arm64,
      this is done by passing GFP_DMA32 flag to memory allocation functions.
      
      For IOMMU L2 tables that only take 1KB, it would be a waste to allocate
      a full page using get_free_pages, so we considered 3 approaches:
       1. This patch, adding support for GFP_DMA32 slab caches.
       2. genalloc, which requires pre-allocating the maximum number of L2
          page tables (4096, so 4MB of memory).
       3. page_frag, which is not very memory-efficient as it is unable
          to reuse freed fragments until the whole page is freed.
      
      This change makes it possible to create a custom cache in DMA32 zone using
      kmem_cache_create, then allocate memory using kmem_cache_alloc.
      
      We do not create a DMA32 kmalloc cache array, as there are currently no
      users of kmalloc(..., GFP_DMA32).  These calls will continue to trigger a
      warning, as we keep GFP_DMA32 in GFP_SLAB_BUG_MASK.
      
      This implies that calls to kmem_cache_*alloc on a SLAB_CACHE_DMA32
      kmem_cache must _not_ use GFP_DMA32 (it is anyway redundant and
      unnecessary).
      
      Link: http://lkml.kernel.org/r/20181210011504.122604-2-drinkcat@chromium.orgSigned-off-by: 's avatarNicolas Boichat <drinkcat@chromium.org>
      Acked-by: 's avatarVlastimil Babka <vbabka@suse.cz>
      Acked-by: 's avatarWill Deacon <will.deacon@arm.com>
      Cc: Robin Murphy <robin.murphy@arm.com>
      Cc: Joerg Roedel <joro@8bytes.org>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Mel Gorman <mgorman@techsingularity.net>
      Cc: Sasha Levin <Alexander.Levin@microsoft.com>
      Cc: Huaisheng Ye <yehs1@lenovo.com>
      Cc: Mike Rapoport <rppt@linux.vnet.ibm.com>
      Cc: Yong Wu <yong.wu@mediatek.com>
      Cc: Matthias Brugger <matthias.bgg@gmail.com>
      Cc: Tomasz Figa <tfiga@google.com>
      Cc: Yingjoe Chen <yingjoe.chen@mediatek.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Hsin-Yi Wang <hsinyi@chromium.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: 's avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: 's avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ed3886c7
    • Qian Cai's avatar
      mm/hotplug: fix offline undo_isolate_page_range() · eef9dbba
      Qian Cai authored
      commit 9b7ea46a upstream.
      
      Commit f1dd2cd1 ("mm, memory_hotplug: do not associate hotadded
      memory to zones until online") introduced move_pfn_range_to_zone() which
      calls memmap_init_zone() during onlining a memory block.
      memmap_init_zone() will reset pagetype flags and makes migrate type to
      be MOVABLE.
      
      However, in __offline_pages(), it also call undo_isolate_page_range()
      after offline_isolated_pages() to do the same thing.  Due to commit
      2ce13640 ("mm: __first_valid_page skip over offline pages") changed
      __first_valid_page() to skip offline pages, undo_isolate_page_range()
      here just waste CPU cycles looping around the offlining PFN range while
      doing nothing, because __first_valid_page() will return NULL as
      offline_isolated_pages() has already marked all memory sections within
      the pfn range as offline via offline_mem_sections().
      
      Also, after calling the "useless" undo_isolate_page_range() here, it
      reaches the point of no returning by notifying MEM_OFFLINE.  Those pages
      will be marked as MIGRATE_MOVABLE again once onlining.  The only thing
      left to do is to decrease the number of isolated pageblocks zone counter
      which would make some paths of the page allocation slower that the above
      commit introduced.
      
      Even if alloc_contig_range() can be used to isolate 16GB-hugetlb pages
      on ppc64, an "int" should still be enough to represent the number of
      pageblocks there.  Fix an incorrect comment along the way.
      
      [cai@lca.pw: v4]
        Link: http://lkml.kernel.org/r/20190314150641.59358-1-cai@lca.pw
      Link: http://lkml.kernel.org/r/20190313143133.46200-1-cai@lca.pw
      Fixes: 2ce13640 ("mm: __first_valid_page skip over offline pages")
      Signed-off-by: 's avatarQian Cai <cai@lca.pw>
      Acked-by: 's avatarMichal Hocko <mhocko@suse.com>
      Reviewed-by: 's avatarOscar Salvador <osalvador@suse.de>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: <stable@vger.kernel.org>	[4.13+]
      Signed-off-by: 's avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: 's avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      eef9dbba
    • Claudiu Manoil's avatar
      net: mii: Fix PAUSE cap advertisement from linkmode_adv_to_lcl_adv_t() helper · aa3f1b02
      Claudiu Manoil authored
      [ Upstream commit 7f07e5f1 ]
      
      With a recent link mode advertisement code update this helper
      providing local pause capability translation used for flow
      control link mode negotiation got broken.
      For eth drivers using this helper, the issue is apparent only
      if either PAUSE or ASYM_PAUSE is being advertised.
      
      Fixes: 3c1bcc86 ("net: ethernet: Convert phydev advertize and supported from u32 to link mode")
      Signed-off-by: 's avatarClaudiu Manoil <claudiu.manoil@nxp.com>
      Signed-off-by: 's avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      aa3f1b02
    • Xin Long's avatar
      sctp: get sctphdr by offset in sctp_compute_cksum · d2af0ce5
      Xin Long authored
      [ Upstream commit 273160ff ]
      
      sctp_hdr(skb) only works when skb->transport_header is set properly.
      
      But in Netfilter, skb->transport_header for ipv6 is not guaranteed
      to be right value for sctphdr. It would cause to fail to check the
      checksum for sctp packets.
      
      So fix it by using offset, which is always right in all places.
      
      v1->v2:
        - Fix the changelog.
      
      Fixes: e6d8b64b ("net: sctp: fix and consolidate SCTP checksumming code")
      Reported-by: 's avatarLi Shuang <shuali@redhat.com>
      Signed-off-by: 's avatarXin Long <lucien.xin@gmail.com>
      Signed-off-by: 's avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      d2af0ce5
    • Maxime Chevallier's avatar
      packets: Always register packet sk in the same order · 278c7d7e
      Maxime Chevallier authored
      [ Upstream commit a4dc6a49 ]
      
      When using fanouts with AF_PACKET, the demux functions such as
      fanout_demux_cpu will return an index in the fanout socket array, which
      corresponds to the selected socket.
      
      The ordering of this array depends on the order the sockets were added
      to a given fanout group, so for FANOUT_CPU this means sockets are bound
      to cpus in the order they are configured, which is OK.
      
      However, when stopping then restarting the interface these sockets are
      bound to, the sockets are reassigned to the fanout group in the reverse
      order, due to the fact that they were inserted at the head of the
      interface's AF_PACKET socket list.
      
      This means that traffic that was directed to the first socket in the
      fanout group is now directed to the last one after an interface restart.
      
      In the case of FANOUT_CPU, traffic from CPU0 will be directed to the
      socket that used to receive traffic from the last CPU after an interface
      restart.
      
      This commit introduces a helper to add a socket at the tail of a list,
      then uses it to register AF_PACKET sockets.
      
      Note that this changes the order in which sockets are listed in /proc and
      with sock_diag.
      
      Fixes: dc99f600 ("packet: Add fanout support")
      Signed-off-by: 's avatarMaxime Chevallier <maxime.chevallier@bootlin.com>
      Acked-by: 's avatarWillem de Bruijn <willemb@google.com>
      Signed-off-by: 's avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      278c7d7e
    • Pablo Neira Ayuso's avatar
      netfilter: nf_tables: fix set double-free in abort path · 400dded5
      Pablo Neira Ayuso authored
      [ Upstream commit 40ba1d9b ]
      
      The abort path can cause a double-free of an anonymous set.
      Added-and-to-be-aborted rule looks like this:
      
      udp dport { 137, 138 } drop
      
      The to-be-aborted transaction list looks like this:
      
      newset
      newsetelem
      newsetelem
      rule
      
      This gets walked in reverse order, so first pass disables the rule, the
      set elements, then the set.
      
      After synchronize_rcu(), we then destroy those in same order: rule, set
      element, set element, newset.
      
      Problem is that the anonymous set has already been bound to the rule, so
      the rule (lookup expression destructor) already frees the set, when then
      cause use-after-free when trying to delete the elements from this set,
      then try to free the set again when handling the newset expression.
      
      Rule releases the bound set in first place from the abort path, this
      causes the use-after-free on set element removal when undoing the new
      element transactions. To handle this, skip new element transaction if
      set is bound from the abort path.
      
      This is still causes the use-after-free on set element removal.  To
      handle this, remove transaction from the list when the set is already
      bound.
      
      Joint work with Florian Westphal.
      
      Fixes: f6ac8585 ("netfilter: nf_tables: unbind set in rule from commit path")
      Bugzilla: https://bugzilla.netfilter.org/show_bug.cgi?id=1325Acked-by: 's avatarFlorian Westphal <fw@strlen.de>
      Signed-off-by: 's avatarPablo Neira Ayuso <pablo@netfilter.org>
      Signed-off-by: 's avatarSasha Levin <sashal@kernel.org>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      400dded5
  3. 27 Mar, 2019 2 commits
    • Linus Torvalds's avatar
      aio: simplify - and fix - fget/fput for io_submit() · a179695e
      Linus Torvalds authored
      commit 84c4e1f8 upstream.
      
      Al Viro root-caused a race where the IOCB_CMD_POLL handling of
      fget/fput() could cause us to access the file pointer after it had
      already been freed:
      
       "In more details - normally IOCB_CMD_POLL handling looks so:
      
         1) io_submit(2) allocates aio_kiocb instance and passes it to
            aio_poll()
      
         2) aio_poll() resolves the descriptor to struct file by req->file =
            fget(iocb->aio_fildes)
      
         3) aio_poll() sets ->woken to false and raises ->ki_refcnt of that
            aio_kiocb to 2 (bumps by 1, that is).
      
         4) aio_poll() calls vfs_poll(). After sanity checks (basically,
            "poll_wait() had been called and only once") it locks the queue.
            That's what the extra reference to iocb had been for - we know we
            can safely access it.
      
         5) With queue locked, we check if ->woken has already been set to
            true (by aio_poll_wake()) and, if it had been, we unlock the
            queue, drop a reference to aio_kiocb and bugger off - at that
            point it's a responsibility to aio_poll_wake() and the stuff
            called/scheduled by it. That code will drop the reference to file
            in req->file, along with the other reference to our aio_kiocb.
      
         6) otherwise, we see whether we need to wait. If we do, we unlock the
            queue, drop one reference to aio_kiocb and go away - eventual
            wakeup (or cancel) will deal with the reference to file and with
            the other reference to aio_kiocb
      
         7) otherwise we remove ourselves from waitqueue (still under the
            queue lock), so that wakeup won't get us. No async activity will
            be happening, so we can safely drop req->file and iocb ourselves.
      
        If wakeup happens while we are in vfs_poll(), we are fine - aio_kiocb
        won't get freed under us, so we can do all the checks and locking
        safely. And we don't touch ->file if we detect that case.
      
        However, vfs_poll() most certainly *does* touch the file it had been
        given. So wakeup coming while we are still in ->poll() might end up
        doing fput() on that file. That case is not too rare, and usually we
        are saved by the still present reference from descriptor table - that
        fput() is not the final one.
      
        But if another thread closes that descriptor right after our fget()
        and wakeup does happen before ->poll() returns, we are in trouble -
        final fput() done while we are in the middle of a method:
      
      Al also wrote a patch to take an extra reference to the file descriptor
      to fix this, but I instead suggested we just streamline the whole file
      pointer handling by submit_io() so that the generic aio submission code
      simply keeps the file pointer around until the aio has completed.
      
      Fixes: bfe4037e ("aio: implement IOCB_CMD_POLL")
      Acked-by: 's avatarAl Viro <viro@zeniv.linux.org.uk>
      Reported-by: syzbot+503d4cc169fcec1cb18c@syzkaller.appspotmail.com
      Signed-off-by: 's avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      a179695e
    • Ilya Dryomov's avatar
      libceph: wait for latest osdmap in ceph_monc_blacklist_add() · 48cce130
      Ilya Dryomov authored
      commit bb229bbb upstream.
      
      Because map updates are distributed lazily, an OSD may not know about
      the new blacklist for quite some time after "osd blacklist add" command
      is completed.  This makes it possible for a blacklisted but still alive
      client to overwrite a post-blacklist update, resulting in data
      corruption.
      
      Waiting for latest osdmap in ceph_monc_blacklist_add() and thus using
      the post-blacklist epoch for all post-blacklist requests ensures that
      all such requests "wait" for the blacklist to come into force on their
      respective OSDs.
      
      Cc: stable@vger.kernel.org
      Fixes: 6305a3b4 ("libceph: support for blacklisting clients")
      Signed-off-by: 's avatarIlya Dryomov <idryomov@gmail.com>
      Reviewed-by: 's avatarJason Dillaman <dillaman@redhat.com>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      48cce130
  4. 23 Mar, 2019 9 commits
  5. 13 Mar, 2019 1 commit
    • Ard Biesheuvel's avatar
      drm: disable uncached DMA optimization for ARM and arm64 · e3f5c3cb
      Ard Biesheuvel authored
      [ Upstream commit e02f5c1b ]
      
      The DRM driver stack is designed to work with cache coherent devices
      only, but permits an optimization to be enabled in some cases, where
      for some buffers, both the CPU and the GPU use uncached mappings,
      removing the need for DMA snooping and allocation in the CPU caches.
      
      The use of uncached GPU mappings relies on the correct implementation
      of the PCIe NoSnoop TLP attribute by the platform, otherwise the GPU
      will use cached mappings nonetheless. On x86 platforms, this does not
      seem to matter, as uncached CPU mappings will snoop the caches in any
      case. However, on ARM and arm64, enabling this optimization on a
      platform where NoSnoop is ignored results in loss of coherency, which
      breaks correct operation of the device. Since we have no way of
      detecting whether NoSnoop works or not, just disable this
      optimization entirely for ARM and arm64.
      
      Cc: Christian Koenig <christian.koenig@amd.com>
      Cc: Alex Deucher <alexander.deucher@amd.com>
      Cc: David Zhou <David1.Zhou@amd.com>
      Cc: Huang Rui <ray.huang@amd.com>
      Cc: Junwei Zhang <Jerry.Zhang@amd.com>
      Cc: Michel Daenzer <michel.daenzer@amd.com>
      Cc: David Airlie <airlied@linux.ie>
      Cc: Daniel Vetter <daniel@ffwll.ch>
      Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
      Cc: Maxime Ripard <maxime.ripard@bootlin.com>
      Cc: Sean Paul <sean@poorly.run>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: Robin Murphy <robin.murphy@arm.com>
      Cc: amd-gfx list <amd-gfx@lists.freedesktop.org>
      Cc: dri-devel <dri-devel@lists.freedesktop.org>
      Reported-by: 's avatarCarsten Haitzler <Carsten.Haitzler@arm.com>
      Signed-off-by: 's avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Reviewed-by: 's avatarChristian König <christian.koenig@amd.com>
      Reviewed-by: 's avatarAlex Deucher <alexander.deucher@amd.com>
      Link: https://patchwork.kernel.org/patch/10778815/Signed-off-by: 's avatarChristian König <christian.koenig@amd.com>
      Signed-off-by: 's avatarSasha Levin <sashal@kernel.org>
      e3f5c3cb
  6. 10 Mar, 2019 4 commits
    • Matthias Kaehlcke's avatar
      Bluetooth: Fix locking in bt_accept_enqueue() for BH context · bc609314
      Matthias Kaehlcke authored
      commit c4f5627f upstream.
      
      With commit e1633762 ("Bluetooth: Handle bt_accept_enqueue() socket
      atomically") lock_sock[_nested]() is used to acquire the socket lock
      before manipulating the socket. lock_sock[_nested]() may block, which
      is problematic since bt_accept_enqueue() can be called in bottom half
      context (e.g. from rfcomm_connect_ind()):
      
      [<ffffff80080d81ec>] __might_sleep+0x4c/0x80
      [<ffffff800876c7b0>] lock_sock_nested+0x24/0x58
      [<ffffff8000d7c27c>] bt_accept_enqueue+0x48/0xd4 [bluetooth]
      [<ffffff8000e67d8c>] rfcomm_connect_ind+0x190/0x218 [rfcomm]
      
      Add a parameter to bt_accept_enqueue() to indicate whether the
      function is called from BH context, and acquire the socket lock
      with bh_lock_sock_nested() if that's the case.
      
      Also adapt all callers of bt_accept_enqueue() to pass the new
      parameter:
      
      - l2cap_sock_new_connection_cb()
        - uses lock_sock() to lock the parent socket => process context
      
      - rfcomm_connect_ind()
        - acquires the parent socket lock with bh_lock_sock() => BH
          context
      
      - __sco_chan_add()
        - called from sco_chan_add(), which is called from sco_connect().
          parent is NULL, hence bt_accept_enqueue() isn't called in this
          code path and we can ignore it
        - also called from sco_conn_ready(). uses bh_lock_sock() to acquire
          the parent lock => BH context
      
      Fixes: e1633762 ("Bluetooth: Handle bt_accept_enqueue() socket atomically")
      Signed-off-by: 's avatarMatthias Kaehlcke <mka@chromium.org>
      Reviewed-by: 's avatarDouglas Anderson <dianders@chromium.org>
      Signed-off-by: Marcel Holtmann's avatarMarcel Holtmann <marcel@holtmann.org>
      Cc: stable@vger.kernel.org
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      bc609314
    • Eric Dumazet's avatar
      net: sched: put back q.qlen into a single location · cd267ea6
      Eric Dumazet authored
      [ Upstream commit 46b1c18f ]
      
      In the series fc8b81a5 ("Merge branch 'lockless-qdisc-series'")
      John made the assumption that the data path had no need to read
      the qdisc qlen (number of packets in the qdisc).
      
      It is true when pfifo_fast is used as the root qdisc, or as direct MQ/MQPRIO
      children.
      
      But pfifo_fast can be used as leaf in class full qdiscs, and existing
      logic needs to access the child qlen in an efficient way.
      
      HTB breaks badly, since it uses cl->leaf.q->q.qlen in :
        htb_activate() -> WARN_ON()
        htb_dequeue_tree() to decide if a class can be htb_deactivated
        when it has no more packets.
      
      HFSC, DRR, CBQ, QFQ have similar issues, and some calls to
      qdisc_tree_reduce_backlog() also read q.qlen directly.
      
      Using qdisc_qlen_sum() (which iterates over all possible cpus)
      in the data path is a non starter.
      
      It seems we have to put back qlen in a central location,
      at least for stable kernels.
      
      For all qdisc but pfifo_fast, qlen is guarded by the qdisc lock,
      so the existing q.qlen{++|--} are correct.
      
      For 'lockless' qdisc (pfifo_fast so far), we need to use atomic_{inc|dec}()
      because the spinlock might be not held (for example from
      pfifo_fast_enqueue() and pfifo_fast_dequeue())
      
      This patch adds atomic_qlen (in the same location than qlen)
      and renames the following helpers, since we want to express
      they can be used without qdisc lock, and that qlen is no longer percpu.
      
      - qdisc_qstats_cpu_qlen_dec -> qdisc_qstats_atomic_qlen_dec()
      - qdisc_qstats_cpu_qlen_inc -> qdisc_qstats_atomic_qlen_inc()
      
      Later (net-next) we might revert this patch by tracking all these
      qlen uses and replace them by a more efficient method (not having
      to access a precise qlen, but an empty/non_empty status that might
      be less expensive to maintain/track).
      
      Another possibility is to have a legacy pfifo_fast version that would
      be used when used a a child qdisc, since the parent qdisc needs
      a spinlock anyway. But then, future lockless qdiscs would also
      have the same problem.
      
      Fixes: 7e66016f ("net: sched: helpers to sum qlen and qlen for per cpu logic")
      Signed-off-by: 's avatarEric Dumazet <edumazet@google.com>
      Cc: John Fastabend <john.fastabend@gmail.com>
      Cc: Jamal Hadi Salim <jhs@mojatatu.com>
      Cc: Cong Wang <xiyou.wangcong@gmail.com>
      Cc: Jiri Pirko <jiri@resnulli.us>
      Signed-off-by: 's avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cd267ea6
    • Todd Kjos's avatar
      binder: create node flag to request sender's security context · ed1776bb
      Todd Kjos authored
      commit ec74136d upstream.
      
      To allow servers to verify client identity, allow a node
      flag to be set that causes the sender's security context
      to be delivered with the transaction. The BR_TRANSACTION
      command is extended in BR_TRANSACTION_SEC_CTX to
      contain a pointer to the security context string.
      Signed-off-by: 's avatarTodd Kjos <tkjos@google.com>
      Reviewed-by: 's avatarJoel Fernandes (Google) <joel@joelfernandes.org>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ed1776bb
    • Viresh Kumar's avatar
      cpufreq: Use struct kobj_attribute instead of struct global_attr · e36e066f
      Viresh Kumar authored
      commit 625c85a6 upstream.
      
      The cpufreq_global_kobject is created using kobject_create_and_add()
      helper, which assigns the kobj_type as dynamic_kobj_ktype and show/store
      routines are set to kobj_attr_show() and kobj_attr_store().
      
      These routines pass struct kobj_attribute as an argument to the
      show/store callbacks. But all the cpufreq files created using the
      cpufreq_global_kobject expect the argument to be of type struct
      attribute. Things work fine currently as no one accesses the "attr"
      argument. We may not see issues even if the argument is used, as struct
      kobj_attribute has struct attribute as its first element and so they
      will both get same address.
      
      But this is logically incorrect and we should rather use struct
      kobj_attribute instead of struct global_attr in the cpufreq core and
      drivers and the show/store callbacks should take struct kobj_attribute
      as argument instead.
      
      This bug is caught using CFI CLANG builds in android kernel which
      catches mismatch in function prototypes for such callbacks.
      Reported-by: 's avatarDonghee Han <dh.han@samsung.com>
      Reported-by: 's avatarSangkyu Kim <skwith.kim@samsung.com>
      Signed-off-by: 's avatarViresh Kumar <viresh.kumar@linaro.org>
      Signed-off-by: 's avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e36e066f
  7. 02 Mar, 2019 1 commit
  8. 27 Feb, 2019 1 commit
  9. 25 Feb, 2019 1 commit