1. 31 Jan, 2019 40 commits
    • Deepa Dinamani's avatar
      Input: input_event - fix the CONFIG_SPARC64 mixup · 4c8e5815
      Deepa Dinamani authored
      commit 141e5dca upstream.
      
      Arnd Bergmann pointed out that CONFIG_* cannot be used in a uapi header.
      Override with an equivalent conditional.
      
      Fixes: 2e746942 ("Input: input_event - provide override for sparc64")
      Fixes: 152194fe ("Input: extend usable life of event timestamps to 2106 on 32 bit systems")
      Signed-off-by: 's avatarDeepa Dinamani <deepa.kernel@gmail.com>
      Signed-off-by: 's avatarDmitry Torokhov <dmitry.torokhov@gmail.com>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4c8e5815
    • Christoph Hellwig's avatar
      ide: fix a typo in the settings proc file name · 3ff523df
      Christoph Hellwig authored
      commit f8ff6c73 upstream.
      
      Fixes: ec7d9c9c ("ide: replace ->proc_fops with ->proc_show")
      Reported-by: 's avatarkernel test robot <lkp@intel.com>
      Signed-off-by: 's avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: 's avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3ff523df
    • Lorenzo Bianconi's avatar
      mt76x0: phy: unify calibration between mt76x0u and mt76x0e · 6b90547d
      Lorenzo Bianconi authored
      commit 1163bdb6 upstream.
      
      Align phy calibration logic between mt76x0u and mt76x0e drivers
      This patch improves connection stability with low SNR
      Signed-off-by: 's avatarLorenzo Bianconi <lorenzo.bianconi@redhat.com>
      Signed-off-by: Felix Fietkau's avatarFelix Fietkau <nbd@nbd.name>
      Signed-off-by: 's avatarStanislaw Gruszka <sgruszka@redhat.com>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6b90547d
    • Stanislaw Gruszka's avatar
      mt76x0: antenna select corrections · 9e39a91e
      Stanislaw Gruszka authored
      commit ef442b73 upstream.
      
      Update mt76x0_phy_ant_select() to conform vendor driver, most notably
      add dual antenna mode support, read configuration from EEPROM and
      move ant select out of channel config to init phase. Plus small MT7630E
      quirk for MT_CMB_CTRL register which vendor driver dedicated to this
      chip do.
      
      This make MT7630E workable with mt76x0e driver and do not cause any
      problems on MT7610U for me.
      Acked-by: 's avatarLorenzo Bianconi <lorenzo.bianconi@redhat.com>
      Signed-off-by: Felix Fietkau's avatarFelix Fietkau <nbd@nbd.name>
      Signed-off-by: 's avatarStanislaw Gruszka <sgruszka@redhat.com>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9e39a91e
    • Stanislaw Gruszka's avatar
      mt76x0: do not perform MCU calibration for MT7630 · 14275ffa
      Stanislaw Gruszka authored
      commit a83150ea upstream.
      
      Driver works better for MT7630 without MCU calibration, which
      looks like it can hangs the firmware. Vendor driver do not
      perform it for MT7630 as well.
      Signed-off-by: Felix Fietkau's avatarFelix Fietkau <nbd@nbd.name>
      Signed-off-by: 's avatarStanislaw Gruszka <sgruszka@redhat.com>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      14275ffa
    • Stanislaw Gruszka's avatar
      mt76x02: assure we update gain after scan · e15c6c8a
      Stanislaw Gruszka authored
      commit 4784a3cc upstream.
      
      Assure that after we initialize dev->cal.low_gain to -1 this
      will cause update gain calibration. Otherwise this might or
      might not happen depending on value of second bit of low_gain
      and values read from registers in mt76x02_phy_adjust_vga_gain().
      Signed-off-by: Felix Fietkau's avatarFelix Fietkau <nbd@nbd.name>
      Signed-off-by: 's avatarStanislaw Gruszka <sgruszka@redhat.com>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e15c6c8a
    • Stanislaw Gruszka's avatar
      mt76x02: run calibration after scanning · 1101ce4a
      Stanislaw Gruszka authored
      commit f1b8ee35 upstream.
      
      If we are associated and scanning is performed, sw_scan_complete callback
      is done after we get back to operating channel, so we do not perform
      queue cal work. Fix this queue cal work from sw_scan_complete().
      
      On mt76x0 we have to restore gain in MT_BBP(AGC, 8) register after
      scanning, as it was multiple times modified by channel switch code.
      So queue cal work without any delay to set AGC gain value.
      
      Similar like in mt76x2 init AGC gain only when set operating channel
      and just check before queuing cal work in sw_scan_complete() if
      initialization was already done.
      
      Fixes: bbd10586 ("mt76x0: phy: do not run calibration during channel switch")
      Signed-off-by: 's avatarStanislaw Gruszka <sgruszka@redhat.com>
      Signed-off-by: Felix Fietkau's avatarFelix Fietkau <nbd@nbd.name>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      
      1101ce4a
    • Stanislaw Gruszka's avatar
      mt76x0: use band parameter for LC calibration · ee21042f
      Stanislaw Gruszka authored
      commit ad3f993a upstream.
      
      We use always 1 as band parameter for MCU_CAL_LC, this break 2GHz,
      we should use 0 for this band instead.
      
      Patch fixes problems happened sometimes when try to associate with 2GHz
      AP and manifest by errors like below:
      
      [14680.920823] wlan0: authenticate with 18:31:bf:c0:51:b0
      [14681.109506] wlan0: send auth to 18:31:bf:c0:51:b0 (try 1/3)
      [14681.310454] wlan0: send auth to 18:31:bf:c0:51:b0 (try 2/3)
      [14681.518469] wlan0: send auth to 18:31:bf:c0:51:b0 (try 3/3)
      [14681.726499] wlan0: authentication with 18:31:bf:c0:51:b0 timed out
      
      Fixes: 9aec146d ("mt76x0: pci: introduce mt76x0_phy_calirate routine")
      Signed-off-by: 's avatarStanislaw Gruszka <sgruszka@redhat.com>
      Signed-off-by: Felix Fietkau's avatarFelix Fietkau <nbd@nbd.name>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ee21042f
    • Stanislaw Gruszka's avatar
      mt76x0: do not overwrite other MT_BBP(AGC, 8) fields · 3650aa10
      Stanislaw Gruszka authored
      commit b983a5b9 upstream.
      
      MT_BBP(AGC, 8) register has values depend on band in
      mt76x0_bbp_switch_tab, so we should not overwrite other fields
      than MT_BBP_AGC_GAIN when setting gain.
      
      This can fix performance issues when connecting to 2.4GHz AP.
      
      Fixes: 4636a254 ("mt76x0: phy: align channel gain logic to mt76x2 one")
      Acked-by: 's avatarLorenzo Bianconi <lorenzo.bianconi@redhat.com>
      Signed-off-by: 's avatarStanislaw Gruszka <sgruszka@redhat.com>
      Signed-off-by: Felix Fietkau's avatarFelix Fietkau <nbd@nbd.name>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      
      3650aa10
    • Jack Pham's avatar
      usb: dwc3: gadget: Clear req->needs_extra_trb flag on cleanup · 5eaf9833
      Jack Pham authored
      commit bd674224 upstream.
      
      OUT endpoint requests may somtimes have this flag set when
      preparing to be submitted to HW indicating that there is an
      additional TRB chained to the request for alignment purposes.
      If that request is removed before the controller can execute the
      transfer (e.g. ep_dequeue/ep_disable), the request will not go
      through the dwc3_gadget_ep_cleanup_completed_request() handler
      and will not have its needs_extra_trb flag cleared when
      dwc3_gadget_giveback() is called.  This same request could be
      later requeued for a new transfer that does not require an
      extra TRB and if it is successfully completed, the cleanup
      and TRB reclamation will incorrectly process the additional TRB
      which belongs to the next request, and incorrectly advances the
      TRB dequeue pointer, thereby messing up calculation of the next
      requeust's actual/remaining count when it completes.
      
      The right thing to do here is to ensure that the flag is cleared
      before it is given back to the function driver.  A good place
      to do that is in dwc3_gadget_del_and_unmap_request().
      
      Fixes: c6267a51 ("usb: dwc3: gadget: align transfers to wMaxPacketSize")
      Cc: stable@vger.kernel.org
      Signed-off-by: 's avatarJack Pham <jackp@codeaurora.org>
      Signed-off-by: 's avatarFelipe Balbi <felipe.balbi@linux.intel.com>
      [jackp: backport to <= 4.20: replaced 'needs_extra_trb' with 'unaligned'
              and 'zero' members in patch and reworded commit text]
      Signed-off-by: 's avatarJack Pham <jackp@codeaurora.org>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5eaf9833
    • Michal Hocko's avatar
      Revert "mm, memory_hotplug: initialize struct pages for the full memory section" · 9b2e022c
      Michal Hocko authored
      commit 4aa9fc2a upstream.
      
      This reverts commit 2830bf6f.
      
      The underlying assumption that one sparse section belongs into a single
      numa node doesn't hold really. Robert Shteynfeld has reported a boot
      failure. The boot log was not captured but his memory layout is as
      follows:
      
        Early memory node ranges
          node   1: [mem 0x0000000000001000-0x0000000000090fff]
          node   1: [mem 0x0000000000100000-0x00000000dbdf8fff]
          node   1: [mem 0x0000000100000000-0x0000001423ffffff]
          node   0: [mem 0x0000001424000000-0x0000002023ffffff]
      
      This means that node0 starts in the middle of a memory section which is
      also in node1.  memmap_init_zone tries to initialize padding of a
      section even when it is outside of the given pfn range because there are
      code paths (e.g.  memory hotplug) which assume that the full worth of
      memory section is always initialized.
      
      In this particular case, though, such a range is already intialized and
      most likely already managed by the page allocator.  Scribbling over
      those pages corrupts the internal state and likely blows up when any of
      those pages gets used.
      Reported-by: Robert Shteynfeld's avatarRobert Shteynfeld <robert.shteynfeld@gmail.com>
      Fixes: 2830bf6f ("mm, memory_hotplug: initialize struct pages for the full memory section")
      Cc: stable@kernel.org
      Signed-off-by: 's avatarMichal Hocko <mhocko@suse.com>
      Signed-off-by: 's avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      9b2e022c
    • Dexuan Cui's avatar
      vmbus: fix subchannel removal · 601cdaed
      Dexuan Cui authored
      [ Upstream commit b5679ceb ]
      
      The changes to split ring allocation from open/close, broke
      the cleanup of subchannels. This resulted in problems using
      uio on network devices because the subchannel was left behind
      when the network device was unbound.
      
      The cause was in the disconnect logic which used list splice
      to move the subchannel list into a local variable. This won't
      work because the subchannel list is needed later during the
      process of the rescind messages (relid2channel).
      
      The fix is to just leave the subchannel list in place
      which is what the original code did. The list is cleaned
      up later when the host rescind is processed.
      
      Without the fix, we have a lot of "hang" issues in netvsc when we
      try to change the NIC's MTU, set the number of channels, etc.
      
      Fixes: ae6935ed ("vmbus: split ring buffer allocation from open")
      Cc: stable@vger.kernel.org
      Signed-off-by: 's avatarStephen Hemminger <sthemmin@microsoft.com>
      Signed-off-by: 's avatarDexuan Cui <decui@microsoft.com>
      Signed-off-by: 's avatarSasha Levin <sashal@kernel.org>
      601cdaed
    • Dexuan Cui's avatar
      Drivers: hv: vmbus: Remove the useless API vmbus_get_outgoing_channel() · e4266dff
      Dexuan Cui authored
      [ Upstream commit 4d3c5c69 ]
      
      Commit d86adf48 ("scsi: storvsc: Enable multi-queue support") removed
      the usage of the API in Jan 2017, and the API is not used since then.
      
      netvsc and storvsc have their own algorithms to determine the outgoing
      channel, so this API is useless.
      
      And the API is potentially unsafe, because it reads primary->num_sc without
      any lock held. This can be risky considering the RESCIND-OFFER message.
      
      Let's remove the API.
      
      Cc: Long Li <longli@microsoft.com>
      Cc: Stephen Hemminger <sthemmin@microsoft.com>
      Cc: K. Y. Srinivasan <kys@microsoft.com>
      Cc: Haiyang Zhang <haiyangz@microsoft.com>
      Signed-off-by: 's avatarDexuan Cui <decui@microsoft.com>
      Signed-off-by: 's avatarK. Y. Srinivasan <kys@microsoft.com>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: 's avatarDexuan Cui <decui@microsoft.com>
      Signed-off-by: 's avatarSasha Levin <sashal@kernel.org>
      e4266dff
    • Daniel Borkmann's avatar
      bpf: fix inner map masking to prevent oob under speculation · a79c57d1
      Daniel Borkmann authored
      [ commit 9d5564dd upstream ]
      
      During review I noticed that inner meta map setup for map in
      map is buggy in that it does not propagate all needed data
      from the reference map which the verifier is later accessing.
      
      In particular one such case is index masking to prevent out of
      bounds access under speculative execution due to missing the
      map's unpriv_array/index_mask field propagation. Fix this such
      that the verifier is generating the correct code for inlined
      lookups in case of unpriviledged use.
      
      Before patch (test_verifier's 'map in map access' dump):
      
        # bpftool prog dump xla id 3
           0: (62) *(u32 *)(r10 -4) = 0
           1: (bf) r2 = r10
           2: (07) r2 += -4
           3: (18) r1 = map[id:4]
           5: (07) r1 += 272                |
           6: (61) r0 = *(u32 *)(r2 +0)     |
           7: (35) if r0 >= 0x1 goto pc+6   | Inlined map in map lookup
           8: (54) (u32) r0 &= (u32) 0      | with index masking for
           9: (67) r0 <<= 3                 | map->unpriv_array.
          10: (0f) r0 += r1                 |
          11: (79) r0 = *(u64 *)(r0 +0)     |
          12: (15) if r0 == 0x0 goto pc+1   |
          13: (05) goto pc+1                |
          14: (b7) r0 = 0                   |
          15: (15) if r0 == 0x0 goto pc+11
          16: (62) *(u32 *)(r10 -4) = 0
          17: (bf) r2 = r10
          18: (07) r2 += -4
          19: (bf) r1 = r0
          20: (07) r1 += 272                |
          21: (61) r0 = *(u32 *)(r2 +0)     | Index masking missing (!)
          22: (35) if r0 >= 0x1 goto pc+3   | for inner map despite
          23: (67) r0 <<= 3                 | map->unpriv_array set.
          24: (0f) r0 += r1                 |
          25: (05) goto pc+1                |
          26: (b7) r0 = 0                   |
          27: (b7) r0 = 0
          28: (95) exit
      
      After patch:
      
        # bpftool prog dump xla id 1
           0: (62) *(u32 *)(r10 -4) = 0
           1: (bf) r2 = r10
           2: (07) r2 += -4
           3: (18) r1 = map[id:2]
           5: (07) r1 += 272                |
           6: (61) r0 = *(u32 *)(r2 +0)     |
           7: (35) if r0 >= 0x1 goto pc+6   | Same inlined map in map lookup
           8: (54) (u32) r0 &= (u32) 0      | with index masking due to
           9: (67) r0 <<= 3                 | map->unpriv_array.
          10: (0f) r0 += r1                 |
          11: (79) r0 = *(u64 *)(r0 +0)     |
          12: (15) if r0 == 0x0 goto pc+1   |
          13: (05) goto pc+1                |
          14: (b7) r0 = 0                   |
          15: (15) if r0 == 0x0 goto pc+12
          16: (62) *(u32 *)(r10 -4) = 0
          17: (bf) r2 = r10
          18: (07) r2 += -4
          19: (bf) r1 = r0
          20: (07) r1 += 272                |
          21: (61) r0 = *(u32 *)(r2 +0)     |
          22: (35) if r0 >= 0x1 goto pc+4   | Now fixed inlined inner map
          23: (54) (u32) r0 &= (u32) 0      | lookup with proper index masking
          24: (67) r0 <<= 3                 | for map->unpriv_array.
          25: (0f) r0 += r1                 |
          26: (05) goto pc+1                |
          27: (b7) r0 = 0                   |
          28: (b7) r0 = 0
          29: (95) exit
      
      Fixes: b2157399 ("bpf: prevent out-of-bounds speculation")
      Signed-off-by: 's avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: 's avatarMartin KaFai Lau <kafai@fb.com>
      Signed-off-by: 's avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: 's avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: 's avatarSasha Levin <sashal@kernel.org>
      a79c57d1
    • Daniel Borkmann's avatar
      bpf: fix sanitation of alu op with pointer / scalar type from different paths · 4bce22c3
      Daniel Borkmann authored
      [ commit d3bd7413 upstream ]
      
      While 979d63d5 ("bpf: prevent out of bounds speculation on pointer
      arithmetic") took care of rejecting alu op on pointer when e.g. pointer
      came from two different map values with different map properties such as
      value size, Jann reported that a case was not covered yet when a given
      alu op is used in both "ptr_reg += reg" and "numeric_reg += reg" from
      different branches where we would incorrectly try to sanitize based
      on the pointer's limit. Catch this corner case and reject the program
      instead.
      
      Fixes: 979d63d5 ("bpf: prevent out of bounds speculation on pointer arithmetic")
      Reported-by: 's avatarJann Horn <jannh@google.com>
      Signed-off-by: 's avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: 's avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: 's avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: 's avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: 's avatarSasha Levin <sashal@kernel.org>
      4bce22c3
    • Daniel Borkmann's avatar
      bpf: prevent out of bounds speculation on pointer arithmetic · 078da99d
      Daniel Borkmann authored
      [ commit 979d63d5 upstream ]
      
      Jann reported that the original commit back in b2157399
      ("bpf: prevent out-of-bounds speculation") was not sufficient
      to stop CPU from speculating out of bounds memory access:
      While b2157399 only focussed on masking array map access
      for unprivileged users for tail calls and data access such
      that the user provided index gets sanitized from BPF program
      and syscall side, there is still a more generic form affected
      from BPF programs that applies to most maps that hold user
      data in relation to dynamic map access when dealing with
      unknown scalars or "slow" known scalars as access offset, for
      example:
      
        - Load a map value pointer into R6
        - Load an index into R7
        - Do a slow computation (e.g. with a memory dependency) that
          loads a limit into R8 (e.g. load the limit from a map for
          high latency, then mask it to make the verifier happy)
        - Exit if R7 >= R8 (mispredicted branch)
        - Load R0 = R6[R7]
        - Load R0 = R6[R0]
      
      For unknown scalars there are two options in the BPF verifier
      where we could derive knowledge from in order to guarantee
      safe access to the memory: i) While </>/<=/>= variants won't
      allow to derive any lower or upper bounds from the unknown
      scalar where it would be safe to add it to the map value
      pointer, it is possible through ==/!= test however. ii) another
      option is to transform the unknown scalar into a known scalar,
      for example, through ALU ops combination such as R &= <imm>
      followed by R |= <imm> or any similar combination where the
      original information from the unknown scalar would be destroyed
      entirely leaving R with a constant. The initial slow load still
      precedes the latter ALU ops on that register, so the CPU
      executes speculatively from that point. Once we have the known
      scalar, any compare operation would work then. A third option
      only involving registers with known scalars could be crafted
      as described in [0] where a CPU port (e.g. Slow Int unit)
      would be filled with many dependent computations such that
      the subsequent condition depending on its outcome has to wait
      for evaluation on its execution port and thereby executing
      speculatively if the speculated code can be scheduled on a
      different execution port, or any other form of mistraining
      as described in [1], for example. Given this is not limited
      to only unknown scalars, not only map but also stack access
      is affected since both is accessible for unprivileged users
      and could potentially be used for out of bounds access under
      speculation.
      
      In order to prevent any of these cases, the verifier is now
      sanitizing pointer arithmetic on the offset such that any
      out of bounds speculation would be masked in a way where the
      pointer arithmetic result in the destination register will
      stay unchanged, meaning offset masked into zero similar as
      in array_index_nospec() case. With regards to implementation,
      there are three options that were considered: i) new insn
      for sanitation, ii) push/pop insn and sanitation as inlined
      BPF, iii) reuse of ax register and sanitation as inlined BPF.
      
      Option i) has the downside that we end up using from reserved
      bits in the opcode space, but also that we would require
      each JIT to emit masking as native arch opcodes meaning
      mitigation would have slow adoption till everyone implements
      it eventually which is counter-productive. Option ii) and iii)
      have both in common that a temporary register is needed in
      order to implement the sanitation as inlined BPF since we
      are not allowed to modify the source register. While a push /
      pop insn in ii) would be useful to have in any case, it
      requires once again that every JIT needs to implement it
      first. While possible, amount of changes needed would also
      be unsuitable for a -stable patch. Therefore, the path which
      has fewer changes, less BPF instructions for the mitigation
      and does not require anything to be changed in the JITs is
      option iii) which this work is pursuing. The ax register is
      already mapped to a register in all JITs (modulo arm32 where
      it's mapped to stack as various other BPF registers there)
      and used in constant blinding for JITs-only so far. It can
      be reused for verifier rewrites under certain constraints.
      The interpreter's tmp "register" has therefore been remapped
      into extending the register set with hidden ax register and
      reusing that for a number of instructions that needed the
      prior temporary variable internally (e.g. div, mod). This
      allows for zero increase in stack space usage in the interpreter,
      and enables (restricted) generic use in rewrites otherwise as
      long as such a patchlet does not make use of these instructions.
      The sanitation mask is dynamic and relative to the offset the
      map value or stack pointer currently holds.
      
      There are various cases that need to be taken under consideration
      for the masking, e.g. such operation could look as follows:
      ptr += val or val += ptr or ptr -= val. Thus, the value to be
      sanitized could reside either in source or in destination
      register, and the limit is different depending on whether
      the ALU op is addition or subtraction and depending on the
      current known and bounded offset. The limit is derived as
      follows: limit := max_value_size - (smin_value + off). For
      subtraction: limit := umax_value + off. This holds because
      we do not allow any pointer arithmetic that would
      temporarily go out of bounds or would have an unknown
      value with mixed signed bounds where it is unclear at
      verification time whether the actual runtime value would
      be either negative or positive. For example, we have a
      derived map pointer value with constant offset and bounded
      one, so limit based on smin_value works because the verifier
      requires that statically analyzed arithmetic on the pointer
      must be in bounds, and thus it checks if resulting
      smin_value + off and umax_value + off is still within map
      value bounds at time of arithmetic in addition to time of
      access. Similarly, for the case of stack access we derive
      the limit as follows: MAX_BPF_STACK + off for subtraction
      and -off for the case of addition where off := ptr_reg->off +
      ptr_reg->var_off.value. Subtraction is a special case for
      the masking which can be in form of ptr += -val, ptr -= -val,
      or ptr -= val. In the first two cases where we know that
      the value is negative, we need to temporarily negate the
      value in order to do the sanitation on a positive value
      where we later swap the ALU op, and restore original source
      register if the value was in source.
      
      The sanitation of pointer arithmetic alone is still not fully
      sufficient as is, since a scenario like the following could
      happen ...
      
        PTR += 0x1000 (e.g. K-based imm)
        PTR -= BIG_NUMBER_WITH_SLOW_COMPARISON
        PTR += 0x1000
        PTR -= BIG_NUMBER_WITH_SLOW_COMPARISON
        [...]
      
      ... which under speculation could end up as ...
      
        PTR += 0x1000
        PTR -= 0 [ truncated by mitigation ]
        PTR += 0x1000
        PTR -= 0 [ truncated by mitigation ]
        [...]
      
      ... and therefore still access out of bounds. To prevent such
      case, the verifier is also analyzing safety for potential out
      of bounds access under speculative execution. Meaning, it is
      also simulating pointer access under truncation. We therefore
      "branch off" and push the current verification state after the
      ALU operation with known 0 to the verification stack for later
      analysis. Given the current path analysis succeeded it is
      likely that the one under speculation can be pruned. In any
      case, it is also subject to existing complexity limits and
      therefore anything beyond this point will be rejected. In
      terms of pruning, it needs to be ensured that the verification
      state from speculative execution simulation must never prune
      a non-speculative execution path, therefore, we mark verifier
      state accordingly at the time of push_stack(). If verifier
      detects out of bounds access under speculative execution from
      one of the possible paths that includes a truncation, it will
      reject such program.
      
      Given we mask every reg-based pointer arithmetic for
      unprivileged programs, we've been looking into how it could
      affect real-world programs in terms of size increase. As the
      majority of programs are targeted for privileged-only use
      case, we've unconditionally enabled masking (with its alu
      restrictions on top of it) for privileged programs for the
      sake of testing in order to check i) whether they get rejected
      in its current form, and ii) by how much the number of
      instructions and size will increase. We've tested this by
      using Katran, Cilium and test_l4lb from the kernel selftests.
      For Katran we've evaluated balancer_kern.o, Cilium bpf_lxc.o
      and an older test object bpf_lxc_opt_-DUNKNOWN.o and l4lb
      we've used test_l4lb.o as well as test_l4lb_noinline.o. We
      found that none of the programs got rejected by the verifier
      with this change, and that impact is rather minimal to none.
      balancer_kern.o had 13,904 bytes (1,738 insns) xlated and
      7,797 bytes JITed before and after the change. Most complex
      program in bpf_lxc.o had 30,544 bytes (3,817 insns) xlated
      and 18,538 bytes JITed before and after and none of the other
      tail call programs in bpf_lxc.o had any changes either. For
      the older bpf_lxc_opt_-DUNKNOWN.o object we found a small
      increase from 20,616 bytes (2,576 insns) and 12,536 bytes JITed
      before to 20,664 bytes (2,582 insns) and 12,558 bytes JITed
      after the change. Other programs from that object file had
      similar small increase. Both test_l4lb.o had no change and
      remained at 6,544 bytes (817 insns) xlated and 3,401 bytes
      JITed and for test_l4lb_noinline.o constant at 5,080 bytes
      (634 insns) xlated and 3,313 bytes JITed. This can be explained
      in that LLVM typically optimizes stack based pointer arithmetic
      by using K-based operations and that use of dynamic map access
      is not overly frequent. However, in future we may decide to
      optimize the algorithm further under known guarantees from
      branch and value speculation. Latter seems also unclear in
      terms of prediction heuristics that today's CPUs apply as well
      as whether there could be collisions in e.g. the predictor's
      Value History/Pattern Table for triggering out of bounds access,
      thus masking is performed unconditionally at this point but could
      be subject to relaxation later on. We were generally also
      brainstorming various other approaches for mitigation, but the
      blocker was always lack of available registers at runtime and/or
      overhead for runtime tracking of limits belonging to a specific
      pointer. Thus, we found this to be minimally intrusive under
      given constraints.
      
      With that in place, a simple example with sanitized access on
      unprivileged load at post-verification time looks as follows:
      
        # bpftool prog dump xlated id 282
        [...]
        28: (79) r1 = *(u64 *)(r7 +0)
        29: (79) r2 = *(u64 *)(r7 +8)
        30: (57) r1 &= 15
        31: (79) r3 = *(u64 *)(r0 +4608)
        32: (57) r3 &= 1
        33: (47) r3 |= 1
        34: (2d) if r2 > r3 goto pc+19
        35: (b4) (u32) r11 = (u32) 20479  |
        36: (1f) r11 -= r2                | Dynamic sanitation for pointer
        37: (4f) r11 |= r2                | arithmetic with registers
        38: (87) r11 = -r11               | containing bounded or known
        39: (c7) r11 s>>= 63              | scalars in order to prevent
        40: (5f) r11 &= r2                | out of bounds speculation.
        41: (0f) r4 += r11                |
        42: (71) r4 = *(u8 *)(r4 +0)
        43: (6f) r4 <<= r1
        [...]
      
      For the case where the scalar sits in the destination register
      as opposed to the source register, the following code is emitted
      for the above example:
      
        [...]
        16: (b4) (u32) r11 = (u32) 20479
        17: (1f) r11 -= r2
        18: (4f) r11 |= r2
        19: (87) r11 = -r11
        20: (c7) r11 s>>= 63
        21: (5f) r2 &= r11
        22: (0f) r2 += r0
        23: (61) r0 = *(u32 *)(r2 +0)
        [...]
      
      JIT blinding example with non-conflicting use of r10:
      
        [...]
         d5:	je     0x0000000000000106    _
         d7:	mov    0x0(%rax),%edi       |
         da:	mov    $0xf153246,%r10d     | Index load from map value and
         e0:	xor    $0xf153259,%r10      | (const blinded) mask with 0x1f.
         e7:	and    %r10,%rdi            |_
         ea:	mov    $0x2f,%r10d          |
         f0:	sub    %rdi,%r10            | Sanitized addition. Both use r10
         f3:	or     %rdi,%r10            | but do not interfere with each
         f6:	neg    %r10                 | other. (Neither do these instructions
         f9:	sar    $0x3f,%r10           | interfere with the use of ax as temp
         fd:	and    %r10,%rdi            | in interpreter.)
        100:	add    %rax,%rdi            |_
        103:	mov    0x0(%rdi),%eax
       [...]
      
      Tested that it fixes Jann's reproducer, and also checked that test_verifier
      and test_progs suite with interpreter, JIT and JIT with hardening enabled
      on x86-64 and arm64 runs successfully.
      
        [0] Speculose: Analyzing the Security Implications of Speculative
            Execution in CPUs, Giorgi Maisuradze and Christian Rossow,
            https://arxiv.org/pdf/1801.04084.pdf
      
        [1] A Systematic Evaluation of Transient Execution Attacks and
            Defenses, Claudio Canella, Jo Van Bulck, Michael Schwarz,
            Moritz Lipp, Benjamin von Berg, Philipp Ortner, Frank Piessens,
            Dmitry Evtyushkin, Daniel Gruss,
            https://arxiv.org/pdf/1811.05441.pdf
      
      Fixes: b2157399 ("bpf: prevent out-of-bounds speculation")
      Reported-by: 's avatarJann Horn <jannh@google.com>
      Signed-off-by: 's avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: 's avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: 's avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: 's avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: 's avatarSasha Levin <sashal@kernel.org>
      078da99d
    • Daniel Borkmann's avatar
      bpf: fix check_map_access smin_value test when pointer contains offset · ae474b62
      Daniel Borkmann authored
      [ commit b7137c4e upstream ]
      
      In check_map_access() we probe actual bounds through __check_map_access()
      with offset of reg->smin_value + off for lower bound and offset of
      reg->umax_value + off for the upper bound. However, even though the
      reg->smin_value could have a negative value, the final result of the
      sum with off could be positive when pointer arithmetic with known and
      unknown scalars is combined. In this case we reject the program with
      an error such as "R<x> min value is negative, either use unsigned index
      or do a if (index >=0) check." even though the access itself would be
      fine. Therefore extend the check to probe whether the actual resulting
      reg->smin_value + off is less than zero.
      Signed-off-by: 's avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: 's avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: 's avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: 's avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: 's avatarSasha Levin <sashal@kernel.org>
      ae474b62
    • Daniel Borkmann's avatar
      bpf: restrict unknown scalars of mixed signed bounds for unprivileged · ab474ba0
      Daniel Borkmann authored
      [ commit 9d7eceed upstream ]
      
      For unknown scalars of mixed signed bounds, meaning their smin_value is
      negative and their smax_value is positive, we need to reject arithmetic
      with pointer to map value. For unprivileged the goal is to mask every
      map pointer arithmetic and this cannot reliably be done when it is
      unknown at verification time whether the scalar value is negative or
      positive. Given this is a corner case, the likelihood of breaking should
      be very small.
      Signed-off-by: 's avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: 's avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: 's avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: 's avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: 's avatarSasha Levin <sashal@kernel.org>
      ab474ba0
    • Daniel Borkmann's avatar
      bpf: restrict stack pointer arithmetic for unprivileged · 2508a491
      Daniel Borkmann authored
      [ commit e4298d25 upstream ]
      
      Restrict stack pointer arithmetic for unprivileged users in that
      arithmetic itself must not go out of bounds as opposed to the actual
      access later on. Therefore after each adjust_ptr_min_max_vals() with
      a stack pointer as a destination we simulate a check_stack_access()
      of 1 byte on the destination and once that fails the program is
      rejected for unprivileged program loads. This is analog to map
      value pointer arithmetic and needed for masking later on.
      Signed-off-by: 's avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: 's avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: 's avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: 's avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: 's avatarSasha Levin <sashal@kernel.org>
      2508a491
    • Daniel Borkmann's avatar
      bpf: restrict map value pointer arithmetic for unprivileged · 4ec2af91
      Daniel Borkmann authored
      [ commit 0d6303db upstream ]
      
      Restrict map value pointer arithmetic for unprivileged users in that
      arithmetic itself must not go out of bounds as opposed to the actual
      access later on. Therefore after each adjust_ptr_min_max_vals() with a
      map value pointer as a destination it will simulate a check_map_access()
      of 1 byte on the destination and once that fails the program is rejected
      for unprivileged program loads. We use this later on for masking any
      pointer arithmetic with the remainder of the map value space. The
      likelihood of breaking any existing real-world unprivileged eBPF
      program is very small for this corner case.
      Signed-off-by: 's avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: 's avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: 's avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: 's avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: 's avatarSasha Levin <sashal@kernel.org>
      4ec2af91
    • Daniel Borkmann's avatar
      bpf: enable access to ax register also from verifier rewrite · 74d3c044
      Daniel Borkmann authored
      [ commit 9b73bfdd upstream ]
      
      Right now we are using BPF ax register in JIT for constant blinding as
      well as in interpreter as temporary variable. Verifier will not be able
      to use it simply because its use will get overridden from the former in
      bpf_jit_blind_insn(). However, it can be made to work in that blinding
      will be skipped if there is prior use in either source or destination
      register on the instruction. Taking constraints of ax into account, the
      verifier is then open to use it in rewrites under some constraints. Note,
      ax register already has mappings in every eBPF JIT.
      Signed-off-by: 's avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: 's avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: 's avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: 's avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: 's avatarSasha Levin <sashal@kernel.org>
      74d3c044
    • Raju Rangoju's avatar
      nvmet-rdma: fix null dereference under heavy load · bab8056a
      Raju Rangoju authored
      commit 5cbab630 upstream.
      
      Under heavy load if we don't have any pre-allocated rsps left, we
      dynamically allocate a rsp, but we are not actually allocating memory
      for nvme_completion (rsp->req.rsp). In such a case, accessing pointer
      fields (req->rsp->status) in nvmet_req_init() will result in crash.
      
      To fix this, allocate the memory for nvme_completion by calling
      nvmet_rdma_alloc_rsp()
      
      Fixes: 8407879c("nvmet-rdma:fix possible bogus dereference under heavy load")
      
      Cc: <stable@vger.kernel.org>
      Reviewed-by: 's avatarMax Gurtovoy <maxg@mellanox.com>
      Reviewed-by: 's avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: 's avatarRaju Rangoju <rajur@chelsio.com>
      Signed-off-by: Sagi Grimberg's avatarSagi Grimberg <sagi@grimberg.me>
      Signed-off-by: 's avatarJens Axboe <axboe@kernel.dk>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      bab8056a
    • Israel's avatar
      nvmet-rdma: Add unlikely for response allocated check · 8134ef8f
      Israel authored
      commit ad1f8249 upstream.
      Signed-off-by: Israel's avatarIsrael Rukshin <israelr@mellanox.com>
      Reviewed-by: Sagi Grimberg's avatarSagi Grimberg <sagi@grimberg.me>
      Reviewed-by: 's avatarMax Gurtovoy <maxg@mellanox.com>
      Signed-off-by: 's avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: 's avatarJens Axboe <axboe@kernel.dk>
      Cc: Raju  Rangoju <rajur@chelsio.com>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8134ef8f
    • Daniel Borkmann's avatar
      bpf: move tmp variable into ax register in interpreter · 433303ac
      Daniel Borkmann authored
      [ commit 144cd91c upstream ]
      
      This change moves the on-stack 64 bit tmp variable in ___bpf_prog_run()
      into the hidden ax register. The latter is currently only used in JITs
      for constant blinding as a temporary scratch register, meaning the BPF
      interpreter will never see the use of ax. Therefore it is safe to use
      it for the cases where tmp has been used earlier. This is needed to later
      on allow restricted hidden use of ax in both interpreter and JITs.
      Signed-off-by: 's avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: 's avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: 's avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: 's avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: 's avatarSasha Levin <sashal@kernel.org>
      433303ac
    • Daniel Borkmann's avatar
      bpf: move {prev_,}insn_idx into verifier env · 629b8af1
      Daniel Borkmann authored
      [ commit c08435ec upstream ]
      
      Move prev_insn_idx and insn_idx from the do_check() function into
      the verifier environment, so they can be read inside the various
      helper functions for handling the instructions. It's easier to put
      this into the environment rather than changing all call-sites only
      to pass it along. insn_idx is useful in particular since this later
      on allows to hold state in env->insn_aux_data[env->insn_idx].
      Signed-off-by: 's avatarDaniel Borkmann <daniel@iogearbox.net>
      Acked-by: 's avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: 's avatarAlexei Starovoitov <ast@kernel.org>
      Signed-off-by: 's avatarDaniel Borkmann <daniel@iogearbox.net>
      Signed-off-by: 's avatarSasha Levin <sashal@kernel.org>
      629b8af1
    • Neil Armstrong's avatar
      drm/meson: Fix atomic mode switching regression · ecb4e06b
      Neil Armstrong authored
      commit ce0210c1 upstream.
      
      Since commit 2bcd3eca when switching mode from X11 (ubuntu mate for
      example) the display gets blurry, looking like an invalid framebuffer width.
      
      This commit fixed atomic crtc modesetting in a totally wrong way and
      introduced a local unnecessary ->enabled crtc state.
      
      This commit reverts the crctc _begin() and _enable() changes and simply
      adds drm_atomic_helper_commit_tail_rpm as helper.
      Reported-by: 's avatarTony McKahan <tonymckahan@gmail.com>
      Suggested-by: 's avatarDaniel Vetter <daniel@ffwll.ch>
      Fixes: 2bcd3eca ("drm/meson: Fixes for drm_crtc_vblank_on/off support")
      Signed-off-by: Neil Armstrong's avatarNeil Armstrong <narmstrong@baylibre.com>
      Acked-by: Daniel Vetter's avatarDaniel Vetter <daniel.vetter@ffwll.ch>
      [narmstrong: fixed blank line issue from checkpatch]
      Link: https://patchwork.freedesktop.org/patch/msgid/20190114153118.8024-1-narmstrong@baylibre.comSigned-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      ecb4e06b
    • Nicolas Pitre's avatar
      vt: invoke notifier on screen size change · f85be7e0
      Nicolas Pitre authored
      commit 0c9b1965 upstream.
      
      User space using poll() on /dev/vcs devices are not awaken when a
      screen size change occurs. Let's fix that.
      Signed-off-by: 's avatarNicolas Pitre <nico@linaro.org>
      Cc: stable <stable@vger.kernel.org>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      f85be7e0
    • Nicolas Pitre's avatar
      vt: always call notifier with the console lock held · 641b7da5
      Nicolas Pitre authored
      commit 7e1d2263 upstream.
      
      Every invocation of notify_write() and notify_update() is performed
      under the console lock, except for one case. Let's fix that.
      Signed-off-by: 's avatarNicolas Pitre <nico@linaro.org>
      Cc: stable@vger.kernel.org
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      641b7da5
    • Nicolas Pitre's avatar
      vt: make vt_console_print() compatible with the unicode screen buffer · 082ea9f6
      Nicolas Pitre authored
      commit 6609cff6 upstream.
      
      When kernel messages are printed to the console, they appear blank on
      the unicode screen. This is because vt_console_print() is lacking a call
      to vc_uniscr_putc(). However the later function assumes vc->vc_x is
      always up to date when called, which is not the case here as
      vt_console_print() uses it to mark the beginning of the display update.
      
      This patch reworks (and simplifies) vt_console_print() so that vc->vc_x
      is always valid and keeps the start of display update in a local variable
      instead, which finally allows for adding the missing vc_uniscr_putc()
      call.
      Signed-off-by: 's avatarNicolas Pitre <nico@linaro.org>
      Cc: stable@vger.kernel.org # v4.19+
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      082ea9f6
    • Uwe Kleine-König's avatar
      can: flexcan: fix NULL pointer exception during bringup · 44486b29
      Uwe Kleine-König authored
      commit a55234da upstream.
      
      Commit cbffaf7a ("can: flexcan: Always use last mailbox for TX")
      introduced a loop letting i run up to (including) ARRAY_SIZE(regs->mb)
      and in the body accessed regs->mb[i] which is an out-of-bounds array
      access that then resulted in an access to an reserved register area.
      
      Later this was changed by commit 0517961c ("can: flexcan: Add
      provision for variable payload size") to iterate a bit differently but
      still runs one iteration too much resulting to call
      
      	flexcan_get_mb(priv, priv->mb_count)
      
      which results in a WARN_ON and then a NULL pointer exception. This
      only affects devices compatible with "fsl,p1010-flexcan",
      "fsl,imx53-flexcan", "fsl,imx35-flexcan", "fsl,imx25-flexcan",
      "fsl,imx28-flexcan", so newer i.MX SoCs are not affected.
      
      Fixes: cbffaf7a ("can: flexcan: Always use last mailbox for TX")
      Signed-off-by: 's avatarUwe Kleine-König <u.kleine-koenig@pengutronix.de>
      Cc: linux-stable <stable@vger.kernel.org> # >= 4.20
      Signed-off-by: 's avatarMarc Kleine-Budde <mkl@pengutronix.de>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      
      44486b29
    • Oliver Hartkopp's avatar
      can: bcm: check timer values before ktime conversion · 5305c33f
      Oliver Hartkopp authored
      commit 93171ba6 upstream.
      
      Kyungtae Kim detected a potential integer overflow in bcm_[rx|tx]_setup()
      when the conversion into ktime multiplies the given value with NSEC_PER_USEC
      (1000).
      
      Reference: https://marc.info/?l=linux-can&m=154732118819828&w=2
      
      Add a check for the given tv_usec, so that the value stays below one second.
      Additionally limit the tv_sec value to a reasonable value for CAN related
      use-cases of 400 days and ensure all values to be positive.
      Reported-by: 's avatarKyungtae Kim <kt0755@gmail.com>
      Tested-by: 's avatarOliver Hartkopp <socketcan@hartkopp.net>
      Signed-off-by: 's avatarOliver Hartkopp <socketcan@hartkopp.net>
      Cc: linux-stable <stable@vger.kernel.org> # >= 2.6.26
      Tested-by: 's avatarKyungtae Kim <kt0755@gmail.com>
      Acked-by: 's avatarAndre Naujoks <nautsch2@gmail.com>
      Signed-off-by: 's avatarMarc Kleine-Budde <mkl@pengutronix.de>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      5305c33f
    • Manfred Schlaegl's avatar
      can: dev: __can_get_echo_skb(): fix bogous check for non-existing skb by removing it · e0bb3046
      Manfred Schlaegl authored
      commit 7b12c818 upstream.
      
      This patch revert commit 7da11ba5
      ("can: dev: __can_get_echo_skb(): print error message, if trying to echo non existing skb")
      
      After introduction of this change we encountered following new error
      message on various i.MX plattforms (flexcan):
      
      | flexcan 53fc8000.can can0: __can_get_echo_skb: BUG! Trying to echo non
      | existing skb: can_priv::echo_skb[0]
      
      The introduction of the message was a mistake because
      priv->echo_skb[idx] = NULL is a perfectly valid in following case: If
      CAN_RAW_LOOPBACK is disabled (setsockopt) in applications, the pkt_type
      of the tx skb's given to can_put_echo_skb is set to PACKET_LOOPBACK. In
      this case can_put_echo_skb will not set priv->echo_skb[idx]. It is
      therefore kept NULL.
      
      As additional argument for revert: The order of check and usage of idx
      was changed. idx is used to access an array element before checking it's
      boundaries.
      Signed-off-by: 's avatarManfred Schlaegl <manfred.schlaegl@ginzinger.com>
      Fixes: 7da11ba5 ("can: dev: __can_get_echo_skb(): print error message, if trying to echo non existing skb")
      Cc: linux-stable <stable@vger.kernel.org>
      Signed-off-by: 's avatarMarc Kleine-Budde <mkl@pengutronix.de>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e0bb3046
    • Marc Zyngier's avatar
      irqchip/gic-v3-its: Align PCI Multi-MSI allocation on their size · 8e44876c
      Marc Zyngier authored
      commit 8208d170 upstream.
      
      The way we allocate events works fine in most cases, except
      when multiple PCI devices share an ITS-visible DevID, and that
      one of them is trying to use MultiMSI allocation.
      
      In that case, our allocation is not guaranteed to be zero-based
      anymore, and we have to make sure we allocate it on a boundary
      that is compatible with the PCI Multi-MSI constraints.
      
      Fix this by allocating the full region upfront instead of iterating
      over the number of MSIs. MSI-X are always allocated one by one,
      so this shouldn't change anything on that front.
      
      Fixes: b48ac83d ("irqchip: GICv3: ITS: MSI support")
      Cc: stable@vger.kernel.org
      Reported-by: 's avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Tested-by: 's avatarArd Biesheuvel <ard.biesheuvel@linaro.org>
      Signed-off-by: 's avatarMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      8e44876c
    • Thomas Gleixner's avatar
      net: sun: cassini: Cleanup license conflict · 3d90c485
      Thomas Gleixner authored
      commit 56cb4e50 upstream.
      
      The recent addition of SPDX license identifiers to the files in
      drivers/net/ethernet/sun created a licensing conflict.
      
      The cassini driver files contain a proper license notice:
      
        * This program is free software; you can redistribute it and/or
        * modify it under the terms of the GNU General Public License as
        * published by the Free Software Foundation; either version 2 of the
        * License, or (at your option) any later version.
      
      but the SPDX change added:
      
         SPDX-License-Identifier: GPL-2.0
      
      So the file got tagged GPL v2 only while in fact it is licensed under GPL
      v2 or later.
      
      It's nice that people care about the SPDX tags, but they need to be more
      careful about it. Not everything under (the) sun belongs to ...
      
      Fix up the SPDX identifier and remove the boiler plate text as it is
      redundant.
      
      Fixes: c861ef83 ("sun: Add SPDX license tags to Sun network drivers")
      Signed-off-by: 's avatarThomas Gleixner <tglx@linutronix.de>
      Cc: Shannon Nelson <shannon.nelson@oracle.com>
      Cc: Zhu Yanjun <yanjun.zhu@oracle.com>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: netdev@vger.kernel.org
      Cc: stable@vger.kernel.org
      Acked-by: 's avatarShannon Nelson <shannon.lee.nelson@gmail.com>
      Reviewed-by: 's avatarZhu Yanjun <yanjun.zhu@oracle.com>
      Signed-off-by: 's avatarDavid S. Miller <davem@davemloft.net>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3d90c485
    • Thomas Gleixner's avatar
      posix-cpu-timers: Unbreak timer rearming · e2630035
      Thomas Gleixner authored
      commit 93ad0fc0 upstream.
      
      The recent commit which prevented a division by 0 issue in the alarm timer
      code broke posix CPU timers as an unwanted side effect.
      
      The reason is that the common rearm code checks for timer->it_interval
      being 0 now. What went unnoticed is that the posix cpu timer setup does not
      initialize timer->it_interval as it stores the interval in CPU timer
      specific storage. The reason for the separate storage is historical as the
      posix CPU timers always had a 64bit nanoseconds representation internally
      while timer->it_interval is type ktime_t which used to be a modified
      timespec representation on 32bit machines.
      
      Instead of reverting the offending commit and fixing the alarmtimer issue
      in the alarmtimer code, store the interval in timer->it_interval at CPU
      timer setup time so the common code check works. This also repairs the
      existing inconistency of the posix CPU timer code which kept a single shot
      timer armed despite of the interval being 0.
      
      The separate storage can be removed in mainline, but that needs to be a
      separate commit as the current one has to be backported to stable kernels.
      
      Fixes: 0e334db6 ("posix-timers: Fix division by zero bug")
      Reported-by: H.J. Lu's avatarH.J. Lu <hjl.tools@gmail.com>
      Signed-off-by: 's avatarThomas Gleixner <tglx@linutronix.de>
      Cc: John Stultz <john.stultz@linaro.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20190111133500.840117406@linutronix.deSigned-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      e2630035
    • Jan Beulich's avatar
      x86/entry/64/compat: Fix stack switching for XEN PV · 75775e21
      Jan Beulich authored
      commit fc24d75a upstream.
      
      While in the native case entry into the kernel happens on the trampoline
      stack, PV Xen kernels get entered with the current thread stack right
      away. Hence source and destination stacks are identical in that case,
      and special care is needed.
      
      Other than in sync_regs() the copying done on the INT80 path isn't
      NMI / #MC safe, as either of these events occurring in the middle of the
      stack copying would clobber data on the (source) stack.
      
      There is similar code in interrupt_entry() and nmi(), but there is no fixup
      required because those code paths are unreachable in XEN PV guests.
      
      [ tglx: Sanitized subject, changelog, Fixes tag and stable mail address. Sigh ]
      
      Fixes: 7f2590a1 ("x86/entry/64: Use a per-CPU trampoline stack for IDT entries")
      Signed-off-by: Jan Beulich's avatarJan Beulich <jbeulich@suse.com>
      Signed-off-by: 's avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: Juergen Gross's avatarJuergen Gross <jgross@suse.com>
      Acked-by: 's avatarAndy Lutomirski <luto@kernel.org>
      Cc: Peter Anvin <hpa@zytor.com>
      Cc: xen-devel@lists.xenproject.org>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/5C3E1128020000780020DFAD@prv1-mh.provo.novell.comSigned-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      75775e21
    • Daniel Drake's avatar
      x86/kaslr: Fix incorrect i8254 outb() parameters · eec170d8
      Daniel Drake authored
      commit 7e6fc2f5 upstream.
      
      The outb() function takes parameters value and port, in that order.  Fix
      the parameters used in the kalsr i8254 fallback code.
      
      Fixes: 5bfce5ef ("x86, kaslr: Provide randomness functions")
      Signed-off-by: 's avatarDaniel Drake <drake@endlessm.com>
      Signed-off-by: 's avatarThomas Gleixner <tglx@linutronix.de>
      Cc: bp@alien8.de
      Cc: hpa@zytor.com
      Cc: linux@endlessm.com
      Cc: stable@vger.kernel.org
      Link: https://lkml.kernel.org/r/20190107034024.15005-1-drake@endlessm.comSigned-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      eec170d8
    • Dave Hansen's avatar
      x86/selftests/pkeys: Fork() to check for state being preserved · 940343c7
      Dave Hansen authored
      commit e1812933 upstream.
      
      There was a bug where the per-mm pkey state was not being preserved across
      fork() in the child.  fork() is performed in the pkey selftests, but all of
      the pkey activity is performed in the parent.  The child does not perform
      any actions sensitive to pkey state.
      
      To make the test more sensitive to these kinds of bugs, add a fork() where
      the parent exits, and execution continues in the child.
      
      To achieve this let the key exhaustion test not terminate at the first
      allocation failure and fork after 2*NR_PKEYS loops and continue in the
      child.
      Signed-off-by: 's avatarDave Hansen <dave.hansen@linux.intel.com>
      Signed-off-by: 's avatarThomas Gleixner <tglx@linutronix.de>
      Cc: bp@alien8.de
      Cc: hpa@zytor.com
      Cc: peterz@infradead.org
      Cc: mpe@ellerman.id.au
      Cc: will.deacon@arm.com
      Cc: luto@kernel.org
      Cc: jroedel@suse.de
      Cc: stable@vger.kernel.org
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Joerg Roedel <jroedel@suse.de>
      Link: https://lkml.kernel.org/r/20190102215657.585704B7@viggo.jf.intel.comSigned-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      940343c7
    • Dave Hansen's avatar
      x86/pkeys: Properly copy pkey state at fork() · 992c4f69
      Dave Hansen authored
      commit a31e184e upstream.
      
      Memory protection key behavior should be the same in a child as it was
      in the parent before a fork.  But, there is a bug that resets the
      state in the child at fork instead of preserving it.
      
      The creation of new mm's is a bit convoluted.  At fork(), the code
      does:
      
        1. memcpy() the parent mm to initialize child
        2. mm_init() to initalize some select stuff stuff
        3. dup_mmap() to create true copies that memcpy() did not do right
      
      For pkeys two bits of state need to be preserved across a fork:
      'execute_only_pkey' and 'pkey_allocation_map'.
      
      Those are preserved by the memcpy(), but mm_init() invokes
      init_new_context() which overwrites 'execute_only_pkey' and
      'pkey_allocation_map' with "new" values.
      
      The author of the code erroneously believed that init_new_context is *only*
      called at execve()-time.  But, alas, init_new_context() is used at execve()
      and fork().
      
      The result is that, after a fork(), the child's pkey state ends up looking
      like it does after an execve(), which is totally wrong.  pkeys that are
      already allocated can be allocated again, for instance.
      
      To fix this, add code called by dup_mmap() to copy the pkey state from
      parent to child explicitly.  Also add a comment above init_new_context() to
      make it more clear to the next poor sod what this code is used for.
      
      Fixes: e8c24d3a ("x86/pkeys: Allocation/free syscalls")
      Signed-off-by: 's avatarDave Hansen <dave.hansen@linux.intel.com>
      Signed-off-by: 's avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: 's avatarThomas Gleixner <tglx@linutronix.de>
      Cc: bp@alien8.de
      Cc: hpa@zytor.com
      Cc: peterz@infradead.org
      Cc: mpe@ellerman.id.au
      Cc: will.deacon@arm.com
      Cc: luto@kernel.org
      Cc: jroedel@suse.de
      Cc: stable@vger.kernel.org
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Joerg Roedel <jroedel@suse.de>
      Link: https://lkml.kernel.org/r/20190102215655.7A69518C@viggo.jf.intel.comSigned-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      992c4f69
    • Tom Roeder's avatar
      kvm: x86/vmx: Use kzalloc for cached_vmcs12 · dcc1097e
      Tom Roeder authored
      commit 3a33d030 upstream.
      
      This changes the allocation of cached_vmcs12 to use kzalloc instead of
      kmalloc. This removes the information leak found by Syzkaller (see
      Reported-by) in this case and prevents similar leaks from happening
      based on cached_vmcs12.
      
      It also changes vmx_get_nested_state to copy out the full 4k VMCS12_SIZE
      in copy_to_user rather than only the size of the struct.
      
      Tested: rebuilt against head, booted, and ran the syszkaller repro
        https://syzkaller.appspot.com/text?tag=ReproC&x=174efca3400000 without
        observing any problems.
      
      Reported-by: syzbot+ded1696f6b50b615b630@syzkaller.appspotmail.com
      Fixes: 8fcc4b59
      Cc: stable@vger.kernel.org
      Signed-off-by: 's avatarTom Roeder <tmroeder@google.com>
      Signed-off-by: 's avatarPaolo Bonzini <pbonzini@redhat.com>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      dcc1097e