1. 06 Jun, 2018 1 commit
    • Kees Cook's avatar
      treewide: Use struct_size() for kmalloc()-family · acafe7e3
      Kees Cook authored
      One of the more common cases of allocation size calculations is finding
      the size of a structure that has a zero-sized array at the end, along
      with memory for some number of elements for that array. For example:
      
      struct foo {
          int stuff;
          void *entry[];
      };
      
      instance = kmalloc(sizeof(struct foo) + sizeof(void *) * count, GFP_KERNEL);
      
      Instead of leaving these open-coded and prone to type mistakes, we can
      now use the new struct_size() helper:
      
      instance = kmalloc(struct_size(instance, entry, count), GFP_KERNEL);
      
      This patch makes the changes for kmalloc()-family (and kvmalloc()-family)
      uses. It was done via automatic conversion with manual review for the
      "CHECKME" non-standard cases noted below, using the following Coccinelle
      script:
      
      // pkey_cache = kmalloc(sizeof *pkey_cache + tprops->pkey_tbl_len *
      //                      sizeof *pkey_cache->table, GFP_KERNEL);
      @@
      identifier alloc =~ "kmalloc|kzalloc|kvmalloc|kvzalloc";
      expression GFP;
      identifier VAR, ELEMENT;
      expression COUNT;
      @@
      
      - alloc(sizeof(*VAR) + COUNT * sizeof(*VAR->ELEMENT), GFP)
      + alloc(struct_size(VAR, ELEMENT, COUNT), GFP)
      
      // mr = kzalloc(sizeof(*mr) + m * sizeof(mr->map[0]), GFP_KERNEL);
      @@
      identifier alloc =~ "kmalloc|kzalloc|kvmalloc|kvzalloc";
      expression GFP;
      identifier VAR, ELEMENT;
      expression COUNT;
      @@
      
      - alloc(sizeof(*VAR) + COUNT * sizeof(VAR->ELEMENT[0]), GFP)
      + alloc(struct_size(VAR, ELEMENT, COUNT), GFP)
      
      // Same pattern, but can't trivially locate the trailing element name,
      // or variable name.
      @@
      identifier alloc =~ "kmalloc|kzalloc|kvmalloc|kvzalloc";
      expression GFP;
      expression SOMETHING, COUNT, ELEMENT;
      @@
      
      - alloc(sizeof(SOMETHING) + COUNT * sizeof(ELEMENT), GFP)
      + alloc(CHECKME_struct_size(&SOMETHING, ELEMENT, COUNT), GFP)
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      acafe7e3
  2. 15 Mar, 2018 1 commit
    • Tejun Heo's avatar
      RDMAVT: Fix synchronization around percpu_ref · 95da6e96
      Tejun Heo authored
      rvt_mregion uses percpu_ref for reference counting and RCU to protect
      accesses from lkey_table.  When a rvt_mregion needs to be freed, it
      first gets unregistered from lkey_table and then rvt_check_refs() is
      called to wait for in-flight usages before the rvt_mregion is freed.
      
      rvt_check_refs() seems to have a couple issues.
      
      * It has a fast exit path which tests percpu_ref_is_zero().  However,
        a percpu_ref reading zero doesn't mean that the object can be
        released.  In fact, the ->release() callback might not even have
        started executing yet.  Proceeding with freeing can lead to
        use-after-free.
      
      * lkey_table is RCU protected but there is no RCU grace period in the
        free path.  percpu_ref uses RCU internally but it's sched-RCU whose
        grace periods are different from regular RCU.  Also, it generally
        isn't a good idea to depend on internal behaviors like this.
      
      To address the above issues, this patch removes the fast exit and adds
      an explicit synchronize_rcu().
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarDennis Dalessandro <dennis.dalessandro@intel.com>
      Cc: Mike Marciniszyn <mike.marciniszyn@intel.com>
      Cc: linux-rdma@vger.kernel.org
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      95da6e96
  3. 14 Mar, 2018 1 commit
    • Tejun Heo's avatar
      RDMAVT: Fix synchronization around percpu_ref · 74b44bbe
      Tejun Heo authored
      rvt_mregion uses percpu_ref for reference counting and RCU to protect
      accesses from lkey_table.  When a rvt_mregion needs to be freed, it
      first gets unregistered from lkey_table and then rvt_check_refs() is
      called to wait for in-flight usages before the rvt_mregion is freed.
      
      rvt_check_refs() seems to have a couple issues.
      
      * It has a fast exit path which tests percpu_ref_is_zero().  However,
        a percpu_ref reading zero doesn't mean that the object can be
        released.  In fact, the ->release() callback might not even have
        started executing yet.  Proceeding with freeing can lead to
        use-after-free.
      
      * lkey_table is RCU protected but there is no RCU grace period in the
        free path.  percpu_ref uses RCU internally but it's sched-RCU whose
        grace periods are different from regular RCU.  Also, it generally
        isn't a good idea to depend on internal behaviors like this.
      
      To address the above issues, this patch removes the fast exit and adds
      an explicit synchronize_rcu().
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarDennis Dalessandro <dennis.dalessandro@intel.com>
      Cc: Mike Marciniszyn <mike.marciniszyn@intel.com>
      Cc: linux-rdma@vger.kernel.org
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      74b44bbe
  4. 11 Jan, 2018 1 commit
    • Randy Dunlap's avatar
      infiniband: fix sw/rdmavt/* kernel-doc notation · 4f9a3018
      Randy Dunlap authored
      Use correct parameter names and formatting in function kernel-doc notation
      to eliminate warnings from scripts/kernel-doc.
      
      ../drivers/infiniband/sw/rdmavt/mr.c:784: warning: Excess function parameter 'ibmfr' description in 'rvt_map_phys_fmr'
      ../drivers/infiniband/sw/rdmavt/vt.c:234: warning: Excess function parameter 'intex' description in 'rvt_query_pkey'
      ../drivers/infiniband/sw/rdmavt/vt.c:266: warning: Excess function parameter 'index' description in 'rvt_query_gid'
      ../drivers/infiniband/sw/rdmavt/vt.c:306: warning: Excess function parameter 'data' description in 'rvt_alloc_ucontext'
      ../drivers/infiniband/sw/rdmavt/cq.c:65: warning: Excess function parameter 'sig' description in 'rvt_cq_enter'
      ../drivers/infiniband/sw/rdmavt/qp.c:279: warning: Excess function parameter 'qpt' description in 'rvt_free_all_qps'
      ../drivers/infiniband/sw/rdmavt/mcast.c:282: warning: Excess function parameter 'igd' description in 'rvt_attach_mcast'
      ../drivers/infiniband/sw/rdmavt/mcast.c:345: warning: Excess function parameter 'igd' description in 'rvt_detach_mcast'
      Signed-off-by: default avatarRandy Dunlap <rdunlap@infradead.org>
      Cc: Doug Ledford <dledford@redhat.com>
      Cc: Jason Gunthorpe <jgg@mellanox.com>
      Cc: linux-doc@vger.kernel.org
      Acked-by: default avatarDennis Dalessandro <dennis.dalessandro@intel.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      4f9a3018
  5. 28 Aug, 2017 1 commit
  6. 31 Jul, 2017 1 commit
    • Mike Marciniszyn's avatar
      IB/{rdmavt, hfi1, qib}: Fix panic with post receive and SGE compression · 3ffea7d8
      Mike Marciniszyn authored
      The server side of qperf panics as follows:
      
      [242446.336860] IP: report_bug+0x64/0x10
      [242446.341031] PGD 1c0c067
      [242446.341032] P4D 1c0c067
      [242446.343951] PUD 1c0d063
      [242446.346870] PMD 8587ea067
      [242446.349788] PTE 800000083e14016
      [242446.352901]
      [242446.358352] Oops: 0003 [#1] SM
      [242446.437919] CPU: 1 PID: 7442 Comm: irq/92-hfi1_0 k Not tainted 4.12.0-mam-asm #1
      [242446.446365] Hardware name: Intel Corporation S2600WT2/S2600WT2, BIOS SE5C610.86B.01.01.0018.C4.072020161249 07/20/201
      [242446.458397] task: ffff8808392d2b80 task.stack: ffffc9000664000
      [242446.465097] RIP: 0010:report_bug+0x64/0x10
      [242446.469859] RSP: 0018:ffffc900066439c0 EFLAGS: 0001000
      [242446.475784] RAX: ffffffffa06647e4 RBX: ffffffffa06461e1 RCX: 000000000000000
      [242446.483840] RDX: 0000000000000907 RSI: ffffffffa0675040 RDI: ffffffffffff740
      [242446.491897] RBP: ffffc900066439e0 R08: 0000000000000001 R09: 000000000000025
      [242446.499953] R10: ffffffff81a253df R11: 0000000000000133 R12: ffffc90006643b3
      [242446.508010] R13: ffffffffa065bbf0 R14: 00000000000001e5 R15: 000000000000000
      [242446.516067] FS:  0000000000000000(0000) GS:ffff88085f640000(0000) knlGS:000000000000000
      [242446.525191] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003
      [242446.531698] CR2: ffffffffa06647ee CR3: 0000000001c09000 CR4: 00000000001406e
      [242446.539756] Call Trace
      [242446.542582]  fixup_bug+0x2c/0x5
      [242446.546277]  do_trap+0x12b/0x18
      [242446.549972]  do_error_trap+0x89/0x11
      [242446.554171]  ? hfi1_copy_sge+0x271/0x2b0 [hfi1
      [242446.559324]  ? ttwu_do_wakeup+0x1e/0x14
      [242446.563795]  ? ttwu_do_activate+0x77/0x8
      [242446.568363]  do_invalid_op+0x20/0x3
      [242446.572448]  invalid_op+0x1e/0x3
      [242446.576247] RIP: 0010:hfi1_copy_sge+0x271/0x2b0 [hfi1
      [242446.582075] RSP: 0018:ffffc90006643be8 EFLAGS: 0001004
      [242446.587999] RAX: 0000000000000000 RBX: ffff88083e0fa240 RCX: 000000000000000
      [242446.596058] RDX: 0000000000000000 RSI: ffff880842508000 RDI: ffff88083e0fa24
      [242446.604116] RBP: ffffc90006643c28 R08: 0000000000000000 R09: 000000000000000
      [242446.612172] R10: ffffc90009473640 R11: 0000000000000133 R12: 000000000000000
      [242446.620228] R13: 0000000000000000 R14: 0000000000002000 R15: ffff88084250800
      [242446.628293]  ? hfi1_copy_sge+0x1a1/0x2b0 [hfi1
      [242446.633449]  hfi1_rc_rcv+0x3da/0x1270 [hfi1
      [242446.638312]  ? sc_buffer_alloc+0x113/0x150 [hfi1
      [242446.643662]  hfi1_ib_rcv+0x1c9/0x2e0 [hfi1
      [242446.648428]  process_receive_ib+0x19a/0x270 [hfi1
      [242446.653866]  ? process_rcv_qp_work+0xd2/0x160 [hfi1
      [242446.659505]  handle_receive_interrupt_nodma_rtail+0x184/0x2e0 [hfi1
      [242446.666693]  ? irq_finalize_oneshot+0x100/0x10
      [242446.671846]  receive_context_thread+0x1b/0x140 [hfi1
      [242446.677576]  irq_thread_fn+0x1e/0x4
      [242446.681659]  irq_thread+0x13c/0x1b
      [242446.685646]  ? irq_forced_thread_fn+0x60/0x6
      [242446.690604]  kthread+0x112/0x15
      [242446.694298]  ? irq_thread_check_affinity+0xe0/0xe
      [242446.699738]  ? kthread_park+0x60/0x6
      [242446.703919]  ? do_syscall_64+0x67/0x15
      [242446.708292]  ret_from_fork+0x25/0x3
      [242446.712374] Code: 63 78 04 44 0f b7 70 08 41 89 d0 4c 8d 2c 38 41 83 e0 01 f6 c2 02 74 17 66 45 85 c0 74 11 f6 c2 04 b9 01 00 00 00 75 bb 83 ca 04 <66> 89 50 0a 66 45 85 c0 74 52 0f b6 48 0b 41 0f b7 f6 4d 89 e0
      [242446.733527] RIP: report_bug+0x64/0x100 RSP: ffffc900066439c
      [242446.739935] CR2: ffffffffa06647e
      [242446.743763] ---[ end trace 0e90a20d0aa494f7 ]--
      
      The root cause is that the qib/hfi1 post receive call to rvt_lkey_ok()
      doesn't interpret the new return value from rvt_lkey_ok() properly
      leading to an mr reference count underrun.
      
      Additionally, remove an unused argument in rvt_sge_adjacent()
      aw well as an unneeded incr local in rvt_post_one_wr().
      
      Fixes: Commit 14fe13fc ("IB/rdmavt: Compress adjacent SGEs in rvt_lkey_ok()")
      Signed-off-by: default avatarMike Marciniszyn <mike.marciniszyn@intel.com>
      Signed-off-by: default avatarDennis Dalessandro <dennis.dalessandro@intel.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      3ffea7d8
  7. 27 Jun, 2017 1 commit
  8. 25 Apr, 2017 1 commit
  9. 05 Apr, 2017 1 commit
    • Mike Marciniszyn's avatar
      IB/hfi1: Eliminate synchronize_rcu() in mr delete · b58fc804
      Mike Marciniszyn authored
      The synchronize_rcu() call can be eliminated to improve memory deregistration
      performance.
      
      There are two key fields involved:
      - The rcu pointer itself
      - the lkey_published field
      
      To close the window between the rcu read of the mregion pointer and the
      reference count the code should:
      
      1. To lkey/rkey validation (reader)
      
      Read the rcu pointer.  If the pointer is non-NULL, get a reference.
      
      To the current validation tests use a READ_ONCE() on the lkey_published.
      
      Upon any failure release the reference.
      
      2. To the remove logic (delete)
      
      Insure the published is zeroed prior to setting the pointer to NULL.
      This requires using rcu_assign_pointer() to insure lkey_published
      is written prior to the NULL.
      
      3. To the insert logic (add)
      
      Insure the published is set use an rcu_assign_pointer() to insure the
      pointer is after all MR fields.
      Reviewed-by: default avatarDennis Dalessandro <dennis.dalessandro@intel.com>
      Signed-off-by: default avatarMike Marciniszyn <mike.marciniszyn@intel.com>
      Signed-off-by: default avatarDennis Dalessandro <dennis.dalessandro@intel.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      b58fc804
  10. 19 Feb, 2017 1 commit
  11. 24 Jan, 2017 1 commit
  12. 11 Dec, 2016 2 commits
  13. 15 Nov, 2016 1 commit
  14. 16 Sep, 2016 1 commit
  15. 02 Aug, 2016 2 commits
  16. 26 May, 2016 1 commit
  17. 11 Mar, 2016 7 commits