1. 18 Oct, 2018 1 commit
    • Tariq Toukan's avatar
      net/mlx5: Refactor fragmented buffer struct fields and init flow · 4972e6fa
      Tariq Toukan authored
      Take struct mlx5_frag_buf out of mlx5_frag_buf_ctrl, as it is not
      needed to manage and control the datapath of the fragmented buffers API.
      
      struct mlx5_frag_buf contains control info to manage the allocation
      and de-allocation of the fragmented buffer.
      Its fields are not relevant for datapath, so here I take them out of the
      struct mlx5_frag_buf_ctrl, except for the fragments array itself.
      
      In addition, modified mlx5_fill_fbc to initialise the frags pointers
      as well. This implies that the buffer must be allocated before the
      function is called.
      
      A set of type-specific *_get_byte_size() functions are replaced by
      a generic one.
      Signed-off-by: default avatarTariq Toukan <tariqt@mellanox.com>
      Signed-off-by: default avatarSaeed Mahameed <saeedm@mellanox.com>
      4972e6fa
  2. 17 Oct, 2018 2 commits
  3. 27 Sep, 2018 1 commit
  4. 26 Sep, 2018 1 commit
  5. 25 Sep, 2018 3 commits
  6. 22 Sep, 2018 3 commits
  7. 21 Sep, 2018 1 commit
  8. 20 Sep, 2018 1 commit
  9. 11 Sep, 2018 4 commits
  10. 05 Sep, 2018 3 commits
  11. 04 Sep, 2018 1 commit
    • Majd Dibbiny's avatar
      IB/mlx5: Change TX affinity assignment in RoCE LAG mode · c6a21c38
      Majd Dibbiny authored
      In the current code, the TX affinity is per RoCE device, which can cause
      unfairness between different contexts. e.g. if we open two contexts, and
      each open 10 QPs concurrently, all of the QPs of the first context might
      end up on the first port instead of distributed on the two ports as
      expected
      
      To overcome this unfairness between processes, we maintain per device TX
      affinity, and per process TX affinity.
      
      The allocation algorithm is as follow:
      
      1. Hold two tx_port_affinity atomic variables, one per RoCE device and one
         per ucontext. Both initialized to 0.
      
      2. In mlx5_ib_alloc_ucontext do:
       2.1. ucontext.tx_port_affinity = device.tx_port_affinity
       2.2. device.tx_port_affinity += 1
      
      3. In modify QP INIT2RST:
       3.1. qp.tx_port_affinity = ucontext.tx_port_affinity % MLX5_PORT_NUM
       3.2. ucontext.tx_port_affinity += 1
      Signed-off-by: default avatarMajd Dibbiny <majd@mellanox.com>
      Reviewed-by: default avatarMoni Shoua <monis@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      c6a21c38
  12. 10 Aug, 2018 1 commit
  13. 31 Jul, 2018 2 commits
  14. 24 Jul, 2018 4 commits
  15. 13 Jul, 2018 2 commits
    • Leon Romanovsky's avatar
      RDMA/mlx5: Check that supplied blue flame index doesn't overflow · 05f58ceb
      Leon Romanovsky authored
      User's supplied index is checked again total number of system pages, but
      this number already includes num_static_sys_pages, so addition of that
      value to supplied index causes to below error while trying to access
      sys_pages[].
      
      BUG: KASAN: slab-out-of-bounds in bfregn_to_uar_index+0x34f/0x400
      Read of size 4 at addr ffff880065561904 by task syz-executor446/314
      
      CPU: 0 PID: 314 Comm: syz-executor446 Not tainted 4.18.0-rc1+ #256
      Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.0-0-g63451fca13-prebuilt.qemu-project.org 04/01/2014
      Call Trace:
       dump_stack+0xef/0x17e
       print_address_description+0x83/0x3b0
       kasan_report+0x18d/0x4d0
       bfregn_to_uar_index+0x34f/0x400
       create_user_qp+0x272/0x227d
       create_qp_common+0x32eb/0x43e0
       mlx5_ib_create_qp+0x379/0x1ca0
       create_qp.isra.5+0xc94/0x22d0
       ib_uverbs_create_qp+0x21b/0x2a0
       ib_uverbs_write+0xc2c/0x1010
       vfs_write+0x1b0/0x550
       ksys_write+0xc6/0x1a0
       do_syscall_64+0xa7/0x590
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      RIP: 0033:0x433679
      Code: fd ff 48 81 c4 80 00 00 00 e9 f1 fe ff ff 0f 1f 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 3b 91 fd ff c3 66 2e 0f 1f 84 00 00 00 00
      RSP: 002b:00007fff2b3d8e48 EFLAGS: 00000217 ORIG_RAX: 0000000000000001
      RAX: ffffffffffffffda RBX: 00000000004002f8 RCX: 0000000000433679
      RDX: 0000000000000040 RSI: 0000000020000240 RDI: 0000000000000003
      RBP: 00000000006d4018 R08: 00000000004002f8 R09: 00000000004002f8
      R10: 00000000004002f8 R11: 0000000000000217 R12: 0000000000000000
      R13: 000000000040cb00 R14: 000000000040cb90 R15: 0000000000000006
      
      Allocated by task 314:
       kasan_kmalloc+0xa0/0xd0
       __kmalloc+0x1a9/0x510
       mlx5_ib_alloc_ucontext+0x966/0x2620
       ib_uverbs_get_context+0x23f/0xa60
       ib_uverbs_write+0xc2c/0x1010
       __vfs_write+0x10d/0x720
       vfs_write+0x1b0/0x550
       ksys_write+0xc6/0x1a0
       do_syscall_64+0xa7/0x590
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      Freed by task 1:
       __kasan_slab_free+0x12e/0x180
       kfree+0x159/0x630
       kvfree+0x37/0x50
       single_release+0x8e/0xf0
       __fput+0x2d8/0x900
       task_work_run+0x102/0x1f0
       exit_to_usermode_loop+0x159/0x1c0
       do_syscall_64+0x408/0x590
       entry_SYSCALL_64_after_hwframe+0x49/0xbe
      
      The buggy address belongs to the object at ffff880065561100
       which belongs to the cache kmalloc-4096 of size 4096
      The buggy address is located 2052 bytes inside of
       4096-byte region [ffff880065561100, ffff880065562100)
      The buggy address belongs to the page:
      page:ffffea0001955800 count:1 mapcount:0 mapping:ffff88006c402480 index:0x0 compound_mapcount: 0
      flags: 0x4000000000008100(slab|head)
      raw: 4000000000008100 ffffea0001a7c000 0000000200000002 ffff88006c402480
      raw: 0000000000000000 0000000080070007 00000001ffffffff 0000000000000000
      page dumped because: kasan: bad access detected
      
      Memory state around the buggy address:
       ffff880065561800: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
       ffff880065561880: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
      >ffff880065561900: 04 fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
                         ^
       ffff880065561980: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
       ffff880065561a00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
      
      Cc: <stable@vger.kernel.org> # 4.15
      Fixes: 1ee47ab3 ("IB/mlx5: Enable QP creation with a given blue flame index")
      Reported-by: default avatarNoa Osherovich <noaos@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      05f58ceb
    • Leon Romanovsky's avatar
      RDMA/mlx5: Melt consecutive calls to alloc_bfreg() in one call · ffaf58de
      Leon Romanovsky authored
      There is no need for three consecutive calls to alloc_bfreg(). It can be
      implemented with one function.
      Signed-off-by: default avatarLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      ffaf58de
  16. 25 Jun, 2018 1 commit
    • Yishai Hadas's avatar
      IB/mlx5: Add support for drain SQ & RQ · d0e84c0a
      Yishai Hadas authored
      This patch follows the logic from ib_core but considers the internal
      device state upon executing the involved commands.
      
      Specifically,
      Upon internal error state modify QP to an error state can be assumed to
      be success as each in-progress WR going to be flushed in error in any
      case as expected by that modify command.
      
      In addition,
      As the drain should never fail the driver makes sure that post_send/recv
      will succeed even if the device is already in an internal error state.
      As such once the driver will supply the simulated/SW CQEs the CQE for
      the drain WR will be handled as well.
      
      In case of an internal error state the CQE for the drain WR may be
      completed as part of the main task that handled the error state or by
      the task that issued the drain WR.
      
      As the above depends on scheduling the code takes the relevant locks and
      actions to make sure that the completion handler for that WR will always
      be called after that the post_send/recv were issued but not in parallel
      to the other task that handles the error flow.
      Signed-off-by: default avatarYishai Hadas <yishaih@mellanox.com>
      Reviewed-by: default avatarMax Gurtovoy <maxg@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      d0e84c0a
  17. 22 Jun, 2018 1 commit
  18. 19 Jun, 2018 3 commits
    • Yishai Hadas's avatar
      IB/mlx5: Expose DEVX tree · c59450c4
      Yishai Hadas authored
      Expose DEVX tree to be used by upper layers.
      Signed-off-by: default avatarYishai Hadas <yishaih@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      c59450c4
    • Yishai Hadas's avatar
      IB/mlx5: Add support for DEVX query UAR · 7c043e90
      Yishai Hadas authored
      Return a device UAR index for a given user index via the DEVX interface.
      
      Security note:
      The hardware protection mechanism works like this: Each device object that
      is subject to UAR doorbells (QP/SQ/CQ) gets a UAR ID (called uar_page in
      the device specification manual) upon its creation. Then upon doorbell,
      hardware fetches the object context for which the doorbell was rang, and
      validates that the UAR through which the DB was rang matches the UAR ID
      of the object.
      
      If no match the doorbell is silently ignored by the hardware.  Of
      course, the user cannot ring a doorbell on a UAR that was not mapped to
      it.
      
      Now in devx, as the devx kernel does not manipulate the QP/SQ/CQ command
      mailboxes (except tagging them with UID), we expose to the user its UAR
      ID, so it can embed it in these objects in the expected specification
      format. So the only thing the user can do is hurt itself by creating a
      QP/SQ/CQ with a UAR ID other than his, and then in this case other users
      may ring a doorbell on its objects.
      
      The consequence of that will be that another user can schedule a QP/SQ
      of the buggy user for execution (just insert it to the hardware schedule
      queue or arm its CQ for event generation), no further harm is expected.
      Signed-off-by: default avatarYishai Hadas <yishaih@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      7c043e90
    • Yishai Hadas's avatar
      IB/mlx5: Introduce DEVX · a8b92ca1
      Yishai Hadas authored
      Introduce DEVX to enable direct device commands in downstream patches
      from this series.
      
      In that mode of work the firmware manages the isolation between
      processes' resources and as such a DEVX user id is created and assigned
      to the given user context upon allocation request.
      
      A capability check is done to make sure that this feature is really
      supported by the firmware prior to creating the DEVX user id.
      Signed-off-by: default avatarYishai Hadas <yishaih@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leonro@mellanox.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      a8b92ca1
  19. 18 Jun, 2018 1 commit
  20. 02 Jun, 2018 3 commits
  21. 05 Apr, 2018 1 commit