1. 12 Jan, 2017 2 commits
  2. 14 Sep, 2016 1 commit
    • Matthew R. Ochs's avatar
      scsi: cxlflash: Fix context reference tracking on detach · c4a11827
      Matthew R. Ochs authored
      Commit 888baf06 ("scsi: cxlflash: Add kref to context") introduced a
      kref to the context. In particular, the detach routine was updated to
      use the kref services for managing the removal and destruction of a
      context.
      
      As part of this change, the tracking mechanism internal to the detach
      handler was refactored. This introduced a bug that can cause the
      tracking state to be lost. This can lead to a situation where exclusive
      access to a context is prematurely [and unknowingly] relinquished for
      the executing thread.
      
      To remedy, only update the tracking state when the kref operation
      indicates the context was removed.
      
      Fixes: 888baf06 ("scsi: cxlflash: Add kref to context")
      Signed-off-by: default avatarMatthew R. Ochs <[email protected]>
      Acked-by: default avatarUma Krishnan <[email protected]>
      Signed-off-by: Martin K. Petersen's avatarMartin K. Petersen <[email protected]>
      c4a11827
  3. 24 Aug, 2016 2 commits
  4. 19 Aug, 2016 3 commits
  5. 06 May, 2016 1 commit
    • Manoj N. Kumar's avatar
      cxlflash: Fix to resolve dead-lock during EEH recovery · 635f6b08
      Manoj N. Kumar authored
      When a cxlflash adapter goes into EEH recovery and multiple processes
      (each having established its own context) are active, the EEH recovery
      can hang if the processes attempt to recover in parallel. The symptom
      logged after a couple of minutes is:
      
      INFO: task eehd:48 blocked for more than 120 seconds.
      Not tainted 4.5.0-491-26f710d+ #1
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      eehd            0    48      2
      Call Trace:
      __switch_to+0x2f0/0x410
      __schedule+0x300/0x980
      schedule+0x48/0xc0
      rwsem_down_write_failed+0x294/0x410
      down_write+0x88/0xb0
      cxlflash_pci_error_detected+0x100/0x1c0 [cxlflash]
      cxl_vphb_error_detected+0x88/0x110 [cxl]
      cxl_pci_error_detected+0xb0/0x1d0 [cxl]
      eeh_report_error+0xbc/0x130
      eeh_pe_dev_traverse+0x94/0x160
      eeh_handle_normal_event+0x17c/0x450
      eeh_handle_event+0x184/0x370
      eeh_event_handler+0x1c8/0x1d0
      kthread+0x110/0x130
      ret_from_kernel_thread+0x5c/0xa4
      INFO: task blockio:33215 blocked for more than 120 seconds.
      
      Not tainted 4.5.0-491-26f710d+ #1
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
      blockio         0 33215  33213
      Call Trace:
      0x1 (unreliable)
      __switch_to+0x2f0/0x410
      __schedule+0x300/0x980
      schedule+0x48/0xc0
      rwsem_down_read_failed+0x124/0x1d0
      down_read+0x68/0x80
      cxlflash_ioctl+0x70/0x6f0 [cxlflash]
      scsi_ioctl+0x3b0/0x4c0
      sg_ioctl+0x960/0x1010
      do_vfs_ioctl+0xd8/0x8c0
      SyS_ioctl+0xd4/0xf0
      system_call+0x38/0xb4
      INFO: task eehd:48 blocked for more than 120 seconds.
      
      The hang is because of a 3 way dead-lock:
      
      Process A holds the recovery mutex, and waits for eehd to complete.
      Process B holds the semaphore and waits for the recovery mutex.
      eehd waits for semaphore.
      
      The fix is to have Process B above release the semaphore before
      attempting to acquire the recovery mutex. This will allow
      eehd to proceed to completion.
      Signed-off-by: default avatarManoj N. Kumar <[email protected]>
      Reviewed-by: default avatarMatthew R. Ochs <[email protected]>
      Signed-off-by: Martin K. Petersen's avatarMartin K. Petersen <[email protected]>
      635f6b08
  6. 09 Mar, 2016 3 commits
    • Uma Krishnan's avatar
      cxlflash: Reorder user context initialization · 5d1952ac
      Uma Krishnan authored
      In order to support cxlflash in the PowerVM environment, underlying
      hypervisor APIs have imposed a kernel API ordering change.
      
      For the superpipe access to LUN, user applications need a context.
      The cxlflash module creates this context by making a sequence of
      cxl calls. In the current code, a context is initialized via
      cxl_dev_context_init() followed by cxl_process_element(), a function
      that obtains the process element id. Finally, cxl_start_work()
      is called to attach the process element.
      
      In the PowerVM environment, a process element id cannot be obtained
      from the hypervisor until the process element is attached. The
      cxlflash module is unable to create contexts without a valid
      process element id.
      
      To fix this problem, cxl_start_work() is called before obtaining
      the process element id.
      Signed-off-by: default avatarUma Krishnan <[email protected]>
      Acked-by: default avatarMatthew R. Ochs <[email protected]>
      Signed-off-by: Martin K. Petersen's avatarMartin K. Petersen <[email protected]>
      5d1952ac
    • Matthew R. Ochs's avatar
      cxlflash: Simplify attach path error cleanup · 8a96b52a
      Matthew R. Ochs authored
      The cxlflash_disk_attach() routine currently uses a cascading error
      gate strategy for its error cleanup path. While this strategy is
      commonly used to handle cleanup scenarios, it is too restrictive when
      function callouts need to be restructured. Problems range from
      inserting error path bugs in previously 'good' code to the cleanup
      path imposing design changes to how the normal path is structured.
      A less restrictive approach is needed to support ordering changes
      that come about when operating in different environments.
      
      To overcome this restriction, the error cleanup path is modified to
      have a single entrypoint and use conditional logic to cleanup where
      necessary. Entities that require multiple cleanup steps must be
      carefully vetted to ensure their APIs support state. In cases where
      they do not (none as of this commit) additional local variables can
      be used to maintain state on their behalf.
      Signed-off-by: default avatarMatthew R. Ochs <[email protected]>
      Reviewed-by: default avatarUma Krishnan <[email protected]>
      Signed-off-by: Martin K. Petersen's avatarMartin K. Petersen <[email protected]>
      8a96b52a
    • Matthew R. Ochs's avatar
      cxlflash: Split out context initialization · 5e6632d1
      Matthew R. Ochs authored
      Presently, context information structures are allocated and
      initialized in the same routine, create_context(). This imposes
      an ordering restriction such that all pieces of information needed
      to initialize a context must be known before the context is even
      allocated.
      
      This design point is not flexible when the order of context
      creation needs to be modified. Specifically, this can lead to
      problems when members of the context information structure are
      a part of an ordering dependency (i.e. - the 'work' structure
      embedded within the context).
      
      To remedy, the allocation is left as-is, inside of the existing
      create_context() routine and the initialization is transitioned
      to a new void routine, init_context(). At the same time, in
      anticipation of these routines not being called in sequence, a
      state boolean is added to the context information structure to
      track when the context has been initilized. The context teardown
      routine, destroy_context(), is modified to support being called
      with a non-initialized context.
      Signed-off-by: default avatarMatthew R. Ochs <[email protected]>
      Reviewed-by: default avatarUma Krishnan <[email protected]>
      Signed-off-by: Martin K. Petersen's avatarMartin K. Petersen <[email protected]>
      5e6632d1
  7. 10 Dec, 2015 2 commits
  8. 30 Oct, 2015 12 commits
    • Matthew R. Ochs's avatar
      cxlflash: Fix to avoid bypassing context cleanup · a82544c7
      Matthew R. Ochs authored
      Contexts may be skipped over for cleanup in situations where contention
      for the adapter's table-list mutex is experienced in the presence of a
      signal during the execution of the release handler.
      
      This can lead to two known issues:
      
       - A hang condition on remove as that path tries to wait for users to
         cleanup - something that will never complete should this scenario play
         out as the user has already cleaned up from their perspective.
      
       - An Oops in the unmap_mapping_range() call that is made as part of
         the user waiting mechanism that is invoked on remove when contexts
         are found to still exist.
      
      The root cause of this issue can be found in get_context() and how the
      table-list mutex is acquired. As this code path is shared by several
      different access points within the driver, a decision was made during
      the development cycle to acquire this mutex in this location using the
      interruptible version of the mutex locking service. In almost all of
      the use-cases and environmental scenarios this holds up, even when the
      mutex is contended. However, for critical system threads (such as the
      release handler), failing to acquire the mutex and bailing with the
      intention of the user being able to try again later is unacceptable.
      
      In such a scenario, the context _must_ be derived as it is on an
      irreversible path to being freed. Without being able to derive the
      context, the code mistakenly assumes that it has already been freed
      and proceeds to free up the underlying CXL context resources. From
      this point on, any usage of [the now stale] CXL context resources
      will result in undefined behavior. This is root cause of the Oops
      mentioned as the second known issue as the mapping passed to the
      unmap_mapping_range() service is owned by the CXL context.
      
      To fix this problem, acquisition of the table-list mutex within
      get_context() is simply changed to use the uninterruptible version
      of the mutex locking service. This is safe as the timing windows for
      holding this mutex are short and also protected against blocking.
      Signed-off-by: default avatarMatthew R. Ochs <[email protected]>
      Acked-by: default avatarManoj Kumar <[email protected]>
      Reviewed-by: default avatarAndrew Donnellan <[email protected]>
      Signed-off-by: default avatarJames Bottomley <[email protected]>
      a82544c7
    • Matthew R. Ochs's avatar
      cxlflash: Fix to avoid potential deadlock on EEH · aacb4ff6
      Matthew R. Ochs authored
      Ioctl threads that use scsi_execute() can run for an excessive amount
      of time due to the fact that they have lengthy timeouts and retry logic
      built in. Under normal operation this is not an issue. However, once EEH
      enters the picture, a long execution time coupled with the possibility
      that a timeout can trigger entry to the driver via registered reset
      callbacks becomes a liability.
      
      In particular, a deadlock can occur when an EEH event is encountered
      while in running in scsi_execute(). As part of the recovery, the EEH
      handler drains all currently running ioctls, waiting until they have
      completed before proceeding with a reset. As the scsi_execute()'s are
      situated on the ioctl path, the EEH handler will wait until they (and
      the remainder of the ioctl handler they're associated with) have
      completed. Normally this would not be much of an issue aside from the
      longer recovery period. Unfortunately, the scsi_execute() triggers a
      reset when it times out. The reset handler will see that the device is
      already being reset and wait until that reset completed. This creates
      a condition where the EEH handler becomes stuck, infinitely waiting for
      the ioctl thread to complete.
      
      To avoid this behavior, temporarily unmark the scsi_execute() threads
      as an ioctl thread by releasing the ioctl read semaphore. This allows
      the EEH handler to proceed with a recovery while the thread is still
      running. Once the scsi_execute() returns, the ioctl read semaphore is
      reacquired and the adapter state is rechecked in case it changed while
      inside of scsi_execute(). The state check will wait if the adapter is
      still being recovered or returns a failure if the recovery failed. In
      the event that the adapter reset failed, the failure is simply returned
      as the ioctl would be unable to continue.
      Reported-by: default avatarBrian King <[email protected]om>
      Signed-off-by: default avatarMatthew R. Ochs <[email protected]>
      Signed-off-by: default avatarManoj N. Kumar <[email protected]>
      Reviewed-by: default avatarBrian King <[email protected]>
      Reviewed-by: default avatarDaniel Axtens <[email protected]>
      Reviewed-by: default avatarTomas Henzl <[email protected]>
      Signed-off-by: default avatarJames Bottomley <[email protected]>
      aacb4ff6
    • Matthew R. Ochs's avatar
      cxlflash: Fix to avoid corrupting adapter fops · 17ead26f
      Matthew R. Ochs authored
      The fops owned by the adapter can be corrupted in certain scenarios,
      opening a window where certain fops are temporarily NULLed before being
      reset to their proper value. This can potentially lead software to make
      incorrect decisions, leaving the user with the inability to function as
      intended.
      
      An example of this behavior can be observed when there are a number of
      users with a high rate of turn around (attach to LUN, perform an I/O,
      detach from LUN, repeat). Every so often a user is given a valid
      context and adapter file descriptor, but the file associated with the
      descriptor lacks the correct read permission bit (FMODE_CAN_READ) and
      thus the read system call bails before calling the valid read fop.
      
      Background:
      
      The fops is stored in the adapter structure to provide the ability to
      lookup the adapter structure from within the fop handler. CXL services
      use the file's private_data and at present, the CXL context does not
      have a private section. In an effort to limit areas of the cxlflash
      driver with code specific the superpipe function, a design choice was
      made to keep the details of the fops situated away from the legacy
      portions of the driver. This drove the behavior that the adapter fops
      is set at the beginning of the disk attach ioctl handler when there
      are no users present.
      
      The corruption that this fix remedies is due to the fact that the fops
      is initially defaulted to values found within a static structure. When
      the fops is handed down to the CXL services later in the attach path,
      certain services are patched. The fops structure remains correct until
      the user count drops to 0 and the fops is reset, triggering the process
      to repeat again. The user counts are tightly coupled with the creation
      and deletion of the user context. If multiple users perform a disk
      attach at the same time, when the user count is currently 0, some users
      can be in the middle of obtaining a file descriptor and have not yet
      reached the context creation code that [in addition to creating the
      context] increments the user count. Subsequent users coming in to
      perform the attach see that the user count is still 0, and reinitialize
      the fops, temporarily removing the patched fops. The users that are in
      the middle obtaining their file descriptor may then receive an invalid
      descriptor.
      
      The fix simply removes the user count altogether and moves the fops
      initialization to probe time such that it is only performed one time
      for the life of the adapter. In the future, if the CXL services adopt
      a private member for their context, that could be used to store the
      adapter structure reference and cxlflash could revert to a model that
      does not require an embedded fops.
      Signed-off-by: default avatarMatthew R. Ochs <[email protected]>
      Signed-off-by: default avatarManoj N. Kumar <[email protected]>
      Reviewed-by: default avatarBrian King <[email protected]>
      Reviewed-by: default avatarAndrew Donnellan <[email protected]>
      Reviewed-by: default avatarDaniel Axtens <[email protected]>
      Reviewed-by: default avatarTomas Henzl <[email protected]>
      Signed-off-by: default avatarJames Bottomley <[email protected]>
      17ead26f
    • Matthew R. Ochs's avatar
      cxlflash: Correct spelling, grammar, and alignment mistakes · f15fbf8d
      Matthew R. Ochs authored
      There are several spelling and grammar mistakes throughout the
      driver. Additionally there are a handful of places where there
      are extra lines and unnecessary variables/statements. These are
      a nuisance and pollute the driver.
      
      Fix spelling and grammar issues. Update some comments for clarity and
      consistency. Remove extra lines and a few unneeded variables/statements.
      Signed-off-by: default avatarMatthew R. Ochs <[email protected]>
      Signed-off-by: default avatarManoj N. Kumar <[email protected]>
      Reviewed-by: default avatarBrian King <[email protected]>
      Reviewed-by: default avatarAndrew Donnellan <[email protected]>
      Reviewed-by: default avatarTomas Henzl <[email protected]>
      Signed-off-by: default avatarJames Bottomley <[email protected]>
      f15fbf8d
    • Matthew R. Ochs's avatar
      cxlflash: Fix to prevent EEH recovery failure · 8e782623
      Matthew R. Ochs authored
      The process_sense() routine can perform a read capacity which
      can take some time to complete. If an EEH occurs while waiting
      on the read capacity, the EEH handler will wait to obtain the
      context's mutex in order to put the context in an error state.
      The EEH handler will sit and wait until the context is free,
      but this wait can potentially last forever (deadlock) if the
      scsi_execute() that performs the read capacity experiences a
      timeout and calls into the reset callback. When that occurs,
      the reset callback sees that the device is already being reset
      and waits for the reset to complete. This leaves two threads
      waiting on the other.
      
      To address this issue, make the context unavailable to new,
      non-system owned threads and release the context while calling
      into process_sense(). After returning from process_sense() the
      context mutex is reacquired and the context is made available
      again. The context can be safely moved to the error state if
      needed during the unavailable window as no other threads will
      hold its reference.
      Signed-off-by: default avatarMatthew R. Ochs <[email protected]>
      Signed-off-by: default avatarManoj N. Kumar <[email protected]>
      Reviewed-by: default avatarBrian King <[email protected]>
      Reviewed-by: default avatarDaniel Axtens <[email protected]>
      Reviewed-by: default avatarTomas Henzl <[email protected]>
      Signed-off-by: default avatarJames Bottomley <[email protected]>
      8e782623
    • Matthew R. Ochs's avatar
      cxlflash: Fix MMIO and endianness errors · 1786f4a0
      Matthew R. Ochs authored
      Sparse uncovered several errors with MMIO operations (accessing
      directly) and handling endianness. These can cause issues when
      running in different environments.
      
      Introduce __iomem and proper endianness tags/swaps where
      appropriate to make driver sparse clean.
      Signed-off-by: default avatarMatthew R. Ochs <[email protected]>
      Signed-off-by: default avatarManoj N. Kumar <[email protected]>
      Reviewed-by: default avatarBrian King <[email protected]>
      Reviewed-by: default avatarAndrew Donnellan <[email protected]>
      Reviewed-by: default avatarTomas Henzl <[email protected]>
      Signed-off-by: default avatarJames Bottomley <[email protected]>
      1786f4a0
    • Matthew R. Ochs's avatar
      cxlflash: Correct naming of limbo state and waitq · 439e85c1
      Matthew R. Ochs authored
      Limbo is not an accurate representation of this state and is
      also not consistent with the terminology that other drivers
      use to represent this concept. Rename the state and and its
      associated waitq to 'reset'.
      Signed-off-by: default avatarMatthew R. Ochs <[email protected]>
      Signed-off-by: default avatarManoj N. Kumar <[email protected]>
      Reviewed-by: default avatarBrian King <[email protected]>
      Reviewed-by: default avatarDaniel Axtens <[email protected]>
      Reviewed-by: default avatarTomas Henzl <[email protected]>
      Signed-off-by: default avatarJames Bottomley <[email protected]>
      439e85c1
    • Matthew R. Ochs's avatar
      cxlflash: Fix to avoid CXL services during EEH · 0a27ae51
      Matthew R. Ochs authored
      During an EEH freeze event, certain CXL services should not be
      called until after the hardware reset has taken place. Doing so
      can result in unnecessary failures and possibly cause other ill
      effects by triggering hardware accesses. This translates to a
      requirement to quiesce all threads that may potentially use CXL
      runtime service during this window. In particular, multiple ioctls
      make use of the CXL services when acting on contexts on behalf of
      the user. Thus, it is essential to 'drain' running ioctls _before_
      proceeding with handling the EEH freeze event.
      
      Create the ability to drain ioctls by wrapping the ioctl handler
      call in a read semaphore and then implementing a small routine that
      obtains the write semaphore, effectively creating a wait point for
      all currently executing ioctls.
      Signed-off-by: default avatarMatthew R. Ochs <[email protected]>
      Signed-off-by: default avatarManoj N. Kumar <[email protected]>
      Reviewed-by: default avatarBrian King <[email protected]>
      Reviewed-by: default avatarDaniel Axtens <[email protected]>
      Reviewed-by: default avatarTomas Henzl <[email protected]>
      Signed-off-by: default avatarJames Bottomley <[email protected]>
      0a27ae51
    • Matthew R. Ochs's avatar
      cxlflash: Fix to avoid sizeof(bool) · e568e23f
      Matthew R. Ochs authored
      Using sizeof(bool) is considered poor form for various reasons and
      sparse warns us of that. Correct by changing type from bool to u8.
      Signed-off-by: default avatarMatthew R. Ochs <[email protected]>
      Signed-off-by: default avatarManoj N. Kumar <[email protected]>
      Reviewed-by: default avatarBrian King <[email protected]>
      Reviewed-by: default avatarDaniel Axtens <[email protected]>
      Reviewed-by: default avatarTomas Henzl <[email protected]>
      Signed-off-by: default avatarJames Bottomley <[email protected]>
      e568e23f
    • Matthew R. Ochs's avatar
      cxlflash: Fix potential oops following LUN removal · 22fe1ae8
      Matthew R. Ochs authored
      When a LUN is removed, the sdev that is associated with the LUN
      remains intact until its reference count drops to 0. In order
      to prevent an sdev from being removed while a context is still
      associated with it, obtain an additional reference per-context
      for each LUN attached to the context.
      
      This resolves a potential Oops in the release handler when a
      dealing with a LUN that has already been removed.
      Signed-off-by: default avatarMatthew R. Ochs <[email protected]>
      Signed-off-by: default avatarManoj N. Kumar <[email protected]>
      Reviewed-by: default avatarBrian King <[email protected]>
      Reviewed-by: default avatarTomas Henzl <[email protected]>
      Signed-off-by: default avatarJames Bottomley <[email protected]>
      22fe1ae8
    • Manoj Kumar's avatar
      cxlflash: Fix read capacity timeout · 471a5a60
      Manoj Kumar authored
      The timeout value for read capacity is too small. Certain devices
      may take longer to respond and thus the command may prematurely
      timeout. Additionally the literal used for the timeout is stale.
      
      Update the timeout to 30 seconds (matches the value used in sd.c)
      and rework the timeout literal to a more appropriate description.
      Signed-off-by: default avatarMatthew R. Ochs <[email protected]>
      Signed-off-by: default avatarManoj N. Kumar <[email protected]>
      Reviewed-by: default avatarBrian King <[email protected]>
      Reviewed-by: default avatarTomas Henzl <[email protected]>
      Signed-off-by: default avatarJames Bottomley <[email protected]>
      471a5a60
    • Manoj Kumar's avatar
      cxlflash: Replace magic numbers with literals · 3ebf2030
      Manoj Kumar authored
      Magic numbers are not meaningful and can create confusion. As a
      remedy, replace them with descriptive literals.
      
      Replace 512 with literal MAX_SECTOR_UNIT.
      Replace 5 with literal CMD_RETRIES.
      Signed-off-by: default avatarMatthew R. Ochs <[email protected]>
      Signed-off-by: default avatarManoj N. Kumar <[email protected]>
      Reviewed-by: default avatarBrian King <[email protected]>
      Reviewed-by: default avatarAndrew Donnellan <[email protected]>
      Reviewed-by: default avatarTomas Henzl <[email protected]>
      Signed-off-by: default avatarJames Bottomley <[email protected]>
      3ebf2030
  9. 27 Aug, 2015 2 commits