1. 12 Feb, 2019 5 commits
    • Leonid Iziumtsev's avatar
      dmaengine: imx-dma: fix wrong callback invoke · dfaa3232
      Leonid Iziumtsev authored
      commit 341198ed upstream.
      
      Once the "ld_queue" list is not empty, next descriptor will migrate
      into "ld_active" list. The "desc" variable will be overwritten
      during that transition. And later the dmaengine_desc_get_callback_invoke()
      will use it as an argument. As result we invoke wrong callback.
      
      That behaviour was in place since:
      commit fcaaba6c ("dmaengine: imx-dma: fix callback path in tasklet").
      But after commit 4cd13c21 ("softirq: Let ksoftirqd do its job")
      things got worse, since possible delay between tasklet_schedule()
      from DMA irq handler and actual tasklet function execution got bigger.
      And that gave more time for new DMA request to be submitted and
      to be put into "ld_queue" list.
      
      It has been noticed that DMA issue is causing problems for "mxc-mmc"
      driver. While stressing the system with heavy network traffic and
      writing/reading to/from sd card simultaneously the timeout may happen:
      
      10013000.sdhci: mxcmci_watchdog: read time out (status = 0x30004900)
      
      That often lead to file system corruption.
      Signed-off-by: default avatarLeonid Iziumtsev <leonid.iziumtsev@gmail.com>
      Signed-off-by: Vinod Koul's avatarVinod Koul <vkoul@kernel.org>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      dfaa3232
    • Lukas Wunner's avatar
      dmaengine: bcm2835: Fix abort of transactions · 4209b907
      Lukas Wunner authored
      commit 9e528c79 upstream.
      
      There are multiple issues with bcm2835_dma_abort() (which is called on
      termination of a transaction):
      
      * The algorithm to abort the transaction first pauses the channel by
        clearing the ACTIVE flag in the CS register, then waits for the PAUSED
        flag to clear.  Page 49 of the spec documents the latter as follows:
      
        "Indicates if the DMA is currently paused and not transferring data.
         This will occur if the active bit has been cleared [...]"
         https://www.raspberrypi.org/app/uploads/2012/02/BCM2835-ARM-Peripherals.pdf
      
        So the function is entering an infinite loop because it is waiting for
        PAUSED to clear which is always set due to the function having cleared
        the ACTIVE flag.  The only thing that's saving it from itself is the
        upper bound of 10000 loop iterations.
      
        The code comment says that the intention is to "wait for any current
        AXI transfer to complete", so the author probably wanted to check the
        WAITING_FOR_OUTSTANDING_WRITES flag instead.  Amend the function
        accordingly.
      
      * The CS register is only read at the beginning of the function.  It
        needs to be read again after pausing the channel and before checking
        for outstanding writes, otherwise writes which were issued between
        the register read at the beginning of the function and pausing the
        channel may not be waited for.
      
      * The function seeks to abort the transfer by writing 0 to the NEXTCONBK
        register and setting the ABORT and ACTIVE flags.  Thereby, the 0 in
        NEXTCONBK is sought to be loaded into the CONBLK_AD register.  However
        experimentation has shown this approach to not work:  The CONBLK_AD
        register remains the same as before and the CS register contains
        0x00000030 (PAUSED | DREQ_STOPS_DMA).  In other words, the control
        block is not aborted but merely paused and it will be resumed once the
        next DMA transaction is started.  That is absolutely not the desired
        behavior.
      
        A simpler approach is to set the channel's RESET flag instead.  This
        reliably zeroes the NEXTCONBK as well as the CS register.  It requires
        less code and only a single MMIO write.  This is also what popular
        user space DMA drivers do, e.g.:
        https://github.com/metachris/RPIO/blob/master/source/c_pwm/pwm.c
      
        Note that the spec is contradictory whether the NEXTCONBK register
        is writeable at all.  On the one hand, page 41 claims:
      
        "The value loaded into the NEXTCONBK register can be overwritten so
        that the linked list of Control Block data structures can be
        dynamically altered. However it is only safe to do this when the DMA
        is paused."
      
        On the other hand, page 40 specifies:
      
        "Only three registers in each channel's register set are directly
        writeable (CS, CONBLK_AD and DEBUG). The other registers (TI,
        SOURCE_AD, DEST_AD, TXFR_LEN, STRIDE & NEXTCONBK), are automatically
        loaded from a Control Block data structure held in external memory."
      
      Fixes: 96286b57 ("dmaengine: Add support for BCM2835")
      Signed-off-by: default avatarLukas Wunner <lukas@wunner.de>
      Cc: stable@vger.kernel.org # v3.14+
      Cc: Frank Pavlic <f.pavlic@kunbus.de>
      Cc: Martin Sperl <kernel@martin.sperl.org>
      Cc: Florian Meier <florian.meier@koalo.de>
      Cc: Clive Messer <clive.m.messer@gmail.com>
      Cc: Matthias Reichl <hias@horus.com>
      Tested-by: default avatarStefan Wahren <stefan.wahren@i2se.com>
      Acked-by: Florian Kauer's avatarFlorian Kauer <florian.kauer@koalo.de>
      Signed-off-by: Vinod Koul's avatarVinod Koul <vkoul@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      4209b907
    • Lukas Wunner's avatar
      dmaengine: bcm2835: Fix interrupt race on RT · 06c383c9
      Lukas Wunner authored
      commit f7da7782 upstream.
      
      If IRQ handlers are threaded (either because CONFIG_PREEMPT_RT_BASE is
      enabled or "threadirqs" was passed on the command line) and if system
      load is sufficiently high that wakeup latency of IRQ threads degrades,
      SPI DMA transactions on the BCM2835 occasionally break like this:
      
      ks8851 spi0.0: SPI transfer timed out
      bcm2835-dma 3f007000.dma: DMA transfer could not be terminated
      ks8851 spi0.0 eth2: ks8851_rdfifo: spi_sync() failed
      
      The root cause is an assumption made by the DMA driver which is
      documented in a code comment in bcm2835_dma_terminate_all():
      
      /*
       * Stop DMA activity: we assume the callback will not be called
       * after bcm_dma_abort() returns (even if it does, it will see
       * c->desc is NULL and exit.)
       */
      
      That assumption falls apart if the IRQ handler bcm2835_dma_callback() is
      threaded: A client may terminate a descriptor and issue a new one
      before the IRQ handler had a chance to run. In fact the IRQ handler may
      miss an *arbitrary* number of descriptors. The result is the following
      race condition:
      
      1. A descriptor finishes, its interrupt is deferred to the IRQ thread.
      2. A client calls dma_terminate_async() which sets channel->desc = NULL.
      3. The client issues a new descriptor. Because channel->desc is NULL,
         bcm2835_dma_issue_pending() immediately starts the descriptor.
      4. Finally the IRQ thread runs and writes BCM2835_DMA_INT to the CS
         register to acknowledge the interrupt. This clears the ACTIVE flag,
         so the newly issued descriptor is paused in the middle of the
         transaction. Because channel->desc is not NULL, the IRQ thread
         finalizes the descriptor and tries to start the next one.
      
      I see two possible solutions: The first is to call synchronize_irq()
      in bcm2835_dma_issue_pending() to wait until the IRQ thread has
      finished before issuing a new descriptor. The downside of this approach
      is unnecessary latency if clients desire rapidly terminating and
      re-issuing descriptors and don't have any use for an IRQ callback.
      (The SPI TX DMA channel is a case in point.)
      
      A better alternative is to make the IRQ thread recognize that it has
      missed descriptors and avoid finalizing the newly issued descriptor.
      So first of all, set the ACTIVE flag when acknowledging the interrupt.
      This keeps a newly issued descriptor running.
      
      If the descriptor was finished, the channel remains idle despite the
      ACTIVE flag being set. However the ACTIVE flag can then no longer be
      used to check whether the channel is idle, so instead check whether
      the register containing the current control block address is zero
      and finalize the current descriptor only if so.
      
      That way, there is no impact on latency and throughput if the client
      doesn't care for the interrupt: Only minimal additional overhead is
      introduced for non-cyclic descriptors as one further MMIO read is
      necessary per interrupt to check for idleness of the channel. Cyclic
      descriptors are sped up slightly by removing one MMIO write per
      interrupt.
      
      Fixes: 96286b57 ("dmaengine: Add support for BCM2835")
      Signed-off-by: default avatarLukas Wunner <lukas@wunner.de>
      Cc: stable@vger.kernel.org # v3.14+
      Cc: Frank Pavlic <f.pavlic@kunbus.de>
      Cc: Martin Sperl <kernel@martin.sperl.org>
      Cc: Florian Meier <florian.meier@koalo.de>
      Cc: Clive Messer <clive.m.messer@gmail.com>
      Cc: Matthias Reichl <hias@horus.com>
      Tested-by: default avatarStefan Wahren <stefan.wahren@i2se.com>
      Acked-by: Florian Kauer's avatarFlorian Kauer <florian.kauer@koalo.de>
      Signed-off-by: Vinod Koul's avatarVinod Koul <vkoul@kernel.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      06c383c9
    • Eric Long's avatar
      dmaengine: sprd: Support DMA link-list cyclic callback · 9535baf8
      Eric Long authored
      [ Upstream commit 97dbd6ea ]
      
      The Spreadtrum DMA link-list mode is always one cyclic transfer,
      so we should clear the SPRD_DMA_LLIST_END flag for the link-list
      configuration. Moreover add cyclic callback support for the cyclic
      transfer.
      Signed-off-by: default avatarEric Long <eric.long@spreadtrum.com>
      Signed-off-by: default avatarBaolin Wang <baolin.wang@linaro.org>
      Signed-off-by: Vinod Koul's avatarVinod Koul <vkoul@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      9535baf8
    • Nathan Chancellor's avatar
      dmaengine: xilinx_dma: Remove __aligned attribute on zynqmp_dma_desc_ll · 4baa4c53
      Nathan Chancellor authored
      [ Upstream commit aeaebcc1 ]
      
      Clang warns:
      
      drivers/dma/xilinx/zynqmp_dma.c:166:4: warning: attribute 'aligned' is
      ignored, place it after "struct" to apply attribute to type declaration
      [-Wignored-attributes]
      }; __aligned(64)
         ^
      ./include/linux/compiler_types.h:200:38: note: expanded from macro
      '__aligned'
                                                     ^
      1 warning generated.
      
      As Nick pointed out in the previous version of this patch, the author
      likely intended for this struct to be 8-byte (64-bit) aligned, not
      64-byte, which is the default. Remove the hanging __aligned attribute.
      
      Fixes: b0cc417c ("dmaengine: Add Xilinx zynqmp dma engine driver support")
      Reported-by: default avatarNick Desaulniers <ndesaulniers@google.com>
      Suggested-by: default avatarNick Desaulniers <ndesaulniers@google.com>
      Signed-off-by: Nathan Chancellor's avatarNathan Chancellor <natechancellor@gmail.com>
      Reviewed-by: default avatarNick Desaulniers <ndesaulniers@google.com>
      Signed-off-by: Vinod Koul's avatarVinod Koul <vkoul@kernel.org>
      Signed-off-by: default avatarSasha Levin <sashal@kernel.org>
      4baa4c53
  2. 06 Dec, 2018 1 commit
  3. 05 Dec, 2018 5 commits
  4. 29 Nov, 2018 2 commits
  5. 16 Oct, 2018 1 commit
  6. 15 Oct, 2018 1 commit
  7. 09 Oct, 2018 1 commit
  8. 07 Oct, 2018 13 commits
  9. 05 Oct, 2018 2 commits
  10. 02 Oct, 2018 3 commits
  11. 18 Sep, 2018 2 commits
    • Waiman Long's avatar
      driver/dma/ioat: Call del_timer_sync() without holding prep_lock · cfb03be6
      Waiman Long authored
      The following lockdep splat was observed:
      
      [ 1222.241750] ======================================================
      [ 1222.271301] WARNING: possible circular locking dependency detected
      [ 1222.301060] 4.16.0-10.el8+5.x86_64+debug #1 Not tainted
      [ 1222.326659] ------------------------------------------------------
      [ 1222.356565] systemd-shutdow/1 is trying to acquire lock:
      [ 1222.382660]  ((&ioat_chan->timer)){+.-.}, at: [<00000000f71e1a28>] del_timer_sync+0x5/0xf0
      [ 1222.422928]
      [ 1222.422928] but task is already holding lock:
      [ 1222.451743]  (&(&ioat_chan->prep_lock)->rlock){+.-.}, at: [<000000008ea98b12>] ioat_shutdown+0x86/0x100 [ioatdma]
         :
      [ 1223.524987] Chain exists of:
      [ 1223.524987]   (&ioat_chan->timer) --> &(&ioat_chan->cleanup_lock)->rlock --> &(&ioat_chan->prep_lock)->rlock
      [ 1223.524987]
      [ 1223.594082]  Possible unsafe locking scenario:
      [ 1223.594082]
      [ 1223.622630]        CPU0                    CPU1
      [ 1223.645080]        ----                    ----
      [ 1223.667404]   lock(&(&ioat_chan->prep_lock)->rlock);
      [ 1223.691535]                                lock(&(&ioat_chan->cleanup_lock)->rlock);
      [ 1223.728657]                                lock(&(&ioat_chan->prep_lock)->rlock);
      [ 1223.765122]   lock((&ioat_chan->timer));
      [ 1223.784095]
      [ 1223.784095]  *** DEADLOCK ***
      [ 1223.784095]
      [ 1223.813492] 4 locks held by systemd-shutdow/1:
      [ 1223.834677]  #0:  (reboot_mutex){+.+.}, at: [<0000000056d33456>] SYSC_reboot+0x10f/0x300
      [ 1223.873310]  #1:  (&dev->mutex){....}, at: [<00000000258dfdd7>] device_shutdown+0x1c8/0x660
      [ 1223.913604]  #2:  (&dev->mutex){....}, at: [<0000000068331147>] device_shutdown+0x1d6/0x660
      [ 1223.954000]  #3:  (&(&ioat_chan->prep_lock)->rlock){+.-.}, at: [<000000008ea98b12>] ioat_shutdown+0x86/0x100 [ioatdma]
      
      In the ioat_shutdown() function:
      
      	spin_lock_bh(&ioat_chan->prep_lock);
      	set_bit(IOAT_CHAN_DOWN, &ioat_chan->state);
      	del_timer_sync(&ioat_chan->timer);
      	spin_unlock_bh(&ioat_chan->prep_lock);
      
      According to the synchronization rule for the del_timer_sync() function,
      the caller must not hold locks which would prevent completion of the
      timer's handler.
      
      The timer structure has its own lock that manages its synchronization.
      Setting the IOAT_CHAN_DOWN bit should prevent other CPUs from
      trying to use that device anyway, there is probably no need to call
      del_timer_sync() while holding the prep_lock. So the del_timer_sync()
      call is now moved outside of the prep_lock critical section to prevent
      the circular lock dependency.
      Signed-off-by: Waiman Long's avatarWaiman Long <longman@redhat.com>
      Reviewed-by: default avatarDave Jiang <dave.jiang@intel.com>
      Signed-off-by: Vinod Koul's avatarVinod Koul <vkoul@kernel.org>
      cfb03be6
    • Angelo Dureghello's avatar
      dmaengine: mcf-edma: avoid warning for wrong pointer cast · 5b7d0c94
      Angelo Dureghello authored
      This patch fixes the following compilation warning
      reported during x86_64 allmodconfig build:
      
        drivers/dma/mcf-edma.c: In function 'mcf_edma_filter_fn':
        drivers/dma/mcf-edma.c:296:33: warning: cast from pointer to
      integer of different size [-Wpointer-to-int-cast]
              return (mcf_chan->slave_id == (u32)param);
      Reported-by: default avatarStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: default avatarAngelo Dureghello <angelo@sysam.it>
      Signed-off-by: Vinod Koul's avatarVinod Koul <vkoul@kernel.org>
      5b7d0c94
  12. 11 Sep, 2018 4 commits