io_ordering.txt 2.01 KB
Newer Older
1 2 3 4
==============================================
Ordering I/O writes to memory-mapped addresses
==============================================

Linus Torvalds's avatar
Linus Torvalds committed
5 6 7 8 9 10 11 12 13 14
On some platforms, so-called memory-mapped I/O is weakly ordered.  On such
platforms, driver writers are responsible for ensuring that I/O writes to
memory-mapped addresses on their device arrive in the order intended.  This is
typically done by reading a 'safe' device or bridge register, causing the I/O
chipset to flush pending writes to the device before any reads are posted.  A
driver would usually use this technique immediately prior to the exit of a
critical section of code protected by spinlocks.  This would ensure that
subsequent writes to I/O space arrived only after all prior writes (much like a
memory barrier op, mb(), only with respect to I/O).

15
A more concrete example from a hypothetical device driver::
Linus Torvalds's avatar
Linus Torvalds committed
16

17 18 19 20 21 22 23 24 25 26 27 28 29
		...
	CPU A:  spin_lock_irqsave(&dev_lock, flags)
	CPU A:  val = readl(my_status);
	CPU A:  ...
	CPU A:  writel(newval, ring_ptr);
	CPU A:  spin_unlock_irqrestore(&dev_lock, flags)
		...
	CPU B:  spin_lock_irqsave(&dev_lock, flags)
	CPU B:  val = readl(my_status);
	CPU B:  ...
	CPU B:  writel(newval2, ring_ptr);
	CPU B:  spin_unlock_irqrestore(&dev_lock, flags)
		...
Linus Torvalds's avatar
Linus Torvalds committed
30 31

In the case above, the device may receive newval2 before it receives newval,
32
which could cause problems.  Fixing it is easy enough though::
Linus Torvalds's avatar
Linus Torvalds committed
33

34 35 36 37 38 39 40 41 42 43 44 45 46 47
		...
	CPU A:  spin_lock_irqsave(&dev_lock, flags)
	CPU A:  val = readl(my_status);
	CPU A:  ...
	CPU A:  writel(newval, ring_ptr);
	CPU A:  (void)readl(safe_register); /* maybe a config register? */
	CPU A:  spin_unlock_irqrestore(&dev_lock, flags)
		...
	CPU B:  spin_lock_irqsave(&dev_lock, flags)
	CPU B:  val = readl(my_status);
	CPU B:  ...
	CPU B:  writel(newval2, ring_ptr);
	CPU B:  (void)readl(safe_register); /* maybe a config register? */
	CPU B:  spin_unlock_irqrestore(&dev_lock, flags)
Linus Torvalds's avatar
Linus Torvalds committed
48 49 50 51

Here, the reads from safe_register will cause the I/O chipset to flush any
pending writes before actually posting the read to the chipset, preventing
possible data corruption.