This project is mirrored from https://git.kernel.org/pub/scm/linux/kernel/git/wagi/linux-cip-rt.git. The repository failed to update .
Repository mirroring has been paused due to too many failed attempts, and can be resumed by a project maintainer.
Last successful update .
  1. 21 Jul, 2012 1 commit
  2. 19 Jul, 2012 1 commit
    • Linus Torvalds's avatar
      Make wait_for_device_probe() also do scsi_complete_async_scans() · eea03c20
      Linus Torvalds authored
      Commit a7a20d10 ("sd: limit the scope of the async probe domain")
      make the SCSI device probing run device discovery in it's own async
      domain.
      
      However, as a result, the partition detection was no longer synchronized
      by async_synchronize_full() (which, despite the name, only synchronizes
      the global async space, not all of them).  Which in turn meant that
      "wait_for_device_probe()" would not wait for the SCSI partitions to be
      parsed.
      
      And "wait_for_device_probe()" was what the boot time init code relied on
      for mounting the root filesystem.
      
      Now, most people never noticed this, because not only is it
      timing-dependent, but modern distributions all use initrd.  So the root
      filesystem isn't actually on a disk at all.  And then before they
      actually mount the final disk filesystem, they will have loaded the
      scsi-wait-scan module, which not only does the expected
      wait_for_device_probe(), but also does scsi_complete_async_scans().
      
      [ Side note: scsi_complete_async_scans() had also been partially broken,
        but that was fixed in commit 43a8d39d ("fix async probe
        regression"), so that same commit a7a20d10 had actually broken
        setups even if you used scsi-wait-scan explicitly ]
      
      Solve this problem by just moving the scsi_complete_async_scans() call
      into wait_for_device_probe().  Everybody who wants to wait for device
      probing to finish really wants the SCSI probing to complete, so there's
      no reason not to do this.
      
      So now "wait_for_device_probe()" really does what the name implies, and
      properly waits for device probing to finish.  This also removes the now
      unnecessary extra calls to scsi_complete_async_scans().
      Reported-and-tested-by: 's avatarArtem S. Tashkinov <t.artem@mailcity.com>
      Cc: Dan Williams <dan.j.williams@gmail.com>
      Cc: Alan Stern <stern@rowland.harvard.edu>
      Cc: James Bottomley <jbottomley@parallels.com>
      Cc: Borislav Petkov <bp@amd64.org>
      Cc: linux-scsi <linux-scsi@vger.kernel.org>
      Signed-off-by: 's avatarLinus Torvalds <torvalds@linux-foundation.org>
      eea03c20
  3. 18 Jul, 2012 1 commit
    • Sage Weil's avatar
      libceph: fix messenger retry · 5bdca4e0
      Sage Weil authored
      In ancient times, the messenger could both initiate and accept connections.
      An artifact if that was data structures to store/process an incoming
      ceph_msg_connect request and send an outgoing ceph_msg_connect_reply.
      Sadly, the negotiation code was referencing those structures and ignoring
      important information (like the peer's connect_seq) from the correct ones.
      
      Among other things, this fixes tight reconnect loops where the server sends
      RETRY_SESSION and we (the client) retries with the same connect_seq as last
      time.  This bug pretty easily triggered by injecting socket failures on the
      MDS and running some fs workload like workunits/direct_io/test_sync_io.
      Signed-off-by: 's avatarSage Weil <sage@inktank.com>
      5bdca4e0
  4. 17 Jul, 2012 2 commits
    • Michael Kerrisk's avatar
      PM: Rename CAP_EPOLLWAKEUP to CAP_BLOCK_SUSPEND · d9914cf6
      Michael Kerrisk authored
      As discussed in
      http://thread.gmane.org/gmane.linux.kernel/1249726/focus=1288990,
      the capability introduced in 4d7e30d9
      to govern EPOLLWAKEUP seems misnamed: this capability is about governing
      the ability to suspend the system, not using a particular API flag
      (EPOLLWAKEUP). We should make the name of the capability more general
      to encourage reuse in related cases. (Whether or not this capability
      should also be used to govern the use of /sys/power/wake_lock is a
      question that needs to be separately resolved.)
      
      This patch renames the capability to CAP_BLOCK_SUSPEND. In order to ensure
      that the old capability name doesn't make it out into the wild, could you
      please apply and push up the tree to ensure that it is incorporated
      for the 3.5 release.
      Signed-off-by: 's avatarMichael Kerrisk <mtk.manpages@gmail.com>
      Acked-by: 's avatarSerge Hallyn <serge.hallyn@canonical.com>
      Signed-off-by: 's avatarRafael J. Wysocki <rjw@sisk.pl>
      d9914cf6
    • Lin Ming's avatar
      ipvs: fix oops on NAT reply in br_nf context · 9e33ce45
      Lin Ming authored
      IPVS should not reset skb->nf_bridge in FORWARD hook
      by calling nf_reset for NAT replies. It triggers oops in
      br_nf_forward_finish.
      
      [  579.781508] BUG: unable to handle kernel NULL pointer dereference at 0000000000000004
      [  579.781669] IP: [<ffffffff817b1ca5>] br_nf_forward_finish+0x58/0x112
      [  579.781792] PGD 218f9067 PUD 0
      [  579.781865] Oops: 0000 [#1] SMP
      [  579.781945] CPU 0
      [  579.781983] Modules linked in:
      [  579.782047]
      [  579.782080]
      [  579.782114] Pid: 4644, comm: qemu Tainted: G        W    3.5.0-rc5-00006-g95e69f9 #282 Hewlett-Packard  /30E8
      [  579.782300] RIP: 0010:[<ffffffff817b1ca5>]  [<ffffffff817b1ca5>] br_nf_forward_finish+0x58/0x112
      [  579.782455] RSP: 0018:ffff88007b003a98  EFLAGS: 00010287
      [  579.782541] RAX: 0000000000000008 RBX: ffff8800762ead00 RCX: 000000000001670a
      [  579.782653] RDX: 0000000000000000 RSI: 000000000000000a RDI: ffff8800762ead00
      [  579.782845] RBP: ffff88007b003ac8 R08: 0000000000016630 R09: ffff88007b003a90
      [  579.782957] R10: ffff88007b0038e8 R11: ffff88002da37540 R12: ffff88002da01a02
      [  579.783066] R13: ffff88002da01a80 R14: ffff88002d83c000 R15: ffff88002d82a000
      [  579.783177] FS:  0000000000000000(0000) GS:ffff88007b000000(0063) knlGS:00000000f62d1b70
      [  579.783306] CS:  0010 DS: 002b ES: 002b CR0: 000000008005003b
      [  579.783395] CR2: 0000000000000004 CR3: 00000000218fe000 CR4: 00000000000027f0
      [  579.783505] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      [  579.783684] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      [  579.783795] Process qemu (pid: 4644, threadinfo ffff880021b20000, task ffff880021aba760)
      [  579.783919] Stack:
      [  579.783959]  ffff88007693cedc ffff8800762ead00 ffff88002da01a02 ffff8800762ead00
      [  579.784110]  ffff88002da01a02 ffff88002da01a80 ffff88007b003b18 ffffffff817b26c7
      [  579.784260]  ffff880080000000 ffffffff81ef59f0 ffff8800762ead00 ffffffff81ef58b0
      [  579.784477] Call Trace:
      [  579.784523]  <IRQ>
      [  579.784562]
      [  579.784603]  [<ffffffff817b26c7>] br_nf_forward_ip+0x275/0x2c8
      [  579.784707]  [<ffffffff81704b58>] nf_iterate+0x47/0x7d
      [  579.784797]  [<ffffffff817ac32e>] ? br_dev_queue_push_xmit+0xae/0xae
      [  579.784906]  [<ffffffff81704bfb>] nf_hook_slow+0x6d/0x102
      [  579.784995]  [<ffffffff817ac32e>] ? br_dev_queue_push_xmit+0xae/0xae
      [  579.785175]  [<ffffffff8187fa95>] ? _raw_write_unlock_bh+0x19/0x1b
      [  579.785179]  [<ffffffff817ac417>] __br_forward+0x97/0xa2
      [  579.785179]  [<ffffffff817ad366>] br_handle_frame_finish+0x1a6/0x257
      [  579.785179]  [<ffffffff817b2386>] br_nf_pre_routing_finish+0x26d/0x2cb
      [  579.785179]  [<ffffffff817b2cf0>] br_nf_pre_routing+0x55d/0x5c1
      [  579.785179]  [<ffffffff81704b58>] nf_iterate+0x47/0x7d
      [  579.785179]  [<ffffffff817ad1c0>] ? br_handle_local_finish+0x44/0x44
      [  579.785179]  [<ffffffff81704bfb>] nf_hook_slow+0x6d/0x102
      [  579.785179]  [<ffffffff817ad1c0>] ? br_handle_local_finish+0x44/0x44
      [  579.785179]  [<ffffffff81551525>] ? sky2_poll+0xb35/0xb54
      [  579.785179]  [<ffffffff817ad62a>] br_handle_frame+0x213/0x229
      [  579.785179]  [<ffffffff817ad417>] ? br_handle_frame_finish+0x257/0x257
      [  579.785179]  [<ffffffff816e3b47>] __netif_receive_skb+0x2b4/0x3f1
      [  579.785179]  [<ffffffff816e69fc>] process_backlog+0x99/0x1e2
      [  579.785179]  [<ffffffff816e6800>] net_rx_action+0xdf/0x242
      [  579.785179]  [<ffffffff8107e8a8>] __do_softirq+0xc1/0x1e0
      [  579.785179]  [<ffffffff8135a5ba>] ? trace_hardirqs_off_thunk+0x3a/0x6c
      [  579.785179]  [<ffffffff8188812c>] call_softirq+0x1c/0x30
      
      The steps to reproduce as follow,
      
      1. On Host1, setup brige br0(192.168.1.106)
      2. Boot a kvm guest(192.168.1.105) on Host1 and start httpd
      3. Start IPVS service on Host1
         ipvsadm -A -t 192.168.1.106:80 -s rr
         ipvsadm -a -t 192.168.1.106:80 -r 192.168.1.105:80 -m
      4. Run apache benchmark on Host2(192.168.1.101)
         ab -n 1000 http://192.168.1.106/
      
      ip_vs_reply4
        ip_vs_out
          handle_response
            ip_vs_notrack
              nf_reset()
              {
                skb->nf_bridge = NULL;
              }
      
      Actually, IPVS wants in this case just to replace nfct
      with untracked version. So replace the nf_reset(skb) call
      in ip_vs_notrack() with a nf_conntrack_put(skb->nfct) call.
      Signed-off-by: 's avatarLin Ming <mlin@ss.pku.edu.cn>
      Signed-off-by: 's avatarJulian Anastasov <ja@ssi.bg>
      Signed-off-by: 's avatarSimon Horman <horms@verge.net.au>
      Signed-off-by: 's avatarPablo Neira Ayuso <pablo@netfilter.org>
      9e33ce45
  5. 11 Jul, 2012 5 commits
    • Yinghai Lu's avatar
      memblock: free allocated memblock_reserved_regions later · 29f67386
      Yinghai Lu authored
      memblock_free_reserved_regions() calls memblock_free(), but
      memblock_free() would double reserved.regions too, so we could free the
      old range for reserved.regions.
      
      Also tj said there is another bug which could be related to this.
      
      | I don't think we're saving any noticeable
      | amount by doing this "free - give it to page allocator - reserve
      | again" dancing.  We should just allocate regions aligned to page
      | boundaries and free them later when memblock is no longer in use.
      
      in that case, when DEBUG_PAGEALLOC, will get panic:
      
           memblock_free: [0x0000102febc080-0x0000102febf080] memblock_free_reserved_regions+0x37/0x39
        BUG: unable to handle kernel paging request at ffff88102febd948
        IP: [<ffffffff836a5774>] __next_free_mem_range+0x9b/0x155
        PGD 4826063 PUD cf67a067 PMD cf7fa067 PTE 800000102febd160
        Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
        CPU 0
        Pid: 0, comm: swapper Not tainted 3.5.0-rc2-next-20120614-sasha #447
        RIP: 0010:[<ffffffff836a5774>]  [<ffffffff836a5774>] __next_free_mem_range+0x9b/0x155
      
      See the discussion at https://lkml.org/lkml/2012/6/13/469
      
      So try to allocate with PAGE_SIZE alignment and free it later.
      Reported-by: 's avatarSasha Levin <levinsasha928@gmail.com>
      Acked-by: 's avatarTejun Heo <tj@kernel.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: 's avatarYinghai Lu <yinghai@kernel.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: 's avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: 's avatarLinus Torvalds <torvalds@linux-foundation.org>
      29f67386
    • Yinghai Lu's avatar
      mm: sparse: fix usemap allocation above node descriptor section · 99ab7b19
      Yinghai Lu authored
      After commit f5bf18fa ("bootmem/sparsemem: remove limit constraint
      in alloc_bootmem_section"), usemap allocations may easily be placed
      outside the optimal section that holds the node descriptor, even if
      there is space available in that section.  This results in unnecessary
      hotplug dependencies that need to have the node unplugged before the
      section holding the usemap.
      
      The reason is that the bootmem allocator doesn't guarantee a linear
      search starting from the passed allocation goal but may start out at a
      much higher address absent an upper limit.
      
      Fix this by trying the allocation with the limit at the section end,
      then retry without if that fails.  This keeps the fix from f5bf18fa
      of not panicking if the allocation does not fit in the section, but
      still makes sure to try to stay within the section at first.
      Signed-off-by: 's avatarYinghai Lu <yinghai@kernel.org>
      Signed-off-by: 's avatarJohannes Weiner <hannes@cmpxchg.org>
      Cc: <stable@vger.kernel.org>	[3.3.x, 3.4.x]
      Signed-off-by: 's avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: 's avatarLinus Torvalds <torvalds@linux-foundation.org>
      99ab7b19
    • Jiang Liu's avatar
      memory hotplug: fix invalid memory access caused by stale kswapd pointer · d8adde17
      Jiang Liu authored
      kswapd_stop() is called to destroy the kswapd work thread when all memory
      of a NUMA node has been offlined.  But kswapd_stop() only terminates the
      work thread without resetting NODE_DATA(nid)->kswapd to NULL.  The stale
      pointer will prevent kswapd_run() from creating a new work thread when
      adding memory to the memory-less NUMA node again.  Eventually the stale
      pointer may cause invalid memory access.
      
      An example stack dump as below. It's reproduced with 2.6.32, but latest
      kernel has the same issue.
      
        BUG: unable to handle kernel NULL pointer dereference at (null)
        IP: [<ffffffff81051a94>] exit_creds+0x12/0x78
        PGD 0
        Oops: 0000 [#1] SMP
        last sysfs file: /sys/devices/system/memory/memory391/state
        CPU 11
        Modules linked in: cpufreq_conservative cpufreq_userspace cpufreq_powersave acpi_cpufreq microcode fuse loop dm_mod tpm_tis rtc_cmos i2c_i801 rtc_core tpm serio_raw pcspkr sg tpm_bios igb i2c_core iTCO_wdt rtc_lib mptctl iTCO_vendor_support button dca bnx2 usbhid hid uhci_hcd ehci_hcd usbcore sd_mod crc_t10dif edd ext3 mbcache jbd fan ide_pci_generic ide_core ata_generic ata_piix libata thermal processor thermal_sys hwmon mptsas mptscsih mptbase scsi_transport_sas scsi_mod
        Pid: 7949, comm: sh Not tainted 2.6.32.12-qiuxishi-5-default #92 Tecal RH2285
        RIP: 0010:exit_creds+0x12/0x78
        RSP: 0018:ffff8806044f1d78  EFLAGS: 00010202
        RAX: 0000000000000000 RBX: ffff880604f22140 RCX: 0000000000019502
        RDX: 0000000000000000 RSI: 0000000000000202 RDI: 0000000000000000
        RBP: ffff880604f22150 R08: 0000000000000000 R09: ffffffff81a4dc10
        R10: 00000000000032a0 R11: ffff880006202500 R12: 0000000000000000
        R13: 0000000000c40000 R14: 0000000000008000 R15: 0000000000000001
        FS:  00007fbc03d066f0(0000) GS:ffff8800282e0000(0000) knlGS:0000000000000000
        CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
        CR2: 0000000000000000 CR3: 000000060f029000 CR4: 00000000000006e0
        DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
        DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
        Process sh (pid: 7949, threadinfo ffff8806044f0000, task ffff880603d7c600)
        Stack:
         ffff880604f22140 ffffffff8103aac5 ffff880604f22140 ffffffff8104d21e
         ffff880006202500 0000000000008000 0000000000c38000 ffffffff810bd5b1
         0000000000000000 ffff880603d7c600 00000000ffffdd29 0000000000000003
        Call Trace:
          __put_task_struct+0x5d/0x97
          kthread_stop+0x50/0x58
          offline_pages+0x324/0x3da
          memory_block_change_state+0x179/0x1db
          store_mem_state+0x9e/0xbb
          sysfs_write_file+0xd0/0x107
          vfs_write+0xad/0x169
          sys_write+0x45/0x6e
          system_call_fastpath+0x16/0x1b
        Code: ff 4d 00 0f 94 c0 84 c0 74 08 48 89 ef e8 1f fd ff ff 5b 5d 31 c0 41 5c c3 53 48 8b 87 20 06 00 00 48 89 fb 48 8b bf 18 06 00 00 <8b> 00 48 c7 83 18 06 00 00 00 00 00 00 f0 ff 0f 0f 94 c0 84 c0
        RIP  exit_creds+0x12/0x78
         RSP <ffff8806044f1d78>
        CR2: 0000000000000000
      
      [akpm@linux-foundation.org: add pglist_data.kswapd locking comments]
      Signed-off-by: 's avatarXishi Qiu <qiuxishi@huawei.com>
      Signed-off-by: 's avatarJiang Liu <jiang.liu@huawei.com>
      Acked-by: 's avatarKAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Acked-by: 's avatarKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Acked-by: 's avatarMel Gorman <mgorman@suse.de>
      Acked-by: 's avatarDavid Rientjes <rientjes@google.com>
      Reviewed-by: 's avatarMinchan Kim <minchan@kernel.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: 's avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: 's avatarLinus Torvalds <torvalds@linux-foundation.org>
      d8adde17
    • Thomas Gleixner's avatar
      timekeeping: Provide hrtimer update function · f6c06abf
      Thomas Gleixner authored
      To finally fix the infamous leap second issue and other race windows
      caused by functions which change the offsets between the various time
      bases (CLOCK_MONOTONIC, CLOCK_REALTIME and CLOCK_BOOTTIME) we need a
      function which atomically gets the current monotonic time and updates
      the offsets of CLOCK_REALTIME and CLOCK_BOOTTIME with minimalistic
      overhead. The previous patch which provides ktime_t offsets allows us
      to make this function almost as cheap as ktime_get() which is going to
      be replaced in hrtimer_interrupt().
      Signed-off-by: 's avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: 's avatarIngo Molnar <mingo@kernel.org>
      Acked-by: 's avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: 's avatarPrarit Bhargava <prarit@redhat.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: 's avatarJohn Stultz <johnstul@us.ibm.com>
      Link: http://lkml.kernel.org/r/1341960205-56738-7-git-send-email-johnstul@us.ibm.comSigned-off-by: 's avatarThomas Gleixner <tglx@linutronix.de>
      f6c06abf
    • John Stultz's avatar
      hrtimer: Provide clock_was_set_delayed() · f55a6faa
      John Stultz authored
      clock_was_set() cannot be called from hard interrupt context because
      it calls on_each_cpu().
      
      For fixing the widely reported leap seconds issue it is necessary to
      call it from hard interrupt context, i.e. the timer tick code, which
      does the timekeeping updates.
      
      Provide a new function which denotes it in the hrtimer cpu base
      structure of the cpu on which it is called and raise the hrtimer
      softirq. We then execute the clock_was_set() notificiation from
      softirq context in run_hrtimer_softirq(). The hrtimer softirq is
      rarely used, so polling the flag there is not a performance issue.
      
      [ tglx: Made it depend on CONFIG_HIGH_RES_TIMERS. We really should get
        rid of all this ifdeffery ASAP ]
      Signed-off-by: 's avatarJohn Stultz <johnstul@us.ibm.com>
      Reported-by: 's avatarJan Engelhardt <jengelh@inai.de>
      Reviewed-by: 's avatarIngo Molnar <mingo@kernel.org>
      Acked-by: 's avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: 's avatarPrarit Bhargava <prarit@redhat.com>
      Cc: stable@vger.kernel.org
      Link: http://lkml.kernel.org/r/1341960205-56738-2-git-send-email-johnstul@us.ibm.comSigned-off-by: 's avatarThomas Gleixner <tglx@linutronix.de>
      f55a6faa
  6. 10 Jul, 2012 1 commit
    • Alan Stern's avatar
      PCI: EHCI: fix crash during suspend on ASUS computers · dbf0e4c7
      Alan Stern authored
      Quite a few ASUS computers experience a nasty problem, related to the
      EHCI controllers, when going into system suspend.  It was observed
      that the problem didn't occur if the controllers were not put into the
      D3 power state before starting the suspend, and commit
      151b6128 (USB: EHCI: fix crash during
      suspend on ASUS computers) was created to do this.
      
      It turned out this approach messed up other computers that didn't have
      the problem -- it prevented USB wakeup from working.  Consequently
      commit c2fb8a3f (USB: add
      NO_D3_DURING_SLEEP flag and revert 151b6128) was merged; it
      reverted the earlier commit and added a whitelist of known good board
      names.
      
      Now we know the actual cause of the problem.  Thanks to AceLan Kao for
      tracking it down.
      
      According to him, an engineer at ASUS explained that some of their
      BIOSes contain a bug that was added in an attempt to work around a
      problem in early versions of Windows.  When the computer goes into S3
      suspend, the BIOS tries to verify that the EHCI controllers were first
      quiesced by the OS.  Nothing's wrong with this, but the BIOS does it
      by checking that the PCI COMMAND registers contain 0 without checking
      the controllers' power state.  If the register isn't 0, the BIOS
      assumes the controller needs to be quiesced and tries to do so.  This
      involves making various MMIO accesses to the controller, which don't
      work very well if the controller is already in D3.  The end result is
      a system hang or memory corruption.
      
      Since the value in the PCI COMMAND register doesn't matter once the
      controller has been suspended, and since the value will be restored
      anyway when the controller is resumed, we can work around the BIOS bug
      simply by setting the register to 0 during system suspend.  This patch
      (as1590) does so and also reverts the second commit mentioned above,
      which is now unnecessary.
      
      In theory we could do this for every PCI device.  However to avoid
      introducing new problems, the patch restricts itself to EHCI host
      controllers.
      
      Finally the affected systems can suspend with USB wakeup working
      properly.
      
      Reference: https://bugzilla.kernel.org/show_bug.cgi?id=37632
      Reference: https://bugzilla.kernel.org/show_bug.cgi?id=42728Based-on-patch-by: 's avatarAceLan Kao <acelan.kao@canonical.com>
      Signed-off-by: 's avatarAlan Stern <stern@rowland.harvard.edu>
      Tested-by: 's avatarDâniel Fraga <fragabr@gmail.com>
      Tested-by: 's avatarJavier Marcet <jmarcet@gmail.com>
      Tested-by: 's avatarAndrey Rahmatullin <wrar@wrar.name>
      Tested-by: 's avatarOleksij Rempel <bug-track@fisher-privat.net>
      Tested-by: 's avatarPavel Pisa <pisa@cmp.felk.cvut.cz>
      Cc: stable <stable@vger.kernel.org>
      Acked-by: 's avatarBjorn Helgaas <bhelgaas@google.com>
      Acked-by: 's avatarRafael J. Wysocki <rjw@sisk.pl>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      dbf0e4c7
  7. 09 Jul, 2012 1 commit
  8. 08 Jul, 2012 2 commits
    • Dan Williams's avatar
      [SCSI] libsas: fix taskfile corruption in sas_ata_qc_fill_rtf · 6ef1b512
      Dan Williams authored
      fill_result_tf() grabs the taskfile flags from the originating qc which
      sas_ata_qc_fill_rtf() promptly overwrites.  The presence of an
      ata_taskfile in the sata_device makes it tempting to just copy the full
      contents in sas_ata_qc_fill_rtf().  However, libata really only wants
      the fis contents and expects the other portions of the taskfile to not
      be touched by ->qc_fill_rtf.  To that end store a fis buffer in the
      sata_device and use ata_tf_from_fis() like every other ->qc_fill_rtf()
      implementation.
      
      Cc: <stable@vger.kernel.org>
      Reported-by: 's avatarPraveen Murali <pmurali@logicube.com>
      Tested-by: 's avatarPraveen Murali <pmurali@logicube.com>
      Signed-off-by: 's avatarDan Williams <dan.j.williams@intel.com>
      Signed-off-by: 's avatarJames Bottomley <JBottomley@Parallels.com>
      6ef1b512
    • Mark Rustad's avatar
      [SCSI] Fix NULL dereferences in scsi_cmd_to_driver · 222a806a
      Mark Rustad authored
      Avoid crashing if the private_data pointer happens to be NULL. This has
      been seen sometimes when a host reset happens, notably when there are
      many LUNs:
      
      host3: Assigned Port ID 0c1601
      scsi host3: libfc: Host reset succeeded on port (0c1601)
      BUG: unable to handle kernel NULL pointer dereference at 0000000000000350
      IP: [<ffffffff81352bb8>] scsi_send_eh_cmnd+0x58/0x3a0
      <snip>
      Process scsi_eh_3 (pid: 4144, threadinfo ffff88030920c000, task ffff880326b160c0)
      Stack:
       000000010372e6ba 0000000000000282 000027100920dca0 ffffffffa0038ee0
       0000000000000000 0000000000030003 ffff88030920dc80 ffff88030920dc80
       00000002000e0000 0000000a00004000 ffff8803242f7760 ffff88031326ed80
      Call Trace:
       [<ffffffff8105b590>] ? lock_timer_base+0x70/0x70
       [<ffffffff81352fbe>] scsi_eh_tur+0x3e/0xc0
       [<ffffffff81353a36>] scsi_eh_test_devices+0x76/0x170
       [<ffffffff81354125>] scsi_eh_host_reset+0x85/0x160
       [<ffffffff81354291>] scsi_eh_ready_devs+0x91/0x110
       [<ffffffff813543fd>] scsi_unjam_host+0xed/0x1f0
       [<ffffffff813546a8>] scsi_error_handler+0x1a8/0x200
       [<ffffffff81354500>] ? scsi_unjam_host+0x1f0/0x1f0
       [<ffffffff8106ec3e>] kthread+0x9e/0xb0
       [<ffffffff81509264>] kernel_thread_helper+0x4/0x10
       [<ffffffff8106eba0>] ? kthread_freezable_should_stop+0x70/0x70
       [<ffffffff81509260>] ? gs_change+0x13/0x13
      Code: 25 28 00 00 00 48 89 45 c8 31 c0 48 8b 87 80 00 00 00 48 8d b5 60 ff ff ff 89 d1 48 89 fb 41 89 d6 4c 89 fa 48 8b 80 b8 00 00 00
       <48> 8b 80 50 03 00 00 48 8b 00 48 89 85 38 ff ff ff 48 8b 07 4c
      RIP  [<ffffffff81352bb8>] scsi_send_eh_cmnd+0x58/0x3a0
       RSP <ffff88030920dc50>
      CR2: 0000000000000350
      Signed-off-by: 's avatarMark Rustad <mark.d.rustad@intel.com>
      Tested-by: 's avatarMarcus Dennis <marcusx.e.dennis@intel.com>
      Cc: <stable@kernel.org>
      Signed-off-by: 's avatarJames Bottomley <JBottomley@Parallels.com>
      222a806a
  9. 07 Jul, 2012 1 commit
  10. 06 Jul, 2012 1 commit
  11. 05 Jul, 2012 3 commits
  12. 04 Jul, 2012 2 commits
    • Ohad Ben-Cohen's avatar
      rpmsg: make sure inflight messages don't invoke just-removed callbacks · 15fd943a
      Ohad Ben-Cohen authored
      When inbound messages arrive, rpmsg core looks up their associated
      endpoint (by destination address) and then invokes their callback.
      
      We've made sure that endpoints will never be de-allocated after they
      were found by rpmsg core, but we also need to protect against the
      (rare) scenario where the rpmsg driver was just removed, and its
      callback function isn't available anymore.
      
      This is achieved by introducing a callback mutex, which must be taken
      before the callback is invoked, and, obviously, before it is removed.
      
      Cc: stable <stable@vger.kernel.org>
      Reported-by: 's avatarFernando Guzman Lugo <fernando.lugo@ti.com>
      Signed-off-by: 's avatarOhad Ben-Cohen <ohad@wizery.com>
      15fd943a
    • Ohad Ben-Cohen's avatar
      rpmsg: avoid premature deallocation of endpoints · 5a081caa
      Ohad Ben-Cohen authored
      When an inbound message arrives, the rpmsg core looks up its
      associated endpoint and invokes the registered callback.
      
      If a message arrives while its endpoint is being removed (because
      the rpmsg driver was removed, or a recovery of a remote processor
      has kicked in) we must ensure atomicity, i.e.:
      
      - Either the ept is removed before it is found
      
      or
      
      - The ept is found but will not be freed until the callback returns
      
      This is achieved by maintaining a per-ept reference count, which,
      when drops to zero, will trigger deallocation of the ept.
      
      With this in hand, it is now forbidden to directly deallocate
      epts once they have been added to the endpoints idr.
      
      Cc: stable <stable@vger.kernel.org>
      Reported-by: 's avatarFernando Guzman Lugo <fernando.lugo@ti.com>
      Signed-off-by: 's avatarOhad Ben-Cohen <ohad@wizery.com>
      5a081caa
  13. 03 Jul, 2012 1 commit
  14. 02 Jul, 2012 1 commit
  15. 01 Jul, 2012 1 commit
    • Neil Horman's avatar
      sctp: be more restrictive in transport selection on bundled sacks · 4244854d
      Neil Horman authored
      It was noticed recently that when we send data on a transport, its possible that
      we might bundle a sack that arrived on a different transport.  While this isn't
      a major problem, it does go against the SHOULD requirement in section 6.4 of RFC
      2960:
      
       An endpoint SHOULD transmit reply chunks (e.g., SACK, HEARTBEAT ACK,
         etc.) to the same destination transport address from which it
         received the DATA or control chunk to which it is replying.  This
         rule should also be followed if the endpoint is bundling DATA chunks
         together with the reply chunk.
      
      This patch seeks to correct that.  It restricts the bundling of sack operations
      to only those transports which have moved the ctsn of the association forward
      since the last sack.  By doing this we guarantee that we only bundle outbound
      saks on a transport that has received a chunk since the last sack.  This brings
      us into stricter compliance with the RFC.
      
      Vlad had initially suggested that we strictly allow only sack bundling on the
      transport that last moved the ctsn forward.  While this makes sense, I was
      concerned that doing so prevented us from bundling in the case where we had
      received chunks that moved the ctsn on multiple transports.  In those cases, the
      RFC allows us to select any of the transports having received chunks to bundle
      the sack on.  so I've modified the approach to allow for that, by adding a state
      variable to each transport that tracks weather it has moved the ctsn since the
      last sack.  This I think keeps our behavior (and performance), close enough to
      our current profile that I think we can do this without a sysctl knob to
      enable/disable it.
      Signed-off-by: 's avatarNeil Horman <nhorman@tuxdriver.com>
      CC: Vlad Yaseivch <vyasevich@gmail.com>
      CC: David S. Miller <davem@davemloft.net>
      CC: linux-sctp@vger.kernel.org
      Reported-by: 's avatarMichele Baldessari <michele@redhat.com>
      Reported-by: 's avatarsorin serban <sserban@redhat.com>
      Acked-by: 's avatarVlad Yasevich <vyasevich@gmail.com>
      Signed-off-by: 's avatarDavid S. Miller <davem@davemloft.net>
      4244854d
  16. 30 Jun, 2012 1 commit
  17. 25 Jun, 2012 1 commit
    • Paul Mundt's avatar
      bug.h: Fix up CONFIG_BUG=n implicit function declarations. · 09682c1d
      Paul Mundt authored
      Commit 2603efa3 ("bug.h: Fix up powerpc build regression") corrected
      the powerpc build case and extended the __ASSEMBLY__ guards, but it also
      got caught in pre-processor hell accidentally matching the else case of
      CONFIG_BUG resulting in the BUG disabled case tripping up on
      -Werror=implicit-function-declaration.
      
      It's not possible to __ASSEMBLY__ guard the entire file as architecture
      code needs to get at the BUGFLAG_WARNING definition in the GENERIC_BUG
      case, but the rest of the CONFIG_BUG=y/n case needs to be guarded.
      
      Rather than littering endless __ASSEMBLY__ checks in each of the if/else
      cases we just move the BUGFLAG definitions up under their own
      GENERIC_BUG test and then shove everything else under one big
      __ASSEMBLY__ guard.
      
      Build tested on all of x86 CONFIG_BUG=y, CONFIG_BUG=n, powerpc (due to
      it's dependence on BUGFLAG definitions in assembly code), and sh (due to
      not bringing in linux/kernel.h to satisfy the taint flag definitions used
      by the generic bug code).
      
      Hopefully that's the end of the corner cases and I can abstain from ever
      having to touch this infernal header ever again.
      Reported-by: 's avatarFengguang Wu <fengguang.wu@intel.com>
      Tested-by: 's avatarFengguang Wu <wfg@linux.intel.com>
      Acked-by: 's avatarRandy Dunlap <rdunlap@xenotime.net>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Signed-off-by: 's avatarPaul Mundt <lethal@linux-sh.org>
      Signed-off-by: 's avatarLinus Torvalds <torvalds@linux-foundation.org>
      09682c1d
  18. 23 Jun, 2012 1 commit
    • Alan Stern's avatar
      SCSI & usb-storage: add try_rc_10_first flag · 6a0bdffa
      Alan Stern authored
      Several bug reports have been received recently for USB mass-storage
      devices that don't handle READ CAPACITY(16) commands properly.  They
      report bogus sizes, in some cases becoming unusable as a result.
      
      The bugs were triggered by commit
      09b6b51b (SCSI & usb-storage: add
      flags for VPD pages and REPORT LUNS), which caused usb-storage to stop
      overriding the SCSI level reported by devices.  By default, the sd
      driver will try READ CAPACITY(16) first for any device whose level is
      above SCSI_SPC_2.
      
      It seems likely that any device large enough to require the use of
      READ CAPACITY(16) (i.e., 2 TB or more) would be able to handle READ
      CAPACITY(10) commands properly.  Indeed, I don't know of any devices
      that don't handle READ CAPACITY(10) properly.
      
      Therefore this patch (as1559) adds a new flag telling the sd driver
      to try READ CAPACITY(10) before READ CAPACITY(16), and sets this flag
      for every USB mass-storage device.  If a device really is larger than
      2 TB, sd will fall back to READ CAPACITY(16) just as it used to.
      
      This fixes Bugzilla #43391.
      Signed-off-by: 's avatarAlan Stern <stern@rowland.harvard.edu>
      Acked-by: 's avatarHans de Goede <hdegoede@redhat.com>
      CC: "James E.J. Bottomley" <JBottomley@parallels.com>
      CC: Matthew Dharm <mdharm-usb@one-eyed-alien.net>
      Cc: stable <stable@vger.kernel.org>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      6a0bdffa
  19. 22 Jun, 2012 1 commit
  20. 21 Jun, 2012 1 commit
  21. 20 Jun, 2012 3 commits
    • Viresh Kumar's avatar
      Viresh has moved · 10d8935f
      Viresh Kumar authored
      viresh.kumar@st.com email-id doesn't exist anymore as I have left the
      company.  Replace ST's id with viresh.linux@gmail.com.
      
      It also updates .mailmap file to fix address for 'git shortlog'
      Signed-off-by: 's avatarViresh Kumar <viresh.linux@gmail.com>
      Signed-off-by: 's avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: 's avatarLinus Torvalds <torvalds@linux-foundation.org>
      10d8935f
    • Andrea Arcangeli's avatar
      thp: avoid atomic64_read in pmd_read_atomic for 32bit PAE · e4eed03f
      Andrea Arcangeli authored
      In the x86 32bit PAE CONFIG_TRANSPARENT_HUGEPAGE=y case while holding the
      mmap_sem for reading, cmpxchg8b cannot be used to read pmd contents under
      Xen.
      
      So instead of dealing only with "consistent" pmdvals in
      pmd_none_or_trans_huge_or_clear_bad() (which would be conceptually
      simpler) we let pmd_none_or_trans_huge_or_clear_bad() deal with pmdvals
      where the low 32bit and high 32bit could be inconsistent (to avoid having
      to use cmpxchg8b).
      
      The only guarantee we get from pmd_read_atomic is that if the low part of
      the pmd was found null, the high part will be null too (so the pmd will be
      considered unstable).  And if the low part of the pmd is found "stable"
      later, then it means the whole pmd was read atomically (because after a
      pmd is stable, neither MADV_DONTNEED nor page faults can alter it anymore,
      and we read the high part after the low part).
      
      In the 32bit PAE x86 case, it is enough to read the low part of the pmdval
      atomically to declare the pmd as "stable" and that's true for THP and no
      THP, furthermore in the THP case we also have a barrier() that will
      prevent any inconsistent pmdvals to be cached by a later re-read of the
      *pmd.
      Signed-off-by: 's avatarAndrea Arcangeli <aarcange@redhat.com>
      Cc: Jonathan Nieder <jrnieder@gmail.com>
      Cc: Ulrich Obergfell <uobergfe@redhat.com>
      Cc: Mel Gorman <mgorman@suse.de>
      Cc: Hugh Dickins <hughd@google.com>
      Cc: Larry Woodman <lwoodman@redhat.com>
      Cc: Petr Matousek <pmatouse@redhat.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Jan Beulich <jbeulich@suse.com>
      Cc: KOSAKI Motohiro <kosaki.motohiro@gmail.com>
      Tested-by: 's avatarAndrew Jones <drjones@redhat.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: 's avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: 's avatarLinus Torvalds <torvalds@linux-foundation.org>
      e4eed03f
    • Pravin B Shelar's avatar
      mm: fix slab->page _count corruption when using slub · abca7c49
      Pravin B Shelar authored
      On arches that do not support this_cpu_cmpxchg_double() slab_lock is used
      to do atomic cmpxchg() on double word which contains page->_count.  The
      page count can be changed from get_page() or put_page() without taking
      slab_lock.  That corrupts page counter.
      
      Fix it by moving page->_count out of cmpxchg_double data.  So that slub
      does no change it while updating slub meta-data in struct page.
      
      [akpm@linux-foundation.org: use standard comment layout, tweak comment text]
      Reported-by: 's avatarAmey Bhide <abhide@nicira.com>
      Signed-off-by: 's avatarPravin B Shelar <pshelar@nicira.com>
      Acked-by: 's avatarChristoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: 's avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: 's avatarLinus Torvalds <torvalds@linux-foundation.org>
      abca7c49
  22. 18 Jun, 2012 4 commits
    • Kay Sievers's avatar
      kmsg - kmsg_dump() fix CONFIG_PRINTK=n compilation · 246f6f2f
      Kay Sievers authored
      Signed-off-by: 's avatarKay Sievers <kay@vrfy.org>
      Cc: Stephen Rothwell <sfr@canb.auug.org.au>
      Reported-by: 's avatarRandy Dunlap <rdunlap@xenotime.net>
      Reported-by: 's avatarFengguang Wu <wfg@linux.intel.com>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      246f6f2f
    • Paul Mundt's avatar
      bug.h: Fix up powerpc build regression. · 2603efa3
      Paul Mundt authored
      The asm-generic/bug.h __ASSEMBLY__ guarding is completely bogus, which
      tripped up the powerpc build when the kernel.h include was added:
      
      	In file included from include/asm-generic/bug.h:5:0,
      			 from arch/powerpc/include/asm/bug.h:127,
      			 from arch/powerpc/kernel/head_64.S:31:
      	include/linux/kernel.h:44:0: warning: "ALIGN" redefined [enabled by default]
      	include/linux/linkage.h:57:0: note: this is the location of the previous definition
      	include/linux/sysinfo.h: Assembler messages:
      	include/linux/sysinfo.h:7: Error: Unrecognized opcode: `struct'
      	include/linux/sysinfo.h:8: Error: Unrecognized opcode: `__kernel_long_t'
      
      Moving the __ASSEMBLY__ guard up and stashing the kernel.h include under
      it fixes this up, as well as covering the case the original fix was
      attempting to handle.
      Tested-by: 's avatarStephen Rothwell <sfr@canb.auug.org.au>
      Acked-by: 's avatarArnd Bergmann <arnd@arndb.de>
      Signed-off-by: 's avatarPaul Mundt <lethal@linux-sh.org>
      Signed-off-by: 's avatarLinus Torvalds <torvalds@linux-foundation.org>
      2603efa3
    • Steven Rostedt's avatar
      ftrace: Make all inline tags also include notrace · 93b3cca1
      Steven Rostedt authored
      Commit 5963e317 ("ftrace/x86: Do not change stacks in DEBUG when
      calling lockdep") prevented lockdep calls from the int3 breakpoint handler
      from reseting the stack if a function that was called was in the process
      of being converted for tracing and had a breakpoint on it. The idea is,
      before calling the lockdep code, do a load_idt() to the special IDT that
      kept the breakpoint stack from reseting. This worked well as a quick fix
      for this kernel release, until a certain config caused a lockup in the
      function tracer start up tests.
      
      Investigating it, I found that the load_idt that was used to prevent
      the int3 from changing stacks was itself being traced!
      
      Even though the config had CONFIG_OPTIMIZE_INLINING disabled, and
      all 'inline' tags were set to always inline, there were still cases that
      it did not inline! This was caused by CONFIG_PARAVIRT_GUEST, where it
      would add a pointer to the native_load_idt() which made that function
      to be traced.
      
      Commit 45959ee7 ("ftrace: Do not function trace inlined functions")
      only touched the 'inline' tags when CONFIG_OPMITIZE_INLINING was enabled.
      PARAVIRT_GUEST shows that this was not enough and we need to also
      mark always_inline with notrace as well.
      Reported-by: 's avatarFengguang Wu <wfg@linux.intel.com>
      Tested-by: 's avatarFengguang Wu <wfg@linux.intel.com>
      Signed-off-by: 's avatarSteven Rostedt <rostedt@goodmis.org>
      93b3cca1
    • Trond Myklebust's avatar
      NFSv4.1: Fix umount when filelayout DS is also the MDS · 2a4c8994
      Trond Myklebust authored
      Currently there is a 'chicken and egg' issue when the DS is also the mounted
      MDS. The nfs_match_client() reference from nfs4_set_ds_client bumps the
      cl_count, the nfs_client is not freed at umount, and nfs4_deviceid_purge_client
      is not called to dereference the MDS usage of a deviceid which holds a
      reference to the DS nfs_client.  The result is the umount program returns,
      but the nfs_client is not freed, and the cl_session hearbeat continues.
      
      The MDS (and all other nfs mounts) lose their last nfs_client reference in
      nfs_free_server when the last nfs_server (fsid) is umounted.
      The file layout DS lose their last nfs_client reference in destroy_ds
      when the last deviceid referencing the data server is put and destroy_ds is
      called. This is triggered by a call to nfs4_deviceid_purge_client which
      removes references to a pNFS deviceid used by an MDS mount.
      
      The fix is to track how many pnfs enabled filesystems are mounted from
      this server, and then to purge the device id cache once that count reaches
      zero.
      Reported-by: 's avatarJorge Mora <Jorge.Mora@netapp.com>
      Reported-by: 's avatarAndy Adamson <andros@netapp.com>
      Signed-off-by: 's avatarTrond Myklebust <Trond.Myklebust@netapp.com>
      2a4c8994
  23. 17 Jun, 2012 1 commit
  24. 16 Jun, 2012 3 commits
    • Randy Dunlap's avatar
      vga_switcheroo.h: fix pci_dev warning · f8fee8f5
      Randy Dunlap authored
      Fix warnings on some architectures/configs (not on x86):
      
      include/linux/vga_switcheroo.h:28:30: warning: 'struct pci_dev' declared inside parameter list [enabled by default]
      include/linux/vga_switcheroo.h:28:30: warning: its scope is only this definition or declaration, which is probably not what you want [enabled by default]
      Signed-off-by: 's avatarRandy Dunlap <rdunlap@xenotime.net>
      Cc:	Takashi Iwai <tiwai@suse.de>
      Reported-by: 's avatarGeert Uytterhoeven <geert@linux-m68k.org>
      Signed-off-by: 's avatarDave Airlie <airlied@redhat.com>
      f8fee8f5
    • David S. Miller's avatar
      Revert "ipv6: Prevent access to uninitialized fib_table_hash via /proc/net/ipv6_route" · e8803b6c
      David S. Miller authored
      This reverts commit 2a0c451a.
      
      It causes crashes, because now ip6_null_entry is used before
      it is initialized.
      Signed-off-by: 's avatarDavid S. Miller <davem@davemloft.net>
      e8803b6c
    • Hugh Dickins's avatar
      swap: fix shmem swapping when more than 8 areas · 9b15b817
      Hugh Dickins authored
      Minchan Kim reports that when a system has many swap areas, and tmpfs
      swaps out to the ninth or more, shmem_getpage_gfp()'s attempts to read
      back the page cannot locate it, and the read fails with -ENOMEM.
      
      Whoops.  Yes, I blindly followed read_swap_header()'s pte_to_swp_entry(
      swp_entry_to_pte()) technique for determining maximum usable swap
      offset, without stopping to realize that that actually depends upon the
      pte swap encoding shifting swap offset to the higher bits and truncating
      it there.  Whereas our radix_tree swap encoding leaves offset in the
      lower bits: it's swap "type" (that is, index of swap area) that was
      truncated.
      
      Fix it by reducing the SWP_TYPE_SHIFT() in swapops.h, and removing the
      broken radix_to_swp_entry(swp_to_radix_entry()) from read_swap_header().
      
      This does not reduce the usable size of a swap area any further, it
      leaves it as claimed when making the original commit: no change from 3.0
      on x86_64, nor on i386 without PAE; but 3.0's 512GB is reduced to 128GB
      per swapfile on i386 with PAE.  It's not a change I would have risked
      five years ago, but with x86_64 supported for ten years, I believe it's
      appropriate now.
      
      Hmm, and what if some architecture implements its swap pte with offset
      encoded below type? That would equally break the maximum usable swap
      offset check.  Happily, they all follow the same tradition of encoding
      offset above type, but I'll prepare a check on that for next.
      Reported-and-Reviewed-and-Tested-by: 's avatarMinchan Kim <minchan@kernel.org>
      Signed-off-by: 's avatarHugh Dickins <hughd@google.com>
      Cc: stable@vger.kernel.org [3.1, 3.2, 3.3, 3.4]
      Signed-off-by: 's avatarLinus Torvalds <torvalds@linux-foundation.org>
      9b15b817