1. 27 Feb, 2019 7 commits
  2. 22 Jan, 2019 1 commit
  3. 29 Nov, 2018 3 commits
  4. 09 Nov, 2018 1 commit
    • Juergen Gross's avatar
      xen: remove size limit of privcmd-buf mapping interface · 3941552a
      Juergen Gross authored
      Currently the size of hypercall buffers allocated via
      /dev/xen/hypercall is limited to a default of 64 memory pages. For live
      migration of guests this might be too small as the page dirty bitmask
      needs to be sized according to the size of the guest. This means
      migrating a 8GB sized guest is already exhausting the default buffer
      size for the dirty bitmap.
      
      There is no sensible way to set a sane limit, so just remove it
      completely. The device node's usage is limited to root anyway, so there
      is no additional DOS scenario added by allowing unlimited buffers.
      
      While at it make the error path for the -ENOMEM case a little bit
      cleaner by setting n_pages to the number of successfully allocated
      pages instead of the target size.
      
      Fixes: c51b3c63 ("xen: add new hypercall buffer mapping device")
      Cc: <stable@vger.kernel.org> #4.18
      Signed-off-by: Juergen Gross's avatarJuergen Gross <jgross@suse.com>
      Reviewed-by: 's avatarBoris Ostrovsky <boris.ostrovsky@oracle.com>
      Signed-off-by: Juergen Gross's avatarJuergen Gross <jgross@suse.com>
      3941552a
  5. 06 Nov, 2018 1 commit
  6. 31 Oct, 2018 4 commits
    • David Hildenbrand's avatar
      mm/memory_hotplug: make add_memory() take the device_hotplug_lock · 8df1d0e4
      David Hildenbrand authored
      add_memory() currently does not take the device_hotplug_lock, however
      is aleady called under the lock from
      	arch/powerpc/platforms/pseries/hotplug-memory.c
      	drivers/acpi/acpi_memhotplug.c
      to synchronize against CPU hot-remove and similar.
      
      In general, we should hold the device_hotplug_lock when adding memory to
      synchronize against online/offline request (e.g.  from user space) - which
      already resulted in lock inversions due to device_lock() and
      mem_hotplug_lock - see 30467e0b ("mm, hotplug: fix concurrent memory
      hot-add deadlock").  add_memory()/add_memory_resource() will create memory
      block devices, so this really feels like the right thing to do.
      
      Holding the device_hotplug_lock makes sure that a memory block device
      can really only be accessed (e.g. via .online/.state) from user space,
      once the memory has been fully added to the system.
      
      The lock is not held yet in
      	drivers/xen/balloon.c
      	arch/powerpc/platforms/powernv/memtrace.c
      	drivers/s390/char/sclp_cmd.c
      	drivers/hv/hv_balloon.c
      So, let's either use the locked variants or take the lock.
      
      Don't export add_memory_resource(), as it once was exported to be used by
      XEN, which is never built as a module.  If somebody requires it, we also
      have to export a locked variant (as device_hotplug_lock is never
      exported).
      
      Link: http://lkml.kernel.org/r/20180925091457.28651-3-david@redhat.comSigned-off-by: David Hildenbrand's avatarDavid Hildenbrand <david@redhat.com>
      Reviewed-by: 's avatarPavel Tatashin <pavel.tatashin@microsoft.com>
      Reviewed-by: 's avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Reviewed-by: 's avatarRashmica Gupta <rashmica.g@gmail.com>
      Reviewed-by: 's avatarOscar Salvador <osalvador@suse.de>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
      Cc: Len Brown <lenb@kernel.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com>
      Cc: John Allen <jallen@linux.vnet.ibm.com>
      Cc: Michal Hocko <mhocko@suse.com>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Cc: Vlastimil Babka <vbabka@suse.cz>
      Cc: Mathieu Malaterre <malat@debian.org>
      Cc: Pavel Tatashin <pavel.tatashin@microsoft.com>
      Cc: YASUAKI ISHIMATSU <yasu.isimatu@gmail.com>
      Cc: Balbir Singh <bsingharora@gmail.com>
      Cc: Haiyang Zhang <haiyangz@microsoft.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Kate Stewart <kstewart@linuxfoundation.org>
      Cc: "K. Y. Srinivasan" <kys@microsoft.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Michael Neuling <mikey@neuling.org>
      Cc: Philippe Ombredanne <pombredanne@nexb.com>
      Cc: Stephen Hemminger <sthemmin@microsoft.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: 's avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: 's avatarLinus Torvalds <torvalds@linux-foundation.org>
      8df1d0e4
    • Mike Rapoport's avatar
      mm: remove include/linux/bootmem.h · 57c8a661
      Mike Rapoport authored
      Move remaining definitions and declarations from include/linux/bootmem.h
      into include/linux/memblock.h and remove the redundant header.
      
      The includes were replaced with the semantic patch below and then
      semi-automated removal of duplicated '#include <linux/memblock.h>
      
      @@
      @@
      - #include <linux/bootmem.h>
      + #include <linux/memblock.h>
      
      [sfr@canb.auug.org.au: dma-direct: fix up for the removal of linux/bootmem.h]
        Link: http://lkml.kernel.org/r/20181002185342.133d1680@canb.auug.org.au
      [sfr@canb.auug.org.au: powerpc: fix up for removal of linux/bootmem.h]
        Link: http://lkml.kernel.org/r/20181005161406.73ef8727@canb.auug.org.au
      [sfr@canb.auug.org.au: x86/kaslr, ACPI/NUMA: fix for linux/bootmem.h removal]
        Link: http://lkml.kernel.org/r/20181008190341.5e396491@canb.auug.org.au
      Link: http://lkml.kernel.org/r/1536927045-23536-30-git-send-email-rppt@linux.vnet.ibm.comSigned-off-by: 's avatarMike Rapoport <rppt@linux.vnet.ibm.com>
      Signed-off-by: 's avatarStephen Rothwell <sfr@canb.auug.org.au>
      Acked-by: 's avatarMichal Hocko <mhocko@suse.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greentime Hu <green.hu@gmail.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Guan Xuetao <gxt@pku.edu.cn>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
      Cc: Jonas Bonn <jonas@southpole.se>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Ley Foon Tan <lftan@altera.com>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Palmer Dabbelt <palmer@sifive.com>
      Cc: Paul Burton <paul.burton@mips.com>
      Cc: Richard Kuo <rkuo@codeaurora.org>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Serge Semin <fancer.lancer@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Signed-off-by: 's avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: 's avatarLinus Torvalds <torvalds@linux-foundation.org>
      57c8a661
    • Mike Rapoport's avatar
      memblock: replace free_bootmem{_node} with memblock_free · 2013288f
      Mike Rapoport authored
      The free_bootmem and free_bootmem_node are merely wrappers for
      memblock_free. Replace their usage with a call to memblock_free using the
      following semantic patch:
      
      @@
      expression e1, e2, e3;
      @@
      (
      - free_bootmem(e1, e2)
      + memblock_free(e1, e2)
      |
      - free_bootmem_node(e1, e2, e3)
      + memblock_free(e2, e3)
      )
      
      Link: http://lkml.kernel.org/r/1536927045-23536-24-git-send-email-rppt@linux.vnet.ibm.comSigned-off-by: 's avatarMike Rapoport <rppt@linux.vnet.ibm.com>
      Acked-by: 's avatarMichal Hocko <mhocko@suse.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greentime Hu <green.hu@gmail.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Guan Xuetao <gxt@pku.edu.cn>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
      Cc: Jonas Bonn <jonas@southpole.se>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Ley Foon Tan <lftan@altera.com>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Palmer Dabbelt <palmer@sifive.com>
      Cc: Paul Burton <paul.burton@mips.com>
      Cc: Richard Kuo <rkuo@codeaurora.org>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Serge Semin <fancer.lancer@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Signed-off-by: 's avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: 's avatarLinus Torvalds <torvalds@linux-foundation.org>
      2013288f
    • Mike Rapoport's avatar
      memblock: replace alloc_bootmem_pages with memblock_alloc · 15c3c114
      Mike Rapoport authored
      The alloc_bootmem_pages() function allocates PAGE_SIZE aligned memory.
      memblock_alloc() with alignment set to PAGE_SIZE does exactly the same
      thing.
      
      The conversion is done using the following semantic patch:
      
      @@
      expression e;
      @@
      - alloc_bootmem_pages(e)
      + memblock_alloc(e, PAGE_SIZE)
      
      Link: http://lkml.kernel.org/r/1536927045-23536-20-git-send-email-rppt@linux.vnet.ibm.comSigned-off-by: 's avatarMike Rapoport <rppt@linux.vnet.ibm.com>
      Acked-by: 's avatarMichal Hocko <mhocko@suse.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Chris Zankel <chris@zankel.net>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Greentime Hu <green.hu@gmail.com>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Guan Xuetao <gxt@pku.edu.cn>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: "James E.J. Bottomley" <jejb@parisc-linux.org>
      Cc: Jonas Bonn <jonas@southpole.se>
      Cc: Jonathan Corbet <corbet@lwn.net>
      Cc: Ley Foon Tan <lftan@altera.com>
      Cc: Mark Salter <msalter@redhat.com>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: Palmer Dabbelt <palmer@sifive.com>
      Cc: Paul Burton <paul.burton@mips.com>
      Cc: Richard Kuo <rkuo@codeaurora.org>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Rich Felker <dalias@libc.org>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Serge Semin <fancer.lancer@gmail.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Signed-off-by: 's avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: 's avatarLinus Torvalds <torvalds@linux-foundation.org>
      15c3c114
  7. 26 Oct, 2018 1 commit
  8. 24 Oct, 2018 4 commits
  9. 23 Oct, 2018 1 commit
    • David Howells's avatar
      iov_iter: Separate type from direction and use accessor functions · aa563d7b
      David Howells authored
      In the iov_iter struct, separate the iterator type from the iterator
      direction and use accessor functions to access them in most places.
      
      Convert a bunch of places to use switch-statements to access them rather
      then chains of bitwise-AND statements.  This makes it easier to add further
      iterator types.  Also, this can be more efficient as to implement a switch
      of small contiguous integers, the compiler can use ~50% fewer compare
      instructions than it has to use bitwise-and instructions.
      
      Further, cease passing the iterator type into the iterator setup function.
      The iterator function can set that itself.  Only the direction is required.
      Signed-off-by: 's avatarDavid Howells <dhowells@redhat.com>
      aa563d7b
  10. 18 Oct, 2018 1 commit
    • Joe Jin's avatar
      xen-swiotlb: use actually allocated size on check physical continuous · 7250f422
      Joe Jin authored
      xen_swiotlb_{alloc,free}_coherent() allocate/free memory based on the
      order of the pages and not size argument (bytes). This is inconsistent with
      range_straddles_page_boundary and memset which use the 'size' value,
      which may lead to not exchanging memory with Xen (range_straddles_page_boundary()
      returned true). And then the call to xen_swiotlb_free_coherent() would
      actually try to exchange the memory with Xen, leading to the kernel
      hitting an BUG (as the hypercall returned an error).
      
      This patch fixes it by making the 'size' variable be of the same size
      as the amount of memory allocated.
      
      CC: stable@vger.kernel.org
      Signed-off-by: 's avatarJoe Jin <joe.jin@oracle.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Christoph Helwig <hch@lst.de>
      Cc: Dongli Zhang <dongli.zhang@oracle.com>
      Cc: John Sobecki <john.sobecki@oracle.com>
      Signed-off-by: 's avatarKonrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      7250f422
  11. 26 Sep, 2018 3 commits
  12. 20 Sep, 2018 2 commits
    • Christoph Hellwig's avatar
      dma-mapping: support non-coherent devices in dma_common_get_sgtable · 9406a49f
      Christoph Hellwig authored
      We can use the arch_dma_coherent_to_pfn hook to provide a ->get_sgtable
      implementation.  Note that this isn't an endorsement of this interface
      (which is a horrible bad idea), but it is required to move arm64 over
      to the generic code without a loss of functionality.
      Signed-off-by: 's avatarChristoph Hellwig <hch@lst.de>
      9406a49f
    • Christoph Hellwig's avatar
      dma-mapping: consolidate the dma mmap implementations · 58b04406
      Christoph Hellwig authored
      The only functional differences (modulo a few missing fixes in the arch
      code) is that architectures without coherent caches need a hook to
      convert a virtual or dma address into a pfn, given that we don't have
      the kernel linear mapping available for the otherwise easy virt_to_page
      call.  As a side effect we can support mmap of the per-device coherent
      area even on architectures not providing the callback, and we make
      previous dangerous default methods dma_common_mmap actually save for
      non-coherent architectures by rejecting it without the right helper.
      
      In addition to that we need a hook so that some architectures can
      override the protection bits when mmaping a dma coherent allocations.
      Signed-off-by: 's avatarChristoph Hellwig <hch@lst.de>
      Acked-by: Paul Burton <paul.burton@mips.com> # MIPS parts
      58b04406
  13. 19 Sep, 2018 1 commit
  14. 14 Sep, 2018 5 commits
    • Michal Hocko's avatar
      xen/gntdev: fix up blockable calls to mn_invl_range_start · 58a57569
      Michal Hocko authored
      Patch series "mmu_notifiers follow ups".
      
      Tetsuo has noticed some fallouts from 93065ac753e4 ("mm, oom: distinguish
      blockable mode for mmu notifiers").  One of them has been fixed and picked
      up by AMD/DRM maintainer [1].  XEN issue is fixed by patch 1.  I have also
      clarified expectations about blockable semantic of invalidate_range_end.
      Finally the last patch removes MMU_INVALIDATE_DOES_NOT_BLOCK which is no
      longer used nor needed.
      
      [1] http://lkml.kernel.org/r/20180824135257.GU29735@dhcp22.suse.cz
      
      This patch (of 3):
      
      93065ac753e4 ("mm, oom: distinguish blockable mode for mmu notifiers") has
      introduced blockable parameter to all mmu_notifiers and the notifier has
      to back off when called in !blockable case and it could block down the
      road.
      
      The above commit implemented that for mn_invl_range_start but both
      in_range checks are done unconditionally regardless of the blockable mode
      and as such they would fail all the time for regular calls.  Fix this by
      checking blockable parameter as well.
      
      Once we are there we can remove the stale TODO.  The lock has to be
      sleepable because we wait for completion down in gnttab_unmap_refs_sync.
      
      Link: http://lkml.kernel.org/r/20180827112623.8992-2-mhocko@kernel.org
      Fixes: 93065ac753e4 ("mm, oom: distinguish blockable mode for mmu notifiers")
      Signed-off-by: 's avatarMichal Hocko <mhocko@suse.com>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Jerome Glisse <jglisse@redhat.com>
      Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
      Reviewed-by: Juergen Gross's avatarJuergen Gross <jgross@suse.com>
      Signed-off-by: 's avatarBoris Ostrovsky <boris.ostrovsky@oracle.com>
      58a57569
    • Josh Abraham's avatar
      xen: fix GCC warning and remove duplicate EVTCHN_ROW/EVTCHN_COL usage · 4dca864b
      Josh Abraham authored
      This patch removes duplicate macro useage in events_base.c.
      
      It also fixes gcc warning:
      variable ‘col’ set but not used [-Wunused-but-set-variable]
      Signed-off-by: 's avatarJoshua Abraham <j.abraham1776@gmail.com>
      Reviewed-by: Juergen Gross's avatarJuergen Gross <jgross@suse.com>
      Signed-off-by: 's avatarBoris Ostrovsky <boris.ostrovsky@oracle.com>
      4dca864b
    • Olaf Hering's avatar
      xen: avoid crash in disable_hotplug_cpu · 3366cdb6
      Olaf Hering authored
      The command 'xl vcpu-set 0 0', issued in dom0, will crash dom0:
      
      BUG: unable to handle kernel NULL pointer dereference at 00000000000002d8
      PGD 0 P4D 0
      Oops: 0000 [#1] PREEMPT SMP NOPTI
      CPU: 7 PID: 65 Comm: xenwatch Not tainted 4.19.0-rc2-1.ga9462db-default #1 openSUSE Tumbleweed (unreleased)
      Hardware name: Intel Corporation S5520UR/S5520UR, BIOS S5500.86B.01.00.0050.050620101605 05/06/2010
      RIP: e030:device_offline+0x9/0xb0
      Code: 77 24 00 e9 ce fe ff ff 48 8b 13 e9 68 ff ff ff 48 8b 13 e9 29 ff ff ff 48 8b 13 e9 ea fe ff ff 90 66 66 66 66 90 41 54 55 53 <f6> 87 d8 02 00 00 01 0f 85 88 00 00 00 48 c7 c2 20 09 60 81 31 f6
      RSP: e02b:ffffc90040f27e80 EFLAGS: 00010203
      RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
      RDX: ffff8801f3800000 RSI: ffffc90040f27e70 RDI: 0000000000000000
      RBP: 0000000000000000 R08: ffffffff820e47b3 R09: 0000000000000000
      R10: 0000000000007ff0 R11: 0000000000000000 R12: ffffffff822e6d30
      R13: dead000000000200 R14: dead000000000100 R15: ffffffff8158b4e0
      FS:  00007ffa595158c0(0000) GS:ffff8801f39c0000(0000) knlGS:0000000000000000
      CS:  e033 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00000000000002d8 CR3: 00000001d9602000 CR4: 0000000000002660
      Call Trace:
       handle_vcpu_hotplug_event+0xb5/0xc0
       xenwatch_thread+0x80/0x140
       ? wait_woken+0x80/0x80
       kthread+0x112/0x130
       ? kthread_create_worker_on_cpu+0x40/0x40
       ret_from_fork+0x3a/0x50
      
      This happens because handle_vcpu_hotplug_event is called twice. In the
      first iteration cpu_present is still true, in the second iteration
      cpu_present is false which causes get_cpu_device to return NULL.
      In case of cpu#0, cpu_online is apparently always true.
      
      Fix this crash by checking if the cpu can be hotplugged, which is false
      for a cpu that was just removed.
      
      Also check if the cpu was actually offlined by device_remove, otherwise
      leave the cpu_present state as it is.
      
      Rearrange to code to do all work with device_hotplug_lock held.
      Signed-off-by: Olaf Hering's avatarOlaf Hering <olaf@aepfle.de>
      Reviewed-by: Juergen Gross's avatarJuergen Gross <jgross@suse.com>
      Signed-off-by: 's avatarBoris Ostrovsky <boris.ostrovsky@oracle.com>
      3366cdb6
    • Marek Marczykowski-Górecki's avatar
      xen/balloon: add runtime control for scrubbing ballooned out pages · 197ecb38
      Marek Marczykowski-Górecki authored
      Scrubbing pages on initial balloon down can take some time, especially
      in nested virtualization case (nested EPT is slow). When HVM/PVH guest is
      started with memory= significantly lower than maxmem=, all the extra
      pages will be scrubbed before returning to Xen. But since most of them
      weren't used at all at that point, Xen needs to populate them first
      (from populate-on-demand pool). In nested virt case (Xen inside KVM)
      this slows down the guest boot by 15-30s with just 1.5GB needed to be
      returned to Xen.
      
      Add runtime parameter to enable/disable it, to allow initially disabling
      scrubbing, then enable it back during boot (for example in initramfs).
      Such usage relies on assumption that a) most pages ballooned out during
      initial boot weren't used at all, and b) even if they were, very few
      secrets are in the guest at that time (before any serious userspace
      kicks in).
      Convert CONFIG_XEN_SCRUB_PAGES to CONFIG_XEN_SCRUB_PAGES_DEFAULT (also
      enabled by default), controlling default value for the new runtime
      switch.
      Signed-off-by: Marek Marczykowski-Górecki's avatarMarek Marczykowski-Górecki <marmarek@invisiblethingslab.com>
      Reviewed-by: Juergen Gross's avatarJuergen Gross <jgross@suse.com>
      Signed-off-by: 's avatarBoris Ostrovsky <boris.ostrovsky@oracle.com>
      197ecb38
    • Vitaly Kuznetsov's avatar
      xen/manage: don't complain about an empty value in control/sysrq node · 87dffe86
      Vitaly Kuznetsov authored
      When guest receives a sysrq request from the host it acknowledges it by
      writing '\0' to control/sysrq xenstore node. This, however, make xenstore
      watch fire again but xenbus_scanf() fails to parse empty value with "%c"
      format string:
      
       sysrq: SysRq : Emergency Sync
       Emergency Sync complete
       xen:manage: Error -34 reading sysrq code in control/sysrq
      
      Ignore -ERANGE the same way we already ignore -ENOENT, empty value in
      control/sysrq is totally legal.
      Signed-off-by: 's avatarVitaly Kuznetsov <vkuznets@redhat.com>
      Reviewed-by: 's avatarWei Liu <wei.liu2@citrix.com>
      Signed-off-by: 's avatarBoris Ostrovsky <boris.ostrovsky@oracle.com>
      87dffe86
  15. 03 Sep, 2018 1 commit
  16. 28 Aug, 2018 1 commit
  17. 22 Aug, 2018 1 commit
    • Michal Hocko's avatar
      mm, oom: distinguish blockable mode for mmu notifiers · 93065ac7
      Michal Hocko authored
      There are several blockable mmu notifiers which might sleep in
      mmu_notifier_invalidate_range_start and that is a problem for the
      oom_reaper because it needs to guarantee a forward progress so it cannot
      depend on any sleepable locks.
      
      Currently we simply back off and mark an oom victim with blockable mmu
      notifiers as done after a short sleep.  That can result in selecting a new
      oom victim prematurely because the previous one still hasn't torn its
      memory down yet.
      
      We can do much better though.  Even if mmu notifiers use sleepable locks
      there is no reason to automatically assume those locks are held.  Moreover
      majority of notifiers only care about a portion of the address space and
      there is absolutely zero reason to fail when we are unmapping an unrelated
      range.  Many notifiers do really block and wait for HW which is harder to
      handle and we have to bail out though.
      
      This patch handles the low hanging fruit.
      __mmu_notifier_invalidate_range_start gets a blockable flag and callbacks
      are not allowed to sleep if the flag is set to false.  This is achieved by
      using trylock instead of the sleepable lock for most callbacks and
      continue as long as we do not block down the call chain.
      
      I think we can improve that even further because there is a common pattern
      to do a range lookup first and then do something about that.  The first
      part can be done without a sleeping lock in most cases AFAICS.
      
      The oom_reaper end then simply retries if there is at least one notifier
      which couldn't make any progress in !blockable mode.  A retry loop is
      already implemented to wait for the mmap_sem and this is basically the
      same thing.
      
      The simplest way for driver developers to test this code path is to wrap
      userspace code which uses these notifiers into a memcg and set the hard
      limit to hit the oom.  This can be done e.g.  after the test faults in all
      the mmu notifier managed memory and set the hard limit to something really
      small.  Then we are looking for a proper process tear down.
      
      [akpm@linux-foundation.org: coding style fixes]
      [akpm@linux-foundation.org: minor code simplification]
      Link: http://lkml.kernel.org/r/20180716115058.5559-1-mhocko@kernel.orgSigned-off-by: 's avatarMichal Hocko <mhocko@suse.com>
      Acked-by: Christian König <christian.koenig@amd.com> # AMD notifiers
      Acked-by: Leon Romanovsky <leonro@mellanox.com> # mlx and umem_odp
      Reported-by: 's avatarDavid Rientjes <rientjes@google.com>
      Cc: "David (ChunMing) Zhou" <David1.Zhou@amd.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Alex Deucher <alexander.deucher@amd.com>
      Cc: David Airlie <airlied@linux.ie>
      Cc: Jani Nikula <jani.nikula@linux.intel.com>
      Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
      Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
      Cc: Doug Ledford <dledford@redhat.com>
      Cc: Jason Gunthorpe <jgg@ziepe.ca>
      Cc: Mike Marciniszyn <mike.marciniszyn@intel.com>
      Cc: Dennis Dalessandro <dennis.dalessandro@intel.com>
      Cc: Sudeep Dutt <sudeep.dutt@intel.com>
      Cc: Ashutosh Dixit <ashutosh.dixit@intel.com>
      Cc: Dimitri Sivanich <sivanich@sgi.com>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Juergen Gross <jgross@suse.com>
      Cc: "Jérôme Glisse" <jglisse@redhat.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Felix Kuehling <felix.kuehling@amd.com>
      Signed-off-by: 's avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: 's avatarLinus Torvalds <torvalds@linux-foundation.org>
      93065ac7
  18. 20 Aug, 2018 2 commits