1. 27 Aug, 2015 1 commit
    • Dan Williams's avatar
      mm: ZONE_DEVICE for "device memory" · 033fbae9
      Dan Williams authored
      While pmem is usable as a block device or via DAX mappings to userspace
      there are several usage scenarios that can not target pmem due to its
      lack of struct page coverage. In preparation for "hot plugging" pmem
      into the vmemmap add ZONE_DEVICE as a new zone to tag these pages
      separately from the ones that are subject to standard page allocations.
      Importantly "device memory" can be removed at will by userspace
      unbinding the driver of the device.
      Having a separate zone prevents allocation and otherwise marks these
      pages that are distinct from typical uniform memory.  Device memory has
      different lifetime and performance characteristics than RAM.  However,
      since we have run out of ZONES_SHIFT bits this functionality currently
      depends on sacrificing ZONE_DMA.
      Cc: H. Peter Anvin <[email protected]>
      Cc: Ingo Molnar <[email protected]>
      Cc: Dave Hansen <[email protected]>
      Cc: Rik van Riel <[email protected]>
      Cc: Mel Gorman <[email protected]>
      Cc: Jerome Glisse <[email protected]>
      [hch: various simplifications in the arch interface]
      Signed-off-by: default avatarChristoph Hellwig <[email protected]>
      Signed-off-by: default avatarDan Williams <[email protected]>
  2. 14 Apr, 2015 1 commit
    • David Rientjes's avatar
      mm, hotplug: fix concurrent memory hot-add deadlock · 30467e0b
      David Rientjes authored
      There's a deadlock when concurrently hot-adding memory through the probe
      interface and switching a memory block from offline to online.
      When hot-adding memory via the probe interface, add_memory() first takes
      mem_hotplug_begin() and then device_lock() is later taken when registering
      the newly initialized memory block.  This creates a lock dependency of (1)
      mem_hotplug.lock (2) dev->mutex.
      When switching a memory block from offline to online, dev->mutex is first
      grabbed in device_online() when the write(2) transitions an existing
      memory block from offline to online, and then online_pages() will take
      This creates a lock inversion between mem_hotplug.lock and dev->mutex.
      Vitaly reports that this deadlock can happen when kworker handling a probe
      event races with systemd-udevd switching a memory block's state.
      This patch requires the state transition to take mem_hotplug_begin()
      before dev->mutex.  Hot-adding memory via the probe interface creates a
      memory block while holding mem_hotplug_begin(), there is no way to take
      dev->mutex first in this case.
      online_pages() and offline_pages() are only called when transitioning
      memory block state.  We now require that mem_hotplug_begin() is taken
      before calling them -- this requires exporting the mem_hotplug_begin() and
      mem_hotplug_done() to generic code.  In all hot-add and hot-remove cases,
      mem_hotplug_begin() is done prior to device_online().  This is all that is
      needed to avoid the deadlock.
      Signed-off-by: default avatarDavid Rientjes <[email protected]>
      Reported-by: default avatarVitaly Kuznetsov <[email protected]>
      Tested-by: default avatarVitaly Kuznetsov <[email protected]>
      Cc: Greg Kroah-Hartman <[email protected]>
      Cc: "Rafael J. Wysocki" <[email protected]>
      Cc: "K. Y. Srinivasan" <[email protected]>
      Cc: Yasuaki Ishimatsu <[email protected]>
      Cc: Tang Chen <[email protected]>
      Cc: Vlastimil Babka <[email protected]>
      Cc: Zhang Zhen <[email protected]>
      Cc: Vladimir Davydov <[email protected]>
      Cc: Wang Nan <[email protected]>
      Signed-off-by: default avatarAndrew Morton <[email protected]>
      Signed-off-by: default avatarLinus Torvalds <[email protected]>
  3. 10 Oct, 2014 1 commit
  4. 07 Aug, 2014 2 commits
    • Wang Nan's avatar
      memory-hotplug: add zone_for_memory() for selecting zone for new memory · 63264400
      Wang Nan authored
      This series of patches fixes a problem when adding memory in bad manner.
      For example: for a x86_64 machine booted with "mem=400M" and with 2GiB
      memory installed, following commands cause problem:
        # echo 0x40000000 > /sys/devices/system/memory/probe
       [   28.613895] init_memory_mapping: [mem 0x40000000-0x47ffffff]
        # echo 0x48000000 > /sys/devices/system/memory/probe
       [   28.693675] init_memory_mapping: [mem 0x48000000-0x4fffffff]
        # echo online_movable > /sys/devices/system/memory/memory9/state
        # echo 0x50000000 > /sys/devices/system/memory/probe
       [   29.084090] init_memory_mapping: [mem 0x50000000-0x57ffffff]
        # echo 0x58000000 > /sys/devices/system/memory/probe
       [   29.151880] init_memory_mapping: [mem 0x58000000-0x5fffffff]
        # echo online_movable > /sys/devices/system/memory/memory11/state
        # echo online> /sys/devices/system/memory/memory8/state
        # echo online> /sys/devices/system/memory/memory10/state
        # echo offline> /sys/devices/system/memory/memory9/state
       [   30.558819] Offlined Pages 32768
        # free
                    total       used       free     shared    buffers     cached
       Mem:        780588 18014398509432020     830552          0          0      51180
       -/+ buffers/cache: 18014398509380840     881732
       Swap:            0          0          0
      This is because the above commands probe higher memory after online a
      section with online_movable, which causes ZONE_HIGHMEM (or ZONE_NORMAL
      for systems without ZONE_HIGHMEM) overlaps ZONE_MOVABLE.
      After the second online_movable, the problem can be observed from
        # cat /proc/zoneinfo
        Node 0, zone  Movable
          pages free     65491
                min      250
                low      312
                high     375
                scanned  0
                spanned  18446744073709518848
                present  65536
                managed  65536
      This series of patches solve the problem by checking ZONE_MOVABLE when
      choosing zone for new memory.  If new memory is inside or higher than
      ZONE_MOVABLE, makes it go there instead.
      After applying this series of patches, following are free and zoneinfo
      result (after offlining memory9):
        bash-4.2# free
                      total       used       free     shared    buffers     cached
         Mem:        780956      80112     700844          0          0      51180
         -/+ buffers/cache:      28932     752024
         Swap:            0          0          0
        bash-4.2# cat /proc/zoneinfo
        Node 0, zone      DMA
          pages free     3389
                min      14
                low      17
                high     21
                scanned  0
                spanned  4095
                present  3998
                managed  3977
            nr_free_pages 3389
          start_pfn:         1
          inactive_ratio:    1
        Node 0, zone    DMA32
          pages free     73724
                min      341
                low      426
                high     511
                scanned  0
                spanned  98304
                present  98304
                managed  92958
            nr_free_pages 73724
          start_pfn:         4096
          inactive_ratio:    1
        Node 0, zone   Normal
          pages free     32630
                min      120
                low      150
                high     180
                scanned  0
                spanned  32768
                present  32768
                managed  32768
            nr_free_pages 32630
          start_pfn:         262144
          inactive_ratio:    1
        Node 0, zone  Movable
          pages free     65476
                min      241
                low      301
                high     361
                scanned  0
                spanned  98304
                present  65536
                managed  65536
            nr_free_pages 65476
          start_pfn:         294912
          inactive_ratio:    1
      This patch (of 7):
      Introduce zone_for_memory() in arch independent code for
      arch_add_memory() use.
      Many arch_add_memory() function simply selects ZONE_HIGHMEM or
      ZONE_NORMAL and add new memory into it.  However, with the existance of
      ZONE_MOVABLE, the selection method should be carefully considered: if
      new, higher memory is added after ZONE_MOVABLE is setup, the default
      zone and ZONE_MOVABLE may overlap each other.
      should_add_memory_movable() checks the status of ZONE_MOVABLE.  If it
      has already contain memory, compare the address of new memory and
      movable memory.  If new memory is higher than movable, it should be
      added into ZONE_MOVABLE instead of default zone.
      Signed-off-by: default avatarWang Nan <[email protected]>
      Cc: Zhang Yanfei <[email protected]>
      Cc: Dave Hansen <[email protected]>
      Cc: Ingo Molnar <[email protected]>
      Cc: Yinghai Lu <[email protected]>
      Cc: "Mel Gorman" <[email protected]>
      Cc: Thomas Gleixner <[email protected]>
      Cc: "H. Peter Anvin" <[email protected]>
      Cc: "Luck, Tony" <[email protected]>
      Cc: Benjamin Herrenschmidt <[email protected]>
      Cc: Paul Mackerras <[email protected]>
      Cc: Chris Metcalf <[email protected]>
      Signed-off-by: default avatarAndrew Morton <[email protected]>
      Signed-off-by: default avatarLinus Torvalds <[email protected]>
    • Tang Chen's avatar
      mem-hotplug: introduce MMOP_OFFLINE to replace the hard coding -1 · 4f7c6b49
      Tang Chen authored
      In store_mem_state(), we have:
        334         else if (!strncmp(buf, "offline", min_t(int, count, 7)))
        335                 online_type = -1;
        355         case -1:
        356                 ret = device_offline(&mem->dev);
        357                 break;
      Here, "offline" is hard coded as -1.
      This patch does the following renaming:
      and introduces MMOP_OFFLINE = -1 to avoid hard coding.
      Signed-off-by: default avatarTang Chen <[email protected]>
      Cc: Hu Tao <[email protected]>
      Cc: Greg Kroah-Hartman <[email protected]>
      Cc: Lai Jiangshan <[email protected]>
      Cc: Yasuaki Ishimatsu <[email protected]>
      Cc: Gu Zheng <[email protected]>
      Signed-off-by: default avatarAndrew Morton <[email protected]>
      Signed-off-by: default avatarLinus Torvalds <[email protected]>
  5. 04 Jun, 2014 1 commit
    • Vladimir Davydov's avatar
      mem-hotplug: implement get/put_online_mems · bfc8c901
      Vladimir Davydov authored
      kmem_cache_{create,destroy,shrink} need to get a stable value of
      cpu/node online mask, because they init/destroy/access per-cpu/node
      kmem_cache parts, which can be allocated or destroyed on cpu/mem
      hotplug.  To protect against cpu hotplug, these functions use
      {get,put}_online_cpus.  However, they do nothing to synchronize with
      memory hotplug - taking the slab_mutex does not eliminate the
      possibility of race as described in patch 2.
      What we need there is something like get_online_cpus, but for memory.
      We already have lock_memory_hotplug, which serves for the purpose, but
      it's a bit of a hammer right now, because it's backed by a mutex.  As a
      result, it imposes some limitations to locking order, which are not
      desirable, and can't be used just like get_online_cpus.  That's why in
      patch 1 I substitute it with get/put_online_mems, which work exactly
      like get/put_online_cpus except they block not cpu, but memory hotplug.
      [ v1 can be found at https://lkml.org/lkml/2014/4/6/68.  I NAK'ed it by
        myself, because it used an rw semaphore for get/put_online_mems,
        making them dead lock prune.  ]
      This patch (of 2):
      {un}lock_memory_hotplug, which is used to synchronize against memory
      hotplug, is currently backed by a mutex, which makes it a bit of a
      hammer - threads that only want to get a stable value of online nodes
      mask won't be able to proceed concurrently.  Also, it imposes some
      strong locking ordering rules on it, which narrows down the set of its
      usage scenarios.
      This patch introduces get/put_online_mems, which are the same as
      get/put_online_cpus, but for memory hotplug, i.e.  executing a code
      inside a get/put_online_mems section will guarantee a stable value of
      online nodes, present pages, etc.
      lock_memory_hotplug()/unlock_memory_hotplug() are removed altogether.
      Signed-off-by: default avatarVladimir Davydov <[email protected]>
      Cc: Christoph Lameter <[email protected]>
      Cc: Pekka Enberg <[email protected]>
      Cc: Tang Chen <[email protected]>
      Cc: Zhang Yanfei <[email protected]>
      Cc: Toshi Kani <[email protected]>
      Cc: Xishi Qiu <[email protected]>
      Cc: Jiang Liu <[email protected]>
      Cc: Rafael J. Wysocki <[email protected]>
      Cc: David Rientjes <[email protected]>
      Cc: Wen Congyang <[email protected]>
      Cc: Yasuaki Ishimatsu <[email protected]>
      Cc: Lai Jiangshan <[email protected]>
      Signed-off-by: default avatarAndrew Morton <[email protected]>
      Signed-off-by: default avatarLinus Torvalds <[email protected]>
  6. 13 Nov, 2013 2 commits
  7. 01 Jun, 2013 3 commits
  8. 12 May, 2013 1 commit
    • Rafael J. Wysocki's avatar
      ACPI / memhotplug: Bind removable memory blocks to ACPI device nodes · e2ff3940
      Rafael J. Wysocki authored
      During ACPI memory hotplug configuration bind memory blocks residing
      in modules removable through the standard ACPI mechanism to struct
      acpi_device objects associated with ACPI namespace objects
      representing those modules.  Accordingly, unbind those memory blocks
      from the struct acpi_device objects when the memory modules in
      question are being removed.
      When "offline" operation for devices representing memory blocks is
      introduced, this will allow the ACPI core's device hot-remove code to
      use it to carry out remove_memory() for those memory blocks and check
      the results of that before it actually removes the modules holding
      them from the system.
      Since walk_memory_range() is used for accessing all memory blocks
      corresponding to a given ACPI namespace object, it is exported from
      memory_hotplug.c so that the code in acpi_memhotplug.c can use it.
      Signed-off-by: default avatarRafael J. Wysocki <[email protected]>
      Tested-by: default avatarVasilis Liaskovitis <[email protected]>
      Reviewed-by: default avatarToshi Kani <[email protected]>
  9. 29 Apr, 2013 1 commit
  10. 24 Feb, 2013 5 commits
  11. 12 Dec, 2012 1 commit
    • Lai Jiangshan's avatar
      mm, memory-hotplug: dynamic configure movable memory and portion memory · 511c2aba
      Lai Jiangshan authored
      Add online_movable and online_kernel for logic memory hotplug.  This is
      the dynamic version of "movablecore" & "kernelcore".
      We have the same reason to introduce it as to introduce "movablecore" &
      "kernelcore".  It has the same motive as "movablecore" & "kernelcore", but
      it is dynamic/running-time:
      o We can configure memory as kernelcore or movablecore after boot.
        Userspace workload is increased, we need more hugepage, we can't use
        "online_movable" to add memory and allow the system use more
        THP(transparent-huge-page), vice-verse when kernel workload is increase.
        Also help for virtualization to dynamic configure host/guest's memory,
        to save/(reduce waste) memory.
        Memory capacity on Demand
      o When a new node is physically online after boot, we need to use
        "online_movable" or "online_kernel" to configure/portion it as we
        expected when we logic-online it.
        This configuration also helps for physically-memory-migrate.
      o all benefit as the same as existed "movablecore" & "kernelcore".
      o Preparing for movable-node, which is very important for power-saving,
        hardware partitioning and high-available-system(hardware fault
      (Note, we don't introduce movable-node here.)
      Action behavior:
      When a memoryblock/memorysection is onlined by "online_movable", the kernel
      will not have directly reference to the page of the memoryblock,
      thus we can remove that memory any time when needed.
      When it is online by "online_kernel", the kernel can use it.
      When it is online by "online", the zone type doesn't changed.
      Current constraints:
      Only the memoryblock which is adjacent to the ZONE_MOVABLE
      can be online from ZONE_NORMAL to ZONE_MOVABLE.
      [[email protected]: use min_t, cleanups]
      Signed-off-by: Lai Jiangshan's avatarLai Jiangshan <[email protected]>
      Signed-off-by: default avatarWen Congyang <[email protected]>
      Cc: Yasuaki Ishimatsu <[email protected]>
      Cc: Lai Jiangshan <[email protected]>
      Cc: Jiang Liu <[email protected]>
      Cc: KOSAKI Motohiro <[email protected]>
      Cc: Minchan Kim <[email protected]>
      Cc: Mel Gorman <[email protected]>
      Cc: David Rientjes <[email protected]>
      Cc: Yinghai Lu <[email protected]>
      Cc: Rusty Russell <[email protected]>
      Cc: Greg KH <[email protected]>
      Signed-off-by: default avatarAndrew Morton <[email protected]>
      Signed-off-by: default avatarLinus Torvalds <[email protected]>
  12. 09 Oct, 2012 2 commits
  13. 04 Mar, 2012 1 commit
    • Paul Gortmaker's avatar
      BUG: headers with BUG/BUG_ON etc. need linux/bug.h · 187f1882
      Paul Gortmaker authored
      If a header file is making use of BUG, BUG_ON, BUILD_BUG_ON, or any
      other BUG variant in a static inline (i.e. not in a #define) then
      that header really should be including <linux/bug.h> and not just
      expecting it to be implicitly present.
      We can make this change risk-free, since if the files using these
      headers didn't have exposure to linux/bug.h already, they would have
      been causing compile failures/warnings.
      Signed-off-by: default avatarPaul Gortmaker <[email protected]>
  14. 26 Jul, 2011 1 commit
    • Daniel Kiper's avatar
      mm: extend memory hotplug API to allow memory hotplug in virtual machines · 9d0ad8ca
      Daniel Kiper authored
      This patch contains online_page_callback and apropriate functions for
      registering/unregistering online page callbacks.  It allows to do some
      machine specific tasks during online page stage which is required to
      implement memory hotplug in virtual machines.  Currently this patch is
      required by latest memory hotplug support for Xen balloon driver patch
      which will be posted soon.
      Additionally, originial online_page() function was splited into
      following functions doing "atomic" operations:
        - __online_page_set_limits() - set new limits for memory management code,
        - __online_page_increment_counters() - increment totalram_pages and totalhigh_pages,
        - __online_page_free() - free page to allocator.
      It was done to:
        - not duplicate existing code,
        - ease hotplug code devolpment by usage of well defined interface,
        - avoid stupid bugs which are unavoidable when the same code
          (by design) is developed in many places.
      [[email protected]: use explicit indirect-call syntax]
      Signed-off-by: default avatarDaniel Kiper <[email protected]>
      Reviewed-by: default avatarKonrad Rzeszutek Wilk <[email protected]>
      Cc: Ian Campbell <[email protected]>
      Cc: Jeremy Fitzhardinge <[email protected]>
      Signed-off-by: default avatarAndrew Morton <[email protected]>
      Signed-off-by: default avatarLinus Torvalds <[email protected]>
  15. 14 Jan, 2011 1 commit
    • Andrea Arcangeli's avatar
      thp: remove PG_buddy · 5f24ce5f
      Andrea Arcangeli authored
      PG_buddy can be converted to _mapcount == -2.  So the PG_compound_lock can
      be added to page->flags without overflowing (because of the sparse section
      bits increasing) with CONFIG_X86_PAE=y and CONFIG_X86_PAT=y.  This also
      has to move the memory hotplug code from _mapcount to lru.next to avoid
      any risk of clashes.  We can't use lru.next for PG_buddy removal, but
      memory hotplug can use lru.next even more easily than the mapcount
      Signed-off-by: default avatarAndrea Arcangeli <[email protected]>
      Signed-off-by: default avatarAndrew Morton <[email protected]>
      Signed-off-by: default avatarLinus Torvalds <[email protected]>
  16. 11 Jan, 2011 1 commit
  17. 02 Dec, 2010 1 commit
  18. 26 Oct, 2010 1 commit
  19. 25 May, 2010 1 commit
  20. 15 Dec, 2009 1 commit
  21. 23 Sep, 2009 1 commit
    • KAMEZAWA Hiroyuki's avatar
      walk system ram range · 908eedc6
      KAMEZAWA Hiroyuki authored
      Originally, walk_memory_resource() was introduced to traverse all memory
      of "System RAM" for detecting memory hotplug/unplug range.  For doing so,
      flags of IORESOUCE_MEM|IORESOURCE_BUSY was used and this was enough for
      memory hotplug.
      But for using other purpose, /proc/kcore, this may includes some firmware
      area marked as IORESOURCE_BUSY | IORESOUCE_MEM.  This patch makes the
      check strict to find out busy "System RAM".
      Note: PPC64 keeps their own walk_memory_resouce(), which walk through
      ppc64's lmb informaton.  Because old kclist_add() is called per lmb, this
      patch makes no difference in behavior, finally.
      And this patch removes CONFIG_MEMORY_HOTPLUG check from this function.
      Because pfn_valid() just show "there is memmap or not* and cannot be used
      for "there is physical memory or not", this function is useful in generic
      to scan physical memory range.
      Signed-off-by: default avatarKAMEZAWA Hiroyuki <[email protected]>
      Cc: Ralf Baechle <[email protected]>
      Cc: Benjamin Herrenschmidt <[email protected]>
      Cc: WANG Cong <[email protected]>
      Cc: Américo Wang <[email protected]>
      Cc: David Rientjes <[email protected]>
      Cc: Roland Dreier <[email protected]>
      Signed-off-by: default avatarAndrew Morton <[email protected]>
      Signed-off-by: default avatarLinus Torvalds <[email protected]>
  22. 06 Jan, 2009 1 commit
    • Gary Hade's avatar
      mm: show node to memory section relationship with symlinks in sysfs · c04fc586
      Gary Hade authored
      Show node to memory section relationship with symlinks in sysfs
      Add /sys/devices/system/node/nodeX/memoryY symlinks for all
      the memory sections located on nodeX.  For example:
      /sys/devices/system/node/node1/memory135 -> ../../memory/memory135
      indicates that memory section 135 resides on node1.
      Also revises documentation to cover this change as well as updating
      Documentation/ABI/testing/sysfs-devices-memory to include descriptions
      of memory hotremove files 'phys_device', 'phys_index', and 'state'
      that were previously not described there.
      In addition to it always being a good policy to provide users with
      the maximum possible amount of physical location information for
      resources that can be hot-added and/or hot-removed, the following
      are some (but likely not all) of the user benefits provided by
      this change.
        - Provides information needed to determine the specific node
          on which a defective DIMM is located.  This will reduce system
          downtime when the node or defective DIMM is swapped out.
        - Prevents unintended onlining of a memory section that was
          previously offlined due to a defective DIMM.  This could happen
          during node hot-add when the user or node hot-add assist script
          onlines _all_ offlined sections due to user or script inability
          to identify the specific memory sections located on the hot-added
          node.  The consequences of reintroducing the defective memory
          could be ugly.
        - Provides information needed to vary the amount and distribution
          of memory on specific nodes for testing or debugging purposes.
        - Will provide information needed to identify the memory
          sections that need to be offlined prior to physical removal
          of a specific node.
      Symlink creation during boot was tested on 2-node x86_64, 2-node
      ppc64, and 2-node ia64 systems.  Symlink creation during physical
      memory hot-add tested on a 2-node x86_64 system.
      Signed-off-by: default avatarGary Hade <[email protected]>
      Signed-off-by: default avatarBadari Pulavarty <[email protected]>
      Acked-by: default avatarIngo Molnar <[email protected]>
      Signed-off-by: default avatarAndrew Morton <[email protected]>
      Signed-off-by: default avatarLinus Torvalds <[email protected]>
  23. 24 Jul, 2008 2 commits
    • Badari Pulavarty's avatar
      memory-hotplug: add sysfs removable attribute for hotplug memory remove · 5c755e9f
      Badari Pulavarty authored
      Memory may be hot-removed on a per-memory-block basis, particularly on
      POWER where the SPARSEMEM section size often matches the memory-block
      size.  A user-level agent must be able to identify which sections of
      memory are likely to be removable before attempting the potentially
      expensive operation.  This patch adds a file called "removable" to the
      memory directory in sysfs to help such an agent.  In this patch, a memory
      block is considered removable if;
      o It contains only MOVABLE pageblocks
      o It contains only pageblocks with free pages regardless of pageblock type
      On the other hand, a memory block starting with a PageReserved() page will
      never be considered removable.  Without this patch, the user-agent is
      forced to choose a memory block to remove randomly.
      Sample output of the sysfs files:
      ./memory/memory0/removable: 0
      ./memory/memory1/removable: 0
      ./memory/memory2/removable: 0
      ./memory/memory3/removable: 0
      ./memory/memory4/removable: 0
      ./memory/memory5/removable: 0
      ./memory/memory6/removable: 0
      ./memory/memory7/removable: 1
      ./memory/memory8/removable: 0
      ./memory/memory9/removable: 0
      ./memory/memory10/removable: 0
      ./memory/memory11/removable: 0
      ./memory/memory12/removable: 0
      ./memory/memory13/removable: 0
      ./memory/memory14/removable: 0
      ./memory/memory15/removable: 0
      ./memory/memory16/removable: 0
      ./memory/memory17/removable: 1
      ./memory/memory18/removable: 1
      ./memory/memory19/removable: 1
      ./memory/memory20/removable: 1
      ./memory/memory21/removable: 1
      ./memory/memory22/removable: 1
      Signed-off-by: default avatarBadari Pulavarty <[email protected]>
      Signed-off-by: default avatarMel Gorman <[email protected]>
      Acked-by: default avatarKAMEZAWA Hiroyuki <[email protected]>
      Signed-off-by: default avatarAndrew Morton <[email protected]>
      Signed-off-by: default avatarLinus Torvalds <[email protected]>
    • Yasunori Goto's avatar
      memory hotplug: small fixes to bootmem freeing for memory hotremove · af370fb8
      Yasunori Goto authored
      - Change some naming
        * Magic -> types
        * Change definition of bootmem type from direct hex value
      - __free_pages_bootmem() becomes __meminit.
      Signed-off-by: default avatarYasunori Goto <[email protected]>
      Cc: Andy Whitcroft <[email protected]>
      Cc: Badari Pulavarty <[email protected]>
      Cc: Yinghai Lu <[email protected]>
      Cc: Johannes Weiner <[email protected]>
      Signed-off-by: default avatarAndrew Morton <[email protected]>
      Signed-off-by: default avatarLinus Torvalds <[email protected]>
  24. 09 Jun, 2008 1 commit
  25. 28 Apr, 2008 2 commits
    • Yasunori Goto's avatar
      memory hotplug: register section/node id to free · 04753278
      Yasunori Goto authored
      This patch set is to free pages which is allocated by bootmem for
      memory-hotremove.  Some structures of memory management are allocated by
      bootmem.  ex) memmap, etc.
      To remove memory physically, some of them must be freed according to
      circumstance.  This patch set makes basis to free those pages, and free
      Basic my idea is using remain members of struct page to remember information
      of users of bootmem (section number or node id).  When the section is
      removing, kernel can confirm it.  By this information, some issues can be
        1) When the memmap of removing section is allocated on other
           section by bootmem, it should/can be free.
        2) When the memmap of removing section is allocated on the
           same section, it shouldn't be freed. Because the section has to be
           logical memory offlined already and all pages must be isolated against
           page allocater. If it is freed, page allocator may use it which will
           be removed physically soon.
        3) When removing section has other section's memmap,
           kernel will be able to show easily which section should be removed
           before it for user. (Not implemented yet)
        4) When the above case 2), the page isolation will be able to check and skip
           memmap's page when logical memory offline (offline_pages()).
           Current page isolation code fails in this case because this page is
           just reserved page and it can't distinguish this pages can be
           removed or not. But, it will be able to do by this patch.
           (Not implemented yet.)
        5) The node information like pgdat has similar issues. But, this
           will be able to be solved too by this.
           (Not implemented yet, but, remembering node id in the pages.)
      Fortunately, current bootmem allocator just keeps PageReserved flags,
      and doesn't use any other members of page struct. The users of
      bootmem doesn't use them too.
      This patch:
      This is to register information which is node or section's id.  Kernel can
      distinguish which node/section uses the pages allcated by bootmem.  This is
      basis for hot-remove sections or nodes.
      Signed-off-by: default avatarYasunori Goto <[email protected]>
      Cc: Badari Pulavarty <[email protected]>
      Cc: Yinghai Lu <[email protected]>
      Cc: Yasunori Goto <[email protected]>
      Signed-off-by: default avatarAndrew Morton <[email protected]>
      Signed-off-by: default avatarLinus Torvalds <[email protected]>
    • Badari Pulavarty's avatar
      hotplug memory remove: generic __remove_pages() support · ea01ea93
      Badari Pulavarty authored
      Generic helper function to remove section mappings and sysfs entries for the
      section of the memory we are removing.  offline_pages() correctly adjusted
      zone and marked the pages reserved.
      TODO: Yasunori Goto is working on patches to free up allocations from bootmem.
      Signed-off-by: default avatarBadari Pulavarty <[email protected]>
      Acked-by: default avatarYasunori Goto <[email protected]>
      Cc: Benjamin Herrenschmidt <[email protected]>
      Cc: Paul Mackerras <[email protected]>
      Signed-off-by: default avatarAndrew Morton <[email protected]>
      Signed-off-by: default avatarLinus Torvalds <[email protected]>
  26. 16 Oct, 2007 4 commits