• Shakeel Butt's avatar
    fs: fsnotify: account fsnotify metadata to kmemcg · d46eb14b
    Shakeel Butt authored
    Patch series "Directed kmem charging", v8.
    The Linux kernel's memory cgroup allows limiting the memory usage of the
    jobs running on the system to provide isolation between the jobs.  All
    the kernel memory allocated in the context of the job and marked with
    __GFP_ACCOUNT will also be included in the memory usage and be limited
    by the job's limit.
    The kernel memory can only be charged to the memcg of the process in
    whose context kernel memory was allocated.  However there are cases
    where the allocated kernel memory should be charged to the memcg
    different from the current processes's memcg.  This patch series
    contains two such concrete use-cases i.e.  fsnotify and buffer_head.
    The fsnotify event objects can consume a lot of system memory for large
    or unlimited queues if there is either no or slow listener.  The events
    are allocated in the context of the event producer.  However they should
    be charged to the event consumer.  Similarly the buffer_head objects can
    be allocated in a memcg different from the memcg of the page for which
    buffer_head objects are being allocated.
    To solve this issue, this patch series introduces mechanism to charge
    kernel memory to a given memcg.  In case of fsnotify events, the memcg
    of the consumer can be used for charging and for buffer_head, the memcg
    of the page can be charged.  For directed charging, the caller can use
    the scope API memalloc_[un]use_memcg() to specify the memcg to charge
    for all the __GFP_ACCOUNT allocations within the scope.
    This patch (of 2):
    A lot of memory can be consumed by the events generated for the huge or
    unlimited queues if there is either no or slow listener.  This can cause
    system level memory pressure or OOMs.  So, it's better to account the
    fsnotify kmem caches to the memcg of the listener.
    However the listener can be in a different memcg than the memcg of the
    producer and these allocations happen in the context of the event
    producer.  This patch introduces remote memcg charging API which the
    producer can use to charge the allocations to the memcg of the listener.
    There are seven fsnotify kmem caches and among them allocations from
    dnotify_struct_cache, dnotify_mark_cache, fanotify_mark_cache and
    inotify_inode_mark_cachep happens in the context of syscall from the
    listener.  So, SLAB_ACCOUNT is enough for these caches.
    The objects from fsnotify_mark_connector_cachep are not accounted as
    they are small compared to the notification mark or events and it is
    unclear whom to account connector to since it is shared by all events
    attached to the inode.
    The allocations from the event caches happen in the context of the event
    producer.  For such caches we will need to remote charge the allocations
    to the listener's memcg.  Thus we save the memcg reference in the
    fsnotify_group structure of the listener.
    This patch has also moved the members of fsnotify_group to keep the size
    same, at least for 64 bit build, even with additional member by filling
    the holes.
    [shakeelb@google.com: use GFP_KERNEL_ACCOUNT rather than open-coding it]
      Link: http://lkml.kernel.org/r/20180702215439.211597-1-shakeelb@google.com
    Link: http://lkml.kernel.org/r/20180627191250.209150-2-shakeelb@google.comSigned-off-by: default avatarShakeel Butt <shakeelb@google.com>
    Acked-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
    Cc: Michal Hocko <mhocko@kernel.org>
    Cc: Jan Kara <jack@suse.cz>
    Cc: Amir Goldstein <amir73il@gmail.com>
    Cc: Greg Thelen <gthelen@google.com>
    Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
    Cc: Roman Gushchin <guro@fb.com>
    Cc: Alexander Viro <viro@zeniv.linux.org.uk>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
group.c 4.57 KB