All of lore.kernel.org
 help / color / mirror / Atom feed
From: Vladimir Davydov <vdavydov@virtuozzo.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>,
	Michal Hocko <mhocko@kernel.org>, Tejun Heo <tj@kernel.org>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 1/5] mm: uncharge kmem pages from generic free_page path
Date: Wed, 30 Sep 2015 19:46:51 +0300	[thread overview]
Message-ID: <20150930164651.GA19988@esperanza> (raw)
In-Reply-To: <20150929154347.c22bc340458d534d5cdb096c@linux-foundation.org>

On Tue, Sep 29, 2015 at 03:43:47PM -0700, Andrew Morton wrote:
> On Sat, 26 Sep 2015 13:45:53 +0300 Vladimir Davydov <vdavydov@parallels.com> wrote:
> 
> > Currently, to charge a page to kmemcg one should use alloc_kmem_pages
> > helper. When the page is not needed anymore it must be freed with
> > free_kmem_pages helper, which will uncharge the page before freeing it.
> > Such a design is acceptable for thread info pages and kmalloc large
> > allocations, which are currently the only users of alloc_kmem_pages, but
> > it gets extremely inconvenient if one wants to make use of batched free
> > (e.g. to charge page tables - see release_pages) or page reference
> > counter (pipe buffers - see anon_pipe_buf_release).
> > 
> > To overcome this limitation, this patch moves kmemcg uncharge code to
> > the generic free path and zaps free_kmem_pages helper. To distinguish
> > kmem pages from other page types, it makes alloc_kmem_pages initialize
> > page->_mapcount to a special value and introduces a new PageKmem helper,
> > which returns true if it sees this value.
> 
> As far as I can tell, this new use of page._mapcount is OK, but...
> 
> - The documentation for _mapcount needs to be updated (mm_types.h)
> 
> - Don't believe the documentation!  Because someone else may have
>   done what you tried to do.  Please manually audit mm/ for _mapcount
>   uses.

OK, I rechecked mm/. Here is the list of (ab)users of the
page->_mapcount field:

 - free pages in buddy (PAGE_BUDDY_MAPCOUNT_VALUE)
 - balloon pages (PAGE_BALLOON_MAPCOUNT_VALUE)
 - compound tail pages (use _mapcount for reference counting)

None of them needs PageKmem set by design. The _mapcount is also
overloaded by slab, but the latter doesn't need alloc_kmem_pages for
kmemcg accounting.

However, there is a (ab)user of _mapcount outside mm/. It's arch/x390,
which stores its private info in page table pages' _mapcount. AFAICS
this shouldn't result in any conflicts with the PageKmem helper
introduced by this patch set, because s390 doesn't use generic
tlb_remove_page, but it looks nasty anyway and at least needs a comment.
I'll look what we can do with that.

> 
> - One such use is "For recording whether a page is in the buddy
>   system, we set ->_mapcount PAGE_BUDDY_MAPCOUNT_VALUE".  Please update
>   the comment for this while you're in there.  (Including description
>   of the state's lifetime).
> 
> - And please update _mapcount docs for PageBalloon()
> 
> - Why is the code accessing ->_mapcount directly?  afaict page_mapcount()
>   and friends will work OK?

page_mapcount() lives in mm.h, which isn't included by page-flags.h.
Anyway, I don't think it's a good idea to use page_mapcount() helper
here, because the latter returns not the value of _mapcount, but the
actual count of page mappings (i.e. _mapcount+1), which is IMO confusing
if the page is never mapped and _mapcount is (ab)used for storing a
flag.

> 
> - The patch adds overhead to all kernels, even non-kmemcg and
>   non-memcg kernels.  Bad.  Fixable?

If kmemcg is not used, memcg_kmem_uncharge_pages is a no-op (thanks to
jump labels), so the overhead added is that of load + comparison + store
(i.e. if PageKmem(page) __ClearPageKmem(page)) at worst. For most cases
(!PageKmem) it's just a load + comparison. Taking into account the fact
that page->_mapcount is accessed in free_pages_check anyway, the
overhead should therefore be negligible IMO.

I can, of course, move all PageKmem stuff under CONFIG_MEMCG_KMEM, but
is that vague performance benefit worth obscuring the code?

> 
> - PAGE_BUDDY_MAPCOUNT_VALUE, PAGE_BALLOON_MAPCOUNT_VALUE and
>   PAGE_KMEM_MAPCOUNT_VALUE should all be put next to each other so
>   readers can see all the possible values and so we don't get
>   duplicates, etc.

Right, will do.

Thanks,
Vladimir

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Vladimir Davydov <vdavydov@virtuozzo.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>,
	Michal Hocko <mhocko@kernel.org>, Tejun Heo <tj@kernel.org>,
	<linux-mm@kvack.org>, <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 1/5] mm: uncharge kmem pages from generic free_page path
Date: Wed, 30 Sep 2015 19:46:51 +0300	[thread overview]
Message-ID: <20150930164651.GA19988@esperanza> (raw)
In-Reply-To: <20150929154347.c22bc340458d534d5cdb096c@linux-foundation.org>

On Tue, Sep 29, 2015 at 03:43:47PM -0700, Andrew Morton wrote:
> On Sat, 26 Sep 2015 13:45:53 +0300 Vladimir Davydov <vdavydov@parallels.com> wrote:
> 
> > Currently, to charge a page to kmemcg one should use alloc_kmem_pages
> > helper. When the page is not needed anymore it must be freed with
> > free_kmem_pages helper, which will uncharge the page before freeing it.
> > Such a design is acceptable for thread info pages and kmalloc large
> > allocations, which are currently the only users of alloc_kmem_pages, but
> > it gets extremely inconvenient if one wants to make use of batched free
> > (e.g. to charge page tables - see release_pages) or page reference
> > counter (pipe buffers - see anon_pipe_buf_release).
> > 
> > To overcome this limitation, this patch moves kmemcg uncharge code to
> > the generic free path and zaps free_kmem_pages helper. To distinguish
> > kmem pages from other page types, it makes alloc_kmem_pages initialize
> > page->_mapcount to a special value and introduces a new PageKmem helper,
> > which returns true if it sees this value.
> 
> As far as I can tell, this new use of page._mapcount is OK, but...
> 
> - The documentation for _mapcount needs to be updated (mm_types.h)
> 
> - Don't believe the documentation!  Because someone else may have
>   done what you tried to do.  Please manually audit mm/ for _mapcount
>   uses.

OK, I rechecked mm/. Here is the list of (ab)users of the
page->_mapcount field:

 - free pages in buddy (PAGE_BUDDY_MAPCOUNT_VALUE)
 - balloon pages (PAGE_BALLOON_MAPCOUNT_VALUE)
 - compound tail pages (use _mapcount for reference counting)

None of them needs PageKmem set by design. The _mapcount is also
overloaded by slab, but the latter doesn't need alloc_kmem_pages for
kmemcg accounting.

However, there is a (ab)user of _mapcount outside mm/. It's arch/x390,
which stores its private info in page table pages' _mapcount. AFAICS
this shouldn't result in any conflicts with the PageKmem helper
introduced by this patch set, because s390 doesn't use generic
tlb_remove_page, but it looks nasty anyway and at least needs a comment.
I'll look what we can do with that.

> 
> - One such use is "For recording whether a page is in the buddy
>   system, we set ->_mapcount PAGE_BUDDY_MAPCOUNT_VALUE".  Please update
>   the comment for this while you're in there.  (Including description
>   of the state's lifetime).
> 
> - And please update _mapcount docs for PageBalloon()
> 
> - Why is the code accessing ->_mapcount directly?  afaict page_mapcount()
>   and friends will work OK?

page_mapcount() lives in mm.h, which isn't included by page-flags.h.
Anyway, I don't think it's a good idea to use page_mapcount() helper
here, because the latter returns not the value of _mapcount, but the
actual count of page mappings (i.e. _mapcount+1), which is IMO confusing
if the page is never mapped and _mapcount is (ab)used for storing a
flag.

> 
> - The patch adds overhead to all kernels, even non-kmemcg and
>   non-memcg kernels.  Bad.  Fixable?

If kmemcg is not used, memcg_kmem_uncharge_pages is a no-op (thanks to
jump labels), so the overhead added is that of load + comparison + store
(i.e. if PageKmem(page) __ClearPageKmem(page)) at worst. For most cases
(!PageKmem) it's just a load + comparison. Taking into account the fact
that page->_mapcount is accessed in free_pages_check anyway, the
overhead should therefore be negligible IMO.

I can, of course, move all PageKmem stuff under CONFIG_MEMCG_KMEM, but
is that vague performance benefit worth obscuring the code?

> 
> - PAGE_BUDDY_MAPCOUNT_VALUE, PAGE_BALLOON_MAPCOUNT_VALUE and
>   PAGE_KMEM_MAPCOUNT_VALUE should all be put next to each other so
>   readers can see all the possible values and so we don't get
>   duplicates, etc.

Right, will do.

Thanks,
Vladimir

  reply	other threads:[~2015-09-30 16:47 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-09-26 10:45 [PATCH 0/5] memcg: charge page tables (x86) and pipe buffers Vladimir Davydov
2015-09-26 10:45 ` Vladimir Davydov
2015-09-26 10:45 ` [PATCH 1/5] mm: uncharge kmem pages from generic free_page path Vladimir Davydov
2015-09-26 10:45   ` Vladimir Davydov
2015-09-29 22:43   ` Andrew Morton
2015-09-29 22:43     ` Andrew Morton
2015-09-30 16:46     ` Vladimir Davydov [this message]
2015-09-30 16:46       ` Vladimir Davydov
2015-09-30 19:51   ` Greg Thelen
2015-09-30 19:51     ` Greg Thelen
2015-10-01 18:52     ` Vladimir Davydov
2015-10-01 18:52       ` Vladimir Davydov
2015-09-26 10:45 ` [PATCH 2/5] fs: charge pipe buffers to memcg Vladimir Davydov
2015-09-26 10:45   ` Vladimir Davydov
2015-09-29 22:57   ` Andrew Morton
2015-09-29 22:57     ` Andrew Morton
2015-09-30 16:49     ` Vladimir Davydov
2015-09-30 16:49       ` Vladimir Davydov
2015-09-26 10:45 ` [PATCH 3/5] memcg: teach uncharge_list to uncharge kmem pages Vladimir Davydov
2015-09-26 10:45   ` Vladimir Davydov
2015-09-26 10:45 ` [PATCH 4/5] mm: add __get_free_kmem_pages helper Vladimir Davydov
2015-09-26 10:45   ` Vladimir Davydov
2015-09-26 10:45 ` [PATCH 5/5] x86: charge page table pages to memcg Vladimir Davydov
2015-09-26 10:45   ` Vladimir Davydov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150930164651.GA19988@esperanza \
    --to=vdavydov@virtuozzo.com \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.