linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Andrea Arcangeli <aarcange@redhat.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: akpm@linux-foundation.org, hughd@google.com, jweiner@redhat.com,
	mgorman@suse.de, pholasek@redhat.com, riel@redhat.com,
	David Rientjes <rientjes@google.com>,
	Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	linux-mm <linux-mm@kvack.org>
Subject: Re: [patch 1/1] thp: avoid VM_BUG_ON page_count(page) false positives in __collapse_huge_page_copy
Date: Fri, 28 Sep 2012 14:07:18 +0200	[thread overview]
Message-ID: <20120928120718.GY19474@redhat.com> (raw)
In-Reply-To: <CA+55aFw069v_jCcF7rC-1bkOPYsg3f9wPiWEhNOXEq5D71Lx=g@mail.gmail.com>

Hi Linus,

On Thu, Sep 27, 2012 at 04:40:13PM -0700, Linus Torvalds wrote:
> On Thu, Sep 27, 2012 at 2:51 PM,  <akpm@linux-foundation.org> wrote:
> > From: Andrea Arcangeli <aarcange@redhat.com>
> > Subject: thp: avoid VM_BUG_ON page_count(page) false positives in __collapse_huge_page_copy
> >
> > Some time ago Petr once reproduced a false positive VM_BUG_ON in
> > khugepaged while running the autonuma-benchmark on a large 8 node system.
> > All production kernels out there have DEBUG_VM=n so it was only noticeable
> > on self built kernels.  It's not easily reproducible even on the 8 nodes
> > system.
> >
> > Use page_freeze_refs to prevent speculative pagecache lookups to
> > trigger the false positives, so we're still able to check the
> > page_count to be exact.
> 
> This is too ugly to live. It also fundamentally changes semantics and
> actually makes CONFIG_DEBUG_VM result in totally different behavior.
> 
> I really don't think it's a good feature to make CONFIG_DEBUG_VM
> actually seriously change serialization.
> 
> Either do the page_freeze_refs thing *unconditionally*, presumably
> replacing the current code that does
> 
>                 ...
>                 /* cannot use mapcount: can't collapse if there's a gup pin */
>                 if (page_count(page) != 1) {
> 
> instead, or then just relax the potentially racy VM_BUG_ON() to just
> check >= 2. Because debug stuff that changes semantics really is
> horribly horribly bad.

Agreed. I think we can simply drop that VM_BUG_ON: there's a
putback_lru_page that does a put_page, and then another that does
page_cache_release just after the VM_BUG_ON, so if the count is < 2
we'll notice when the count underflows in free_page_and_swap_cache
(we've a VM_BUG_ON in put_page_testzero for that).

The reason for too ugly to live patch, is that I wanted to take the
opportunity of a workload on a 128 CPU 8 node system that could
reproduce it to verify that it really was the speculative lookup and
not something else (and something else wouldn't have been good :). But
I fully agree with you that after successfully verifying that, it's
good to go with a non-ugly version. Thanks for checking this!

I'll send an updated patch.

> Btw, there are two other thp patches (relating to mlock) floating
> around. They look much more reasonable than this one, but I was hoping
> to see more ack's for them.

Ok! I'll review them too. I read them and they looked all right but I
didn't spend enough time on it yet to be sure and send an ack sorry.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

      reply	other threads:[~2012-09-28 12:07 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20120927215147.A3F935C0050@hpza9.eem.corp.google.com>
2012-09-27 23:40 ` [patch 1/1] thp: avoid VM_BUG_ON page_count(page) false positives in __collapse_huge_page_copy Linus Torvalds
2012-09-28 12:07   ` Andrea Arcangeli [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120928120718.GY19474@redhat.com \
    --to=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=hughd@google.com \
    --cc=jweiner@redhat.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=n-horiguchi@ah.jp.nec.com \
    --cc=pholasek@redhat.com \
    --cc=riel@redhat.com \
    --cc=rientjes@google.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).