All of lore.kernel.org
 help / color / mirror / Atom feed
From: Wu Fengguang <fengguang.wu@intel.com>
To: Nick Piggin <npiggin@suse.de>
Cc: Andi Kleen <andi@firstfloor.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Hugh Dickins <hugh.dickins@tiscali.co.uk>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [RFC][PATCH] HWPOISON: remove the unsafe __set_page_locked()
Date: Mon, 28 Sep 2009 16:44:01 +0800	[thread overview]
Message-ID: <20090928084401.GA22131@localhost> (raw)
In-Reply-To: <20090927192025.GA6327@wotan.suse.de>

On Mon, Sep 28, 2009 at 03:20:25AM +0800, Nick Piggin wrote:
> On Sun, Sep 27, 2009 at 06:47:39PM +0800, Wu Fengguang wrote:
> > > 
> > > And standard deviation is 0.04%, much larger than the difference 0.008% ..
> > 
> > Sorry that's not correct. I improved the accounting by treating
> > function0+function1 from two CPUs as an integral entity:
> > 
> >                  total time      add_to_page_cache_lru   percent  stddev
> >          before  3880166848.722  9683329.610             0.250%   0.014%
> >          after   3828516894.376  9778088.870             0.256%   0.012%
> >          delta                                           0.006%
> 
> I don't understand why you're doing this NFS workload to measure?

Because it is the first convenient workload hit my mind, and avoids
real disk IO :)

> I see significant nfs, networking protocol and device overheads in
> your profiles, also you're hitting some locks or something which
> is causing massive context switching. So I don't think this is a
> good test.

Yes there are overheads. However it is a real and common workload.

> But anyway as Hugh points out, you need to compare with a
> *completely* fixed kernel, which includes auditing all users of page
> flags non-atomically (slab, notably, but possibly also other
> places).

That's good point. We can do more benchmarks when more fixes are
available. However I suspect their design goal will be "fix them
without introducing noticeable overheads" :)

> One other thing to keep in mind that I will mention is that I am
> going to push in a patch to the page allocator to allow callers
> to avoid the refcounting (atomic_dec_and_test) in page lifetime,
> which is especially important for SLUB and takes more cycles off
> the page allocator...
>
> I don't know exactly what you're going to do after that to get a
> stable reference to slab pages. I guess you can read the page
> flags and speculatively take some slab locks and recheck etc...

For reliably we could skip page lock on zero refcounted pages.

We may lose the PG_hwpoison bit on races with __SetPageSlub*, however
it should be an acceptable imperfection.

Thanks,
Fengguang

WARNING: multiple messages have this Message-ID (diff)
From: Wu Fengguang <fengguang.wu@intel.com>
To: Nick Piggin <npiggin@suse.de>
Cc: Andi Kleen <andi@firstfloor.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Hugh Dickins <hugh.dickins@tiscali.co.uk>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [RFC][PATCH] HWPOISON: remove the unsafe __set_page_locked()
Date: Mon, 28 Sep 2009 16:44:01 +0800	[thread overview]
Message-ID: <20090928084401.GA22131@localhost> (raw)
In-Reply-To: <20090927192025.GA6327@wotan.suse.de>

On Mon, Sep 28, 2009 at 03:20:25AM +0800, Nick Piggin wrote:
> On Sun, Sep 27, 2009 at 06:47:39PM +0800, Wu Fengguang wrote:
> > > 
> > > And standard deviation is 0.04%, much larger than the difference 0.008% ..
> > 
> > Sorry that's not correct. I improved the accounting by treating
> > function0+function1 from two CPUs as an integral entity:
> > 
> >                  total time      add_to_page_cache_lru   percent  stddev
> >          before  3880166848.722  9683329.610             0.250%   0.014%
> >          after   3828516894.376  9778088.870             0.256%   0.012%
> >          delta                                           0.006%
> 
> I don't understand why you're doing this NFS workload to measure?

Because it is the first convenient workload hit my mind, and avoids
real disk IO :)

> I see significant nfs, networking protocol and device overheads in
> your profiles, also you're hitting some locks or something which
> is causing massive context switching. So I don't think this is a
> good test.

Yes there are overheads. However it is a real and common workload.

> But anyway as Hugh points out, you need to compare with a
> *completely* fixed kernel, which includes auditing all users of page
> flags non-atomically (slab, notably, but possibly also other
> places).

That's good point. We can do more benchmarks when more fixes are
available. However I suspect their design goal will be "fix them
without introducing noticeable overheads" :)

> One other thing to keep in mind that I will mention is that I am
> going to push in a patch to the page allocator to allow callers
> to avoid the refcounting (atomic_dec_and_test) in page lifetime,
> which is especially important for SLUB and takes more cycles off
> the page allocator...
>
> I don't know exactly what you're going to do after that to get a
> stable reference to slab pages. I guess you can read the page
> flags and speculatively take some slab locks and recheck etc...

For reliably we could skip page lock on zero refcounted pages.

We may lose the PG_hwpoison bit on races with __SetPageSlub*, however
it should be an acceptable imperfection.

Thanks,
Fengguang

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2009-09-28  8:44 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-09-26  3:15 [RFC][PATCH] HWPOISON: remove the unsafe __set_page_locked() Wu Fengguang
2009-09-26  3:15 ` Wu Fengguang
2009-09-26  3:49 ` Andi Kleen
2009-09-26  3:49   ` Andi Kleen
2009-09-26 10:52   ` Wu Fengguang
2009-09-26 10:52     ` Wu Fengguang
2009-09-26 11:31     ` Wu Fengguang
2009-09-26 11:31       ` Wu Fengguang
2009-09-27 10:47       ` Wu Fengguang
2009-09-27 10:47         ` Wu Fengguang
2009-09-27 19:20         ` Nick Piggin
2009-09-27 19:20           ` Nick Piggin
2009-09-28  8:44           ` Wu Fengguang [this message]
2009-09-28  8:44             ` Wu Fengguang
2009-09-29  5:16             ` Wu Fengguang
2009-09-29  5:16               ` Wu Fengguang
2009-10-01  2:02             ` Nick Piggin
2009-10-01  2:02               ` Nick Piggin
2009-10-02 10:54               ` Wu Fengguang
2009-10-02 10:54                 ` Wu Fengguang
2009-09-26 11:09 ` Hugh Dickins
2009-09-26 11:09   ` Hugh Dickins
2009-09-26 11:48   ` Wu Fengguang
2009-09-26 11:48     ` Wu Fengguang
2009-09-26 11:58     ` Hugh Dickins
2009-09-26 11:58       ` Hugh Dickins
2009-09-26 15:05     ` Andi Kleen
2009-09-26 15:05       ` Andi Kleen
2009-09-26 19:12       ` Nick Piggin
2009-09-26 19:12         ` Nick Piggin
2009-09-26 19:14     ` Nick Piggin
2009-09-26 19:14       ` Nick Piggin
2009-09-26 19:06   ` Nick Piggin
2009-09-26 19:06     ` Nick Piggin
2009-09-26 21:32     ` Andi Kleen
2009-09-26 21:32       ` Andi Kleen
2009-09-27 16:26       ` Hugh Dickins
2009-09-27 16:26         ` Hugh Dickins
2009-09-27 19:22         ` Nick Piggin
2009-09-27 19:22           ` Nick Piggin
2009-09-27 21:57           ` Hugh Dickins
2009-09-27 21:57             ` Hugh Dickins
2009-09-27 23:01             ` Nick Piggin
2009-09-27 23:01               ` Nick Piggin
2009-09-28  1:19               ` Andi Kleen
2009-09-28  1:19                 ` Andi Kleen
2009-09-28  1:52                 ` Wu Fengguang
2009-09-28  1:52                   ` Wu Fengguang
2009-09-28  2:57                 ` Nick Piggin
2009-09-28  2:57                   ` Nick Piggin
2009-09-28  4:11                   ` Andi Kleen
2009-09-28  4:11                     ` Andi Kleen
2009-09-28  4:29                     ` Nick Piggin
2009-09-28  4:29                       ` Nick Piggin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090928084401.GA22131@localhost \
    --to=fengguang.wu@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=andi@firstfloor.org \
    --cc=hugh.dickins@tiscali.co.uk \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=npiggin@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.