From: Wu Fengguang <wfg@linux.intel.com>
To: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: "Илья Тумайкин" <librarian_rus@yahoo.com>,
LKML <linux-kernel@vger.kernel.org>,
linux-fsdevel@vger.kernel.org
Subject: Re: A regression in recent 3.2 kernel: bdi_dirty_limit() divide error
Date: Mon, 9 Jan 2012 12:55:15 +0800 [thread overview]
Message-ID: <20120109045515.GB3976@localhost> (raw)
In-Reply-To: <1326017954.2442.35.camel@twins>
On Sun, Jan 08, 2012 at 11:19:14AM +0100, Peter Zijlstra wrote:
> On Sun, 2012-01-08 at 10:33 +0800, Wu Fengguang wrote:
> > On Sat, Jan 07, 2012 at 05:35:25PM +0100, Peter Zijlstra wrote:
> > > On Sat, 2012-01-07 at 22:56 +0800, Wu Fengguang wrote:
> > > > Subject:
> > > > Date: Sat Jan 07 22:50:45 CST 2012
> > > >
> > > > The uninitilized shift may lead to denominator=0 in
> > > > prop_fraction_percpu() and divide error in bdi_dirty_limit().
> > >
> > > I'm not seeing how, only proc_change_shift() can change ->index, and it
> > > does that after it writes ->pg[index]->shift.
> >
> > Then I lose the clue why bdi_dirty_limit() will divide error at all.
>
> You and me both, the weird thing is, this code hasn't been changes like
> forever and I can't recall any such weirdness.
>
> In fact, prop_fraction_percpu() sets the denominator to period_2 +
> (global_count & counter_mask).
>
> The only way to make that 0 is to overflow the unsigned long.. did the
> crash happen on 32bit -- I never saw the initial report?
No, it's a 64bit kernel. Sorry I should have forwarded the initial
complete report.
> But even then, we limit PROP_MAX_SHIFT to 3*BITS_PER_LONG/4, I don't
> think that could ever overflow.
It seems PROP_MAX_SHIFT should be set to <=32 on 64bit box, because
1) bdi_dirty_limit() only uses the lower 32 bit of the denominator
by calling do_div()
2) (bdi_dirty * numerator) could easily overflow if numerator used
up to 48 bits, leaving only 16 bits to bdi_dirty
And I guess (2) may be the root cause of a related old bug:
sudden drops of bdi_thresh
http://lkml.indiana.edu/hypermail/linux/kernel/1109.0/00183.html
http://lkml.indiana.edu/hypermail/linux/kernel/1109.0/00183/10-3.1.0-rc1%2Bbalance_dirty_pages-pages.png
> > prop_change_shift() does
> >
> > change ->pg[index]->shift
> > smp_wmb()
> > change ->index
> >
> > Will the read side prop_fraction_percpu() need some read memory barrier?
>
> It actually has one, see prop_get_global()...
Ah yes! Sorry for overlooking this.
Regards,
Fengguang
next prev parent reply other threads:[~2012-01-09 4:55 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <1325884395.57034.YahooMailClassic@web161605.mail.bf1.yahoo.com>
2012-01-07 14:56 ` A regression in recent 3.2 kernel: bdi_dirty_limit() divide error Wu Fengguang
2012-01-07 16:35 ` Peter Zijlstra
2012-01-08 2:33 ` Wu Fengguang
2012-01-08 10:19 ` Peter Zijlstra
2012-01-09 4:04 ` Wu Fengguang
[not found] ` <1326205945.62365.YahooMailClassic@web161603.mail.bf1.yahoo.com>
2012-01-10 14:38 ` Wu Fengguang
2012-01-09 4:55 ` Wu Fengguang [this message]
2012-01-10 14:54 ` [PATCH] lib: proportion: lower PROP_MAX_SHIFT to 32 on 64-bit kernel Wu Fengguang
2012-01-10 15:01 ` Wu Fengguang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120109045515.GB3976@localhost \
--to=wfg@linux.intel.com \
--cc=a.p.zijlstra@chello.nl \
--cc=librarian_rus@yahoo.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).