All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
To: Borislav Petkov <bp@amd64.org>
Cc: Conny Seidel <conny.seidel@amd.com>, Jan Kara <jack@suse.cz>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Fengguang Wu <fengguang.wu@intel.com>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Andrew Morton <akpm@linux-foundation.org>,
	Johannes Weiner <jweiner@redhat.com>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Subject: Re: divide error: bdi_dirty_limit+0x5a/0x9e
Date: Tue, 25 Sep 2012 00:18:46 +0530	[thread overview]
Message-ID: <5060AB0E.3070809@linux.vnet.ibm.com> (raw)
In-Reply-To: <20120924181927.GA25762@aftab.osrc.amd.com>

On 09/24/2012 11:49 PM, Borislav Petkov wrote:
> On Mon, Sep 24, 2012 at 08:16:50PM +0200, Conny Seidel wrote:
>> Hi,
>>
>> On Mon, 24 Sep 2012 16:36:09 +0200
>> Borislav Petkov <bp@amd64.org> wrote:
>>> [ a?| ]
>>>
>>> Conny, would you test pls?
>>
>> Sure thing.
>> Out of ~25 runs I only triggered it once, without the patch the
>> trigger-rate is higher.
>>
>> [   55.098249] Broke affinity for irq 81
>> [   55.105108] smpboot: CPU 1 is now offline
>> [   55.311216] smpboot: Booting Node 0 Processor 1 APIC 0x11
>> [   55.333022] LVT offset 0 assigned for vector 0x400
>> [   55.545877] smpboot: CPU 2 is now offline
>> [   55.753050] smpboot: Booting Node 0 Processor 2 APIC 0x12
>> [   55.775582] LVT offset 0 assigned for vector 0x400
>> [   55.986747] smpboot: CPU 3 is now offline
>> [   56.193839] smpboot: Booting Node 0 Processor 3 APIC 0x13
>> [   56.212643] LVT offset 0 assigned for vector 0x400
>> [   56.423201] Got negative events: -25
> 
> I see it:
> 
> __percpu_counter_sum does for_each_online_cpu without doing
> get/put_online_cpus().
> 

Maybe I'm missing something, but that doesn't immediately tell me
what's the exact source of the bug.. Note that there is a hotplug
callback percpu_counter_hotcpu_callback() that takes the same
fbc->lock before updating/resetting the percpu counters of offline
CPU. So, though the synchronization is a bit weird, I don't
immediately see a problematic race condition there.

And, speaking of hotplug callbacks, on a slightly different note,
I see one defined as ratelimit_handler(), which calls
writeback_set_ratelimit() for *every single* state change in the
hotplug sequence! Is that really intentional? num_online_cpus()
changes its value only -once- for every hotplug :-)

Regards,
Srivatsa S. Bhat

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: "Srivatsa S. Bhat" <srivatsa.bhat@linux.vnet.ibm.com>
To: Borislav Petkov <bp@amd64.org>
Cc: Conny Seidel <conny.seidel@amd.com>, Jan Kara <jack@suse.cz>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Fengguang Wu <fengguang.wu@intel.com>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Andrew Morton <akpm@linux-foundation.org>,
	Johannes Weiner <jweiner@redhat.com>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Subject: Re: divide error: bdi_dirty_limit+0x5a/0x9e
Date: Tue, 25 Sep 2012 00:18:46 +0530	[thread overview]
Message-ID: <5060AB0E.3070809@linux.vnet.ibm.com> (raw)
In-Reply-To: <20120924181927.GA25762@aftab.osrc.amd.com>

On 09/24/2012 11:49 PM, Borislav Petkov wrote:
> On Mon, Sep 24, 2012 at 08:16:50PM +0200, Conny Seidel wrote:
>> Hi,
>>
>> On Mon, 24 Sep 2012 16:36:09 +0200
>> Borislav Petkov <bp@amd64.org> wrote:
>>> [ … ]
>>>
>>> Conny, would you test pls?
>>
>> Sure thing.
>> Out of ~25 runs I only triggered it once, without the patch the
>> trigger-rate is higher.
>>
>> [   55.098249] Broke affinity for irq 81
>> [   55.105108] smpboot: CPU 1 is now offline
>> [   55.311216] smpboot: Booting Node 0 Processor 1 APIC 0x11
>> [   55.333022] LVT offset 0 assigned for vector 0x400
>> [   55.545877] smpboot: CPU 2 is now offline
>> [   55.753050] smpboot: Booting Node 0 Processor 2 APIC 0x12
>> [   55.775582] LVT offset 0 assigned for vector 0x400
>> [   55.986747] smpboot: CPU 3 is now offline
>> [   56.193839] smpboot: Booting Node 0 Processor 3 APIC 0x13
>> [   56.212643] LVT offset 0 assigned for vector 0x400
>> [   56.423201] Got negative events: -25
> 
> I see it:
> 
> __percpu_counter_sum does for_each_online_cpu without doing
> get/put_online_cpus().
> 

Maybe I'm missing something, but that doesn't immediately tell me
what's the exact source of the bug.. Note that there is a hotplug
callback percpu_counter_hotcpu_callback() that takes the same
fbc->lock before updating/resetting the percpu counters of offline
CPU. So, though the synchronization is a bit weird, I don't
immediately see a problematic race condition there.

And, speaking of hotplug callbacks, on a slightly different note,
I see one defined as ratelimit_handler(), which calls
writeback_set_ratelimit() for *every single* state change in the
hotplug sequence! Is that really intentional? num_online_cpus()
changes its value only -once- for every hotplug :-)

Regards,
Srivatsa S. Bhat


  reply	other threads:[~2012-09-24 18:49 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-09-24 10:23 divide error: bdi_dirty_limit+0x5a/0x9e Borislav Petkov
2012-09-24 10:23 ` Borislav Petkov
2012-09-24 10:38 ` Srivatsa S. Bhat
2012-09-24 10:38   ` Srivatsa S. Bhat
2012-09-24 11:05   ` Borislav Petkov
2012-09-24 11:05     ` Borislav Petkov
2012-09-24 11:13     ` Srivatsa S. Bhat
2012-09-24 11:13       ` Srivatsa S. Bhat
2012-09-24 11:34       ` Fengguang Wu
2012-09-24 11:34         ` Fengguang Wu
2012-09-24 11:51         ` Srivatsa S. Bhat
2012-09-24 11:51           ` Srivatsa S. Bhat
2012-09-24 12:20         ` Borislav Petkov
2012-09-24 12:20           ` Borislav Petkov
2012-09-24 12:29           ` Fengguang Wu
2012-09-24 12:29             ` Fengguang Wu
2012-09-24 12:56             ` Borislav Petkov
2012-09-24 12:56               ` Borislav Petkov
2012-09-24 18:54               ` Srivatsa S. Bhat
2012-09-24 18:54                 ` Srivatsa S. Bhat
2012-09-24 14:23 ` Jan Kara
2012-09-24 14:23   ` Jan Kara
2012-09-24 14:36   ` Borislav Petkov
2012-09-24 14:36     ` Borislav Petkov
2012-09-24 18:16     ` Conny Seidel
2012-09-24 18:16       ` Conny Seidel
2012-09-24 18:19       ` Borislav Petkov
2012-09-24 18:19         ` Borislav Petkov
2012-09-24 18:48         ` Srivatsa S. Bhat [this message]
2012-09-24 18:48           ` Srivatsa S. Bhat
2012-09-24 19:31           ` Borislav Petkov
2012-09-24 19:31             ` Borislav Petkov
2012-09-24 20:07             ` Jan Kara
2012-09-24 20:07               ` Jan Kara
2012-09-24 20:17               ` Jan Kara
2012-09-24 20:17                 ` Jan Kara
2012-09-24 21:21                 ` Andrew Morton
2012-09-24 21:21                   ` Andrew Morton
2012-09-24 22:27                   ` Jan Kara
2012-09-24 22:27                     ` Jan Kara
2012-09-25  8:57                 ` Conny Seidel
2012-09-25  8:57                   ` Conny Seidel
2012-09-24 20:48           ` [PATCH] CPU hotplug, writeback: Don't call writeback_set_ratelimit() too often during hotplug Srivatsa S. Bhat
2012-09-24 20:48             ` Srivatsa S. Bhat
2012-09-28 12:27             ` Fengguang Wu
2012-09-28 12:27               ` Fengguang Wu
2012-09-28 14:46               ` Srivatsa S. Bhat
2012-09-28 14:46                 ` Srivatsa S. Bhat
2012-10-03 23:11               ` Ni zhan Chen
2012-10-03 23:11                 ` Ni zhan Chen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5060AB0E.3070809@linux.vnet.ibm.com \
    --to=srivatsa.bhat@linux.vnet.ibm.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=bp@amd64.org \
    --cc=conny.seidel@amd.com \
    --cc=fengguang.wu@intel.com \
    --cc=jack@suse.cz \
    --cc=jweiner@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=paulmck@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.