All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mingming Cao <cmm@us.ibm.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	tytso@mit.edu, sandeen@redhat.com, linux-ext4@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH -V3 01/11] percpu_counters: make fbc->count read atomic on 32 bit architecture
Date: Thu, 28 Aug 2008 15:59:46 -0700	[thread overview]
Message-ID: <1219964386.6384.63.camel@mingming-laptop> (raw)
In-Reply-To: <20080827210925.b4846037.akpm@linux-foundation.org>


在 2008-08-27三的 21:09 -0700,Andrew Morton写道:
> On Thu, 28 Aug 2008 09:22:00 +0530 "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> wrote:
> 
> > On Wed, Aug 27, 2008 at 02:22:50PM -0700, Andrew Morton wrote:
> > > On Wed, 27 Aug 2008 23:01:52 +0200
> > > Peter Zijlstra <a.p.zijlstra@chello.nl> wrote:
> > > 
> > > > > 
> > > > > > +static inline s64 percpu_counter_read(struct percpu_counter *fbc)
> > > > > > +{
> > > > > > +	return fbc_count(fbc);
> > > > > > +}
> > > > > 
> > > > > This change means that a percpu_counter_read() from interrupt context
> > > > > on a 32-bit machine is now deadlockable, whereas it previously was not
> > > > > deadlockable on either 32-bit or 64-bit.
> > > > > 
> > > > > This flows on to the lib/proportions.c, which uses
> > > > > percpu_counter_read() and also does spin_lock_irqsave() internally,
> > > > > indicating that it is (or was) designed to be used in IRQ contexts.
> > > > 
> > > > percpu_counter() never was irq safe, which is why the proportion stuff
> > > > does all the irq disabling bits by hand.
> > > 
> > > percpu_counter_read() was irq-safe.  That changes here.  Needs careful
> > > review, changelogging and, preferably, runtime checks.  But perhaps
> > > they should be inside some CONFIG_thing which won't normally be done in
> > > production.
> > > 
> > > otoh, percpu_counter_read() is in fact a rare operation, so a bit of
> > > overhead probably won't matter.
> > > 
> > > (write-often, read-rarely is the whole point.  This patch's changelog's
> > > assertion that "Since fbc->count is read more frequently and updated
> > > rarely" is probably wrong.  Most percpu_counters will have their
> > > fbc->count modified far more frequently than having it read from).
> > 
> > we may actually be doing percpu_counter_add. But that doesn't update
> > fbc->count. Only if the local percpu values cross FBC_BATCH we update
> > fbc->count. If we are modifying fbc->count more frequently than
> > reading fbc->count then i guess we would be contenting of fbc->lock more.
> > 
> > 
> 
> Yep.  The frequency of modification of fbc->count is of the order of a
> tenth or a hundredth of the frequency of
> precpu_counter_<modification>() calls.
> 
> But in many cases the frequency of percpu_counter_read() calls is far
> far less than this.  For example, the percpu_counter_read() may only
> happen when userspace polls a /proc file.
> 
> 

The global counter is is much more frequently accessed with delalloc.:(

With delayed allocation, we have to do read the free blocks counter  at
each write_begin(),  to make sure there is enough free blocks to do
block reservation to prevent lately writepages returns ENOSPC.

Mingming

--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

WARNING: multiple messages have this Message-ID (diff)
From: Mingming Cao <cmm@us.ibm.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	tytso@mit.edu, sandeen@redhat.com, linux-ext4@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH -V3 01/11] percpu_counters: make fbc->count read atomic on 32 bit architecture
Date: Thu, 28 Aug 2008 15:59:46 -0700	[thread overview]
Message-ID: <1219964386.6384.63.camel@mingming-laptop> (raw)
In-Reply-To: <20080827210925.b4846037.akpm@linux-foundation.org>


在 2008-08-27三的 21:09 -0700,Andrew Morton写道:
> On Thu, 28 Aug 2008 09:22:00 +0530 "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> wrote:
> 
> > On Wed, Aug 27, 2008 at 02:22:50PM -0700, Andrew Morton wrote:
> > > On Wed, 27 Aug 2008 23:01:52 +0200
> > > Peter Zijlstra <a.p.zijlstra@chello.nl> wrote:
> > > 
> > > > > 
> > > > > > +static inline s64 percpu_counter_read(struct percpu_counter *fbc)
> > > > > > +{
> > > > > > +	return fbc_count(fbc);
> > > > > > +}
> > > > > 
> > > > > This change means that a percpu_counter_read() from interrupt context
> > > > > on a 32-bit machine is now deadlockable, whereas it previously was not
> > > > > deadlockable on either 32-bit or 64-bit.
> > > > > 
> > > > > This flows on to the lib/proportions.c, which uses
> > > > > percpu_counter_read() and also does spin_lock_irqsave() internally,
> > > > > indicating that it is (or was) designed to be used in IRQ contexts.
> > > > 
> > > > percpu_counter() never was irq safe, which is why the proportion stuff
> > > > does all the irq disabling bits by hand.
> > > 
> > > percpu_counter_read() was irq-safe.  That changes here.  Needs careful
> > > review, changelogging and, preferably, runtime checks.  But perhaps
> > > they should be inside some CONFIG_thing which won't normally be done in
> > > production.
> > > 
> > > otoh, percpu_counter_read() is in fact a rare operation, so a bit of
> > > overhead probably won't matter.
> > > 
> > > (write-often, read-rarely is the whole point.  This patch's changelog's
> > > assertion that "Since fbc->count is read more frequently and updated
> > > rarely" is probably wrong.  Most percpu_counters will have their
> > > fbc->count modified far more frequently than having it read from).
> > 
> > we may actually be doing percpu_counter_add. But that doesn't update
> > fbc->count. Only if the local percpu values cross FBC_BATCH we update
> > fbc->count. If we are modifying fbc->count more frequently than
> > reading fbc->count then i guess we would be contenting of fbc->lock more.
> > 
> > 
> 
> Yep.  The frequency of modification of fbc->count is of the order of a
> tenth or a hundredth of the frequency of
> precpu_counter_<modification>() calls.
> 
> But in many cases the frequency of percpu_counter_read() calls is far
> far less than this.  For example, the percpu_counter_read() may only
> happen when userspace polls a /proc file.
> 
> 

The global counter is is much more frequently accessed with delalloc.:(

With delayed allocation, we have to do read the free blocks counter  at
each write_begin(),  to make sure there is enough free blocks to do
block reservation to prevent lately writepages returns ENOSPC.

Mingming


  reply	other threads:[~2008-08-28 22:59 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-08-27 15:28 [PATCH -V3 01/11] percpu_counters: make fbc->count read atomic on 32 bit architecture Aneesh Kumar K.V
2008-08-27 15:28 ` [PATCH -V3 02/11] ext4: Make sure all the block allocation paths reserve blocks Aneesh Kumar K.V
2008-08-27 15:28   ` [PATCH -V3 03/11] ext4: Retry block reservation Aneesh Kumar K.V
2008-08-27 15:28     ` [PATCH -V3 04/11] ext4: Add percpu dirty block accounting Aneesh Kumar K.V
2008-08-27 15:28       ` [PATCH -V3 05/11] ext4: Switch to non delalloc mode when we are low on free blocks count Aneesh Kumar K.V
2008-08-27 15:28         ` [PATCH -V3 06/11] ext4: Update meta-data reservation with delalloc Aneesh Kumar K.V
2008-08-27 15:28           ` [PATCH -V3 07/11] ext4: request for blocks with ar.excepted_group = -1 Aneesh Kumar K.V
2008-08-27 15:28             ` [PATCH -V3 08/11] ext4: Signed arithematic fix Aneesh Kumar K.V
2008-08-27 15:28               ` [PATCH -V3 09/11] ext4: Fix ext4 nomballoc allocator for ENOSPC Aneesh Kumar K.V
2008-08-27 15:28                 ` [PATCH -V3 10/11] ext4: Add inode to journal handle after block allocation for ordered mode Aneesh Kumar K.V
2008-08-27 15:28                   ` [PATCH -V3 11/11] ext4: Retry block allocation if we have free blocks left Aneesh Kumar K.V
2008-08-28 21:57                 ` [PATCH -V3 09/11] ext4: Fix ext4 nomballoc allocator for ENOSPC Mingming Cao
2008-08-29  3:44                   ` Aneesh Kumar K.V
2008-08-29  4:14                     ` Aneesh Kumar K.V
2008-08-29  5:02                       ` Mingming Cao
2008-08-29  5:06                     ` Mingming Cao
2008-08-29  8:25                       ` Aneesh Kumar K.V
2008-08-28 21:04               ` [PATCH -V3 08/11] ext4: Signed arithematic fix Mingming Cao
2008-08-28 21:03             ` [PATCH -V3 07/11] ext4: request for blocks with ar.excepted_group = -1 Mingming Cao
2008-08-28 21:03           ` [PATCH -V3 06/11] ext4: Update meta-data reservation with delalloc Mingming Cao
2008-08-28 20:57         ` [PATCH -V3 05/11] ext4: Switch to non delalloc mode when we are low on free blocks count Mingming Cao
2008-08-28 20:56       ` [PATCH -V3 04/11] ext4: Add percpu dirty block accounting Mingming Cao
2008-10-09 20:44       ` Eric Sandeen
2008-10-10  4:52         ` Aneesh Kumar K.V
2008-10-10  4:58           ` Eric Sandeen
2008-10-11 21:10         ` Andreas Dilger
2008-08-28 20:42     ` [PATCH -V3 03/11] ext4: Retry block reservation Mingming Cao
2008-08-28 20:41   ` [PATCH -V3 02/11] ext4: Make sure all the block allocation paths reserve blocks Mingming Cao
2008-08-27 19:05 ` [PATCH -V3 01/11] percpu_counters: make fbc->count read atomic on 32 bit architecture Andrew Morton
2008-08-27 21:01   ` Peter Zijlstra
2008-08-27 21:22     ` Andrew Morton
2008-08-28  3:52       ` Aneesh Kumar K.V
2008-08-28  4:09         ` Andrew Morton
2008-08-28 22:59           ` Mingming Cao [this message]
2008-08-28 22:59             ` Mingming Cao
2008-08-28  7:57       ` Peter Zijlstra
2008-08-28  3:48   ` Aneesh Kumar K.V
2008-08-28  4:06     ` Andrew Morton
2008-08-28 14:19       ` Nick Piggin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1219964386.6384.63.camel@mingming-laptop \
    --to=cmm@us.ibm.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=aneesh.kumar@linux.vnet.ibm.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=sandeen@redhat.com \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.