All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
To: Mingming Cao <cmm@us.ibm.com>
Cc: Theodore Tso <tytso@mit.edu>,
	ext4 development <linux-ext4@vger.kernel.org>
Subject: Re: ENOSPC returned during writepages
Date: Thu, 21 Aug 2008 20:48:15 +0530	[thread overview]
Message-ID: <20080821151815.GD6509@skywalker> (raw)
In-Reply-To: <1219269325.7895.45.camel@mingming-laptop>

On Wed, Aug 20, 2008 at 02:55:25PM -0700, Mingming Cao wrote:
> 
> 在 2008-08-20三的 07:53 -0400,Theodore Tso写道:
> > On Wed, Aug 20, 2008 at 04:16:44PM +0530, Aneesh Kumar K.V wrote:
> > > > mpage_da_map_blocks block allocation failed for inode 323784 at logical
> > > > offset 313 with max blocks 11 with error -28
> > > > This should not happen.!! Data will be lost
> > 
> > We don't actually lose the data if free blocks are subsequently made
> > available, correct?
> > 
> 
> Well, I thought with Aneesh's new ext4_da_invalidate patch  in the patch
> queue, the dirty page get invalidate if ext4_da_writepages() could not
> successfully map/allocate blocks. That means we lost data:( 
> 
> I have a feeling that we did not try very hard before invalidate the
> dirty page which fail to map to disks. Perhaps we should try a few more
> times before give up. Also in that case, perhaps we should turn off
> delalloc fs wide, so the new writers won't take the subsequently made
> avaible free blocks away from this unlucky delalloc da writepages.

How do we try hard ? The mballoc already try had to allocate blocks. So I
am not sure what do we achieve by requesting for block allocation again.


> 
> > > I tried this patch. There are still multiple ways we can get wrong free
> > > block count. The patch reduced the number of errors. So we are doing
> > > better with patch. But I guess we can't use the percpu_counter based
> > > free block accounting with delalloc. Without delalloc it is ok even if
> > > we find some wrong free blocks count . The actual block allocation will fail in
> > > that case and we handle it perfectly fine. With delalloc we cannot
> > > afford to fail the block allocation. Should we look at a free block
> > > accounting rewrite using simple ext4_fsblk_t and and a spin lock ?
> > 
> > It would be a shame if we did given that the whole point of the percpu
> > counter was to avoid a scalability bottleneck.  Perhaps we could take
> > a filesystem-level spinlock only when the number of free blocks as
> > reported by the percpu_counter falls below some critical level?
> 
> Perhaps the  thresh hold should b higher, but other than that, the
> current ext4_has_free_blocks() code, does 1) get the freeblocks counter
> 2) if the counter < FBC_BATCH , it will call
> percpu_counter_sum_and_set(), which will take the per-cpu-counter lock,
> and do accurate accounting.
> 
> So after think again, I could not see what suggested above diffrent from
> what current ext4_has_free_blocks() does?
> 
> 
> Right now the ext4_has_free_blocks() uses the 
> 
> #define FBC_BATCH       (NR_CPUS*4)
> 
> as the thresh hold.  I thought that was good enough as
> ext4_da_reserve_space() only request 1 block at a time (called at
> write_begin time), but maybe I am wrong...
> 

I have right now threshold check as below.

+       /* Each CPU can accumulate FBC_BATCH blocks in their local
+        * counters. So we need to make sure we have free blocks more
+        * than FBC_BATCH  * nr_cpu_ids. Also add a window of 4 times.
+        */
+       if (free_blocks - (nblocks + root_blocks) <
+                                       (4 * (FBC_BATCH * nr_cpu_ids)))
{

-aneesh
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2008-08-21 15:18 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-08-20  5:43 ENOSPC returned during writepages Aneesh Kumar K.V
2008-08-20 10:46 ` Aneesh Kumar K.V
2008-08-20 11:53   ` Theodore Tso
2008-08-20 18:27     ` Aneesh Kumar K.V
2008-08-20 21:35       ` Mingming Cao
2008-08-21 15:15         ` Aneesh Kumar K.V
2008-08-20 19:25     ` Andreas Dilger
2008-08-20 19:34       ` Theodore Tso
2008-08-20 20:56     ` Mingming Cao
2008-08-20 21:55       ` Theodore Tso
2008-08-20 22:02         ` Mingming Cao
2008-08-20 23:22       ` Mingming Cao
2008-08-20 23:42         ` Andreas Dilger
2008-08-20 23:58           ` Mingming Cao
2008-08-21  1:44             ` Andreas Dilger
2008-08-20 21:55     ` Mingming Cao
2008-08-21 15:18       ` Aneesh Kumar K.V [this message]
2008-08-21 15:35         ` Theodore Tso
2008-08-21 17:17           ` Mingming Cao
2008-08-23 11:12         ` Andreas Dilger
2008-08-21 15:12     ` Aneesh Kumar K.V
2008-08-21 16:56       ` Mingming Cao
2008-08-20 21:58 ` Mingming Cao
2008-08-21 15:09   ` Aneesh Kumar K.V
2008-08-21  5:06 ` Eric Sandeen
2008-08-21 16:45 ` Aneesh Kumar K.V
2008-08-21 17:07   ` Mingming Cao
2008-08-21 17:31     ` Aneesh Kumar K.V
2008-08-21 18:06       ` Mingming Cao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080821151815.GD6509@skywalker \
    --to=aneesh.kumar@linux.vnet.ibm.com \
    --cc=cmm@us.ibm.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.