Re: ENOSPC returned during writepages

Linux EXT4 FS development
 help / color / mirror / Atom feed

From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
To: Mingming Cao <cmm@us.ibm.com>
Cc: Theodore Tso <tytso@mit.edu>,
	ext4 development <linux-ext4@vger.kernel.org>
Subject: Re: ENOSPC returned during writepages
Date: Thu, 21 Aug 2008 20:48:15 +0530	[thread overview]
Message-ID: <20080821151815.GD6509@skywalker> (raw)
In-Reply-To: <1219269325.7895.45.camel@mingming-laptop>

On Wed, Aug 20, 2008 at 02:55:25PM -0700, Mingming Cao wrote:
> 
> 在 2008-08-20三的 07:53 -0400，Theodore Tso写道：
> > On Wed, Aug 20, 2008 at 04:16:44PM +0530, Aneesh Kumar K.V wrote:
> > > > mpage_da_map_blocks block allocation failed for inode 323784 at logical
> > > > offset 313 with max blocks 11 with error -28
> > > > This should not happen.!! Data will be lost
> > 
> > We don't actually lose the data if free blocks are subsequently made
> > available, correct?
> > 
> 
> Well, I thought with Aneesh's new ext4_da_invalidate patch  in the patch
> queue, the dirty page get invalidate if ext4_da_writepages() could not
> successfully map/allocate blocks. That means we lost data:( 
> 
> I have a feeling that we did not try very hard before invalidate the
> dirty page which fail to map to disks. Perhaps we should try a few more
> times before give up. Also in that case, perhaps we should turn off
> delalloc fs wide, so the new writers won't take the subsequently made
> avaible free blocks away from this unlucky delalloc da writepages.

How do we try hard ? The mballoc already try had to allocate blocks. So I
am not sure what do we achieve by requesting for block allocation again.


> 
> > > I tried this patch. There are still multiple ways we can get wrong free
> > > block count. The patch reduced the number of errors. So we are doing
> > > better with patch. But I guess we can't use the percpu_counter based
> > > free block accounting with delalloc. Without delalloc it is ok even if
> > > we find some wrong free blocks count . The actual block allocation will fail in
> > > that case and we handle it perfectly fine. With delalloc we cannot
> > > afford to fail the block allocation. Should we look at a free block
> > > accounting rewrite using simple ext4_fsblk_t and and a spin lock ?
> > 
> > It would be a shame if we did given that the whole point of the percpu
> > counter was to avoid a scalability bottleneck.  Perhaps we could take
> > a filesystem-level spinlock only when the number of free blocks as
> > reported by the percpu_counter falls below some critical level?
> 
> Perhaps the  thresh hold should b higher, but other than that, the
> current ext4_has_free_blocks() code, does 1) get the freeblocks counter
> 2) if the counter < FBC_BATCH , it will call
> percpu_counter_sum_and_set(), which will take the per-cpu-counter lock,
> and do accurate accounting.
> 
> So after think again, I could not see what suggested above diffrent from
> what current ext4_has_free_blocks() does?
> 
> 
> Right now the ext4_has_free_blocks() uses the 
> 
> #define FBC_BATCH       (NR_CPUS*4)
> 
> as the thresh hold.  I thought that was good enough as
> ext4_da_reserve_space() only request 1 block at a time (called at
> write_begin time), but maybe I am wrong...
> 

I have right now threshold check as below.

+       /* Each CPU can accumulate FBC_BATCH blocks in their local
+        * counters. So we need to make sure we have free blocks more
+        * than FBC_BATCH  * nr_cpu_ids. Also add a window of 4 times.
+        */
+       if (free_blocks - (nblocks + root_blocks) <
+                                       (4 * (FBC_BATCH * nr_cpu_ids)))
{

-aneesh
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

next prev parent reply	other threads:[~2008-08-21 15:18 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-08-20  5:43 ENOSPC returned during writepages Aneesh Kumar K.V
2008-08-20 10:46 ` Aneesh Kumar K.V
2008-08-20 11:53   ` Theodore Tso
2008-08-20 18:27     ` Aneesh Kumar K.V
2008-08-20 21:35       ` Mingming Cao
2008-08-21 15:15         ` Aneesh Kumar K.V
2008-08-20 19:25     ` Andreas Dilger
2008-08-20 19:34       ` Theodore Tso
2008-08-20 20:56     ` Mingming Cao
2008-08-20 21:55       ` Theodore Tso
2008-08-20 22:02         ` Mingming Cao
2008-08-20 23:22       ` Mingming Cao
2008-08-20 23:42         ` Andreas Dilger
2008-08-20 23:58           ` Mingming Cao
2008-08-21  1:44             ` Andreas Dilger
2008-08-20 21:55     ` Mingming Cao
2008-08-21 15:18       ` Aneesh Kumar K.V [this message]
2008-08-21 15:35         ` Theodore Tso
2008-08-21 17:17           ` Mingming Cao
2008-08-23 11:12         ` Andreas Dilger
2008-08-21 15:12     ` Aneesh Kumar K.V
2008-08-21 16:56       ` Mingming Cao
2008-08-20 21:58 ` Mingming Cao
2008-08-21 15:09   ` Aneesh Kumar K.V
2008-08-21  5:06 ` Eric Sandeen
2008-08-21 16:45 ` Aneesh Kumar K.V
2008-08-21 17:07   ` Mingming Cao
2008-08-21 17:31     ` Aneesh Kumar K.V
2008-08-21 18:06       ` Mingming Cao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080821151815.GD6509@skywalker \
    --to=aneesh.kumar@linux.vnet.ibm.com \
    --cc=cmm@us.ibm.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox