Re: ENOSPC returned during writepages

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
To: Mingming Cao <cmm@us.ibm.com>
Cc: "Theodore Ts'o" <tytso@mit.edu>, Andreas Dilger <adilger@sun.com>,
	ext4 development <linux-ext4@vger.kernel.org>
Subject: Re: ENOSPC returned during writepages
Date: Thu, 21 Aug 2008 23:01:21 +0530	[thread overview]
Message-ID: <20080821173121.GF6509@skywalker> (raw)
In-Reply-To: <1219338457.6342.15.camel@mingming-laptop>

On Thu, Aug 21, 2008 at 10:07:37AM -0700, Mingming Cao wrote:
> 
> 在 2008-08-21四的 22:15 +0530，Aneesh Kumar K.V写道：
> > On Wed, Aug 20, 2008 at 11:13:39AM +0530, Aneesh Kumar K.V wrote:
> > > Hi,
> > > 
> > > I am getting this even with the latest patch queue. The test program is
> > > a modified fsstress with fallocate support.
> > > 
> > > mpage_da_map_blocks block allocation failed for inode 377954 at logical
> > > offset 313 with max blocks 4 with error -28
> > > mpage_da_map_blocks block allocation failed for inode 336367 at logical
> > > offset 74 with max blocks 9 with error -28
> > > mpage_da_map_blocks block allocation failed for inode 345560 at logical
> > > offset 542 with max blocks 7 with error -28
> > > This should not happen.!! Data will be lost
> > > mpage_da_map_blocks block allocation failed for inode 355317 at logical
> > > offset 152 with max blocks 10 with error -28
> > > This should not happen.!! Data will be lost
> > > mpage_da_map_blocks block allocation failed for inode 395261 at logical
> > > offset 462 with max blocks 1 with error -28
> > > This should not happen.!! Data will be lost
> > > mpage_da_map_blocks block allocation failed for inode 323784 at logical
> > > offset 313 with max blocks 11 with error -28
> > > This should not happen.!! Data will be lost
> > > 
> > 
> > With this patch i am not seeing error. It does the below
> > 
> > a) use ext4_claim_free_blocks that also update the free blocks count
> > b) Later after block allocation update the free blocks count if we
> > allocated less with non-delayed mode
> > c) Switch to non delay mode if we are low on free blocks. 
> > 
> I had sent a patch to do c) yesterday, I noticed that we can't switch to
> non delayed mode if the inode already have some delalloc dirty pages.
> 
> 
> > @@ -2462,11 +2464,21 @@ static int ext4_da_write_begin(struct file *file, struct address_space *mapping,
> >  	unsigned from, to;
> >  	struct inode *inode = mapping->host;
> >  	handle_t *handle;
> > +	s64 free_blocks;
> > +	struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
> > 
> >  	index = pos >> PAGE_CACHE_SHIFT;
> >  	from = pos & (PAGE_CACHE_SIZE - 1);
> >  	to = from + len;
> > 
> > +	free_blocks = percpu_counter_read_positive(&sbi->s_freeblocks_counter);
> > +	if (free_blocks < (4 * (FBC_BATCH * nr_cpu_ids))) {
> > +		/* switch to non delalloc mode */
> > +		*fsdata = (void *)1;
> > +		return ext4_write_begin(file, mapping, pos,
> > +					len, flags, pagep, fsdata);
> > +	}
> > +	*fsdata = (void *)0;
> 
> No, calling ext4_write_begin() directly won't work, as it start a
> handle, do the block allocation , and leave the handle there.   It
> expect later the write_end aops  to file the inode to the transaction
> list, and close that handle. 
> 
> With your change, the aops write_end still points to the
> ext4_da_write_end(), which doesn't match the ext4_write_begin. We need
> to switch the aop write_begin/write_end  function pointers all together.
> 

My patch does it in a simple way. I am attaching only switch to non
delalloc patch below. It is still being tested.

commit 490b69bd47b9ea27b1bb86bbdfb85a2911047149
Author: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Date:   Thu Aug 21 22:10:50 2008 +0530

    switch to non-delalloc

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index d965a05..087abca 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -1030,11 +1030,17 @@ static void ext4_da_update_reserve_space(struct inode *inode, int used)
 	BUG_ON(mdb > EXT4_I(inode)->i_reserved_meta_blocks);
 	mdb_free = EXT4_I(inode)->i_reserved_meta_blocks - mdb;
 
-	/* Account for allocated meta_blocks */
-	mdb_free -= EXT4_I(inode)->i_allocated_meta_blocks;
+	if (mdb_free) {
+		/* Account for allocated meta_blocks */
+		mdb_free -= EXT4_I(inode)->i_allocated_meta_blocks;
 
-	/* update fs free blocks counter for truncate case */
-	percpu_counter_add(&sbi->s_freeblocks_counter, mdb_free);
+		/*
+		 * We have reserved more blocks.
+		 * Now free the extra blocks reserved
+		 */
+		percpu_counter_add(&sbi->s_freeblocks_counter, mdb_free);
+		EXT4_I(inode)->i_allocated_meta_blocks = 0;
+	}
 
 	/* update per-inode reservations */
 	BUG_ON(used  > EXT4_I(inode)->i_reserved_data_blocks);
@@ -1042,7 +1048,6 @@ static void ext4_da_update_reserve_space(struct inode *inode, int used)
 
 	BUG_ON(mdb > EXT4_I(inode)->i_reserved_meta_blocks);
 	EXT4_I(inode)->i_reserved_meta_blocks = mdb;
-	EXT4_I(inode)->i_allocated_meta_blocks = 0;
 	spin_unlock(&EXT4_I(inode)->i_block_reservation_lock);
 }
 
@@ -2459,11 +2464,21 @@ static int ext4_da_write_begin(struct file *file, struct address_space *mapping,
 	unsigned from, to;
 	struct inode *inode = mapping->host;
 	handle_t *handle;
+	s64 free_blocks;
+	struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
 
 	index = pos >> PAGE_CACHE_SHIFT;
 	from = pos & (PAGE_CACHE_SIZE - 1);
 	to = from + len;
 
+	free_blocks = percpu_counter_read_positive(&sbi->s_freeblocks_counter);
+	if (free_blocks < (4 * (FBC_BATCH * nr_cpu_ids))) {
+		/* switch to non delalloc mode */
+		*fsdata = (void *)1;
+		return ext4_write_begin(file, mapping, pos,
+					len, flags, pagep, fsdata);
+	}
+	*fsdata = (void *)0;
 retry:
 	/*
 	 * With delayed allocation, we don't log the i_disksize update
@@ -2532,6 +2547,19 @@ static int ext4_da_write_end(struct file *file,
 	handle_t *handle = ext4_journal_current_handle();
 	loff_t new_i_size;
 	unsigned long start, end;
+	int low_free_blocks = (int)fsdata;
+
+	if (low_free_blocks) {
+		if (ext4_should_order_data(inode)) {
+			return ext4_ordered_write_end(file, mapping, pos,
+					len, copied, page, fsdata);
+		} else if (ext4_should_writeback_data(inode)) {
+			return ext4_writeback_write_end(file, mapping, pos,
+					len, copied, page, fsdata);
+		} else {
+			BUG();
+		}
+	}
 
 	start = pos & (PAGE_CACHE_SIZE - 1);
 	end = start + copied -1;
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

next prev parent reply	other threads:[~2008-08-21 17:31 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-08-20  5:43 ENOSPC returned during writepages Aneesh Kumar K.V
2008-08-20 10:46 ` Aneesh Kumar K.V
2008-08-20 11:53   ` Theodore Tso
2008-08-20 18:27     ` Aneesh Kumar K.V
2008-08-20 21:35       ` Mingming Cao
2008-08-21 15:15         ` Aneesh Kumar K.V
2008-08-20 19:25     ` Andreas Dilger
2008-08-20 19:34       ` Theodore Tso
2008-08-20 20:56     ` Mingming Cao
2008-08-20 21:55       ` Theodore Tso
2008-08-20 22:02         ` Mingming Cao
2008-08-20 23:22       ` Mingming Cao
2008-08-20 23:42         ` Andreas Dilger
2008-08-20 23:58           ` Mingming Cao
2008-08-21  1:44             ` Andreas Dilger
2008-08-20 21:55     ` Mingming Cao
2008-08-21 15:18       ` Aneesh Kumar K.V
2008-08-21 15:35         ` Theodore Tso
2008-08-21 17:17           ` Mingming Cao
2008-08-23 11:12         ` Andreas Dilger
2008-08-21 15:12     ` Aneesh Kumar K.V
2008-08-21 16:56       ` Mingming Cao
2008-08-20 21:58 ` Mingming Cao
2008-08-21 15:09   ` Aneesh Kumar K.V
2008-08-21  5:06 ` Eric Sandeen
2008-08-21 16:45 ` Aneesh Kumar K.V
2008-08-21 17:07   ` Mingming Cao
2008-08-21 17:31     ` Aneesh Kumar K.V [this message]
2008-08-21 18:06       ` Mingming Cao

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:d965a05 dfblob:087abca )
 OR (
bs:"Re: ENOSPC returned during writepages" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080821173121.GF6509@skywalker \
    --to=aneesh.kumar@linux.vnet.ibm.com \
    --cc=adilger@sun.com \
    --cc=cmm@us.ibm.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.