From: Mingming Cao <cmm@us.ibm.com>
To: Andreas Dilger <adilger@sun.com>
Cc: Theodore Tso <tytso@mit.edu>,
"Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>,
ext4 development <linux-ext4@vger.kernel.org>
Subject: Re: ENOSPC returned during writepages
Date: Wed, 20 Aug 2008 16:58:47 -0700 [thread overview]
Message-ID: <1219276727.7895.69.camel@mingming-laptop> (raw)
In-Reply-To: <20080820234208.GO3392@webber.adilger.int>
在 2008-08-20三的 17:42 -0600,Andreas Dilger写道:
> On Aug 20, 2008 16:22 -0700, Mingming Cao wrote:
> > ext4: fall back to non delalloc mode if filesystem is almost full
> > From: Mingming Cao <cmm@us.ibm.com>
> >
> > In the case of filesystem is close to full (free blocks is below
> > the watermark NRCPUS *4) and there is not enough to reserve blocks for
> > delayed allocation, instead of return user back with ENOSPC error, with
> > this patch, it tries to fall back to non delayed allocation mode.
>
> I don't think that making a low watermark of only 4 blocks is enough,
> because each of the per-CPU counters could be off by as much as FBC_BATCH.
> I think dropping delalloc support earlier is safer, something like
> (FBC_BATCH * NR_CPUS).
>
Okay, make sense.
> > +static int ext4_write_begin_nondelalloc(struct file *file,
> > + struct address_space *mapping,
> > + loff_t pos, unsigned len, unsigned flags,
> > + struct page **pagep, void **fsdata)
> > +{
> > + struct inode *inode = mapping->host;
> > +
> > + /* turn off delalloc for this inode*/
> > + ext4_set_aops(inode, 0);
> > +
> > + return mapping->a_ops->write_begin(file, mapping, pos, len,
> > + flags, pagep, fsdata);
> > +}
>
> Hmm, I don't understand this - isn't delalloc already off here, because
> this is "ext4_write_begin_nondelalloc()"?
>
This function probably should be called
ext4_wb_fall_back_to_nondelalloc(). it is called when we detect ENOSPC
and trying to fall back to non delalloc.
This function eventually will call nondelalloc write_begin function
ext4_write_begin().
> > +void ext4_set_aops(struct inode *inode, int delalloc)
> > {
> > + if (test_opt(inode->i_sb, DELALLOC)) {
> > + if (ext4_has_free_blocks(EXT4_SB(inode->i_sb),
> > + EXT4_MIN_FREE_BLOCKS) > EXT4_MIN_FREE_BLOCKS)
> > + delalloc = 0;
> > +
> > + if (delalloc) {
> > + inode->i_mapping->a_ops = &ext4_da_aops;
> > + return;
> > + } else
> > + printk(KERN_INFO "filesystem is close to full, "
> > + "delayed allocation is turned off for "
> > + " inode %lu\n", inode->i_ino);
> > + }
>
> Also, if you are doing this by changing the aops on the inode, isn't
> it possible that a large write starts outside the EXT4_MIN_FREE_BLOCKS
> boundary and then still runs out of space without changing the aops?
>
> Instead it is maybe better to do the check at the start of
> ext4_da_write_begin() and if it fails then call the non-delalloc
> write_begin from there?
>
Yeah that's better.
But I realize a problem. Actually now I think we can't fall back to
nondelalloc mode if the inode has any dirty pages in the page cache, as
those pages need delalloc aops ->ext4_da_writepages() to handle delayed
allocation writeout..
> Cheers, Andreas
> --
> Andreas Dilger
> Sr. Staff Engineer, Lustre Group
> Sun Microsystems of Canada, Inc.
>
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2008-08-20 23:58 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-08-20 5:43 ENOSPC returned during writepages Aneesh Kumar K.V
2008-08-20 10:46 ` Aneesh Kumar K.V
2008-08-20 11:53 ` Theodore Tso
2008-08-20 18:27 ` Aneesh Kumar K.V
2008-08-20 21:35 ` Mingming Cao
2008-08-21 15:15 ` Aneesh Kumar K.V
2008-08-20 19:25 ` Andreas Dilger
2008-08-20 19:34 ` Theodore Tso
2008-08-20 20:56 ` Mingming Cao
2008-08-20 21:55 ` Theodore Tso
2008-08-20 22:02 ` Mingming Cao
2008-08-20 23:22 ` Mingming Cao
2008-08-20 23:42 ` Andreas Dilger
2008-08-20 23:58 ` Mingming Cao [this message]
2008-08-21 1:44 ` Andreas Dilger
2008-08-20 21:55 ` Mingming Cao
2008-08-21 15:18 ` Aneesh Kumar K.V
2008-08-21 15:35 ` Theodore Tso
2008-08-21 17:17 ` Mingming Cao
2008-08-23 11:12 ` Andreas Dilger
2008-08-21 15:12 ` Aneesh Kumar K.V
2008-08-21 16:56 ` Mingming Cao
2008-08-20 21:58 ` Mingming Cao
2008-08-21 15:09 ` Aneesh Kumar K.V
2008-08-21 5:06 ` Eric Sandeen
2008-08-21 16:45 ` Aneesh Kumar K.V
2008-08-21 17:07 ` Mingming Cao
2008-08-21 17:31 ` Aneesh Kumar K.V
2008-08-21 18:06 ` Mingming Cao
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1219276727.7895.69.camel@mingming-laptop \
--to=cmm@us.ibm.com \
--cc=adilger@sun.com \
--cc=aneesh.kumar@linux.vnet.ibm.com \
--cc=linux-ext4@vger.kernel.org \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.