public inbox for linux-ext4@vger.kernel.org
 help / color / mirror / Atom feed
From: Andreas Dilger <adilger@sun.com>
To: Mingming Cao <cmm@us.ibm.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>,
	linux-ext4@vger.kernel.org
Subject: Re: [RFC][PATCH] ext4: Convert uninitialized extent to initialized extent in case of file system full
Date: Fri, 29 Feb 2008 10:05:27 -0800	[thread overview]
Message-ID: <20080229180527.GD2997@webber.adilger.int> (raw)
In-Reply-To: <1204240440.3609.26.camel@localhost.localdomain>

On Feb 28, 2008  15:14 -0800, Mingming Cao wrote:
> On Thu, 2008-02-28 at 23:35 +0530, Aneesh Kumar K.V wrote:
> A write to prealloc area cause the split of unititalized extent into
> a initialized and uninitialized extent. If we don't have space to
> add new extent information instead of returning error convert the
> existing uninitialized extent to initialized one. We need to zero out
> the blocks corresponding to the extent to prevent wrong data reaching
> userspace.

> > +/* FIXME!! we need to try to merge to left or right after zerout  */
> > +static int ext4_ext_zeroout(handle_t *handle, struct inode *inode,
> > +				ext4_lblk_t iblock, struct ext4_extent *ex)
> > +{
> > +}
> > +
> 
> The complexity added to the code to handle the corner case seems not
> worth the effort. 
> 
> One simple solution is submit bio directly to zero out the blocks on
> disk, and wait for that to finish before clear the uninitialized bit. On
> a 4K block size case, the max size of an uninitialized extents is 128MB,
> and since the blocks are all contigous on disk, a single IO could done
> the job, the latency should not be a too big issue. After all when a
> filesystem is full, it's already performs slowly.

Further to Mingming's comments:
- you can map the ZERO_PAGE to every entry in the bio, which will avoid
  the very significant problem of needing 128MB of pages to zero out the
  extent
- make sure you limit the extent size to BIO_MAX_PAGES
- submitting large bios to the block layer is MUCH more efficient than
  adding pages to the page cache because the block device can do a very
  good job of writing this out
- make sure you wait for bio completion before you allow the block IO
  to begin.  In Lustre we did this by passing a waitq and our own
  completion function to the bio and have the caller go to sleep until
  the bio completion function is called.  Note that the completion
  function may be called multiple times if there are block errors.
- zeroing out pages in the page cache is very dangerous because they
  may already have dirty data in them.
- please make a helper function like "ext4_zero_blocks()" because at
  some point in the future I'd like to add the ability to have the kernel
  zero out inode table blocks for filesystems formatted with
  "-O uninit_groups,lazy_bg"

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.


  parent reply	other threads:[~2008-02-29 18:05 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-02-28 18:05 [RFC][PATCH] ext4: Use page_mkwrite vma_operations to get mmap write notification Aneesh Kumar K.V
2008-02-28 18:05 ` [RFC][PATCH] ext4: Fix fallocate error path Aneesh Kumar K.V
2008-02-28 18:05   ` [RFC][PATCH] ext4: Convert uninitialized extent to initialized extent in case of file system full Aneesh Kumar K.V
2008-02-28 18:05     ` [RFC][PATCH] ext4: Enable extent format for symlink Aneesh Kumar K.V
2008-02-28 23:14     ` [RFC][PATCH] ext4: Convert uninitialized extent to initialized extent in case of file system full Mingming Cao
2008-02-29 11:09       ` Aneesh Kumar K.V
2008-02-29 19:21         ` Andreas Dilger
2008-03-01 17:30           ` Aneesh Kumar K.V
2008-03-02 18:51             ` Andreas Dilger
2008-02-29 18:05       ` Andreas Dilger [this message]
  -- strict thread matches above, loose matches on Subject: below --
2008-02-21 19:17 Aneesh Kumar K.V
2008-02-21 21:07 ` Mingming Cao
2008-02-22 14:31   ` Aneesh Kumar K.V
2008-02-22 15:42     ` Aneesh Kumar K.V
2008-02-22 17:28       ` Mingming Cao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080229180527.GD2997@webber.adilger.int \
    --to=adilger@sun.com \
    --cc=aneesh.kumar@linux.vnet.ibm.com \
    --cc=cmm@us.ibm.com \
    --cc=linux-ext4@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox