From: Andreas Dilger <adilger@sun.com>
To: Theodore Tso <tytso@mit.edu>
Cc: Alex Tomas <bzzz@sun.com>, linux-ext4@vger.kernel.org
Subject: Re: Potential bug in mballoc --- reusing data blocks before txn commit
Date: Wed, 01 Oct 2008 00:17:03 -0700 [thread overview]
Message-ID: <20081001071703.GH3160@webber.adilger.int> (raw)
In-Reply-To: <20080930141559.GO10831@mit.edu>
Theodore Tso wrote:
> Yeah, I know Andrian Bunk strikes again.... but the right answer is
> to ressurect that code and add it back.
I submitted our patch to re-add this support yesterday.
On Sep 30, 2008 10:15 -0400, Theodore Ts'o wrote:
> On Tue, Sep 30, 2008 at 05:12:12PM +0400, Alex Tomas wrote:
> >> For ext4, the only reason to use a tree would be to allow us to merge
> >> deleted extents. This might not be worth the complexity, though, I
> >> admit it.
> >
> > strictly speaking, extents code should have merged them at allocation time.
>
> Sorry, I wasn't being clear enough. I was thinking of the scenario
> where the user runs "rm -r" and deletes a directory hierarchy with
> lots of small files. So the merging I was talking about was between
> blocks belonging to different files, so we can send a single large
> "trim" command to the disk.
I agree that this is probably most efficient. There would be an rbtree
per transaction, and would mostly be sparse.
> > btw, I've just remembered why I decided don't protect data from reallocation:
> > in data=writeback one can get block with stale data easily. and many people
> > (to my knowledge) were using data=writeback as performing better.
>
> Well, data=ordered is the default, so there would be many more people
> using data=ordered. If we think there is a significant advantage in
> not protecting data from reallocation besides the memory utilization,
> I suppose we could make protecting data being conditional on
> data=writeback. Perhaps having the additional data blocks available
> to the block allocator could allow it to make better decisions. Not
> sure it's worth it, though. Any thoughts?
I think for minimum complexity we should keep the rbtree all the time.
We would need it for the TRIM support in any case.
Having it enabled for ordered mode is fairly important, and I didn't
know that this little surprise was in the mballoc code at all. It may
explain some rare problems we've seen in the past.
Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.
prev parent reply other threads:[~2008-10-01 7:17 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-09-28 1:35 Potential bug in mballoc --- reusing data blocks before txn commit Theodore Ts'o
2008-09-29 20:21 ` Alex Tomas
2008-09-29 20:57 ` Theodore Tso
2008-09-29 21:04 ` Ric Wheeler
2008-09-29 23:00 ` Theodore Tso
2008-09-29 23:05 ` Ric Wheeler
2008-09-30 4:35 ` Alex Tomas
2008-09-30 13:02 ` Theodore Tso
2008-09-30 13:12 ` Alex Tomas
2008-09-30 14:15 ` Theodore Tso
2008-10-01 7:17 ` Andreas Dilger [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20081001071703.GH3160@webber.adilger.int \
--to=adilger@sun.com \
--cc=bzzz@sun.com \
--cc=linux-ext4@vger.kernel.org \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox