linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Rogier Wolff <R.E.Wolff@BitWizard.nl>
To: Theodore Tso <tytso@MIT.EDU>
Cc: Florian Weimer <fweimer@bfk.de>,
	Eric Sandeen <sandeen@redhat.com>,
	Phillip Susi <psusi@cfl.rr.com>,
	"linux-ext4@vger.kernel.org" <linux-ext4@vger.kernel.org>
Subject: Re: Large directories and poor order correlation
Date: Tue, 15 Mar 2011 14:33:27 +0100	[thread overview]
Message-ID: <20110315133327.GG22577@bitwizard.nl> (raw)
In-Reply-To: <4C11D2E5-75CD-4A9F-A534-EEC16CDD836B@mit.edu>

On Tue, Mar 15, 2011 at 07:06:34AM -0400, Theodore Tso wrote:
> So in an absolute cold cache situations, what I'd recommend is
> readdir, sort by inode, FIEMAP, sort by block, and then read in the
> dpkg files.  Of course an RPM partisan might say, "it would help if
> you guys had used a real database instead of ab(using) the file
> system.  And then the dpkg guys could complain about what happens
> when RPM has to deal with corrupted rpm database, and how this
> allows dpkg to use shell scripts to access their package
> information.  Life is full of tradeoffs.

IMHO, the most important part is "up to and including the stat". It
should be possible to get the directory, and inode info all inside the
same "16Mb" part of the disk. This would result in (after a few seeks)
the rest of the accesses coming from the disk's cache. 

This would mean that you should allocate directory blocks from the end
PREVIOUS block group.... 

	Roger. 

-- 
** R.E.Wolff@BitWizard.nl ** http://www.BitWizard.nl/ ** +31-15-2600998 **
**    Delftechpark 26 2628 XH  Delft, The Netherlands. KVK: 27239233    **
*-- BitWizard writes Linux device drivers for any device you may have! --*
Q: It doesn't work. A: Look buddy, doesn't work is an ambiguous statement. 
Does it sit on the couch all day? Is it unemployed? Please be specific! 
Define 'it' and what it isn't doing. --------- Adapted from lxrbot FAQ

  parent reply	other threads:[~2011-03-15 13:33 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-03-14 20:24 Large directories and poor order correlation Phillip Susi
2011-03-14 20:37 ` Eric Sandeen
2011-03-14 20:52   ` Phillip Susi
2011-03-14 21:12     ` Eric Sandeen
2011-03-14 21:52     ` Ted Ts'o
2011-03-14 23:43       ` Phillip Susi
2011-03-15  0:14         ` Ted Ts'o
2011-03-15 14:01           ` Phillip Susi
2011-03-15 14:33             ` Rogier Wolff
2011-03-15 14:36               ` Ric Wheeler
2011-03-15 17:08             ` Ted Ts'o
2011-03-15 19:08               ` Phillip Susi
2011-03-16  1:50                 ` Ted Ts'o
2011-03-15  7:59   ` Florian Weimer
2011-03-15 11:06     ` Theodore Tso
2011-03-15 11:23       ` Ric Wheeler
2011-03-15 11:38         ` Theodore Tso
2011-03-15 13:33       ` Rogier Wolff [this message]
2011-03-15 17:18         ` Ted Ts'o

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110315133327.GG22577@bitwizard.nl \
    --to=r.e.wolff@bitwizard.nl \
    --cc=fweimer@bfk.de \
    --cc=linux-ext4@vger.kernel.org \
    --cc=psusi@cfl.rr.com \
    --cc=sandeen@redhat.com \
    --cc=tytso@MIT.EDU \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).