From: Linda Walsh <xfs@tlinx.org>
To: David Chinner <dgc@sgi.com>
Cc: Jeff Breidenbach <jeff@jab.org>, xfs@oss.sgi.com
Subject: Re: tuning, many small files, small blocksize
Date: Mon, 18 Feb 2008 17:03:57 -0800 [thread overview]
Message-ID: <47BA2AFD.2060409@tlinx.org> (raw)
In-Reply-To: <20080218235103.GW155407@sgi.com>
David Chinner wrote:
> That makes no sense. Inodes are *unique* - they are not shared with
> any other inode at all. Could you explain why you think that 256
> byte inodes are any different to larger inodes in this respect?
---
Sorry to be unclear, but it would seem to me that if the
minimum physical blocksize on disk is 512 bytes, then either a 256
byte inode will share that block with another inode, or you are
wasting 256 bytes on each inode. The latter interpretation doesn't
make logical sense.
If the minimum physical I/O size is larger than 512 bytes,
then I would assume even more, *unique*, inodes could be packed
in per block.
>
>> Remember, in xfs, if the last bit of left-over data in an inode will fit
>> into the inode, it can save a block-allocation, though I don't know
>> how this will affect speed.
>
> No, that's wrong. We never put data in inodes.
---
You mean file data, no? Doesn't directory and link data
get packed in? It always gnawed at me, as to why inode's packing
in small bits of data was disallowed for file data, but not
other types of data. How about extended attribute data? Is
it always allocated in separate data blocks as well, or can it
be fit into an inode if it fits? Why not include file data as
a type of data that could be packed into an inode? I'm sure there's
a good reason, but it seems other types of file system data can
be packed into inodes -- just not file data...or am I really
disinformed? :-)
>
>> Space-wise, a 2k block size and 1k-inode size might be good, but don't
>> know how that would affect performance.
>
> Inode size vs block size is pretty much irrelevant w.r.t performance,
> except for the fact inode size can't be larger than the block size.
----
If you have a small directory, can't it be stored in the inode?
Wouldn't that save some bit (or block) of I/O?
>> I'm sure you are familiar with mount options noatime,nodiratime -- same
>> concepts, but dir's are split out.
>
> noatime implies nodiratime.
----
Well dang...thanks! Ever since the nodiratime option came out,
I thought I had to specify it in addition. Now my fstabs can be
shorter!
>> Also, it depends on the situation, but sometimes flattening out the
>> directory structure can speed up lookup time.
>
> Like using large directory block sizes to make large directory
> btrees wider and flatter and therefore use less seeks for any given
> random directory lookup? ;)
---
Are you saying that directory entries are stored in a sorted
order in a B-Tree? Hmmm...
Well, I did say it depended on the situation -- you are right
that time lost to seeks might overshadow time lost to #blocks read
in, I'd think it might depend on how the directories are laid out
on disk, but in benchmarks, I've noticed larger slowdowns when using
more files/dir, than distributing the same number of files among
more dirs, but it could have been something about my test setup,
but I did not test with varying size directory block sizes.
Either I overlooked the naming option size param or was
limited to version=1 for some reason (don't remember when version=2
was added...)
>
>> Sometime back someone did some benchmarks involving log size and it seemed
>> that 32768b(4k) or ~128Meg seemed optimal if memory serves me correctly.
>
> 128MB is the maximum size currently.
---
Maybe that's why it's optimal? :-)
Thanks for the corrections...I appreciate it!
-l
next prev parent reply other threads:[~2008-02-19 1:36 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-02-16 5:01 tuning, many small files, small blocksize Jeff Breidenbach
2008-02-16 9:28 ` Hannes Dorbath
2008-02-16 10:24 ` Jeff Breidenbach
2008-02-16 20:30 ` Jeff Breidenbach
2008-02-19 0:48 ` Timothy Shimmin
2008-02-16 12:23 ` pg_xfs2
2008-02-18 22:53 ` David Chinner
2008-02-18 23:12 ` Linda Walsh
2008-02-18 23:51 ` David Chinner
2008-02-19 1:03 ` Linda Walsh [this message]
2008-02-19 2:49 ` David Chinner
2008-02-19 4:58 ` Jeff Breidenbach
2008-02-19 8:27 ` Peter Grandi
2008-02-19 11:44 ` Hannes Dorbath
2008-02-19 21:24 ` Peter Grandi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=47BA2AFD.2060409@tlinx.org \
--to=xfs@tlinx.org \
--cc=dgc@sgi.com \
--cc=jeff@jab.org \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox