From: Avi Kivity <avi@redhat.com>
To: linux-btrfs@vger.kernel.org
Subject: packing structures and numbers
Date: Fri, 03 Oct 2008 14:42:50 +0300 [thread overview]
Message-ID: <48E6053A.8010100@redhat.com> (raw)
I've been reading btrfs's on-disk format, and two things caught my eye
- attribute((packed)) structures everywhere, often with misaligned
fields. This conserves space, but can be harmful to in-memory
performance on some archs.
- le64's everywhere. This scales nicely, but wastes space. My home
directory is unlikely to have more than 4G objects or 4GB extents (let
alone >2 devices).
I think the two issues can be improved by separating the on-disk format
and the in-memory structure, and by using uleb128 as the on-disk format
for numbers. uleb128 is a variable-length format that encodes 7 bits of
a number in each byte, using the eighth bit as a stop bit.
So, for example
struct btrfs_disk_key {
__le64 objectid;
u8 type;
__le64 offset;
} __attribute__ ((__packed__));
With 1M objectids, and 1T offsets, this reduces in size from 17 bytes to
10 bytes. Most other structures show similar gains. We can also have
more than 256 types if the need arises.
There are, off course, disadvantages to switching to uleb128:
- need to write encode and decode functions, which is tedious. This can
be automated a la xdr.
- increased cpu utilization for decoding and encoding
- can no longer know the size of the in-memory structures in advance
- it's just wonderful to rewrite the entire disk format so close to
freezing it
The advantages, IMO, outweigh the disadvantages:
- better packing reduces tree depth and therefore seekage, the most
important cost on rotating media
- the disk format is infinitely growable
- in-memory format is more efficient for archs which prefer aligned accesses
I'm not volunteering to do this, but please consider this proposal.
--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.
next reply other threads:[~2008-10-03 11:42 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-10-03 11:42 Avi Kivity [this message]
2008-10-03 12:22 ` packing structures and numbers Chris Mason
2008-10-03 13:40 ` Avi Kivity
2008-10-04 0:22 ` Daniel Phillips
2008-10-04 0:28 ` Chris Mason
2008-10-04 0:35 ` Daniel Phillips
2008-10-04 0:43 ` Daniel Phillips
2008-10-04 8:27 ` Avi Kivity
2008-10-03 18:37 ` Zach Brown
2008-10-04 8:31 ` Avi Kivity
2008-10-04 16:34 ` Andi Kleen
2008-10-05 5:31 ` Avi Kivity
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=48E6053A.8010100@redhat.com \
--to=avi@redhat.com \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.