All of lore.kernel.org
 help / color / mirror / Atom feed
* packing structures and numbers
@ 2008-10-03 11:42 Avi Kivity
  2008-10-03 12:22 ` Chris Mason
  2008-10-03 18:37 ` Zach Brown
  0 siblings, 2 replies; 12+ messages in thread
From: Avi Kivity @ 2008-10-03 11:42 UTC (permalink / raw)
  To: linux-btrfs

I've been reading btrfs's on-disk format, and two things caught my eye

- attribute((packed)) structures everywhere, often with misaligned 
fields.  This conserves space, but can be harmful to in-memory 
performance on some archs.
- le64's everywhere.   This scales nicely, but wastes space.  My home 
directory is unlikely to have more than 4G objects or 4GB extents (let 
alone >2 devices).

I think the two issues can be improved by separating the on-disk format 
and the in-memory structure, and by using uleb128 as the on-disk format 
for numbers.  uleb128 is a variable-length format that encodes 7 bits of 
a number in each byte, using the eighth bit as a stop bit.

So, for example

struct btrfs_disk_key {
    __le64 objectid;
    u8 type;
    __le64 offset;
} __attribute__ ((__packed__));

With 1M objectids, and 1T offsets, this reduces in size from 17 bytes to 
10 bytes.  Most other structures show similar gains.  We can also have 
more than 256 types if the need arises.

There are, off course, disadvantages to switching to uleb128:

- need to write encode and decode functions, which is tedious.  This can 
be automated a la xdr.
- increased cpu utilization for decoding and encoding
- can no longer know the size of the in-memory structures in advance
- it's just wonderful to rewrite the entire disk format so close to 
freezing it

The advantages, IMO, outweigh the disadvantages:

- better packing reduces tree depth and therefore seekage, the most 
important cost on rotating media
- the disk format is infinitely growable
- in-memory format is more efficient for archs which prefer aligned accesses

I'm not volunteering to do this, but please consider this proposal.

-- 
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2008-10-05  5:31 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-10-03 11:42 packing structures and numbers Avi Kivity
2008-10-03 12:22 ` Chris Mason
2008-10-03 13:40   ` Avi Kivity
2008-10-04  0:22   ` Daniel Phillips
2008-10-04  0:28     ` Chris Mason
2008-10-04  0:35       ` Daniel Phillips
2008-10-04  0:43       ` Daniel Phillips
2008-10-04  8:27         ` Avi Kivity
2008-10-03 18:37 ` Zach Brown
2008-10-04  8:31   ` Avi Kivity
2008-10-04 16:34     ` Andi Kleen
2008-10-05  5:31       ` Avi Kivity

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.