All of lore.kernel.org
 help / color / mirror / Atom feed
From: "David Dabbs" <david@dabbs.net>
To: 'David Masover' <ninja@slaphack.com>
Cc: reiserfs-list@namesys.com
Subject: RE: Fibration questions
Date: Tue, 20 Jul 2004 03:31:30 -0500	[thread overview]
Message-ID: <20040720083108.73C4C15DCA@mail03.powweb.com> (raw)
In-Reply-To: <40FCC497.40308@slaphack.com>

> 
> Hans Reiser wrote:
> | David Masover wrote:
> |
> |> Why beyond?  Ask each fs object (without knowing its name), "What is
> |> your primary type?"  Put like-typed objects together.  Simple.
> |
> | Except that at look up time all you know is the name, and if the type is
> | not in the name then you cannot fibrate by it.
> 
> I must not understand fibration.  Do you have to know the fibration of
> an object to find it?
> 

Fibration is simply a means to physically group together filesystem objects
you want grouped together, or perhaps more likely, separated from some other
group. I muddied the waters by proposing that the fibration bits might be
gainfully employed as a quick and dirty to test whether an object had a
particular extension. 

Here is the schematic for directory item keys with which you are probably
familiar. The file /etc/foo will have the dirid for /etc as key.el[0], the
first 7 characters of the file name in key.el[1]. etc. Fibration, if
enabled, assigns a 'fibre' as the 7 high bits of the second key element,
thus acting like 'GROUP BY' for those file names that end up generating the
same fibre bits using the configured fibration plugin. See below.

David


/*
*   KEY ASSIGNMENT: PLAN A, LONG KEYS.
*
* DIRECTORY ITEMS
*
|       60     | 4 | 7 |1|   56        |        64        |        64
+--------------+---+---+-+-------------+------------------+----------------+
|    dirid     | 0 | F |H|  prefix-1   |    prefix-2      |  prefix-3/hash |
+--------------+---+---+-+-------------+------------------+----------------+
|                  |                   |                  |                |
|    8 bytes       |      8 bytes      |     8 bytes      |     8 bytes    |


 dirid         objectid of directory this item is for

 F             fibration, see fs/reiser4/plugin/fibration.[ch] (BELOW)

 H             1 if last 8 bytes of the key contain hash,
               0 if last 8 bytes of the key contain prefix-3

 prefix-1      first 7 characters of file name.
               Padded by zeroes if name is not long enough.

 prefix-2      next 8 characters of the file name.

 prefix-3      next 8 characters of the file name.

 hash          hash of the rest of file name (i.e., portion of file
               name not included into prefix-1 and prefix-2).

 File names shorter than 23 (== 7 + 8 + 8) characters are completely encoded
 in the key. Such file names are called "short". They are distinguished by H
 bit set in the key.

 Other file names are "long". For long name, H bit is 0, and first 15 (== 7
 + 8) characters are encoded in prefix-1 and prefix-2 portions of the
 key. Last 8 bytes of the key are occupied by hash of the remaining
 characters of the name.

 This key assignment reaches following important goals:

     (1) directory entries are sorted in approximately lexicographical
     order.

     (2) collisions (when multiple directory items have the same key), while
     principally unavoidable in a tree with fixed length keys, are rare.


/***************************** fibration.c ****************************/
/* Copyright 2004 by Hans Reiser, licensing governed by
 * reiser4/README */

/*
 * Suppose we have a directory tree with sources of some project. During
 * compilation .o files are created within this tree. This makes access
 * to the original source files less efficient, because source files are
 * now "diluted" by object files: default directory plugin uses prefix
 * of a file name as a part of the key for directory entry (and this
 * part is also inherited by the key of file body). This means that
 * foo.o will be located close to foo.c and foo.h in the tree.
 *
 * To avoid this effect directory plugin fills highest 7 (unused
 * originally) bits of the second component of the directory entry key
 * by bit-pattern depending on the file name (see
 * fs/reiser4/kassign.c:build_entry_key_common()). These bits are called
 * "fibre". Fibre of the file name key is inherited by key of stat data
 * and keys of file body (in the case of REISER4_LARGE_KEY).
 *
 * Fibre for a given file is chosen by per-directory fibration
 * plugin. Names within given fibre are ordered lexicographically.
 */

static const int fibre_shift = 57;

#define FIBRE_NO(n) (((__u64)(n)) << fibre_shift)

/*
 * Trivial fibration: all files of directory are just ordered
 * lexicographically.
 */
static __u64 fibre_trivial(const struct inode *dir, const char *name, int
len)
{
	return FIBRE_NO(0);
}

/*
 * dot-o fibration: place .o files after all others.
 */
static __u64 fibre_dot_o(const struct inode *dir, const char *name, int len)
{
	/* special treatment for .*\.o */
	if (len > 2 && name[len - 1] == 'o' && name[len - 2] == '.')
		return FIBRE_NO(1);
	else
		return FIBRE_NO(0);
}

/*
 * ext.1 fibration: subdivide directory into 128 fibrations one for each
 * 7bit extension character (file "foo.h" goes into fibre "h"), plus
 * default fibre for the rest.
 */
static __u64 fibre_ext_1(const struct inode *dir, const char *name, int len)
{
	if (len > 2 && name[len - 2] == '.')
		return FIBRE_NO(name[len - 1]);
	else
		return FIBRE_NO(0);
}

/*
 * ext.3 fibration: try to separate files with different 3-character
 * extensions from each other.
 */
static __u64 fibre_ext_3(const struct inode *dir, const char *name, int len)
{
	if (len > 4 && name[len - 4] == '.')
		return FIBRE_NO(name[len - 3] + name[len - 2] + name[len -
1]);
	else
		return FIBRE_NO(0);
}




  reply	other threads:[~2004-07-20  8:31 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-07-16 10:35 Fibration questions David Dabbs
2004-07-16 11:04 ` Nikita Danilov
2004-07-16 17:45   ` David Dabbs
2004-07-18  7:11 ` Hans Reiser
2004-07-18  7:47   ` David Dabbs
2004-07-19  4:23   ` David Masover
2004-07-19  7:21     ` David Dabbs
2004-07-19 21:34       ` David Masover
2004-07-19 22:06         ` Valdis.Kletnieks
2004-07-19 22:32         ` David Dabbs
2004-07-20  6:03           ` Hans Reiser
2004-07-20  7:03           ` David Masover
2004-07-20  5:30         ` Hans Reiser
2004-07-20  7:07           ` David Masover
2004-07-20  8:31             ` David Dabbs [this message]
2004-07-21  5:13               ` David Masover
2004-07-21  5:44                 ` David Dabbs
2004-07-21  6:20                   ` David Masover
2004-07-21  6:36                     ` David Dabbs
2004-07-21  8:32                       ` mjt
2004-07-22  4:08                         ` David Masover
2004-07-22 10:06                           ` mjt
2004-07-22 18:14                             ` Hans Reiser
2004-07-23  2:45                             ` David Masover
2004-07-23  9:42                               ` mjt
2004-07-23 18:21                                 ` David Masover
2004-07-22 10:10                           ` Vitaly Fertman
2004-07-23  2:43                             ` David Masover
2004-07-23  9:09                               ` Vitaly Fertman
2004-07-26  6:28                                 ` Hans Reiser
2004-07-26 10:11                                   ` Vitaly Fertman
2004-07-23  9:59                             ` Christian Mayrhuber
2004-07-23  9:59                               ` mjt
2004-07-23 18:13                                 ` David Masover
2004-07-23 10:05                               ` mjt
2004-07-22  8:03                     ` Hans Reiser
2004-07-22 12:16                       ` Nikita Danilov
2004-07-22 14:39                         ` mjt
2004-07-22 18:17                           ` Hans Reiser
2004-07-22 18:26                             ` mjt
2004-07-22 19:57                           ` Valdis.Kletnieks
2004-07-22 21:05                             ` mjt
2004-07-22 21:36                               ` Valdis.Kletnieks
2004-07-23  9:28                                 ` mjt
2004-07-23 22:42                                   ` Valdis.Kletnieks
2004-07-23  2:40                         ` David Masover

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20040720083108.73C4C15DCA@mail03.powweb.com \
    --to=david@dabbs.net \
    --cc=ninja@slaphack.com \
    --cc=reiserfs-list@namesys.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.