From: Dave Chinner <david@fromorbit.com>
To: Alex Elder <aelder@sgi.com>
Cc: xfs@oss.sgi.com
Subject: Re: [PATCH v4, 14/16] xfsprogs: metadump: fix duplicate handling once and for all
Date: Tue, 8 Mar 2011 11:50:17 +1100 [thread overview]
Message-ID: <20110308005017.GC1956@dastard> (raw)
In-Reply-To: <1299519558.2578.322.camel@doink>
On Mon, Mar 07, 2011 at 11:39:18AM -0600, Alex Elder wrote:
> This is a case where I think I've solved a problem to death.
>
> The metadump code now stops rather than spinning forever in the face
> of finding no obfuscated name that hasn't already been seen.
> Instead, it simply gives up and passes the original name back to use
> without obfuscation.
>
> Unfortunately, as a result it actually creates entries with
> duplicate names in a directory (or inode attribute fork). And at
> least in the case of directories, xfs_mdrestore(8) will populate the
> directory it restores with duplicate entries. That even seems to
> work, but xfs_repair(8) does identify this as a problem and fixes it
> (by moving duplicates to "lost+found").
>
> This might have been OK, given that it was a rare occurence. But
> it's possible, with short (5-character) names, for the obfuscation
> algorithm to come up with only a single possible alternate name,
> and I felt that was just not acceptable.
>
> This patch fixes all that by creating a way to generate alternate
> names directly from existing names by carefully flipping pairs of
> bits in the characters making up the name.
>
>
> The first change is that a name is only ever obfuscated once.
> If the obfuscated name can't be used, an alternate is computed
> based on that name rather than re-starting the obfuscation
> process. (Names shorter than 5 characters are still not
> obfuscated.)
>
> Second, once a name is selected for use (obfuscated or not), it is
> checked for duplicates. The name table is consulted to see if it
> has already been seen, and if it has, an alternate for that name is
> created (a different name of the same length that has the same hash
> value). That name is checked in the name table, and if it too is
> already there the process repeats until an unused one is found.
>
> Third, alternates are generated methodically rather than by
> repeatedly trying to come up with new random names. A sequence
> number uniquely defines a particular alternate name, given an
> existing name. (Note that some of those alternates aren't valid
> because they contain at least one unallowed character.)
>
> Finally, because all names are now maintained in the name table,
> and because of the way alternates are generated, it's actually
> possible for short names to get modified in order to avoid
> duplicates.
>
> The algorithm for doing all of this is pretty well explained in
> the comments in the code itself, so I'll avoid duplicating any
> more of that here.
>
> Updates since last posting:
> - Definition of ARRAY_SIZE() macro moved to "include/libxfs.h"
> - Added some more background commentary:
> - About the details of operation in flip_bit().
> Specifically, that the table can be expanded as needed,
> but that it is already way bigger than practically
> necessary (and why it is that way).
> - About the number of alternates available as the length
> of a name increases.
> - That the key cases we're interested in are names that are
> around 5 characters in length. Less than that it's not
> very important because we don't obfuscate the name, and
> greater than that the odds of the result of conflicting
> with an existing name are small.
> - Basically, the density of meaning in this code is kind of
> high, so it warrants a lot more comments to help make what
> it's doing more apparent. So I fleshed this out, as requested
> by Dave.
>
> Signed-off-by: Alex Elder <aelder@sgi.com>
The additional comments help a lot in explaining this code. Very
well written, Alex.
Reviewed-by: Dave Chinner <dchinner@redhat.com>
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
prev parent reply other threads:[~2011-03-08 0:47 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-03-07 17:39 [PATCH v4, 14/16] xfsprogs: metadump: fix duplicate handling once and for all Alex Elder
2011-03-08 0:50 ` Dave Chinner [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110308005017.GC1956@dastard \
--to=david@fromorbit.com \
--cc=aelder@sgi.com \
--cc=xfs@oss.sgi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox