* [PATCH v3, 09/16] xfsprogs: metadump: don't loop on too many dups
@ 2011-02-18 21:21 Alex Elder
2011-02-24 1:55 ` Dave Chinner
0 siblings, 1 reply; 4+ messages in thread
From: Alex Elder @ 2011-02-18 21:21 UTC (permalink / raw)
To: xfs
Don't just loop indefinitely when an obfuscated name comes up as a
duplicate. Count the number of times we've found a duplicate and if
if it gets excessive despite choosing names at random, just give up
and use the original name without obfuscation.
Technically, a typical 5-character name has 255 other names that can
have the same hash value. But the algorithm doesn't hit all
possible names (far from it) so duplicates are still possible.
Signed-off-by: Alex Elder <aelder@sgi.com>
The only change worth mentioning from the last version posted is
that the duplicate count is now updated inside the loop that
searches the name table.
---
db/metadump.c | 32 ++++++++++++++++++++++++--------
1 file changed, 24 insertions(+), 8 deletions(-)
Index: b/db/metadump.c
===================================================================
--- a/db/metadump.c
+++ b/db/metadump.c
@@ -29,6 +29,14 @@
#define DEFAULT_MAX_EXT_SIZE 1000
+/*
+ * It's possible that multiple files in a directory (or attributes
+ * in a file) produce the same obfuscated name. If that happens, we
+ * try to create another one. After several rounds of this though,
+ * we just give up and leave the original name as-is.
+ */
+#define DUP_MAX 5 /* Max duplicates before we give up */
+
/* copy all metadata structures to/from a file */
static int metadump_f(int argc, char **argv);
@@ -437,8 +445,9 @@ generate_obfuscated_name(
{
xfs_dahash_t hash;
name_ent_t *p;
- int dup;
+ int dup = 0;
uchar_t newname[NAME_MAX];
+ uchar_t *newp;
/*
* Our obfuscation algorithm requires at least 5-character
@@ -471,19 +480,17 @@ generate_obfuscated_name(
do {
int i;
xfs_dahash_t newhash = 0;
- uchar_t *newp = &newname[0];
uchar_t *first;
uchar_t high_bit;
int shift;
- dup = 0;
-
/*
* The beginning of the obfuscated name can be
* pretty much anything, so fill it in with random
* characters. Accumulate its new hash value as we
* go.
*/
+ newp = &newname[0];
for (i = 0; i < namelen - 5; i++) {
*newp = random_filename_char();
newhash = *newp ^ rol32(newhash, 7);
@@ -531,14 +538,22 @@ generate_obfuscated_name(
ASSERT(libxfs_da_hashname(newname, namelen) == hash);
+ /*
+ * Search the name table to be sure we don't produce
+ * a name that's already been used.
+ */
for (p = nametable[hash % NAME_TABLE_SIZE]; p; p = p->next) {
if (p->hash == hash && p->namelen == namelen &&
!memcmp(p->name, newname, namelen)) {
- dup = 1;
+ dup++;
break;
}
}
- } while (dup);
+ } while (dup && dup < DUP_MAX);
+
+ /* Use the original name if we got too many dups. */
+
+ newp = dup < DUP_MAX ? newname : name;
/* Create an entry for the name in the name table */
@@ -547,7 +562,7 @@ generate_obfuscated_name(
return;
p->namelen = namelen;
- memcpy(p->name, newname, namelen);
+ memcpy(p->name, newp, namelen);
p->hash = hash;
p->next = nametable[hash % NAME_TABLE_SIZE];
@@ -555,7 +570,8 @@ generate_obfuscated_name(
/* Update the caller's copy with the obfuscated name */
- memcpy(name, newname, namelen);
+ if (newp != name)
+ memcpy(name, newp, namelen);
}
static void
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: [PATCH v3, 09/16] xfsprogs: metadump: don't loop on too many dups
2011-02-18 21:21 [PATCH v3, 09/16] xfsprogs: metadump: don't loop on too many dups Alex Elder
@ 2011-02-24 1:55 ` Dave Chinner
2011-02-25 18:13 ` [PATCH v4, " Alex Elder
0 siblings, 1 reply; 4+ messages in thread
From: Dave Chinner @ 2011-02-24 1:55 UTC (permalink / raw)
To: Alex Elder; +Cc: xfs
On Fri, Feb 18, 2011 at 03:21:01PM -0600, Alex Elder wrote:
> Don't just loop indefinitely when an obfuscated name comes up as a
> duplicate. Count the number of times we've found a duplicate and if
> if it gets excessive despite choosing names at random, just give up
> and use the original name without obfuscation.
>
> Technically, a typical 5-character name has 255 other names that can
> have the same hash value. But the algorithm doesn't hit all
> possible names (far from it) so duplicates are still possible.
>
> Signed-off-by: Alex Elder <aelder@sgi.com>
>
> The only change worth mentioning from the last version posted is
> that the duplicate count is now updated inside the loop that
> searches the name table.
The only thing that I'd suggest here is that we emit a warning to
indicate that we haven't obfuscated a name due to excessive
duplicates being created. If the user has asked for obfuscation, we
shoul dat least inform them failures to do so for filenames that
should be obfuscated....
Otherwise,
Reviewed-by: Dave Chinner <dchinner@redhat.com>
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH v4, 09/16] xfsprogs: metadump: don't loop on too many dups
2011-02-24 1:55 ` Dave Chinner
@ 2011-02-25 18:13 ` Alex Elder
2011-03-03 5:07 ` Dave Chinner
0 siblings, 1 reply; 4+ messages in thread
From: Alex Elder @ 2011-02-25 18:13 UTC (permalink / raw)
To: Dave Chinner; +Cc: xfs
Don't just loop indefinitely when an obfuscated name comes up as a
duplicate. Count the number of times we've found a duplicate and if
if it gets excessive despite choosing names at random, just give up
and use the original name without obfuscation.
Technically, a typical 5-character name has 255 other names that can
have the same hash value. But the algorithm doesn't hit all
possible names (far from it) so duplicates are still possible.
Signed-off-by: Alex Elder <aelder@sgi.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Updates (v4):
- Rearranged things a bit so that if too many duplicates are
encountered, a warning gets emitted.
Dave already signed off on it but the update was different enough I
thought I should post it once more.
---
db/metadump.c | 42 +++++++++++++++++++++++++++++++-----------
1 file changed, 31 insertions(+), 11 deletions(-)
Index: b/db/metadump.c
===================================================================
--- a/db/metadump.c
+++ b/db/metadump.c
@@ -29,6 +29,14 @@
#define DEFAULT_MAX_EXT_SIZE 1000
+/*
+ * It's possible that multiple files in a directory (or attributes
+ * in a file) produce the same obfuscated name. If that happens, we
+ * try to create another one. After several rounds of this though,
+ * we just give up and leave the original name as-is.
+ */
+#define DUP_MAX 5 /* Max duplicates before we give up */
+
/* copy all metadata structures to/from a file */
static int metadump_f(int argc, char **argv);
@@ -444,8 +452,9 @@ generate_obfuscated_name(
{
xfs_dahash_t hash;
name_ent_t *p;
- int dup;
+ int dup = 0;
uchar_t newname[NAME_MAX];
+ uchar_t *newp;
/*
* Our obfuscation algorithm requires at least 5-character
@@ -481,19 +490,17 @@ generate_obfuscated_name(
do {
int i;
xfs_dahash_t newhash = 0;
- uchar_t *newp = &newname[0];
uchar_t *first;
uchar_t high_bit;
int shift;
- dup = 0;
-
/*
* The beginning of the obfuscated name can be
* pretty much anything, so fill it in with random
* characters. Accumulate its new hash value as we
* go.
*/
+ newp = &newname[0];
for (i = 0; i < namelen - 5; i++) {
*newp = random_filename_char();
newhash = *newp ^ rol32(newhash, 7);
@@ -541,14 +548,31 @@ generate_obfuscated_name(
ASSERT(libxfs_da_hashname(newname, namelen) == hash);
+ /*
+ * Search the name table to be sure we don't produce
+ * a name that's already been used.
+ */
for (p = nametable[hash % NAME_TABLE_SIZE]; p; p = p->next) {
if (p->hash == hash && p->namelen == namelen &&
!memcmp(p->name, newname, namelen)) {
- dup = 1;
+ dup++;
break;
}
}
- } while (dup);
+ } while (dup && dup < DUP_MAX);
+
+ /*
+ * Update the caller's copy with the obfuscated name. Use
+ * the original name if we got too many duplicates--and if
+ * so, issue a warning.
+ */
+ if (dup < DUP_MAX)
+ memcpy(name, newname, namelen);
+ else
+ print_warning("duplicate name for inode %llu "
+ "in dir inode %llu\n",
+ (unsigned long long) ino,
+ (unsigned long long) cur_ino);
/* Create an entry for the name in the name table */
@@ -557,15 +581,11 @@ generate_obfuscated_name(
return;
p->namelen = namelen;
- memcpy(p->name, newname, namelen);
+ memcpy(p->name, name, namelen);
p->hash = hash;
p->next = nametable[hash % NAME_TABLE_SIZE];
nametable[hash % NAME_TABLE_SIZE] = p;
-
- /* Update the caller's copy with the obfuscated name */
-
- memcpy(name, newname, namelen);
}
static void
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: [PATCH v4, 09/16] xfsprogs: metadump: don't loop on too many dups
2011-02-25 18:13 ` [PATCH v4, " Alex Elder
@ 2011-03-03 5:07 ` Dave Chinner
0 siblings, 0 replies; 4+ messages in thread
From: Dave Chinner @ 2011-03-03 5:07 UTC (permalink / raw)
To: Alex Elder; +Cc: xfs
On Fri, Feb 25, 2011 at 12:13:44PM -0600, Alex Elder wrote:
> Don't just loop indefinitely when an obfuscated name comes up as a
> duplicate. Count the number of times we've found a duplicate and if
> if it gets excessive despite choosing names at random, just give up
> and use the original name without obfuscation.
>
> Technically, a typical 5-character name has 255 other names that can
> have the same hash value. But the algorithm doesn't hit all
> possible names (far from it) so duplicates are still possible.
>
> Signed-off-by: Alex Elder <aelder@sgi.com>
> Reviewed-by: Dave Chinner <dchinner@redhat.com>
>
> Updates (v4):
> - Rearranged things a bit so that if too many duplicates are
> encountered, a warning gets emitted.
>
> Dave already signed off on it but the update was different enough I
> thought I should post it once more.
Reviewed-by: Dave Chinner <dchinner@redhat.com>
--
Dave Chinner
david@fromorbit.com
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2011-03-03 5:04 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-02-18 21:21 [PATCH v3, 09/16] xfsprogs: metadump: don't loop on too many dups Alex Elder
2011-02-24 1:55 ` Dave Chinner
2011-02-25 18:13 ` [PATCH v4, " Alex Elder
2011-03-03 5:07 ` Dave Chinner
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox