linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] ensure unique i_ino in filesystems without permanent inode numbers (introduction)
@ 2006-12-07 22:12 Jeff Layton
  2006-12-15 13:05 ` Jeff Layton
  0 siblings, 1 reply; 5+ messages in thread
From: Jeff Layton @ 2006-12-07 22:12 UTC (permalink / raw)
  To: linux-kernel, linux-fsdevel

Apologies for the long email, but I couldn't come up with a way to explain
this in fewer words. Many filesystems that are part of the linux kernel have
problems with how they have assign out i_ino values:

1) on filesystems w/o permanent inode numbers, i_ino values can be
larger than 32 bits, which can cause problems for some 32 bit userspace
programs on a 64 bit kernel. We can't do anything for filesystems that have
actual 64-bit inode numbers, but on filesystems that generate i_ino
values on the fly, we should try to have them fit in 32 bits. We could
trivially fix this by making the static counters in new_inode and iunique
32 bits, but...

2) many filesystems call new_inode and assume that the i_ino values they
are given are unique. They are not guaranteed to be so, since the static
counter can wrap. This problem is exacerbated by the fix for #1.

3) after allocating a new inode, some filesystems call iunique to try to
get a unique i_ino value, but they don't actually add their inodes to
the hashtable, and so they're still not guaranteed to be unique if that
counter wraps. We could hash the inodes to fix this, but...

4) many of these filesystems pin their inodes in memory, and adding them to
the inode hashtable might slow down lookups for "real" filesystems.

The following series of patches aims to correct these problems. It adds
two new functions iunique_register and iunique_unregister, that use IDR
under the hood. Filesystems can call iunique_register at inode creation,
and then at deletion, we'll automatically unregister them. It uses
per-superblock hashes for this. One side effect is that with this patch,
i_ino values are reused rather quickly (i.e. IDR prefers to reuse a number
that has been deallocated rather than assign an unused one).

Because i_ino's can be reused so quickly, we don't want NFS getting
confused when it happens. The patch also adds a new s_generation counter
to the superblock. When iunique_register is called, we'll assign
the s_generation value to the i_generation, and then increment it to
help ensure that we get different filehandles.

Al Viro had expressed some concern with an earlier patch that this method
might slow down pipe creation. I've done some testing and I think the
impact will be minimal. Timing a small program that creates and closes 100
million pipes in a loop:

patched:
-------------
real    8m8.623s
user    0m37.418s
sys     7m31.196s

unpatched:
--------------
real    8m7.150s
user    0m40.943s
sys     7m26.204s

As the number of pipes grows on the system this time may grow somewhat,
but it doesn't seem like it will be terrible.

iunique_unregister is called unconditionally in several places, but filesystems
that don't use this should have empty IDR hashes and return quickly.

3 patches follow:

- a patch to add the new superblock fields and functions and to change the
iunique counter to 32 bits

- a patch to make sure that the inodes allocated by get_sb_pseudo and
simple_fill_super are unique

- a patch to convert pipefs to hash its inode numbers this way

Other patches will follow to fix up other filesystems as I get to them. Once
all of the callers of new_inode have been audited to make sure that they
assign i_ino to a sane value, we can remove the static counter from new_inode.

Many thanks to Eric Sandeen, Joern Engel, Christoph Hellwig, and Al Viro for
guidance on this.

Signed-off-by: Jeff Layton <jlayton@redhat.com>


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 0/3] ensure unique i_ino in filesystems without permanent inode numbers (introduction)
  2006-12-07 22:12 [PATCH 0/3] ensure unique i_ino in filesystems without permanent inode numbers (introduction) Jeff Layton
@ 2006-12-15 13:05 ` Jeff Layton
  2006-12-15 14:00   ` Jörn Engel
  0 siblings, 1 reply; 5+ messages in thread
From: Jeff Layton @ 2006-12-15 13:05 UTC (permalink / raw)
  To: akpm; +Cc: linux-kernel, linux-fsdevel

Jeff Layton wrote:
 > Apologies for the long email, but I couldn't come up with a way to explain
 > this in fewer words. Many filesystems that are part of the linux kernel
 > have problems with how they have assign out i_ino values:
 >

If there are no further comments/suggestions on this patchset, I'd like to
ask Andrew to add it to -mm soon and target getting it rolled into 2.6.21.
Once it's in -mm, I'll start posting more patches to convert the other
filesystems.

Andrew, are the patches I've already posted sufficient, or should I resend
the set?

Thanks,
Jeff



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 0/3] ensure unique i_ino in filesystems without permanent inode numbers (introduction)
  2006-12-15 13:05 ` Jeff Layton
@ 2006-12-15 14:00   ` Jörn Engel
  2006-12-15 15:01     ` Jeff Layton
  0 siblings, 1 reply; 5+ messages in thread
From: Jörn Engel @ 2006-12-15 14:00 UTC (permalink / raw)
  To: Jeff Layton; +Cc: akpm, linux-kernel, linux-fsdevel, Christoph Hellwig

On Fri, 15 December 2006 08:05:24 -0500, Jeff Layton wrote:
> Jeff Layton wrote:
> > Apologies for the long email, but I couldn't come up with a way to explain
> > this in fewer words. Many filesystems that are part of the linux kernel
> > have problems with how they have assign out i_ino values:
> >
> 
> If there are no further comments/suggestions on this patchset, I'd like to
> ask Andrew to add it to -mm soon and target getting it rolled into 2.6.21.

I'm still unsure whether idr has a sufficient advantage over simply
hashing the inodes.  Hch has suggested that keeping the hashtable
smaller is good for performance.  But idr adds new complexity, which
should be avoided on its own right.  So is the performance benefit big
enough to add more complexity?  Is it even measurable?

Jörn

-- 
Fancy algorithms are buggier than simple ones, and they're much harder
to implement. Use simple algorithms as well as simple data structures.
-- Rob Pike
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 0/3] ensure unique i_ino in filesystems without permanent inode numbers (introduction)
  2006-12-15 14:00   ` Jörn Engel
@ 2006-12-15 15:01     ` Jeff Layton
  2006-12-26 19:16       ` Jeff Layton
  0 siblings, 1 reply; 5+ messages in thread
From: Jeff Layton @ 2006-12-15 15:01 UTC (permalink / raw)
  To: Jörn Engel; +Cc: akpm, linux-kernel, linux-fsdevel, Christoph Hellwig

Jörn Engel wrote:
 > On Fri, 15 December 2006 08:05:24 -0500, Jeff Layton wrote:
 >> Jeff Layton wrote:
 >>> Apologies for the long email, but I couldn't come up with a way to explain
 >>> this in fewer words. Many filesystems that are part of the linux kernel
 >>> have problems with how they have assign out i_ino values:
 >>>
 >> If there are no further comments/suggestions on this patchset, I'd like to
 >> ask Andrew to add it to -mm soon and target getting it rolled into 2.6.21.
 >
 > I'm still unsure whether idr has a sufficient advantage over simply
 > hashing the inodes.  Hch has suggested that keeping the hashtable
 > smaller is good for performance.  But idr adds new complexity, which
 > should be avoided on its own right.  So is the performance benefit big
 > enough to add more complexity?  Is it even measurable?
 >
 > Jörn
 >


A very good question. Certainly, just hashing them would be a heck of a
lot simpler. That was my first inclination when I looked at this, but as
you said, HCH NAK'ed that idea stating that it would bloat out the
hashtable. I tend to think that it's probably not that significant, but
that might very much depend on workload.

I'm OK with either approach, though I'd like to have some sort of buyin
from Christoph on hashing the inodes before I start working on patches to
do that.

Christoph, care to comment?

-- Jeff


-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH 0/3] ensure unique i_ino in filesystems without permanent inode numbers (introduction)
  2006-12-15 15:01     ` Jeff Layton
@ 2006-12-26 19:16       ` Jeff Layton
  0 siblings, 0 replies; 5+ messages in thread
From: Jeff Layton @ 2006-12-26 19:16 UTC (permalink / raw)
  To: Jörn Engel, Christoph Hellwig; +Cc: linux-kernel, linux-fsdevel

Jeff Layton wrote:
 >  >
 >  > I'm still unsure whether idr has a sufficient advantage over simply
 >  > hashing the inodes.  Hch has suggested that keeping the hashtable
 >  > smaller is good for performance.  But idr adds new complexity, which
 >  > should be avoided on its own right.  So is the performance benefit big
 >  > enough to add more complexity?  Is it even measurable?
 >  >
 >  > Jörn
 >  >
 >
 >
 > A very good question. Certainly, just hashing them would be a heck of a
 > lot simpler. That was my first inclination when I looked at this, but as
 > you said, HCH NAK'ed that idea stating that it would bloat out the
 > hashtable. I tend to think that it's probably not that significant, but
 > that might very much depend on workload.
 >
 > I'm OK with either approach, though I'd like to have some sort of buyin
 > from Christoph on hashing the inodes before I start working on patches to
 > do that.
 >
 > Christoph, care to comment?
 >
 > -- Jeff
 >
 >

I'm still in limbo on this patchset and could use some guidance. It's not
clear to me that IDR is really a big enough win over just hashing the inodes.
It seems like the inode hash is pretty well optimized for fast lookups such
that even if we get some extra hash collisions it shouldn't be too awful.

Perhaps the best thing to do is to start with a patch that just hashes the
inodes and then fall back to using IDR (or some other scheme) if it turns out
that it impacts performance too much?

Anyone have an opinion on what approach they think would be best here?

-- Jeff

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2006-12-26 19:16 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-12-07 22:12 [PATCH 0/3] ensure unique i_ino in filesystems without permanent inode numbers (introduction) Jeff Layton
2006-12-15 13:05 ` Jeff Layton
2006-12-15 14:00   ` Jörn Engel
2006-12-15 15:01     ` Jeff Layton
2006-12-26 19:16       ` Jeff Layton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).