* regressions due to 64-bit ext4 directory cookies @ 2013-02-12 20:28 J. Bruce Fields 2013-02-12 20:56 ` Bernd Schubert ` (2 more replies) 0 siblings, 3 replies; 44+ messages in thread From: J. Bruce Fields @ 2013-02-12 20:28 UTC (permalink / raw) To: linux-ext4, sandeen, Theodore Ts'o, Bernd Schubert, gluster-devel 06effdbb49af5f6c "nfsd: vfs_llseek() with 32 or 64 bit offsets (hashes)" and previous patches solved problems with hash collisions in large directories by using 64- instead of 32- bit directory hashes in some cases. But it caused problems for users who assume directory offsets are "small". Two cases we've run across: - older NFS clients: 64-bit cookies cause applications on many older clients to fail. - gluster: gluster assumed that it could take the top bits of the offset for its own use. In both cases we could argue we're in the right: the nfs protocol defines cookies to be 64 bits, so clients should be prepared to handle them (remapping to smaller integers if necessary to placate applications using older system interfaces). And gluster was incorrect to assume that the "offset" was really an "offset" as opposed to just an opaque value. But in practice things that worked fine for a long time break on a kernel upgrade. So at a minimum I think we owe people a workaround, and turning off dir_index may not be practical for everyone. A "no_64bit_cookies" export option would provide a workaround for NFS servers with older NFS clients, but not for applications like gluster. For that reason I'd rather have a way to turn this off on a given ext4 filesystem. Is that practical? --b. ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: regressions due to 64-bit ext4 directory cookies 2013-02-12 20:28 regressions due to 64-bit ext4 directory cookies J. Bruce Fields @ 2013-02-12 20:56 ` Bernd Schubert 2013-02-12 21:00 ` J. Bruce Fields 2013-02-13 4:00 ` Theodore Ts'o 2013-02-13 6:56 ` Andreas Dilger 2 siblings, 1 reply; 44+ messages in thread From: Bernd Schubert @ 2013-02-12 20:56 UTC (permalink / raw) To: J. Bruce Fields Cc: linux-ext4, sandeen, Theodore Ts'o, gluster-devel, Andreas Dilger On 02/12/2013 09:28 PM, J. Bruce Fields wrote: > 06effdbb49af5f6c "nfsd: vfs_llseek() with 32 or 64 bit offsets (hashes)" > and previous patches solved problems with hash collisions in large > directories by using 64- instead of 32- bit directory hashes in some > cases. But it caused problems for users who assume directory offsets > are "small". Two cases we've run across: > > - older NFS clients: 64-bit cookies cause applications on many > older clients to fail. > - gluster: gluster assumed that it could take the top bits of > the offset for its own use. > > In both cases we could argue we're in the right: the nfs protocol > defines cookies to be 64 bits, so clients should be prepared to handle > them (remapping to smaller integers if necessary to placate applications > using older system interfaces). And gluster was incorrect to assume > that the "offset" was really an "offset" as opposed to just an opaque > value. > > But in practice things that worked fine for a long time break on a > kernel upgrade. > > So at a minimum I think we owe people a workaround, and turning off > dir_index may not be practical for everyone. > > A "no_64bit_cookies" export option would provide a workaround for NFS > servers with older NFS clients, but not for applications like gluster. > > For that reason I'd rather have a way to turn this off on a given ext4 > filesystem. Is that practical? I think Ted needs to answer if he would accept another mount option. But before we are going this way, what is gluster doing if there are hash collions? Thanks, Bernd ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: regressions due to 64-bit ext4 directory cookies 2013-02-12 20:56 ` Bernd Schubert @ 2013-02-12 21:00 ` J. Bruce Fields 2013-02-13 8:17 ` Bernd Schubert 2013-02-13 13:31 ` [Gluster-devel] " Niels de Vos 0 siblings, 2 replies; 44+ messages in thread From: J. Bruce Fields @ 2013-02-12 21:00 UTC (permalink / raw) To: Bernd Schubert Cc: linux-ext4, sandeen, Theodore Ts'o, gluster-devel, Andreas Dilger On Tue, Feb 12, 2013 at 09:56:41PM +0100, Bernd Schubert wrote: > On 02/12/2013 09:28 PM, J. Bruce Fields wrote: > > 06effdbb49af5f6c "nfsd: vfs_llseek() with 32 or 64 bit offsets (hashes)" > > and previous patches solved problems with hash collisions in large > > directories by using 64- instead of 32- bit directory hashes in some > > cases. But it caused problems for users who assume directory offsets > > are "small". Two cases we've run across: > > > > - older NFS clients: 64-bit cookies cause applications on many > > older clients to fail. > > - gluster: gluster assumed that it could take the top bits of > > the offset for its own use. > > > > In both cases we could argue we're in the right: the nfs protocol > > defines cookies to be 64 bits, so clients should be prepared to handle > > them (remapping to smaller integers if necessary to placate applications > > using older system interfaces). And gluster was incorrect to assume > > that the "offset" was really an "offset" as opposed to just an opaque > > value. > > > > But in practice things that worked fine for a long time break on a > > kernel upgrade. > > > > So at a minimum I think we owe people a workaround, and turning off > > dir_index may not be practical for everyone. > > > > A "no_64bit_cookies" export option would provide a workaround for NFS > > servers with older NFS clients, but not for applications like gluster. > > > > For that reason I'd rather have a way to turn this off on a given ext4 > > filesystem. Is that practical? > > I think Ted needs to answer if he would accept another mount option. But > before we are going this way, what is gluster doing if there are hash > collions? They probably just haven't tested NFS with large enough directories. The birthday paradox says you'd need about 2^16 entries to have a 50-50 chance of hitting the problem. I don't know enough about ext4 directory performance. But unfortunately I suspect there's a range of directory sizes that are too small to have a significant chance of having directory collisions, but still large enough to need dir_index? --b. ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: regressions due to 64-bit ext4 directory cookies 2013-02-12 21:00 ` J. Bruce Fields @ 2013-02-13 8:17 ` Bernd Schubert 2013-02-13 22:18 ` J. Bruce Fields 2013-02-13 13:31 ` [Gluster-devel] " Niels de Vos 1 sibling, 1 reply; 44+ messages in thread From: Bernd Schubert @ 2013-02-13 8:17 UTC (permalink / raw) To: J. Bruce Fields Cc: linux-ext4, sandeen, Theodore Ts'o, gluster-devel, Andreas Dilger On 02/12/2013 10:00 PM, J. Bruce Fields wrote: > On Tue, Feb 12, 2013 at 09:56:41PM +0100, Bernd Schubert wrote: >> On 02/12/2013 09:28 PM, J. Bruce Fields wrote: >>> 06effdbb49af5f6c "nfsd: vfs_llseek() with 32 or 64 bit offsets (hashes)" >>> and previous patches solved problems with hash collisions in large >>> directories by using 64- instead of 32- bit directory hashes in some >>> cases. But it caused problems for users who assume directory offsets >>> are "small". Two cases we've run across: >>> >>> - older NFS clients: 64-bit cookies cause applications on many >>> older clients to fail. >>> - gluster: gluster assumed that it could take the top bits of >>> the offset for its own use. >>> >>> In both cases we could argue we're in the right: the nfs protocol >>> defines cookies to be 64 bits, so clients should be prepared to handle >>> them (remapping to smaller integers if necessary to placate applications >>> using older system interfaces). And gluster was incorrect to assume >>> that the "offset" was really an "offset" as opposed to just an opaque >>> value. >>> >>> But in practice things that worked fine for a long time break on a >>> kernel upgrade. >>> >>> So at a minimum I think we owe people a workaround, and turning off >>> dir_index may not be practical for everyone. >>> >>> A "no_64bit_cookies" export option would provide a workaround for NFS >>> servers with older NFS clients, but not for applications like gluster. >>> >>> For that reason I'd rather have a way to turn this off on a given ext4 >>> filesystem. Is that practical? >> >> I think Ted needs to answer if he would accept another mount option. But >> before we are going this way, what is gluster doing if there are hash >> collions? > > They probably just haven't tested NFS with large enough directories. Is it only related to NFS or generic readdir over gluster? > The birthday paradox says you'd need about 2^16 entries to have a 50-50 > chance of hitting the problem. We are frequently running into it with 50000 files per directory. > > I don't know enough about ext4 directory performance. But unfortunately > I suspect there's a range of directory sizes that are too small to have > a significant chance of having directory collisions, but still large > enough to need dir_index? Here is a link to the initial benchmark: http://search.luky.org/linux-kernel.2001/msg00117.html Cheers, Bernd ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: regressions due to 64-bit ext4 directory cookies 2013-02-13 8:17 ` Bernd Schubert @ 2013-02-13 22:18 ` J. Bruce Fields 0 siblings, 0 replies; 44+ messages in thread From: J. Bruce Fields @ 2013-02-13 22:18 UTC (permalink / raw) To: Bernd Schubert Cc: linux-ext4, sandeen, Theodore Ts'o, gluster-devel, Andreas Dilger On Wed, Feb 13, 2013 at 09:17:28AM +0100, Bernd Schubert wrote: > On 02/12/2013 10:00 PM, J. Bruce Fields wrote: > >On Tue, Feb 12, 2013 at 09:56:41PM +0100, Bernd Schubert wrote: > >>On 02/12/2013 09:28 PM, J. Bruce Fields wrote: > >>>06effdbb49af5f6c "nfsd: vfs_llseek() with 32 or 64 bit offsets (hashes)" > >>>and previous patches solved problems with hash collisions in large > >>>directories by using 64- instead of 32- bit directory hashes in some > >>>cases. But it caused problems for users who assume directory offsets > >>>are "small". Two cases we've run across: > >>> > >>> - older NFS clients: 64-bit cookies cause applications on many > >>> older clients to fail. > >>> - gluster: gluster assumed that it could take the top bits of > >>> the offset for its own use. > >>> > >>>In both cases we could argue we're in the right: the nfs protocol > >>>defines cookies to be 64 bits, so clients should be prepared to handle > >>>them (remapping to smaller integers if necessary to placate applications > >>>using older system interfaces). And gluster was incorrect to assume > >>>that the "offset" was really an "offset" as opposed to just an opaque > >>>value. > >>> > >>>But in practice things that worked fine for a long time break on a > >>>kernel upgrade. > >>> > >>>So at a minimum I think we owe people a workaround, and turning off > >>>dir_index may not be practical for everyone. > >>> > >>>A "no_64bit_cookies" export option would provide a workaround for NFS > >>>servers with older NFS clients, but not for applications like gluster. > >>> > >>>For that reason I'd rather have a way to turn this off on a given ext4 > >>>filesystem. Is that practical? > >> > >>I think Ted needs to answer if he would accept another mount option. But > >>before we are going this way, what is gluster doing if there are hash > >>collions? > > > >They probably just haven't tested NFS with large enough directories. > > Is it only related to NFS or generic readdir over gluster? > > >The birthday paradox says you'd need about 2^16 entries to have a 50-50 > >chance of hitting the problem. > > We are frequently running into it with 50000 files per directory. > > > > >I don't know enough about ext4 directory performance. But unfortunately > >I suspect there's a range of directory sizes that are too small to have > >a significant chance of having directory collisions, but still large > >enough to need dir_index? > > Here is a link to the initial benchmark: > http://search.luky.org/linux-kernel.2001/msg00117.html Hm, so I still don't have a good feeling for when dir_index is likely to start winning. For comparison, assuming the probability of seeing a failure due to hash collisions in an n-entry directory is the probability of a collision among n numbers chosen uniformly at random from 2^31, that's about: 0.0002% for n= 100 0.006 % for n= 500 0.02 % for n= 1000 0.6 % for n= 5000 2 % for n=10000 So if we could tell anyone with directories smaller than 10,000 entries: "hey, you don't need dir_index anyway, just turn it off"--good, the only people still forced to deal with 64-bit cookies will be the ones that have probably already found that ext4 isn't reliable for their purposes. If there are people with only a few hundred entries who still need dir_index--well, we may be making them unhappy as we're making them suffer to fix a bug that they've never actually seen. --b. ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [Gluster-devel] regressions due to 64-bit ext4 directory cookies 2013-02-12 21:00 ` J. Bruce Fields 2013-02-13 8:17 ` Bernd Schubert @ 2013-02-13 13:31 ` Niels de Vos 2013-02-13 15:40 ` Bernd Schubert 1 sibling, 1 reply; 44+ messages in thread From: Niels de Vos @ 2013-02-13 13:31 UTC (permalink / raw) To: J. Bruce Fields Cc: Bernd Schubert, sandeen, Andreas Dilger, linux-ext4, Theodore Ts'o, gluster-devel On Tue, Feb 12, 2013 at 04:00:54PM -0500, J. Bruce Fields wrote: > On Tue, Feb 12, 2013 at 09:56:41PM +0100, Bernd Schubert wrote: > > On 02/12/2013 09:28 PM, J. Bruce Fields wrote: > > > 06effdbb49af5f6c "nfsd: vfs_llseek() with 32 or 64 bit offsets (hashes)" > > > and previous patches solved problems with hash collisions in large > > > directories by using 64- instead of 32- bit directory hashes in some > > > cases. But it caused problems for users who assume directory offsets > > > are "small". Two cases we've run across: > > > > > > - older NFS clients: 64-bit cookies cause applications on many > > > older clients to fail. > > > - gluster: gluster assumed that it could take the top bits of > > > the offset for its own use. > > > > > > In both cases we could argue we're in the right: the nfs protocol > > > defines cookies to be 64 bits, so clients should be prepared to handle > > > them (remapping to smaller integers if necessary to placate applications > > > using older system interfaces). And gluster was incorrect to assume > > > that the "offset" was really an "offset" as opposed to just an opaque > > > value. > > > > > > But in practice things that worked fine for a long time break on a > > > kernel upgrade. > > > > > > So at a minimum I think we owe people a workaround, and turning off > > > dir_index may not be practical for everyone. > > > > > > A "no_64bit_cookies" export option would provide a workaround for NFS > > > servers with older NFS clients, but not for applications like gluster. > > > > > > For that reason I'd rather have a way to turn this off on a given ext4 > > > filesystem. Is that practical? > > > > I think Ted needs to answer if he would accept another mount option. But > > before we are going this way, what is gluster doing if there are hash > > collions? > > They probably just haven't tested NFS with large enough directories. > The birthday paradox says you'd need about 2^16 entries to have a 50-50 > chance of hitting the problem. The Gluster NFS-server gets into an infinite loop: - https://bugzilla.redhat.com/show_bug.cgi?id=838784 The general advise (even before this Bug) is that XFS should be used, which is not affected with this problem (yet?). Cheers, Niels ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [Gluster-devel] regressions due to 64-bit ext4 directory cookies 2013-02-13 13:31 ` [Gluster-devel] " Niels de Vos @ 2013-02-13 15:40 ` Bernd Schubert 2013-02-14 5:32 ` Dave Chinner 0 siblings, 1 reply; 44+ messages in thread From: Bernd Schubert @ 2013-02-13 15:40 UTC (permalink / raw) To: Niels de Vos Cc: J. Bruce Fields, sandeen, Andreas Dilger, linux-ext4, Theodore Ts'o, gluster-devel On 02/13/2013 02:31 PM, Niels de Vos wrote: > On Tue, Feb 12, 2013 at 04:00:54PM -0500, J. Bruce Fields wrote: >> On Tue, Feb 12, 2013 at 09:56:41PM +0100, Bernd Schubert wrote: >>> On 02/12/2013 09:28 PM, J. Bruce Fields wrote: >>>> 06effdbb49af5f6c "nfsd: vfs_llseek() with 32 or 64 bit offsets (hashes)" >>>> and previous patches solved problems with hash collisions in large >>>> directories by using 64- instead of 32- bit directory hashes in some >>>> cases. But it caused problems for users who assume directory offsets >>>> are "small". Two cases we've run across: >>>> >>>> - older NFS clients: 64-bit cookies cause applications on many >>>> older clients to fail. >>>> - gluster: gluster assumed that it could take the top bits of >>>> the offset for its own use. >>>> >>>> In both cases we could argue we're in the right: the nfs protocol >>>> defines cookies to be 64 bits, so clients should be prepared to handle >>>> them (remapping to smaller integers if necessary to placate applications >>>> using older system interfaces). And gluster was incorrect to assume >>>> that the "offset" was really an "offset" as opposed to just an opaque >>>> value. >>>> >>>> But in practice things that worked fine for a long time break on a >>>> kernel upgrade. >>>> >>>> So at a minimum I think we owe people a workaround, and turning off >>>> dir_index may not be practical for everyone. >>>> >>>> A "no_64bit_cookies" export option would provide a workaround for NFS >>>> servers with older NFS clients, but not for applications like gluster. >>>> >>>> For that reason I'd rather have a way to turn this off on a given ext4 >>>> filesystem. Is that practical? >>> >>> I think Ted needs to answer if he would accept another mount option. But >>> before we are going this way, what is gluster doing if there are hash >>> collions? >> >> They probably just haven't tested NFS with large enough directories. >> The birthday paradox says you'd need about 2^16 entries to have a 50-50 >> chance of hitting the problem. > > The Gluster NFS-server gets into an infinite loop: > - https://bugzilla.redhat.com/show_bug.cgi?id=838784 Hmm, this bugzilla is not entirely what I meant, as it refers to 64-bit hashes. My question actually was, what is gluster going to do if there is a 32-bit hash collision and ext4 seeks back to a random entry? That might end in an endless loop, but it also simply might list entries multiple times on readdir(). Of course, something that only happens rarely is better than something that happens all the time, but it still would be better to properly fix it, wouldn't it? > The general advise (even before this Bug) is that XFS should be used, > which is not affected with this problem (yet?). Hmm, well, always depends on the workload. Cheers, Bernd ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [Gluster-devel] regressions due to 64-bit ext4 directory cookies 2013-02-13 15:40 ` Bernd Schubert @ 2013-02-14 5:32 ` Dave Chinner 0 siblings, 0 replies; 44+ messages in thread From: Dave Chinner @ 2013-02-14 5:32 UTC (permalink / raw) To: Bernd Schubert Cc: Niels de Vos, J. Bruce Fields, sandeen, Andreas Dilger, linux-ext4, Theodore Ts'o, gluster-devel On Wed, Feb 13, 2013 at 04:40:35PM +0100, Bernd Schubert wrote: > >The general advise (even before this Bug) is that XFS should be used, > >which is not affected with this problem (yet?). > > Hmm, well, always depends on the workload. XFS won't suffer from this collision bug, for 2 reasons. The first is that XFS uses a virtual mapping for directory data and uses an encoded index into that virtual mapping as the cookie data. You can't have 2 entries at the same index, so you cannot get cookie collisions. The second is that the virtual mapping is for a 32GB data segment, (2^35 bytes) and, like so much of XFS, the cookie is made up of bitfields that encode a specific location. The high bits are the virtual block offset into the directory data segment, the low bits the offset into the directory block. Given that directory entries are aligned to 8 bytes, the offset into the directory block can have 3 bits compressed out and hence we end up with only 32 bits being needed to address the entire 32GB directory data segment. So, there are no collisions or 32/64 bit issues with XFS directory cookies regardless of the workload. Cheers, Dave. -- Dave Chinner david@fromorbit.com ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: regressions due to 64-bit ext4 directory cookies 2013-02-12 20:28 regressions due to 64-bit ext4 directory cookies J. Bruce Fields 2013-02-12 20:56 ` Bernd Schubert @ 2013-02-13 4:00 ` Theodore Ts'o 2013-02-13 13:31 ` J. Bruce Fields 2013-02-13 6:56 ` Andreas Dilger 2 siblings, 1 reply; 44+ messages in thread From: Theodore Ts'o @ 2013-02-13 4:00 UTC (permalink / raw) To: J. Bruce Fields; +Cc: linux-ext4, sandeen, Bernd Schubert, gluster-devel On Tue, Feb 12, 2013 at 03:28:41PM -0500, J. Bruce Fields wrote: > 06effdbb49af5f6c "nfsd: vfs_llseek() with 32 or 64 bit offsets (hashes)" > and previous patches solved problems with hash collisions in large > directories by using 64- instead of 32- bit directory hashes in some > cases. But it caused problems for users who assume directory offsets > are "small". Two cases we've run across: > > - older NFS clients: 64-bit cookies cause applications on many > older clients to fail. Is there a list of clients (and version numbers) which are having problems? > A "no_64bit_cookies" export option would provide a workaround for NFS > servers with older NFS clients, but not for applications like gluster. Why isn't it sufficient for gluster? Are they doing something horrible such as assuming that telldir() cookies accessed from userspace are identical to NFS cookies? Or is it some other horrible abstraction violation? - Ted ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: regressions due to 64-bit ext4 directory cookies 2013-02-13 4:00 ` Theodore Ts'o @ 2013-02-13 13:31 ` J. Bruce Fields 2013-02-13 15:14 ` Theodore Ts'o 0 siblings, 1 reply; 44+ messages in thread From: J. Bruce Fields @ 2013-02-13 13:31 UTC (permalink / raw) To: Theodore Ts'o; +Cc: linux-ext4, sandeen, Bernd Schubert, gluster-devel On Tue, Feb 12, 2013 at 11:00:03PM -0500, Theodore Ts'o wrote: > On Tue, Feb 12, 2013 at 03:28:41PM -0500, J. Bruce Fields wrote: > > 06effdbb49af5f6c "nfsd: vfs_llseek() with 32 or 64 bit offsets (hashes)" > > and previous patches solved problems with hash collisions in large > > directories by using 64- instead of 32- bit directory hashes in some > > cases. But it caused problems for users who assume directory offsets > > are "small". Two cases we've run across: > > > > - older NFS clients: 64-bit cookies cause applications on many > > older clients to fail. > > Is there a list of clients (and version numbers) which are having > problems? I've seen complaints about Solaris, AIX, and HP-UX clients. I don't have version numbers. It's possible that this is a problem with their latest versions, so I probably shouldn't have said "older" above. > > A "no_64bit_cookies" export option would provide a workaround for NFS > > servers with older NFS clients, but not for applications like gluster. > > Why isn't it sufficient for gluster? Are they doing something > horrible such as assuming that telldir() cookies accessed from > userspace are identical to NFS cookies? Or is it some other horrible > abstraction violation? They're assuming they can take the high bits of the cookie for their own use. (In more detail: they're spreading a single directory across multiple nodes, and encoding a node ID into the cookie they return, so they can tell which node the cookie came from when they get it back.) That works if you assume the cookie is an "offset" bounded above by some measure of the directory size, hence unlikely to ever use the high bits.... --b. ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: regressions due to 64-bit ext4 directory cookies 2013-02-13 13:31 ` J. Bruce Fields @ 2013-02-13 15:14 ` Theodore Ts'o 2013-02-13 15:19 ` J. Bruce Fields 0 siblings, 1 reply; 44+ messages in thread From: Theodore Ts'o @ 2013-02-13 15:14 UTC (permalink / raw) To: J. Bruce Fields; +Cc: linux-ext4, sandeen, Bernd Schubert, gluster-devel On Wed, Feb 13, 2013 at 08:31:31AM -0500, J. Bruce Fields wrote: > They're assuming they can take the high bits of the cookie for their own > use. > > (In more detail: they're spreading a single directory across multiple > nodes, and encoding a node ID into the cookie they return, so they can > tell which node the cookie came from when they get it back.) > > That works if you assume the cookie is an "offset" bounded above by some > measure of the directory size, hence unlikely to ever use the high > bits.... Right, but why wouldn't a nfs export option solave the problem for gluster? Basically, it would be nice if we did not have to degrade locally running userspace applications by globally turning off 64-bit telldir cookies just because there are some broken cluster file systems and nfsv3 clients out there. And if we are only turning off 64-bit cookies for NFS, wouldn't it make sense to make this be a NFS export option, as opposed to a mount option? Regards, - Ted ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: regressions due to 64-bit ext4 directory cookies 2013-02-13 15:14 ` Theodore Ts'o @ 2013-02-13 15:19 ` J. Bruce Fields 2013-02-13 15:36 ` Theodore Ts'o 0 siblings, 1 reply; 44+ messages in thread From: J. Bruce Fields @ 2013-02-13 15:19 UTC (permalink / raw) To: Theodore Ts'o; +Cc: linux-ext4, sandeen, Bernd Schubert, gluster-devel On Wed, Feb 13, 2013 at 10:14:55AM -0500, Theodore Ts'o wrote: > On Wed, Feb 13, 2013 at 08:31:31AM -0500, J. Bruce Fields wrote: > > They're assuming they can take the high bits of the cookie for their own > > use. > > > > (In more detail: they're spreading a single directory across multiple > > nodes, and encoding a node ID into the cookie they return, so they can > > tell which node the cookie came from when they get it back.) > > > > That works if you assume the cookie is an "offset" bounded above by some > > measure of the directory size, hence unlikely to ever use the high > > bits.... > > Right, but why wouldn't a nfs export option solave the problem for > gluster? No, gluster is running on ext4 directly. > Basically, it would be nice if we did not have to degrade locally > running userspace applications by globally turning off 64-bit telldir > cookies just because there are some broken cluster file systems and > nfsv3 clients out there. And if we are only turning off 64-bit > cookies for NFS, wouldn't it make sense to make this be a NFS export > option, as opposed to a mount option? Right, the problem is that from ext4's point of view gluster is just another userspace application. (And my worry of course is that there may be others. Samba would be another one to check.) --b. ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: regressions due to 64-bit ext4 directory cookies 2013-02-13 15:19 ` J. Bruce Fields @ 2013-02-13 15:36 ` Theodore Ts'o [not found] ` <20130213153654.GC17431-AKGzg7BKzIDYtjvyW6yDsg@public.gmane.org> 0 siblings, 1 reply; 44+ messages in thread From: Theodore Ts'o @ 2013-02-13 15:36 UTC (permalink / raw) To: J. Bruce Fields; +Cc: linux-ext4, sandeen, Bernd Schubert, gluster-devel On Wed, Feb 13, 2013 at 10:19:53AM -0500, J. Bruce Fields wrote: > > > (In more detail: they're spreading a single directory across multiple > > > nodes, and encoding a node ID into the cookie they return, so they can > > > tell which node the cookie came from when they get it back.) > > > > > > That works if you assume the cookie is an "offset" bounded above by some > > > measure of the directory size, hence unlikely to ever use the high > > > bits.... > > > > Right, but why wouldn't a nfs export option solave the problem for > > gluster? > > No, gluster is running on ext4 directly. OK, so let me see if I can get this straight. Each local gluster node is running a userspace NFS server, right? Because if it were running a kernel-side NFS server, it would be sufficient to use an nfs export option. A client which mounts a "gluster file system" is also doing this via NFSv3, right? Or are they using their own protocol? If they are using their own protocol, why can't they encode the node ID somewhere else? So this a correct picture of what is going on: /------ GFS Storage / Server #1 GFS Cluster NFS V3 GFS Cluster -- NFS v3 Client <---------> Frontend Server ---------- GFS Storage -- Server #2 \ \------ GFS Storage Server #3 And the reason why it needs to use the high bits is because when it needs to coalesce the results from each GFS Storage Server to the GFS Cluster client? The other thing that I'd note is that the readdir cookie has been 64-bit since NFSv3, which was released in June ***1995***. And the explicit, stated purpose of making it be a 64-bit value (as stated in RFC 1813) was to reduce interoperability problems. If that were the case, are you telling me that Sun (who has traditionally been pretty good worrying about interoperability concerns, and in fact employed the editors of RFC 1813) didn't get this right? This seems quite.... surprising to me. I thought this was the whole point of the various NFS interoperability testing done at Connectathon, for which Sun was a major sponsor?!? No one noticed?!? - Ted ^ permalink raw reply [flat|nested] 44+ messages in thread
[parent not found: <20130213153654.GC17431-AKGzg7BKzIDYtjvyW6yDsg@public.gmane.org>]
* Re: regressions due to 64-bit ext4 directory cookies [not found] ` <20130213153654.GC17431-AKGzg7BKzIDYtjvyW6yDsg@public.gmane.org> @ 2013-02-13 16:20 ` J. Bruce Fields [not found] ` <20130213162059.GL14195-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org> 0 siblings, 1 reply; 44+ messages in thread From: J. Bruce Fields @ 2013-02-13 16:20 UTC (permalink / raw) To: Theodore Ts'o Cc: Bernd Schubert, sandeen-H+wXaHxf7aLQT0dZR+AlfA, linux-nfs-u79uwXL29TY76Z2rM5mHXA, linux-ext4-u79uwXL29TY76Z2rM5mHXA, gluster-devel-qX2TKyscuCcdnm+yROfE0A Oops, probably should have cc'd linux-nfs. On Wed, Feb 13, 2013 at 10:36:54AM -0500, Theodore Ts'o wrote: > On Wed, Feb 13, 2013 at 10:19:53AM -0500, J. Bruce Fields wrote: > > > > (In more detail: they're spreading a single directory across multiple > > > > nodes, and encoding a node ID into the cookie they return, so they can > > > > tell which node the cookie came from when they get it back.) > > > > > > > > That works if you assume the cookie is an "offset" bounded above by some > > > > measure of the directory size, hence unlikely to ever use the high > > > > bits.... > > > > > > Right, but why wouldn't a nfs export option solave the problem for > > > gluster? > > > > No, gluster is running on ext4 directly. > > OK, so let me see if I can get this straight. Each local gluster node > is running a userspace NFS server, right? My understanding is that only one frontend server is running the server. So in your picture below, "NFS v3" should be some internal gluster protocol: /------ GFS Storage / Server #1 GFS Cluster NFS V3 GFS Cluster -- gluster protocol Client <---------> Frontend Server ---------- GFS Storage -- Server #2 \ \------ GFS Storage Server #3 That frontend server gets a readdir request for a directory which is stored across several of the storage servers. It has to return a cookie. It will get that cookie back from the client at some unknown later time (possibly after the server has rebooted). So their solution is to return a cookie from one of the storage servers, plus some kind of node id in the top bits so they can remember which server it came from. (I don't know much about gluster, but I think that's the basic idea.) I've assumed that users of directory cookies should treat them as opaque, so I don't think what gluster is doing is correct. But on the other hand they are defined as integers and described as offsets here and there. And I can't actually think of anything else that would work, short of gluster generating and storing its own cookies. > Because if it were running > a kernel-side NFS server, it would be sufficient to use an nfs export > option. > > A client which mounts a "gluster file system" is also doing this via > NFSv3, right? Or are they using their own protocol? If they are > using their own protocol, why can't they encode the node ID somewhere > else? > > So this a correct picture of what is going on: > > /------ GFS Storage > / Server #1 > GFS Cluster NFS V3 GFS Cluster -- NFS v3 > Client <---------> Frontend Server ---------- GFS Storage > -- Server #2 > \ > \------ GFS Storage > Server #3 > > > And the reason why it needs to use the high bits is because when it > needs to coalesce the results from each GFS Storage Server to the GFS > Cluster client? > > The other thing that I'd note is that the readdir cookie has been > 64-bit since NFSv3, which was released in June ***1995***. And the > explicit, stated purpose of making it be a 64-bit value (as stated in > RFC 1813) was to reduce interoperability problems. If that were the > case, are you telling me that Sun (who has traditionally been pretty > good worrying about interoperability concerns, and in fact employed > the editors of RFC 1813) didn't get this right? This seems > quite.... surprising to me. > > I thought this was the whole point of the various NFS interoperability > testing done at Connectathon, for which Sun was a major sponsor?!? No > one noticed?!? Beats me. But it's not necessarily easy to replace clients running legacy applications, so we're stuck working with the clients we have.... The linux client does remap the server-provided cookies to small integers, I believe exactly because older applications had trouble with servers returning "large" cookies. So presumably ext4-exporting-Linux servers aren't the first to do this. I don't know which client versions are affected--Connectathon's next week and I'll talk to people and make sure there's an ext4 export with this turned on to test against. --b. ^ permalink raw reply [flat|nested] 44+ messages in thread
[parent not found: <20130213162059.GL14195-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>]
* Re: regressions due to 64-bit ext4 directory cookies [not found] ` <20130213162059.GL14195-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org> @ 2013-02-13 16:43 ` Myklebust, Trond 2013-02-13 21:33 ` J. Bruce Fields 2013-02-13 21:21 ` Anand Avati 1 sibling, 1 reply; 44+ messages in thread From: Myklebust, Trond @ 2013-02-13 16:43 UTC (permalink / raw) To: J. Bruce Fields Cc: Theodore Ts'o, linux-ext4-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, sandeen-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org, Bernd Schubert, gluster-devel-qX2TKyscuCcdnm+yROfE0A@public.gmane.org, linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org On Wed, 2013-02-13 at 11:20 -0500, J. Bruce Fields wrote: > Oops, probably should have cc'd linux-nfs. > > On Wed, Feb 13, 2013 at 10:36:54AM -0500, Theodore Ts'o wrote: > > On Wed, Feb 13, 2013 at 10:19:53AM -0500, J. Bruce Fields wrote: > > > > > (In more detail: they're spreading a single directory across multiple > > > > > nodes, and encoding a node ID into the cookie they return, so they can > > > > > tell which node the cookie came from when they get it back.) > > > > > > > > > > That works if you assume the cookie is an "offset" bounded above by some > > > > > measure of the directory size, hence unlikely to ever use the high > > > > > bits.... > > > > > > > > Right, but why wouldn't a nfs export option solave the problem for > > > > gluster? > > > > > > No, gluster is running on ext4 directly. > > > > OK, so let me see if I can get this straight. Each local gluster node > > is running a userspace NFS server, right? > > My understanding is that only one frontend server is running the server. > So in your picture below, "NFS v3" should be some internal gluster > protocol: > > > /------ GFS Storage > / Server #1 > GFS Cluster NFS V3 GFS Cluster -- gluster protocol > Client <---------> Frontend Server ---------- GFS Storage > -- Server #2 > \ > \------ GFS Storage > Server #3 > > > That frontend server gets a readdir request for a directory which is > stored across several of the storage servers. It has to return a > cookie. It will get that cookie back from the client at some unknown > later time (possibly after the server has rebooted). So their solution > is to return a cookie from one of the storage servers, plus some kind of > node id in the top bits so they can remember which server it came from. > > (I don't know much about gluster, but I think that's the basic idea.) > > I've assumed that users of directory cookies should treat them as > opaque, so I don't think what gluster is doing is correct. But on the > other hand they are defined as integers and described as offsets here > and there. And I can't actually think of anything else that would work, > short of gluster generating and storing its own cookies. > > > Because if it were running > > a kernel-side NFS server, it would be sufficient to use an nfs export > > option. > > > > A client which mounts a "gluster file system" is also doing this via > > NFSv3, right? Or are they using their own protocol? If they are > > using their own protocol, why can't they encode the node ID somewhere > > else? > > > > So this a correct picture of what is going on: > > > > /------ GFS Storage > > / Server #1 > > GFS Cluster NFS V3 GFS Cluster -- NFS v3 > > Client <---------> Frontend Server ---------- GFS Storage > > -- Server #2 > > \ > > \------ GFS Storage > > Server #3 > > > > > > And the reason why it needs to use the high bits is because when it > > needs to coalesce the results from each GFS Storage Server to the GFS > > Cluster client? > > > > The other thing that I'd note is that the readdir cookie has been > > 64-bit since NFSv3, which was released in June ***1995***. And the > > explicit, stated purpose of making it be a 64-bit value (as stated in > > RFC 1813) was to reduce interoperability problems. If that were the > > case, are you telling me that Sun (who has traditionally been pretty > > good worrying about interoperability concerns, and in fact employed > > the editors of RFC 1813) didn't get this right? This seems > > quite.... surprising to me. > > > > I thought this was the whole point of the various NFS interoperability > > testing done at Connectathon, for which Sun was a major sponsor?!? No > > one noticed?!? > > Beats me. But it's not necessarily easy to replace clients running > legacy applications, so we're stuck working with the clients we have.... > > The linux client does remap the server-provided cookies to small > integers, I believe exactly because older applications had trouble with > servers returning "large" cookies. So presumably ext4-exporting-Linux > servers aren't the first to do this. > > I don't know which client versions are affected--Connectathon's next > week and I'll talk to people and make sure there's an ext4 export with > this turned on to test against. Actually, one of the main reasons for the Linux client not exporting raw readdir cookies is because the glibc-2 folks in their infinite wisdom declared that telldir()/seekdir() use an off_t. They then went yet one further and decided to declare negative offsets to be illegal so that they could use the negative values internally in their syscall wrappers. The POSIX definition has none of the above rubbish (http://pubs.opengroup.org/onlinepubs/009695399/functions/telldir.html) and so glibc brilliantly saddled Linux with a crippled readdir implementation that is _not_ POSIX compatible. No, I'm not at all bitter... Trond -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: regressions due to 64-bit ext4 directory cookies 2013-02-13 16:43 ` Myklebust, Trond @ 2013-02-13 21:33 ` J. Bruce Fields 2013-02-14 3:59 ` Myklebust, Trond 0 siblings, 1 reply; 44+ messages in thread From: J. Bruce Fields @ 2013-02-13 21:33 UTC (permalink / raw) To: Myklebust, Trond Cc: Theodore Ts'o, linux-ext4@vger.kernel.org, sandeen@redhat.com, Bernd Schubert, gluster-devel@nongnu.org, linux-nfs@vger.kernel.org On Wed, Feb 13, 2013 at 04:43:05PM +0000, Myklebust, Trond wrote: > On Wed, 2013-02-13 at 11:20 -0500, J. Bruce Fields wrote: > > Oops, probably should have cc'd linux-nfs. > > > > On Wed, Feb 13, 2013 at 10:36:54AM -0500, Theodore Ts'o wrote: > > > The other thing that I'd note is that the readdir cookie has been > > > 64-bit since NFSv3, which was released in June ***1995***. And the > > > explicit, stated purpose of making it be a 64-bit value (as stated in > > > RFC 1813) was to reduce interoperability problems. If that were the > > > case, are you telling me that Sun (who has traditionally been pretty > > > good worrying about interoperability concerns, and in fact employed > > > the editors of RFC 1813) didn't get this right? This seems > > > quite.... surprising to me. > > > > > > I thought this was the whole point of the various NFS interoperability > > > testing done at Connectathon, for which Sun was a major sponsor?!? No > > > one noticed?!? > > > > Beats me. But it's not necessarily easy to replace clients running > > legacy applications, so we're stuck working with the clients we have.... > > > > The linux client does remap the server-provided cookies to small > > integers, I believe exactly because older applications had trouble with > > servers returning "large" cookies. So presumably ext4-exporting-Linux > > servers aren't the first to do this. > > > > I don't know which client versions are affected--Connectathon's next > > week and I'll talk to people and make sure there's an ext4 export with > > this turned on to test against. > > Actually, one of the main reasons for the Linux client not exporting raw > readdir cookies is because the glibc-2 folks in their infinite wisdom > declared that telldir()/seekdir() use an off_t. They then went yet one > further and decided to declare negative offsets to be illegal so that > they could use the negative values internally in their syscall wrappers. > > The POSIX definition has none of the above rubbish > (http://pubs.opengroup.org/onlinepubs/009695399/functions/telldir.html) > and so glibc brilliantly saddled Linux with a crippled readdir > implementation that is _not_ POSIX compatible. > > No, I'm not at all bitter... Oh, right, I knew I'd forgotten part of the story.... But then you must have actually been testing against servers that were using that 32nd bit? I think ext4 actually only uses 31 bits even in the 32-bit case. And for a server that was literally using an offset inside a directory file, that would be a colossal directory. So I'm wondering how you ran across it. Partly just pure curiosity. --b. ^ permalink raw reply [flat|nested] 44+ messages in thread
* RE: regressions due to 64-bit ext4 directory cookies 2013-02-13 21:33 ` J. Bruce Fields @ 2013-02-14 3:59 ` Myklebust, Trond [not found] ` <4FA345DA4F4AE44899BD2B03EEEC2FA91F3D6BAB-UCI0kNdgLrHLJmV3vhxcH3OR4cbS7gtM96Bgd4bDwmQ@public.gmane.org> 0 siblings, 1 reply; 44+ messages in thread From: Myklebust, Trond @ 2013-02-14 3:59 UTC (permalink / raw) To: J. Bruce Fields Cc: Theodore Ts'o, linux-ext4@vger.kernel.org, sandeen@redhat.com, Bernd Schubert, gluster-devel@nongnu.org, linux-nfs@vger.kernel.org > -----Original Message----- > From: J. Bruce Fields [mailto:bfields@fieldses.org] > Sent: Wednesday, February 13, 2013 4:34 PM > To: Myklebust, Trond > Cc: Theodore Ts'o; linux-ext4@vger.kernel.org; sandeen@redhat.com; > Bernd Schubert; gluster-devel@nongnu.org; linux-nfs@vger.kernel.org > Subject: Re: regressions due to 64-bit ext4 directory cookies > > On Wed, Feb 13, 2013 at 04:43:05PM +0000, Myklebust, Trond wrote: > > On Wed, 2013-02-13 at 11:20 -0500, J. Bruce Fields wrote: > > > Oops, probably should have cc'd linux-nfs. > > > > > > On Wed, Feb 13, 2013 at 10:36:54AM -0500, Theodore Ts'o wrote: > > > > The other thing that I'd note is that the readdir cookie has been > > > > 64-bit since NFSv3, which was released in June ***1995***. And > > > > the explicit, stated purpose of making it be a 64-bit value (as > > > > stated in RFC 1813) was to reduce interoperability problems. If > > > > that were the case, are you telling me that Sun (who has > > > > traditionally been pretty good worrying about interoperability > > > > concerns, and in fact employed the editors of RFC 1813) didn't get > > > > this right? This seems quite.... surprising to me. > > > > > > > > I thought this was the whole point of the various NFS > > > > interoperability testing done at Connectathon, for which Sun was a > > > > major sponsor?!? No one noticed?!? > > > > > > Beats me. But it's not necessarily easy to replace clients running > > > legacy applications, so we're stuck working with the clients we have.... > > > > > > The linux client does remap the server-provided cookies to small > > > integers, I believe exactly because older applications had trouble > > > with servers returning "large" cookies. So presumably > > > ext4-exporting-Linux servers aren't the first to do this. > > > > > > I don't know which client versions are affected--Connectathon's next > > > week and I'll talk to people and make sure there's an ext4 export > > > with this turned on to test against. > > > > Actually, one of the main reasons for the Linux client not exporting > > raw readdir cookies is because the glibc-2 folks in their infinite > > wisdom declared that telldir()/seekdir() use an off_t. They then went > > yet one further and decided to declare negative offsets to be illegal > > so that they could use the negative values internally in their syscall > wrappers. > > > > The POSIX definition has none of the above rubbish > > (http://pubs.opengroup.org/onlinepubs/009695399/functions/telldir.html > > ) and so glibc brilliantly saddled Linux with a crippled readdir > > implementation that is _not_ POSIX compatible. > > > > No, I'm not at all bitter... > > Oh, right, I knew I'd forgotten part of the story.... > > But then you must have actually been testing against servers that were using > that 32nd bit? > > I think ext4 actually only uses 31 bits even in the 32-bit case. And for a server > that was literally using an offset inside a directory file, that would be a > colossal directory. > > So I'm wondering how you ran across it. > > Partly just pure curiosity. IIRC, XFS on IRIX used 0xFFFFF as the readdir eof marker, which caused us to generate an EIO... Cheers Trond ^ permalink raw reply [flat|nested] 44+ messages in thread
[parent not found: <4FA345DA4F4AE44899BD2B03EEEC2FA91F3D6BAB-UCI0kNdgLrHLJmV3vhxcH3OR4cbS7gtM96Bgd4bDwmQ@public.gmane.org>]
* Re: regressions due to 64-bit ext4 directory cookies [not found] ` <4FA345DA4F4AE44899BD2B03EEEC2FA91F3D6BAB-UCI0kNdgLrHLJmV3vhxcH3OR4cbS7gtM96Bgd4bDwmQ@public.gmane.org> @ 2013-02-14 5:45 ` Dave Chinner 0 siblings, 0 replies; 44+ messages in thread From: Dave Chinner @ 2013-02-14 5:45 UTC (permalink / raw) To: Myklebust, Trond Cc: J. Bruce Fields, Theodore Ts'o, linux-ext4-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, sandeen-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org, Bernd Schubert, gluster-devel-qX2TKyscuCcdnm+yROfE0A@public.gmane.org, linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org On Thu, Feb 14, 2013 at 03:59:17AM +0000, Myklebust, Trond wrote: > > -----Original Message----- > > From: J. Bruce Fields [mailto:bfields-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org] > > Sent: Wednesday, February 13, 2013 4:34 PM > > To: Myklebust, Trond > > Cc: Theodore Ts'o; linux-ext4-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; sandeen-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org; > > Bernd Schubert; gluster-devel-qX2TKyscuCcdnm+yROfE0A@public.gmane.org; linux-nfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > > Subject: Re: regressions due to 64-bit ext4 directory cookies > > > > On Wed, Feb 13, 2013 at 04:43:05PM +0000, Myklebust, Trond wrote: > > > On Wed, 2013-02-13 at 11:20 -0500, J. Bruce Fields wrote: > > > > Oops, probably should have cc'd linux-nfs. > > > > > > > > On Wed, Feb 13, 2013 at 10:36:54AM -0500, Theodore Ts'o wrote: > > > > > The other thing that I'd note is that the readdir cookie has been > > > > > 64-bit since NFSv3, which was released in June ***1995***. And > > > > > the explicit, stated purpose of making it be a 64-bit value (as > > > > > stated in RFC 1813) was to reduce interoperability problems. If > > > > > that were the case, are you telling me that Sun (who has > > > > > traditionally been pretty good worrying about interoperability > > > > > concerns, and in fact employed the editors of RFC 1813) didn't get > > > > > this right? This seems quite.... surprising to me. > > > > > > > > > > I thought this was the whole point of the various NFS > > > > > interoperability testing done at Connectathon, for which Sun was a > > > > > major sponsor?!? No one noticed?!? > > > > > > > > Beats me. But it's not necessarily easy to replace clients running > > > > legacy applications, so we're stuck working with the clients we have.... > > > > > > > > The linux client does remap the server-provided cookies to small > > > > integers, I believe exactly because older applications had trouble > > > > with servers returning "large" cookies. So presumably > > > > ext4-exporting-Linux servers aren't the first to do this. > > > > > > > > I don't know which client versions are affected--Connectathon's next > > > > week and I'll talk to people and make sure there's an ext4 export > > > > with this turned on to test against. > > > > > > Actually, one of the main reasons for the Linux client not exporting > > > raw readdir cookies is because the glibc-2 folks in their infinite > > > wisdom declared that telldir()/seekdir() use an off_t. They then went > > > yet one further and decided to declare negative offsets to be illegal > > > so that they could use the negative values internally in their syscall > > wrappers. > > > > > > The POSIX definition has none of the above rubbish > > > (http://pubs.opengroup.org/onlinepubs/009695399/functions/telldir.html > > > ) and so glibc brilliantly saddled Linux with a crippled readdir > > > implementation that is _not_ POSIX compatible. > > > > > > No, I'm not at all bitter... > > > > Oh, right, I knew I'd forgotten part of the story.... > > > > But then you must have actually been testing against servers that were using > > that 32nd bit? > > > > I think ext4 actually only uses 31 bits even in the 32-bit case. And for a server > > that was literally using an offset inside a directory file, that would be a > > colossal directory. That's exactly what XFS directory cookies are - a direct encoding of the dirent offset into the directory file. Which means a overflow would occur at 16GB of directory data for XFS. That is in the realm of several hundreds of millions of files in a single directory, which I have seen done before.... > > So I'm wondering how you ran across it. > > > > Partly just pure curiosity. > > IIRC, XFS on IRIX used 0xFFFFF as the readdir eof marker, which caused us to generate an EIO... And this discussion explains the magic 0x7fffffff offset mask in the linux XFS readdir code. I've been trying to find out for years exactly why that was necessary, and now I know. I probably should write a patch that makes it a "non-magic" number and remove it completely for 64 bit platforms before I forget again... Cheers, Dave. -- Dave Chinner david-FqsqvQoI3Ljby3iVrkZq2A@public.gmane.org -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: regressions due to 64-bit ext4 directory cookies [not found] ` <20130213162059.GL14195-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org> 2013-02-13 16:43 ` Myklebust, Trond @ 2013-02-13 21:21 ` Anand Avati [not found] ` <CAFboF2wXvP+vttiff8iRE9rAgvV8UWGbFprgVp8p7kE43TU=PA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 1 sibling, 1 reply; 44+ messages in thread From: Anand Avati @ 2013-02-13 21:21 UTC (permalink / raw) To: J. Bruce Fields Cc: sandeen-H+wXaHxf7aLQT0dZR+AlfA, linux-nfs-u79uwXL29TY76Z2rM5mHXA, Theodore Ts'o, Bernd Schubert, linux-ext4-u79uwXL29TY76Z2rM5mHXA, gluster-devel-qX2TKyscuCcdnm+yROfE0A [-- Attachment #1.1: Type: text/plain, Size: 3581 bytes --] > > My understanding is that only one frontend server is running the server. > So in your picture below, "NFS v3" should be some internal gluster > protocol: > > > /------ GFS Storage > / Server #1 > GFS Cluster NFS V3 GFS Cluster -- gluster protocol > Client <---------> Frontend Server ---------- GFS Storage > -- Server #2 > \ > \------ GFS Storage > Server #3 > > > That frontend server gets a readdir request for a directory which is > stored across several of the storage servers. It has to return a > cookie. It will get that cookie back from the client at some unknown > later time (possibly after the server has rebooted). So their solution > is to return a cookie from one of the storage servers, plus some kind of > node id in the top bits so they can remember which server it came from. > > (I don't know much about gluster, but I think that's the basic idea.) > > I've assumed that users of directory cookies should treat them as > opaque, so I don't think what gluster is doing is correct. NFS uses the term cookies, while man pages of readdir/seekdir/telldir calls them "offsets". RFC 1813 only talks about communication between and NFS server and NFS client. While knfsd performs a trivial 1:1 mapping between d_off "offsets" into these "opaque cookies", the "gluster" issue at hand is that, it made assumptions about the nature of these "offsets" (that they are representing some kind of true distance/offset and therefore fall within some kind of bounded magnitude -- somewhat like the inode numbering), and performs a transformation (instead of a 1:1 trivial mapping) like this: final_d_off = (ext4_d_off * MAX_SERVERS) + server_idx thereby utilizing a few more top bits, also ability to perform a reverse transformation to "continue" from a previous location. As you can see, final_d_off now overflows for very large values of ext4_d_off. This final_d_off is used both as cookies in gluster-NFS (userspace) server, and also as d_off entry parameter in FUSE readdir reply. The gluster / ext4 d_off issue is not limited to gluster-NFS, but also exists in the FUSE client where NFS is completely out of picture. You are probably right in that gluster has made different assumptions about the "nature" of values filled in d_off fields. But the language used in all man pages makes you believe they were supposed to be numbers representing some kind of distance/offset (with bounded magnitude), and not a "random" number. This had worked (accidentally, you may call it) on all filesystems including ext4, as expected. But on kernel upgrade, only ext4 backed deployments started giving problems and we have been advising our users to either downgrade their kernel or use a different filesystem (we really do not want to force them into making a choice of one backend filesystem vs another.) You can always say "this is your fault" for interpreting the man pages differently and punish us by leaving things as they are (and unfortunately a big chunk of users who want both ext4 and gluster jeapordized). Or you can be kind, generous and be considerate to the legacy apps and users (of which gluster is only a subset) and only provide a mount option to control the large d_off behavior. Thanks! Avati [-- Attachment #1.2: Type: text/html, Size: 4215 bytes --] [-- Attachment #2: Type: text/plain, Size: 185 bytes --] _______________________________________________ Gluster-devel mailing list Gluster-devel-qX2TKyscuCcdnm+yROfE0A@public.gmane.org https://lists.nongnu.org/mailman/listinfo/gluster-devel ^ permalink raw reply [flat|nested] 44+ messages in thread
[parent not found: <CAFboF2wXvP+vttiff8iRE9rAgvV8UWGbFprgVp8p7kE43TU=PA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [Gluster-devel] regressions due to 64-bit ext4 directory cookies [not found] ` <CAFboF2wXvP+vttiff8iRE9rAgvV8UWGbFprgVp8p7kE43TU=PA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2013-02-13 22:20 ` Theodore Ts'o 2013-02-13 22:41 ` J. Bruce Fields [not found] ` <20130213222052.GD5938-AKGzg7BKzIDYtjvyW6yDsg@public.gmane.org> 0 siblings, 2 replies; 44+ messages in thread From: Theodore Ts'o @ 2013-02-13 22:20 UTC (permalink / raw) To: Anand Avati Cc: J. Bruce Fields, Bernd Schubert, sandeen-H+wXaHxf7aLQT0dZR+AlfA, linux-nfs-u79uwXL29TY76Z2rM5mHXA, linux-ext4-u79uwXL29TY76Z2rM5mHXA, gluster-devel-qX2TKyscuCcdnm+yROfE0A On Wed, Feb 13, 2013 at 01:21:06PM -0800, Anand Avati wrote: > > NFS uses the term cookies, while man pages of readdir/seekdir/telldir calls > them "offsets". Unfortunately, telldir and seekdir are part of the "unspeakable Unix design horrors" which has been with us for 25+ years. To quote from the rationale section from the Single Unix Specification v3 (there is similar language in the Posix spec). The original standard developers perceived that there were restrictions on the use of the seekdir() and telldir() functions related to implementation details, and for that reason these functions need not be supported on all POSIX-conforming systems. They are required on implementations supporting the XSI extension. One of the perceived problems of implementation is that returning to a given point in a directory is quite difficult to describe formally, in spite of its intuitive appeal, when systems that use B-trees, hashing functions, or other similar mechanisms to order their directories are considered. The definition of seekdir() and telldir() does not specify whether, when using these interfaces, a given directory entry will be seen at all, or more than once. On systems not supporting these functions, their capability can sometimes be accomplished by saving a filename found by readdir() and later using rewinddir() and a loop on readdir() to relocate the position from which the filename was saved. Telldir() and seekdir() are basically implementation horrors for any file system that is using anything other than a simple array of directory entries ala the V7 Unix file system or the BSD FFS. For any file system which is using a more advanced data structure, like b-trees hash trees, etc, there **can't** possibly be a "offset" into a readdir stream. This is why ext3/ext4 uses a telldir cookie, and it's why the NFS specifications refer to it as a cookie. If you are using a modern file system, it can't possibly be an offset. > You can always say "this is your fault" for interpreting the man pages > differently and punish us by leaving things as they are (and unfortunately > a big chunk of users who want both ext4 and gluster jeapordized). Or you > can be kind, generous and be considerate to the legacy apps and users (of > which gluster is only a subset) and only provide a mount option to control > the large d_off behavior. The problem is that we made this change to fix real problems that take place when you have hash collisions. And if you are using a 31-bit cookie, the birthday paradox means that by the time you have a directory with 2**16 entries, the chances of hash collisions are very real. This could result in NFS readdir getting stuck in loops where it constantly gets the file "foo.c", and then when it passes the 31-bit cookie for "bar.c", since there is a hash collision, it gets "foo.c" again, and the readdir never terminates. So the problem is that you are effectively asking me to penalize well-behaved programs that don't try to steel bits from the top of the telldir cookie, just for the benefit of gluster. What if we have an ioctl or a process personality flag where a broken application can tell the file system "I'm broken, please give me a degraded telldir/seekdir cookie"? That way we don't penalize programs that are doing the right thing, while providing some accomodation for programs who are abusing the telldir cookie. - Ted -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [Gluster-devel] regressions due to 64-bit ext4 directory cookies 2013-02-13 22:20 ` [Gluster-devel] " Theodore Ts'o @ 2013-02-13 22:41 ` J. Bruce Fields [not found] ` <20130213224141.GU14195-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org> [not found] ` <20130213222052.GD5938-AKGzg7BKzIDYtjvyW6yDsg@public.gmane.org> 1 sibling, 1 reply; 44+ messages in thread From: J. Bruce Fields @ 2013-02-13 22:41 UTC (permalink / raw) To: Theodore Ts'o Cc: Anand Avati, Bernd Schubert, sandeen, linux-nfs, linux-ext4, gluster-devel On Wed, Feb 13, 2013 at 05:20:52PM -0500, Theodore Ts'o wrote: > On Wed, Feb 13, 2013 at 01:21:06PM -0800, Anand Avati wrote: > > > > NFS uses the term cookies, while man pages of readdir/seekdir/telldir calls > > them "offsets". > > Unfortunately, telldir and seekdir are part of the "unspeakable Unix > design horrors" which has been with us for 25+ years. To quote from > the rationale section from the Single Unix Specification v3 (there is > similar language in the Posix spec). > > The original standard developers perceived that there were > restrictions on the use of the seekdir() and telldir() functions > related to implementation details, and for that reason these > functions need not be supported on all POSIX-conforming > systems. They are required on implementations supporting the XSI > extension. > > One of the perceived problems of implementation is that returning > to a given point in a directory is quite difficult to describe > formally, in spite of its intuitive appeal, when systems that use > B-trees, hashing functions, or other similar mechanisms to order > their directories are considered. The definition of seekdir() and > telldir() does not specify whether, when using these interfaces, a > given directory entry will be seen at all, or more than once. > > On systems not supporting these functions, their capability can > sometimes be accomplished by saving a filename found by readdir() > and later using rewinddir() and a loop on readdir() to relocate > the position from which the filename was saved. > > > Telldir() and seekdir() are basically implementation horrors for any > file system that is using anything other than a simple array of > directory entries ala the V7 Unix file system or the BSD FFS. For any > file system which is using a more advanced data structure, like > b-trees hash trees, etc, there **can't** possibly be a "offset" into a > readdir stream. This is why ext3/ext4 uses a telldir cookie, and it's > why the NFS specifications refer to it as a cookie. If you are using > a modern file system, it can't possibly be an offset. > > > You can always say "this is your fault" for interpreting the man pages > > differently and punish us by leaving things as they are (and unfortunately > > a big chunk of users who want both ext4 and gluster jeapordized). Or you > > can be kind, generous and be considerate to the legacy apps and users (of > > which gluster is only a subset) and only provide a mount option to control > > the large d_off behavior. > > The problem is that we made this change to fix real problems that take > place when you have hash collisions. And if you are using a 31-bit > cookie, the birthday paradox means that by the time you have a > directory with 2**16 entries, the chances of hash collisions are very > real. This could result in NFS readdir getting stuck in loops where > it constantly gets the file "foo.c", and then when it passes the > 31-bit cookie for "bar.c", since there is a hash collision, it gets > "foo.c" again, and the readdir never terminates. > > So the problem is that you are effectively asking me to penalize > well-behaved programs that don't try to steel bits from the top of the > telldir cookie, just for the benefit of gluster. > > What if we have an ioctl or a process personality flag where a broken > application can tell the file system "I'm broken, please give me a > degraded telldir/seekdir cookie"? That way we don't penalize programs > that are doing the right thing, while providing some accomodation for > programs who are abusing the telldir cookie. Yeah, if there's a simple way to do that, maybe it would be worth it. --b. ^ permalink raw reply [flat|nested] 44+ messages in thread
[parent not found: <20130213224141.GU14195-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>]
* Re: [Gluster-devel] regressions due to 64-bit ext4 directory cookies [not found] ` <20130213224141.GU14195-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org> @ 2013-02-13 22:47 ` Theodore Ts'o [not found] ` <20130213224720.GE5938-AKGzg7BKzIDYtjvyW6yDsg@public.gmane.org> 0 siblings, 1 reply; 44+ messages in thread From: Theodore Ts'o @ 2013-02-13 22:47 UTC (permalink / raw) To: J. Bruce Fields Cc: Anand Avati, Bernd Schubert, sandeen-H+wXaHxf7aLQT0dZR+AlfA, linux-nfs-u79uwXL29TY76Z2rM5mHXA, linux-ext4-u79uwXL29TY76Z2rM5mHXA, gluster-devel-qX2TKyscuCcdnm+yROfE0A On Wed, Feb 13, 2013 at 05:41:41PM -0500, J. Bruce Fields wrote: > > What if we have an ioctl or a process personality flag where a broken > > application can tell the file system "I'm broken, please give me a > > degraded telldir/seekdir cookie"? That way we don't penalize programs > > that are doing the right thing, while providing some accomodation for > > programs who are abusing the telldir cookie. > > Yeah, if there's a simple way to do that, maybe it would be worth it. Doing this as an ioctl which gets called right after opendir, i.e (ignoring error checking): DIR *dir = opendir("/foo/bar/baz"); ioctl(dirfd(dir), EXT4_IOC_DEGRADED_READDIR, 1); ... should be quite easy. It would be a very ext3/4 specific thing, though. It would be more work to get something in as a process personality flag, mostly due to the politics of assiging a bit out of the bitfield. - Ted -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 44+ messages in thread
[parent not found: <20130213224720.GE5938-AKGzg7BKzIDYtjvyW6yDsg@public.gmane.org>]
* Re: regressions due to 64-bit ext4 directory cookies [not found] ` <20130213224720.GE5938-AKGzg7BKzIDYtjvyW6yDsg@public.gmane.org> @ 2013-02-13 22:57 ` Anand Avati [not found] ` <CAFboF2z1akN_edrY_fT915xfehfHGioA2M=PSHv0Fp3rD-5v5A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 44+ messages in thread From: Anand Avati @ 2013-02-13 22:57 UTC (permalink / raw) To: Theodore Ts'o Cc: Bernd Schubert, linux-nfs-u79uwXL29TY76Z2rM5mHXA, sandeen-H+wXaHxf7aLQT0dZR+AlfA, linux-ext4-u79uwXL29TY76Z2rM5mHXA, gluster-devel-qX2TKyscuCcdnm+yROfE0A [-- Attachment #1.1: Type: text/plain, Size: 1183 bytes --] On Wed, Feb 13, 2013 at 2:47 PM, Theodore Ts'o <tytso-3s7WtUTddSA@public.gmane.org> wrote: > On Wed, Feb 13, 2013 at 05:41:41PM -0500, J. Bruce Fields wrote: > > > What if we have an ioctl or a process personality flag where a broken > > > application can tell the file system "I'm broken, please give me a > > > degraded telldir/seekdir cookie"? That way we don't penalize programs > > > that are doing the right thing, while providing some accomodation for > > > programs who are abusing the telldir cookie. > > > > Yeah, if there's a simple way to do that, maybe it would be worth it. > > Doing this as an ioctl which gets called right after opendir, i.e > (ignoring error checking): > > DIR *dir = opendir("/foo/bar/baz"); > ioctl(dirfd(dir), EXT4_IOC_DEGRADED_READDIR, 1); > ... > > should be quite easy. It would be a very ext3/4 specific thing, > though. That would work, even though it would be ext3/4 specific. What is the recommended programmatic way to detect if the file is on ext3/4 -- we would not want to attempt that blindly on a non-ext3/4 FS as the numerical value of EXT4_IOC_DEGRADED_READDIR might get interpreted in dangerous ways? Avati [-- Attachment #1.2: Type: text/html, Size: 1638 bytes --] [-- Attachment #2: Type: text/plain, Size: 185 bytes --] _______________________________________________ Gluster-devel mailing list Gluster-devel-qX2TKyscuCcdnm+yROfE0A@public.gmane.org https://lists.nongnu.org/mailman/listinfo/gluster-devel ^ permalink raw reply [flat|nested] 44+ messages in thread
[parent not found: <CAFboF2z1akN_edrY_fT915xfehfHGioA2M=PSHv0Fp3rD-5v5A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [Gluster-devel] regressions due to 64-bit ext4 directory cookies [not found] ` <CAFboF2z1akN_edrY_fT915xfehfHGioA2M=PSHv0Fp3rD-5v5A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2013-02-13 23:05 ` J. Bruce Fields [not found] ` <20130213230511.GW14195-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org> 0 siblings, 1 reply; 44+ messages in thread From: J. Bruce Fields @ 2013-02-13 23:05 UTC (permalink / raw) To: Anand Avati Cc: Theodore Ts'o, Bernd Schubert, sandeen-H+wXaHxf7aLQT0dZR+AlfA, linux-nfs-u79uwXL29TY76Z2rM5mHXA, linux-ext4-u79uwXL29TY76Z2rM5mHXA, gluster-devel-qX2TKyscuCcdnm+yROfE0A On Wed, Feb 13, 2013 at 02:57:13PM -0800, Anand Avati wrote: > On Wed, Feb 13, 2013 at 2:47 PM, Theodore Ts'o <tytso-3s7WtUTddSA@public.gmane.org> wrote: > > > On Wed, Feb 13, 2013 at 05:41:41PM -0500, J. Bruce Fields wrote: > > > > What if we have an ioctl or a process personality flag where a broken > > > > application can tell the file system "I'm broken, please give me a > > > > degraded telldir/seekdir cookie"? That way we don't penalize programs > > > > that are doing the right thing, while providing some accomodation for > > > > programs who are abusing the telldir cookie. > > > > > > Yeah, if there's a simple way to do that, maybe it would be worth it. > > > > Doing this as an ioctl which gets called right after opendir, i.e > > (ignoring error checking): > > > > DIR *dir = opendir("/foo/bar/baz"); > > ioctl(dirfd(dir), EXT4_IOC_DEGRADED_READDIR, 1); > > ... > > > > should be quite easy. It would be a very ext3/4 specific thing, > > though. > > > That would work, even though it would be ext3/4 specific. What is the > recommended programmatic way to detect if the file is on ext3/4 -- we would > not want to attempt that blindly on a non-ext3/4 FS as the numerical value > of EXT4_IOC_DEGRADED_READDIR might get interpreted in dangerous ways? We must have been through this before, but: is the only way to generate a collision-free readdir cookie really to use a larger hash? Would it be possible to make something work like, for example, a 31-bit hash plus an offset into a hash bucket? I have trouble thinking about this, partly because I can't remember where to find the requirements for readdir on concurrently modified directories.... --b. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 44+ messages in thread
[parent not found: <20130213230511.GW14195-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>]
* Re: [Gluster-devel] regressions due to 64-bit ext4 directory cookies [not found] ` <20130213230511.GW14195-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org> @ 2013-02-13 23:44 ` Theodore Ts'o [not found] ` <20130213234430.GF5938-AKGzg7BKzIDYtjvyW6yDsg@public.gmane.org> 0 siblings, 1 reply; 44+ messages in thread From: Theodore Ts'o @ 2013-02-13 23:44 UTC (permalink / raw) To: J. Bruce Fields Cc: Anand Avati, Bernd Schubert, sandeen-H+wXaHxf7aLQT0dZR+AlfA, linux-nfs-u79uwXL29TY76Z2rM5mHXA, linux-ext4-u79uwXL29TY76Z2rM5mHXA, gluster-devel-qX2TKyscuCcdnm+yROfE0A On Wed, Feb 13, 2013 at 06:05:11PM -0500, J. Bruce Fields wrote: > > Would it be possible to make something work like, for example, a 31-bit > hash plus an offset into a hash bucket? > > I have trouble thinking about this, partly because I can't remember > where to find the requirements for readdir on concurrently modified > directories.... The requires are that for a directory entry which has not been modified since the last opendir() or rewindir(), readdir() must return that directory entry exactly once. For a directory entry which has been added or removed since the last opendir() or rewinddir() call, it is undefined whether the directory entry is returned once or not at all. And a rename is defined as a add/remove, so it's OK for the old filename and the new file name to appear in the readdir() stream; it would also be OK if neither appeared in the readdir() stream. The SUSv3 definition of readdir() can be found here: http://pubs.opengroup.org/onlinepubs/009695399/functions/readdir.html Note also that if you look at the SuSv3 definition of seekdir(), it explicitly states that the value returned by telldir() is not guaranteed to be valid after a rewinddir() or across another opendir(): If the value of loc was not obtained from an earlier call to telldir(), or if a call to rewinddir() occurred between the call to telldir() and the call to seekdir(), the results of subsequent calls to readdir() are unspecified. Hence, it would be legal, and arguably more correct, if we created an internal array of pointers into the directory structure, where the first call to telldir() return 1, and the second call to telldir() returned 2, and the third call to telldir() returned 3, regardless of the position in the directory, and this number was used by seekdir() to index into the array of pointers to return the exact location in the b-tree. This would completely eliminate the possibility of hash collisions, and guarantee that readdir() would never drop or return a directory entry multiple times after seekdir(). This implementation approach would have a potential denial of service potential since each call to telldir() would potentially be allocating kernel memory, but as long as we make sure the OOM killler kills the nasty process which is calling telldir() a lot, this would probably be OK. It would also be legal to throw away this array after a call to rewinddir() and closedir(), since telldir() cookies and not guaranteed to valid indefinitely. See: http://pubs.opengroup.org/onlinepubs/009695399/functions/seekdir.html I suspect this would seriously screw over Gluster, though, and this wouldn't be a solution for NFSv3, since NFS needs long-lived directory cookies, and not the short-lived cookies which is all POSIX/SuSv3 guarantees. Regards, - Ted -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 44+ messages in thread
[parent not found: <20130213234430.GF5938-AKGzg7BKzIDYtjvyW6yDsg@public.gmane.org>]
* Re: regressions due to 64-bit ext4 directory cookies [not found] ` <20130213234430.GF5938-AKGzg7BKzIDYtjvyW6yDsg@public.gmane.org> @ 2013-02-14 0:05 ` Anand Avati [not found] ` <CAFboF2zS+YAa0uUxMFUAbqgPh3Kb4xZu40WUjLyGn8qPoP+Oyw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2013-02-14 21:46 ` [Gluster-devel] " J. Bruce Fields 1 sibling, 1 reply; 44+ messages in thread From: Anand Avati @ 2013-02-14 0:05 UTC (permalink / raw) To: Theodore Ts'o Cc: Bernd Schubert, linux-nfs-u79uwXL29TY76Z2rM5mHXA, sandeen-H+wXaHxf7aLQT0dZR+AlfA, linux-ext4-u79uwXL29TY76Z2rM5mHXA, gluster-devel-qX2TKyscuCcdnm+yROfE0A [-- Attachment #1.1: Type: text/plain, Size: 975 bytes --] On Wed, Feb 13, 2013 at 3:44 PM, Theodore Ts'o <tytso-3s7WtUTddSA@public.gmane.org> wrote: > > I suspect this would seriously screw over Gluster, though, and this > wouldn't be a solution for NFSv3, since NFS needs long-lived directory > cookies, and not the short-lived cookies which is all POSIX/SuSv3 > guarantees. > Actually this would work just fine with Gluster. Except in the case of gluster-NFS, the native client is only acting like a router/proxy of syscalls to the backend system. A directory opened by an application will have a matching directory fd opened on ext4, and readdir from an app will be translated into readdir on the matching fd on ext4. So the app-on-glusterfs and glusterfsd-on-ext4 are essentially "moving in tandem". As long as the offs^H^H^H^H cookies do not overflow in the transformation, Gluster would not have a problem. However Gluster-NFS (and NFS in general, too) will break, as we opendir/closedir potentially on every request. Avati [-- Attachment #1.2: Type: text/html, Size: 1332 bytes --] [-- Attachment #2: Type: text/plain, Size: 185 bytes --] _______________________________________________ Gluster-devel mailing list Gluster-devel-qX2TKyscuCcdnm+yROfE0A@public.gmane.org https://lists.nongnu.org/mailman/listinfo/gluster-devel ^ permalink raw reply [flat|nested] 44+ messages in thread
[parent not found: <CAFboF2zS+YAa0uUxMFUAbqgPh3Kb4xZu40WUjLyGn8qPoP+Oyw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [Gluster-devel] regressions due to 64-bit ext4 directory cookies [not found] ` <CAFboF2zS+YAa0uUxMFUAbqgPh3Kb4xZu40WUjLyGn8qPoP+Oyw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2013-02-14 21:47 ` J. Bruce Fields 2013-03-26 15:23 ` Bernd Schubert 1 sibling, 0 replies; 44+ messages in thread From: J. Bruce Fields @ 2013-02-14 21:47 UTC (permalink / raw) To: Anand Avati Cc: Theodore Ts'o, Bernd Schubert, sandeen-H+wXaHxf7aLQT0dZR+AlfA, linux-nfs-u79uwXL29TY76Z2rM5mHXA, linux-ext4-u79uwXL29TY76Z2rM5mHXA, gluster-devel-qX2TKyscuCcdnm+yROfE0A On Wed, Feb 13, 2013 at 04:05:01PM -0800, Anand Avati wrote: > On Wed, Feb 13, 2013 at 3:44 PM, Theodore Ts'o <tytso-3s7WtUTddSA@public.gmane.org> wrote: > > > > I suspect this would seriously screw over Gluster, though, and this > > wouldn't be a solution for NFSv3, since NFS needs long-lived directory > > cookies, and not the short-lived cookies which is all POSIX/SuSv3 > > guarantees. > > > > Actually this would work just fine with Gluster. Except in the case of > gluster-NFS, the native client is only acting like a router/proxy of > syscalls to the backend system. A directory opened by an application will > have a matching directory fd opened on ext4, and readdir from an app will > be translated into readdir on the matching fd on ext4. So the > app-on-glusterfs and glusterfsd-on-ext4 are essentially "moving in tandem". > As long as the offs^H^H^H^H cookies do not overflow in the transformation, > Gluster would not have a problem. > > However Gluster-NFS (and NFS in general, too) will break, as we > opendir/closedir potentially on every request. Yes. And, of course, NFS cookies live forever--we have no idea when a client will hand one back to us and expect us to do something with it. --b. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: regressions due to 64-bit ext4 directory cookies [not found] ` <CAFboF2zS+YAa0uUxMFUAbqgPh3Kb4xZu40WUjLyGn8qPoP+Oyw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2013-02-14 21:47 ` [Gluster-devel] " J. Bruce Fields @ 2013-03-26 15:23 ` Bernd Schubert [not found] ` <5151BD5F.30607-mPn0NPGs4xGatNDF+KUbs4QuADTiUCJX@public.gmane.org> 1 sibling, 1 reply; 44+ messages in thread From: Bernd Schubert @ 2013-03-26 15:23 UTC (permalink / raw) To: Anand Avati Cc: sandeen-H+wXaHxf7aLQT0dZR+AlfA, linux-nfs-u79uwXL29TY76Z2rM5mHXA, Theodore Ts'o, linux-ext4-u79uwXL29TY76Z2rM5mHXA, gluster-devel-qX2TKyscuCcdnm+yROfE0A Sorry for my late reply, I had been rather busy. On 02/14/2013 01:05 AM, Anand Avati wrote: > On Wed, Feb 13, 2013 at 3:44 PM, Theodore Ts'o <tytso-3s7WtUTddSA@public.gmane.org> wrote: >> >> I suspect this would seriously screw over Gluster, though, and this >> wouldn't be a solution for NFSv3, since NFS needs long-lived directory >> cookies, and not the short-lived cookies which is all POSIX/SuSv3 >> guarantees. >> > > Actually this would work just fine with Gluster. Except in the case of Would it really work perfectly? What about a server reboot in the middle of a readdir of a client? > gluster-NFS, the native client is only acting like a router/proxy of > syscalls to the backend system. A directory opened by an application will > have a matching directory fd opened on ext4, and readdir from an app will > be translated into readdir on the matching fd on ext4. So the > app-on-glusterfs and glusterfsd-on-ext4 are essentially "moving in tandem". > As long as the offs^H^H^H^H cookies do not overflow in the transformation, > Gluster would not have a problem. > > However Gluster-NFS (and NFS in general, too) will break, as we > opendir/closedir potentially on every request. We don't have reached a conclusion so far, do we? What about the ioctl approach, but a bit differently? Would it work to specify the allowed upper bits for ext4 (for example 16 additional bit) and the remaining part for gluster? One of the mails had the calculation formula: final_d_off = (ext4_d_off * MAX_SERVERS) + server_idx But what is the value of MAX_SERVERS? Cheers, Bernd ^ permalink raw reply [flat|nested] 44+ messages in thread
[parent not found: <5151BD5F.30607-mPn0NPGs4xGatNDF+KUbs4QuADTiUCJX@public.gmane.org>]
* Re: [Gluster-devel] regressions due to 64-bit ext4 directory cookies [not found] ` <5151BD5F.30607-mPn0NPGs4xGatNDF+KUbs4QuADTiUCJX@public.gmane.org> @ 2013-03-26 15:48 ` Eric Sandeen 2013-03-28 14:07 ` Theodore Ts'o 0 siblings, 1 reply; 44+ messages in thread From: Eric Sandeen @ 2013-03-26 15:48 UTC (permalink / raw) To: Bernd Schubert Cc: Anand Avati, Theodore Ts'o, J. Bruce Fields, linux-nfs-u79uwXL29TY76Z2rM5mHXA, linux-ext4-u79uwXL29TY76Z2rM5mHXA, gluster-devel-qX2TKyscuCcdnm+yROfE0A On 3/26/13 10:23 AM, Bernd Schubert wrote: > Sorry for my late reply, I had been rather busy. > > On 02/14/2013 01:05 AM, Anand Avati wrote: >> On Wed, Feb 13, 2013 at 3:44 PM, Theodore Ts'o <tytso-3s7WtUTddSA@public.gmane.org> wrote: >>> >>> I suspect this would seriously screw over Gluster, though, and this >>> wouldn't be a solution for NFSv3, since NFS needs long-lived directory >>> cookies, and not the short-lived cookies which is all POSIX/SuSv3 >>> guarantees. >>> >> >> Actually this would work just fine with Gluster. Except in the case of > > Would it really work perfectly? What about a server reboot in the middle of a readdir of a client? > >> gluster-NFS, the native client is only acting like a router/proxy of >> syscalls to the backend system. A directory opened by an application will >> have a matching directory fd opened on ext4, and readdir from an app will >> be translated into readdir on the matching fd on ext4. So the >> app-on-glusterfs and glusterfsd-on-ext4 are essentially "moving in tandem". >> As long as the offs^H^H^H^H cookies do not overflow in the transformation, >> Gluster would not have a problem. >> >> However Gluster-NFS (and NFS in general, too) will break, as we >> opendir/closedir potentially on every request. > > We don't have reached a conclusion so far, do we? What about the > ioctl approach, but a bit differently? Would it work to specify the > allowed upper bits for ext4 (for example 16 additional bit) and the > remaining part for gluster? One of the mails had the calculation > formula: I did throw together an ioctl patch last week, but I think Anand has a new approach he's trying out which won't require ext4 code changes. I'll let him reply when he has a moment. :) -Eric > final_d_off = (ext4_d_off * MAX_SERVERS) + server_idx > > But what is the value of MAX_SERVERS? > > > Cheers, > Bernd > > > -- > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in > the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [Gluster-devel] regressions due to 64-bit ext4 directory cookies 2013-03-26 15:48 ` [Gluster-devel] " Eric Sandeen @ 2013-03-28 14:07 ` Theodore Ts'o 2013-03-28 16:26 ` Eric Sandeen 2013-03-28 17:52 ` Zach Brown 0 siblings, 2 replies; 44+ messages in thread From: Theodore Ts'o @ 2013-03-28 14:07 UTC (permalink / raw) To: Eric Sandeen Cc: Bernd Schubert, Anand Avati, J. Bruce Fields, linux-nfs, linux-ext4, gluster-devel On Tue, Mar 26, 2013 at 10:48:14AM -0500, Eric Sandeen wrote: > > We don't have reached a conclusion so far, do we? What about the > > ioctl approach, but a bit differently? Would it work to specify the > > allowed upper bits for ext4 (for example 16 additional bit) and the > > remaining part for gluster? One of the mails had the calculation > > formula: > > I did throw together an ioctl patch last week, but I think Anand has a new > approach he's trying out which won't require ext4 code changes. I'll let > him reply when he has a moment. :) Any update about whether Gluster can address this without needing the ioctl patch? Or should we push the ioctl patch into ext4 for the next merge window? Thanks, - Ted ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [Gluster-devel] regressions due to 64-bit ext4 directory cookies 2013-03-28 14:07 ` Theodore Ts'o @ 2013-03-28 16:26 ` Eric Sandeen 2013-03-28 17:52 ` Zach Brown 1 sibling, 0 replies; 44+ messages in thread From: Eric Sandeen @ 2013-03-28 16:26 UTC (permalink / raw) To: Theodore Ts'o Cc: Bernd Schubert, Anand Avati, J. Bruce Fields, linux-nfs, linux-ext4, gluster-devel On 3/28/13 9:07 AM, Theodore Ts'o wrote: > On Tue, Mar 26, 2013 at 10:48:14AM -0500, Eric Sandeen wrote: >>> We don't have reached a conclusion so far, do we? What about the >>> ioctl approach, but a bit differently? Would it work to specify the >>> allowed upper bits for ext4 (for example 16 additional bit) and the >>> remaining part for gluster? One of the mails had the calculation >>> formula: >> >> I did throw together an ioctl patch last week, but I think Anand has a new >> approach he's trying out which won't require ext4 code changes. I'll let >> him reply when he has a moment. :) > > Any update about whether Gluster can address this without needing the > ioctl patch? Or should we push the ioctl patch into ext4 for the next > merge window? I went ahead & sent the ioctl patches to the ext4 list; they are lightly tested, and not tested at all w/ gluster AFAIK. Wanted to get them out just in case we decide we want them. Thanks, -Eric > Thanks, > > - Ted > -- > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [Gluster-devel] regressions due to 64-bit ext4 directory cookies 2013-03-28 14:07 ` Theodore Ts'o 2013-03-28 16:26 ` Eric Sandeen @ 2013-03-28 17:52 ` Zach Brown [not found] ` <20130328175205.GD16651-fypN+1c5dIyjpB87vu3CluTW4wlIGRCZ@public.gmane.org> 1 sibling, 1 reply; 44+ messages in thread From: Zach Brown @ 2013-03-28 17:52 UTC (permalink / raw) To: Theodore Ts'o Cc: Eric Sandeen, Bernd Schubert, Anand Avati, J. Bruce Fields, linux-nfs, linux-ext4, gluster-devel On Thu, Mar 28, 2013 at 10:07:44AM -0400, Theodore Ts'o wrote: > On Tue, Mar 26, 2013 at 10:48:14AM -0500, Eric Sandeen wrote: > > > We don't have reached a conclusion so far, do we? What about the > > > ioctl approach, but a bit differently? Would it work to specify the > > > allowed upper bits for ext4 (for example 16 additional bit) and the > > > remaining part for gluster? One of the mails had the calculation > > > formula: > > > > I did throw together an ioctl patch last week, but I think Anand has a new > > approach he's trying out which won't require ext4 code changes. I'll let > > him reply when he has a moment. :) > > Any update about whether Gluster can address this without needing the > ioctl patch? Or should we push the ioctl patch into ext4 for the next > merge window? They're testing a work-around: http://review.gluster.org/#change,4711 I'm not sure if they've decided that they're going to go with it, or not. - z ^ permalink raw reply [flat|nested] 44+ messages in thread
[parent not found: <20130328175205.GD16651-fypN+1c5dIyjpB87vu3CluTW4wlIGRCZ@public.gmane.org>]
* Re: regressions due to 64-bit ext4 directory cookies [not found] ` <20130328175205.GD16651-fypN+1c5dIyjpB87vu3CluTW4wlIGRCZ@public.gmane.org> @ 2013-03-28 18:05 ` Anand Avati [not found] ` <CAFboF2ztc06G00z8ga35NrxgnT2YgBiDECgU_9kvVA_Go1_Bww-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 44+ messages in thread From: Anand Avati @ 2013-03-28 18:05 UTC (permalink / raw) To: Zach Brown Cc: Eric Sandeen, linux-nfs-u79uwXL29TY76Z2rM5mHXA, Theodore Ts'o, Bernd Schubert, linux-ext4-u79uwXL29TY76Z2rM5mHXA, gluster-devel-qX2TKyscuCcdnm+yROfE0A [-- Attachment #1.1: Type: text/plain, Size: 1329 bytes --] On Thu, Mar 28, 2013 at 10:52 AM, Zach Brown <zab-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote: > On Thu, Mar 28, 2013 at 10:07:44AM -0400, Theodore Ts'o wrote: > > On Tue, Mar 26, 2013 at 10:48:14AM -0500, Eric Sandeen wrote: > > > > We don't have reached a conclusion so far, do we? What about the > > > > ioctl approach, but a bit differently? Would it work to specify the > > > > allowed upper bits for ext4 (for example 16 additional bit) and the > > > > remaining part for gluster? One of the mails had the calculation > > > > formula: > > > > > > I did throw together an ioctl patch last week, but I think Anand has a > new > > > approach he's trying out which won't require ext4 code changes. I'll > let > > > him reply when he has a moment. :) > > > > Any update about whether Gluster can address this without needing the > > ioctl patch? Or should we push the ioctl patch into ext4 for the next > > merge window? > > They're testing a work-around: > > http://review.gluster.org/#change,4711 > > I'm not sure if they've decided that they're going to go with it, or > not. > Jeff reported that the approach did not work in his testing. I haven't had a chance to look into the failure yet. Independent of the fix, it would certainly be good have the ioctl() support - Samba could use it too, if it wanted. Avati [-- Attachment #1.2: Type: text/html, Size: 1995 bytes --] [-- Attachment #2: Type: text/plain, Size: 185 bytes --] _______________________________________________ Gluster-devel mailing list Gluster-devel-qX2TKyscuCcdnm+yROfE0A@public.gmane.org https://lists.nongnu.org/mailman/listinfo/gluster-devel ^ permalink raw reply [flat|nested] 44+ messages in thread
[parent not found: <CAFboF2ztc06G00z8ga35NrxgnT2YgBiDECgU_9kvVA_Go1_Bww-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [Gluster-devel] regressions due to 64-bit ext4 directory cookies [not found] ` <CAFboF2ztc06G00z8ga35NrxgnT2YgBiDECgU_9kvVA_Go1_Bww-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2013-03-28 18:31 ` J. Bruce Fields [not found] ` <20130328183153.GG7080-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org> 0 siblings, 1 reply; 44+ messages in thread From: J. Bruce Fields @ 2013-03-28 18:31 UTC (permalink / raw) To: Anand Avati Cc: Zach Brown, Theodore Ts'o, Eric Sandeen, Bernd Schubert, linux-nfs-u79uwXL29TY76Z2rM5mHXA, linux-ext4-u79uwXL29TY76Z2rM5mHXA, gluster-devel-qX2TKyscuCcdnm+yROfE0A On Thu, Mar 28, 2013 at 11:05:41AM -0700, Anand Avati wrote: > On Thu, Mar 28, 2013 at 10:52 AM, Zach Brown <zab-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote: > > > On Thu, Mar 28, 2013 at 10:07:44AM -0400, Theodore Ts'o wrote: > > > On Tue, Mar 26, 2013 at 10:48:14AM -0500, Eric Sandeen wrote: > > > > > We don't have reached a conclusion so far, do we? What about the > > > > > ioctl approach, but a bit differently? Would it work to specify the > > > > > allowed upper bits for ext4 (for example 16 additional bit) and the > > > > > remaining part for gluster? One of the mails had the calculation > > > > > formula: > > > > > > > > I did throw together an ioctl patch last week, but I think Anand has a > > new > > > > approach he's trying out which won't require ext4 code changes. I'll > > let > > > > him reply when he has a moment. :) > > > > > > Any update about whether Gluster can address this without needing the > > > ioctl patch? Or should we push the ioctl patch into ext4 for the next > > > merge window? > > > > They're testing a work-around: > > > > http://review.gluster.org/#change,4711 > > > > I'm not sure if they've decided that they're going to go with it, or > > not. > > > > Jeff reported that the approach did not work in his testing. I haven't had > a chance to look into the failure yet. Independent of the fix, it would > certainly be good have the ioctl() support The one advantage of your scheme is that it keeps more of the hash bits; the chance of 31-bit cookie collisions is much higher. > Samba could use it too, if it wanted. It'd be useful to understand their situation. --b. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 44+ messages in thread
[parent not found: <20130328183153.GG7080-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>]
* Re: regressions due to 64-bit ext4 directory cookies [not found] ` <20130328183153.GG7080-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org> @ 2013-03-28 18:49 ` Anand Avati [not found] ` <CAFboF2w49Lc0vM0SerbJfL9_RuSHgEU+y_Yk7F4pLxeiqu+KRg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 44+ messages in thread From: Anand Avati @ 2013-03-28 18:49 UTC (permalink / raw) To: J. Bruce Fields Cc: Eric Sandeen, linux-nfs-u79uwXL29TY76Z2rM5mHXA, Theodore Ts'o, Zach Brown, Bernd Schubert, linux-ext4-u79uwXL29TY76Z2rM5mHXA, gluster-devel-qX2TKyscuCcdnm+yROfE0A [-- Attachment #1.1: Type: text/plain, Size: 817 bytes --] On Thu, Mar 28, 2013 at 11:31 AM, J. Bruce Fields <bfields-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org>wrote: > > > Jeff reported that the approach did not work in his testing. I haven't > had > > a chance to look into the failure yet. Independent of the fix, it would > > certainly be good have the ioctl() support > > The one advantage of your scheme is that it keeps more of the hash bits; > the chance of 31-bit cookie collisions is much higher. Yes, it should, based on the theory of how ext4 was generating the 63bits. But Jeff's test finds that the experiment is not matching the theory. I intend to debug this, but currently drowned in a different issue. It would be good if the ext developers can have a look at http://review.gluster.org/4711 and see if there are obvious holes in the approach or code. Avati [-- Attachment #1.2: Type: text/html, Size: 1274 bytes --] [-- Attachment #2: Type: text/plain, Size: 185 bytes --] _______________________________________________ Gluster-devel mailing list Gluster-devel-qX2TKyscuCcdnm+yROfE0A@public.gmane.org https://lists.nongnu.org/mailman/listinfo/gluster-devel ^ permalink raw reply [flat|nested] 44+ messages in thread
[parent not found: <CAFboF2w49Lc0vM0SerbJfL9_RuSHgEU+y_Yk7F4pLxeiqu+KRg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: [Gluster-devel] regressions due to 64-bit ext4 directory cookies [not found] ` <CAFboF2w49Lc0vM0SerbJfL9_RuSHgEU+y_Yk7F4pLxeiqu+KRg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2013-03-28 19:43 ` Jeff Darcy [not found] ` <51549D74.1060703-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 0 siblings, 1 reply; 44+ messages in thread From: Jeff Darcy @ 2013-03-28 19:43 UTC (permalink / raw) To: Anand Avati Cc: J. Bruce Fields, Eric Sandeen, linux-nfs-u79uwXL29TY76Z2rM5mHXA, Theodore Ts'o, Zach Brown, Bernd Schubert, linux-ext4-u79uwXL29TY76Z2rM5mHXA, gluster-devel-qX2TKyscuCcdnm+yROfE0A On 03/28/2013 02:49 PM, Anand Avati wrote: > Yes, it should, based on the theory of how ext4 was generating the > 63bits. But Jeff's test finds that the experiment is not matching the > theory. FWIW, I was able to re-run my test in between stuff related to That Other Problem. What seems to be happening is that we read correctly until just after d_off 0x4000000000000000, then we suddenly wrap around - not to the very first d_off we saw, but to a pretty early one (e.g. 0x0041b6340689a32e). This is all on a single brick, BTW, so it's pretty easy to line up the back-end and front-end d_off values which match perfectly up to this point. I haven't had a chance to ponder what this all means and debug it further. Hopefully I'll be able to do so soon, but I figured I'd mention it in case something about those numbers rang a bell. -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 44+ messages in thread
[parent not found: <51549D74.1060703-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>]
* Re: regressions due to 64-bit ext4 directory cookies [not found] ` <51549D74.1060703-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> @ 2013-03-28 22:14 ` Anand Avati [not found] ` <CAFboF2xkvXx9YFYxBXupwg=s=3MaeQYm2KK2m8MFtEBPsxwQ7Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 0 siblings, 1 reply; 44+ messages in thread From: Anand Avati @ 2013-03-28 22:14 UTC (permalink / raw) To: Jeff Darcy Cc: Eric Sandeen, linux-nfs-u79uwXL29TY76Z2rM5mHXA, Theodore Ts'o, Zach Brown, Bernd Schubert, linux-ext4-u79uwXL29TY76Z2rM5mHXA, gluster-devel-qX2TKyscuCcdnm+yROfE0A [-- Attachment #1.1: Type: text/plain, Size: 1323 bytes --] On Thu, Mar 28, 2013 at 12:43 PM, Jeff Darcy <jdarcy-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote: > On 03/28/2013 02:49 PM, Anand Avati wrote: > > Yes, it should, based on the theory of how ext4 was generating the > > 63bits. But Jeff's test finds that the experiment is not matching the > > theory. > > FWIW, I was able to re-run my test in between stuff related to That > Other Problem. What seems to be happening is that we read correctly > until just after d_off 0x4000000000000000, then we suddenly wrap around > - not to the very first d_off we saw, but to a pretty early one (e.g. > 0x0041b6340689a32e). This is all on a single brick, BTW, so it's pretty > easy to line up the back-end and front-end d_off values which match > perfectly up to this point. > > I haven't had a chance to ponder what this all means and debug it > further. Hopefully I'll be able to do so soon, but I figured I'd > mention it in case something about those numbers rang a bell. > Of course, the unit tests (with artificial offsets) were done with brick count >= 2. You have tested with DHT subvol count=1, which was not tested, and sure enough, the code isn't handling it well. Just verified with the unit tests that brick count = 1 condition fails to return the same d_off. Posting a fixed version. Thanks for the catch! Avati [-- Attachment #1.2: Type: text/html, Size: 1789 bytes --] [-- Attachment #2: Type: text/plain, Size: 185 bytes --] _______________________________________________ Gluster-devel mailing list Gluster-devel-qX2TKyscuCcdnm+yROfE0A@public.gmane.org https://lists.nongnu.org/mailman/listinfo/gluster-devel ^ permalink raw reply [flat|nested] 44+ messages in thread
[parent not found: <CAFboF2xkvXx9YFYxBXupwg=s=3MaeQYm2KK2m8MFtEBPsxwQ7Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>]
* Re: regressions due to 64-bit ext4 directory cookies [not found] ` <CAFboF2xkvXx9YFYxBXupwg=s=3MaeQYm2KK2m8MFtEBPsxwQ7Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> @ 2013-03-28 22:20 ` Anand Avati 0 siblings, 0 replies; 44+ messages in thread From: Anand Avati @ 2013-03-28 22:20 UTC (permalink / raw) To: Jeff Darcy Cc: Eric Sandeen, linux-nfs-u79uwXL29TY76Z2rM5mHXA, Theodore Ts'o, Zach Brown, Bernd Schubert, linux-ext4-u79uwXL29TY76Z2rM5mHXA, gluster-devel-qX2TKyscuCcdnm+yROfE0A [-- Attachment #1.1: Type: text/plain, Size: 1646 bytes --] On Thu, Mar 28, 2013 at 3:14 PM, Anand Avati <anand.avati-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> wrote: > On Thu, Mar 28, 2013 at 12:43 PM, Jeff Darcy <jdarcy-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote: > >> On 03/28/2013 02:49 PM, Anand Avati wrote: >> > Yes, it should, based on the theory of how ext4 was generating the >> > 63bits. But Jeff's test finds that the experiment is not matching the >> > theory. >> >> FWIW, I was able to re-run my test in between stuff related to That >> Other Problem. What seems to be happening is that we read correctly >> until just after d_off 0x4000000000000000, then we suddenly wrap around >> - not to the very first d_off we saw, but to a pretty early one (e.g. >> 0x0041b6340689a32e). This is all on a single brick, BTW, so it's pretty >> easy to line up the back-end and front-end d_off values which match >> perfectly up to this point. >> >> I haven't had a chance to ponder what this all means and debug it >> further. Hopefully I'll be able to do so soon, but I figured I'd >> mention it in case something about those numbers rang a bell. >> > > Of course, the unit tests (with artificial offsets) were done with brick > count >= 2. You have tested with DHT subvol count=1, which was not tested, > and sure enough, the code isn't handling it well. Just verified with the > unit tests that brick count = 1 condition fails to return the same d_off. > > Posting a fixed version. Thanks for the catch! > Posted an updated version http://review.gluster.org/4711. This passes unit tests for all brick counts (>= 1). Can you confirm if the "loop"ing is now gone in your test env? Thanks, Avati [-- Attachment #1.2: Type: text/html, Size: 2463 bytes --] [-- Attachment #2: Type: text/plain, Size: 185 bytes --] _______________________________________________ Gluster-devel mailing list Gluster-devel-qX2TKyscuCcdnm+yROfE0A@public.gmane.org https://lists.nongnu.org/mailman/listinfo/gluster-devel ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [Gluster-devel] regressions due to 64-bit ext4 directory cookies [not found] ` <20130213234430.GF5938-AKGzg7BKzIDYtjvyW6yDsg@public.gmane.org> 2013-02-14 0:05 ` Anand Avati @ 2013-02-14 21:46 ` J. Bruce Fields 1 sibling, 0 replies; 44+ messages in thread From: J. Bruce Fields @ 2013-02-14 21:46 UTC (permalink / raw) To: Theodore Ts'o Cc: Anand Avati, Bernd Schubert, sandeen-H+wXaHxf7aLQT0dZR+AlfA, linux-nfs-u79uwXL29TY76Z2rM5mHXA, linux-ext4-u79uwXL29TY76Z2rM5mHXA, gluster-devel-qX2TKyscuCcdnm+yROfE0A On Wed, Feb 13, 2013 at 06:44:30PM -0500, Theodore Ts'o wrote: > On Wed, Feb 13, 2013 at 06:05:11PM -0500, J. Bruce Fields wrote: > > > > Would it be possible to make something work like, for example, a 31-bit > > hash plus an offset into a hash bucket? > > > > I have trouble thinking about this, partly because I can't remember > > where to find the requirements for readdir on concurrently modified > > directories.... > > The requires are that for a directory entry which has not been > modified since the last opendir() or rewindir(), readdir() must return > that directory entry exactly once. > > For a directory entry which has been added or removed since the last > opendir() or rewinddir() call, it is undefined whether the directory > entry is returned once or not at all. And a rename is defined as a > add/remove, so it's OK for the old filename and the new file name to > appear in the readdir() stream; it would also be OK if neither > appeared in the readdir() stream. That's what I couldn't remember, thanks! --b. > > The SUSv3 definition of readdir() can be found here: > > http://pubs.opengroup.org/onlinepubs/009695399/functions/readdir.html > > Note also that if you look at the SuSv3 definition of seekdir(), it > explicitly states that the value returned by telldir() is not > guaranteed to be valid after a rewinddir() or across another opendir(): > > If the value of loc was not obtained from an earlier call to > telldir(), or if a call to rewinddir() occurred between the call to > telldir() and the call to seekdir(), the results of subsequent > calls to readdir() are unspecified. > > Hence, it would be legal, and arguably more correct, if we created an > internal array of pointers into the directory structure, where the > first call to telldir() return 1, and the second call to telldir() > returned 2, and the third call to telldir() returned 3, regardless of > the position in the directory, and this number was used by seekdir() > to index into the array of pointers to return the exact location in > the b-tree. This would completely eliminate the possibility of hash > collisions, and guarantee that readdir() would never drop or return a > directory entry multiple times after seekdir(). > > This implementation approach would have a potential denial of service > potential since each call to telldir() would potentially be allocating > kernel memory, but as long as we make sure the OOM killler kills the > nasty process which is calling telldir() a lot, this would probably be > OK. > > It would also be legal to throw away this array after a call to > rewinddir() and closedir(), since telldir() cookies and not guaranteed > to valid indefinitely. See: > > http://pubs.opengroup.org/onlinepubs/009695399/functions/seekdir.html > > I suspect this would seriously screw over Gluster, though, and this > wouldn't be a solution for NFSv3, since NFS needs long-lived directory > cookies, and not the short-lived cookies which is all POSIX/SuSv3 guarantees. > > Regards, > > - Ted > -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 44+ messages in thread
[parent not found: <20130213222052.GD5938-AKGzg7BKzIDYtjvyW6yDsg@public.gmane.org>]
* Re: [Gluster-devel] regressions due to 64-bit ext4 directory cookies [not found] ` <20130213222052.GD5938-AKGzg7BKzIDYtjvyW6yDsg@public.gmane.org> @ 2013-02-14 6:10 ` Dave Chinner 2013-02-14 22:01 ` J. Bruce Fields 0 siblings, 1 reply; 44+ messages in thread From: Dave Chinner @ 2013-02-14 6:10 UTC (permalink / raw) To: Theodore Ts'o Cc: Anand Avati, J. Bruce Fields, Bernd Schubert, sandeen-H+wXaHxf7aLQT0dZR+AlfA, linux-nfs-u79uwXL29TY76Z2rM5mHXA, linux-ext4-u79uwXL29TY76Z2rM5mHXA, gluster-devel-qX2TKyscuCcdnm+yROfE0A On Wed, Feb 13, 2013 at 05:20:52PM -0500, Theodore Ts'o wrote: > Telldir() and seekdir() are basically implementation horrors for any > file system that is using anything other than a simple array of > directory entries ala the V7 Unix file system or the BSD FFS. For any > file system which is using a more advanced data structure, like > b-trees hash trees, etc, there **can't** possibly be a "offset" into a > readdir stream. I'll just point you to this: http://marc.info/?l=linux-ext4&m=136081996316453&w=2 so you can see that XFS implements what you say can't possibly be done. ;) FWIW, that post only talked about the data segment. I didn't mention that XFS has 2 other segments in the directory file (both beyond EOF) for the directory data indexes. One contains the name-hash btree index used for name based lookups and the other contains a freespace index for tracking free space in the data segment. IOWs persistent, deterministic, low cost telldir/seekdir behaviour was a problem solved in the 1990s. :) Cheers, Dave. -- Dave Chinner david-FqsqvQoI3Ljby3iVrkZq2A@public.gmane.org -- To unsubscribe from this list: send the line "unsubscribe linux-nfs" in the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [Gluster-devel] regressions due to 64-bit ext4 directory cookies 2013-02-14 6:10 ` Dave Chinner @ 2013-02-14 22:01 ` J. Bruce Fields 2013-02-15 2:27 ` Dave Chinner 0 siblings, 1 reply; 44+ messages in thread From: J. Bruce Fields @ 2013-02-14 22:01 UTC (permalink / raw) To: Dave Chinner Cc: Theodore Ts'o, Anand Avati, Bernd Schubert, sandeen, linux-nfs, linux-ext4, gluster-devel On Thu, Feb 14, 2013 at 05:10:02PM +1100, Dave Chinner wrote: > On Wed, Feb 13, 2013 at 05:20:52PM -0500, Theodore Ts'o wrote: > > Telldir() and seekdir() are basically implementation horrors for any > > file system that is using anything other than a simple array of > > directory entries ala the V7 Unix file system or the BSD FFS. For any > > file system which is using a more advanced data structure, like > > b-trees hash trees, etc, there **can't** possibly be a "offset" into a > > readdir stream. > > I'll just point you to this: > > http://marc.info/?l=linux-ext4&m=136081996316453&w=2 > > so you can see that XFS implements what you say can't possibly be > done. ;) > > FWIW, that post only talked about the data segment. I didn't mention > that XFS has 2 other segments in the directory file (both beyond > EOF) for the directory data indexes. One contains the name-hash btree > index used for name based lookups and the other contains a freespace > index for tracking free space in the data segment. OK, so in some sense that reduces the problem to that of implementing readdir cookies for directories that are stored in a simple linear array. Which I should know how to do but I don't: I guess all you need is a provision for making holes on remove (so that you aren't required move existing entries, messing up offsets for concurrent readers)? Purely out of curiosity: is there a more detailed writeup of XFS's directory format? (Or a pointer to a piece of the code a person could understand without losing a month to it?) --b. > > IOWs persistent, deterministic, low cost telldir/seekdir behaviour > was a problem solved in the 1990s. :) ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: [Gluster-devel] regressions due to 64-bit ext4 directory cookies 2013-02-14 22:01 ` J. Bruce Fields @ 2013-02-15 2:27 ` Dave Chinner 0 siblings, 0 replies; 44+ messages in thread From: Dave Chinner @ 2013-02-15 2:27 UTC (permalink / raw) To: J. Bruce Fields Cc: Theodore Ts'o, Anand Avati, Bernd Schubert, sandeen, linux-nfs, linux-ext4, gluster-devel On Thu, Feb 14, 2013 at 05:01:10PM -0500, J. Bruce Fields wrote: > On Thu, Feb 14, 2013 at 05:10:02PM +1100, Dave Chinner wrote: > > On Wed, Feb 13, 2013 at 05:20:52PM -0500, Theodore Ts'o wrote: > > > Telldir() and seekdir() are basically implementation horrors for any > > > file system that is using anything other than a simple array of > > > directory entries ala the V7 Unix file system or the BSD FFS. For any > > > file system which is using a more advanced data structure, like > > > b-trees hash trees, etc, there **can't** possibly be a "offset" into a > > > readdir stream. > > > > I'll just point you to this: > > > > http://marc.info/?l=linux-ext4&m=136081996316453&w=2 > > > > so you can see that XFS implements what you say can't possibly be > > done. ;) > > > > FWIW, that post only talked about the data segment. I didn't mention > > that XFS has 2 other segments in the directory file (both beyond > > EOF) for the directory data indexes. One contains the name-hash btree > > index used for name based lookups and the other contains a freespace > > index for tracking free space in the data segment. > > OK, so in some sense that reduces the problem to that of implementing > readdir cookies for directories that are stored in a simple linear > array. *nod* > Which I should know how to do but I don't: I guess all you need is a > provision for making holes on remove (so that you aren't required move > existing entries, messing up offsets for concurrent readers)? Exactly. The data segment is a virtual mapping that is maintained by the extent tree, so we can simply punch holes in it for directory blocks that are empty and no longer referenced. i.e. the data segement really is just a sparse file. The result of doing block mapping this way is that the freespace tracking segment actually only needs to track space in partially used blocks. Hence we only need to allocate new blocks when the freespace map empties, And we work out where to allocate the new block in the virtual map by doing an extent tree lookup to find the first hole.... > Purely out of curiosity: is there a more detailed writeup of XFS's > directory format? (Or a pointer to a piece of the code a person could > understand without losing a month to it?) Not really. There's documentation of the on-disk structures, but it's a massive leap from there to understanding the structure and how it all ties together. I've been spending the past couple of months deep in the depths of the XFS directory code so how it all works is front-and-center in my brain right now... That said, the thought had crossed my mind that there's a a couple of LWN articles/conference talks I could put together as a brain dump. ;) Cheers, Dave. -- Dave Chinner david@fromorbit.com ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: regressions due to 64-bit ext4 directory cookies 2013-02-12 20:28 regressions due to 64-bit ext4 directory cookies J. Bruce Fields 2013-02-12 20:56 ` Bernd Schubert 2013-02-13 4:00 ` Theodore Ts'o @ 2013-02-13 6:56 ` Andreas Dilger 2013-02-13 13:40 ` J. Bruce Fields 2 siblings, 1 reply; 44+ messages in thread From: Andreas Dilger @ 2013-02-13 6:56 UTC (permalink / raw) To: J. Bruce Fields Cc: linux-ext4, sandeen, Theodore Ts'o, Bernd Schubert, gluster-devel On 2013-02-12, at 12:28 PM, J. Bruce Fields wrote: > 06effdbb49af5f6c "nfsd: vfs_llseek() with 32 or 64 bit offsets (hashes)" > and previous patches solved problems with hash collisions in large > directories by using 64- instead of 32- bit directory hashes in some > cases. But it caused problems for users who assume directory offsets > are "small". Two cases we've run across: > > - older NFS clients: 64-bit cookies cause applications on > many older clients to fail. > - gluster: gluster assumed that it could take the top bits of > the offset for its own use. > > In both cases we could argue we're in the right: the nfs protocol > defines cookies to be 64 bits, so clients should be prepared to handle them (remapping to smaller integers if necessary to placate > applications using older system interfaces). There appears to already be support for handling this for NFSv2 clients, so it should be possible to have an NFS server mount option to set this for all clients: /* NFSv2 only supports 32 bit cookies */ if (rqstp->rq_vers > 2) may_flags |= NFSD_MAY_64BIT_COOKIE; Alternately, this might be detected on a per-client basis by whitelist or blacklist if there is some way for the server to identify the client? > And gluster was incorrect to assume that the "offset" was really > an "offset" as opposed to just an opaque value. Hmm, userspace already can't use the top bit of the cookie, since the offset is a signed value, so gluster could continue to use that bit for itself. It could, in theory, also downshift the cookie by one bit for 64-bit cookies and shift it back before use, but I'm not sure that is kosher for all filesystems. > But in practice things that worked fine for a long time break on a > kernel upgrade. > > So at a minimum I think we owe people a workaround, and turning off > dir_index may not be practical for everyone. > > A "no_64bit_cookies" export option would provide a workaround for NFS > servers with older NFS clients, but not for applications like gluster. We added a "32bitapi" mount option to Lustre to handle the case where it is re-exporting via NFS to 32-bit clients, which is like your proposed "no_64bit_cookies" and "nfs.enable_ino64=0" together. > For that reason I'd rather have a way to turn this off on a given ext4 filesystem. Is that practical? It wouldn't be impossible - pos2maj_hash() and pos2min_hash() could get a per-superblock and/or kernel option to force 32-bit hash values. Cheers, Andreas ^ permalink raw reply [flat|nested] 44+ messages in thread
* Re: regressions due to 64-bit ext4 directory cookies 2013-02-13 6:56 ` Andreas Dilger @ 2013-02-13 13:40 ` J. Bruce Fields 0 siblings, 0 replies; 44+ messages in thread From: J. Bruce Fields @ 2013-02-13 13:40 UTC (permalink / raw) To: Andreas Dilger Cc: linux-ext4, sandeen, Theodore Ts'o, Bernd Schubert, gluster-devel On Tue, Feb 12, 2013 at 10:56:36PM -0800, Andreas Dilger wrote: > On 2013-02-12, at 12:28 PM, J. Bruce Fields wrote: > > 06effdbb49af5f6c "nfsd: vfs_llseek() with 32 or 64 bit offsets (hashes)" > > and previous patches solved problems with hash collisions in large > > directories by using 64- instead of 32- bit directory hashes in some > > cases. But it caused problems for users who assume directory offsets > > are "small". Two cases we've run across: > > > > - older NFS clients: 64-bit cookies cause applications on > > many older clients to fail. > > - gluster: gluster assumed that it could take the top bits of > > the offset for its own use. > > > > In both cases we could argue we're in the right: the nfs protocol > > defines cookies to be 64 bits, so clients should be prepared to handle them (remapping to smaller integers if necessary to placate > > applications using older system interfaces). > > There appears to already be support for handling this for NFSv2 > clients, so it should be possible to have an NFS server mount > option to set this for all clients: > > /* NFSv2 only supports 32 bit cookies */ > if (rqstp->rq_vers > 2) > may_flags |= NFSD_MAY_64BIT_COOKIE; > > Alternately, this might be detected on a per-client basis by > whitelist or blacklist if there is some way for the server to > identify the client? No, there isn't. --b. ^ permalink raw reply [flat|nested] 44+ messages in thread
end of thread, other threads:[~2013-03-28 22:20 UTC | newest] Thread overview: 44+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2013-02-12 20:28 regressions due to 64-bit ext4 directory cookies J. Bruce Fields 2013-02-12 20:56 ` Bernd Schubert 2013-02-12 21:00 ` J. Bruce Fields 2013-02-13 8:17 ` Bernd Schubert 2013-02-13 22:18 ` J. Bruce Fields 2013-02-13 13:31 ` [Gluster-devel] " Niels de Vos 2013-02-13 15:40 ` Bernd Schubert 2013-02-14 5:32 ` Dave Chinner 2013-02-13 4:00 ` Theodore Ts'o 2013-02-13 13:31 ` J. Bruce Fields 2013-02-13 15:14 ` Theodore Ts'o 2013-02-13 15:19 ` J. Bruce Fields 2013-02-13 15:36 ` Theodore Ts'o [not found] ` <20130213153654.GC17431-AKGzg7BKzIDYtjvyW6yDsg@public.gmane.org> 2013-02-13 16:20 ` J. Bruce Fields [not found] ` <20130213162059.GL14195-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org> 2013-02-13 16:43 ` Myklebust, Trond 2013-02-13 21:33 ` J. Bruce Fields 2013-02-14 3:59 ` Myklebust, Trond [not found] ` <4FA345DA4F4AE44899BD2B03EEEC2FA91F3D6BAB-UCI0kNdgLrHLJmV3vhxcH3OR4cbS7gtM96Bgd4bDwmQ@public.gmane.org> 2013-02-14 5:45 ` Dave Chinner 2013-02-13 21:21 ` Anand Avati [not found] ` <CAFboF2wXvP+vttiff8iRE9rAgvV8UWGbFprgVp8p7kE43TU=PA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2013-02-13 22:20 ` [Gluster-devel] " Theodore Ts'o 2013-02-13 22:41 ` J. Bruce Fields [not found] ` <20130213224141.GU14195-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org> 2013-02-13 22:47 ` Theodore Ts'o [not found] ` <20130213224720.GE5938-AKGzg7BKzIDYtjvyW6yDsg@public.gmane.org> 2013-02-13 22:57 ` Anand Avati [not found] ` <CAFboF2z1akN_edrY_fT915xfehfHGioA2M=PSHv0Fp3rD-5v5A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2013-02-13 23:05 ` [Gluster-devel] " J. Bruce Fields [not found] ` <20130213230511.GW14195-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org> 2013-02-13 23:44 ` Theodore Ts'o [not found] ` <20130213234430.GF5938-AKGzg7BKzIDYtjvyW6yDsg@public.gmane.org> 2013-02-14 0:05 ` Anand Avati [not found] ` <CAFboF2zS+YAa0uUxMFUAbqgPh3Kb4xZu40WUjLyGn8qPoP+Oyw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2013-02-14 21:47 ` [Gluster-devel] " J. Bruce Fields 2013-03-26 15:23 ` Bernd Schubert [not found] ` <5151BD5F.30607-mPn0NPGs4xGatNDF+KUbs4QuADTiUCJX@public.gmane.org> 2013-03-26 15:48 ` [Gluster-devel] " Eric Sandeen 2013-03-28 14:07 ` Theodore Ts'o 2013-03-28 16:26 ` Eric Sandeen 2013-03-28 17:52 ` Zach Brown [not found] ` <20130328175205.GD16651-fypN+1c5dIyjpB87vu3CluTW4wlIGRCZ@public.gmane.org> 2013-03-28 18:05 ` Anand Avati [not found] ` <CAFboF2ztc06G00z8ga35NrxgnT2YgBiDECgU_9kvVA_Go1_Bww-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2013-03-28 18:31 ` [Gluster-devel] " J. Bruce Fields [not found] ` <20130328183153.GG7080-uC3wQj2KruNg9hUCZPvPmw@public.gmane.org> 2013-03-28 18:49 ` Anand Avati [not found] ` <CAFboF2w49Lc0vM0SerbJfL9_RuSHgEU+y_Yk7F4pLxeiqu+KRg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2013-03-28 19:43 ` [Gluster-devel] " Jeff Darcy [not found] ` <51549D74.1060703-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2013-03-28 22:14 ` Anand Avati [not found] ` <CAFboF2xkvXx9YFYxBXupwg=s=3MaeQYm2KK2m8MFtEBPsxwQ7Q-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org> 2013-03-28 22:20 ` Anand Avati 2013-02-14 21:46 ` [Gluster-devel] " J. Bruce Fields [not found] ` <20130213222052.GD5938-AKGzg7BKzIDYtjvyW6yDsg@public.gmane.org> 2013-02-14 6:10 ` Dave Chinner 2013-02-14 22:01 ` J. Bruce Fields 2013-02-15 2:27 ` Dave Chinner 2013-02-13 6:56 ` Andreas Dilger 2013-02-13 13:40 ` J. Bruce Fields
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).