* buggy readdir with inline dirs
@ 2013-03-22 18:26 Zach Brown
2013-03-23 1:02 ` Dave Chinner
2013-03-23 13:24 ` Tao Ma
0 siblings, 2 replies; 4+ messages in thread
From: Zach Brown @ 2013-03-22 18:26 UTC (permalink / raw)
To: linux-ext4; +Cc: Tao Ma, Eric Sandeen
I don't remember quite how, but I found myself flipping through the
inline dir code that's in mainline now. It looked pretty fishy so Eric
and I played around with it. It's very buggy in its current form.
ext4_read_inline_dir() doesn't seem to undertand the filldir arguments.
It suggests that offset 0 is the next offset after both the "." and ".."
entries. It needs to have specific offsets for "." and ".." and return them
accordingly. It looks like fixing this will trickle down into the
revalidation loop.
It doesn't understand that it's possible to only return a single "."
entry in getdents and have a subsequent call have f_pos pointing at the
fake ".." entry. With the current code if your getdents buffer only has
room for "." it just spins returning that entry leaving f_pos at 0.
Those are all relatively simple bugs that just need to be fixed.
But the big bug is that it changes the d_off values for entries as it
converts from byte offsets in the inline dir xattr to hashed offsets in
indexed dir leaves. A concurrent readdir could be unlucky enough to get
a bunch of duplicate entries as it reads past the nice low inline byte
offsets into the huge hashed offsets.
I'm not sure how to easily fix that. It feels like it'd want to
maintain the dir entries in the xattr blob with the offsets that they'll
have once converted to full dir blocks. So instead of being a magical
readdir path maybe it wants to be in the path of looking up dir blocks
so existing unindexed and indexed code would operate on the data in the
xattr blob as though it were a block?
Dunno, just wanted to share what we found. Are these all known problems
in prototype code that isn't intended to be used?
- z
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: buggy readdir with inline dirs
2013-03-22 18:26 buggy readdir with inline dirs Zach Brown
@ 2013-03-23 1:02 ` Dave Chinner
2013-03-23 13:24 ` Tao Ma
1 sibling, 0 replies; 4+ messages in thread
From: Dave Chinner @ 2013-03-23 1:02 UTC (permalink / raw)
To: Zach Brown; +Cc: linux-ext4, Tao Ma, Eric Sandeen
On Fri, Mar 22, 2013 at 11:26:08AM -0700, Zach Brown wrote:
> I don't remember quite how, but I found myself flipping through the
> inline dir code that's in mainline now. It looked pretty fishy so Eric
> and I played around with it. It's very buggy in its current form.
>
> ext4_read_inline_dir() doesn't seem to undertand the filldir arguments.
> It suggests that offset 0 is the next offset after both the "." and ".."
> entries. It needs to have specific offsets for "." and ".." and return them
> accordingly. It looks like fixing this will trickle down into the
> revalidation loop.
>
> It doesn't understand that it's possible to only return a single "."
> entry in getdents and have a subsequent call have f_pos pointing at the
> fake ".." entry. With the current code if your getdents buffer only has
> room for "." it just spins returning that entry leaving f_pos at 0.
>
> Those are all relatively simple bugs that just need to be fixed.
>
> But the big bug is that it changes the d_off values for entries as it
> converts from byte offsets in the inline dir xattr to hashed offsets in
> indexed dir leaves. A concurrent readdir could be unlucky enough to get
> a bunch of duplicate entries as it reads past the nice low inline byte
> offsets into the huge hashed offsets.
>
> I'm not sure how to easily fix that. It feels like it'd want to
> maintain the dir entries in the xattr blob with the offsets that they'll
> have once converted to full dir blocks. So instead of being a magical
> readdir path maybe it wants to be in the path of looking up dir blocks
> so existing unindexed and indexed code would operate on the data in the
> xattr blob as though it were a block?
XFs solves this problem by keeping an "offset" field in the
short-form directory entry. It calculates what the offset would be
if the entry was in block form and adds that to the directory entry
so that when the directory grows to block form and the entries are
moved,they retain the same userspace visible offset. The offsets
returned to getdents/readdir are what is in the offset field, not
the physical offset of then entry in the shortform structure...
A similar (but opposite) process takes place when going from block
form back to short form....
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: buggy readdir with inline dirs
2013-03-22 18:26 buggy readdir with inline dirs Zach Brown
2013-03-23 1:02 ` Dave Chinner
@ 2013-03-23 13:24 ` Tao Ma
2013-03-23 17:28 ` Andreas Dilger
1 sibling, 1 reply; 4+ messages in thread
From: Tao Ma @ 2013-03-23 13:24 UTC (permalink / raw)
To: Zach Brown; +Cc: linux-ext4, Eric Sandeen
Hi Zach,
On 03/23/2013 02:26 AM, Zach Brown wrote:
> I don't remember quite how, but I found myself flipping through the
> inline dir code that's in mainline now. It looked pretty fishy so Eric
> and I played around with it. It's very buggy in its current form.
Sorry about any inconvenience brought to you.
>
> ext4_read_inline_dir() doesn't seem to undertand the filldir arguments.
> It suggests that offset 0 is the next offset after both the "." and ".."
> entries. It needs to have specific offsets for "." and ".." and return them
> accordingly. It looks like fixing this will trickle down into the
> revalidation loop.
yes, it is my fault, I guess at the very first beginning, I just can't
figured out how to return a proper 'offset' to the user to indicate
'..'. Now we don't save anything about '.', so offset 0 is OK for it,
but maybe we should return some offset like '2' to the user about it.
Anyway it should be fixed.
>
> It doesn't understand that it's possible to only return a single "."
> entry in getdents and have a subsequent call have f_pos pointing at the
> fake ".." entry. With the current code if your getdents buffer only has
> room for "." it just spins returning that entry leaving f_pos at 0.
Sorry.
>
> Those are all relatively simple bugs that just need to be fixed.
>
> But the big bug is that it changes the d_off values for entries as it
> converts from byte offsets in the inline dir xattr to hashed offsets in
> indexed dir leaves. A concurrent readdir could be unlucky enough to get
> a bunch of duplicate entries as it reads past the nice low inline byte
> offsets into the huge hashed offsets.
>
> I'm not sure how to easily fix that. It feels like it'd want to
> maintain the dir entries in the xattr blob with the offsets that they'll
> have once converted to full dir blocks. So instead of being a magical
> readdir path maybe it wants to be in the path of looking up dir blocks
> so existing unindexed and indexed code would operate on the data in the
> xattr blob as though it were a block?
>
> Dunno, just wanted to share what we found. Are these all known problems
> in prototype code that isn't intended to be used?
I will check what xfs does in this case as Dave mentioned in another
reply and come back with a fix about it.
Thanks,
Tao
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: buggy readdir with inline dirs
2013-03-23 13:24 ` Tao Ma
@ 2013-03-23 17:28 ` Andreas Dilger
0 siblings, 0 replies; 4+ messages in thread
From: Andreas Dilger @ 2013-03-23 17:28 UTC (permalink / raw)
To: Tao Ma; +Cc: Zach Brown, linux-ext4@vger.kernel.org, Eric Sandeen
On 2013-03-23, at 6:24, Tao Ma <tm@tao.ma> wrote:
> On 03/23/2013 02:26 AM, Zach Brown wrote:
>> I don't remember quite how, but I found myself flipping through the
>> inline dir code that's in mainline now. It looked pretty fishy so Eric
>> and I played around with it. It's very buggy in its current form.
> Sorry about any inconvenience brought to you.
>>
>> ext4_read_inline_dir() doesn't seem to undertand the filldir arguments.
>> It suggests that offset 0 is the next offset after both the "." and ".."
>> entries. It needs to have specific offsets for "." and ".." and return them
>> accordingly. It looks like fixing this will trickle down into the
>> revalidation loop.
> yes, it is my fault, I guess at the very first beginning, I just can't
> figured out how to return a proper 'offset' to the user to indicate
> '..'. Now we don't save anything about '.', so offset 0 is OK for it,
> but maybe we should return some offset like '2' to the user about it.
> Anyway it should be fixed.
FYI, ZFS (which generates "." and ".." entries on-the-fly also) uses "0" for start of readdir, "1" for ".", and "2" for "..".
Cheers, Andreas
>> It doesn't understand that it's possible to only return a single "."
>> entry in getdents and have a subsequent call have f_pos pointing at the
>> fake ".." entry. With the current code if your getdents buffer only has
>> room for "." it just spins returning that entry leaving f_pos at 0.
> Sorry.
>>
>> Those are all relatively simple bugs that just need to be fixed.
>>
>> But the big bug is that it changes the d_off values for entries as it
>> converts from byte offsets in the inline dir xattr to hashed offsets in
>> indexed dir leaves. A concurrent readdir could be unlucky enough to get
>> a bunch of duplicate entries as it reads past the nice low inline byte
>> offsets into the huge hashed offsets.
>>
>> I'm not sure how to easily fix that. It feels like it'd want to
>> maintain the dir entries in the xattr blob with the offsets that they'll
>> have once converted to full dir blocks. So instead of being a magical
>> readdir path maybe it wants to be in the path of looking up dir blocks
>> so existing unindexed and indexed code would operate on the data in the
>> xattr blob as though it were a block?
>>
>> Dunno, just wanted to share what we found. Are these all known problems
>> in prototype code that isn't intended to be used?
> I will check what xfs does in this case as Dave mentioned in another
> reply and come back with a fix about it.
>
> Thanks,
> Tao
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2013-03-23 17:35 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-03-22 18:26 buggy readdir with inline dirs Zach Brown
2013-03-23 1:02 ` Dave Chinner
2013-03-23 13:24 ` Tao Ma
2013-03-23 17:28 ` Andreas Dilger
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox