linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Brad Boyer <flar@allandria.com>
To: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [RFC] readdir mess
Date: Wed, 13 Aug 2008 01:36:35 -0700	[thread overview]
Message-ID: <20080813083635.GA5933@cynthia.pants.nu> (raw)
In-Reply-To: <20080813000433.GZ28946@ZenIV.linux.org.uk>

On Wed, Aug 13, 2008 at 01:04:33AM +0100, Al Viro wrote:
> hfs:	at least missing checks for hfs_bnode_read() failure.  And I'm not
> 	at all sure that hfs_mac2asc() use is safe there.  BTW, open_dir_list
> 	handling appears to be odd - how the hell does decrementing ->f_pos
> 	help anything?  And hfs_dir_release() removes from list without any
> 	locks, so that's almost certainly racy as well.
> hfsplus: ditto

I don't work on this code anymore, but I wrote the original version which
makes me a bit curious about some of the critcism here. The function
hfs_bnode_read is declared void, so it doesn't return any errors. The
only thing inside it that I could even think of failing is kmap, but
I was under the impression that kmap would sleep until it could
complete. The actual disk read happens earlier and is saved in the
page cache as long as the bnode object exists.

Is there any reason that hfs_mac2asc would not be safe? I can't think
of any good way to avoid that call even if it is unsafe, since the
readdir will expect UTF8 strings rather than the mac (or UTF16 for
hfsplus) encoding found on disk.

The open_dir_list stuff is a little odd I admit, and I think you are
right about the locking issue in release. However, I feel like I should
explain the way hfs and hfsplus use f_pos on directories. The on-disk
format requires that the directory entries be sorted in order by
filename and does not allow any holes in the list. Because of this,
adding and removing entries will move all the items that are later
in the order to make room or eliminate the hole. The manipulation
of f_pos is intended to make it act more like a standard unix
filesystem where removing an item doesn't usually make a pending
readdir skip an unrelated item. The value of f_pos for a directory
is only incremented by one for each entry to make seeking work
in a more sane fashion. Because of this, an increment moves to the
next item and decrement moves to the previous one.

As a side note about the complexity of making hfs and hfsplus fit
into a unix model, there is just one file that contains the equivalent
of every directory and the entire inode table. Because of this, the
directory code is very synthetic. I tried to make it look as much as
possible like a normal unix filesystem, including making i_nlink on
the directory inodes reflect the number of child directories. It
makes for some code that is admittedly a little hard to follow.

	Brad Boyer
	flar@allandria.com



  parent reply	other threads:[~2008-08-13  8:36 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-08-12  6:22 [RFC] readdir mess Al Viro
2008-08-12 17:02 ` OGAWA Hirofumi
2008-08-12 17:18   ` Linus Torvalds
2008-08-12 18:10     ` Al Viro
2008-08-12 18:22       ` Al Viro
2008-08-12 18:37         ` Al Viro
2008-08-12 19:24           ` Al Viro
2008-08-12 20:02       ` Linus Torvalds
2008-08-12 20:21       ` Linus Torvalds
2008-08-12 20:38         ` Al Viro
2008-08-12 21:04           ` Linus Torvalds
2008-08-13  0:04             ` Al Viro
2008-08-13  0:28               ` Linus Torvalds
2008-08-13  1:19                 ` Al Viro
2008-08-13  1:51                   ` Linus Torvalds
2008-08-13  8:36               ` Brad Boyer [this message]
2008-08-13 16:19                 ` Al Viro
2008-08-15  5:06               ` Jan Harkes
2008-08-15  5:34                 ` Al Viro
2008-08-15 16:58                 ` Linus Torvalds
2008-08-24 10:10                   ` Al Viro
2008-08-24 11:03                     ` Al Viro
2008-08-25 16:16                       ` J. Bruce Fields
2008-08-24 17:20                     ` Linus Torvalds
2008-08-24 19:59                       ` Al Viro
2008-08-24 23:51                         ` Linus Torvalds
2008-08-25  1:33                           ` Al Viro
2008-08-25  1:44                             ` Al Viro
2008-08-12 19:45     ` OGAWA Hirofumi
2008-08-12 20:05       ` Linus Torvalds
2008-08-12 20:59         ` Al Viro
2008-08-12 21:24           ` Linus Torvalds
2008-08-12 21:54             ` Al Viro
2008-08-12 22:04               ` Linus Torvalds
2008-08-13 16:20                 ` J. Bruce Fields
2008-08-12 21:47         ` Alan Cox
2008-08-12 22:20           ` Linus Torvalds
2008-08-12 22:10             ` Alan Cox

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080813083635.GA5933@cynthia.pants.nu \
    --to=flar@allandria.com \
    --cc=hirofumi@mail.parknet.co.jp \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@ZenIV.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).