From mboxrd@z Thu Jan 1 00:00:00 1970 From: Nikita Danilov Subject: Re: Expected getdents behaviour Date: Fri, 16 Sep 2005 11:29:29 +0400 Message-ID: <17194.29785.236460.832682@gargle.gargle.HOWL> References: <1126793268.1676.9.camel@imp.csi.cam.ac.uk> <1126793558.1676.15.camel@imp.csi.cam.ac.uk> <1126797460.1676.23.camel@imp.csi.cam.ac.uk> <20050915164110.GA25573@hasse.suse.de> <20050915174658.GA9974@wohnheim.fh-wedel.de> <20050915181946.GH22503@thunk.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Cc: Jan Blunck , Anton Altaparmakov , miklos@szeredi.hu, aaranya@cs.sunysb.edu, linux-fsdevel@vger.kernel.org Return-path: Received: from H190.C26.B96.tor.eicat.ca ([66.96.26.190]:57295 "EHLO moraine.clusterfs.com") by vger.kernel.org with ESMTP id S932598AbVIPHii (ORCPT ); Fri, 16 Sep 2005 03:38:38 -0400 To: Theodore Ts'o In-Reply-To: <20050915181946.GH22503@thunk.org> Sender: linux-fsdevel-owner@vger.kernel.org List-Id: linux-fsdevel.vger.kernel.org Theodore Ts'o writes: > On Thu, Sep 15, 2005 at 07:46:58PM +0200, J?rn Engel wrote: > > Correct. But you're missing half of the problem. > > > > The other half is getting O(1) or O(ln(n)) performance for lookup on > > a large directory. Solution to this half alone would be to use a > > balanced tree for directory entries. Now, using a self-balancing tree > > would not solve your half of the problem, so it is not a full > > solution. > > > > Ext2, as far as I understand it, solved both halves by having both a > > hashtree and the sequential list. Getdents is walking the list and > > ignoring entries with the "deleted" flag. Lookup is using the > > hashtree, which points to the list for actual data. > > Actually, no. What we do is return the directory entries in hash sort > order, by walking the btree. And we use the hash as the offset > returned in d_off and via telldir(). The reason for this? So that if > we add files which causes a node to split, that we still only return > files in the directory once and only once. If we traversed the tree Except for hash collided directory entries, that are returned multiple times, because after seekdir() ext2_readdir() restarts from the start of hash-bucket, right? Nikita.