From mboxrd@z Thu Jan 1 00:00:00 1970 From: David Howells Subject: Re: afs_fsync Date: Mon, 19 Apr 2010 12:54:09 +0100 Message-ID: <10333.1271678049@redhat.com> References: <20100418194653.GA20069@lst.de> Cc: dhowells@redhat.com, linux-fsdevel@vger.kernel.org, linux-afs@lists.infradead.org To: Christoph Hellwig Return-path: Received: from mx1.redhat.com ([209.132.183.28]:60627 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753580Ab0DSLyR (ORCPT ); Mon, 19 Apr 2010 07:54:17 -0400 In-Reply-To: <20100418194653.GA20069@lst.de> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: Christoph Hellwig wrote: > I've been looking at afs_fsync a bit lately and don't quite > understanding what's going on there. As of 2.6.32 we always > write out all data before calling into ->fsync. From my very > unscientific exploration into afs_fsync it's doing exactly that > data writeout again, just in a rather complicated way, and > then marks the inode as having dirty pages again, which is not > very helpful inside ->fsync. Any chance you could explain > what's really going on there? kAFS maintains a queue of outstanding writebacks for each inode/vnode, which afs_writepages() attempts to write back in order when there are multiple elements in the queue for the range being written. The afs_writeback struct that forms the elements in that queue maps a written region in the file to a key struct, which defines the security details to use for the AFS StoreData op. These afs_writeback structs also make it easier to write back multiple pages with one network op. Now, afs_fsync() sticks a null record at the tail of the queue, so that it will get woken up when everything before it in the queue at that point is gone. It then invokes afs_writepages() to write out the contents of the queue and waits for the null record to be processes. I don't recall why I put __mark_inode_dirty() in there with I_DIRTY_PAGES. I would suspect I copied it from somewhere, but I don't remember where. I should probably optimise afs_fsync() to not do anything if vnode->writebacks is empty once it has taken vnode->writeback_lock. David