From mboxrd@z Thu Jan 1 00:00:00 1970 From: Chris Mason Subject: [PATCH 0 of 8] O_DIRECT locking rework v5 Date: Thu, 21 Dec 2006 20:45:52 -0500 Message-ID: <20061222014552.GA26388@think.oraclecorp.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Return-path: Received: from rgminet01.oracle.com ([148.87.113.118]:41799 "EHLO rgminet01.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1945911AbWLVBnz (ORCPT ); Thu, 21 Dec 2006 20:43:55 -0500 To: linux-fsdevel@vger.kernel.org, akpm@osdl.org, zach.brown@oracle.com Content-Disposition: inline Sender: linux-fsdevel-owner@vger.kernel.org List-Id: linux-fsdevel.vger.kernel.org [ resend, sorry for the messed up headers ] I took a small detour on the O_DIRECT locking rework to look at different alternatives for range locking in the pagecache. After benchmarking a few different types of trees, I didn't find anything that would match radix for random lookup performance. This patchset does ranges in the radix tree by inserting a placeholder at the last slot in the range and forcing all lookups to search forward. It means radix_tree_gang_lookup must be used instead of radix_tree_lookup, but this is still faster for random searches than anything else I tried. A bit is set on the radix root node to indicate if range searching is required. So, when O_DIRECT isn't used or O_DIRECT is used for tiny ios, no range lookups are done. With O_DIRECT in use, only a single placeholder is inserted to lock down the entire range for a given IO. This should stand up pretty well for those monster XFS workloads. If the mapping has pages on it, I do one placeholder for every 64k to limit the number of pages pinned down during the IO. There's lots of hand waving here, it may be best to get rid of this special case. Patch against Linus' git from today. Testing has been light, I'm mostly looking for comments on the range locking tricks. -chris