From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jeff Layton Subject: Re: [linux-cifs-client] Re: fsx-linux failing with latest cifs-2.6 git tree Date: Mon, 1 Dec 2008 06:55:19 -0500 Message-ID: <20081201065519.774b15ef@tleilax.poochiereds.net> References: <20081126130943.GF23649@wotan.suse.de> <20081126100857.63e4b45d@tleilax.poochiereds.net> <20081126152332.GA23539@wotan.suse.de> <20081126113758.66d27c67@tleilax.poochiereds.net> <20081127083330.GB28285@wotan.suse.de> <20081128071845.020907f1@tupile.poochiereds.net> <524f69650811301344n4316bbcam86530e11ef6ca620@mail.gmail.com> <20081130171734.3c37e1b9@tupile.poochiereds.net> <20081201084435.GE2529@wotan.suse.de> <20081201062849.764062b2@tleilax.poochiereds.net> <20081201113226.GD13903@wotan.suse.de> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Cc: Steve French , linux-fsdevel , pbadari@us.ibm.com, "linux-cifs-client@lists.samba.org" To: Nick Piggin Return-path: Received: from mx2.redhat.com ([66.187.237.31]:46607 "EHLO mx2.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753378AbYLALz2 (ORCPT ); Mon, 1 Dec 2008 06:55:28 -0500 In-Reply-To: <20081201113226.GD13903@wotan.suse.de> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On Mon, 1 Dec 2008 12:32:26 +0100 Nick Piggin wrote: > On Mon, Dec 01, 2008 at 06:28:49AM -0500, Jeff Layton wrote: > > On Mon, 1 Dec 2008 09:44:35 +0100 > > Nick Piggin wrote: > > > > I think it actually is a problem. Suppose PageChecked is never cleared > > > > like you say, we flush the page and then do a partial page write again. > > > > We do a readpage this time and it fails, but the copy of data to the > > > > page works. Now we hit cifs_write_end and PageChecked is set, but > > > > the unwritten parts of the page actually aren't up to date. Data > > > > corruption ensues... > > > > > > > > I agree that we should drop that patch. We might be able to make > > > > cifs_write_end more efficient, but we'll need to be more careful > > > > with PageChecked. > > > > > > Oh? I admittedly haven't looked at the source code after applying > > > your latest patch, but I thought it should not be possible to have > > > a leaking PageChecked. The page is under the page lock the whole > > > time, so a concurrent write should not be an issue...? > > > > > > > But a concurrent write and read is, right? > > > > Suppose we do a successful cifs_write_begin and set PageChecked. Another > > thread then incurs a page fault and does a readpage before we copy the > > data to the page. Won't we then call write_end with both PageChecked and > > PageUptodate set? > > > > That write will be fine, of course. PageChecked is still true though, > > and I think that sets up the problem I was describing... > > Unless cifs is doing something different from the usual case, it should > lock the page over the readpage operation (the end IO handler would > typically unlock the page after doing a SetPageUptodate). > > So concurrent reads should be protected with page lock as well. > Ahh good point...the page would be locked there. If it's impossible for PageUptodate to be flipped on while the page lock is held then this is probably safe enough. I'd still prefer that we handle the situation where both bits are set in cifs_write_end. Some defensive coding is warranted here I think. That can wait until 2.6.29 though. For now, the patch in Steve's tree should be fine, IMO. -- Jeff Layton