From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Steve French" Subject: Re: [linux-cifs-client] Re: fsx-linux failing with latest cifs-2.6 git tree Date: Mon, 1 Dec 2008 11:43:52 -0600 Message-ID: <524f69650812010943l45b5f117o4386ce29162189e2@mail.gmail.com> References: <20081126130943.GF23649@wotan.suse.de> <20081126113758.66d27c67@tleilax.poochiereds.net> <20081127083330.GB28285@wotan.suse.de> <20081128071845.020907f1@tupile.poochiereds.net> <524f69650811301344n4316bbcam86530e11ef6ca620@mail.gmail.com> <20081130171734.3c37e1b9@tupile.poochiereds.net> <20081201084435.GE2529@wotan.suse.de> <20081201062849.764062b2@tleilax.poochiereds.net> <20081201113226.GD13903@wotan.suse.de> <20081201065519.774b15ef@tleilax.poochiereds.net> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Cc: "Nick Piggin" , linux-fsdevel , pbadari@us.ibm.com, "linux-cifs-client@lists.samba.org" To: "Jeff Layton" Return-path: Received: from nf-out-0910.google.com ([64.233.182.186]:5642 "EHLO nf-out-0910.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752210AbYLARny (ORCPT ); Mon, 1 Dec 2008 12:43:54 -0500 Received: by nf-out-0910.google.com with SMTP id d3so1351861nfc.21 for ; Mon, 01 Dec 2008 09:43:53 -0800 (PST) In-Reply-To: <20081201065519.774b15ef@tleilax.poochiereds.net> Content-Disposition: inline Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On Mon, Dec 1, 2008 at 5:55 AM, Jeff Layton wrote: > On Mon, 1 Dec 2008 12:32:26 +0100 > Nick Piggin wrote: > >> On Mon, Dec 01, 2008 at 06:28:49AM -0500, Jeff Layton wrote: >> > On Mon, 1 Dec 2008 09:44:35 +0100 >> > Nick Piggin wrote: >> > > > I think it actually is a problem. Suppose PageChecked is never cleared >> > > > like you say, we flush the page and then do a partial page write again. >> > > > We do a readpage this time and it fails, but the copy of data to the >> > > > page works. Now we hit cifs_write_end and PageChecked is set, but >> > > > the unwritten parts of the page actually aren't up to date. Data >> > > > corruption ensues... >> > > > >> > > > I agree that we should drop that patch. We might be able to make >> > > > cifs_write_end more efficient, but we'll need to be more careful >> > > > with PageChecked. >> > > >> > > Oh? I admittedly haven't looked at the source code after applying >> > > your latest patch, but I thought it should not be possible to have >> > > a leaking PageChecked. The page is under the page lock the whole >> > > time, so a concurrent write should not be an issue...? >> > > >> > >> > But a concurrent write and read is, right? >> > >> > Suppose we do a successful cifs_write_begin and set PageChecked. Another >> > thread then incurs a page fault and does a readpage before we copy the >> > data to the page. Won't we then call write_end with both PageChecked and >> > PageUptodate set? >> > >> > That write will be fine, of course. PageChecked is still true though, >> > and I think that sets up the problem I was describing... >> >> Unless cifs is doing something different from the usual case, it should >> lock the page over the readpage operation (the end IO handler would >> typically unlock the page after doing a SetPageUptodate). >> >> So concurrent reads should be protected with page lock as well. >> > > Ahh good point...the page would be locked there. If it's impossible for > PageUptodate to be flipped on while the page lock is held then this is > probably safe enough. > > I'd still prefer that we handle the situation where both bits are set > in cifs_write_end. Some defensive coding is warranted here I think. > That can wait until 2.6.29 though. For now, the patch in Steve's tree > should be fine, IMO. This (why this defensive code is fine) is similar to what we discussed. I agree that we can leave it as is. -- Thanks, Steve