From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jeff Layton Subject: Re: flush and EIO errors when writepages fails Date: Sat, 21 Jun 2008 08:27:42 -0400 Message-ID: <20080621082742.4b311132@tleilax.poochiereds.net> References: <524f69650806201534g67704841u9f942a71b6d9caa3@mail.gmail.com> <20080621070556.GA1155@2ka.mipt.ru> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Cc: Evgeniy Polyakov , linux-fsdevel , Shirish Pargaonkar , Dave Kleikamp , Jody French To: Steve French Return-path: Received: from mx1.redhat.com ([66.187.233.31]:59837 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750800AbYFUMbR (ORCPT ); Sat, 21 Jun 2008 08:31:17 -0400 In-Reply-To: <20080621070556.GA1155@2ka.mipt.ru> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On Sat, 21 Jun 2008 11:05:56 +0400 Evgeniy Polyakov wrote: > Hi. > > On Fri, Jun 20, 2008 at 05:34:21PM -0500, Steve French (smfrench@gmail.com) wrote: > > > Right, but with the current implementation, once filemap_fdatawrite > > > returns, any pages that that run touched are either written out or > > > discarded. > > Depending on writepages() implementation, it is not always the case. > Right. I meant with the current cifs_writepages() implementation. It looks like when the write fails we walk the pagevec and do: if (rc) SetPageError(page); kunmap(page); unlock_page(page); end_page_writeback(page); page_cache_release(page); So I'm not certain that the data is actually discarded. I'm no expert in page accounting, but I'm pretty sure that we won't attempt to write it out again from CIFS. > > That could explain some problems if true. When writepages fails, we > > make the pages as in error (PG_error flag?) and presumably they are > > still dirty. Why in the world would anyone free the pages just > > because we failed the first time and need to write them again later? > > Do you where (presumably in /mm) pages could be freed that are still > > dirty (it is hard to find where the PG_error flag is checked etc)? > > You can clear writeback bit but leave/set dirty bit in completion > callback for given request. You can manually insert page into radix tree > with dirty tag. You can also lock page and do not allow to unlock it > until you resent your data. > > So, there is plenty of possibilities to break page accounting in own > writepages() method :) > I still think that if you get a hard error back from the server then you're not likely to have much success on a second attempt to write out the pages. Returning an error to userspace as soon as you realize that it didn't work seems reasonable to me. A non-responding server, on the other hand, may be a place that's recoverable. Either way, if we really want to do a second attempt to write out the pagevec, then adding some code to cifs_writepages that sleeps for a bit and implements this seems like the thing to do. I'm not convinced that it will actually make much difference, but it seems unlikely to hurt anything. -- Jeff Layton