From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jody French Subject: Re: flush and EIO errors when writepages fails Date: Sat, 21 Jun 2008 09:21:42 -0500 Message-ID: <485D0E76.3020205@austin.rr.com> References: <524f69650806201534g67704841u9f942a71b6d9caa3@mail.gmail.com> <20080621070556.GA1155@2ka.mipt.ru> <20080621082742.4b311132@tleilax.poochiereds.net> <20080621131919.GA22509@2ka.mipt.ru> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Jeff Layton , Steve French , linux-fsdevel , Shirish Pargaonkar , Dave Kleikamp To: Evgeniy Polyakov Return-path: Received: from hrndva-omtalb.mail.rr.com ([71.74.56.122]:34126 "EHLO hrndva-omtalb.mail.rr.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751041AbYFUOhh (ORCPT ); Sat, 21 Jun 2008 10:37:37 -0400 In-Reply-To: <20080621131919.GA22509@2ka.mipt.ru> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: Evgeniy Polyakov wrote: >> Either way, if we really want to do a second attempt to write out the >> pagevec, then adding some code to cifs_writepages that sleeps for a bit >> and implements this seems like the thing to do. I'm not convinced that >> it will actually make much difference, but it seems unlikely to hurt >> anything. >> > > If server returns serious error then there is no other way except to > discard data with error, but if server does not respond or respond EBUSY > or that kind of error, then subsequent write can succeed and at least > should not harm. As a simple case, it is possible to sleep a bit and > resend in writepages(), but it is also possible just to return from the > callback and allow VFS to call it again (frequently it will happen very > soon though) with the same pages (or even increased chunk). > In the particular case we are looking at, the network stack (TCP perhaps due a temporary glitch in the network adapter or routing infrastructure or temporary memory pressure) is returning EAGAIN for more than 15 seconds (on the tcp send of the Write request) but the server itself has not crashed, (subsequent parts of the file written via later writepages requests are eventually written out), eventually we give up in writepages and return EIO on the next fsync or flush/close - but if we could make one more attempt to go through in flush, and write all dirty pages including the ones that we timed out on that would help. In addition if readpage is about to do a partial page read into a dirty page that we were unable to write out we would like to try once more before corrupting the data.