From mboxrd@z Thu Jan  1 00:00:00 1970
From: Jody French <jfrench@austin.rr.com>
Subject: Re: flush and EIO errors when writepages fails
Date: Sat, 21 Jun 2008 09:21:42 -0500
Message-ID: <485D0E76.3020205@austin.rr.com>
References: <524f69650806201534g67704841u9f942a71b6d9caa3@mail.gmail.com> <20080621070556.GA1155@2ka.mipt.ru> <20080621082742.4b311132@tleilax.poochiereds.net> <20080621131919.GA22509@2ka.mipt.ru>
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: Jeff Layton <jlayton@redhat.com>,
	Steve French <smfrench@gmail.com>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Shirish Pargaonkar <shirishpargaonkar@gmail.com>,
	Dave Kleikamp <shaggy@linux.vnet.ibm.com>
To: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Return-path: <linux-fsdevel-owner@vger.kernel.org>
Received: from hrndva-omtalb.mail.rr.com ([71.74.56.122]:34126 "EHLO
	hrndva-omtalb.mail.rr.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1751041AbYFUOhh (ORCPT
	<rfc822;linux-fsdevel@vger.kernel.org>);
	Sat, 21 Jun 2008 10:37:37 -0400
In-Reply-To: <20080621131919.GA22509@2ka.mipt.ru>
Sender: linux-fsdevel-owner@vger.kernel.org
List-ID: <linux-fsdevel.vger.kernel.org>

Evgeniy Polyakov wrote:
>> Either way, if we really want to do a second attempt to write out the
>> pagevec, then adding some code to cifs_writepages that sleeps for a bit
>> and implements this seems like the thing to do. I'm not convinced that
>> it will actually make much difference, but it seems unlikely to hurt
>> anything.
>>     
>
> If server returns serious error then there is no other way except to
> discard data with error, but if server does not respond or respond EBUSY
> or that kind of error, then subsequent write can succeed and at least
> should not harm. As a simple case, it is possible to sleep a bit and
> resend in writepages(), but it is also possible just to return from the
> callback and allow VFS to call it again (frequently it will happen very
> soon though) with the same pages (or even increased chunk).
>   
In the particular case we are looking at, the network stack (TCP perhaps 
due a temporary glitch in
the network adapter or routing infrastructure or temporary memory 
pressure) is returning EAGAIN
for more than 15 seconds (on the tcp send of the Write request) but the 
server itself has not crashed,
(subsequent parts of the file written via later writepages requests are 
eventually written out),  eventually
we give up in writepages and return EIO on the next fsync or flush/close 
- but if we could
make one more attempt to go through in flush, and write all dirty pages 
including the ones that we timed
out on that would help.  In addition if readpage is about to do a 
partial page read into a dirty page that
we were unable to write out we would like to try once more before 
corrupting the data.