From mboxrd@z Thu Jan  1 00:00:00 1970
From: "Steve French (smfltc)" <smfltc@us.ibm.com>
Subject: Re: flush and EIO errors when writepages fails
Date: Fri, 20 Jun 2008 11:41:26 -0500
Message-ID: <485BDDB6.1040400@us.ibm.com>
References: <20080620073150.2bc9988e@tupile.poochiereds.net> <OFE8C66E61.981E25D1-ON8725746E.0045A92A-8625746E.004718C0@us.ibm.com> <20080620091542.09edb43f@tupile.poochiereds.net> <485BD887.8090608@us.ibm.com> <20080620123451.2a038eea@tleilax.poochiereds.net>
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Cc: Shirish S Pargaonkar <shirishp@us.ibm.com>,
	shaggy@linux.vnet.ibm.com, linux-fsdevel@vger.kernel.org
To: Jeff Layton <jlayton@redhat.com>
Return-path: <linux-fsdevel-owner@vger.kernel.org>
Received: from e4.ny.us.ibm.com ([32.97.182.144]:44708 "EHLO e4.ny.us.ibm.com"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S1758264AbYFTQka (ORCPT <rfc822;linux-fsdevel@vger.kernel.org>);
	Fri, 20 Jun 2008 12:40:30 -0400
Received: from d01relay04.pok.ibm.com (d01relay04.pok.ibm.com [9.56.227.236])
	by e4.ny.us.ibm.com (8.13.8/8.13.8) with ESMTP id m5KGeTve019919
	for <linux-fsdevel@vger.kernel.org>; Fri, 20 Jun 2008 12:40:29 -0400
Received: from d01av03.pok.ibm.com (d01av03.pok.ibm.com [9.56.224.217])
	by d01relay04.pok.ibm.com (8.13.8/8.13.8/NCO v9.0) with ESMTP id m5KGeTfb220388
	for <linux-fsdevel@vger.kernel.org>; Fri, 20 Jun 2008 12:40:29 -0400
Received: from d01av03.pok.ibm.com (loopback [127.0.0.1])
	by d01av03.pok.ibm.com (8.12.11.20060308/8.13.3) with ESMTP id m5KGeS6G014905
	for <linux-fsdevel@vger.kernel.org>; Fri, 20 Jun 2008 12:40:28 -0400
In-Reply-To: <20080620123451.2a038eea@tleilax.poochiereds.net>
Sender: linux-fsdevel-owner@vger.kernel.org
List-ID: <linux-fsdevel.vger.kernel.org>

Jeff Layton wrote:
> On Fri, 20 Jun 2008 11:19:19 -0500
> "Steve French (smfltc)" <smfltc@us.ibm.com> wrote:
>
>   
>> If flush fails to write all dirty pages (due to an I/O error on the 
>> server, server disk or networking stack) today the error (EIO) is marked 
>> in the inode, and returned on close.   I think cifs_flush (which is 
>> called before close by the vfs) should also (perhaps after sleep a 
>> second or so then) retry at least once on the filemap_fdatawrite before 
>> giving up.  (perhaps retry more if mounted hard) Thoughts?
>>
>>     
>
> A couple of thoughts...
>
> Retrying is only likely to be helpful if the server isn't responding. We
> could consider doing a better job there somehow.
>
>   
The particular problem case that I am thinking of at the moment, and 
wish is helped by retry, is
the case in which memory pressure prevents the TCP/IP stack or 
underlying (perhaps badly
written) network adapter driver from allowing the SMB write packet from 
even getting to
the wire.
> If you want to be more aggressive about handling errors when writing
> out pages, then most of the changes will need to be made at the
> cifs_writepages level, not so much with cifs_flush.
>   
flush is our "last chance" effort to write the file data - once flush 
and close are called the
file handle is gone so we can no longer write the file data after that 
point.