From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Steve French (smfltc)" Subject: Re: flush and EIO errors when writepages fails Date: Fri, 20 Jun 2008 11:41:26 -0500 Message-ID: <485BDDB6.1040400@us.ibm.com> References: <20080620073150.2bc9988e@tupile.poochiereds.net> <20080620091542.09edb43f@tupile.poochiereds.net> <485BD887.8090608@us.ibm.com> <20080620123451.2a038eea@tleilax.poochiereds.net> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Cc: Shirish S Pargaonkar , shaggy@linux.vnet.ibm.com, linux-fsdevel@vger.kernel.org To: Jeff Layton Return-path: Received: from e4.ny.us.ibm.com ([32.97.182.144]:44708 "EHLO e4.ny.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1758264AbYFTQka (ORCPT ); Fri, 20 Jun 2008 12:40:30 -0400 Received: from d01relay04.pok.ibm.com (d01relay04.pok.ibm.com [9.56.227.236]) by e4.ny.us.ibm.com (8.13.8/8.13.8) with ESMTP id m5KGeTve019919 for ; Fri, 20 Jun 2008 12:40:29 -0400 Received: from d01av03.pok.ibm.com (d01av03.pok.ibm.com [9.56.224.217]) by d01relay04.pok.ibm.com (8.13.8/8.13.8/NCO v9.0) with ESMTP id m5KGeTfb220388 for ; Fri, 20 Jun 2008 12:40:29 -0400 Received: from d01av03.pok.ibm.com (loopback [127.0.0.1]) by d01av03.pok.ibm.com (8.12.11.20060308/8.13.3) with ESMTP id m5KGeS6G014905 for ; Fri, 20 Jun 2008 12:40:28 -0400 In-Reply-To: <20080620123451.2a038eea@tleilax.poochiereds.net> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: Jeff Layton wrote: > On Fri, 20 Jun 2008 11:19:19 -0500 > "Steve French (smfltc)" wrote: > > >> If flush fails to write all dirty pages (due to an I/O error on the >> server, server disk or networking stack) today the error (EIO) is marked >> in the inode, and returned on close. I think cifs_flush (which is >> called before close by the vfs) should also (perhaps after sleep a >> second or so then) retry at least once on the filemap_fdatawrite before >> giving up. (perhaps retry more if mounted hard) Thoughts? >> >> > > A couple of thoughts... > > Retrying is only likely to be helpful if the server isn't responding. We > could consider doing a better job there somehow. > > The particular problem case that I am thinking of at the moment, and wish is helped by retry, is the case in which memory pressure prevents the TCP/IP stack or underlying (perhaps badly written) network adapter driver from allowing the SMB write packet from even getting to the wire. > If you want to be more aggressive about handling errors when writing > out pages, then most of the changes will need to be made at the > cifs_writepages level, not so much with cifs_flush. > flush is our "last chance" effort to write the file data - once flush and close are called the file handle is gone so we can no longer write the file data after that point.