From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1761011AbXEPPjc (ORCPT ); Wed, 16 May 2007 11:39:32 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1755753AbXEPPjZ (ORCPT ); Wed, 16 May 2007 11:39:25 -0400 Received: from mail.clusterfs.com ([206.168.112.78]:51590 "EHLO mail.clusterfs.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754723AbXEPPjY (ORCPT ); Wed, 16 May 2007 11:39:24 -0400 Date: Wed, 16 May 2007 17:39:19 +0200 From: Johann Lombardi To: Andrew Morton Cc: linux-kernel@vger.kernel.org Subject: Re: Clear PG_error before reading a page Message-ID: <20070516153919.GC2630@chiva> Mail-Followup-To: Johann Lombardi , Andrew Morton , linux-kernel@vger.kernel.org References: <20070515143726.GC2160@chiva> <20070515101144.f7072476.akpm@linux-foundation.org> <20070515210124.GA23698@chiva> <20070515142339.4d9098f3.akpm@linux-foundation.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20070515142339.4d9098f3.akpm@linux-foundation.org> User-Agent: Mutt/1.5.13 (2006-08-11) Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org On Tue, May 15, 2007 at 02:23:39PM -0700, Andrew Morton wrote: > > Yes, indeed. However, as soon as a call to get_block() fails, > > do_mpage_readpage() will call block_read_full_page() which will attach > > buffers to this page. > > Consequently, all subsequent reads will go through block_read_full_page(). > > hm, confused. Why is get_block() failing? That has to go and read > metadata. In fact, I am referring to the first part of my test case (i.e. mount the ext3 fs, enable medium errors in scsi_debug and try to read a file from the fs). So, when I try to read a file, ext3_get_block() needs to read metadata from the disk. However, given that the SCSI disk simulated by scsi_debug reports medium errors, ext3_get_block() returns EIO to the caller (i.e. do_mpage_readpage()). That's why get_block() is failing. Then, do_mpage_readpage() calls block_read_full_page() (via "goto confused"). block_read_full_page() attaches buffers to this page and calls ext3_get_block() which fails for the same reason as before. Consequently, block_read_full_page() sets the PG_error flag. Moreover, all subsequent readpage calls will go through block_read_full_page() because the page has now buffers attached. Basically, my problem is that afterwards, when the device no longer returns any errors, the PG_error flag is never cleared and, as a result, I keep getting -EIO. That's the problem I'd like to address. > If get_block() failed then we don't know what blocks to read to > bring this page uptodate, so the pagecache page should remain in state > !PageUptodate(), !PageError(). But then, we shouldn't have populated > pagecache at that offset at all. Yes, indeed. do_generic_mapping_read() doesn't populate the pagecache if an error occurred. Still, __do_page_cache_readahead()->read_pages()->ext3_readpages()->mpage_readpages() does populate the pagecache even if the page reads failed. > I think I'm missing something here. I suspect you're referring to a mix > of reading the blockdev via /dev/hda1 and then using the already-populated > pagecache as filesystem metadata, or something? I don't think so, unless I'm missing something. > Is the PageError page part of an S_ISREG file, or is it part of an S_ISBLK > file? The PageError pages are part of a regular file and have been read through readahead. Johann