From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S263397AbTJQLpS (ORCPT ); Fri, 17 Oct 2003 07:45:18 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S263403AbTJQLpS (ORCPT ); Fri, 17 Oct 2003 07:45:18 -0400 Received: from thebsh.namesys.com ([212.16.7.65]:65497 "HELO thebsh.namesys.com") by vger.kernel.org with SMTP id S263397AbTJQLpM (ORCPT ); Fri, 17 Oct 2003 07:45:12 -0400 Message-ID: <3F8FD646.9070902@namesys.com> Date: Fri, 17 Oct 2003 15:45:10 +0400 From: Hans Reiser User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.5) Gecko/20031007 X-Accept-Language: en-us, en MIME-Version: 1.0 To: Norman Diamond CC: Wes Janzen , Rogier Wolff , John Bradford , linux-kernel@vger.kernel.org, nikita@namesys.com, Pavel Machek , Vitaly Fertman Subject: Re: Blockbusting news, this is important (Re: Why are bad disk sectors numbered strangely, and what happens to them?) References: <32a101c3916c$e282e330$5cee4ca5@DIAMONDLX60> <200310131014.h9DAEwY3000241@81-2-122-30.bradfords.org.uk> <33a201c39174$2b936660$5cee4ca5@DIAMONDLX60> <20031014064925.GA12342@bitwizard.nl> <3F8BA037.9000705@sbcglobal.net> <3F8BBC08.6030901@namesys.com> <11bf01c39492$bc5307c0$3eee4ca5@DIAMONDLX60> <3F8FBADE.7020107@namesys.com> <126d01c3949f$91bdecc0$3eee4ca5@DIAMONDLX60> In-Reply-To: <126d01c3949f$91bdecc0$3eee4ca5@DIAMONDLX60> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Norman Diamond wrote: >Replying first to Hans Reiser; below to Russell King and Pavel Machek. > > > >>Instead of recording the bad blocks, just write to them. >> >> > >If writes are guaranteed to force reallocations then this is potentially >part of a solution. > >I still remain suspicious because the first failed read was milliseconds or >minutes after the preceding write. I think the odds are very high that the >sector was already bad at the time of the write but reallocation did not >occur. It is possible but I think very unlikely that the sector was >reallocated to a different physical sector which went bad milliseconds after >being written after reallocation, and equally unlikely that the sector >wasn't reallocated because it really hadn't been bad but went bad >milliseconds later. In other words, I think it is overwhelmingly likely >that the write failed but was not detected as such and did not result in >reallocation. > > perform the write after the failed read, that way the drive knows it is a bad block at the time you write. >Now, maybe there is a technique to force it anyway. When a partition is >newly created and is being formatted with the intention of writing data a >few minutes later, do writes that "should" have a better chance of being >detected. The way to start this is to simply write every block, but this is >obviously insufficient because my block did get written shortly after the >partition was formatted and that write didn't cause the block to be >reallocated. So in addition to simply writing every block, also read every >block. For each read that fails, proceed to do another write which "should" >force reallocation. > >Mr. Reiser, when I created a partition of your design, that technique was >not offered. Why? And will it soon start being offered? > > I think I discussed with Vitaly offering users the option of writing, reading, and then writing again, every block before mkreiserfs. I forget what happened to that idea, Vitaly? >Also, I remain highly suspicious that for each read that fails, when the >formatting program proceeds to do another write which "should" force >reallocation, the drive might not do it. > I am not going to worry about such suspicions without evidence or drive manufacturer comment, as it has not been our experience so far. > > >Why does it matter? The drive already reported a read failure. Maybe Linux >programs aren't all smart enough to inform the user when a read operation >results in an I/O error, but drivers could be smarter. > There is a general problem with reporting urgent kernel messages to users thanks to GUIs covering over the console. -- Hans