From mboxrd@z Thu Jan 1 00:00:00 1970 From: "Max T. Woodbury" Subject: ide-io.c, ide_do_request -- race condition? Date: Mon, 05 Jul 2004 23:51:45 +0100 Sender: linux-ide-owner@vger.kernel.org Message-ID: <40E99531.BDB8D339@verizon.net> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Return-path: Received: from out009pub.verizon.net ([206.46.170.131]:23779 "EHLO out009.verizon.net") by vger.kernel.org with ESMTP id S261763AbUGEWvt (ORCPT ); Mon, 5 Jul 2004 18:51:49 -0400 Received: from M2Dual.localdomain ([4.15.37.40]) by out009.verizon.net (InterMail vM.5.01.06.06 201-253-122-130-106-20030910) with ESMTP id <20040705225148.RAGW29216.out009.verizon.net@M2Dual.localdomain> for ; Mon, 5 Jul 2004 17:51:48 -0500 Received: from verizon.net ([10.0.0.38]) by M2Dual.localdomain (8.12.8/8.12.8) with ESMTP id i65MplnA010587 for ; Mon, 5 Jul 2004 18:51:48 -0400 List-Id: linux-ide@vger.kernel.org To: linux-ide@vger.kernel.org I have a question about a specific statement in the ide-io.c code. However, I know that I need to establish a context for that question, so please be patient with the fairly long discussion that follows. I recently decided to install Linux (Fedora Core 1 to be specific) on a venerably old (Thinkpad 760ED) laptop. The process proved troublesome. Part way through the installation, the file system became corrupted, throwing read errors on a block in the installed packages (i.e. RPM) database. I did a bad block scan, zeroed the partition and tried again. Twice. The list of bad blocks was NOT consistent from time to time. That eliminated the disk drive as the source of the problem. A fourth attempt produced a clue. I was watching the installation using "top" and it took a bit longer for the corruption to occur. Three attempts later and I had a clean installation. The trick was to put a heavy computational load on the machine ("while [[ 0 == 0 ]]; do echo -n; done &" x 5) during the installation. However the installation took about 6 hours as a result, two and a half to three times the normal time. The problem did not end there. Updating from the installation base to the current patch level also corrupted the file system. It took three more tries, from scratch, to get a usable system. Surprisingly, a kernel build did NOT produce any corruption. This pretty much eliminated memory as a source of the problem. The build process is a much more memory intense process than the installation process. It would have blown up faster than the installations if there was a memory problem. (The fact that the machine runs other OSs without noticeable problems is also an indication that the underlying hardware is in working order. Only the system software and disk drive changed between the two setups and I have explained why I do not think it is the disk drive.) I concluded that there was a race condition someplace in the disk drivers sequence and went a hunting through the driver code starting from the IDE end. In ide-io.c there is a block comment just before the place where the i/o request setup routine is called. It notes that some older chipsets do not like to be interrupted during the setup process. It also notes that 'massive fs corruption" will result if the setup process is interrupted. (Hmmmm.) A search of the kernel mailing list archives found one note on this piece of code where commands were getting lost (rarely) in an SMP environment. I thought that this would be a good piece of code to fine tooth. The code was more or less what I expected. A pair of calls locked the register set by turing the relevant interrupt off before the setup and back on when it was done. They were the obvious places to do the SMP synchronization and inspection proved that that was in fact the case so premature interrupts should not be a problem if everything else was kosher. What I found next was very surprising given the comments. Local interrupts were then turned back on! I expected to see ALL interrupts turned off here because the setup had to be atomic, but the opposite was what the code did. What is going on here? This is 2.4.22 code, but it has not been changed for 2.4.26. There is some significant changes with 2.6.7, but it is worse if anything. A little more explanation is probably in order. At the top of the routine all interrupts are turned off and the hardware group busy bit is set. This bit is protected by a spinlock, so there should be no need to lock out interrupts while manipulating it. There are a few other conditions checked next, none requiring interrupt lockout. So a whole bunch of code has been executing under interrupt lockout when there was no need for the lockout. Not a huge problem, just strange. Also, in 2.6, the lockout has to begin before the routine is called which is why I said 2.6 was worse. Then comes the block comment, the all-CPU lockout of completion interrupts, and just when the comment suggests that all interrupts should be turned off, they are turned back on. I can understand the problem the comment might be addressing. The interface could well be sensitive to the timing of the over-all load sequence or a read from the status register checking for 'ready' could bollix the whole setup process. This final setup phase really should be an atomic operation. I've done a little 'playing' with this code. First I tried just removing the enable call. It seems to have had absolutely no effect. The system did NOT hang because the interrupt lock out did not end. The file system corruption showed up again on applying a subsequent update. I've also made a slightly more venturesome change. I pulled the disable at the head of the routine and put it just before the setup call and moved the enable to after the setup call. I've seen no problems with this variant, but I might not see any with the hardware I have. A machine with more than one IDE interface on the same IRQ line might show a problem, particularly if it was a multi-processor running SMP. (I have an SMP machine, but not one with three IDE controllers.) I haven't dug around the Linux kernel as much as I probably should have. (That does not mean I am not familiar with kernels. I did a lot of digging through PDP 11 and VAX operating system code including RSX-11 A, B, D and M, RSTS-11, RT-11, DOS-11 and VMS plus some bare metal [paper tape loader!] applications.) I can see that the setup process for an IDE controller could be fairly lengthy and that interrupt latency could become a problem, and enabling local interrupts would relieve the problem, but not at the cost of corrupting system integrity? Could someone else fine tooth this? Just to make sure I'm not missing something major (or minor)... 1) Modify the FC1 installation environment so that the correct lockout sequence is used. This will require that I build a new boot floppy. 2) Do another installation without using the system load trick. This should be fairly definitive since I have not be able to do a complete system load without problem in half a dozen attempts. 3) Do the updates without the system load trick. Finally, if this proves to be an acceptable cure to my problem, A) Does there need to be a way to turn this fix off when it is not needed? (and how do you tell if it is needed? boot command line option? Blacklist?) B) Should it be included in the standard distribution? C) What is the procedure for getting it included? I've been going through the linux-ide archives and noticed that there have been a number of mystery fs corruption issues that just disappeared. This might be related. There was also a DMA problem that might have been relevant, but I know it does not apply in this case since "hdparm" shows DMA turned off by default on this machine. Max T.E. Woodbury max@mtew.isa-geek.net