From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752652AbbHPPQU (ORCPT ); Sun, 16 Aug 2015 11:16:20 -0400 Received: from mx01.unibo.it ([137.204.24.54]:61964 "EHLO mx01.unibo.it" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752388AbbHPPQS (ORCPT ); Sun, 16 Aug 2015 11:16:18 -0400 X-Greylist: delayed 901 seconds by postgrey-1.27 at vger.kernel.org; Sun, 16 Aug 2015 11:16:18 EDT X-AuditID: 89cc1836-f79556d000006969-9b-55d0a5b7f9b0 Message-ID: <55D0A5B3.202@unibo.it> Date: Sun, 16 Aug 2015 17:01:07 +0200 From: Sergio Callegari User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.8.0 MIME-Version: 1.0 To: Subject: Re: Regression: commit 045065d breaks kernel on machine with atapi floppy: high IOWAIT, hung processes (bisected) References: <55D09BD8.4090601@unibo.it> In-Reply-To: <55D09BD8.4090601@unibo.it> Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [5.170.91.192] X-ClientProxiedBy: E13-MBX1-CS.personale.dir.unibo.it (10.12.1.71) To E13-MBX3-DR.personale.dir.unibo.it (10.12.1.73) X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFprJKsWRmVeSWpSXmKPExsXCxcPoqbt96YVQg+dNRhaXd81hc2D0+LxJ LoAxitsmKbGkLDgzPU/fLoE7493d+awFT0QrDp2RaWDcJNjFyMkhIWAicXDBDzYIW0ziwr31 QDYXh5DAMkaJ9WvOMkI4Wxkl/i9bxAhSxSugLDHp9lswm0VAVaL95RNmEJtNwEji6LdtYJNE BcIkrpzZyAxRLyhxcuYTFhBbREBJ4vu1brBeYYEaiY89z8HqhQTUJR4+3AtmcwpoSDTv3wzW yyxgITFz/nlGCFteYvvbOcwQ9WoSU/r3skBcrSBxpmU9lF0uMePkHLYJjEKzkKyehWTULCSj FjAyr2IUK85N1y1OTswz1Esu1ivNy0zK18vJT97ECAzazjMSZjsYV513O8QowMGoxMMrceh8 qBBrYllxZe4hRkkOJiVR3gdRQCG+pPyUyozE4oz4otKc1OJDjBIczEoivDq9F0KFeFMSK6tS i/JhUtIcLErivA0tXaFCAumJJanZqakFqUUwWRkODiUJ3volQI2CRanpqRVpmTklCGkmDk6Q 4VxSIsWpeSmpRYmlJRnxoEiOLwbGMkiKB2hvGkg7b3FBYi5QFKL1FKMux5YFN9YyCbHk5eel SonzRoAUCYAUZZTmwa0ApSjWVdEHXjGKA30szDsLpIoHmN7gJr0CWsIEtMRuxlmQJSWJCCmp Bka5SbuV3DQ0c1zf17G/OXJ4Uw/j7pg9B1wjb1qqcKbmVT2zkY3lPrz32tmGBfzaB+3SmBYG zHm69/tS86ha1WV/nCTlyypOLpJ6NElz6taax48tluvsM/c9+vhbPYuCad4kgepCJpUNfjMM T7G06T9sYWoOeGAw2/uS0GShUPNrmlJm1n+LjymxFGckGmoxFxUnAgCbSQjzLAMAAA== Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Seems that the issue also affects other systems with different configs: https://bbs.archlinux.org/viewtopic.php?id=189324 Possibly, the same bug reported in https://bugzilla.kernel.org/show_bug.cgi?id=87581 A tentative patch was submitted on LKML https://lkml.org/lkml/2014/11/20/581 I have not tested it yet. Another possible solution being reported is increasing delay time in blk_delay_queue(q, SCSI_QUEUE_DELAY) Not tested yet either. Threads in 189324 suggests that bug is triggered by mixing a slower device with a faster one on the same IDE/SATA channel. Can someone indicate: - If one of the two patches has already been accepted in recent kernels or is pending acceptance? - Which one among the two approaches (extending delay time or modifying spin locks in scsi_lib.c) is more appropriate for me to test? Best, Sergio On 16/08/2015 16:19, Sergio Callegari wrote: > > Hi, > > please keep me in CC in answers. > > I'd like to report that after commit > > [045065d8a300a37218c548e9aa7becd581c6a0e8] [SCSI] fix qemu boot hang > problem > > the kernel is not usable on a machine with an IOMEGA Zip 100 ATAPI drive > as in: > > Model=IOMEGA ZIP 100 ATAPI Floppy, FwRev=12.A, SerialNo= > Config={ SpinMotCtl Removeable nonMagnetic } > RawCHS=0/0/0, TrkSize=0, SectSize=0, ECCbytes=0 > BuffType=unknown, BuffSize=unknown, MaxMultSect=0 (maybe): CurCHS=0/0/0, > CurSects=0, LBA=yes, LBAsects=0 > IORDY=on/off, tPIO={min:500,w/IORDY:180} > PIO modes: pio0 pio1 pio2 pio3 > AdvancedPM=no > > Symptoms include: > > - Extremely high IOWAIT in absence of load > - Kernel reporting hung processes > - Commands like blkid hanging > - Inability of the machine to shutdown > > Symptoms do not appear immediately, but after some time (anywhere > between a few minutes and /many hours/ after boot). First symptom is > IOWAIT suddendly jumping high. > > Due to the delay in which symptoms manifest, bisecting has been quite > painful, but I am now rather sure that the first bad commit is the one > above. > > Other pieces of hardware configuration include: > > - ASRock N68S motherboard with AMD Phenom(tm) II X4 920 Processor and > NVIDIA MCP61 SATA/IDE Chipset > - IDE drive connected as slave on ide interface where master is HL-DT-ST > DVD-RAM GH22NP20 CDROM/DVD writer > > Issue is weird because the commit seems to merely fix a trivial error in > logic condition > > - if (atomic_read(&sdev->device_busy) && !scsi_device_blocked(sdev)) > + if (!atomic_read(&sdev->device_busy) && !scsi_device_blocked(sdev)) > blk_delay_queue(q, SCSI_QUEUE_DELAY); > > Hence, the commit may just end up making visible some other issue. > > Best, > > Sergio > >