From mboxrd@z Thu Jan 1 00:00:00 1970 From: Albert Lee Subject: Ejecting removable disk causes a flood of ALLOW_MEDIUM_REMOVAL Date: Thu, 13 Jul 2006 12:07:51 +0800 Message-ID: <44B5C717.40300@tw.ibm.com> Reply-To: albertl@mail.com Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Return-path: Received: from e35.co.us.ibm.com ([32.97.110.153]:43934 "EHLO e35.co.us.ibm.com") by vger.kernel.org with ESMTP id S932192AbWGMEIE (ORCPT ); Thu, 13 Jul 2006 00:08:04 -0400 Sender: linux-ide-owner@vger.kernel.org List-Id: linux-ide@vger.kernel.org To: Douglas Gilbert Cc: Jeff Tranter , linux-scsi@vger.kernel.org, Linux IDE , Jeff Garzik , Tejun Heo , Unicorn Chang , Doug Maxey , Gary Hade Hi Doug, A problem related to SG_IO. When testing the 2.6.18-rc1 libata, a flood of ALLOW_MEDIUM_REMOVAL is seen after ejecting a removable disk drive: http://bugzilla.kernel.org/show_bug.cgi?id=6799 The interaction to trigger the problem is below: 1. The "eject" utility, using SG_IO, issues ALLOW_MEDIUM_REMOVAL twice to unlock the disk of the device. 2. The disk of the physical device is now unlocked, however, sdev->locked is still 1, inconsistent with the physical device status. 3. The "eject" utility ejects the disk by START_STOP_UNIT. 4. Someone issues TEST_UNIT_READY. The device is not ready since no disk inside. ATAPI check condition returned. 5. 2.6.18-rc1 libata EH triggered in SCSI EH context to request sense. 6. scsi_restart_operations(), seeing sdev->locked == 1, tries to lock the door. 7. Lock failed since no disk inside. ATAPI check condition returned. 8. 2.6.18 libata EH triggered in SCSI EH context. Goto 5 and loop forever until the disk is inserted again. It looks more like eject/scsi problem. (Libata 2.6.17 EH is not affected by the problem because the request sense is done when qc is completed, not in the SCSI EH context.) Maybe adding a filter to the SCSI SG IO layer to - Check incoming SCSI commands - Whenever ALLOW_MEDIUM_REMOVAL is seen, update the sdev->locked to make the kernel data structure consistent with the physical device status can fix the problem? -- albert