From mboxrd@z Thu Jan 1 00:00:00 1970 From: Trey Ramsay Subject: Re: [PATCH 1/1] mmc: Bad device can cause mmc driver to hang Date: Fri, 16 Nov 2012 23:16:21 -0600 Message-ID: <50A71DA5.2070905@linux.vnet.ibm.com> References: <87k3tpkz53.fsf@octavius.laptop.org> <1353079901-8773-1-git-send-email-tramsay@linux.vnet.ibm.com> <876255bluf.fsf@octavius.laptop.org> <50A6D1C7.302@linux.vnet.ibm.com> <87mwyh9ia0.fsf@octavius.laptop.org> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Return-path: Received: from e33.co.us.ibm.com ([32.97.110.151]:35133 "EHLO e33.co.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751071Ab2KQFQ1 (ORCPT ); Sat, 17 Nov 2012 00:16:27 -0500 Received: from /spool/local by e33.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Fri, 16 Nov 2012 22:16:26 -0700 In-Reply-To: <87mwyh9ia0.fsf@octavius.laptop.org> Sender: linux-mmc-owner@vger.kernel.org List-Id: linux-mmc@vger.kernel.org To: Chris Ball Cc: linux-kernel@vger.kernel.org, linux-mmc@vger.kernel.org, Rich Rattanni , Radovan Lekanovic On 11/16/2012 06:37 PM, Chris Ball wrote: > Hi Trey, thanks for the analysis, > > On Fri, Nov 16 2012, Trey Ramsay wrote: >> Good question. In regards to the original problem were it was hung in >> mmc_blk_err_check, the new code path will timeout after 10 minutes, log >> an error, issue a hardware reset and abort the request. Is the hardware >> reset enough or will that even work when the device isn't coming out of >> program state? Should we try to refuse all new I/O? > > mmc_hw_reset() only works for eMMC devices with a hooked up reset GPIO > -- not SD cards -- and at the moment there's only one system (Intel > Medfield) that supplies a GPIO, so that's not a general solution. > > Maybe we should just merge your patch for now; we'll definitely get at > least a pr_err() explaining what's going on, which is an improvement. > Next time someone hits this (if anyone has an SD card that exhibits > this problem, it'd be very valuable for testing) we can look at going > farther, such as immediately setting host->flags |= SDHCI_DEVICE_DEAD. > What do you think? > > - Chris. > Hi Chris, Sounds good. Thanks for the explanation. Setting host->flags |= SDHCI_DEVICE_DEAD is a great idea. I'll check with my team to see if we have any hardware that exhibits this problem. If we do, I can do some testing on the code you suggested. Thanks, Trey