public inbox for linux-mmc@vger.kernel.org
 help / color / mirror / Atom feed
From: Gregory CLEMENT <gregory.clement@free-electrons.com>
To: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ulf Hansson <ulf.hansson@linaro.org>,
	Russell King - ARM Linux <linux@arm.linux.org.uk>,
	linux-mmc <linux-mmc@vger.kernel.org>,
	Marcin Wojtas <mw@semihalf.com>, Shawn Guo <shawnguo@kernel.org>,
	Sascha Hauer <kernel@pengutronix.de>
Subject: Re: [PATCH v4 00/25] MMC/SDHCI fixes
Date: Fri, 12 Feb 2016 15:23:05 +0100	[thread overview]
Message-ID: <87ziv6vwhi.fsf@free-electrons.com> (raw)
In-Reply-To: <56BDAAB0.6010209@intel.com> (Adrian Hunter's message of "Fri, 12 Feb 2016 11:49:36 +0200")

Hi Adrian,
 
 On ven., févr. 12 2016, Adrian Hunter <adrian.hunter@intel.com> wrote:

> On 04/02/16 12:55, Ulf Hansson wrote:
>> + Adrian
>> 
>> On 29 January 2016 at 10:43, Russell King - ARM Linux
>> <linux@arm.linux.org.uk> wrote:
>>> On Mon, Dec 21, 2015 at 11:39:56AM +0000, Russell King - ARM Linux wrote:
>>>> This all started with a report from an iMX6 Hummingboard user that
>>>> they were seeing regular "mmc0: Got data interrupt 0x00000002 even
>>>> though no data operation was in progress." messages with 4.4-rc and
>>>> additional patches to enable UHS support in DT for the platform.
>>>>
>>>> When I tested 4.4-rc on an iMX6 Hummingboard with UHS support enabled,
>>>> I found several other issues.  This patch series addresses some of
>>>> the issues.
>>>>
>>>> However, while testing on Armada 388, an issue was found there, and
>>>> a fix is also included in this patch set, however it does not depend
>>>> on the rest of the patch series.
>>>>
>>>> We all know that the SDHCI driver is in poor state, and Ulf's desire
>>>> to see it turned into a library (which actually comes from myself.)
>>>> However, with the driver in such a poor state, it makes sense to fix
>>>> a few of the issues first, especially if they result in the code
>>>> shrinking - which they do.
>>>>
>>>> However, given what I've found below, I have to wonder how many of the
>>>> quirks that we have in this driver are down to the poor code rather than
>>>> real hardware problems: how many of them are due to poorly investigated
>>>> bugs.
>>>>
>>>> This series starts off by shutting up some kernel messages that should
>>>> not be verbose:
>>>> - the voltage-ranges DT property is optional, so we should not print
>>>>   a message saying that it's missing from DT.  That suggests it isn't
>>>>   optional.  As no ill effect comes from not specifying it, I've
>>>>   dropped this message to debug level.
>>>>
>>>> - reworking the voltage-ranges code so that callers know whether the
>>>>   optional voltage-ranges property was specified.
>>>>
>>>> - "retrying because a re-tune was needed" - a re-tune is something that
>>>>   is part of the higher speed SD specs, and it's required to happen
>>>>   periodically.  So it's expected and normal behaviour.  There's no need
>>>>   to print this to the kernel log.
>>>>
>>>> - When tuning fails, rather than a bland "mmc0: tuning execution failed"
>>>>   message, print the error code so that we can see what the reason for
>>>>   the failure is.  MMC fails on this point in many places.
>>>>
>>>> - Move the initialisation of the command ->error member.  It looks to
>>>>   me as if this is not required; mmc_start_request() does this already,
>>>>   as it seems the rest of the MMC core does too.  Maybe this should be
>>>>   removed altogether?
>>>>
>>>> - Testing error bits, then setting the error, then testing the error
>>>>   value, and completing the command is a very lengthy way to deal with
>>>>   errors here.  Test the error bits, and then decide what error occurred
>>>>   before completing.  (This code is actually buggy, see the next change).
>>>>
>>>> - If a command _response_ CRC error occurs, we can't be certain whether
>>>>   the card received the command successfully, and has started processing
>>>>   it.  If it has, and it results in data being read from the card, the
>>>>   card will enter data transfer state, and start sending data to the
>>>>   host.  Therefore, it is completely wrong to blindly terminate the
>>>>   command, especially when such termination fails to clean _anything_
>>>>   up - eg, it leaves DMA in place so the host can continue DMAing to the
>>>>   memory, and it leaves buffers DMA mapped - which eventually annoys the
>>>>   DMA API debugging.  Fix this by assuming that a CRC error is a receive
>>>>   side error.  If the card did not correctly receive the command, it
>>>>   will not initiate a data transfer, and we should time out after the
>>>>   card specified timeout.
>>>>
>>>>   With the unmodified mainline code, even if we reset the state machines
>>>>   when finishing a request in error, the _card_ itself may well still be
>>>>   in data transfer mode when this happens.  How can we be sure what state
>>>>   the data side is actually in?  Could this be why (eg) sdhci-esdhc-imx.c
>>>>   has this:
>>>>
>>>>         /* FIXME: delay a bit for card to be ready for next tuning due to errors */
>>>>         mdelay(1);
>>>>
>>>>   And how many more such _hacks_ are working around similar problems?
>>>>
>>>> - The unaligned buffer handling is a total waste of resources in its
>>>>   current form.  We DMA map, sync, and unmap it on every transfer,
>>>>   whether the buffer is used or not.  For MMC/SD transfers, these will
>>>>   be coming from the VFS/MM layers, which operate through the page
>>>>   cache - hence the vast majority of buffers will be page aligned.
>>>>   Performance measurements on iMX6 show that this additional DMA
>>>>   maintanence is responsible for a 10% reduction in read transfer
>>>>   performance.  Switch this to use a DMA coherent mapping instead, which
>>>>   needs no DMA maintanence.
>>>>
>>>> - sdhci_adma_table_post() is badly coded: we walk the scatterlist
>>>>   looking for any buffers which might not be aligned (which can be a
>>>>   rare event, see above) and then we only do something if we find an
>>>>   unaligned buffer _and_ it's a read transfer.  So for a write transfer,
>>>>   this scatterlist walking is a total waste of CPU cycles.  Testing the
>>>>   transfer direction is much cheaper than walking the scatterlist.
>>>>   Re-organise the code to do the cheap test first.
>>>>
>>>> - There are two paths to identical DMA unmapping code (the code around
>>>>   dma_unmap_sg()) in sdhci_finish_data() and its child function
>>>>   sdhci_adma_table_post().  Trivially simplify this duplication.
>>>>
>>>> - Move sdhci_pre_dma_transfer() so we avoid needing to declare it.
>>>>
>>>> - Do the same thing with sdhci_prepare_data() and sdhci_pre_dma_transfer()
>>>>   as with sdhci_finish_data() and sdhci_adma_table_post().
>>>>
>>>> - Fix the mess that is the data segment host_cookie handling.  The
>>>>   existing code seems like a candidate for the obfuscated coding prize!
>>>>   Four patches address this, turning it into something much easier to
>>>>   understand, which then allows (finally)...
>>>>
>>>> - The DMA API leak which occurs when we have mapped the DMA buffers in
>>>>   sdhci_prepare_data() and an error occurs early on in the processing
>>>>   of a request.
>>>>
>>>> - A change (mentioned above) which fixes a prior commit for Armada 38x
>>>>   using sdhci-pxav3.c, found while trying to get the higher speed modes
>>>>   working on these SoCs.  Clearly the existing code is broken, this at
>>>>   least resolves some of the breakage by allowing the correct high-order
>>>>   SDHCI capabilities through.
>>>>
>>>> - Fixing another DMA API leak with the pre/post request handling, where
>>>>   a current request can change the state of the SDHCI_REQ_USE_DMA, and
>>>>   thus stop a mapped request being unmapped.
>>>>
>>>> - Fixing the SDHCI timeout calculation code to correctly round timeouts
>>>>   up rather than down, and correctly calculate the time (in microseconds)
>>>>   for a certain number of card clocks.
>>>>
>>>> - Consolidating the SDHCI address/size alignment handling so we only walk
>>>>   the scatterlist once instead of twice.
>>>>
>>>> Further patches are necessary as there are still a number of observable
>>>> problems when using this driver on iMX6 with UHS cards.  Patches up to
>>>> and including "mmc: sdhci: further fix for DMA unmapping in
>>>> sdhci_post_req()" should be considered for inclusion in the stable
>>>> kernel trees.
>>>>
>>>> I've tested these changes on iMX6 at UHS speeds, and Armada 388 SoCs
>>>> at HS speeds so far.  Others should test them.
>>>
>>> One comment from Adrian addressed.
>>>
>>>  drivers/mmc/card/block.c       |   4 +-
>>>  drivers/mmc/core/core.c        |  17 +-
>>>  drivers/mmc/host/sdhci-pxav3.c |   6 +-
>>>  drivers/mmc/host/sdhci.c       | 444 ++++++++++++++++++-----------------------
>>>  drivers/mmc/host/sdhci.h       |   4 +-
>>>  5 files changed, 213 insertions(+), 262 deletions(-)
>>>
>>> --
>>> RMK's Patch system: http://www.arm.linux.org.uk/developer/patches/
>>> FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
>>> according to speedtest.net.
>> 
>> I have applied patch 1 to 4, for next!
>> 
>> The rest are sdhci patches, which from now on needs acks from Adrian
>> for me to pick them up.
>
> I have taken the liberty of re-organizing Russell's patches to put the
> bug fixes first with stable cc's.  I also did 2 tweaks which result in
> final code having the diff below to the original:
>
>
> diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c
> index f857cb2a3d05..03fbb36580f7 100644
> --- a/drivers/mmc/host/sdhci.c
> +++ b/drivers/mmc/host/sdhci.c
> @@ -2270,7 +2270,7 @@ static void sdhci_cmd_irq(struct sdhci_host *host, u32 intmask, u32 *mask)
>  		 * will time out.
>  		 */
>  		if (host->cmd->data &&
> -		    intmask & (SDHCI_INT_CRC | SDHCI_INT_TIMEOUT) ==
> +		    (intmask & (SDHCI_INT_CRC | SDHCI_INT_TIMEOUT)) ==
>  		     SDHCI_INT_CRC) {
>  			host->cmd = NULL;
>  			return;
> @@ -2920,7 +2920,8 @@ int sdhci_add_host(struct sdhci_host *host)
>  			pr_warn("%s: Unable to allocate ADMA buffers - falling back to standard DMA\n",
>  				mmc_hostname(mmc));
>  			host->flags &= ~SDHCI_USE_ADMA;
> -		} else if (dma & SDHCI_ADMA2_MASK) {
> +		} else if ((dma + host->align_buffer_sz) &
> +			   (SDHCI_ADMA2_DESC_ALIGN - 1)) {
>  			pr_warn("%s: unable to allocate aligned ADMA descriptor\n",
>  				mmc_hostname(mmc));
>  			host->flags &= ~SDHCI_USE_ADMA;
>
>
> Where I have made changes, they are indicated in the patch commit.
>
> I did some basic testing of eMMC, SD card and SDIO, for the fixes alone
> and also the entire patch set.
>
> Russell, please let me know if this is OK, and if so, Ulf please consider
> pulling:
>
>
> The following changes since commit 7d9e20d2be0d836c4eb7afca901eced34274c8dd:
>
>   Merge branch 'fixes' into next (2016-02-08 16:22:56 +0100)
>
> are available in the git repository at:
>
>
>   git://git.infradead.org/users/ahunter/linux-sdhci.git master

Thanks to your branch I was able to test the series on the Armada 38x
boards.

At the beginning my main motivation was to ensure that now thanks to the
patch "mmc: sdhci-pxav3: fix higher speed mode capabilities", we really
manage to support SDR50 and DDR50 modes. However I realized that I don't
have any 1.8V support on the board I have.

So I can't confirm this, but at least I didn't see any regression while
using the SD cards.

Gregory



>
> for you to fetch changes up to eb6bf3c3513d2e2ab77d86bc6c2f2755073f0745:
>
>   mmc: sdhci: further code simplication (2016-02-11 16:24:46 +0200)
>
> ----------------------------------------------------------------
> Russell King (22):
>       mmc: sdhci: move initialisation of command error member
>       mmc: sdhci: clean up command error handling
>       mmc: sdhci: fix command response CRC error handling
>       mmc: sdhci: avoid unnecessary mapping/unmapping of align buffer
>       mmc: sdhci: plug DMA mapping leak on error
>       mmc: sdhci-pxav3: fix higher speed mode capabilities
>       mmc: sdhci: further fix for DMA unmapping in sdhci_post_req()
>       mmc: sdhci: fix data timeout (part 1)
>       mmc: sdhci: fix data timeout (part 2)
>       mmc: sdhci: allocate alignment and DMA descriptor buffer together
>       mmc: sdhci: clean up coding style in sdhci_adma_table_pre()
>       mmc: sdhci: avoid walking SG list for writes
>       mmc: sdhci: factor out common DMA cleanup in sdhci_finish_data()
>       mmc: sdhci: move sdhci_pre_dma_transfer()
>       mmc: sdhci: factor out sdhci_pre_dma_transfer() from sdhci_adma_table_pre()
>       mmc: sdhci: pass the cookie into sdhci_pre_dma_transfer()
>       mmc: sdhci: always unmap a mapped data transfer in sdhci_post_req()
>       mmc: sdhci: clean up host cookie handling
>       mmc: sdhci: cleanup DMA un-mapping
>       mmc: sdhci: prepare DMA address/size quirk handling consolidation
>       mmc: sdhci: consolidate the DMA/ADMA size/address quicks
>       mmc: sdhci: further code simplication
>
>  drivers/mmc/host/sdhci-pxav3.c |   6 ++-
>  drivers/mmc/host/sdhci.c       | 444 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-------------------------------------------------------------------------------------------
>  drivers/mmc/host/sdhci.h       |   4 +-
>  3 files changed, 200 insertions(+), 254 deletions(-)
>
>
> Regards
> Adrian

-- 
Gregory Clement, Free Electrons
Kernel, drivers, real-time and embedded Linux
development, consulting, training and support.
http://free-electrons.com

  reply	other threads:[~2016-02-12 14:23 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-01-29  9:43 [PATCH v4 00/25] MMC/SDHCI fixes Russell King - ARM Linux
2016-01-29  9:43 ` [PATCH v4 01/25] mmc: core: shut up "voltage-ranges unspecified" pr_info() Russell King
2016-01-29  9:43 ` [PATCH v4 02/25] mmc: core: improve mmc_of_parse_voltage() to return better status Russell King
2016-01-29  9:44 ` [PATCH v4 03/25] mmc: block: shut up "retrying because a re-tune was needed" message Russell King
2016-01-29  9:44 ` [PATCH v4 04/25] mmc: core: report tuning command execution failure reason Russell King
2016-01-29  9:44 ` [PATCH v4 05/25] mmc: sdhci: move initialisation of command error member Russell King
2016-01-29  9:44 ` [PATCH v4 06/25] mmc: sdhci: clean up command error handling Russell King
2016-01-29  9:44 ` [PATCH v4 07/25] mmc: sdhci: command response CRC " Russell King
2016-01-29  9:44 ` [PATCH v4 08/25] mmc: sdhci: avoid unnecessary mapping/unmapping of align buffer Russell King
2016-01-29  9:44 ` [PATCH v4 09/25] mmc: sdhci: allocate alignment and DMA descriptor buffer together Russell King
2016-01-29  9:44 ` [PATCH v4 10/25] mmc: sdhci: clean up coding style in sdhci_adma_table_pre() Russell King
2016-01-29  9:44 ` [PATCH v4 11/25] mmc: sdhci: avoid walking SG list for writes Russell King
2016-01-29  9:44 ` [PATCH v4 12/25] mmc: sdhci: factor out common DMA cleanup in sdhci_finish_data() Russell King
2016-01-29  9:44 ` [PATCH v4 13/25] mmc: sdhci: move sdhci_pre_dma_transfer() Russell King
2016-01-29  9:44 ` [PATCH v4 14/25] mmc: sdhci: factor out sdhci_pre_dma_transfer() from sdhci_adma_table_pre() Russell King
2016-01-29  9:45 ` [PATCH v4 15/25] mmc: sdhci: pass the cookie into sdhci_pre_dma_transfer() Russell King
2016-01-29  9:45 ` [PATCH v4 16/25] mmc: sdhci: always unmap a mapped data transfer in sdhci_post_req() Russell King
2016-01-29  9:45 ` [PATCH v4 17/25] mmc: sdhci: clean up host cookie handling Russell King
2016-01-29  9:45 ` [PATCH v4 18/25] mmc: sdhci: plug DMA mapping leak on error Russell King
2016-01-29  9:45 ` [PATCH v4 19/25] mmc: sdhci-pxav3: fix higher speed mode capabilities Russell King
2016-01-29  9:45 ` [PATCH v4 20/25] mmc: sdhci: further fix for DMA unmapping in sdhci_post_req() Russell King
2016-01-29  9:45 ` [PATCH v4 21/25] mmc: sdhci: fix data timeout (part 1) Russell King
2016-01-29  9:45 ` [PATCH v4 22/25] mmc: sdhci: fix data timeout (part 2) Russell King
2016-01-29  9:45 ` [PATCH v4 23/25] mmc: sdhci: prepare DMA address/size quirk handling consolidation Russell King
2016-01-29  9:45 ` [PATCH v4 24/25] mmc: sdhci: consolidate the DMA/ADMA size/address quicks Russell King
2016-01-29  9:45 ` [PATCH v4 25/25] mmc: sdhci: further code simplication Russell King
2016-02-04 10:55 ` [PATCH v4 00/25] MMC/SDHCI fixes Ulf Hansson
2016-02-12  9:49   ` Adrian Hunter
2016-02-12 14:23     ` Gregory CLEMENT [this message]
2016-02-16 13:08     ` Ulf Hansson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87ziv6vwhi.fsf@free-electrons.com \
    --to=gregory.clement@free-electrons.com \
    --cc=adrian.hunter@intel.com \
    --cc=kernel@pengutronix.de \
    --cc=linux-mmc@vger.kernel.org \
    --cc=linux@arm.linux.org.uk \
    --cc=mw@semihalf.com \
    --cc=shawnguo@kernel.org \
    --cc=ulf.hansson@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox