From: Russell King - ARM Linux <linux@arm.linux.org.uk>
To: Ulf Hansson <ulf.hansson@linaro.org>
Cc: Gregory CLEMENT <gregory.clement@free-electrons.com>,
linux-mmc@vger.kernel.org, Marcin Wojtas <mw@semihalf.com>,
Shawn Guo <shawnguo@kernel.org>,
Sascha Hauer <kernel@pengutronix.de>
Subject: [PATCH v3 00/25] MMC/SDHCI fixes
Date: Tue, 26 Jan 2016 13:38:41 +0000 [thread overview]
Message-ID: <20160126133840.GA32588@n2100.arm.linux.org.uk> (raw)
On Mon, Dec 21, 2015 at 11:39:56AM +0000, Russell King - ARM Linux wrote:
> This all started with a report from an iMX6 Hummingboard user that
> they were seeing regular "mmc0: Got data interrupt 0x00000002 even
> though no data operation was in progress." messages with 4.4-rc and
> additional patches to enable UHS support in DT for the platform.
>
> When I tested 4.4-rc on an iMX6 Hummingboard with UHS support enabled,
> I found several other issues. This patch series addresses some of
> the issues.
>
> However, while testing on Armada 388, an issue was found there, and
> a fix is also included in this patch set, however it does not depend
> on the rest of the patch series.
>
> We all know that the SDHCI driver is in poor state, and Ulf's desire
> to see it turned into a library (which actually comes from myself.)
> However, with the driver in such a poor state, it makes sense to fix
> a few of the issues first, especially if they result in the code
> shrinking - which they do.
>
> However, given what I've found below, I have to wonder how many of the
> quirks that we have in this driver are down to the poor code rather than
> real hardware problems: how many of them are due to poorly investigated
> bugs.
>
> This series starts off by shutting up some kernel messages that should
> not be verbose:
> - the voltage-ranges DT property is optional, so we should not print
> a message saying that it's missing from DT. That suggests it isn't
> optional. As no ill effect comes from not specifying it, I've
> dropped this message to debug level.
>
> - reworking the voltage-ranges code so that callers know whether the
> optional voltage-ranges property was specified.
>
> - "retrying because a re-tune was needed" - a re-tune is something that
> is part of the higher speed SD specs, and it's required to happen
> periodically. So it's expected and normal behaviour. There's no need
> to print this to the kernel log.
>
> - When tuning fails, rather than a bland "mmc0: tuning execution failed"
> message, print the error code so that we can see what the reason for
> the failure is. MMC fails on this point in many places.
>
> - Move the initialisation of the command ->error member. It looks to
> me as if this is not required; mmc_start_request() does this already,
> as it seems the rest of the MMC core does too. Maybe this should be
> removed altogether?
>
> - Testing error bits, then setting the error, then testing the error
> value, and completing the command is a very lengthy way to deal with
> errors here. Test the error bits, and then decide what error occurred
> before completing. (This code is actually buggy, see the next change).
>
> - If a command _response_ CRC error occurs, we can't be certain whether
> the card received the command successfully, and has started processing
> it. If it has, and it results in data being read from the card, the
> card will enter data transfer state, and start sending data to the
> host. Therefore, it is completely wrong to blindly terminate the
> command, especially when such termination fails to clean _anything_
> up - eg, it leaves DMA in place so the host can continue DMAing to the
> memory, and it leaves buffers DMA mapped - which eventually annoys the
> DMA API debugging. Fix this by assuming that a CRC error is a receive
> side error. If the card did not correctly receive the command, it
> will not initiate a data transfer, and we should time out after the
> card specified timeout.
>
> With the unmodified mainline code, even if we reset the state machines
> when finishing a request in error, the _card_ itself may well still be
> in data transfer mode when this happens. How can we be sure what state
> the data side is actually in? Could this be why (eg) sdhci-esdhc-imx.c
> has this:
>
> /* FIXME: delay a bit for card to be ready for next tuning due to errors */
> mdelay(1);
>
> And how many more such _hacks_ are working around similar problems?
>
> - The unaligned buffer handling is a total waste of resources in its
> current form. We DMA map, sync, and unmap it on every transfer,
> whether the buffer is used or not. For MMC/SD transfers, these will
> be coming from the VFS/MM layers, which operate through the page
> cache - hence the vast majority of buffers will be page aligned.
> Performance measurements on iMX6 show that this additional DMA
> maintanence is responsible for a 10% reduction in read transfer
> performance. Switch this to use a DMA coherent mapping instead, which
> needs no DMA maintanence.
>
> - sdhci_adma_table_post() is badly coded: we walk the scatterlist
> looking for any buffers which might not be aligned (which can be a
> rare event, see above) and then we only do something if we find an
> unaligned buffer _and_ it's a read transfer. So for a write transfer,
> this scatterlist walking is a total waste of CPU cycles. Testing the
> transfer direction is much cheaper than walking the scatterlist.
> Re-organise the code to do the cheap test first.
>
> - There are two paths to identical DMA unmapping code (the code around
> dma_unmap_sg()) in sdhci_finish_data() and its child function
> sdhci_adma_table_post(). Trivially simplify this duplication.
>
> - Move sdhci_pre_dma_transfer() so we avoid needing to declare it.
>
> - Do the same thing with sdhci_prepare_data() and sdhci_pre_dma_transfer()
> as with sdhci_finish_data() and sdhci_adma_table_post().
>
> - Fix the mess that is the data segment host_cookie handling. The
> existing code seems like a candidate for the obfuscated coding prize!
> Four patches address this, turning it into something much easier to
> understand, which then allows (finally)...
>
> - The DMA API leak which occurs when we have mapped the DMA buffers in
> sdhci_prepare_data() and an error occurs early on in the processing
> of a request.
>
> - A change (mentioned above) which fixes a prior commit for Armada 38x
> using sdhci-pxav3.c, found while trying to get the higher speed modes
> working on these SoCs. Clearly the existing code is broken, this at
> least resolves some of the breakage by allowing the correct high-order
> SDHCI capabilities through.
>
> - Fixing another DMA API leak with the pre/post request handling, where
> a current request can change the state of the SDHCI_REQ_USE_DMA, and
> thus stop a mapped request being unmapped.
>
> - Fixing the SDHCI timeout calculation code to correctly round timeouts
> up rather than down, and correctly calculate the time (in microseconds)
> for a certain number of card clocks.
>
> - Consolidating the SDHCI address/size alignment handling so we only walk
> the scatterlist once instead of twice.
>
> Further patches are necessary as there are still a number of observable
> problems when using this driver on iMX6 with UHS cards. Patches up to
> and including "mmc: sdhci: further fix for DMA unmapping in
> sdhci_post_req()" should be considered for inclusion in the stable
> kernel trees.
>
> I've tested these changes on iMX6 at UHS speeds, and Armada 388 SoCs
> at HS speeds so far. Others should test them.
Repost against 4.5-rc1, with one additional patch.
drivers/mmc/card/block.c | 4 +-
drivers/mmc/core/core.c | 17 +-
drivers/mmc/host/sdhci-pxav3.c | 6 +-
drivers/mmc/host/sdhci.c | 443 ++++++++++++++++++-----------------------
drivers/mmc/host/sdhci.h | 4 +-
5 files changed, 212 insertions(+), 262 deletions(-)
--
RMK's Patch system: http://www.arm.linux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
next reply other threads:[~2016-01-26 13:38 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-01-26 13:38 Russell King - ARM Linux [this message]
2016-01-26 13:39 ` [PATCH v3 01/25] mmc: core: shut up "voltage-ranges unspecified" pr_info() Russell King
2016-01-26 13:39 ` [PATCH v3 02/25] mmc: core: improve mmc_of_parse_voltage() to return better status Russell King
2016-01-26 13:39 ` [PATCH v3 03/25] mmc: block: shut up "retrying because a re-tune was needed" message Russell King
2016-01-26 13:39 ` [PATCH v3 04/25] mmc: core: report tuning command execution failure reason Russell King
2016-01-26 13:39 ` [PATCH v3 05/25] mmc: sdhci: move initialisation of command error member Russell King
2016-01-26 13:39 ` [PATCH v3 06/25] mmc: sdhci: clean up command error handling Russell King
2016-01-26 13:39 ` [PATCH v3 07/25] mmc: sdhci: command response CRC " Russell King
2016-01-26 14:10 ` kbuild test robot
2016-01-26 14:11 ` kbuild test robot
2016-01-26 13:39 ` [PATCH v3 08/25] mmc: sdhci: avoid unnecessary mapping/unmapping of align buffer Russell King
2016-01-26 13:39 ` [PATCH v3 09/25] mmc: sdhci: allocate alignment and DMA descriptor buffer together Russell King
2016-01-27 9:14 ` Adrian Hunter
2016-01-27 19:50 ` Russell King - ARM Linux
2016-01-26 13:40 ` [PATCH v3 10/25] mmc: sdhci: clean up coding style in sdhci_adma_table_pre() Russell King
2016-01-26 13:40 ` [PATCH v3 11/25] mmc: sdhci: avoid walking SG list for writes Russell King
2016-01-26 13:40 ` [PATCH v3 12/25] mmc: sdhci: factor out common DMA cleanup in sdhci_finish_data() Russell King
2016-01-26 13:40 ` [PATCH v3 13/25] mmc: sdhci: move sdhci_pre_dma_transfer() Russell King
2016-01-26 13:40 ` [PATCH v3 14/25] mmc: sdhci: factor out sdhci_pre_dma_transfer() from sdhci_adma_table_pre() Russell King
2016-01-26 13:40 ` [PATCH v3 15/25] mmc: sdhci: pass the cookie into sdhci_pre_dma_transfer() Russell King
2016-01-26 13:40 ` [PATCH v3 16/25] mmc: sdhci: always unmap a mapped data transfer in sdhci_post_req() Russell King
2016-01-26 13:40 ` [PATCH v3 17/25] mmc: sdhci: clean up host cookie handling Russell King
2016-01-26 13:40 ` [PATCH v3 18/25] mmc: sdhci: plug DMA mapping leak on error Russell King
2016-01-26 13:40 ` [PATCH v3 19/25] mmc: sdhci-pxav3: fix higher speed mode capabilities Russell King
2016-01-26 13:40 ` [PATCH v3 20/25] mmc: sdhci: further fix for DMA unmapping in sdhci_post_req() Russell King
2016-01-26 13:40 ` [PATCH v3 21/25] mmc: sdhci: fix data timeout (part 1) Russell King
2016-01-26 13:41 ` [PATCH v3 22/25] mmc: sdhci: fix data timeout (part 2) Russell King
2016-01-26 13:41 ` [PATCH v3 23/25] mmc: sdhci: prepare DMA address/size quirk handling consolidation Russell King
2016-01-26 13:41 ` [PATCH v3 24/25] mmc: sdhci: consolidate the DMA/ADMA size/address quicks Russell King
2016-01-26 13:41 ` [PATCH v3 25/25] mmc: sdhci: further code simplication Russell King
2016-01-26 15:33 ` [PATCH v3 00/25] MMC/SDHCI fixes Gregory CLEMENT
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160126133840.GA32588@n2100.arm.linux.org.uk \
--to=linux@arm.linux.org.uk \
--cc=gregory.clement@free-electrons.com \
--cc=kernel@pengutronix.de \
--cc=linux-mmc@vger.kernel.org \
--cc=mw@semihalf.com \
--cc=shawnguo@kernel.org \
--cc=ulf.hansson@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).