From: Russell King - ARM Linux <linux@arm.linux.org.uk>
To: Ulf Hansson <ulf.hansson@linaro.org>
Cc: Gregory CLEMENT <gregory.clement@free-electrons.com>,
linux-mmc@vger.kernel.org, Marcin Wojtas <mw@semihalf.com>,
Shawn Guo <shawnguo@kernel.org>,
Sascha Hauer <kernel@pengutronix.de>
Subject: [PATCH 00/17] MMC/SDHCI fixes
Date: Sat, 19 Dec 2015 20:28:51 +0000 [thread overview]
Message-ID: <20151219202851.GS8644@n2100.arm.linux.org.uk> (raw)
This all started with a report from an iMX6 Hummingboard user that
they were seeing regular "mmc0: Got data interrupt 0x00000002 even
though no data operation was in progress." messages with 4.4-rc and
additional patches to enable UHS support in DT for the platform.
When I tested 4.4-rc on an iMX6 Hummingboard with UHS support enabled,
I found several other issues. This patch series addresses some of
the issues.
However, while testing on Armada 388, an issue was found there, and
a fix is also included in this patch set, however it does not depend
on the rest of the patch series.
We all know that the SDHCI driver is in poor state, and Ulf's desire
to see it turned into a library (which actually comes from myself.)
However, with the driver in such a poor state, it makes sense to fix
a few of the issues first, especially if they result in the code
shrinking - which they do.
However, given what I've found below, I have to wonder how many of the
quirks that we have in this driver are down to the poor code rather than
real hardware problems: how many of them are due to poorly investigated
bugs.
This series starts off by shutting up some kernel messages that should
not be verbose:
- the voltage-ranges DT property is optional, so we should not print
a message saying that it's missing from DT. That suggests it isn't
optional. As no ill effect comes from not specifying it, I've
dropped this message to debug level.
- "retrying because a re-tune was needed" - a re-tune is something that
is part of the higher speed SD specs, and it's required to happen
periodically. So it's expected and normal behaviour. There's no need
to print this to the kernel log.
- When tuning fails, rather than a bland "mmc0: tuning execution failed"
message, print the error code so that we can see what the reason for
the failure is. MMC fails on this point in many places.
- Move the initialisation of the command ->error member. It looks to
me as if this is not required; mmc_start_request() does this already,
as it seems the rest of the MMC core does too. Maybe this should be
removed altogether?
- Testing error bits, then setting the error, then testing the error
value, and completing the command is a very lengthy way to deal with
errors here. Test the error bits, and then decide what error occurred
before completing. (This code is actually buggy, see the next change).
- If a command _response_ CRC error occurs, we can't be certain whether
the card received the command successfully, and has started processing
it. If it has, and it results in data being read from the card, the
card will enter data transfer state, and start sending data to the
host. Therefore, it is completely wrong to blindly terminate the
command, especially when such termination fails to clean _anything_
up - eg, it leaves DMA in place so the host can continue DMAing to the
memory, and it leaves buffers DMA mapped - which eventually annoys the
DMA API debugging. Fix this by assuming that a CRC error is a receive
side error. If the card did not correctly receive the command, it
will not initiate a data transfer, and we should time out after the
card specified timeout.
With the unmodified mainline code, even if we reset the state machines
when finishing a request in error, the _card_ itself may well still be
in data transfer mode when this happens. How can we be sure what state
the data side is actually in? Could this be why (eg) sdhci-esdhc-imx.c
has this:
/* FIXME: delay a bit for card to be ready for next tuning due to errors */
mdelay(1);
And how many more such _hacks_ are working around similar problems?
- The unaligned buffer handling is a total waste of resources in its
current form. We DMA map, sync, and unmap it on every transfer,
whether the buffer is used or not. For MMC/SD transfers, these will
be coming from the VFS/MM layers, which operate through the page
cache - hence the vast majority of buffers will be page aligned.
Performance measurements on iMX6 show that this additional DMA
maintanence is responsible for a 10% reduction in read transfer
performance. Switch this to use a DMA coherent mapping instead, which
needs no DMA maintanence.
- sdhci_adma_table_post() is badly coded: we walk the scatterlist
looking for any buffers which might not be aligned (which can be a
rare event, see above) and then we only do something if we find an
unaligned buffer _and_ it's a read transfer. So for a write transfer,
this scatterlist walking is a total waste of CPU cycles. Testing the
transfer direction is much cheaper than walking the scatterlist.
Re-organise the code to do the cheap test first.
- There are two paths to identical DMA unmapping code (the code around
dma_unmap_sg()) in sdhci_finish_data() and its child function
sdhci_adma_table_post(). Trivially simplify this duplication.
- Move sdhci_pre_dma_transfer() so we avoid needing to declare it.
- Do the same thing with sdhci_prepare_data() and sdhci_pre_dma_transfer()
as with sdhci_finish_data() and sdhci_adma_table_post().
- Fix the mess that is the data segment host_cookie handling. The
existing code seems like a candidate for the obfuscated coding prize!
Four patches address this, turning it into something much easier to
understand, which then allows (finally)...
- The DMA API leak which occurs when we have mapped the DMA buffers in
sdhci_prepare_data() and an error occurs early on in the processing
of a request.
- A final change (mentioned above) which fixes a prior commit for
Armada 38x using sdhci-pxav3.c, found while trying to get the higher
speed modes working on these SoCs. Clearly the existing code is
broken, this at least resolves some of the breakage by allowing the
correct high-order SDHCI capabilities through.
All these patches are resulting from _only_ today; I'm sure more will
be coming in the forthcoming weeks, as all I seem to have to do is
give code a little shake and _lots_ of bugs appear. Despite this being
a series of 17 patches, it does not yet solve _all_ the problems I've
found so far on iMX6 with SDHCI.
I've tested these changes on iMX6 at UHS speeds, and Armada 388 SoCs
at HS speeds so far. Others should test them.
drivers/mmc/card/block.c | 4 +-
drivers/mmc/core/core.c | 10 +-
drivers/mmc/host/sdhci-pxav3.c | 6 +-
drivers/mmc/host/sdhci.c | 318 ++++++++++++++++++-----------------------
drivers/mmc/host/sdhci.h | 4 +-
5 files changed, 157 insertions(+), 185 deletions(-)
--
RMK's Patch system: http://www.arm.linux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
next reply other threads:[~2015-12-19 20:29 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-12-19 20:28 Russell King - ARM Linux [this message]
2015-12-19 20:29 ` [PATCH 01/17] mmc: core: shut up "voltage-ranges unspecified" pr_info() Russell King
2015-12-21 10:18 ` Ulf Hansson
2015-12-21 10:24 ` Russell King - ARM Linux
2015-12-19 20:30 ` Russell King
2015-12-19 20:30 ` [PATCH 02/17] mmc: block: shut up "retrying because a re-tune was needed" message Russell King
2015-12-19 20:30 ` [PATCH 03/17] mmc: core: report tuning command execution failure reason Russell King
2015-12-19 20:30 ` [PATCH 04/17] mmc: sdhci: move initialisation of command error member Russell King
2015-12-19 20:30 ` [PATCH 05/17] mmc: sdhci: clean up command error handling Russell King
2015-12-19 20:30 ` [PATCH 06/17] mmc: sdhci: command response CRC " Russell King
2015-12-19 20:30 ` [PATCH 07/17] mmc: sdhci: avoid unnecessary mapping/unmapping of align buffer Russell King
2015-12-21 10:28 ` Ulf Hansson
2015-12-21 10:53 ` Russell King - ARM Linux
2015-12-19 20:30 ` [PATCH 08/17] mmc: sdhci: clean up coding style in sdhci_adma_table_pre() Russell King
2015-12-19 20:30 ` [PATCH 09/17] mmc: sdhci: avoid walking SG list for writes Russell King
2015-12-19 20:30 ` [PATCH 10/17] mmc: sdhci: factor out common DMA cleanup in sdhci_finish_data() Russell King
2015-12-19 20:30 ` [PATCH 11/17] mmc: sdhci: move sdhci_pre_dma_transfer() Russell King
2015-12-19 20:31 ` [PATCH 12/17] mmc: sdhci: factor out sdhci_pre_dma_transfer() from sdhci_adma_table_pre() Russell King
2015-12-19 20:31 ` [PATCH 13/17] mmc: sdhci: pass the cookie into sdhci_pre_dma_transfer() Russell King
2015-12-19 20:31 ` [PATCH 14/17] mmc: sdhci: always unmap a mapped data transfer in sdhci_post_req() Russell King
2015-12-19 20:31 ` [PATCH 15/17] mmc: sdhci: clean up host cookie handling Russell King
2015-12-19 20:31 ` [PATCH 16/17] mmc: sdhci: plug DMA mapping leak on error Russell King
2015-12-19 20:31 ` [PATCH 17/17] mmc: sdhci-pxav3: fix higher speed mode capabilities Russell King
2015-12-19 21:42 ` [PATCH 18/17] mmc: sdhci: further fix for DMA unmapping in sdhci_post_req() Russell King
2015-12-20 8:27 ` [PATCH 19/17] mmc: sdhci: fix data timeout (part 1) Russell King
2015-12-20 8:27 ` [PATCH 20/17] mmc: sdhci: fix data timeout (part 2) Russell King
2015-12-21 10:46 ` [PATCH 00/17] MMC/SDHCI fixes Ulf Hansson
2015-12-21 11:04 ` Russell King - ARM Linux
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20151219202851.GS8644@n2100.arm.linux.org.uk \
--to=linux@arm.linux.org.uk \
--cc=gregory.clement@free-electrons.com \
--cc=kernel@pengutronix.de \
--cc=linux-mmc@vger.kernel.org \
--cc=mw@semihalf.com \
--cc=shawnguo@kernel.org \
--cc=ulf.hansson@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).