SDHCI long sleep with interrupts off

linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed

From: david@protonic.nl (David Jander)
To: linux-arm-kernel@lists.infradead.org
Subject: SDHCI long sleep with interrupts off
Date: Thu, 17 Dec 2015 12:20:53 +0100	[thread overview]
Message-ID: <20151217122053.036daf37@archvile> (raw)
In-Reply-To: <1450350190.3163.93.camel@pengutronix.de>


Hi Lucas,

Thanks for reacting.

On Thu, 17 Dec 2015 12:03:10 +0100
Lucas Stach <l.stach@pengutronix.de> wrote:

> Am Donnerstag, den 17.12.2015, 11:28 +0100 schrieb David Jander:
> > Hi all, 
> > 
> > I was investigating the source of abnormal irq-latency spikes on an i.MX6
> > (ARM) board, and discovered this:
> > 
> > # tracer: preemptirqsoff
> > #
> > # preemptirqsoff latency trace v1.1.5 on 4.4.0-rc4+
> > # --------------------------------------------------------------------
> > # latency: 2068 us, #4/4, CPU#0 | (M:preempt VP:0, KP:0, SP:0 HP:0 #P:1)
> > #    -----------------
> > #    | task: mmcqd/0-92 (uid:0 nice:0 policy:0 rt_prio:0)
> > #    -----------------
> > #  => started at: _raw_spin_lock_irqsave
> > #  => ended at:   _raw_spin_unlock_irqrestore
> > #
> > #
> > #                  _------=> CPU#            
> > #                 / _-----=> irqs-off        
> > #                | / _----=> need-resched    
> > #                || / _---=> hardirq/softirq 
> > #                ||| / _--=> preempt-depth   
> > #                |||| /     delay            
> > #  cmd     pid   ||||| time  |   caller      
> > #     \   /      |||||  \    |   /         
> >  mmcqd/0-92      0d...    1us#: _raw_spin_lock_irqsave
> >  mmcqd/0-92      0.n.1 2066us : _raw_spin_unlock_irqrestore
> >  mmcqd/0-92      0.n.1 2070us+: trace_preempt_on
> > <-_raw_spin_unlock_irqrestore mmcqd/0-92      0.n.1 2107us : <stack trace>
> >  => sdhci_runtime_resume_host
> >  => __rpm_callback
> >  => rpm_callback
> >  => rpm_resume
> >  => __pm_runtime_resume
> >  => __mmc_claim_host
> >  => mmc_blk_issue_rq
> >  => mmc_queue_thread
> >  => kthread
> >  => ret_from_fork
> > 
> > 2 ms with interrupts disabled!!! To much dismay, I later discovered that
> > this isn't even the worst case scenario. I also discovered that this has
> > been in the kernel for a long time without a fix (I have tested from 3.17
> > to 4.4-rc4). There has been an attempt by someone to address this back in
> > 2010, but apparently it never hit mainline.
> > Going through the code in sdhci.c, I found this troublesome code-path:
> > 
> > sdhci_do_set_ios() {
> > 	spin_lock_irqsave(&host->lock, flags);
> > 	...
> > 	sdhci_reinit() --> sdhci_init() --> sdhci_do_reset() -->
> > 		host->ops->reset() --> sdhci_reset()	
> > 	...
> > 	spin_unlock_irqrestore(&host->lock, flags);
> > }
> > 
> > And in sdhci_reset(), which may be called with held spinlock:
> > 
> > 	...
> > 	/* Wait max 100 ms */
> > 	timeout = 100;
> > 
> > 	/* hw clears the bit when it's done */
> > 	while (sdhci_readb(host, SDHCI_SOFTWARE_RESET) & mask) {
> > 		if (timeout == 0) {
> > 			pr_err("%s: Reset 0x%x never completed.\n",
> > 				mmc_hostname(host->mmc), (int)mask);
> > 			sdhci_dumpregs(host);
> > 			return;
> > 		}
> > 		timeout--;
> > 		mdelay(1);
> > 	}
> > 
> > I am wondering: There either must be a reason this hasn't been fixed in
> > such a long time, or I am not understanding this correctly, so please
> > enlighten me. Before trying a cowboy attempt at "fixing" this, I'd really
> > like to know why am I seeing this?
> > I mean... how can such a problem get unnoticed and unfixed for so long?
> > Will an attempt at fixing this issue even be accepted?
> > 
> I would like to see the sdhci spinlock killed and replaced by a mutex
> for exactly this reason.
> 
> That said, your problem is card polling, when no card is present in the
> slot. This is most probably caused by CD gpios having the wrong
> polarity.

... or not having a CD pin at all.
I am using an embedded eMMC chip and a uSD card inserted into a slot. The card
is present and also detected as such. If I never access the card, I see no
spikes (filesystem is mounted but not accessed). If I try to read a file or
directory I get the above trace.
OTOH, if I disable PM functionality in the kernel, the spike is gone, and
worst-case latency is in the 300us range, so I don't think this is related to
card polling.

Best regards,

-- 
David Jander
Protonic Holland.

next prev parent reply	other threads:[~2015-12-17 11:20 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-12-17 10:28 SDHCI long sleep with interrupts off David Jander
2015-12-17 11:03 ` Lucas Stach
2015-12-17 11:20   ` David Jander [this message]
2015-12-17 11:27     ` Lucas Stach
2015-12-17 11:39       ` Ulf Hansson
2015-12-17 12:22         ` David Jander
2015-12-17 14:54           ` Ulf Hansson
2015-12-17 15:09             ` David Jander
2015-12-18 20:09               ` Russell King - ARM Linux
2015-12-18 18:38           ` Russell King - ARM Linux

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151217122053.036daf37@archvile \
    --to=david@protonic.nl \
    --cc=linux-arm-kernel@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).