linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Tejun Heo <tj@kernel.org>
To: Jeff Garzik <jeff@garzik.org>
Cc: mingo@elte.hu, tglx@linutronix.de, bphilips@suse.de,
	yinghai@kernel.org, akpm@linux-foundation.org,
	torvalds@linux-foundation.org, linux-kernel@vger.kernel.org,
	linux-ide@vger.kernel.org, stern@rowland.harvard.edu,
	gregkh@suse.de, khali@linux-fr.org
Subject: Re: [PATCH 11/12] libata: use IRQ expecting
Date: Sat, 26 Jun 2010 11:44:56 +0200	[thread overview]
Message-ID: <4C25CC18.2070507@kernel.org> (raw)
In-Reply-To: <4C25C551.8000404@garzik.org>

Hello, Jeff.

On 06/26/2010 11:16 AM, Jeff Garzik wrote:
> On 06/26/2010 04:31 AM, Tejun Heo wrote:
>> Well, it can indicte the start of cluster of completions, which is the
>> necessary information anyway.  From the second call on, it's a simple
>> flag test and return.  I doubt it will affect anything even w/ high
>> performance SSDs but please read on.
> 
> Yes, and your patch calls unexpect_irq() at the _start_ of a cluster of
> completions.  That is nonsensical, because it reflects the /opposite/ of
> the present ATA bus state, when multiple commands are in flight.

That's actually what we wanna know.  I'll talk about it below.

>> ata_qc_complete_multiple() call [un]expect_irq() only once by
>> introducing an internal completion function w/o irq expect handling,
>> say ata_qc_complete_raw() and making both ata_qc_complete() and
>> ata_qc_complete_multiple() simple wrapper around it w/ irq expect
>> handling.
> 
> Yes, this fixes problem, but it is better to create a wrapper path for
> the legacy PATA/SATA1 that uses irq-expecting, and a fast path for
> modern controllers that do not use it.
> 
>> On 06/26/2010 05:45 AM, Jeff Garzik wrote:
>>> We don't want to  burden modern SATA drivers with the overhead of
>>> dealing with silly PATA/SATA1 legacy irq nastiness, particularly the
>>> ugliness of calling
>>
>> I think we're much better off applying it to all the drivers.  IRQ
>> expecting is very cheap and scalable and there definitely are plenty
>> of IRQ delivery problems with modern controllers although their
>> patterns tend to be different from legacy ones.  Plus, it will also be
>> useful for power state predictions.
> 
> Modern SATA/SAS controllers, and their drivers, already have well
> defined methods of acknowledging interrupts, even unexpected ones, in
> ways that do not need this core manipulation.  This is over-engineering,
> punishing all modern chipsets moving forward regardless of their design,
> by unconditionally requiring this behavior of all libata drivers.

Unacked irqs are primarily handled by spurious IRQ handling.  IRQ
expecting is more about lost interrupts and we have enough lost
interrupt cases even on new controllers w/ native interface, both
transient and non-transient.

One of the goals of this whole IRQ exception handling was to make it
dumb easy for drivers to use which also included makes things cheap
enough so that they can be called from hot paths.  Both expect and
unexpect_irq() are very cheap once the IRQ delivery is verified.  If
the processor is taking an interrupt in the first place, this amount
of logic shouldn't matter at all.  There really isn't punishment to
avoid and IMHO not doing it for native controllers is an over
optimization.  It gains almost nothing while losing meaningful
protection.

> Just like the rest of libata's layered driver architecture, it should be
> straightforward to apply this only to SFF/BMDMA chipsets, then tackle
> odd cases as needs arise.
>
> Modern controllers acknowledge interrupts sanely, and always "expect" an
> interrupt when you include interrupt events like hotplug, even if the
> ATA bus itself is idle.  There is no need to burden the millions of ahci
> users with irq-expecting, for example.

I'm not saying applying it to only SFF/BMDMA is difficult, just that
it's better to apply it to all drivers in this case.  IRQ expecting is
to protect against misdelivered / lost IRQs and we do have them for
ahci, sil24 or whatever too.  It would of course be silly to pay
significant performance overhead for such protection but as I stated
above, it's _really_ cheap.  If the driver is taking an interrupt and
accessing harddware and even if compared only against the general
complexity of generic IRQ and libata code, the cost of IRQ [un]expect
is negligible and designed precisely that way to allow use cases like
this.

> With regards to power state predictions, it is only useful if you are
> accurately reflecting the ATA bus state (idle or not) at all times.  As
> mentioned above, this patch clearly creates a condition where
> unexpect_irq() is called when commands remain in flight, and libata is
> expecting further command completions.
>
> IOW, patch #11 says "we are not expecting irq" when we are.
> 
> At least a halfway sane approach would be to track bus-idle status, and
> trigger useful code when that status changes (idle->active or
> active->idle).  Perhaps LED, power state, and irq-expecting could all
> use such a triggering mechanism.

Continuing from the response to the first paragraph.  The IRQ
expecting code isn't interested in the bus state, it's interested only
in the IRQ events and that's what it's expecting.  The same applies to
power state prediction too, so please consider the following NCQ
command execution sequence.

1. issue tags 0, 1, 2, 3
2. IRQ triggers, tags 0, 2 complete
3. IRQ triggers, tags 1, 3 completes

For IRQ expecting, both 1-2 and 2-3 are segments to expect for and for
power state transition too, as it's IRQ itself which forces the cpu to
come out of sleep state.  The reason why I said unexpect in
ata_qc_complete() is okay is that it can still delimit each segment as
long as we have proper irq_expect() call at the beginning of each
segment (all other unexpect calls are ignored).  But, that's kind of
moot point as we can easily do single pair.

Thanks.

-- 
tejun

  reply	other threads:[~2010-06-26  9:45 UTC|newest]

Thread overview: 77+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-06-13 15:31 [PATCHSET] irq: better lost/spurious irq handling Tejun Heo
2010-06-13 15:31 ` [PATCH 01/12] irq: cleanup irqfixup Tejun Heo
2010-06-13 15:31 ` [PATCH 02/12] irq: make spurious poll timer per desc Tejun Heo
2010-06-15  5:10   ` Konrad Rzeszutek Wilk
2010-06-15 16:34     ` Tejun Heo
2010-06-13 15:31 ` [PATCH 03/12] irq: use desc->poll_timer for irqpoll Tejun Heo
2010-06-13 15:31 ` [PATCH 04/12] irq: kill IRQF_IRQPOLL Tejun Heo
2010-06-13 15:31 ` [PATCH 05/12] irq: misc preparations for further changes Tejun Heo
2010-06-13 15:31 ` [PATCH 06/12] irq: implement irq_schedule_poll() Tejun Heo
2010-06-15 17:40   ` Jonathan Corbet
2010-06-15 17:51     ` Tejun Heo
2010-06-21 13:26     ` [PATCH 06/12 UPDATED] " Tejun Heo
2010-06-13 15:31 ` [PATCH 07/12] irq: improve spurious IRQ handling Tejun Heo
2010-06-13 15:31 ` [PATCH 08/12] irq: implement IRQ watching Tejun Heo
2010-06-13 15:31 ` [PATCH 09/12] irq: implement IRQ expecting Tejun Heo
2010-06-14  9:21   ` Jiri Slaby
2010-06-14  9:43     ` Tejun Heo
2010-06-14  9:46       ` Tejun Heo
2010-06-17  3:48   ` Arjan van de Ven
2010-06-17  8:18     ` Tejun Heo
2010-06-17 11:12       ` Thomas Gleixner
2010-06-17 11:23         ` Tejun Heo
2010-06-17 11:43           ` Alan Cox
2010-06-17 15:54             ` Tejun Heo
2010-06-17 16:02               ` Arjan van de Ven
2010-06-17 16:47                 ` Tejun Heo
2010-06-18  6:26                   ` Arjan van de Ven
2010-06-18  9:23                     ` Tejun Heo
2010-06-18  9:45                       ` Thomas Gleixner
2010-06-19  8:35     ` Andi Kleen
2010-06-19  8:42       ` Tejun Heo
2010-06-19  9:00         ` Andi Kleen
2010-06-19  9:03           ` Tejun Heo
2010-06-19 14:54           ` Arjan van de Ven
2010-06-19 19:49             ` Andi Kleen
2010-06-19 20:07               ` Arjan van de Ven
2010-06-13 15:31 ` [PATCH 10/12] irq: add comment about overall design of lost/spurious IRQ handling Tejun Heo
2010-06-13 15:31 ` [PATCH 11/12] libata: use IRQ expecting Tejun Heo
2010-06-21 13:52   ` Tejun Heo
2010-06-25  0:22   ` Jeff Garzik
2010-06-25  7:44     ` Tejun Heo
2010-06-25  9:48       ` Jeff Garzik
2010-06-25  9:51         ` Tejun Heo
2010-06-25 13:02           ` [PATCH 1/2 #upstream] sata_fsl,mv,nv: prepare for NCQ command completion update Tejun Heo
2010-06-25 13:03             ` [PATCH 2/2 #upstream] libata: always use ata_qc_complete_multiple() for NCQ command completions Tejun Heo
2010-08-17 22:03               ` Jeff Garzik
2010-08-01 23:47             ` [PATCH 1/2 #upstream] sata_fsl,mv,nv: prepare for NCQ command completion update Jeff Garzik
2010-08-02  7:18               ` Tejun Heo
2010-08-04  4:22                 ` Jeff Garzik
2010-06-26  3:45       ` [PATCH 11/12] libata: use IRQ expecting Jeff Garzik
2010-06-26  3:52         ` Jeff Garzik
2010-06-26  8:31         ` Tejun Heo
2010-06-26  9:16           ` Jeff Garzik
2010-06-26  9:44             ` Tejun Heo [this message]
2010-07-02 14:41               ` Tejun Heo
2010-07-02 14:53                 ` Tejun Heo
2010-07-10 10:06                 ` Tejun Heo
2010-07-14  7:58                   ` Jeff Garzik
2010-07-14  9:26                     ` Tejun Heo
2010-07-27 17:37                 ` Jeff Garzik
2010-06-13 15:31 ` [PATCH 12/12] usb: use IRQ watching Tejun Heo
2010-06-14 21:41   ` Greg KH
2010-06-14 21:52     ` Tejun Heo
2010-06-14 22:11       ` Greg KH
2010-06-14 22:19       ` Tejun Heo
2010-06-15 10:30         ` Kay Sievers
2010-06-15 11:05           ` Jean Delvare
2010-06-15 13:30             ` Kay Sievers
2010-06-15 11:20           ` Tejun Heo
2010-06-15 13:36             ` Kay Sievers
2010-06-15 17:36               ` Tejun Heo
2010-06-15 17:47                 ` Greg KH
2010-06-15 17:52                   ` Tejun Heo
2010-06-21 13:51   ` Tejun Heo
2010-06-21 20:27     ` Greg KH
2010-06-22  7:32       ` Tejun Heo
2010-07-02 14:59 ` [GIT PULL] irq: better lost/spurious irq handling Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4C25CC18.2070507@kernel.org \
    --to=tj@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=bphilips@suse.de \
    --cc=gregkh@suse.de \
    --cc=jeff@garzik.org \
    --cc=khali@linux-fr.org \
    --cc=linux-ide@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=stern@rowland.harvard.edu \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=yinghai@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).