From: Grant Likely <grant.likely@secretlab.ca>
To: Jean-Francois Dagenais <jeff.dagenais@gmail.com>,
linus.walleij@stericsson.com
Cc: torvalds@linux-foundation.org,
open list <linux-kernel@vger.kernel.org>,
Thomas Gleixner <tglx@linutronix.de>,
eric.miao@marvell.com, maz@misterjones.org
Subject: Re: gpio: pca953x: interrupt feature unreliable (re-re-send, Grant please acknowledge!!)
Date: Fri, 11 May 2012 12:53:37 -0600 [thread overview]
Message-ID: <20120511185337.E31BC3E0791@localhost> (raw)
In-Reply-To: <8495990E-0B78-4B47-A00D-36D45DC3450F@gmail.com>
On Thu, 10 May 2012 14:19:17 -0400, Jean-Francois Dagenais <jeff.dagenais@gmail.com> wrote:
> Hi all,
>
> Here's a fun, yet potentially dangerous problem.
Hi Jean-Francois,
It sounds like your analysis is accurate and your conclusions look
correct. There really isn't much that I can do for you on this though
as I don't have the hardware. As Linus mentioned, David Jander has
worked on the driver in the past, and git log shows some other folks.
I would ask them.
g.
>
> I've sent this twice already, without any feedback. I'm just trying to get more
> visibility on this because I believe it to be a serious issue depending where
> this chip may be used. Interrupts could be missed in what could be very tough
> conditions to replicate. In my setup I could reproduce the problem easily and
> took great care in analysing and documenting it.
> ===============Re-Re-Send (with edits)==================
> Hi all,
> (Using 3.1.0-rc4, didn't change since)
>
> I have detected a flaw in the interrupt support of the gpio-pca953x driver.
> This is how I discovered the issue:
>
> I have a interrupt client device which has it's interrupt signal connected to
> one of the GPIOs of the pca9555. The client is an ad714x device. It's interrupt
> routine is falling edge triggered (so says the driver, but according to the
> AD714x specs, it should be level,low, however this is another topic). The
> pca9555 is interrupting the CPU through INTA of an ethernet PCI card for which
> I have blacklisted the driver module and only done "echo 1 >
> /bus/pci/devices/00xyz/enable". It is "(level, low)" which matches the pca953x
> request_threaded_isr call (IRQF_TRIGGER_LOW | IRQF_ONESHOT). I used two GPOs on
> some other device to get the !ISR signals shown below. These show the nested
> nature of the cli ISR called from the pca953x's ISR.
> ASCII art of my probing (use a monospace font, "!ISR"==low, mean ISR running)
>
> Events(lined-up):1 2 3 4 5 6 7 8 9
> ____ _______
> client !INT |________| |________(lost)___
> ___________ _________
> cli !ISR |____|____________|
> ____ ___ _________________
> pca953x !INT |____| |_______|
> ______ ____
> pca953x !ISR |___________________________|
> (note: the little spike in the cli !ISR marks the end of reading the client's
> status registers, the rest of the ISR is updating driver state machines, etc.)
>
> Summary: The client enters interrupt(1), which immediately puts the pca953x in
> interrupt as well(1). When the pca953x enters it's isr thread(2), it reads the
> input status register, this immediately clears the pca953x interrupt(3). The
> pca953x ISR then figure's out it should invoke the nested ISR for the client.
> The client ISR starts (4) by reading the client chip's status register, which
> clears the client interrupt(5). This immediately puts the pca953x back into
> interrupt (5) since it's a different state than what was last read in the
> pca953x ISR(3). Since pca953x registered it's interrupt with IRQF_ONESHOT, this
> new IRQ state is masked and goes un-noticed. The pca953x chips' FLAW IS THEN
> REVEILED if/when the client chip goes back in interrupt before (7) the client
> ISR (and the pca053x's ISR with the current code) even has time to finish(8 and
> 9). At that moment, the pca953x's input are back to the way they where when the
> pca953x ISR last read the input status register. So when all of the ISR
> routines are done (9...), the client has it's INT line asserted and no more
> interrupt will come from the pca953x because of this.
>
> pca953x specs quote: "Resetting the interrupt circuit is achieved when data on
> the port is CHANGED TO THE ORIGINAL SETTING, data is read from the port that
> generated the interrupt or in a Stop event. [...] Interrupts that occur during
> the ACK or NACK clock pulse CAN BE LOST (or be very short) due to the resetting
> of the interrupt during this pulse."
>
> So in essence, although the pca953x.c's design and doc says only edges are
> detected and supported, the chip design makes it IMPOSSIBLE to guarantee that
> all edges will be caught.
>
> The IRQ function SHOULD BE REMOVED from the driver since the chip cannot honor
> the driver's promise. At the very least, for system integrators that want to
> use level base interrupts implemented by the pca953x, changes to the driver
> should be made to correctly handle the lack of LATCHING INTERRUPT STATUS
> register...
>
> I would propose a patch, but I am missing knowledge on the internals of the
> kernel's irq code, so here's a simple suggestion description, hoping to get
> some feedback.
>
> First of all, I propose pca953x_irq_set_type should only support
> IRQ_TYPE_LEVEL_HIGH and IRQ_TYPE_LEVEL_LOW. pca953x_irq_handler should be
> adjusted accordingly.
>
> To make sure we catch as many transitions of pca953x chips INT as possible
> (yikes), we make pca953x_irq_setup() request it's own interrupt as both rising
> and falling (IRQF_TRIGGER_RISING|IRQF_TRIGGER_FALLING, is that even possible?)
> without IRQF_ONESHOT, so that pca953x_irq_handler is rescheduled for running
> even when it's already running.
>
> Any thoughts? Maybe there are more/less fancy, or better yet, "correct" ways of
> doing this? Will this even work?
>
> /jfd
>
--
Grant Likely, B.Sc, P.Eng.
Secret Lab Technologies, Ltd.
prev parent reply other threads:[~2012-05-11 18:53 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-05-10 18:19 gpio: pca953x: interrupt feature unreliable (re-re-send, Grant please acknowledge!!) Jean-Francois Dagenais
2012-05-11 7:48 ` Linus Walleij
[not found] ` <20120604091127.0f95a85c@archvile>
[not found] ` <73B37CBF-9E2A-49F5-9CA9-8CFBEF3666D6@gmail.com>
[not found] ` <20120620170126.311b8eef@archvile>
2012-06-20 19:28 ` gpio: pca953x: interrupt feature unreliable Jean-François Dagenais
2012-06-21 7:10 ` David Jander
2012-06-21 19:34 ` Jean-François Dagenais
2012-06-22 7:53 ` David Jander
2012-05-11 18:53 ` Grant Likely [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120511185337.E31BC3E0791@localhost \
--to=grant.likely@secretlab.ca \
--cc=eric.miao@marvell.com \
--cc=jeff.dagenais@gmail.com \
--cc=linus.walleij@stericsson.com \
--cc=linux-kernel@vger.kernel.org \
--cc=maz@misterjones.org \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox