All of lore.kernel.org
 help / color / mirror / Atom feed
From: Grant Likely <grant.likely@secretlab.ca>
To: Jean-Francois Dagenais <jeff.dagenais@gmail.com>,
	linus.walleij@stericsson.com
Cc: torvalds@linux-foundation.org,
	open list <linux-kernel@vger.kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	eric.miao@marvell.com, maz@misterjones.org
Subject: Re: gpio: pca953x: interrupt feature unreliable (re-re-send, Grant please acknowledge!!)
Date: Fri, 11 May 2012 12:53:37 -0600	[thread overview]
Message-ID: <20120511185337.E31BC3E0791@localhost> (raw)
In-Reply-To: <8495990E-0B78-4B47-A00D-36D45DC3450F@gmail.com>

On Thu, 10 May 2012 14:19:17 -0400, Jean-Francois Dagenais <jeff.dagenais@gmail.com> wrote:
> Hi all,
> 
> Here's a fun, yet potentially dangerous problem.

Hi Jean-Francois,

It sounds like your analysis is accurate and your conclusions look
correct.  There really isn't much that I can do for you on this though
as I don't have the hardware.  As Linus mentioned, David Jander has
worked on the driver in the past, and git log shows some other folks.
I would ask them.

g.

> 
> I've sent this twice already, without any feedback. I'm just trying to get more 
> visibility on this because I believe it to be a serious issue depending where 
> this chip may be used. Interrupts could be missed in what could be very tough 
> conditions to replicate. In my setup I could reproduce the problem easily and 
> took great care in analysing and documenting it.
> ===============Re-Re-Send (with edits)==================
> Hi all,
> (Using 3.1.0-rc4, didn't change since)
> 
> I have detected a flaw in the interrupt support of the gpio-pca953x driver. 
> This is how I discovered the issue:
> 
> I have a interrupt client device which has it's interrupt signal connected to 
> one of the GPIOs of the pca9555. The client is an ad714x device. It's interrupt 
> routine is falling edge triggered (so says the driver, but according to the 
> AD714x specs, it should be level,low, however this is another topic). The 
> pca9555 is interrupting the CPU through INTA of an ethernet PCI card for which 
> I have blacklisted the driver module and only done "echo 1 > 
> /bus/pci/devices/00xyz/enable". It is "(level, low)" which matches the pca953x 
> request_threaded_isr call (IRQF_TRIGGER_LOW | IRQF_ONESHOT). I used two GPOs on 
> some other device to get the !ISR signals shown below. These show the nested 
> nature of the cli ISR called from the pca953x's ISR.
> ASCII art of my probing (use a monospace font, "!ISR"==low, mean ISR running)
> 
> Events(lined-up):1  2 3 4 5  6    7       8    9
>              ____          _______
> client !INT      |________|       |________(lost)___
>              ___________                   _________
> cli !ISR                |____|____________|
>              ____      ___         _________________
> pca953x !INT     |____|   |_______|
>              ______                             ____
> pca953x !ISR       |___________________________|
> (note: the little spike in the cli !ISR marks the end of reading the client's 
> status registers, the rest of the ISR is updating driver state machines, etc.)
> 
> Summary: The client enters interrupt(1), which immediately puts the pca953x in 
> interrupt as well(1). When the pca953x enters it's isr thread(2), it reads the 
> input status register, this immediately clears the pca953x interrupt(3). The 
> pca953x ISR then figure's out it should invoke the nested ISR for the client. 
> The client ISR starts (4) by reading the client chip's status register, which 
> clears the client interrupt(5). This immediately puts the pca953x back into 
> interrupt (5) since it's a different state than what was last read in the 
> pca953x ISR(3). Since pca953x registered it's interrupt with IRQF_ONESHOT, this 
> new IRQ state is masked and goes un-noticed. The pca953x chips' FLAW IS THEN 
> REVEILED if/when the client chip goes back in interrupt before (7) the client 
> ISR (and the pca053x's ISR with the current code) even has time to finish(8 and 
> 9). At that moment, the pca953x's input are back to the way they where when the 
> pca953x ISR last read the input status register. So when all of the ISR 
> routines are done (9...), the client has it's INT line asserted and no more 
> interrupt will come from the pca953x because of this.
> 
> pca953x specs quote: "Resetting the interrupt circuit is achieved when data on 
> the port is CHANGED TO THE ORIGINAL SETTING, data is read from the port that 
> generated the interrupt or in a Stop event. [...] Interrupts that occur during 
> the ACK or NACK clock pulse CAN BE LOST (or be very short) due to the resetting 
> of the interrupt during this pulse."
> 
> So in essence, although the pca953x.c's design and doc says only edges are 
> detected and supported, the chip design makes it IMPOSSIBLE to guarantee that 
> all edges will be caught.
> 
> The IRQ function SHOULD BE REMOVED from the driver since the chip cannot honor 
> the driver's promise. At the very least, for system integrators that want to 
> use level base interrupts implemented by the pca953x, changes to the driver 
> should be made to correctly handle the lack of LATCHING INTERRUPT STATUS 
> register...
> 
> I would propose a patch, but I am missing knowledge on the internals of the 
> kernel's irq code, so here's a simple suggestion description, hoping to get 
> some feedback.
> 
> First of all, I propose pca953x_irq_set_type should only support 
> IRQ_TYPE_LEVEL_HIGH and IRQ_TYPE_LEVEL_LOW. pca953x_irq_handler should be 
> adjusted accordingly.
> 
> To make sure we catch as many transitions of pca953x chips INT as possible 
> (yikes), we make pca953x_irq_setup() request it's own interrupt as both rising 
> and falling (IRQF_TRIGGER_RISING|IRQF_TRIGGER_FALLING, is that even possible?) 
> without IRQF_ONESHOT, so that pca953x_irq_handler is rescheduled for running 
> even when it's already running.
> 
> Any thoughts? Maybe there are more/less fancy, or better yet, "correct" ways of 
> doing this? Will this even work?
> 
> /jfd
> 

-- 
Grant Likely, B.Sc, P.Eng.
Secret Lab Technologies, Ltd.

      parent reply	other threads:[~2012-05-11 18:53 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-05-10 18:19 gpio: pca953x: interrupt feature unreliable (re-re-send, Grant please acknowledge!!) Jean-Francois Dagenais
2012-05-11  7:48 ` Linus Walleij
     [not found]   ` <20120604091127.0f95a85c@archvile>
     [not found]     ` <73B37CBF-9E2A-49F5-9CA9-8CFBEF3666D6@gmail.com>
     [not found]       ` <20120620170126.311b8eef@archvile>
2012-06-20 19:28         ` gpio: pca953x: interrupt feature unreliable Jean-François Dagenais
2012-06-21  7:10           ` David Jander
2012-06-21 19:34             ` Jean-François Dagenais
2012-06-22  7:53               ` David Jander
2012-05-11 18:53 ` Grant Likely [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120511185337.E31BC3E0791@localhost \
    --to=grant.likely@secretlab.ca \
    --cc=eric.miao@marvell.com \
    --cc=jeff.dagenais@gmail.com \
    --cc=linus.walleij@stericsson.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maz@misterjones.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.