From: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
To: Jonathan Cameron <jic23@kernel.org>
Cc: Lorenzo Bianconi <lorenzo@kernel.org>,
Jonathan Cameron <Jonathan.Cameron@huawei.com>,
linux-iio@vger.kernel.org, mario.tesi@st.com,
denis.ciocca@st.com, armando.visconti@st.com
Subject: Re: [PATCH] iio: imu: st_lsm6dsx: fix edge-trigger interrupts
Date: Sat, 14 Nov 2020 18:58:14 +0100 [thread overview]
Message-ID: <20201114175814.GC3993@lore-desk> (raw)
In-Reply-To: <20201114173100.0d6ce33e@archlinux>
[-- Attachment #1: Type: text/plain, Size: 11762 bytes --]
> On Sat, 14 Nov 2020 17:48:40 +0100
> Lorenzo Bianconi <lorenzo@kernel.org> wrote:
>
> > > On Sun, 8 Nov 2020 19:27:28 +0100
> > > Lorenzo Bianconi <lorenzo@kernel.org> wrote:
> > >
> > [...]
> > >
> > > So the thing I've been trying to say badly here is that I'm fairly sure the
> > > issue isn't what you think it is at all. (Note I've spent a lot of
> > > time with scopes on interrupt lines looking for similar issues - it's
> > > not fun).
> > >
> > > I think the actual condition here is that you have an interrupt that is not
> > > guaranteed to go low for long enough between being cleared and set. Thus if you are
> > > read the fifo at almost exactly the moment new data is written you may in theory
> > > have the interrupt drop, but in practice analog electronics kicks in an you won't
> > > get an interrupt detected at all. This why the sensor needs to put guarantees
> > > on that drop time (some do - but I'm not seeing in datasheet for this one).
> > > On a more mundane note, I'm not sure in this case that there is a guarantee
> > > it will ever drop even in theory - this buffer could for this short period be
> > > filling faster than we drain it.
> >
> > ack, very nice explanation :)
> >
> > >
> > > The reason your change makes this much less likely to happen is that, by checking
> > > again you are generally much closer to the time of the change of the level in
> > > the fifo. Thus, unless you are preempted you should clear it long before it
> > > would be set again, and thus get a nice clean drop on the interrupt.
> > >
> > > So for some asci art
> >
> > very nice :)
> >
> > >
> > > Previously we have
> > >
> > > data samples | | |
> > > _
> > > Read of fifo ___________|_____
> > > _______ _____________
> > > interrupt line ____| | Interrupt stuck high as edge missed.
> > > ^
> > > 1
> > >
> > > With your fix
> > >
> > > data samples | | |
> > > _
> > > Read of fifo ___________|__|__
> > > _______ __
> > > interrupt line ____| | |____|
> > > ^ ^
> > > 1 2
> > >
> > > So we would have missed 1, but because we check the fifo level again immediate
> > > after we would have made it drop, if we hit this unfortunately timing we will
> > > very quickly pull new data from the sensor and result in a drop well before the
> > > next interrupt comes in.
> >
> > in the last case, even if we introduce a little bit of burstiness, I guess it
> > works because we read both 1 and 2, right?
>
> We should always be fine, because the extra check must take a bit of time. Either
> the event happens after that time (in which case the interrupt will have been low
> long enough) or it doesn't and we will catch it.
>
> >
> > >
> > >
> > > >
[...]
> > I do not know about it, I just received a report about the issue from stm folks.
> > I am fine to drop support for edge interrupts but do we have a similar issue for
> > st sensors (acc, magn, gyro) as well? Please consider:
> > https://elixir.bootlin.com/linux/latest/source/drivers/iio/common/st_sensors/st_sensors_trigger.c#L113
>
> It was a part now supported by that driver that I hit this issue on
> years ago. As a side note, there is a bug in there though, be it one we
> probably can't hit? stat_drdy has to be defined, if not the while loop will get
> a negative back (which is true) and loop for ever.
>
> https://elixir.bootlin.com/linux/latest/source/drivers/iio/common/st_sensors/st_sensors_trigger.c#L36
> Probably want's to return 0 but print an error message. Whilst there even better
> if that function just returns a boolean so we cant accidentally put such a bug
> back in again in future.
ack, I agree. I can post a fix but I have no device for testing.
>
> Lets go with your fix, but perhaps we should add a note to the dt binding to
> say level interrupts preferred? Saving a check or two in the common case is
> definitely beneficial if the host supports level interrupts.
>
> If you can do a v3 with updated explanation and comments that would be great.
sure, I will add some comments to v2 and post v3.
Regards,
Lorenzo
>
> Thanks,
>
> Jonathan
>
> >
> > Regards,
> > Lorenzo
> >
> > >
> > > Jonathan
> > > >
> > > > Regards,
> > > > Lorenzo
> > > >
> > > > >
> > > > > Jonathan
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > >
> > > > > > Regards,
> > > > > > Lorenzo
> > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Hmm. Having had a look at one of the datasheets, I'm far from convinced these
> > > > > > > > > parts truely support edge interrupts. I can't see anything about minimum
> > > > > > > > > off periods etc that you need for true edge interrupts. Otherwise they are
> > > > > > > > > going to be prone to races.
> > > > > > > >
> > > > > > > > @mario, denis, armando: any pointer for this?
> > > > > > > >
> > > > > > > > >
> > > > > > > > > So I think the following can happen.
> > > > > > > > >
> > > > > > > > > A) We drain the fifo and it stays under the limit. Hence once that
> > > > > > > > > is crossed in future we will interrupt as normal.
> > > > > > > > >
> > > > > > > > > B) We drain the fifo but it either has a very low watermark, or is
> > > > > > > > > filling very fast. We manage to drain enough to get the interrupt
> > > > > > > > > to fire again, so all is fine if less than ideal. With you loop we
> > > > > > > > > may up entering the interrupt handler when we don't actually need to.
> > > > > > > > > If you want to avoid that you would need to disable the interrupt,
> > > > > > > > > then drain the fifo and finally do a dance to successfully reenable
> > > > > > > > > the interrupt, whilst ensuring no chance of missing by checking it
> > > > > > > > > should not have fired (still below the threshold)
> > > > > > > > >
> > > > > > > > > C) We try to drain the fifo, but it is actually filling fast enough that
> > > > > > > > > we never get it under the limit, so no interrupt ever fires.
> > > > > > > > > With new code, we'll keep spinning to 0 so might eventually drain it.
> > > > > > > > > That needs a timeout so we just give up eventually.
> > > > > > > > >
> > > > > > > > > D) watershed is one sample, we drain low enough to successfully get down
> > > > > > > > > to zero at the moment of the read, but very very soon after that we get
> > > > > > > > > one sample again. There is a window in which the interrupt line dropped
> > > > > > > > > but analogue electronics etc being what they are, it may not have been
> > > > > > > > > detectable. Hence we miss an interrupt... What you are doing is reducing
> > > > > > > > > the chance of hitting this. It is nasty, but you might be able to ensure
> > > > > > > > > a reasonable period by widening this window. Limit the watermark to 2
> > > > > > > > > samples?
> > > > > > > > >
> > > > > > > > > Also needs a fixes tag :)
> > > > > > > >
> > > > > > > > ack, I will add them in v2
> > > > > > > >
> > > > > > > > Regards,
> > > > > > > > Lorenzo
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
> > > > > > > > > > ---
> > > > > > > > > > drivers/iio/imu/st_lsm6dsx/st_lsm6dsx_core.c | 33 +++++++++++++++-----
> > > > > > > > > > 1 file changed, 25 insertions(+), 8 deletions(-)
> > > > > > > > > >
> > > > > > > > > > diff --git a/drivers/iio/imu/st_lsm6dsx/st_lsm6dsx_core.c b/drivers/iio/imu/st_lsm6dsx/st_lsm6dsx_core.c
> > > > > > > > > > index 5e584c6026f1..d43b08ceec01 100644
> > > > > > > > > > --- a/drivers/iio/imu/st_lsm6dsx/st_lsm6dsx_core.c
> > > > > > > > > > +++ b/drivers/iio/imu/st_lsm6dsx/st_lsm6dsx_core.c
> > > > > > > > > > @@ -2457,22 +2457,36 @@ st_lsm6dsx_report_motion_event(struct st_lsm6dsx_hw *hw)
> > > > > > > > > > return data & event_settings->wakeup_src_status_mask;
> > > > > > > > > > }
> > > > > > > > > >
> > > > > > > > > > +static irqreturn_t st_lsm6dsx_handler_irq(int irq, void *private)
> > > > > > > > > > +{
> > > > > > > > > > + return IRQ_WAKE_THREAD;
> > > > > > > > > > +}
> > > > > > > > > > +
> > > > > > > > > > static irqreturn_t st_lsm6dsx_handler_thread(int irq, void *private)
> > > > > > > > > > {
> > > > > > > > > > struct st_lsm6dsx_hw *hw = private;
> > > > > > > > > > + int fifo_len = 0, len = 0;
> > > > > > > > > > bool event;
> > > > > > > > > > - int count;
> > > > > > > > > >
> > > > > > > > > > event = st_lsm6dsx_report_motion_event(hw);
> > > > > > > > > >
> > > > > > > > > > if (!hw->settings->fifo_ops.read_fifo)
> > > > > > > > > > return event ? IRQ_HANDLED : IRQ_NONE;
> > > > > > > > > >
> > > > > > > > > > - mutex_lock(&hw->fifo_lock);
> > > > > > > > > > - count = hw->settings->fifo_ops.read_fifo(hw);
> > > > > > > > > > - mutex_unlock(&hw->fifo_lock);
> > > > > > > > > > + /*
> > > > > > > > > > + * If we are using edge IRQs, new samples can arrive while
> > > > > > > > > > + * processing current IRQ and those may be missed unless we
> > > > > > > > > > + * pick them here, so let's try read FIFO status again
> > > > > > > > > > + */
> > > > > > > > > > + do {
> > > > > > > > > > + mutex_lock(&hw->fifo_lock);
> > > > > > > > > > + len = hw->settings->fifo_ops.read_fifo(hw);
> > > > > > > > > > + mutex_unlock(&hw->fifo_lock);
> > > > > > > > > > +
> > > > > > > > > > + fifo_len += len;
> > > > > > > > > > + } while (len > 0);
> > > > > > > > > >
> > > > > > > > > > - return count || event ? IRQ_HANDLED : IRQ_NONE;
> > > > > > > > > > + return fifo_len || event ? IRQ_HANDLED : IRQ_NONE;
> > > > > > > > > > }
> > > > > > > > > >
> > > > > > > > > > static int st_lsm6dsx_irq_setup(struct st_lsm6dsx_hw *hw)
> > > > > > > > > > @@ -2488,10 +2502,14 @@ static int st_lsm6dsx_irq_setup(struct st_lsm6dsx_hw *hw)
> > > > > > > > > >
> > > > > > > > > > switch (irq_type) {
> > > > > > > > > > case IRQF_TRIGGER_HIGH:
> > > > > > > > > > + irq_type |= IRQF_ONESHOT;
> > > > > > > > > > + fallthrough;
> > > > > > > > > > case IRQF_TRIGGER_RISING:
> > > > > > > > > > irq_active_low = false;
> > > > > > > > > > break;
> > > > > > > > > > case IRQF_TRIGGER_LOW:
> > > > > > > > > > + irq_type |= IRQF_ONESHOT;
> > > > > > > > > > + fallthrough;
> > > > > > > > > > case IRQF_TRIGGER_FALLING:
> > > > > > > > > > irq_active_low = true;
> > > > > > > > > > break;
> > > > > > > > > > @@ -2520,10 +2538,9 @@ static int st_lsm6dsx_irq_setup(struct st_lsm6dsx_hw *hw)
> > > > > > > > > > }
> > > > > > > > > >
> > > > > > > > > > err = devm_request_threaded_irq(hw->dev, hw->irq,
> > > > > > > > > > - NULL,
> > > > > > > > > > + st_lsm6dsx_handler_irq,
> > > > > > > > > > st_lsm6dsx_handler_thread,
> > > > > > > > > > - irq_type | IRQF_ONESHOT,
> > > > > > > > > > - "lsm6dsx", hw);
> > > > > > > > > > + irq_type, "lsm6dsx", hw);
> > > > > > > > > > if (err) {
> > > > > > > > > > dev_err(hw->dev, "failed to request trigger irq %d\n",
> > > > > > > > > > hw->irq);
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > >
> > >
>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]
prev parent reply other threads:[~2020-11-14 17:58 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-10-22 9:26 [PATCH] iio: imu: st_lsm6dsx: fix edge-trigger interrupts Lorenzo Bianconi
2020-11-01 16:33 ` Jonathan Cameron
2020-11-02 10:15 ` Lorenzo Bianconi
2020-11-02 17:44 ` Jonathan Cameron
2020-11-02 18:18 ` Lorenzo Bianconi
2020-11-08 16:49 ` Jonathan Cameron
2020-11-08 18:27 ` Lorenzo Bianconi
2020-11-14 15:06 ` Jonathan Cameron
2020-11-14 16:48 ` Lorenzo Bianconi
2020-11-14 17:31 ` Jonathan Cameron
2020-11-14 17:58 ` Lorenzo Bianconi [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20201114175814.GC3993@lore-desk \
--to=lorenzo.bianconi@redhat.com \
--cc=Jonathan.Cameron@huawei.com \
--cc=armando.visconti@st.com \
--cc=denis.ciocca@st.com \
--cc=jic23@kernel.org \
--cc=linux-iio@vger.kernel.org \
--cc=lorenzo@kernel.org \
--cc=mario.tesi@st.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).