linux-spi.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Vincent Whitchurch <vincent.whitchurch@axis.com>
To: David Jander <david@protonic.nl>
Cc: Mark Brown <broonie@kernel.org>,
	Casper Andersson <casper.casan@gmail.com>,
	"linux-spi@vger.kernel.org" <linux-spi@vger.kernel.org>,
	Marc Kleine-Budde <mkl@pengutronix.de>,
	Andrew Lunn <andrew@lunn.ch>,
	Lars Povlsen <lars.povlsen@microchip.com>,
	Steen Hegelund <steen.hegelund@microchip.com>,
	Microchip Linux Driver Support <UNGLinuxDriver@microchip.com>
Subject: Re: [PROBLEM] spi driver internal error during boot on sparx5
Date: Thu, 1 Sep 2022 13:42:19 +0200	[thread overview]
Message-ID: <YxCam3BMVyTo+ZON@axis.com> (raw)
In-Reply-To: <20220901130222.587f4932@erd992>

On Thu, Sep 01, 2022 at 01:02:22PM +0200, David Jander wrote:
> On Thu, 1 Sep 2022 08:57:37 +0200
> Vincent Whitchurch <vincent.whitchurch@axis.com> wrote:
> > You can reproduce the problem with:
> > 
> >  git remote add vwax https://github.com/vwax/linux.git
> >  git fetch vwax
> >  git archive vwax/roadtest/devel tools/testing/roadtest | tar xf -
> >  make -C tools/testing/roadtest/ -j24 OPTS="-k adc --rt-timeout 10 -v"
> > 
> > The test passes on v5.19 but on current mainline it hangs and times out.
> 
> This is very nice. Thanks.

Great to hear that it was useful!

> I followed your instructions, and if I apply the following, all tests are
> passed.
> 
> diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c
> index 83da8862b8f2..32c01e684af3 100644
> --- a/drivers/spi/spi.c
> +++ b/drivers/spi/spi.c
> @@ -1727,8 +1727,7 @@ static void __spi_pump_messages(struct spi_controller *ctlr, bool in_kthread)
>         spin_unlock_irqrestore(&ctlr->queue_lock, flags);
>  
>         ret = __spi_pump_transfer_message(ctlr, msg, was_busy);
> -       if (!ret)
> -               kthread_queue_work(ctlr->kworker, &ctlr->pump_messages);
> +       kthread_queue_work(ctlr->kworker, &ctlr->pump_messages);
>  
>         ctlr->cur_msg = NULL;
>         ctlr->fallback = false;
> 
> The problem is that if __spi_pump_transfer_message() fails, the ctlr->busy
> flag is left true, so __spi_async() is not going to queue new work. The busy
> transition is handled right above that piece of code, in
> __spi_pump_transfer_message(), and the mechanism is to queue more work to do
> it. Apparently work was only queued when the transfer was successful, and I am
> not sure why it was like that. Queuing work unconditionally solves the issue
> and should not be a problem.
> 
> Curious thing is, that this change alone is sufficient to make all the
> roadtest tests pass. But I still think that does not fix the bug reported by
> Casper. For that, Mark's patch is also necessary.

Yes, I noticed that too.  If you comment out the two usages of
"@flaky_bus" in tools/testing/roadtest/roadtest/tests/iio/adc/test_adc084s021.py then
the bus error injection will be disabled and only the success paths will
be tested.  If you do that and test ae7d2346dc89ae89a6e0 ("spi: Don't
use the message queue if possible in spi_sync") with the crash fix and
iio sysfs fixes cherry-picked in, you'll see the following failure
without Mark's patch:

 FAILED roadtest/tests/iio/adc/test_adc084s021.py::test_illuminance - BlockingIOError: [Errno 115] Operation now in progress

But if you move forward to 69fa95905d40846756d22 ("spi: Ensure the
io_mutex is held until spi_finalize_current_message()"), then the tests
start passing again, if they're run without the error injection.

So 69fa95905d40846756d22 seems to be masking the original problem from
ae7d2346dc89ae89a6e0 (while at the same time introducing the other
problem in the error handling).

  reply	other threads:[~2022-09-01 11:42 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-26  9:41 [PROBLEM] spi driver internal error during boot on sparx5 Casper Andersson
2022-08-29  8:56 ` David Jander
2022-08-31 16:07   ` Mark Brown
2022-09-01  6:57     ` Vincent Whitchurch
2022-09-01 11:02       ` David Jander
2022-09-01 11:42         ` Vincent Whitchurch [this message]
2022-09-01 12:08           ` David Jander
2022-09-01 11:51         ` Mark Brown
2022-09-01 15:16         ` Casper Andersson
2022-09-02  6:38           ` David Jander
2022-09-01 11:11       ` Mark Brown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YxCam3BMVyTo+ZON@axis.com \
    --to=vincent.whitchurch@axis.com \
    --cc=UNGLinuxDriver@microchip.com \
    --cc=andrew@lunn.ch \
    --cc=broonie@kernel.org \
    --cc=casper.casan@gmail.com \
    --cc=david@protonic.nl \
    --cc=lars.povlsen@microchip.com \
    --cc=linux-spi@vger.kernel.org \
    --cc=mkl@pengutronix.de \
    --cc=steen.hegelund@microchip.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).