From: David Jander <david@protonic.nl>
To: Vincent Whitchurch <vincent.whitchurch@axis.com>
Cc: Mark Brown <broonie@kernel.org>,
Casper Andersson <casper.casan@gmail.com>,
<linux-spi@vger.kernel.org>,
Marc Kleine-Budde <mkl@pengutronix.de>,
Andrew Lunn <andrew@lunn.ch>,
Lars Povlsen <lars.povlsen@microchip.com>,
Steen Hegelund <steen.hegelund@microchip.com>,
Microchip Linux Driver Support <UNGLinuxDriver@microchip.com>
Subject: Re: [PROBLEM] spi driver internal error during boot on sparx5
Date: Thu, 1 Sep 2022 13:02:22 +0200 [thread overview]
Message-ID: <20220901130222.587f4932@erd992> (raw)
In-Reply-To: <YxBX4bXG02E4lSUW@axis.com>
On Thu, 1 Sep 2022 08:57:37 +0200
Vincent Whitchurch <vincent.whitchurch@axis.com> wrote:
> On Wed, Aug 31, 2022 at 05:07:42PM +0100, Mark Brown wrote:
> > On Mon, Aug 29, 2022 at 10:56:13AM +0200, David Jander wrote:
> > > Not sure if this is a correct fix, but I'd like to know if your situation
> > > changes this way, if you could try it.
> > > I don't have access to any hardware with a mux unfortunately, so I can't test
> > > it myself.
> >
> > I guess claiming to have a noop mux might work for testing, though I'd
> > be dubious that it was actually doing the mux operations properly? I
>
> I'm seeing these problems with the roadtest test framework which uses
> UML and doesn't need any special hardware. Roadtest puts its emulated
> devices under a spi-mux with gpio-mockup and there are no tests for the
> actual muxing, but other driver tests break since transfers using
> spi-mux don't work properly.
>
> I pushed the latest version with SPI support to the tree below. The
> test_adc084s021 tests a SPI device. Roadtest is placed inside a linux
> tree, but it doesn't need any patches to the kernel:
>
> https://github.com/vwax/linux/commits/roadtest/devel
>
> You can reproduce the problem with:
>
> git remote add vwax https://github.com/vwax/linux.git
> git fetch vwax
> git archive vwax/roadtest/devel tools/testing/roadtest | tar xf -
> make -C tools/testing/roadtest/ -j24 OPTS="-k adc --rt-timeout 10 -v"
>
> The test passes on v5.19 but on current mainline it hangs and times out.
This is very nice. Thanks.
I followed your instructions, and if I apply the following, all tests are
passed.
diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c
index 83da8862b8f2..32c01e684af3 100644
--- a/drivers/spi/spi.c
+++ b/drivers/spi/spi.c
@@ -1727,8 +1727,7 @@ static void __spi_pump_messages(struct spi_controller *ctlr, bool in_kthread)
spin_unlock_irqrestore(&ctlr->queue_lock, flags);
ret = __spi_pump_transfer_message(ctlr, msg, was_busy);
- if (!ret)
- kthread_queue_work(ctlr->kworker, &ctlr->pump_messages);
+ kthread_queue_work(ctlr->kworker, &ctlr->pump_messages);
ctlr->cur_msg = NULL;
ctlr->fallback = false;
The problem is that if __spi_pump_transfer_message() fails, the ctlr->busy
flag is left true, so __spi_async() is not going to queue new work. The busy
transition is handled right above that piece of code, in
__spi_pump_transfer_message(), and the mechanism is to queue more work to do
it. Apparently work was only queued when the transfer was successful, and I am
not sure why it was like that. Queuing work unconditionally solves the issue
and should not be a problem.
Curious thing is, that this change alone is sufficient to make all the
roadtest tests pass. But I still think that does not fix the bug reported by
Casper. For that, Mark's patch is also necessary.
@Casper: can you test Mark's patch with above addition?
> > think we need to either change spi_mux to duplicate the incoming message
> > (that's probably "cleaner") or teach the core that spi-mux exists and
> > should always use the pump/queue. The below might do the trick but in
> > spite of my suggestion above I've not tested either yet:
> >
> > diff --git a/drivers/spi/spi-mux.c b/drivers/spi/spi-mux.c
> > index f5d32ec4634e..0709e987bd5a 100644
> > --- a/drivers/spi/spi-mux.c
> > +++ b/drivers/spi/spi-mux.c
> > @@ -161,6 +161,7 @@ static int spi_mux_probe(struct spi_device *spi)
> > ctlr->num_chipselect = mux_control_states(priv->mux);
> > ctlr->bus_num = -1;
> > ctlr->dev.of_node = spi->dev.of_node;
> > + ctlr->must_async = true;
> >
> > ret = devm_spi_register_controller(&spi->dev, ctlr);
> > if (ret)
> > diff --git a/drivers/spi/spi.c b/drivers/spi/spi.c
> > index 1cfed874f7ae..88d48a105d3c 100644
> > --- a/drivers/spi/spi.c
> > +++ b/drivers/spi/spi.c
> > @@ -4033,7 +4033,7 @@ static int __spi_sync(struct spi_device *spi, struct spi_message *message)
> > * guard against reentrancy from a different context. The io_mutex
> > * will catch those cases.
> > */
> > - if (READ_ONCE(ctlr->queue_empty)) {
> > + if (READ_ONCE(ctlr->queue_empty) && !ctlr->must_async) {
> > message->actual_length = 0;
> > message->status = -EINPROGRESS;
> >
> > diff --git a/include/linux/spi/spi.h b/include/linux/spi/spi.h
> > index e6c73d5ff1a8..f089ee1ead58 100644
> > --- a/include/linux/spi/spi.h
> > +++ b/include/linux/spi/spi.h
> > @@ -469,6 +469,7 @@ extern struct spi_device *spi_new_ancillary_device(struct spi_device *spi, u8 ch
> > * SPI_TRANS_FAIL_NO_START.
> > * @queue_empty: signal green light for opportunistically skipping the queue
> > * for spi_sync transfers.
> > + * @must_async: disable all fast paths in the core
> > *
> > * Each SPI controller can communicate with one or more @spi_device
> > * children. These make a small bus, sharing MOSI, MISO and SCK signals
> > @@ -690,6 +691,7 @@ struct spi_controller {
> >
> > /* Flag for enabling opportunistic skipping of the queue in spi_sync */
> > bool queue_empty;
> > + bool must_async;
> > };
> >
> > static inline void *spi_controller_get_devdata(struct spi_controller *ctlr)
> >
> > Assuming that works out there'll be an extra test in the fast path but
> > no sync operations, and a performance hit for spi-mux users. Hopefully
> > that works as well as it did before.
>[...]
Best regards,
--
David Jander
Protonic Holland.
next prev parent reply other threads:[~2022-09-01 11:03 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-08-26 9:41 [PROBLEM] spi driver internal error during boot on sparx5 Casper Andersson
2022-08-29 8:56 ` David Jander
2022-08-31 16:07 ` Mark Brown
2022-09-01 6:57 ` Vincent Whitchurch
2022-09-01 11:02 ` David Jander [this message]
2022-09-01 11:42 ` Vincent Whitchurch
2022-09-01 12:08 ` David Jander
2022-09-01 11:51 ` Mark Brown
2022-09-01 15:16 ` Casper Andersson
2022-09-02 6:38 ` David Jander
2022-09-01 11:11 ` Mark Brown
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220901130222.587f4932@erd992 \
--to=david@protonic.nl \
--cc=UNGLinuxDriver@microchip.com \
--cc=andrew@lunn.ch \
--cc=broonie@kernel.org \
--cc=casper.casan@gmail.com \
--cc=lars.povlsen@microchip.com \
--cc=linux-spi@vger.kernel.org \
--cc=mkl@pengutronix.de \
--cc=steen.hegelund@microchip.com \
--cc=vincent.whitchurch@axis.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).