From: Jens Axboe <axboe@kernel.dk>
To: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>,
linux-block@vger.kernel.org, linux-kernel@vger.kernel.org
Cc: Andreas Larsson <andreas@gaisler.com>,
Anthony Yznaga <anthony.yznaga@oracle.com>,
Sam James <sam@gentoo.org>,
"David S . Miller" <davem@davemloft.net>,
Michael Karcher <kernel@mkarcher.dialup.fu-berlin.de>,
sparclinux@vger.kernel.org
Subject: Re: [PATCH v2] Revert "sunvdc: Do not spin in an infinite loop when vio_ldc_send() returns EAGAIN"
Date: Mon, 6 Oct 2025 07:46:21 -0600 [thread overview]
Message-ID: <cef07e8f-a919-4aa1-9904-84b16dfa8fe6@kernel.dk> (raw)
In-Reply-To: <e6a7e68ff9e23aee448003ee1a279a4ab13192c0.camel@physik.fu-berlin.de>
On 10/6/25 7:34 AM, John Paul Adrian Glaubitz wrote:
> Hi Jens,
>
> On Mon, 2025-10-06 at 07:19 -0600, Jens Axboe wrote:
>>> The problem is that giving up can lead to filesystem corruption which
>>> is problem that was never observed before the change from what I know.
>>
>> Yes, I get that.
>>
>>> We have deployed a kernel with the change reverted on several LDOMs that
>>> are seeing heavy use such as cfarm202.cfarm.net and we have seen any system
>>> lock ups or similar.
>>
>> I believe you. I'm just wondering how long you generally need to spin,
>> as per the question above: how many times does it generally spin where
>> it would've failed before?
>
> I don't have any data on that, unfortunately. All I can say that we
> have seen filesystem corruption when installing Linux inside an LDOM
> and this particular issue was eventually tracked down to this commit.
Right, I would imagine that would need to be collected separately. But
if we can get away from the busy looping, then I don't think it's too
interesting.
>>>> Not that it's _really_ that important as this is a pretty niche driver,
>>>> but still pretty ugly... Does it work reliably with a limit of 100
>>>> spins? If things get truly stuck, spinning forever in that loop is not
>>>> going to help. In any case, your patch should have
>>>
>>> Isn't it possible that the call to vio_ldc_send() will eventually succeed
>>> which is why there is no need to abort in __vdc_tx_trigger()?
>>
>> Of course. Is it also possible that some conditions will lead it to
>> never succeed?
>
> I would assume that this would require for the virtual disk server to
> not respond which should never happen since it's a virtualized
> environment.
Most likely, yes.
> If hardware issues would cause vio_ldc_send() to never succeed, these
> would have to be handled by the virtualization environment, I guess.
Yep, it'd be a bug somewhere else.
>> The nicer approach would be to have sunvdc punt retries back up to the
>> block stack, which would at least avoid a busy spin for the condition.
>> Rather than return BLK_STS_IOERR which terminates the request and
>> bubbles back the -EIO as per your log. IOW, if we've already spent
>> 10.5ms in that busy loop as per the above rough calculation, perhaps
>> we'd be better off restarting the queue and hence this operation after a
>> small sleeping delay instead. That would seem a lot saner than hammering
>> on it.
>
> I generally agree with this remark. I just wonder whether this
> behavior should apply for a logical domain as well. I guess if a
> request doesn't succeed immediately, it's an urgent problem if the
> logical domain locks up, is it?
It's just bad behavior. Honestly most of this just looks like either a
bad implementation of the protocol as it's all based on busy looping, or
a badly designed protocol. And then the sunvdc usage of it just
proliferates that problem, rather than utilizing the tools that exist in
the block stack to take a breather rather than repeatedly hammering on
the hardware for conditions like this.
>>> And unlike the change in adddc32d6fde ("sunvnet: Do not spin in an infinite
>>> loop when vio_ldc_send() returns EAGAIN"), we can't just drop data as this
>>> driver concerns a block device while the other driver concerns a network
>>> device. Dropping network packages is expected, dropping bytes from a block
>>> device driver is not.
>>
>> Right, but we can sanely retry it rather than sit in a tight loop.
>> Something like the entirely untested below diff.
>>
>> diff --git a/drivers/block/sunvdc.c b/drivers/block/sunvdc.c
>> index db1fe9772a4d..aa49dffb1b53 100644
>> --- a/drivers/block/sunvdc.c
>> +++ b/drivers/block/sunvdc.c
>> @@ -539,6 +539,7 @@ static blk_status_t vdc_queue_rq(struct blk_mq_hw_ctx *hctx,
>> struct vdc_port *port = hctx->queue->queuedata;
>> struct vio_dring_state *dr;
>> unsigned long flags;
>> + int ret;
>>
>> dr = &port->vio.drings[VIO_DRIVER_TX_RING];
>>
>> @@ -560,7 +561,13 @@ static blk_status_t vdc_queue_rq(struct blk_mq_hw_ctx *hctx,
>> return BLK_STS_DEV_RESOURCE;
>> }
>>
>> - if (__send_request(bd->rq) < 0) {
>> + ret = __send_request(bd->rq);
>> + if (ret == -EAGAIN) {
>> + spin_unlock_irqrestore(&port->vio.lock, flags);
>> + /* already spun for 10msec, defer 10msec and retry */
>> + blk_mq_delay_kick_requeue_list(hctx->queue, 10);
>> + return BLK_STS_DEV_RESOURCE;
>> + } else if (ret < 0) {
>> spin_unlock_irqrestore(&port->vio.lock, flags);
>> return BLK_STS_IOERR;
>> }
>
> We could add this particular change on top of mine after we have
> extensively tested it.
It doesn't really make sense on top of yours, as that removes the
limited looping that sunvdc would do...
> For now, I would propose to pick up my patch to revert the previous
> change. I can then pick up your proposed change and deploy it for
> extensive testing and see if it has any side effects.
Why not just test this one and see if it works? As far as I can tell,
it's been 6.5 years since this change went in, I can't imagine there's a
huge sense of urgency to fix it up that can't wait for testing a more
proper patch rather than a work-around?
--
Jens Axboe
next prev parent reply other threads:[~2025-10-06 13:46 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-06 10:02 [PATCH v2] Revert "sunvdc: Do not spin in an infinite loop when vio_ldc_send() returns EAGAIN" John Paul Adrian Glaubitz
2025-10-06 12:48 ` Jens Axboe
2025-10-06 13:00 ` John Paul Adrian Glaubitz
2025-10-06 13:03 ` John Paul Adrian Glaubitz
2025-10-06 13:19 ` Jens Axboe
2025-10-06 13:34 ` John Paul Adrian Glaubitz
2025-10-06 13:46 ` Jens Axboe [this message]
2025-10-06 13:56 ` John Paul Adrian Glaubitz
2025-10-06 14:03 ` Jens Axboe
2025-10-06 14:12 ` John Paul Adrian Glaubitz
2025-10-06 14:27 ` Jens Axboe
2025-10-06 14:48 ` John Paul Adrian Glaubitz
2025-10-06 14:52 ` Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cef07e8f-a919-4aa1-9904-84b16dfa8fe6@kernel.dk \
--to=axboe@kernel.dk \
--cc=andreas@gaisler.com \
--cc=anthony.yznaga@oracle.com \
--cc=davem@davemloft.net \
--cc=glaubitz@physik.fu-berlin.de \
--cc=kernel@mkarcher.dialup.fu-berlin.de \
--cc=linux-block@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=sam@gentoo.org \
--cc=sparclinux@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).