From: "William A. Kennington III" <william@wkennington.com>
To: Jeremy Kerr <jk@codeconstruct.com.au>,
Matt Johnston <matt@codeconstruct.com.au>,
Andrew Lunn <andrew+netdev@lunn.ch>,
"David S. Miller" <davem@davemloft.net>,
Eric Dumazet <edumazet@google.com>,
Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>,
Wolfram Sang <wsa@kernel.org>
Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] mctp i2c: check packet length before marking flow active
Date: Thu, 7 May 2026 00:50:34 -0700 [thread overview]
Message-ID: <83fddb43-72ff-4176-aeac-3c65faac811e@wkennington.com> (raw)
In-Reply-To: <88cb1f245485d991b845886827876f27f11792ae.camel@codeconstruct.com.au>
On 5/6/26 01:01, Jeremy Kerr wrote:
> Hi William,
>
>>> Just to clarify my understanding of the state: "being held by two
>>> owners" would indicate a violation of the lock itself. Or is it that
>>> there are two threads blocked waiting to acquire the mutex?
>> I think it’s actually this, 2 threads are waiting on acquiring the lock.
> OK, that's good news!
>
>> There was a theory that it was a lock underflow that allowed 2 threads
>> to acquire the lock that lead to this patch.
>>
>>> For NVMe-MI, you're likely using manual tag allocation, where the tag
>>> allocation (and hence flow state) is entirely controlled by userspace.
>>> It may be that the NVMe protocol-level errors are causing that tags to
>>> be held for long durations, perhaps?
>> Yeah, this is very plausible given the device(s) stop responding
>> correctly. I imagine we are getting stuck with manual allocations and
>> not releasing locks. Can we reset the state machine back to NEW instead
>> of holding the lock?
> Not sure what you're referring to here; if the userspace application is
> not releasing the tag, we have to keep the i2c bus locked, otherwise we
> may not receive a response from the device.
Isn't this inherently an approach asking for trouble, where a
potentially buggy userspace can starve out other applications which need
to access the bus. For us we have FRU devices on the that are
periodically rescanned or accessed for various reasons alongside the
MCTP endpoint on the NVME device.
> The one case I can think of (in upstream infrastructure, at least) is
> that this might be triggered by the device reporting a long MPRT value,
> and then a response gets lost. libnvme is respecting the MPRT, and not
> releasing the tag for that (excessive) duration.
Yeah, I'll have to look at the specific firmware bug more but I don't
think it's been oot caused it fully yet.
> However, the tag -> i2c lock associations are only useful if you have
> muxes in the i2c topology. Is that the case on your platform? If not,
> perhaps we could elide all the bus locking when we can detect that...
We have at least 1 layer of mux before each NVME and FRU device.
> Cheers,
>
>
> Jeremy
prev parent reply other threads:[~2026-05-07 7:50 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20260423001517.79219-1-william@wkennington.com>
[not found] ` <61bb1b3838609996600f46ccb2c4ff89d085ee6f.camel@codeconstruct.com.au>
[not found] ` <c94fc7c9-279b-4b4b-92f3-7f1b88bc0c64@wkennington.com>
[not found] ` <1651e98dcc86f38a0b39679b1a6f9ef604e0812a.camel@codeconstruct.com.au>
[not found] ` <0c396f24-366b-49ed-ae84-9f1982866a99@wkennington.com>
2026-05-06 8:01 ` [PATCH] mctp i2c: check packet length before marking flow active Jeremy Kerr
2026-05-07 7:50 ` William A. Kennington III [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=83fddb43-72ff-4176-aeac-3c65faac811e@wkennington.com \
--to=william@wkennington.com \
--cc=andrew+netdev@lunn.ch \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=jk@codeconstruct.com.au \
--cc=kuba@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=matt@codeconstruct.com.au \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=wsa@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox