From: Dan Williams <dan.j.williams@intel.com>
To: Jonathan Cameron <Jonathan.Cameron@huawei.com>,
<linux-cxl@vger.kernel.org>, <dan.j.williams@intel.com>
Subject: RE: CXL/region : commit reset of out of order region appears to succeed.
Date: Fri, 16 Jun 2023 17:26:34 -0700 [thread overview]
Message-ID: <648cfdbad83fe_362143294ce@dwillia2-xfh.jf.intel.com.notmuch> (raw)
In-Reply-To: <20230316171441.0000205b@Huawei.com>
Jonathan Cameron wrote:
> Ran into this whilst testing fix for QEMU uncommit handling.
>
> To replicate.
> 1) Setup two regions on a direct connected Type 3 and commit them both.
> 2) Uncommit the first region once. (it fails with an out of order message)
> Note that from here on the sysfs commit attribute reads as 0.
> 3) Uncommit that first region again. It appears to succeed.
>
> Reason is easy to track down:
> https://elixir.bootlin.com/linux/v6.3-rc2/source/drivers/cxl/core/region.c#L257
>
> commit_store() of 0 unconditionally sets the state to CXL_CONFIG_RESET_PENDING
>
> When the decoder reset fails, that is left set.
> Hence next call drops straight through.
>
> Whilst it's easy to 'fix' the superficial issue by reseting the state to the previous
> value on error, I'm not sure that's sufficient or race free.
I think it is sufficient because the state transition is happening under
the lock and RESET_PENDING > ACTIVE. So any paths that depend on the
region not being active will be protected.
On the other side, if someone races to commit the region via another
thread while the lock is dropped they will either successfully
transition the region back to the COMMIT state, or will re-attempt
the reset. When both those threads re-acquire the lock one of them will
see that the reset state can advance back to ACTIVE, or will see that
someone snuck in and committed the region again while the lock was
dropped.
prev parent reply other threads:[~2023-06-17 0:26 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-03-16 17:14 CXL/region : commit reset of out of order region appears to succeed Jonathan Cameron
2023-06-17 0:26 ` Dan Williams [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=648cfdbad83fe_362143294ce@dwillia2-xfh.jf.intel.com.notmuch \
--to=dan.j.williams@intel.com \
--cc=Jonathan.Cameron@huawei.com \
--cc=linux-cxl@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox