From: Ira Weiny <ira.weiny@intel.com>
To: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Cc: Dan Williams <dan.j.williams@intel.com>,
Bjorn Helgaas <bhelgaas@google.com>,
"Li, Ming" <ming4.li@intel.com>,
Bjorn Helgaas <helgaas@kernel.org>,
Lukas Wunner <lukas@wunner.de>,
Alison Schofield <alison.schofield@intel.com>,
Vishal Verma <vishal.l.verma@intel.com>,
"Dave Jiang" <dave.jiang@intel.com>,
Ben Widawsky <bwidawsk@kernel.org>,
<linux-kernel@vger.kernel.org>, <linux-cxl@vger.kernel.org>,
<linux-pci@vger.kernel.org>
Subject: Re: [PATCH V12 3/9] PCI: Create PCIe library functions in support of DOE mailboxes.
Date: Fri, 1 Jul 2022 15:22:38 -0700 [thread overview]
Message-ID: <Yr9zroBeSLfWyp5U@iweiny-desk3> (raw)
In-Reply-To: <20220630162540.00002910@Huawei.com>
On Thu, Jun 30, 2022 at 04:25:40PM +0100, Jonathan Cameron wrote:
> On Wed, 29 Jun 2022 21:34:18 -0700
> Ira Weiny <ira.weiny@intel.com> wrote:
>
[snip]
I've dropped the IRQ support and was polishing things up. Without the IRQ I
don't think any 'arming' makes sense.
However, in working through the sequence again I think I found another problem.
I _think_... :-/
> >
> > But we are only going to see this if some other entity is using the mailbox
> > right? And I don't think that is going to be common, is it?
>
> BUSY on entry to doe_statemachine_work() is indeed only relevant if
> some other entity is trampling on us. It's best effort only.
>
> BUSY during normal flow is the one I care about.
> In most cases it will go like (assuming we clear the int status in the handler as now)
>
> Send Object
> BUSY ________|-----___________________
> PROC ________|------------------______
> OBJ RDY ___________________________-------
> Int Status______________-____________-_____
So I did not realize that BUSY could clear like this. I thought the point of
BUSY was to indicate someone else had an exchange in flight.
What happens if another entity jumps in during the PROC time? How does one
know that OBJ RDY is _our_ object ready and not someone else's?
For example 'entity' issues a send, we see busy clear and also start a
send. But the device is still processing the send from 'entity':
Send Object(entity) Send Object (Linux)
BUSY ___|----_______________|---______________________________
PROC ___|-----------------------------___|-----------------___
OBJ RDY _________________________________-------______________---
Int Status________-__________________-_____-____________________-__
^^^
This...
... is _not_ Linux's object!?!?!?!
Can that happen?
If so this is entirely broken. Even Polling OBJ RDY will break. And worse yet
we will not even see BUSY being set in any 'abnormal' way.
>
> where I've added PROC to mean the device is processing the data.
> Once it clears the input buffer on the device and hence the device can accept
> another protocol request BUSY will drop. If device has some pipelining
> or runs multiple protocols in different threads, you can think of that busy
> period just being the time it needs to copy out the request to some protocol
> thread specific storage.
BUSY was not at all doing what I thought it did. I'm now concerned that it is
completely broken WRT to other entities even without IRQs. Frankly I'm
confused why pci_doe_send_req() even checks for busy because it is unlikely
that we will ever see it set. For sure we won't from our side because the
workqueue is going to process one task at a time.
If Linux wanted to have multiple objects in flight I think we would need a much
more complex state machine than we had. Maybe your original state machine
handled this. If so, I apologize for missing this subtle point.
At this point I'm debating removing the check for BUSY as well because I don't
see the point. (Other than maybe flagging some error to say that 'entity' may
be messing things up for us and bailing.)
Thoughts?
Ira
>
> You won't see this in QEMU without extra hacks because we shorten the
> flow so that whole thing is instantaneous.
>
> If those two interrupts per transfer occur well spread out they can result in
> your INT flag being set too often and some of the waits dropping through early.
>
> It will 'work' I think though because you ultimately spin on Data object
> ready which won't be set until after the second interrupt.
>
next prev parent reply other threads:[~2022-07-01 22:23 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-06-28 4:15 [PATCH V12 0/9] CXL: Read CDAT and DSMAS data ira.weiny
2022-06-28 4:15 ` [PATCH V12 1/9] PCI: Add vendor ID for the PCI SIG ira.weiny
2022-06-28 4:15 ` [PATCH V12 2/9] PCI: Replace magic constant for PCI Sig Vendor ID ira.weiny
2022-06-28 4:15 ` [PATCH V12 3/9] PCI: Create PCIe library functions in support of DOE mailboxes ira.weiny
2022-06-28 14:16 ` Jonathan Cameron
2022-06-28 14:56 ` Bjorn Helgaas
2022-06-28 18:20 ` Ira Weiny
2022-06-29 14:09 ` Jonathan Cameron
2022-06-30 4:34 ` Ira Weiny
2022-06-30 15:25 ` Jonathan Cameron
2022-07-01 22:22 ` Ira Weiny [this message]
2022-07-04 6:45 ` Jonathan Cameron
2022-06-29 16:53 ` Ira Weiny
2022-06-28 14:38 ` Jonathan Cameron
2022-06-28 16:58 ` Ira Weiny
2022-06-28 4:15 ` [PATCH V12 4/9] cxl/pci: Create PCI DOE mailbox's for memory devices ira.weiny
2022-06-28 14:33 ` Jonathan Cameron
2022-06-30 5:32 ` Ira Weiny
2022-06-30 15:32 ` Jonathan Cameron
2022-06-30 16:14 ` Davidlohr Bueso
2022-07-01 19:52 ` Ira Weiny
2022-06-28 4:15 ` [PATCH V12 5/9] driver-core: Introduce BIN_ATTR_ADMIN_{RO,RW} ira.weiny
2022-06-28 6:06 ` Greg Kroah-Hartman
2022-06-28 16:42 ` Ira Weiny
2022-06-29 1:10 ` Bjorn Helgaas
2022-06-28 4:15 ` [PATCH V12 6/9] cxl/port: Read CDAT table ira.weiny
2022-06-28 14:47 ` Jonathan Cameron
2022-06-30 3:35 ` Ira Weiny
2022-06-30 15:45 ` Jonathan Cameron
2022-06-28 4:15 ` [PATCH V12 7/9] cxl/port: Introduce cxl_cdat_valid() ira.weiny
2022-06-28 14:49 ` Jonathan Cameron
2022-06-30 4:42 ` Ira Weiny
2022-06-28 4:15 ` [PATCH V12 8/9] cxl/port: Retry reading CDAT on failure ira.weiny
2022-06-28 14:57 ` Jonathan Cameron
2022-06-30 3:40 ` Ira Weiny
2022-06-28 4:15 ` [PATCH V12 9/9] cxl/port: Parse out DSMAS data from CDAT table ira.weiny
2022-06-28 15:00 ` Jonathan Cameron
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Yr9zroBeSLfWyp5U@iweiny-desk3 \
--to=ira.weiny@intel.com \
--cc=Jonathan.Cameron@huawei.com \
--cc=alison.schofield@intel.com \
--cc=bhelgaas@google.com \
--cc=bwidawsk@kernel.org \
--cc=dan.j.williams@intel.com \
--cc=dave.jiang@intel.com \
--cc=helgaas@kernel.org \
--cc=linux-cxl@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=lukas@wunner.de \
--cc=ming4.li@intel.com \
--cc=vishal.l.verma@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox