All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jonathan Cameron <Jonathan.Cameron@Huawei.com>
To: Ravis OpenSrc <Ravis.OpenSrc@micron.com>
Cc: "linux-cxl@vger.kernel.org" <linux-cxl@vger.kernel.org>,
	"dan.j.williams@intel.com" <dan.j.williams@intel.com>,
	"dave.jiang@intel.com" <dave.jiang@intel.com>,
	Srinivasulu Opensrc <sthanneeru.opensrc@micron.com>,
	"john@jagalactic.com" <john@jagalactic.com>,
	Ajay Joshi <ajayjoshi@micron.com>
Subject: Re: [RFC PATCH v2 3/4] cxl: Abort background operation in case of timeout
Date: Fri, 18 Oct 2024 17:14:15 +0100	[thread overview]
Message-ID: <20241018171415.00003e47@Huawei.com> (raw)
In-Reply-To: <1e78d777e19344699d6d991b2114a3e1@micron.com>

On Fri, 18 Oct 2024 06:39:34 +0000
Ravis OpenSrc <Ravis.OpenSrc@micron.com> wrote:

> >On Thu, 17 Oct 2024 09:06:00 +0530
> >Jonathan Cameron <Jonathan.Cameron@Huawei.com> wrote:  
> >> 
> >> On Wed, 16 Oct 2024 05:00:00 +0000
> >> Ravis OpenSrc <Ravis.OpenSrc@micron.com> wrote:
> >>   
> >>> Adding support for aborting timed out background operations
> >>>
> >>> CXL r3.1 8.2.9.1.5 Request Abort Background Operation.
> >>>
> >>> If the status of a mailbox command is identified as timedout, an abort
> >>> background operation request is sent to the device.
> >>>
> >>> Link:
> >>>  
> >>https://nam10.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore  
> >>> .kernel.org%2Flinux-cxl%2F66035c2e8ba17_770232948b%40dwillia2-  
> >>xfh.jf.i  
> >>>  
> >>ntel.com.notmuch%2F&data=05%7C02%7Cajayjoshi%40micron.com%7C0d
> >>d742041c  
> >>>  
> >>ba47ec0dbd08dceec16a02%7Cf38a5ecd28134862b11bac1d563c806f%7C0
> >>%7C0%7C63  
> >>>  
> >>8647761722049771%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAw
> >>MDAiLCJQIjoiV  
> >>>  
> >>2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=1vVK
> >>PaqWSp6qT  
> >>> FmwsYF1z%2F7%2FdfEeVFD9cA0g1VVo00c%3D&reserved=0
> >>>
> >>> Suggested-by: Dan Williams <dan.j.williams@intel.com>
> >>> Signed-off-by: Ajay Joshi <ajay.opensrc@micron.com>
> >>> Signed-off-by: Ravi Shankar <ravis.opensrc@micron.com>  
> >>  
> >I was kind of expecting that we'd do this more on an ondemand basis.   
> 
> By saying on demand, do you envision a different implementation for
> abort like a sysfs interface or module param?

No. Abort by the kernel when it wants to do something rather than
on a timeout. It might also make sense ultimately to have a userspace
abort path (if it issued the command) but that is a different isseu.

> 
> > So if nothing is going on, the command can have ages.  
> 
> I did not quite get the comment about the ages.

Long time.  If no one wants to access the device interfaces, we don't mind if
the command takes much longer than 5 seconds.

> 
> >If the kernel wants to access the device and it has had a reasonable amount of
> >time we abort.
> >I'm not against this simpler approach though if it turns out to be good enough
> >for likely use cases.
> > 
> >My worry is error paths crossing with a slow command where we want to find
> >out what went wrong as quick as possible Is 5 seconds ok for that?  Maybe
> > not.  
> Do you feel we need to check for "aborted" status along with background percentage,
> to avoid waiting for 5 seconds for cases when the command has already aborted.

At the moment we don't have anything issuing the abort early than the 5 
seconds so I'm not sure how we'd hit this.

May need it in future though.
> 
> Would something like below make sense, if not let us know your thoughts.
> +static bool cxl_mbox_background_aborted(struct cxl_dev_state *cxlds)
> +{
> +       u64 reg;
> +       u16 ret_code;
> +
> +       reg = readq(cxlds->regs.mbox + CXLDEV_MBOX_BG_CMD_STATUS_OFFSET);
> +       ret_code = FIELD_GET(CXLDEV_MBOX_BG_CMD_COMMAND_RC_MASK, reg);
> +       if (ret_code == CXL_MBOX_CMD_RC_ABORT)
> +               return true;
> +
> +       return false;

	return ret_code == CXL_MBOX_CMD_RC_ABORT;

> +}
> +
>  static bool cxl_mbox_background_complete(struct cxl_dev_state *cxlds)
>  {
>         u64 reg;
> @@ -316,11 +329,17 @@ static int __cxl_pci_mbox_send_cmd(struct cxl_memdev_state *mds,
>                 timeout = mbox_cmd->poll_interval_ms;
>                 for (i = 0; i < mbox_cmd->poll_count; i++) {
>                         if (rcuwait_wait_event_timeout(&mds->mbox_wait,
> -                                      cxl_mbox_background_complete(cxlds),
> +                                      (cxl_mbox_background_complete(cxlds) |
> +                                          cxl_mbox_background_aborted(cxlds)),
>                                        TASK_UNINTERRUPTIBLE,
>                                        msecs_to_jiffies(timeout)) > 0)
>                                 break;
>                 }
> +               if (!cxl_mbox_background_aborted(cxlds)) {
> +                       dev_err(dev, "aborted waiting for background (%d ms) by device\n",
> +                               timeout * mbox_cmd->poll_count);
> +                       return -ECANCELED;
> +               }
> > 
> >Jonathan  
> >>   
> >>> ---
> >>>  drivers/cxl/cxlmem.h |  1 +
> >>>  drivers/cxl/pci.c    | 11 +++++++++++
> >>>  2 files changed, 12 insertions(+)
> >>>
> >>> diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h index
> >>> d8c0894797ac..808fb8712145 100644
> >>> --- a/drivers/cxl/cxlmem.h
> >>> +++ b/drivers/cxl/cxlmem.h
> >>> @@ -516,6 +516,7 @@ to_cxl_memdev_state(struct cxl_dev_state *cxlds)
> >>> enum cxl_opcode {
> >>>          CXL_MBOX_OP_INVALID             = 0x0000,
> >>>          CXL_MBOX_OP_RAW                 = CXL_MBOX_OP_INVALID,
> >>> +       CXL_MBOX_OP_REQ_ABRT_BACKGROUND_OPERATION    = 0x0005,
> >>>          CXL_MBOX_OP_GET_EVENT_RECORD    = 0x0100,
> >>>          CXL_MBOX_OP_CLEAR_EVENT_RECORD  = 0x0101,
> >>>          CXL_MBOX_OP_GET_EVT_INT_POLICY  = 0x0102, diff --git
> >>> a/drivers/cxl/pci.c b/drivers/cxl/pci.c index
> >>> d5d6142f6aa3..95c1f329bca2 100644
> >>> --- a/drivers/cxl/pci.c
> >>> +++ b/drivers/cxl/pci.c
> >>> @@ -394,6 +394,17 @@ static int cxl_pci_mbox_send(struct cxl_mailbox
> >>> *cxl_mbox,
> >>>
> >>>          mutex_lock_io(&cxl_mbox->mbox_mutex);
> >>>          rc = __cxl_pci_mbox_send_cmd(cxl_mbox, cmd);
> >>> +       if (rc == -ETIMEDOUT &&
> >>> +               cmd->return_code == CXL_MBOX_CMD_RC_BACKGROUND) {  
> >>
> >>align cmd just after ( - that is c is under the r of rc After fixing the tabs >ting.
> >>  
> >>> +               struct cxl_mbox_cmd abort_cmd = {
> >>> +                       .opcode =  
> >>CXL_MBOX_OP_REQ_ABRT_BACKGROUND_OPERATION  
> >>> +               };
> >>> +
> >>> +               rc = __cxl_pci_mbox_send_cmd(cxl_mbox, &abort_cmd);
> >>> +               if (!rc)
> >>> +                       rc = -ECANCELED;
> >>> +       }
> >>> +
> >>>          mutex_unlock(&cxl_mbox->mbox_mutex);
> >>>
> >>>          return rc;  


      reply	other threads:[~2024-10-18 16:14 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20241015205633.127333-1-ravis.opensrc@micron.com>
2024-10-16  4:31 ` [RFC 0/4] cxl: Support for mailbox background abort operation Ravis OpenSrc
     [not found] ` <20241015205633.127333-3-ravis.opensrc@micron.com>
2024-10-16  4:32   ` [RFC PATCH 2/4] cxl: Add default timeout for bg mailbox commands Ravis OpenSrc
2024-10-16  4:59   ` [RFC PATCH v2 " Ravis OpenSrc
2024-10-17 15:32     ` Jonathan Cameron
2024-10-17 17:25       ` [EXT] " Srinivasulu Opensrc
     [not found] ` <20241015205633.127333-5-ravis.opensrc@micron.com>
2024-10-16  4:32   ` [RFC PATCH 4/4] cxl/mbox: Add Populate Log support Ravis OpenSrc
2024-10-16  5:00     ` [RFC PATCH v2 " Ravis OpenSrc
2024-10-17 15:37       ` Jonathan Cameron
2024-10-16  4:59 ` [RFC v2 0/4] cxl: Support for mailbox background abort operation Ravis OpenSrc
     [not found] ` <20241015205633.127333-2-ravis.opensrc@micron.com>
2024-10-16  4:31   ` [RFC PATCH 1/4] cxl: Enable mailbox ops with background only if request abort operation is supported Ravis OpenSrc
2024-10-16  4:59   ` [RFC PATCH v2 " Ravis OpenSrc
2024-10-17 15:27     ` Jonathan Cameron
     [not found] ` <20241015205633.127333-4-ravis.opensrc@micron.com>
2024-10-16  4:32   ` [RFC PATCH 3/4] cxl: Abort background operation in case of timeout Ravis OpenSrc
2024-10-16  5:00   ` [RFC PATCH v2 " Ravis OpenSrc
2024-10-17 15:36     ` Jonathan Cameron
2024-10-18  6:39       ` Ravis OpenSrc
2024-10-18 16:14         ` Jonathan Cameron [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20241018171415.00003e47@Huawei.com \
    --to=jonathan.cameron@huawei.com \
    --cc=Ravis.OpenSrc@micron.com \
    --cc=ajayjoshi@micron.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.jiang@intel.com \
    --cc=john@jagalactic.com \
    --cc=linux-cxl@vger.kernel.org \
    --cc=sthanneeru.opensrc@micron.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.