Linux CXL
 help / color / mirror / Atom feed
From: Jonathan Cameron <Jonathan.Cameron@huawei.com>
To: Davidlohr Bueso <dave@stgolabs.net>
Cc: <vishal.l.verma@intel.com>, <fan.ni@samsung.com>,
	<a.manzanares@samsung.com>, <linux-cxl@vger.kernel.org>,
	<dan.j.williams@intel.com>
Subject: Re: [PATCH v2 -qemu] hw/cxl: Support firmware updates
Date: Fri, 21 Jun 2024 18:07:29 +0100	[thread overview]
Message-ID: <20240621180729.0000585c@huawei.com> (raw)
In-Reply-To: <o54dzo3ofnyov7hhh2kcktjfsvvu7oxfqyilu5vl4wrdnzl3ma@iy447qho7kit>

On Mon, 17 Jun 2024 12:37:00 -0700
Davidlohr Bueso <dave@stgolabs.net> wrote:

> Hi Jonathan,
> 
> Just now had some cycles to return to this.
> 
> And I was not able to reproduce the overlapping behavior I
> mentioned in the kernel support - I guess this might be an
> incorrect test I had in place. So sorry for the false alarm,
> and for the record, below is the pasted actual byte ranges
> sent by the driver for a 52k image.
> 
> prev range: 0-0 ... this range: 0-1920
> prev range: 0-1920 ... this range: 1920-3840
> prev range: 1920-3840 ... this range: 3840-5760
> prev range: 3840-5760 ... this range: 5760-7680
> prev range: 5760-7680 ... this range: 7680-9600
> prev range: 7680-9600 ... this range: 9600-11520
> prev range: 9600-11520 ... this range: 11520-13440
> prev range: 11520-13440 ... this range: 13440-15360
> prev range: 13440-15360 ... this range: 15360-17280
> prev range: 15360-17280 ... this range: 17280-19200
> prev range: 17280-19200 ... this range: 19200-21120
> prev range: 19200-21120 ... this range: 21120-23040
> prev range: 21120-23040 ... this range: 23040-24960
> prev range: 23040-24960 ... this range: 24960-26880
> prev range: 24960-26880 ... this range: 26880-28800
> prev range: 26880-28800 ... this range: 28800-30720
> prev range: 28800-30720 ... this range: 30720-32640
> prev range: 30720-32640 ... this range: 32640-34560
> prev range: 32640-34560 ... this range: 34560-36480
> prev range: 34560-36480 ... this range: 36480-38400
> prev range: 36480-38400 ... this range: 38400-40320
> prev range: 38400-40320 ... this range: 40320-42240
> prev range: 40320-42240 ... this range: 42240-44160
> prev range: 42240-44160 ... this range: 44160-46080
> prev range: 44160-46080 ... this range: 46080-48000
> prev range: 46080-48000 ... this range: 48000-49920
> prev range: 48000-49920 ... this range: 49920-51200

Excellent. So I guess we can drop the comment.


> >> ---
> >> Changes from v1:
> >>  - robustify part transfer checking (Jonathan)
> >>  - implement abort
> >>  - increase runtime for full transfer
> >>  - no longer prematurely mark the slot
> >>  - fold both cmds into a single patch
> >>
> >>  hw/cxl/cxl-mailbox-utils.c  | 217 +++++++++++++++++++++++++++++++++++-
> >>  include/hw/cxl/cxl_device.h |  16 +++
> >>  2 files changed, 228 insertions(+), 5 deletions(-)
> >>
> >> diff --git a/hw/cxl/cxl-mailbox-utils.c b/hw/cxl/cxl-mailbox-utils.c
> >> index 80a80f1ec29b..74054855b1fa 100644
> >> --- a/hw/cxl/cxl-mailbox-utils.c
> >> +++ b/hw/cxl/cxl-mailbox-utils.c
> >> @@ -60,6 +60,8 @@ enum {
> >>          #define SET_INTERRUPT_POLICY   0x3
> >>      FIRMWARE_UPDATE = 0x02,
> >>          #define GET_INFO      0x0
> >> +        #define TRANSFER      0x1
> >> +        #define ACTIVATE      0x2
> >>      TIMESTAMP   = 0x03,
> >>          #define GET           0x0
> >>          #define SET           0x1
> >> @@ -815,6 +817,9 @@ static CXLRetCode cmd_infostat_bg_op_sts(const struct cxl_cmd *cmd,
> >>      return CXL_MBOX_SUCCESS;
> >>  }
> >>
> >> +#define CXL_FW_SLOTS 2
> >> +#define CXL_FW_SIZE  0x02000000 /* 32 mb */
> >> +
> >>  /* CXL r3.1 Section 8.2.9.3.1: Get FW Info (Opcode 0200h) */
> >>  static CXLRetCode cmd_firmware_update_get_info(const struct cxl_cmd *cmd,
> >>                                                 uint8_t *payload_in,
> >> @@ -846,15 +851,204 @@ static CXLRetCode cmd_firmware_update_get_info(const struct cxl_cmd *cmd,
> >>      fw_info = (void *)payload_out;
> >>      memset(fw_info, 0, sizeof(*fw_info));
> >>
> >> -    fw_info->slots_supported = 2;
> >> -    fw_info->slot_info = BIT(0) | BIT(3);
> >> -    fw_info->caps = 0;
> >> -    pstrcpy(fw_info->fw_rev1, sizeof(fw_info->fw_rev1), "BWFW VERSION 0");
> >> +    fw_info->slots_supported = CXL_FW_SLOTS;
> >> +    fw_info->slot_info = (cci->fw.active_slot & 0x7) |
> >> +            ((cci->fw.staged_slot & 0x7) << 3);
> >> +    fw_info->caps = BIT(0);  
> >
> >I'd add a comment on this one for what it is. "Online update supported"
> >Given this is trivial I amend the patch on my tree.  
> 
> Sure.
I started doing this but then realized still some nastier corners below
so probably better you do a v3 as you are setup to test this.

So ignore my previous email, you can fix up Fan's stuff as well ;)


...

> >> +    /* allow back-to-back retransmission */
> >> +    if ((offset != cci->fw.prev_offset || length != cci->fw.prev_len) &&
> >> +        (fw_transfer->action == CXL_FW_XFER_ACTION_CONTINUE ||
> >> +         fw_transfer->action == CXL_FW_XFER_ACTION_END)) {
> >> +        /*
> >> +         * XXX: Linux is happy to send overlapping chunks,
> >> +         * so just verify no gaps.
> >> +         */  
> >
> >Does the CXL spec allow overlapping?  I see text about parts being
> >in order (with an exception for back to band transfer). So I think
> >we need to reject any overlap and make sure Linux doesn't do it!
> >
> >
> >The 3rd example in the imp note implies that overlap definitely isn't
> >allowed.  
> 
> Yep, hence the above comment, which also happens to be wrong. And, per
> the examples in the imp notes, it looks like gaps are in fact allowed
> (0-100h, 160h-260h is considered valid, for example).
> 
> >  
> >> +        if (offset > cci->fw.prev_offset + cci->fw.prev_len) {  
> 
> So this really turns into 'offset < ...'
> 
> >> +            return CXL_MBOX_FW_XFER_OUT_OF_ORDER;
> >> +        }
> >> +    }
> >> +
> >> +    switch (fw_transfer->action) {
> >> +    case CXL_FW_XFER_ACTION_FULL: /* ignores offset */
> >> +    case CXL_FW_XFER_ACTION_END:
> >> +        if (fw_transfer->slot == 0 ||
> >> +            fw_transfer->slot == cci->fw.active_slot ||
> >> +            fw_transfer->slot > CXL_FW_SLOTS) {
> >> +            return CXL_MBOX_FW_INVALID_SLOT;
> >> +        }
> >> +
> >> +        /* mark the slot used upon bg completion */
> >> +        break;
> >> +    case CXL_FW_XFER_ACTION_INIT:
> >> +        if (offset != 0) {
> >> +            return CXL_MBOX_INVALID_INPUT;
> >> +        }
> >> +
> >> +        cci->fw.transferring = true;
> >> +        cci->fw.prev_slot = fw_transfer->slot;  
> >
> >Why?  This is only valid for Full and End.  
> 
> oh it occurred to me that the spec was implying that partial
> transfers do want to be the same (Slot=X) regardless of only
> caring about the actual value at the End transfer. I wasn't
> sure, so took the cautious side.
Ok.  If it's vague in the spec and reserved otherwise in these cases
then perhaps just a comment.

> 
> But if this is not the case, it might be useful to update
> the spec and be more explicit.

Go for it. :)


> 
> >  
> >>              return CXL_MBOX_MEDIA_DISABLED;
> >>          }
> >>      }
> >> @@ -2319,6 +2519,9 @@ static void bg_timercb(void *opaque)
> >>          cci->bg.complete_pct = 100;
> >>          cci->bg.ret_code = ret;
> >>          switch (cci->bg.opcode) {
> >> +        case 0x0201: /* fw transfer */
> >> +            __do_firmware_xfer(cci);
> >> +            break;
> >>          case 0x4400: /* sanitize */
> >>          {
> >>              CXLType3Dev *ct3d = CXL_TYPE3(cci->d);
> >> @@ -2390,6 +2593,10 @@ void cxl_init_cci(CXLCCI *cci, size_t payload_max)
> >>      cci->bg.runtime = 0;
> >>      cci->bg.timer = timer_new_ms(QEMU_CLOCK_VIRTUAL,
> >>                                   bg_timercb, cci);
> >> +
> >> +    memset(&cci->fw, 0, sizeof(cci->fw));
> >> +    cci->fw.active_slot = cci->fw.staged_slot = 1;  
> >
> >Why not set staged_slot to 0 on init?
> >
> >"If 0, no FW is currently staged for activation."  
> 
> I prefer following the spec convention directly here.

I'm confused.  My assumption was convention was nothing staged
Perhaps a spec reference?

I'll push out a new tree early next week.  This looks nearly
ready to go - I'll try and remember to tag a 'stable' point
in the tree as I keep promising to do and forgetting.
That will be the appropriate place to base new features rather
than on top of the bits that are less mature.

Jonathan


      reply	other threads:[~2024-06-21 17:07 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-05 17:29 [PATCH v2 -qemu] hw/cxl: Support firmware updates Davidlohr Bueso
2024-03-19 16:28 ` Davidlohr Bueso
2024-03-19 17:56 ` fan
2024-03-19 20:48   ` Davidlohr Bueso
2024-06-21 16:58     ` Jonathan Cameron
2024-06-21 17:08       ` Jonathan Cameron
2024-04-22 15:51 ` Jonathan Cameron
2024-06-17 19:37   ` Davidlohr Bueso
2024-06-21 17:07     ` Jonathan Cameron [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240621180729.0000585c@huawei.com \
    --to=jonathan.cameron@huawei.com \
    --cc=a.manzanares@samsung.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave@stgolabs.net \
    --cc=fan.ni@samsung.com \
    --cc=linux-cxl@vger.kernel.org \
    --cc=vishal.l.verma@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox