qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Klaus Jensen <its@irrelevant.dk>
To: Dmitry Fomichev <Dmitry.Fomichev@wdc.com>
Cc: "kwolf@redhat.com" <kwolf@redhat.com>,
	Niklas Cassel <Niklas.Cassel@wdc.com>,
	"qemu-block@nongnu.org" <qemu-block@nongnu.org>,
	"k.jensen@samsung.com" <k.jensen@samsung.com>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	"mreitz@redhat.com" <mreitz@redhat.com>,
	"kbusch@kernel.org" <kbusch@kernel.org>
Subject: Re: [PATCH 1/2] hw/block/nvme: fix zone boundary check for append
Date: Tue, 26 Jan 2021 07:58:13 +0100	[thread overview]
Message-ID: <YA+9hUM4648b2oQW@apples.localdomain> (raw)
In-Reply-To: <16895b0f976dbe50ddde73bf8211a1bf74ba5870.camel@wdc.com>

[-- Attachment #1: Type: text/plain, Size: 6588 bytes --]

On Jan 26 04:55, Dmitry Fomichev wrote:
> On Tue, 2021-01-19 at 14:54 +0100, Klaus Jensen wrote:
> > From: Klaus Jensen <k.jensen@samsung.com>
> > 
> > When a zone append is processed the controller checks that validity of
> > the write before assigning the LBA to the append command. This causes
> > the boundary check to be wrong.
> > 
> > Fix this by checking the write *after* assigning the LBA. Remove the
> > append special case from the nvme_check_zone_write and open code it in
> > nvme_do_write, assigning the slba when basic sanity checks have been
> > performed. Then check the validity of the resulting write like any other
> > write command.
> > 
> 
> Klaus,
> 
> I tested this series and it works fine. I however think that the problem that
> Niklas has found can be fixed in the much simpler way by applying the
> following to the existing code -
> 
> --- a/hw/block/nvme.c
> +++ b/hw/block/nvme.c
> @@ -1152,6 +1152,9 @@ static uint16_t nvme_check_zone_write(NvmeCtrl *n, NvmeNamespace *ns,
>                  trace_pci_nvme_err_append_not_at_start(slba, zone->d.zslba);
>                  status = NVME_INVALID_FIELD;
>              }
> +            if (zone->w_ptr + nlb > nvme_zone_wr_boundary(zone)) {
> +                status = NVME_ZONE_BOUNDARY_ERROR;
> +            }
>              if (nvme_l2b(ns, nlb) > (n->page_size << n->zasl)) {
>                  trace_pci_nvme_err_append_too_large(slba, nlb, n->zasl);
>                  status = NVME_INVALID_FIELD;
> 
> I am going to send a few patches that take this approach, please take a look. In my
> testing, it works just as well :)
> 
> > 
> > In the process, also fix a missing endianness conversion for the zone
> > append ALBA.
> 
> Great catch! This could be placed to a separate patch though...
> A few more comments below.
> 
> 
> Reported-by: Niklas Cassel <Niklas.Cassel@wdc.com>
> Cc: Dmitry Fomichev <dmitry.fomichev@wdc.com>
> Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
> ---
>  hw/block/nvme.c | 46 ++++++++++++++++++++++++----------------------
>  1 file changed, 24 insertions(+), 22 deletions(-)
> 
> diff --git a/hw/block/nvme.c b/hw/block/nvme.c
> index 309c26db8ff7..f05dea657b01 100644
> --- a/hw/block/nvme.c
> +++ b/hw/block/nvme.c
> @@ -1133,7 +1133,7 @@ static uint16_t nvme_check_zone_state_for_write(NvmeZone *zone)
>  
>  static uint16_t nvme_check_zone_write(NvmeCtrl *n, NvmeNamespace *ns,
>                                        NvmeZone *zone, uint64_t slba,
> -                                      uint32_t nlb, bool append)
> +                                      uint32_t nlb)
>  {
>      uint16_t status;
>  
> @@ -1147,16 +1147,8 @@ static uint16_t nvme_check_zone_write(NvmeCtrl *n, NvmeNamespace *ns,
>          trace_pci_nvme_err_zone_write_not_ok(slba, nlb, status);
>      } else {
>          assert(nvme_wp_is_valid(zone));
> -        if (append) {
> -            if (unlikely(slba != zone->d.zslba)) {
> -                trace_pci_nvme_err_append_not_at_start(slba, zone->d.zslba);
> -                status = NVME_INVALID_FIELD;
> -            }
> -            if (nvme_l2b(ns, nlb) > (n->page_size << n->zasl)) {
> -                trace_pci_nvme_err_append_too_large(slba, nlb, n->zasl);
> -                status = NVME_INVALID_FIELD;
> -            }
> -        } else if (unlikely(slba != zone->w_ptr)) {
> +
> +        if (unlikely(slba != zone->w_ptr)) {
>              trace_pci_nvme_err_write_not_at_wp(slba, zone->d.zslba,
>                                                 zone->w_ptr);
>              status = NVME_ZONE_INVALID_WRITE;
> @@ -1294,10 +1286,9 @@ static void nvme_finalize_zoned_write(NvmeNamespace *ns, NvmeRequest *req,
>      }
>  }
>  
> -static uint64_t nvme_advance_zone_wp(NvmeNamespace *ns, NvmeZone *zone,
> -                                     uint32_t nlb)
> +static void nvme_advance_zone_wp(NvmeNamespace *ns, NvmeZone *zone,
> +                                 uint32_t nlb)
>  {
> -    uint64_t result = zone->w_ptr;
>      uint8_t zs;
>  
>      zone->w_ptr += nlb;
> @@ -1313,8 +1304,6 @@ static uint64_t nvme_advance_zone_wp(NvmeNamespace *ns, NvmeZone *zone,
>              nvme_assign_zone_state(ns, zone, NVME_ZONE_STATE_IMPLICITLY_OPEN);
>          }
>      }
> -
> -    return result;
>  }
> 
>  static inline bool nvme_is_write(NvmeRequest *req)
> @@ -1692,7 +1681,24 @@ static uint16_t nvme_do_write(NvmeCtrl *n, NvmeRequest *req, bool append,
>      if (ns->params.zoned) {
>          zone = nvme_get_zone_by_slba(ns, slba);
> 
> -        status = nvme_check_zone_write(n, ns, zone, slba, nlb, append);
> +        if (append) {
> 
> This is what I see as a drawback of this approach.
> You have to move the ZA checks that were previously nicely tucked in
> nvme_check_zone_write() to the spot below and now this validation is done
> in two different places for appends and regular writes. This can be avoided.
> 

OK. However this means that other commands that write (Write Zeroes,
Copy) to zones has to go through that special case branch even though it
is a special case of regular writes. This is completely irrelevant for
performance so that's not my point, I just found it cleaner to tuck it
away as a special case in write.

> +            if (unlikely(slba != zone->d.zslba)) {
> +                trace_pci_nvme_err_append_not_at_start(slba, zone->d.zslba);
> +                status = NVME_INVALID_FIELD;
> +                goto invalid;
> +            }
> +
> +            if (nvme_l2b(ns, nlb) > (n->page_size << n->zasl)) {
> +                trace_pci_nvme_err_append_too_large(slba, nlb, n->zasl);
> +                status = NVME_INVALID_FIELD;
> +                goto invalid;
> +            }
> +
> +            slba = zone->w_ptr;
> +            res->slba = cpu_to_le64(slba);
> 
> It is a bit premature to set the result here since the nvme_check_zone_write() below
> can still fail. As I recall, ALBA is only returned by a successful command. Again,
> good find about endianness!
> 

There has always been a branch in finalize that clears it to zero on
error.

My patch also has the effect of not setting ALBA for regular writes. To
change that you are gonna have to add a 'if (append)' special case in
nvme_do_write anyway.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

  reply	other threads:[~2021-01-26  7:02 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-19 13:54 [PATCH 0/2] hw/block/nvme: zoned fixes Klaus Jensen
2021-01-19 13:54 ` [PATCH 1/2] hw/block/nvme: fix zone boundary check for append Klaus Jensen
2021-01-26  4:55   ` Dmitry Fomichev
2021-01-26  6:58     ` Klaus Jensen [this message]
2021-01-19 13:55 ` [PATCH 2/2] hw/block/nvme: refactor the logic for zone write checks Klaus Jensen
2021-01-26  4:58   ` Dmitry Fomichev
2021-01-20 11:10 ` [PATCH 0/2] hw/block/nvme: zoned fixes Niklas Cassel
2021-01-25  7:25 ` Klaus Jensen
2021-01-25 21:48   ` Dmitry Fomichev
2021-01-27 17:42 ` Keith Busch
2021-01-27 23:30   ` Dmitry Fomichev
2021-01-28  6:24 ` Klaus Jensen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YA+9hUM4648b2oQW@apples.localdomain \
    --to=its@irrelevant.dk \
    --cc=Dmitry.Fomichev@wdc.com \
    --cc=Niklas.Cassel@wdc.com \
    --cc=k.jensen@samsung.com \
    --cc=kbusch@kernel.org \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).