public inbox for stable@vger.kernel.org
 help / color / mirror / Atom feed
From: Christoph Hellwig <hch@lst.de>
To: Kanchan Joshi <joshi.k@samsung.com>
Cc: hch@lst.de, kbusch@kernel.org, axboe@kernel.dk, sagi@grimberg.me,
	linux-nvme@lists.infradead.org, vincentfu@gmail.com,
	ankit.kumar@samsung.com, joshiiitr@gmail.com, cpgs@samsung.com,
	stable@vger.kernel.org, Vincent Fu <vincent.fu@samsung.com>
Subject: Re: [PATCH v3] nvme: fix memory corruption for passthrough metadata
Date: Tue, 10 Oct 2023 09:46:34 +0200	[thread overview]
Message-ID: <20231010074634.GA6514@lst.de> (raw)
In-Reply-To: <1891546521.01696823881551.JavaMail.epsvc@epcpadp4>

On Fri, Oct 06, 2023 at 07:17:06PM +0530, Kanchan Joshi wrote:
> Same issue is possible for extended-lba case also. When user specifies a
> short unaligned buffer, the kernel makes a copy and uses that for DMA.

I fail to understand the extent LBA case, and also from looking at the
code mixing it up with validation of the metadata_len seems very
confusion.  Can you try to clearly explain it and maybe split it into a
separate patch?

> Fixes: 456cba386e94 ("nvme: wire-up uring-cmd support for io-passthru on char-device")

Is this really io_uring specific?  I think we also had the same issue
before and this should go back to adding metadata support to the
general passthrough ioctl?

> +static inline bool nvme_nlb_in_cdw12(u8 opcode)
> +{
> +	switch (opcode) {
> +	case nvme_cmd_read:
> +	case nvme_cmd_write:
> +	case nvme_cmd_compare:
> +	case nvme_cmd_zone_append:
> +		return true;
> +	}
> +	return false;
> +}

Nitpick: I find it nicer to read to have a switch that catches
everything with a default statement instead of falling out of it
for checks like this.  It's not making any different in practice
but just reads a little nicer.

> +	/* Exclude commands that do not have nlb in cdw12 */
> +	if (!nvme_nlb_in_cdw12(c->common.opcode))
> +		return true;

So we can still get exactly the same corruption for all commands that
are not known?  That's not a very safe way to deal with the issue..

> +	control = upper_16_bits(le32_to_cpu(c->common.cdw12));
> +	/* Exclude when meta transfer from/to host is not done */
> +	if (control & NVME_RW_PRINFO_PRACT && ns->ms == ns->pi_size)
> +		return true;
> +
> +	nlb = lower_16_bits(le32_to_cpu(c->common.cdw12));

I'd use the rw field of the union and the typed control and length
fields to clean this up a bit.

>  	if (bdev && meta_buffer && meta_len) {
> +		if (!nvme_validate_passthru_meta(ns, nvme_req(req)->cmd,
> +					meta_len, bufflen)) {
> +			ret = -EINVAL;
> +			goto out_unmap;
> +		}
> +
>  		meta = nvme_add_user_metadata(req, meta_buffer, meta_len,

I'd move the check into nvme_add_user_metadata to keep it out of the
hot path.

FYI: here is what I'd do for the external metadata only case:

diff --git a/drivers/nvme/host/ioctl.c b/drivers/nvme/host/ioctl.c
index d8ff796fd5f21d..bf22c7953856f5 100644
--- a/drivers/nvme/host/ioctl.c
+++ b/drivers/nvme/host/ioctl.c
@@ -96,6 +96,71 @@ static void __user *nvme_to_user_ptr(uintptr_t ptrval)
 	return (void __user *)ptrval;
 }
 
+static inline bool nvme_nlb_in_cdw12(u8 opcode)
+{
+	switch (opcode) {
+	case nvme_cmd_read:
+	case nvme_cmd_write:
+	case nvme_cmd_compare:
+	case nvme_cmd_zone_append:
+		return true;
+	default:
+		return false;
+	}
+}
+
+/*
+ * NVMe has no separate field to encode the metadata length expected
+ * (except when using SGLs).
+ *
+ * Because of that we can't allow to transfer arbitrary metadata, as
+ * a metadata buffer that is shorted than what the device expects for
+ * the command will lead to arbitrary kernel (if bounce buffering) or
+ * userspace (if not) memory corruption.
+ *
+ * Check that external metadata is only specified for the few commands
+ * where we know the length based of other fields, and that it fits
+ * the actual data transfer from/to the device.
+ */
+static bool nvme_validate_metadata_len(struct request *req, unsigned meta_len)
+{
+	struct nvme_ns *ns = req->q->queuedata;
+	struct nvme_command *c = nvme_req(req)->cmd;
+	u32 len_by_nlb;
+	
+	/* Exclude commands that do not have nlb in cdw12 */
+	if (!nvme_nlb_in_cdw12(c->common.opcode)) {
+		dev_err(ns->ctrl->device,
+			"unknown metadata command %c (%s).\n",
+			c->common.opcode, current->comm);
+		return false;
+	}
+
+	/*
+	 * Skip the check when PI is inserted or stripped and not transferred.
+	 */
+	if (ns->ms == ns->pi_size &&
+	    (c->rw.control & cpu_to_le16(NVME_RW_PRINFO_PRACT)))
+		return true;
+
+	if (ns->features & NVME_NS_EXT_LBAS) {
+		dev_err(ns->ctrl->device,
+			"requires extended LBAs for metadata (%s).\n",
+			current->comm);
+		return false;
+	}
+
+	len_by_nlb = (le16_to_cpu(c->rw.length) + 1) * ns->ms;
+	if (meta_len < len_by_nlb) {
+		dev_err(ns->ctrl->device,
+			"metadata length (%u instad of %u) is too small (%s).\n",
+			meta_len, len_by_nlb, current->comm);
+		return false;
+	}
+
+	return true;
+}
+
 static void *nvme_add_user_metadata(struct request *req, void __user *ubuf,
 		unsigned len, u32 seed)
 {
@@ -104,6 +169,9 @@ static void *nvme_add_user_metadata(struct request *req, void __user *ubuf,
 	void *buf;
 	struct bio *bio = req->bio;
 
+	if (!nvme_validate_metadata_len(req, len))
+		return ERR_PTR(-EINVAL);
+
 	buf = kmalloc(len, GFP_KERNEL);
 	if (!buf)
 		goto out;

  reply	other threads:[~2023-10-10  7:46 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CGME20231006135322epcas5p1c9acf38b04f35017181c715c706281dc@epcas5p1.samsung.com>
2023-10-06 13:47 ` [PATCH v3] nvme: fix memory corruption for passthrough metadata Kanchan Joshi
2023-10-10  7:46   ` Christoph Hellwig [this message]
2023-10-10 13:39     ` Kanchan Joshi
2023-10-10 15:31       ` Clay Mayers
2023-10-11  5:03         ` Christoph Hellwig
2023-10-11  5:02       ` Christoph Hellwig
2023-10-11  5:26         ` Kanchan Joshi
2023-10-11  6:36           ` Christoph Hellwig
2023-10-11 17:04         ` Keith Busch
2023-10-12  4:36           ` Christoph Hellwig
2023-10-12 15:31             ` Keith Busch
2023-10-12 15:46               ` Christoph Hellwig
2023-10-13  2:19               ` Kanchan Joshi
2023-10-13  4:38                 ` Christoph Hellwig
2023-10-13  5:50                   ` Kanchan Joshi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20231010074634.GA6514@lst.de \
    --to=hch@lst.de \
    --cc=ankit.kumar@samsung.com \
    --cc=axboe@kernel.dk \
    --cc=cpgs@samsung.com \
    --cc=joshi.k@samsung.com \
    --cc=joshiiitr@gmail.com \
    --cc=kbusch@kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=sagi@grimberg.me \
    --cc=stable@vger.kernel.org \
    --cc=vincent.fu@samsung.com \
    --cc=vincentfu@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox