linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Christoph Hellwig <hch@lst.de>
To: Uday Shankar <ushankar@purestorage.com>
Cc: Jens Axboe <axboe@kernel.dk>, Kanchan Joshi <joshi.k@samsung.com>,
	Anuj Gupta <anuj20.g@samsung.com>, Christoph Hellwig <hch@lst.de>,
	linux-block@vger.kernel.org,
	Xinyu Zhang <xizhang@purestorage.com>
Subject: Re: [PATCH] block: fix sanity checks in blk_rq_map_user_bvec
Date: Thu, 24 Oct 2024 08:05:43 +0200	[thread overview]
Message-ID: <20241024060543.GA32211@lst.de> (raw)
In-Reply-To: <20241024045622.GA30309@lst.de>

On Thu, Oct 24, 2024 at 06:56:22AM +0200, Christoph Hellwig wrote:
> On Wed, Oct 23, 2024 at 03:15:19PM -0600, Uday Shankar wrote:
> > @@ -600,9 +600,7 @@ static int blk_rq_map_user_bvec(struct request *rq, const struct iov_iter *iter)
> >  		if (nsegs >= nr_segs || bytes > UINT_MAX - bv->bv_len)
> >  			goto put_bio;
> >  		if (bytes + bv->bv_len > nr_iter)
> > -			goto put_bio;
> > -		if (bv->bv_offset + bv->bv_len > PAGE_SIZE)
> > -			goto put_bio;
> > +			break;
> 
> So while this fixes NVMe, it actually breaks just about every SCSI
> driver as the code will easily exceed max_segment_size now, which the
> old code obeyed (although more by accident).

Looking at the existing code a bit more it seems really confused,
e.g. by iterating over all segments in the iov_iter instead of using
the proper iterators that limit to the actualy size for the I/O,
which I think is the root cause of your problem.

Can you try the (untested) patch below?  That uses the proper block
layer helper to check the I/O layout using the bio iterator.  It will
handle all block layer queue limits, and it does so on the actual
iterator instead of the potential larger registration.  One change
in behavior is that it now returns -EREMOTEIO for all limits mismatches
instead of a random mix of -EINVAL and -REMOTEIO.


diff --git a/block/blk-map.c b/block/blk-map.c
index 0e1167b23934..ca2f2ff853da 100644
--- a/block/blk-map.c
+++ b/block/blk-map.c
@@ -561,57 +561,27 @@ EXPORT_SYMBOL(blk_rq_append_bio);
 /* Prepare bio for passthrough IO given ITER_BVEC iter */
 static int blk_rq_map_user_bvec(struct request *rq, const struct iov_iter *iter)
 {
-	struct request_queue *q = rq->q;
-	size_t nr_iter = iov_iter_count(iter);
-	size_t nr_segs = iter->nr_segs;
-	struct bio_vec *bvecs, *bvprvp = NULL;
-	const struct queue_limits *lim = &q->limits;
-	unsigned int nsegs = 0, bytes = 0;
+	const struct queue_limits *lim = &rq->q->limits;
+	unsigned int nsegs;
 	struct bio *bio;
-	size_t i;
 
-	if (!nr_iter || (nr_iter >> SECTOR_SHIFT) > queue_max_hw_sectors(q))
-		return -EINVAL;
-	if (nr_segs > queue_max_segments(q))
+	if (!iov_iter_count(iter))
 		return -EINVAL;
 
-	/* no iovecs to alloc, as we already have a BVEC iterator */
+	/* reuse the bvecs from the iterator instead of allocating new ones */
 	bio = blk_rq_map_bio_alloc(rq, 0, GFP_KERNEL);
-	if (bio == NULL)
+	if (!bio)
 		return -ENOMEM;
-
 	bio_iov_bvec_set(bio, (struct iov_iter *)iter);
-	blk_rq_bio_prep(rq, bio, nr_segs);
-
-	/* loop to perform a bunch of sanity checks */
-	bvecs = (struct bio_vec *)iter->bvec;
-	for (i = 0; i < nr_segs; i++) {
-		struct bio_vec *bv = &bvecs[i];
 
-		/*
-		 * If the queue doesn't support SG gaps and adding this
-		 * offset would create a gap, fallback to copy.
-		 */
-		if (bvprvp && bvec_gap_to_prev(lim, bvprvp, bv->bv_offset)) {
-			blk_mq_map_bio_put(bio);
-			return -EREMOTEIO;
-		}
-		/* check full condition */
-		if (nsegs >= nr_segs || bytes > UINT_MAX - bv->bv_len)
-			goto put_bio;
-		if (bytes + bv->bv_len > nr_iter)
-			goto put_bio;
-		if (bv->bv_offset + bv->bv_len > PAGE_SIZE)
-			goto put_bio;
-
-		nsegs++;
-		bytes += bv->bv_len;
-		bvprvp = bv;
+	/* check that the data layout matches the hardware restrictions */
+	if (bio_split_rw_at(bio, lim, &nsegs, lim->max_hw_sectors)) {
+		blk_mq_map_bio_put(bio);
+		return -EREMOTEIO;
 	}
+
+	blk_rq_bio_prep(rq, bio, nsegs);
 	return 0;
-put_bio:
-	blk_mq_map_bio_put(bio);
-	return -EINVAL;
 }
 
 /**

      reply	other threads:[~2024-10-24  6:05 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-23 21:15 [PATCH] block: fix sanity checks in blk_rq_map_user_bvec Uday Shankar
2024-10-23 22:31 ` Jens Axboe
2024-10-23 22:46   ` Uday Shankar
2024-10-23 22:50 ` Uday Shankar
2024-10-23 22:54   ` Bart Van Assche
2024-10-24  0:42     ` Chaitanya Kulkarni
2024-10-23 23:03 ` Jens Axboe
2024-10-24  4:56 ` Christoph Hellwig
2024-10-24  6:05   ` Christoph Hellwig [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20241024060543.GA32211@lst.de \
    --to=hch@lst.de \
    --cc=anuj20.g@samsung.com \
    --cc=axboe@kernel.dk \
    --cc=joshi.k@samsung.com \
    --cc=linux-block@vger.kernel.org \
    --cc=ushankar@purestorage.com \
    --cc=xizhang@purestorage.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).