linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: Ming Lei <ming.lei@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>,
	linux-xfs@vger.kernel.org, linux-block@vger.kernel.org
Subject: Re: [PATCH] block: fix 32 bit overflow in __blkdev_issue_discard()
Date: Fri, 16 Nov 2018 09:13:37 +1100	[thread overview]
Message-ID: <20181115221337.GY19305@dastard> (raw)
In-Reply-To: <20181115031035.GE32603@ming.t460p>

On Thu, Nov 15, 2018 at 11:10:36AM +0800, Ming Lei wrote:
> On Thu, Nov 15, 2018 at 12:22:01PM +1100, Dave Chinner wrote:
> > On Thu, Nov 15, 2018 at 09:06:52AM +0800, Ming Lei wrote:
> > > On Wed, Nov 14, 2018 at 08:18:24AM -0700, Jens Axboe wrote:
> > > > On 11/13/18 2:43 PM, Dave Chinner wrote:
> > > > > From: Dave Chinner <dchinner@redhat.com>
> > > > > 
> > > > > A discard cleanup merged into 4.20-rc2 causes fstests xfs/259 to
> > > > > fall into an endless loop in the discard code. The test is creating
> > > > > a device that is exactly 2^32 sectors in size to test mkfs boundary
> > > > > conditions around the 32 bit sector overflow region.
> > > > > 
> > > > > mkfs issues a discard for the entire device size by default, and
> > > > > hence this throws a sector count of 2^32 into
> > > > > blkdev_issue_discard(). It takes the number of sectors to discard as
> > > > > a sector_t - a 64 bit value.
> > > > > 
> > > > > The commit ba5d73851e71 ("block: cleanup __blkdev_issue_discard")
> > > > > takes this sector count and casts it to a 32 bit value before
> > > > > comapring it against the maximum allowed discard size the device
> > > > > has. This truncates away the upper 32 bits, and so if the lower 32
> > > > > bits of the sector count is zero, it starts issuing discards of
> > > > > length 0. This causes the code to fall into an endless loop, issuing
> > > > > a zero length discards over and over again on the same sector.
> > > > 
> > > > Applied, thanks. Ming, can you please add a blktests test for
> > > > this case? This is the 2nd time it's been broken.
> > > 
> > > OK, I will add zram discard test in blktests, which should cover the
> > > 1st report. For the xfs/259, I need to investigate if it is easy to
> > > do in blktests.
> > 
> > Just write a test that creates block devices of 2^32 + (-1,0,1)
> > sectors and runs a discard across the entire device. That's all that
> > xfs/259 it doing - exercising mkfs on 2TB, 4TB and 16TB boundaries.
> > i.e. the boundaries where sectors and page cache indexes (on 4k page
> > size systems) overflow 32 bit int and unsigned int sizes. mkfs
> > issues a discard for the entire device, so it's testing that as
> > well...
> 
> Indeed, I can reproduce this issue via the following commands:
> 
> modprobe scsi_debug virtual_gb=2049 sector_size=512 lbpws10=1 dev_size_mb=512
> blkdiscard /dev/sde
> 
> > 
> > You need to write tests that exercise write_same, write_zeros and
> > discard operations around these boundaries, because they all take
> > a 64 bit sector count and stuff them into 32 bit size fields in
> > the bio tha tis being submitted.
> 
> write_same/write_zeros are usually used by driver directly, so we
> may need make the test case on some specific device.

My local linux iscsi server and client advertise support for them.
It definitely does not ships zeros across the wire(*) when I use
things like FALLOC_FL_ZERO_RANGE, but fstests does not have block
device fallocate() tests for zeroing or punching...

Cheers,

Dave.

(*) but the back end storage is a sparse file on an XFS filesystem,
and the iscsi server fails to translate write_zeroes or
WRITE_SAME(0) to FALLOC_FL_ZERO_RANGE on the storage side and hence
is really slow because it physically writes zeros to the XFS file.
i.e. the client offloads the operation to the server to minimise
wire traffic, but then the server doesn't offload the operation to
the storage....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

  reply	other threads:[~2018-11-16  8:23 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-13 21:43 [PATCH] block: fix 32 bit overflow in __blkdev_issue_discard() Dave Chinner
2018-11-14  2:36 ` Darrick J. Wong
2018-11-14  2:53 ` Ming Lei
2018-11-14  8:08   ` Dave Chinner
2018-11-14  8:15     ` Ming Lei
2018-11-14 15:18 ` Jens Axboe
2018-11-15  1:06   ` Ming Lei
2018-11-15  1:22     ` Dave Chinner
2018-11-15  3:10       ` Ming Lei
2018-11-15 22:13         ` Dave Chinner [this message]
2018-11-15 22:24           ` Darrick J. Wong
2018-11-16  4:04             ` Dave Chinner
2018-11-16  8:32               ` Christoph Hellwig
2018-11-16  8:46                 ` Omar Sandoval
2018-11-16  8:53                   ` Christoph Hellwig
2018-11-16 12:06               ` Ming Lei
2018-11-15  1:51     ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181115221337.GY19305@dastard \
    --to=david@fromorbit.com \
    --cc=axboe@kernel.dk \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=ming.lei@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).