All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ming Lei <ming.lei@redhat.com>
To: "hch@lst.de" <hch@lst.de>
Cc: Dave Chinner <david@fromorbit.com>,
	"Verma, Vishal L" <vishal.l.verma@intel.com>,
	"linux-xfs@vger.kernel.org" <linux-xfs@vger.kernel.org>,
	"Williams, Dan J" <dan.j.williams@intel.com>,
	"darrick.wong@oracle.com" <darrick.wong@oracle.com>,
	linux-block@vger.kernel.org
Subject: Re: 5.3-rc1 regression with XFS log recovery
Date: Tue, 20 Aug 2019 16:13:26 +0800	[thread overview]
Message-ID: <20190820081325.GA21032@ming.t460p> (raw)
In-Reply-To: <20190820055320.GB27501@lst.de>

On Tue, Aug 20, 2019 at 07:53:20AM +0200, hch@lst.de wrote:
> On Tue, Aug 20, 2019 at 02:41:35PM +1000, Dave Chinner wrote:
> > > With the following debug patch.  Based on that I think I'll just
> > > formally submit the vmalloc switch as we're at -rc5, and then we
> > > can restart the unaligned slub allocation drama..
> > 
> > This still doesn't make sense to me, because the pmem and brd code
> > have no aligment limitations in their make_request code - they can
> > handle byte adressing and should not have any problem at all with
> > 8 byte aligned memory in bios.
> > 
> > Digging a little furhter, I note that both brd and pmem use
> > identical mechanisms to marshall data in and out of bios, so they
> > are likely to have the same issue.
> > 
> > So, brd_make_request() does:
> > 
> >         bio_for_each_segment(bvec, bio, iter) {
> >                 unsigned int len = bvec.bv_len;
> >                 int err;
> > 
> >                 err = brd_do_bvec(brd, bvec.bv_page, len, bvec.bv_offset,
> >                                   bio_op(bio), sector);
> >                 if (err)
> >                         goto io_error;
> >                 sector += len >> SECTOR_SHIFT;
> >         }
> > 
> > So, the code behind bio_for_each_segment() splits multi-page bvecs
> > into individual pages, which are passed to brd_do_bvec(). An
> > unaligned 4kB io traces out as:
> > 
> >  [  121.295550] p,o,l,s 00000000a77f0146,768,3328,0x7d0048
> >  [  121.297635] p,o,l,s 000000006ceca91e,0,768,0x7d004e
> > 
> > i.e. page		offset	len	sector
> > 00000000a77f0146	768	3328	0x7d0048
> > 000000006ceca91e	0	768	0x7d004e
> > 
> > You should be able to guess what the problems are from this.

The problem should be that offset of '768' is passed to bio_add_page().

It should be one slub buffer used for block IO, looks an old unsolved
problem.

> > 
> > Both pmem and brd are _sector_ based. We've done a partial sector
> > copy on the first bvec, then the second bvec has started the copy
> > from the wrong offset into the sector we've done a partial copy
> > from.
> > 
> > IOWs, no error is reported when the bvec buffer isn't sector
> > aligned, no error is reported when the length of data to copy was
> > not a multiple of sector size, and no error was reported when we
> > copied the same partial sector twice.
> 
> Yes.  I think bio_for_each_segment is buggy here, as it should not
> blindly split by pages.

bio_for_each_segment() just keeps the original interface as before
introducing multi-page bvec.


Thanks,
Ming

  parent reply	other threads:[~2019-08-20  8:13 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-16 20:59 5.3-rc1 regression with XFS log recovery Verma, Vishal L
2019-08-18  7:11 ` hch
2019-08-18  7:41   ` hch
2019-08-18 17:34     ` hch
2019-08-19  0:08       ` Dave Chinner
2019-08-19  3:49         ` hch
2019-08-19  4:11           ` hch
2019-08-19  4:22             ` Dave Chinner
2019-08-19  4:29               ` hch
2019-08-19  4:40                 ` hch
2019-08-19  5:31                   ` Dave Chinner
2019-08-20  6:14                     ` hch
2019-08-20  4:41                   ` Dave Chinner
2019-08-20  5:53                     ` hch
2019-08-20  7:44                       ` Dave Chinner
2019-08-20  8:13                       ` Ming Lei [this message]
2019-08-20  9:24                         ` Ming Lei
2019-08-20 16:30                           ` Verma, Vishal L
2019-08-20 21:44                           ` Dave Chinner
2019-08-20 22:08                             ` Verma, Vishal L
2019-08-20 23:53                               ` Dave Chinner
2019-08-21  2:19                               ` Ming Lei
2019-08-21  1:56                             ` Ming Lei
2019-08-19  4:15           ` Dave Chinner
2019-08-19 17:19       ` Verma, Vishal L
2019-08-21  0:26       ` Dave Chinner
2019-08-21  0:44         ` hch
2019-08-21  1:08           ` Dave Chinner
2019-08-21  1:56             ` Verma, Vishal L
2019-08-21  6:15               ` Dave Chinner
2019-08-26 17:32       ` Verma, Vishal L

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190820081325.GA21032@ming.t460p \
    --to=ming.lei@redhat.com \
    --cc=dan.j.williams@intel.com \
    --cc=darrick.wong@oracle.com \
    --cc=david@fromorbit.com \
    --cc=hch@lst.de \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=vishal.l.verma@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.