From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
To: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: axboe@fb.com, "Ian Campbell" <Ian.Campbell@citrix.com>,
"xen-devel@lists.xen.org" <xen-devel@lists.xen.org>,
"Julien Grall" <julien.grall@citrix.com>,
"David Vrabel" <david.vrabel@citrix.com>,
"Roger Pau Monné" <roger.pau@citrix.com>
Subject: Re: PAGE_SIZE (64KB), while block driver 'struct request' deals with < PAGE_SIZE (up to 44Kb). Was:Re: [RFC] Support of non-indirect grant backend on 64KB guest
Date: Fri, 4 Sep 2015 11:41:53 -0400 [thread overview]
Message-ID: <20150904154151.GA28256@localhost.localdomain> (raw)
In-Reply-To: <alpine.DEB.2.02.1509041503360.2672@kaball.uk.xensource.com>
On Fri, Sep 04, 2015 at 03:04:18PM +0100, Stefano Stabellini wrote:
> On Thu, 27 Aug 2015, Julien Grall wrote:
> > On 21/08/15 18:10, Konrad Rzeszutek Wilk wrote:
> > > On Fri, Aug 21, 2015 at 05:08:35PM +0100, David Vrabel wrote:
> > >> On 21/08/15 17:05, Konrad Rzeszutek Wilk wrote:
> > >>>
> > >>> I have to concur with that. We can't mandate that ARM 64k page MUST use
> > >>> indirect descriptors.
> > >>
> > >> Then it has to be fixed in the block layer to allow < PAGE_SIZE segments
> > >> and to get the block layer to split requests for blkfront.
> > >
> > > Hey Jens,
> > >
> > > I am hoping you can help us figure this problem out.
> > >
> > > The Linux ARM is capable of using 4KB pages and 64KB pages. Our block
> > > driver (xen-blkfront) was built with 4KB pages in mind and without using
> > > any fancy flags (which some backends lack) the maximum amount of I/O it can
> > > fit on a ring is 44KB.
> > >
> > > This has the unfortunate effect that when the xen-blkfront
> > > gets an 'struct request' it can have on page (64KB) and it can't actually
> > > fit it on the ring! And the lowest segment size it advertises is PAGE_SIZE
> > > (64KB). I believe Julien (who found this) tried initially advertising
> > > smaller segment size than PAGE_SIZE (32KB). However looking at
> > > __blk_segment_map_sg it looks to assume smallest size is PAGE_SIZE so
> > > that would explain why it did not work.
> >
> > To be honest, I haven't tried to see how the block layer will act if I
> > dropped those checks in blk-settings.c until today.
> >
> > I don't see any assumption about PAGE_SIZE in __blk_segment_map_sg.
> > Although dropping the checks in blk-settings (see quick patch [1]),
> > I got the following error in the frontend:
> >
> > bio too big device xvda (128 > 88)
> > Buffer I/O error on dev xvda, logical block 0, async page read
> > bio too big device xvda (128 > 88)
> > Buffer I/O error on dev xvda, logical block 0, async page read
> >
> > The "bio too big device ..." comes from generic_make_request_checks
> > (linux/block/blk-core.c) and the stack trace is:
> >
> > [<fffffe0000096c7c>] dump_backtrace+0x0/0x124
> > [<fffffe0000096db0>] show_stack+0x10/0x1c
> > [<fffffe00005885e8>] dump_stack+0x78/0xbc
> > [<fffffe00000bf7f8>] warn_slowpath_common+0x98/0xd0
> > [<fffffe00000bf8f0>] warn_slowpath_null+0x14/0x20
> > [<fffffe00002df304>] generic_make_request_checks+0x114/0x230
> > [<fffffe00002e0580>] generic_make_request+0x10/0x108
> > [<fffffe00002e070c>] submit_bio+0x94/0x1e0
> > [<fffffe00001d573c>] submit_bh_wbc.isra.36+0x100/0x1a8
> > [<fffffe00001d5bf8>] block_read_full_page+0x320/0x3e8
> > [<fffffe00001d877c>] blkdev_readpage+0x14/0x20
> > [<fffffe000014582c>] do_read_cache_page+0x16c/0x1a0
> > [<fffffe0000145870>] read_cache_page+0x10/0x1c
> > [<fffffe00002f2908>] read_dev_sector+0x30/0x9c
> > [<fffffe00002f3d84>] msdos_partition+0x84/0x554
> > [<fffffe00002f38e4>] check_partition+0xf8/0x21c
> > [<fffffe00002f2f28>] rescan_partitions+0xb0/0x2a4
> > [<fffffe00001d98b0>] __blkdev_get+0x228/0x34c
> > [<fffffe00001d9a14>] blkdev_get+0x40/0x364
> > [<fffffe00002f0b6c>] add_disk+0x398/0x424
> > [<fffffe00003d8500>] blkback_changed+0x1200/0x152c
> > [<fffffe000036a954>] xenbus_otherend_changed+0x9c/0xa4
> > [<fffffe000036c984>] backend_changed+0xc/0x18
> > [<fffffe000036a088>] xenwatch_thread+0xa0/0x13c
> > [<fffffe00000d98d0>] kthread+0xd8/0xf0
> >
> > The fs buffer code seems to assume that the block driver will always support
> > at least a bio of PAGE_SIZE.
> >
> > > One wya to make this work is for the driver (xen-blkfront) to split
> > > the 'struct request' I/O in two internal requests.
> > >
> > > But this has to be a normal problem. Surely there are other drivers
> > > (MMC?) that can't handle PAGE_SIZE and have to break their I/Os.
> > > Would it make sense for the common block code to be able to deal
> > > with this?
> >
> > It will take me a bit of time to fully understand the block layer
> > as the changes doesn't seem trivial from POV (I don't have any
> > knowledge in it). So I will wait a feedback from Jens before
> > going further on this approach.
>
> Maybe we could fall back to the previous plan of modifying xen-blkfront
> for the moment?
Which afaic need to be reposted?
next prev parent reply other threads:[~2015-09-04 15:41 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-08-18 6:29 [RFC] Support of non-indirect grant backend on 64KB guest Julien Grall
2015-08-18 7:09 ` Roger Pau Monné
2015-08-18 7:26 ` Jan Beulich
2015-08-18 18:45 ` Julien Grall
2015-08-19 8:50 ` Roger Pau Monné
2015-08-19 14:54 ` Julien Grall
2015-08-19 15:17 ` Roger Pau Monné
2015-08-19 15:52 ` Julien Grall
2015-08-19 23:44 ` Stefano Stabellini
2015-08-20 8:31 ` Roger Pau Monné
2015-08-20 9:43 ` David Vrabel
2015-08-20 16:16 ` Julien Grall
2015-08-20 17:23 ` Stefano Stabellini
2015-08-21 16:05 ` Konrad Rzeszutek Wilk
2015-08-21 16:08 ` David Vrabel
2015-08-21 16:49 ` Stefano Stabellini
2015-08-21 17:10 ` PAGE_SIZE (64KB), while block driver 'struct request' deals with < PAGE_SIZE (up to 44Kb). Was:Re: " Konrad Rzeszutek Wilk
2015-08-27 17:51 ` Julien Grall
2015-09-04 14:04 ` Stefano Stabellini
2015-09-04 15:41 ` Konrad Rzeszutek Wilk [this message]
2015-09-04 16:15 ` Julien Grall
2015-09-04 17:32 ` Konrad Rzeszutek Wilk
2015-09-04 22:05 ` Julien Grall
2015-08-20 9:37 ` Jan Beulich
2015-08-19 8:58 ` Jan Beulich
2015-08-19 15:25 ` Julien Grall
2015-08-20 17:42 ` David Vrabel
2015-08-21 1:30 ` Julien Grall
2015-08-21 16:07 ` Konrad Rzeszutek Wilk
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150904154151.GA28256@localhost.localdomain \
--to=konrad.wilk@oracle.com \
--cc=Ian.Campbell@citrix.com \
--cc=axboe@fb.com \
--cc=david.vrabel@citrix.com \
--cc=julien.grall@citrix.com \
--cc=roger.pau@citrix.com \
--cc=stefano.stabellini@eu.citrix.com \
--cc=xen-devel@lists.xen.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.