* Re: RE: RE: poor domU VBD performance. [not found] <A95E2296287EAD4EB592B5DEEFCE0E9D1E3905@liverpoolst.ad.cl.cam.ac.uk> @ 2005-03-29 22:45 ` Kurt Garloff 2005-03-29 22:59 ` Andrew Theurer ` (2 more replies) 0 siblings, 3 replies; 9+ messages in thread From: Kurt Garloff @ 2005-03-29 22:45 UTC (permalink / raw) To: Ian Pratt Cc: Xen development list, Vincent Hanquez, Jens Axboe, Christian Limpach [-- Attachment #1.1.1: Type: text/plain, Size: 1604 bytes --] Hi Ian, On Tue, Mar 29, 2005 at 07:09:50PM +0100, Ian Pratt wrote: > We'd really appreciate your help on this, or from someone else at SuSE > who actually understands the Linux block layer? I'm Cc'ing Jens ... > In the 2.6 blkfront driver, what scheduler should we be registering > with? What should we be setting as max_sectors? Are there other > parameters we should be setting that we aren't? (block size?) I think noop is a good choice for secondary domains, as you don't want to be too clever there, otherwise you stack a clever scheduler on top of a clever scheduler. noop basically only does front- and backmerging to make the request sizes larger. But you probably should initialize the readahead sectors. Please test attached patch. It fixed the problem for me, but my testing was very limited, I only had a small loopback mounted root fs to test with quickly. Note that initializing to 256 (128k) would be OK as well (and might be the better default); it seems to be set to 256 (128k) by default, but it's not ... If you explicitly set it to 256, the performance still increases tremendously. > In the blkback driver that actually issues the IO's in dom0, is there > something we should be doing to cause IOs to get batched? In 2.4 we used > a task_queue to push the IO through to the disk having queued it with > generic_make_request(). In 2.6 we're currently using submit_bio() and > just hoping that batching happens. I don't think the blkback driver does anything wrong here. Regards, -- Kurt Garloff, Director SUSE Labs, Novell Inc. [-- Attachment #1.1.2: xen-blkfront-ra.diff --] [-- Type: text/plain, Size: 840 bytes --] From: Kurt Garloff <garloff@suse.de> Subject: Initialize readahead in vbd Q init code The domU read performance is poor without readahead, so better make sure we initialize this value. Signed-off-by: Kurt Garloff <garloff@suse.de> Index: linux-2.6.11/drivers/xen/blkfront/vbd.c =================================================================== --- linux-2.6.11.orig/drivers/xen/blkfront/vbd.c +++ linux-2.6.11/drivers/xen/blkfront/vbd.c @@ -268,8 +268,11 @@ static struct gendisk *xlvbd_get_gendisk xlbd_blk_queue, BLKIF_MAX_SEGMENTS_PER_REQUEST); /* Make sure buffer addresses are sector-aligned. */ blk_queue_dma_alignment(xlbd_blk_queue, 511); + + /* Set readahead */ + blk_queue_max_sectors(xlbd_blk_queue, 512); } gd->queue = xlbd_blk_queue; add_disk(gd); [-- Attachment #1.2: Type: application/pgp-signature, Size: 189 bytes --] [-- Attachment #2: Type: text/plain, Size: 138 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: poor domU VBD performance. 2005-03-29 22:45 ` RE: RE: poor domU VBD performance Kurt Garloff @ 2005-03-29 22:59 ` Andrew Theurer 2005-03-29 23:19 ` Kurt Garloff 2005-03-30 8:53 ` RE: " Jens Axboe 2005-03-30 10:00 ` Kurt Garloff 2 siblings, 1 reply; 9+ messages in thread From: Andrew Theurer @ 2005-03-29 22:59 UTC (permalink / raw) To: Kurt Garloff, Ian Pratt Cc: Vincent Hanquez, Xen development list, Jens Axboe, Christian Limpach On Tuesday 29 March 2005 16:45, Kurt Garloff wrote: > Hi Ian, > > On Tue, Mar 29, 2005 at 07:09:50PM +0100, Ian Pratt wrote: > > We'd really appreciate your help on this, or from someone else at SuSE > > who actually understands the Linux block layer? > > I'm Cc'ing Jens ... > > > In the 2.6 blkfront driver, what scheduler should we be registering > > with? What should we be setting as max_sectors? Are there other > > parameters we should be setting that we aren't? (block size?) > > I think noop is a good choice for secondary domains, as you don't > want to be too clever there, otherwise you stack a clever scheduler > on top of a clever scheduler. noop basically only does front- and > backmerging to make the request sizes larger. > > But you probably should initialize the readahead sectors. > > Please test attached patch. This should help the case where one is doing buffered IO (so readahead gets used) but for o_direct, I still think we will have a problem. On Dom0, I can drive 58MB/sec with sequential read with o_direct with just a 32k request size, but on domU with the same request size I can only get ~6MB/sec. I am still wondering is somthing is up with the backend driver. It apperas that the backend driver only submits requests to the actual device every 10ms. With a much larger request size (for o_direct) or a large readahead, 10ms is often enough to keep the disk streaming data. With smaller request sizes or small read ahaad, the disk just doesn't read effciently. -Andrew ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: poor domU VBD performance. 2005-03-29 22:59 ` Andrew Theurer @ 2005-03-29 23:19 ` Kurt Garloff 2005-03-29 23:26 ` Andrew Theurer 0 siblings, 1 reply; 9+ messages in thread From: Kurt Garloff @ 2005-03-29 23:19 UTC (permalink / raw) To: Andrew Theurer Cc: Ian Pratt, Christian Limpach, Xen development list, Jens Axboe, Vincent Hanquez [-- Attachment #1.1: Type: text/plain, Size: 1117 bytes --] Hi Andrew, On Tue, Mar 29, 2005 at 04:59:18PM -0600, Andrew Theurer wrote: > On Tuesday 29 March 2005 16:45, Kurt Garloff wrote: > > Please test attached patch. > > This should help the case where one is doing buffered IO (so readahead gets > used) but for o_direct, I still think we will have a problem. On Dom0, I can > drive 58MB/sec with sequential read with o_direct with just a 32k request > size, but on domU with the same request size I can only get ~6MB/sec. I can't reproduce this. Does this depend on whether your domU root is a loopback mounted file or a real partition/LVM device? > I am still wondering is somthing is up with the backend driver. It > apperas that the backend driver only submits requests to the actual > device every 10ms. With a much larger request size (for o_direct) or > a large readahead, 10ms is often enough to keep the disk streaming > data. With smaller request sizes or small read ahaad, the disk just > doesn't read effciently. We might have a problem with unplugging then. Regards, -- Kurt Garloff, Director SUSE Labs, Novell Inc. [-- Attachment #1.2: Type: application/pgp-signature, Size: 189 bytes --] [-- Attachment #2: Type: text/plain, Size: 138 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: poor domU VBD performance. 2005-03-29 23:19 ` Kurt Garloff @ 2005-03-29 23:26 ` Andrew Theurer 0 siblings, 0 replies; 9+ messages in thread From: Andrew Theurer @ 2005-03-29 23:26 UTC (permalink / raw) To: Kurt Garloff Cc: Ian Pratt, Christian Limpach, Xen development list, Jens Axboe, Vincent Hanquez On Tuesday 29 March 2005 17:19, Kurt Garloff wrote: > Hi Andrew, > > On Tue, Mar 29, 2005 at 04:59:18PM -0600, Andrew Theurer wrote: > > On Tuesday 29 March 2005 16:45, Kurt Garloff wrote: > > > Please test attached patch. > > > > This should help the case where one is doing buffered IO (so readahead > > gets used) but for o_direct, I still think we will have a problem. On > > Dom0, I can drive 58MB/sec with sequential read with o_direct with just a > > 32k request size, but on domU with the same request size I can only get > > ~6MB/sec. > > I can't reproduce this. > Does this depend on whether your domU root is a loopback mounted file > or a real partition/LVM device? I am not sure. What program are you using for o_direct reads? I use a real LVM device for domU root and then another whole disk for the read tests. > > I am still wondering is somthing is up with the backend driver. It > > apperas that the backend driver only submits requests to the actual > > device every 10ms. With a much larger request size (for o_direct) or > > a large readahead, 10ms is often enough to keep the disk streaming > > data. With smaller request sizes or small read ahaad, the disk just > > doesn't read effciently. > > We might have a problem with unplugging then. That's what I suspect, but I do not know the driver code well enough to say for sure. -Andrew ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: RE: RE: poor domU VBD performance. 2005-03-29 22:45 ` RE: RE: poor domU VBD performance Kurt Garloff 2005-03-29 22:59 ` Andrew Theurer @ 2005-03-30 8:53 ` Jens Axboe 2005-03-30 10:00 ` Kurt Garloff 2 siblings, 0 replies; 9+ messages in thread From: Jens Axboe @ 2005-03-30 8:53 UTC (permalink / raw) To: Kurt Garloff Cc: Ian Pratt, Xen development list, Vincent Hanquez, Christian Limpach On Wed, Mar 30 2005, Kurt Garloff wrote: > Hi Ian, > > On Tue, Mar 29, 2005 at 07:09:50PM +0100, Ian Pratt wrote: > > We'd really appreciate your help on this, or from someone else at SuSE > > who actually understands the Linux block layer? > > I'm Cc'ing Jens ... > > > In the 2.6 blkfront driver, what scheduler should we be registering > > with? What should we be setting as max_sectors? Are there other > > parameters we should be setting that we aren't? (block size?) > > I think noop is a good choice for secondary domains, as you don't > want to be too clever there, otherwise you stack a clever scheduler > on top of a clever scheduler. noop basically only does front- and > backmerging to make the request sizes larger. > > But you probably should initialize the readahead sectors. > > Please test attached patch. > > It fixed the problem for me, but my testing was very limited, > I only had a small loopback mounted root fs to test with quickly. > > Note that initializing to 256 (128k) would be OK as well (and might > be the better default); it seems to be set to 256 (128k) by default, > but it's not ... If you explicitly set it to 256, the performance > still increases tremendously. > > > In the blkback driver that actually issues the IO's in dom0, is there > > something we should be doing to cause IOs to get batched? In 2.4 we used > > a task_queue to push the IO through to the disk having queued it with > > generic_make_request(). In 2.6 we're currently using submit_bio() and > > just hoping that batching happens. > > I don't think the blkback driver does anything wrong here. > > Regards, > -- > Kurt Garloff, Director SUSE Labs, Novell Inc. > From: Kurt Garloff <garloff@suse.de> > Subject: Initialize readahead in vbd Q init code > > The domU read performance is poor without readahead, so > better make sure we initialize this value. > > Signed-off-by: Kurt Garloff <garloff@suse.de> > > Index: linux-2.6.11/drivers/xen/blkfront/vbd.c > =================================================================== > --- linux-2.6.11.orig/drivers/xen/blkfront/vbd.c > +++ linux-2.6.11/drivers/xen/blkfront/vbd.c > @@ -268,8 +268,11 @@ static struct gendisk *xlvbd_get_gendisk > xlbd_blk_queue, BLKIF_MAX_SEGMENTS_PER_REQUEST); > > /* Make sure buffer addresses are sector-aligned. */ > blk_queue_dma_alignment(xlbd_blk_queue, 511); > + > + /* Set readahead */ > + blk_queue_max_sectors(xlbd_blk_queue, 512); This isn't read-ahead, it's the max request size setting. The actual read-ahead setting is in q->backing_dev_info.ra_pages. There is a helper function for this type of stacking, blk_queue_stack_limits(). You call it after setting up your own queue: blk_queue_stack_limits(my_queue, bottom_queue); I'll check the xen block driver to see if there's anything else that sticks out. -- Jens Axboe ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: RE: RE: poor domU VBD performance. 2005-03-29 22:45 ` RE: RE: poor domU VBD performance Kurt Garloff 2005-03-29 22:59 ` Andrew Theurer 2005-03-30 8:53 ` RE: " Jens Axboe @ 2005-03-30 10:00 ` Kurt Garloff 2 siblings, 0 replies; 9+ messages in thread From: Kurt Garloff @ 2005-03-30 10:00 UTC (permalink / raw) To: Xen development list Cc: Ian Pratt, Christian Limpach, Jens Axboe, Vincent Hanquez [-- Attachment #1.1: Type: text/plain, Size: 271 bytes --] On Wed, Mar 30, 2005 at 12:45:03AM +0200, Kurt Garloff wrote: > Please test attached patch. Delete it, blk_queue_max_sectors() is called a bit above. Adding printk()s now to see what's going on there. Regards, -- Kurt Garloff, Director SUSE Labs, Novell Inc. [-- Attachment #1.2: Type: application/pgp-signature, Size: 189 bytes --] [-- Attachment #2: Type: text/plain, Size: 138 bytes --] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel ^ permalink raw reply [flat|nested] 9+ messages in thread
* RE: RE: RE: poor domU VBD performance.
@ 2005-03-30 11:16 Ian Pratt
2005-03-31 7:05 ` Jens Axboe
0 siblings, 1 reply; 9+ messages in thread
From: Ian Pratt @ 2005-03-30 11:16 UTC (permalink / raw)
To: Jens Axboe, Kurt Garloff
Cc: Vincent Hanquez, Xen development list, Christian Limpach
> I'll check the xen block driver to see if there's anything
> else that sticks out.
>
> Jens Axboe
Jens, I'd really appreciate this.
The blkfront/blkback drivers have rather evolved over time, and I don't
think any of the core team fully understand the block-layer differences
between 2.4 and 2.6.
There's also some junk left in there from when the backend was in Xen
itself back in the days of 1.2, though Vincent has prepared a patch to
clean this up and also make 'refreshing' of vbd's work (for size
changes), and also allow the blkfront driver to import whole disks
rather than paritions. We had this functionality on 2.4, but lost it in
the move to 2.6.
My bet is that it's the 2.6 backend that is where the true perofrmance
bug lies. Using a 2.6 domU blkfront talking to a 2.4 dom0 blkback seems
to give good performance under a wide variety of circumstances. Using a
2.6 dom0 is far more pernickety. I agree with Andrew that I suspect it's
the work queue changes are biting us when we don't have many outstanding
requests.
Thanks,
Ian
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: RE: RE: poor domU VBD performance. 2005-03-30 11:16 Ian Pratt @ 2005-03-31 7:05 ` Jens Axboe 2005-03-31 7:10 ` Jens Axboe 0 siblings, 1 reply; 9+ messages in thread From: Jens Axboe @ 2005-03-31 7:05 UTC (permalink / raw) To: Ian Pratt Cc: Xen development list, Kurt Garloff, Vincent Hanquez, Christian Limpach On Wed, Mar 30 2005, Ian Pratt wrote: > > I'll check the xen block driver to see if there's anything > > else that sticks out. > > > > Jens Axboe > > Jens, I'd really appreciate this. > > The blkfront/blkback drivers have rather evolved over time, and I don't > think any of the core team fully understand the block-layer differences > between 2.4 and 2.6. > > There's also some junk left in there from when the backend was in Xen > itself back in the days of 1.2, though Vincent has prepared a patch to > clean this up and also make 'refreshing' of vbd's work (for size > changes), and also allow the blkfront driver to import whole disks > rather than paritions. We had this functionality on 2.4, but lost it in > the move to 2.6. > > My bet is that it's the 2.6 backend that is where the true perofrmance > bug lies. Using a 2.6 domU blkfront talking to a 2.4 dom0 blkback seems > to give good performance under a wide variety of circumstances. Using a > 2.6 dom0 is far more pernickety. I agree with Andrew that I suspect it's > the work queue changes are biting us when we don't have many outstanding > requests. You never schedule the queues you submit the io against for the 2.6 kernel, you only have a tq_disk run for 2.4 kernels. This basically puts you at the mercy of the timeout unplugging, which is really suboptimal unless you can keep the io queue of the target busy at all times. You need to either mark the last bio going to that device as BIO_SYNC, or do a blk_run_queue() on the target queue after having submitted all io in this batch for it. -- Jens Axboe ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: RE: RE: poor domU VBD performance. 2005-03-31 7:05 ` Jens Axboe @ 2005-03-31 7:10 ` Jens Axboe 0 siblings, 0 replies; 9+ messages in thread From: Jens Axboe @ 2005-03-31 7:10 UTC (permalink / raw) To: Ian Pratt Cc: Xen development list, Kurt Garloff, Vincent Hanquez, Christian Limpach On Thu, Mar 31 2005, Jens Axboe wrote: > On Wed, Mar 30 2005, Ian Pratt wrote: > > > I'll check the xen block driver to see if there's anything > > > else that sticks out. > > > > > > Jens Axboe > > > > Jens, I'd really appreciate this. > > > > The blkfront/blkback drivers have rather evolved over time, and I don't > > think any of the core team fully understand the block-layer differences > > between 2.4 and 2.6. > > > > There's also some junk left in there from when the backend was in Xen > > itself back in the days of 1.2, though Vincent has prepared a patch to > > clean this up and also make 'refreshing' of vbd's work (for size > > changes), and also allow the blkfront driver to import whole disks > > rather than paritions. We had this functionality on 2.4, but lost it in > > the move to 2.6. > > > > My bet is that it's the 2.6 backend that is where the true perofrmance > > bug lies. Using a 2.6 domU blkfront talking to a 2.4 dom0 blkback seems > > to give good performance under a wide variety of circumstances. Using a > > 2.6 dom0 is far more pernickety. I agree with Andrew that I suspect it's > > the work queue changes are biting us when we don't have many outstanding > > requests. > > You never schedule the queues you submit the io against for the 2.6 > kernel, you only have a tq_disk run for 2.4 kernels. This basically puts > you at the mercy of the timeout unplugging, which is really suboptimal > unless you can keep the io queue of the target busy at all times. > > You need to either mark the last bio going to that device as BIO_SYNC, > or do a blk_run_queue() on the target queue after having submitted all > io in this batch for it. Here is a temporary work-around, this should bring you close to 100% performance at the cost of some extra unplugs. Uncompiled. --- blkback.c~ 2005-03-31 09:06:16.000000000 +0200 +++ blkback.c 2005-03-31 09:09:27.000000000 +0200 @@ -481,7 +481,6 @@ for ( i = 0; i < nr_psegs; i++ ) { struct bio *bio; - struct bio_vec *bv; bio = bio_alloc(GFP_ATOMIC, 1); if ( unlikely(bio == NULL) ) @@ -494,17 +493,12 @@ bio->bi_private = pending_req; bio->bi_end_io = end_block_io_op; bio->bi_sector = phys_seg[i].sector_number; - bio->bi_rw = operation; - bv = bio_iovec_idx(bio, 0); - bv->bv_page = virt_to_page(MMAP_VADDR(pending_idx, i)); - bv->bv_len = phys_seg[i].nr_sects << 9; - bv->bv_offset = phys_seg[i].buffer & ~PAGE_MASK; + bio_add_page(bio, virt_to_page(MMAP_VADDR(pending_idx, i)), + phys_seg[i].nr_sects << 9, + phys_seg[i].buffer & ~PAGE_MASK); - bio->bi_size = bv->bv_len; - bio->bi_vcnt++; - - submit_bio(operation, bio); + submit_bio(operation | (1 << BIO_RW_SYNC), bio); } #endif -- Jens Axboe ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2005-03-31 7:10 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <A95E2296287EAD4EB592B5DEEFCE0E9D1E3905@liverpoolst.ad.cl.cam.ac.uk>
2005-03-29 22:45 ` RE: RE: poor domU VBD performance Kurt Garloff
2005-03-29 22:59 ` Andrew Theurer
2005-03-29 23:19 ` Kurt Garloff
2005-03-29 23:26 ` Andrew Theurer
2005-03-30 8:53 ` RE: " Jens Axboe
2005-03-30 10:00 ` Kurt Garloff
2005-03-30 11:16 Ian Pratt
2005-03-31 7:05 ` Jens Axboe
2005-03-31 7:10 ` Jens Axboe
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.