xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
To: "Duan, Ronghui" <ronghui.duan@intel.com>
Cc: "Stefano.Stabellini@eu.citrix.com"
	<Stefano.Stabellini@eu.citrix.com>,
	"Ian.Jackson@eu.citrix.com" <Ian.Jackson@eu.citrix.com>,
	"xen-devel@lists.xen.org" <xen-devel@lists.xen.org>
Subject: Re: [RFC v1 0/5] VBD: enlarge max segment per request in blkfront
Date: Thu, 16 Aug 2012 09:55:49 -0400	[thread overview]
Message-ID: <20120816135549.GA17613@phenom.dumpdata.com> (raw)
In-Reply-To: <20120816133457.GA5898@phenom.dumpdata.com>

On Thu, Aug 16, 2012 at 09:34:57AM -0400, Konrad Rzeszutek Wilk wrote:
> On Thu, Aug 16, 2012 at 10:22:56AM +0000, Duan, Ronghui wrote:
> > Hi, list.
> > The max segments for request in VBD queue is 11, while for Linux OS/ other VMM, the parameter is set to 128 in default.
> 
> Like the FreeBSD one?
> 
> > This may be caused by the limited size of ring between Front/Back. So I guess whether we can put segment data into another ring and dynamic use them for the single request's need. Here is prototype which don't do much test, but it can work on Linux 64 bits 3.4.6 kernel. I can see the CPU% can be reduced to 1/3 compared to original in sequential test. But it bring some overhead which will make random IO's cpu utilization increase a little.
> > 
> 
> Did you think also about expanding the ring size to something bigger?
> 
> > Here is a short version data use only 1K random read and 64K sequential read in direct mode. Testing a physical SSD disk as blkback in backend. CPU% is got form xentop.
> 
> > Read 1K random	IOPS	   Dom0 CPU	DomU CPU%
> > 		W	52005.9	86.6	71
> > 		W/O	52123.1	85.8	66.9
> > 			
> > Read 64K seq	BW MB/s	Dom0 CPU	DomU CPU%
> > 	W	250		27.1	       10.6
> > 	W/O	250		62.6	       31.1
> > 
> > 
> > The patch will be simple if we only use new methods. But we need consider that user may use new kernel as backend while an older one as frontend. Also need considerate live migration case. So the change become huge...
> 
> OK? I think you are implementing the extension documented in
> 
> changeset:   24875:a59c1dcfe968
> user:        Justin T. Gibbs <justing@spectralogic.com>
> date:        Thu Feb 23 10:03:07 2012 +0000
> summary:     blkif.h: Define and document the request number/size/segments extension
> 
> changeset:   24874:f9789db96c39
> user:        Justin T. Gibbs <justing@spectralogic.com>
> date:        Thu Feb 23 10:02:30 2012 +0000
> summary:     blkif.h: Document the Red Hat and Citrix blkif multi-page ring extensions
> 
> so that would be the max-requests-segments one?
> 
> 
> 
> > [RFC v1 1/5] 
> > 	In order to add new segment ring, refactoring the original code, split some methods related with ring operation.
> > [RFC v1 2/5]
> > 	Add the segment ring support in blkfront. Most of code is about suspend/recover.
> > [RFC v1 3/5]
> > 	As the same, need refractor the original code in blkback.
> > [RFC v1 4/5]
> > 	In order to support different type of ring type in blkback, make the pending_req list per disk.
> 
> Not sure why you structured the patches like this way, but it might
> make sense to order them in 1, 3, 4, 2, 5 order. The 'pending_req'/per disk is an overall
> improvement that fixes a lot of concurrent issues. I tried to implement this and ran
> in an issue with grants still being active? Did you have issues with that or it worked just fine
> for you?
> > [RFC v1 5/5]
> > 	Add the segment ring support in blkback.
> 
> So .. where are the patches? Did I miss them?

Ah, they just arrived.

I took a brief look at them, and I think they are the right step. The things that are
missing is that that you are missing the kfree  in 4/5 when the disk is gone away. Also
there are some code that is commented out and its not clear to me why that is.

Lastly, this protocol should be negotiated using the 'max-request-.. ' or whichever is
the proper type, not the blkfront-ring-type. It also would be good to CC Justin as he
might have some guidance in this and also could test the frontend on his backend
(or vice-versa). Not sure what is involved in setting up a FreeBSD backend that spectralogic
is using.. Thought this might also involed expanding the ring to be a multi-page one
I think?

And I wonder if you need to have such a huge list of ops? Can some of them be trimmed down?
They v1 and v2 look quite similar. Oh, and instead of v1 and v2 I would just call them
'large_segment' and 'default_segment'. Or 'lgr_segment' and 'def_segment' perhaps?

Maybe 'huge_segment' and 'generic_segment' that sounds better.

Lastly, its not clear to me why you are removing the padding on some of the older blkif structures?

Thanks for posting this!
> > -ronghui
> > 
> > 
> > 
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@lists.xen.org
> > http://lists.xen.org/xen-devel

  reply	other threads:[~2012-08-16 13:55 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-08-16 10:22 [RFC v1 0/5] VBD: enlarge max segment per request in blkfront Duan, Ronghui
2012-08-16 11:14 ` Jan Beulich
2012-08-17  1:12   ` Duan, Ronghui
2012-08-16 13:34 ` Konrad Rzeszutek Wilk
2012-08-16 13:55   ` Konrad Rzeszutek Wilk [this message]
2012-08-17  1:26     ` Duan, Ronghui
2012-08-16 14:18   ` Jan Beulich
2012-09-07 17:49 ` Konrad Rzeszutek Wilk
2012-09-13  2:28   ` Duan, Ronghui
2012-09-13  7:32     ` Jan Beulich
2012-09-13 11:05       ` Stefano Stabellini
2012-09-13 13:23         ` Konrad Rzeszutek Wilk
2012-09-13 14:05           ` Duan, Ronghui
2012-09-17  6:33             ` Duan, Ronghui
2012-09-17 14:37               ` Konrad Rzeszutek Wilk
2012-09-19 21:11               ` Konrad Rzeszutek Wilk
2012-09-13 13:21     ` Konrad Rzeszutek Wilk

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120816135549.GA17613@phenom.dumpdata.com \
    --to=konrad.wilk@oracle.com \
    --cc=Ian.Jackson@eu.citrix.com \
    --cc=Stefano.Stabellini@eu.citrix.com \
    --cc=ronghui.duan@intel.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).