Re: [PATCH v5 2/2] xen/block: add multi-page ring support

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Bob Liu <bob.liu@oracle.com>
To: Paul Durrant <Paul.Durrant@citrix.com>
Cc: "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>,
	David Vrabel <david.vrabel@citrix.com>,
	"justing@spectralogic.com" <justing@spectralogic.com>,
	"konrad.wilk@oracle.com" <konrad.wilk@oracle.com>,
	Roger Pau Monne <roger.pau@citrix.com>,
	Julien Grall <julien.grall@citrix.com>,
	"boris.ostrovsky@oracle.com" <boris.ostrovsky@oracle.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v5 2/2] xen/block: add multi-page ring support
Date: Fri, 22 May 2015 17:03:11 +0800	[thread overview]
Message-ID: <555EF0CF.8030009@oracle.com> (raw)
In-Reply-To: <9AAE0902D5BC7E449B7C8E4E778ABCD0259141D9@AMSPEX01CL01.citrite.net>


On 05/22/2015 04:31 PM, Paul Durrant wrote:
>> -----Original Message-----
>> From: Bob Liu [mailto:bob.liu@oracle.com]
>> Sent: 22 May 2015 01:00
>> To: xen-devel@lists.xen.org
>> Cc: David Vrabel; justing@spectralogic.com; konrad.wilk@oracle.com; Roger
>> Pau Monne; Paul Durrant; Julien Grall; boris.ostrovsky@oracle.com; linux-
>> kernel@vger.kernel.org; Bob Liu
>> Subject: [PATCH v5 2/2] xen/block: add multi-page ring support
>>
>> Extend xen/block to support multi-page ring, so that more requests can be
>> issued by using more than one pages as the request ring between blkfront
>> and backend.
>> As a result, the performance can get improved significantly.
>>
>> We got some impressive improvements on our highend iscsi storage cluster
>> backend. If using 64 pages as the ring, the IOPS increased about 15 times
>> for the throughput testing and above doubled for the latency testing.
>>
>> The reason was the limit on outstanding requests is 32 if use only one-page
>> ring, but in our case the iscsi lun was spread across about 100 physical
>> drives, 32 was really not enough to keep them busy.
>>
>> Changes in v2:
>>  - Rebased to 4.0-rc6.
>>  - Document on how multi-page ring feature working to linux io/blkif.h.
>>
>> Changes in v3:
>>  - Remove changes to linux io/blkif.h and follow the protocol defined
>>    in io/blkif.h of XEN tree.
>>  - Rebased to 4.1-rc3
>>
>> Changes in v4:
>>  - Turn to use 'ring-page-order' and 'max-ring-page-order'.
>>  - A few comments from Roger.
>>
>> Changes in v5:
>>  - Clarify 4k granularity to comment.
>>  - Address more comments from Roger.
>>
>> Signed-off-by: Bob Liu <bob.liu@oracle.com>
>> ---
>>  drivers/block/xen-blkback/blkback.c |  13 ++++
>>  drivers/block/xen-blkback/common.h  |   3 +-
>>  drivers/block/xen-blkback/xenbus.c  |  88 +++++++++++++++++------
>>  drivers/block/xen-blkfront.c        | 135 +++++++++++++++++++++++++-------
>> ----
>>  4 files changed, 179 insertions(+), 60 deletions(-)
>>
>> diff --git a/drivers/block/xen-blkback/blkback.c b/drivers/block/xen-
>> blkback/blkback.c
>> index 713fc9f..2126842 100644
>> --- a/drivers/block/xen-blkback/blkback.c
>> +++ b/drivers/block/xen-blkback/blkback.c
>> @@ -84,6 +84,13 @@ MODULE_PARM_DESC(max_persistent_grants,
>>                   "Maximum number of grants to map persistently");
>>
>>  /*
>> + * Maximum order of pages to be used for the shared ring between front
>> and
>> + * backend, 4KB page granularity is used.
>> + */
>> +unsigned int xen_blkif_max_ring_order =
>> XENBUS_MAX_RING_PAGE_ORDER;
>> +module_param_named(max_ring_page_order, xen_blkif_max_ring_order,
>> int, S_IRUGO);
>> +MODULE_PARM_DESC(max_ring_page_order, "Maximum order of pages
>> to be used for the shared ring");
>> +/*
>>   * The LRU mechanism to clean the lists of persistent grants needs to
>>   * be executed periodically. The time interval between consecutive
>> executions
>>   * of the purge mechanism is set in ms.
>> @@ -1438,6 +1445,12 @@ static int __init xen_blkif_init(void)
>>  	if (!xen_domain())
>>  		return -ENODEV;
>>
>> +	if (xen_blkif_max_ring_order > XENBUS_MAX_RING_PAGE_ORDER)
>> {
>> +		pr_info("Invalid max_ring_order (%d), will use default max:
>> %d.\n",
>> +			xen_blkif_max_ring_order,
>> XENBUS_MAX_RING_PAGE_ORDER);
>> +		xen_blkif_max_ring_order =
>> XENBUS_MAX_RING_PAGE_ORDER;
>> +	}
>> +
>>  	rc = xen_blkif_interface_init();
>>  	if (rc)
>>  		goto failed_init;
>> diff --git a/drivers/block/xen-blkback/common.h b/drivers/block/xen-
>> blkback/common.h
>> index f620b5d..919a1ab 100644
>> --- a/drivers/block/xen-blkback/common.h
>> +++ b/drivers/block/xen-blkback/common.h
>> @@ -44,6 +44,7 @@
>>  #include <xen/interface/io/blkif.h>
>>  #include <xen/interface/io/protocols.h>
>>
>> +extern unsigned int xen_blkif_max_ring_order;
>>  /*
>>   * This is the maximum number of segments that would be allowed in
>> indirect
>>   * requests. This value will also be passed to the frontend.
>> @@ -248,7 +249,7 @@ struct backend_info;
>>  #define PERSISTENT_GNT_WAS_ACTIVE	1
>>
>>  /* Number of requests that we can fit in a ring */
>> -#define XEN_BLKIF_REQS			32
>> +#define XEN_MAX_BLKIF_REQS		(32 *
>> XENBUS_MAX_RING_PAGES)
>>
>>  struct persistent_gnt {
>>  	struct page *page;
>> diff --git a/drivers/block/xen-blkback/xenbus.c b/drivers/block/xen-
>> blkback/xenbus.c
>> index 6ab69ad..bc33888 100644
>> --- a/drivers/block/xen-blkback/xenbus.c
>> +++ b/drivers/block/xen-blkback/xenbus.c
>> @@ -25,6 +25,7 @@
>>
>>  /* Enlarge the array size in order to fully show blkback name. */
>>  #define BLKBACK_NAME_LEN (20)
>> +#define RINGREF_NAME_LEN (20)
>>
>>  struct backend_info {
>>  	struct xenbus_device	*dev;
>> @@ -152,7 +153,7 @@ static struct xen_blkif *xen_blkif_alloc(domid_t
>> domid)
>>  	INIT_LIST_HEAD(&blkif->pending_free);
>>  	INIT_WORK(&blkif->free_work, xen_blkif_deferred_free);
>>
>> -	for (i = 0; i < XEN_BLKIF_REQS; i++) {
>> +	for (i = 0; i < XEN_MAX_BLKIF_REQS; i++) {
> 
> How big is XEN_MAX_BLKIF_REQS? These allocations are per-instance so I'd be concerned that the increase in the number of allocations would hit system scalability.
> 

Right, Roger and I have agreed to delay request allocation(including indirect page related memory) until we know the exactly value.
But that would be in an new patch soon.
"
Ack. As said, we have been doing this for a long time. When I added
indirect descriptors I've also allocated everything before knowing if
indirect descriptors will be used or not.

Maybe it's time to change that and provide a way to allocate how many
requests we need, and which fields should be allocated based on the
supported features.
"

Thanks,
-Bob

next prev parent reply	other threads:[~2015-05-22  9:03 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-05-21 23:59 [PATCH v2 1/2] driver: xen-blkfront: move talk_to_blkback to a more suitable place Bob Liu
2015-05-21 23:59 ` [PATCH v5 2/2] xen/block: add multi-page ring support Bob Liu
2015-05-22  8:31   ` Paul Durrant
2015-05-22  8:31     ` Paul Durrant
2015-05-22  9:03     ` Bob Liu
2015-05-22  9:03     ` Bob Liu [this message]
2015-05-21 23:59 ` Bob Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=555EF0CF.8030009@oracle.com \
    --to=bob.liu@oracle.com \
    --cc=Paul.Durrant@citrix.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=david.vrabel@citrix.com \
    --cc=julien.grall@citrix.com \
    --cc=justing@spectralogic.com \
    --cc=konrad.wilk@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=roger.pau@citrix.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.