Re: [PATCH v5 2/2] xen/block: add multi-page ring support

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Bob Liu <bob.liu@oracle.com>
To: Paul Durrant <Paul.Durrant@citrix.com>
Cc: "xen-devel@lists.xen.org" <xen-devel@lists.xen.org>,
	David Vrabel <david.vrabel@citrix.com>,
	"justing@spectralogic.com" <justing@spectralogic.com>,
	"konrad.wilk@oracle.com" <konrad.wilk@oracle.com>,
	Roger Pau Monne <roger.pau@citrix.com>,
	Julien Grall <julien.grall@citrix.com>,
	"boris.ostrovsky@oracle.com" <boris.ostrovsky@oracle.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v5 2/2] xen/block: add multi-page ring support
Date: Fri, 22 May 2015 17:03:11 +0800	[thread overview]
Message-ID: <555EF0CF.8030009@oracle.com> (raw)
In-Reply-To: <9AAE0902D5BC7E449B7C8E4E778ABCD0259141D9@AMSPEX01CL01.citrite.net>


On 05/22/2015 04:31 PM, Paul Durrant wrote:
>> -----Original Message-----
>> From: Bob Liu [mailto:bob.liu@oracle.com]
>> Sent: 22 May 2015 01:00
>> To: xen-devel@lists.xen.org
>> Cc: David Vrabel; justing@spectralogic.com; konrad.wilk@oracle.com; Roger
>> Pau Monne; Paul Durrant; Julien Grall; boris.ostrovsky@oracle.com; linux-
>> kernel@vger.kernel.org; Bob Liu
>> Subject: [PATCH v5 2/2] xen/block: add multi-page ring support
>>
>> Extend xen/block to support multi-page ring, so that more requests can be
>> issued by using more than one pages as the request ring between blkfront
>> and backend.
>> As a result, the performance can get improved significantly.
>>
>> We got some impressive improvements on our highend iscsi storage cluster
>> backend. If using 64 pages as the ring, the IOPS increased about 15 times
>> for the throughput testing and above doubled for the latency testing.
>>
>> The reason was the limit on outstanding requests is 32 if use only one-page
>> ring, but in our case the iscsi lun was spread across about 100 physical
>> drives, 32 was really not enough to keep them busy.
>>
>> Changes in v2:
>>  - Rebased to 4.0-rc6.
>>  - Document on how multi-page ring feature working to linux io/blkif.h.
>>
>> Changes in v3:
>>  - Remove changes to linux io/blkif.h and follow the protocol defined
>>    in io/blkif.h of XEN tree.
>>  - Rebased to 4.1-rc3
>>
>> Changes in v4:
>>  - Turn to use 'ring-page-order' and 'max-ring-page-order'.
>>  - A few comments from Roger.
>>
>> Changes in v5:
>>  - Clarify 4k granularity to comment.
>>  - Address more comments from Roger.
>>
>> Signed-off-by: Bob Liu <bob.liu@oracle.com>
>> ---
>>  drivers/block/xen-blkback/blkback.c |  13 ++++
>>  drivers/block/xen-blkback/common.h  |   3 +-
>>  drivers/block/xen-blkback/xenbus.c  |  88 +++++++++++++++++------
>>  drivers/block/xen-blkfront.c        | 135 +++++++++++++++++++++++++-------
>> ----
>>  4 files changed, 179 insertions(+), 60 deletions(-)
>>
>> diff --git a/drivers/block/xen-blkback/blkback.c b/drivers/block/xen-
>> blkback/blkback.c
>> index 713fc9f..2126842 100644
>> --- a/drivers/block/xen-blkback/blkback.c
>> +++ b/drivers/block/xen-blkback/blkback.c
>> @@ -84,6 +84,13 @@ MODULE_PARM_DESC(max_persistent_grants,
>>                   "Maximum number of grants to map persistently");
>>
>>  /*
>> + * Maximum order of pages to be used for the shared ring between front
>> and
>> + * backend, 4KB page granularity is used.
>> + */
>> +unsigned int xen_blkif_max_ring_order =
>> XENBUS_MAX_RING_PAGE_ORDER;
>> +module_param_named(max_ring_page_order, xen_blkif_max_ring_order,
>> int, S_IRUGO);
>> +MODULE_PARM_DESC(max_ring_page_order, "Maximum order of pages
>> to be used for the shared ring");
>> +/*
>>   * The LRU mechanism to clean the lists of persistent grants needs to
>>   * be executed periodically. The time interval between consecutive
>> executions
>>   * of the purge mechanism is set in ms.
>> @@ -1438,6 +1445,12 @@ static int __init xen_blkif_init(void)
>>  	if (!xen_domain())
>>  		return -ENODEV;
>>
>> +	if (xen_blkif_max_ring_order > XENBUS_MAX_RING_PAGE_ORDER)
>> {
>> +		pr_info("Invalid max_ring_order (%d), will use default max:
>> %d.\n",
>> +			xen_blkif_max_ring_order,
>> XENBUS_MAX_RING_PAGE_ORDER);
>> +		xen_blkif_max_ring_order =
>> XENBUS_MAX_RING_PAGE_ORDER;
>> +	}
>> +
>>  	rc = xen_blkif_interface_init();
>>  	if (rc)
>>  		goto failed_init;
>> diff --git a/drivers/block/xen-blkback/common.h b/drivers/block/xen-
>> blkback/common.h
>> index f620b5d..919a1ab 100644
>> --- a/drivers/block/xen-blkback/common.h
>> +++ b/drivers/block/xen-blkback/common.h
>> @@ -44,6 +44,7 @@
>>  #include <xen/interface/io/blkif.h>
>>  #include <xen/interface/io/protocols.h>
>>
>> +extern unsigned int xen_blkif_max_ring_order;
>>  /*
>>   * This is the maximum number of segments that would be allowed in
>> indirect
>>   * requests. This value will also be passed to the frontend.
>> @@ -248,7 +249,7 @@ struct backend_info;
>>  #define PERSISTENT_GNT_WAS_ACTIVE	1
>>
>>  /* Number of requests that we can fit in a ring */
>> -#define XEN_BLKIF_REQS			32
>> +#define XEN_MAX_BLKIF_REQS		(32 *
>> XENBUS_MAX_RING_PAGES)
>>
>>  struct persistent_gnt {
>>  	struct page *page;
>> diff --git a/drivers/block/xen-blkback/xenbus.c b/drivers/block/xen-
>> blkback/xenbus.c
>> index 6ab69ad..bc33888 100644
>> --- a/drivers/block/xen-blkback/xenbus.c
>> +++ b/drivers/block/xen-blkback/xenbus.c
>> @@ -25,6 +25,7 @@
>>
>>  /* Enlarge the array size in order to fully show blkback name. */
>>  #define BLKBACK_NAME_LEN (20)
>> +#define RINGREF_NAME_LEN (20)
>>
>>  struct backend_info {
>>  	struct xenbus_device	*dev;
>> @@ -152,7 +153,7 @@ static struct xen_blkif *xen_blkif_alloc(domid_t
>> domid)
>>  	INIT_LIST_HEAD(&blkif->pending_free);
>>  	INIT_WORK(&blkif->free_work, xen_blkif_deferred_free);
>>
>> -	for (i = 0; i < XEN_BLKIF_REQS; i++) {
>> +	for (i = 0; i < XEN_MAX_BLKIF_REQS; i++) {
> 
> How big is XEN_MAX_BLKIF_REQS? These allocations are per-instance so I'd be concerned that the increase in the number of allocations would hit system scalability.
> 

Right, Roger and I have agreed to delay request allocation(including indirect page related memory) until we know the exactly value.
But that would be in an new patch soon.
"
Ack. As said, we have been doing this for a long time. When I added
indirect descriptors I've also allocated everything before knowing if
indirect descriptors will be used or not.

Maybe it's time to change that and provide a way to allocate how many
requests we need, and which fields should be allocated based on the
supported features.
"

Thanks,
-Bob

     prev parent reply	other threads:[~2015-05-22  9:03 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-05-21 23:59 [PATCH v2 1/2] driver: xen-blkfront: move talk_to_blkback to a more suitable place Bob Liu
2015-05-21 23:59 ` [PATCH v5 2/2] xen/block: add multi-page ring support Bob Liu
2015-05-22  8:31   ` Paul Durrant
2015-05-22  9:03     ` Bob Liu [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=555EF0CF.8030009@oracle.com \
    --to=bob.liu@oracle.com \
    --cc=Paul.Durrant@citrix.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=david.vrabel@citrix.com \
    --cc=julien.grall@citrix.com \
    --cc=justing@spectralogic.com \
    --cc=konrad.wilk@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=roger.pau@citrix.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox