All of lore.kernel.org
 help / color / mirror / Atom feed
From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
To: "Roger Pau Monné" <roger.pau@citrix.com>
Cc: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"xen-devel@lists.xen.org" <xen-devel@lists.xen.org>
Subject: Re: [PATCH RFC 06/12] xen-blkback: implement LRU mechanism for persistent grants
Date: Tue, 5 Mar 2013 16:49:50 -0500	[thread overview]
Message-ID: <20130305214950.GE8235@phenom.dumpdata.com> (raw)
In-Reply-To: <513634FC.7000401@citrix.com>

On Tue, Mar 05, 2013 at 07:10:04PM +0100, Roger Pau Monné wrote:
> On 04/03/13 21:10, Konrad Rzeszutek Wilk wrote:
> > On Thu, Feb 28, 2013 at 11:28:49AM +0100, Roger Pau Monne wrote:
> >> This mechanism allows blkback to change the number of grants
> >> persistently mapped at run time.
> >>
> >> The algorithm uses a simple LRU mechanism that removes (if needed) the
> >> persistent grants that have not been used since the last LRU run, or
> >> if all grants have been used it removes the first grants in the list
> >> (that are not in use).
> >>
> >> The algorithm has several parameters that can be tuned by the user
> >> from sysfs:
> >>
> >>  * max_persistent_grants: maximum number of grants that will be
> >>    persistently mapped.
> >>  * lru_interval: minimum interval (in ms) at which the LRU should be
> >>    run
> >>  * lru_num_clean: number of persistent grants to remove when executing
> >>    the LRU.
> >>
> >> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> >> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> >> Cc: xen-devel@lists.xen.org
> >> ---
> >>  drivers/block/xen-blkback/blkback.c |  207 +++++++++++++++++++++++++++--------
> >>  drivers/block/xen-blkback/common.h  |    4 +
> >>  drivers/block/xen-blkback/xenbus.c  |    1 +
> >>  3 files changed, 166 insertions(+), 46 deletions(-)
> > 
> > You also should add a Documentation/sysfs/ABI/stable/sysfs-bus-xen-backend
> 
> OK
> 
> >>
> >> diff --git a/drivers/block/xen-blkback/blkback.c b/drivers/block/xen-blkback/blkback.c
> >> index 415a0c7..c14b736 100644
> >> --- a/drivers/block/xen-blkback/blkback.c
> >> +++ b/drivers/block/xen-blkback/blkback.c
> >> @@ -63,6 +63,44 @@ static int xen_blkif_reqs = 64;
> >>  module_param_named(reqs, xen_blkif_reqs, int, 0);
> >>  MODULE_PARM_DESC(reqs, "Number of blkback requests to allocate");
> >>
> >> +/*
> >> + * Maximum number of grants to map persistently in blkback. For maximum
> >> + * performance this should be the total numbers of grants that can be used
> >> + * to fill the ring, but since this might become too high, specially with
> >> + * the use of indirect descriptors, we set it to a value that provides good
> >> + * performance without using too much memory.
> >> + *
> >> + * When the list of persistent grants is full we clean it using a LRU
> >> + * algorithm.
> >> + */
> >> +
> >> +static int xen_blkif_max_pgrants = 352;
> > 
> > And a little blurb saying why 352.
> 
> Yes, this is (as you probably already realized) RING_SIZE *
> BLKIF_MAX_SEGMENTS_PER_REQUEST
> 
> > 
> >> +module_param_named(max_persistent_grants, xen_blkif_max_pgrants, int, 0644);
> >> +MODULE_PARM_DESC(max_persistent_grants,
> >> +                 "Maximum number of grants to map persistently");
> >> +
> >> +/*
> >> + * The LRU mechanism to clean the lists of persistent grants needs to
> >> + * be executed periodically. The time interval between consecutive executions
> >> + * of the purge mechanism is set in ms.
> >> + */
> >> +
> >> +static int xen_blkif_lru_interval = 100;
> > 
> > So every second? What is the benefit of having the user modify this? Would
> > it better if there was an watermark system in xen-blkfront to automatically
> > figure this out? (This could be a TODO of course)
> 
> Every 100ms, so every 0.1 seconds. This can have an impact on
> performance as implemented right now (if we move the purge to a separate
> thread, it's not going to have such an impact), but anyway I feel we can
> let the user tune it.

Why not automatically tune it in the backend? So it can do this by itself?
> 
> >> +module_param_named(lru_interval, xen_blkif_lru_interval, int, 0644);
> >> +MODULE_PARM_DESC(lru_interval,
> >> +"Execution interval (in ms) of the LRU mechanism to clean the list of persistent grants");
> >> +
> >> +/*
> >> + * When the persistent grants list is full we will remove unused grants
> >> + * from the list. The number of grants to be removed at each LRU execution
> >> + * can be set dynamically.
> >> + */
> >> +
> >> +static int xen_blkif_lru_num_clean = BLKIF_MAX_SEGMENTS_PER_REQUEST;
> >> +module_param_named(lru_num_clean, xen_blkif_lru_num_clean, int, 0644);
> >> +MODULE_PARM_DESC(lru_num_clean,
> >> +"Number of persistent grants to unmap when the list is full");
> > 
> > Again, what does that mean to the system admin? Why would they need to modify
> > the contents of that value?
> > 
> 
> Well if you set the maximum number of grants to 1024 you might want to
> increase this to 64 maybe, or on the other hand, if you set the maximum
> number of grants to 10, you may wish to set this to 1, so I think it is
> indeed relevant for system admins.

So why not make this automatic? A value blkback can automatically
adjust as there are less or more grants. This of course does not have
to be part of this patch.
> 
> > Now if this is a debug related one for developing, then this could all be
> > done in DebugFS.
> > 
> >> +
> >>  /* Run-time switchable: /sys/module/blkback/parameters/ */
> >>  static unsigned int log_stats;
> >>  module_param(log_stats, int, 0644);
> >> @@ -81,7 +119,7 @@ struct pending_req {
> >>       unsigned short          operation;
> >>       int                     status;
> >>       struct list_head        free_list;
> >> -     DECLARE_BITMAP(unmap_seg, BLKIF_MAX_SEGMENTS_PER_REQUEST);
> >> +     struct persistent_gnt   *persistent_gnts[BLKIF_MAX_SEGMENTS_PER_REQUEST];
> >>  };
> >>
> >>  #define BLKBACK_INVALID_HANDLE (~0)
> >> @@ -102,36 +140,6 @@ struct xen_blkbk {
> >>  static struct xen_blkbk *blkbk;
> >>
> >>  /*
> >> - * Maximum number of grant pages that can be mapped in blkback.
> >> - * BLKIF_MAX_SEGMENTS_PER_REQUEST * RING_SIZE is the maximum number of
> >> - * pages that blkback will persistently map.
> >> - * Currently, this is:
> >> - * RING_SIZE = 32 (for all known ring types)
> >> - * BLKIF_MAX_SEGMENTS_PER_REQUEST = 11
> >> - * sizeof(struct persistent_gnt) = 48
> >> - * So the maximum memory used to store the grants is:
> >> - * 32 * 11 * 48 = 16896 bytes
> >> - */
> >> -static inline unsigned int max_mapped_grant_pages(enum blkif_protocol protocol)
> >> -{
> >> -     switch (protocol) {
> >> -     case BLKIF_PROTOCOL_NATIVE:
> >> -             return __CONST_RING_SIZE(blkif, PAGE_SIZE) *
> >> -                        BLKIF_MAX_SEGMENTS_PER_REQUEST;
> >> -     case BLKIF_PROTOCOL_X86_32:
> >> -             return __CONST_RING_SIZE(blkif_x86_32, PAGE_SIZE) *
> >> -                        BLKIF_MAX_SEGMENTS_PER_REQUEST;
> >> -     case BLKIF_PROTOCOL_X86_64:
> >> -             return __CONST_RING_SIZE(blkif_x86_64, PAGE_SIZE) *
> >> -                        BLKIF_MAX_SEGMENTS_PER_REQUEST;
> >> -     default:
> >> -             BUG();
> >> -     }
> >> -     return 0;
> >> -}
> >> -
> >> -
> >> -/*
> >>   * Little helpful macro to figure out the index and virtual address of the
> >>   * pending_pages[..]. For each 'pending_req' we have have up to
> >>   * BLKIF_MAX_SEGMENTS_PER_REQUEST (11) pages. The seg would be from 0 through
> >> @@ -251,6 +259,76 @@ static void free_persistent_gnts(struct rb_root *root, unsigned int num)
> >>       BUG_ON(num != 0);
> >>  }
> >>
> >> +static int purge_persistent_gnt(struct rb_root *root, int num)
> >> +{
> >> +     struct gnttab_unmap_grant_ref unmap[BLKIF_MAX_SEGMENTS_PER_REQUEST];
> >> +     struct page *pages[BLKIF_MAX_SEGMENTS_PER_REQUEST];
> >> +     struct persistent_gnt *persistent_gnt;
> >> +     struct rb_node *n;
> >> +     int ret, segs_to_unmap = 0;
> >> +     int requested_num = num;
> >> +     int preserve_used = 1;
> > 
> > Boolean? And perhaps 'scan_dirty' ?
> 
> Sure
> 
> > 
> > 
> >> +
> >> +     pr_debug("Requested the purge of %d persistent grants\n", num);
> >> +
> >> +purge_list:
> > 
> > This could be written a bit differently to also run outside the xen_blkif_schedule
> > (so a new thread). This would require using the lock mechanism and converting
> > this big loop to two smaller loops:
> >  1) - one quick that holds the lock - to take the items of the list,
> >  2) second one to do the grant_set_unmap_op operations and all the heavy
> >     free_xenballooned_pages call.
> 
> Yes, I could add a list_head to persistent_gnt, so we can take them out
> of the red-black tree and queue them in a list to be processed (unmap +
> free) after we have looped thought the list, without holding the lock.
> 
> > 
> > .. As this function ends up (presumarily?) causing xen_blkif_schedule to be doing
> > this for some time every second. Irregardless of how utilized the ring is - so
> > if we are 100% busy - we should not need to call this function. But if we do,
> > then we end up walking the persistent_gnt twice - once with preserve_used set
> > to true, and the other with it set to false.
> > 
> > We don't really want that - so is there a way for xen_blkif_schedule to
> > do a quick determintion of this caliber:
> > 
> > 
> >         if (RING_HAS_UNCONSUMED_REQUESTS(x) >= some_value)
> >                 wait_up(blkif->purgarator)
> 
> It's not possible to tell if all grants will be in use just by looking
> at the number of active requests, since all requests might just be using
> one segment, and thus the list of persistent grants could be purged
> without problems. We could keep a count of the number of active grants
> at each time and use that to check if we can kick the purge or not.
> 
> if (grants_in_use > (persistent_gnt_c - num_purge))
> 	wait(...)

Sure.
> 
> > And the thread would just sit there until kicked in action?
> 
> And when a request frees some grants it could be kicked back to action.

OK.

  parent reply	other threads:[~2013-03-05 21:49 UTC|newest]

Thread overview: 101+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-02-28 10:28 [PATCH RFC 00/12] xen-block: indirect descriptors Roger Pau Monne
2013-02-28 10:28 ` [PATCH RFC 01/12] xen-blkback: don't store dev_bus_addr Roger Pau Monne
2013-02-28 10:28 ` Roger Pau Monne
2013-02-28 10:58   ` [Xen-devel] " Jan Beulich
2013-03-04 17:19     ` Roger Pau Monné
2013-03-04 17:19     ` [Xen-devel] " Roger Pau Monné
2013-03-05  8:06       ` Jan Beulich
2013-03-05 17:02         ` Roger Pau Monné
2013-03-05 17:02         ` [Xen-devel] " Roger Pau Monné
2013-03-05  8:06       ` Jan Beulich
2013-02-28 10:58   ` Jan Beulich
2013-02-28 10:28 ` [PATCH RFC 02/12] xen-blkback: fix foreach_grant_safe to handle empty lists Roger Pau Monne
2013-02-28 10:28   ` Roger Pau Monne
2013-02-28 10:28 ` [PATCH RFC 03/12] xen-blkfront: switch from llist to list Roger Pau Monne
2013-02-28 10:28   ` Roger Pau Monne
2013-02-28 10:28 ` [PATCH RFC 04/12] xen-blkfront: pre-allocate pages for requests Roger Pau Monne
2013-02-28 10:28   ` Roger Pau Monne
2013-03-04 19:39   ` Konrad Rzeszutek Wilk
2013-03-04 19:39   ` Konrad Rzeszutek Wilk
2013-03-05 11:04     ` Roger Pau Monné
2013-03-05 14:18       ` Konrad Rzeszutek Wilk
2013-03-05 14:18       ` Konrad Rzeszutek Wilk
2013-03-05 16:30         ` Roger Pau Monné
2013-03-05 16:30         ` Roger Pau Monné
2013-03-05 21:53           ` Konrad Rzeszutek Wilk
2013-03-06  9:17             ` Roger Pau Monné
2013-03-06  9:17             ` Roger Pau Monné
2013-03-05 21:53           ` Konrad Rzeszutek Wilk
2013-03-05 11:04     ` Roger Pau Monné
2013-02-28 10:28 ` [PATCH RFC 05/12] xen-blkfront: remove frame list from blk_shadow Roger Pau Monne
2013-02-28 10:28   ` Roger Pau Monne
2013-02-28 10:28 ` [PATCH RFC 06/12] xen-blkback: implement LRU mechanism for persistent grants Roger Pau Monne
2013-02-28 10:28   ` Roger Pau Monne
2013-03-04 20:10   ` Konrad Rzeszutek Wilk
2013-03-05 18:10     ` Roger Pau Monné
2013-03-05 21:49       ` Konrad Rzeszutek Wilk
2013-03-05 21:49       ` Konrad Rzeszutek Wilk [this message]
2013-03-18 17:00         ` Roger Pau Monné
2013-03-18 17:00         ` Roger Pau Monné
2013-03-05 18:10     ` Roger Pau Monné
2013-03-04 20:10   ` Konrad Rzeszutek Wilk
2013-02-28 10:28 ` [PATCH RFC 07/12] xen-blkback: print stats about " Roger Pau Monne
2013-02-28 10:28   ` Roger Pau Monne
2013-02-28 10:28 ` [PATCH RFC 08/12] xen-blkback: use balloon pages for all mappings Roger Pau Monne
2013-02-28 10:28   ` Roger Pau Monne
2013-03-04 20:22   ` Konrad Rzeszutek Wilk
2013-03-26 17:30     ` Roger Pau Monné
2013-03-26 17:30     ` Roger Pau Monné
2013-03-26 17:48     ` Roger Pau Monné
2013-03-26 17:48     ` Roger Pau Monné
2013-03-04 20:22   ` Konrad Rzeszutek Wilk
2013-02-28 10:28 ` [PATCH RFC 09/12] xen-blkback: move pending handles list from blkbk to pending_req Roger Pau Monne
2013-02-28 11:07   ` Jan Beulich
2013-02-28 11:07   ` [Xen-devel] " Jan Beulich
2013-02-28 10:28 ` Roger Pau Monne
2013-02-28 10:28 ` [PATCH RFC 10/12] xen-blkback: make the queue of free requests per backend Roger Pau Monne
2013-02-28 10:28   ` Roger Pau Monne
2013-02-28 11:08   ` [Xen-devel] " Jan Beulich
2013-02-28 11:08   ` Jan Beulich
2013-02-28 10:28 ` [PATCH RFC 11/12] xen-blkback: expand map/unmap functions Roger Pau Monne
2013-02-28 10:28 ` Roger Pau Monne
2013-02-28 10:28 ` [PATCH RFC 12/12] xen-block: implement indirect descriptors Roger Pau Monne
2013-02-28 10:28 ` Roger Pau Monne
2013-02-28 11:19   ` [Xen-devel] " Jan Beulich
2013-02-28 12:00     ` Roger Pau Monné
2013-02-28 13:28       ` Jan Beulich
2013-02-28 13:28       ` [Xen-devel] " Jan Beulich
2013-03-04 20:44         ` Konrad Rzeszutek Wilk
2013-03-05  8:11           ` Jan Beulich
2013-03-05 14:16             ` Konrad Rzeszutek Wilk
2013-03-05 17:00               ` Roger Pau Monné
2013-03-05 21:45                 ` Konrad Rzeszutek Wilk
2013-03-05 21:45                 ` Konrad Rzeszutek Wilk
2013-03-05 17:00               ` Roger Pau Monné
2013-03-05 14:16             ` Konrad Rzeszutek Wilk
2013-03-05  8:11           ` Jan Beulich
2013-03-04 20:44         ` Konrad Rzeszutek Wilk
2013-02-28 12:00     ` Roger Pau Monné
2013-02-28 11:19   ` Jan Beulich
2013-03-04 20:41   ` Konrad Rzeszutek Wilk
2013-03-05 17:07     ` Roger Pau Monné
2013-03-05 21:46       ` Konrad Rzeszutek Wilk
2013-03-05 21:46       ` Konrad Rzeszutek Wilk
2013-03-08 17:07         ` Roger Pau Monné
2013-03-22  1:10           ` Konrad Rzeszutek Wilk
2013-03-22  1:10           ` Konrad Rzeszutek Wilk
2013-03-08 17:07         ` Roger Pau Monné
2013-03-05 17:07     ` Roger Pau Monné
2013-03-04 20:41   ` Konrad Rzeszutek Wilk
2013-03-18 17:06   ` Roger Pau Monné
2013-03-18 17:06   ` Roger Pau Monné
2013-03-19 14:38     ` Konrad Rzeszutek Wilk
2013-03-19 14:38     ` Konrad Rzeszutek Wilk
2013-02-28 10:49 ` [PATCH RFC 00/12] xen-block: " Jan Beulich
2013-02-28 10:49 ` [Xen-devel] " Jan Beulich
2013-02-28 11:25   ` Roger Pau Monné
2013-02-28 11:25   ` [Xen-devel] " Roger Pau Monné
2013-02-28 11:35     ` Jan Beulich
2013-02-28 11:35     ` [Xen-devel] " Jan Beulich
2013-02-28 11:44       ` Roger Pau Monné
2013-02-28 11:44       ` [Xen-devel] " Roger Pau Monné

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130305214950.GE8235@phenom.dumpdata.com \
    --to=konrad.wilk@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=roger.pau@citrix.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.