From: Mel Gorman <mgorman@suse.de>
To: Glauber Costa <glommer@parallels.com>
Cc: Glauber Costa <glommer@openvz.org>,
linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
cgroups@vger.kernel.org, kamezawa.hiroyu@jp.fujitsu.com,
Johannes Weiner <hannes@cmpxchg.org>,
Michal Hocko <mhocko@suse.cz>,
hughd@google.com, Greg Thelen <gthelen@google.com>,
linux-fsdevel@vger.kernel.org, Dave Chinner <dchinner@redhat.com>
Subject: Re: [PATCH v5 08/31] list: add a new LRU list type
Date: Fri, 10 May 2013 10:21:05 +0100 [thread overview]
Message-ID: <20130510092105.GK11497@suse.de> (raw)
In-Reply-To: <518C0ECF.8010302@parallels.com>
On Fri, May 10, 2013 at 01:02:07AM +0400, Glauber Costa wrote:
> On 05/09/2013 05:37 PM, Mel Gorman wrote:
> > On Thu, May 09, 2013 at 10:06:25AM +0400, Glauber Costa wrote:
> >> From: Dave Chinner <dchinner@redhat.com>
> >>
> >> Several subsystems use the same construct for LRU lists - a list
> >> head, a spin lock and and item count. They also use exactly the same
> >> code for adding and removing items from the LRU. Create a generic
> >> type for these LRU lists.
> >>
> >> This is the beginning of generic, node aware LRUs for shrinkers to
> >> work with.
> >>
> >> [ glommer: enum defined constants for lru. Suggested by gthelen,
> >> don't relock over retry ]
> >> Signed-off-by: Dave Chinner <dchinner@redhat.com>
> >> Signed-off-by: Glauber Costa <glommer@openvz.org>
> >> Reviewed-by: Greg Thelen <gthelen@google.com>
> >>>
> >>> <SNIP>
> >>>
> >> +
> >> +unsigned long
> >> +list_lru_walk(
> >> + struct list_lru *lru,
> >> + list_lru_walk_cb isolate,
> >> + void *cb_arg,
> >> + long nr_to_walk)
> >> +{
> >> + struct list_head *item, *n;
> >> + unsigned long removed = 0;
> >> +
> >> + spin_lock(&lru->lock);
> >> +restart:
> >> + list_for_each_safe(item, n, &lru->list) {
> >> + enum lru_status ret;
> >> +
> >> + if (nr_to_walk-- < 0)
> >> + break;
> >> +
> >> + ret = isolate(item, &lru->lock, cb_arg);
> >> + switch (ret) {
> >> + case LRU_REMOVED:
> >> + lru->nr_items--;
> >> + removed++;
> >> + break;
> >> + case LRU_ROTATE:
> >> + list_move_tail(item, &lru->list);
> >> + break;
> >> + case LRU_SKIP:
> >> + break;
> >> + case LRU_RETRY:
> >> + goto restart;
> >> + default:
> >> + BUG();
> >> + }
> >> + }
> >
> > What happened your suggestion to only retry once for each object to
> > avoid any possibility of infinite looping or stalling for prolonged
> > periods of time waiting on XFS to do something?
> >
>
> Sorry. It wasn't clear for me if you were just trying to make sure we
> had a way out in case it proves to be a problem, or actually wanted a
> change.
>
Either. If you are sure there is a way out for XFS using LRU_RETRY without
prolonged stalls then it's fine. If it is not certain then I would be much
more comfortable with a retry-once and then moving onto the next LRU node.
> In any case, I cannot claim to be as knowledgeable as Dave in the
> subtleties of such things in the final behavior of the shrinker. Dave,
> can you give us your input here?
>
> I also have another recent observation on this:
>
> The main difference between LRU_SKIP and LRU_RETRY is that LRU_RETRY
> will go back to the beginning of the list, and start scanning it again.
>
Only sortof true. Lets say we had a list of 8 LRU nodes. Nodes 1-3 get
isolated. Node 4 returns LRU_RETRY so we goto restart. The first item on
the list is now potentially LRU_RETRY which it must handle before
reaching Nodes 5-8
LRU_SKIP is different. If Node 4 returned LRU_SKIP then Node 5-8 are
ignored entirely. Actually..... why is that? LRU_SKIP is documented to
"item cannot be locked, skip" but what it actually does it "item cannot
be locked, abort the walk". It's documented behaviour LRU_SKIP implies
continue, not break.
case LRU_SKIP:
continue;
> This is *not* the same behavior we had before, where we used to read:
>
> for (nr_scanned = nr_to_scan; nr_scanned >= 0; nr_scanned--) {
> struct inode *inode;
> [ ... ]
>
> if (inode_has_buffers(inode) || inode->i_data.nrpages) {
> __iget(inode);
> [ ... ]
> iput(inode);
> spin_lock(&sb->s_inode_lru_lock);
>
> if (inode != list_entry(sb->s_inode_lru.next,
> struct inode, i_lru))
> continue; <=====
> /* avoid lock inversions with trylock */
> if (!spin_trylock(&inode->i_lock))
> continue; <=====
> if (!can_unuse(inode)) {
> spin_unlock(&inode->i_lock);
> continue; <=====
> }
> }
>
> It is my interpretation that we in here, we won't really reset the
> search, but just skip this inode.
>
> Another problem is that by restarting the search the way we are doing
> now, we actually decrement nr_to_walk twice in case of a retry. By doing
> a retry-once test, we can actually move nr_to_walk to the end of the
> switch statement, which has the good side effect of getting rid of the
> reason we had to allow it to go negative.
>
> How about we fold the following attached patch to this one? (I would
> still have to give it a round of testing)
>
> diff --git a/lib/list_lru.c b/lib/list_lru.c
> index da9b837..4aa069b 100644
> --- a/lib/list_lru.c
> +++ b/lib/list_lru.c
> @@ -195,12 +195,10 @@ list_lru_walk_node(
> unsigned long isolated = 0;
>
> spin_lock(&nlru->lock);
> -restart:
> list_for_each_safe(item, n, &nlru->list) {
> + bool first_pass = true;
> enum lru_status ret;
> -
> - if ((*nr_to_walk)-- < 0)
> - break;
> +restart:
>
> ret = isolate(item, &nlru->lock, cb_arg);
> switch (ret) {
> @@ -217,10 +215,17 @@ restart:
> case LRU_SKIP:
> break;
> case LRU_RETRY:
> + if (!first_pass)
> + break;
> + first_pass = true;
> goto restart;
I think this is generally much safer and less likely to report bugs
about occasional long stalls during slab shrink.
Similar to LRU_SKIP comment above, should this be continue though to
actually skip the LRU node instead of aborting the LRU walk?
> default:
> BUG();
> }
> +
> + if ((*nr_to_walk)-- == 0)
> + break;
> +
> }
> spin_unlock(&nlru->lock);
> return isolated;
--
Mel Gorman
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2013-05-10 9:21 UTC|newest]
Thread overview: 137+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-05-09 6:06 [PATCH v5 00/31] kmemcg shrinkers Glauber Costa
2013-05-09 6:06 ` Glauber Costa
2013-05-09 6:06 ` Glauber Costa
2013-05-09 6:06 ` [PATCH v5 01/31] super: fix calculation of shrinkable objects for small numbers Glauber Costa
2013-05-09 6:06 ` Glauber Costa
2013-05-09 6:06 ` Glauber Costa
2013-05-09 6:06 ` [PATCH v5 02/31] vmscan: take at least one pass with shrinkers Glauber Costa
2013-05-09 6:06 ` Glauber Costa
2013-05-09 6:06 ` Glauber Costa
2013-05-09 11:12 ` Mel Gorman
2013-05-09 11:12 ` Mel Gorman
[not found] ` <20130509111226.GR11497-l3A5Bk7waGM@public.gmane.org>
2013-05-09 11:28 ` Glauber Costa
2013-05-09 11:28 ` Glauber Costa
2013-05-09 11:28 ` Glauber Costa
[not found] ` <518B884C.9090704-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2013-05-09 11:35 ` Glauber Costa
2013-05-09 11:35 ` Glauber Costa
2013-05-09 11:35 ` Glauber Costa
2013-05-09 6:06 ` [PATCH v5 03/31] dcache: convert dentry_stat.nr_unused to per-cpu counters Glauber Costa
2013-05-09 6:06 ` Glauber Costa
2013-05-09 6:06 ` [PATCH v5 04/31] dentry: move to per-sb LRU locks Glauber Costa
2013-05-09 6:06 ` Glauber Costa
2013-05-09 6:06 ` Glauber Costa
[not found] ` <1368079608-5611-5-git-send-email-glommer-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
2013-05-10 5:29 ` Dave Chinner
2013-05-10 5:29 ` Dave Chinner
2013-05-10 8:16 ` Dave Chinner
2013-05-09 6:06 ` [PATCH v5 05/31] dcache: remove dentries from LRU before putting on dispose list Glauber Costa
2013-05-09 6:06 ` Glauber Costa
2013-05-09 6:06 ` Glauber Costa
2013-05-09 6:06 ` [PATCH v5 06/31] mm: new shrinker API Glauber Costa
2013-05-09 6:06 ` Glauber Costa
2013-05-09 13:30 ` Mel Gorman
[not found] ` <1368079608-5611-1-git-send-email-glommer-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
2013-05-09 6:06 ` [PATCH v5 07/31] shrinker: convert superblock shrinkers to new API Glauber Costa
2013-05-09 6:06 ` Glauber Costa
2013-05-09 6:06 ` Glauber Costa
2013-05-09 13:33 ` Mel Gorman
2013-05-09 6:06 ` [PATCH v5 08/31] list: add a new LRU list type Glauber Costa
2013-05-09 6:06 ` Glauber Costa
[not found] ` <1368079608-5611-9-git-send-email-glommer-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
2013-05-09 13:37 ` Mel Gorman
2013-05-09 13:37 ` Mel Gorman
[not found] ` <20130509133742.GW11497-l3A5Bk7waGM@public.gmane.org>
2013-05-09 21:02 ` Glauber Costa
2013-05-09 21:02 ` Glauber Costa
2013-05-09 21:02 ` Glauber Costa
2013-05-10 9:21 ` Mel Gorman [this message]
2013-05-10 9:56 ` Glauber Costa
2013-05-10 9:56 ` Glauber Costa
[not found] ` <518CC44D.1020409-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2013-05-10 10:01 ` Mel Gorman
2013-05-10 10:01 ` Mel Gorman
2013-05-09 6:06 ` [PATCH v5 09/31] inode: convert inode lru list to generic lru list code Glauber Costa
2013-05-09 6:06 ` Glauber Costa
2013-05-09 6:06 ` [PATCH v5 10/31] dcache: convert to use new lru list infrastructure Glauber Costa
2013-05-09 6:06 ` Glauber Costa
2013-05-09 6:06 ` [PATCH v5 11/31] list_lru: per-node " Glauber Costa
2013-05-09 6:06 ` Glauber Costa
2013-05-09 13:42 ` Mel Gorman
[not found] ` <20130509134246.GX11497-l3A5Bk7waGM@public.gmane.org>
2013-05-09 21:05 ` Glauber Costa
2013-05-09 21:05 ` Glauber Costa
2013-05-09 21:05 ` Glauber Costa
2013-05-09 6:06 ` [PATCH v5 12/31] shrinker: add node awareness Glauber Costa
2013-05-09 6:06 ` Glauber Costa
2013-05-09 6:06 ` [PATCH v5 13/31] fs: convert inode and dentry shrinking to be node aware Glauber Costa
2013-05-09 6:06 ` Glauber Costa
2013-05-09 6:06 ` Glauber Costa
2013-05-09 6:06 ` [PATCH v5 14/31] xfs: convert buftarg LRU to generic code Glauber Costa
2013-05-09 6:06 ` Glauber Costa
2013-05-09 6:06 ` Glauber Costa
[not found] ` <1368079608-5611-15-git-send-email-glommer-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
2013-05-09 13:43 ` Mel Gorman
2013-05-09 13:43 ` Mel Gorman
2013-05-09 6:06 ` [PATCH v5 15/31] xfs: convert dquot cache lru to list_lru Glauber Costa
2013-05-09 6:06 ` Glauber Costa
2013-05-09 6:06 ` Glauber Costa
2013-05-09 6:06 ` [PATCH v5 16/31] fs: convert fs shrinkers to new scan/count API Glauber Costa
2013-05-09 6:06 ` Glauber Costa
2013-05-09 6:06 ` Glauber Costa
2013-05-09 6:06 ` [PATCH v5 17/31] drivers: convert shrinkers to new count/scan API Glauber Costa
2013-05-09 6:06 ` Glauber Costa
2013-05-09 6:06 ` Glauber Costa
2013-05-09 13:52 ` Mel Gorman
2013-05-09 13:52 ` Mel Gorman
[not found] ` <20130509135209.GZ11497-l3A5Bk7waGM@public.gmane.org>
2013-05-09 21:19 ` Glauber Costa
2013-05-09 21:19 ` Glauber Costa
2013-05-09 21:19 ` Glauber Costa
2013-05-10 9:00 ` Mel Gorman
2013-05-09 6:06 ` [PATCH v5 18/31] shrinker: convert remaining shrinkers to " Glauber Costa
2013-05-09 6:06 ` Glauber Costa
2013-05-09 6:06 ` Glauber Costa
2013-05-09 6:06 ` [PATCH v5 19/31] hugepage: convert huge zero page shrinker to new shrinker API Glauber Costa
2013-05-09 6:06 ` Glauber Costa
2013-05-09 6:06 ` Glauber Costa
[not found] ` <1368079608-5611-20-git-send-email-glommer-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
2013-05-10 1:24 ` Kirill A. Shutemov
2013-05-10 1:24 ` Kirill A. Shutemov
2013-05-09 6:06 ` [PATCH v5 20/31] shrinker: Kill old ->shrink API Glauber Costa
2013-05-09 6:06 ` Glauber Costa
2013-05-09 13:53 ` Mel Gorman
2013-05-09 6:06 ` [PATCH v5 21/31] vmscan: also shrink slab in memcg pressure Glauber Costa
2013-05-09 6:06 ` Glauber Costa
2013-05-09 6:06 ` Glauber Costa
2013-05-09 6:06 ` [PATCH v5 22/31] memcg,list_lru: duplicate LRUs upon kmemcg creation Glauber Costa
2013-05-09 6:06 ` Glauber Costa
2013-05-09 6:06 ` Glauber Costa
2013-05-09 6:06 ` [PATCH v5 23/31] lru: add an element to a memcg list Glauber Costa
2013-05-09 6:06 ` Glauber Costa
2013-05-09 6:06 ` Glauber Costa
2013-05-09 6:06 ` [PATCH v5 24/31] list_lru: per-memcg walks Glauber Costa
2013-05-09 6:06 ` Glauber Costa
2013-05-09 6:06 ` Glauber Costa
2013-05-09 6:06 ` [PATCH v5 25/31] memcg: per-memcg kmem shrinking Glauber Costa
2013-05-09 6:06 ` Glauber Costa
2013-05-09 6:06 ` Glauber Costa
2013-05-09 6:06 ` [PATCH v5 26/31] memcg: scan cache objects hierarchically Glauber Costa
2013-05-09 6:06 ` Glauber Costa
2013-05-09 6:06 ` Glauber Costa
2013-05-09 6:06 ` [PATCH v5 27/31] super: targeted memcg reclaim Glauber Costa
2013-05-09 6:06 ` Glauber Costa
2013-05-09 6:06 ` Glauber Costa
2013-05-09 6:06 ` [PATCH v5 28/31] memcg: move initialization to memcg creation Glauber Costa
2013-05-09 6:06 ` Glauber Costa
2013-05-09 6:06 ` Glauber Costa
2013-05-09 6:06 ` [PATCH v5 29/31] vmpressure: in-kernel notifications Glauber Costa
2013-05-09 6:06 ` Glauber Costa
2013-05-09 6:06 ` Glauber Costa
2013-05-09 6:06 ` [PATCH v5 30/31] memcg: reap dead memcgs upon global memory pressure Glauber Costa
2013-05-09 6:06 ` Glauber Costa
2013-05-09 6:06 ` Glauber Costa
2013-05-09 6:06 ` [PATCH v5 31/31] memcg: debugging facility to access dangling memcgs Glauber Costa
2013-05-09 6:06 ` Glauber Costa
2013-05-09 6:06 ` Glauber Costa
2013-05-09 10:55 ` [PATCH v5 00/31] kmemcg shrinkers Mel Gorman
[not found] ` <20130509105519.GQ11497-l3A5Bk7waGM@public.gmane.org>
2013-05-09 11:34 ` Glauber Costa
2013-05-09 11:34 ` Glauber Costa
2013-05-09 11:34 ` Glauber Costa
2013-05-09 13:18 ` Dave Chinner
2013-05-09 14:03 ` Mel Gorman
[not found] ` <20130509140311.GB11497-l3A5Bk7waGM@public.gmane.org>
2013-05-09 21:24 ` Glauber Costa
2013-05-09 21:24 ` Glauber Costa
2013-05-09 21:24 ` Glauber Costa
-- strict thread matches above, loose matches on Subject: below --
2013-05-08 20:22 Glauber Costa
[not found] ` <1368044599-3383-1-git-send-email-glommer-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
2013-05-08 20:22 ` [PATCH v5 08/31] list: add a new LRU list type Glauber Costa
2013-05-08 20:22 ` Glauber Costa
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130510092105.GK11497@suse.de \
--to=mgorman@suse.de \
--cc=akpm@linux-foundation.org \
--cc=cgroups@vger.kernel.org \
--cc=dchinner@redhat.com \
--cc=glommer@openvz.org \
--cc=glommer@parallels.com \
--cc=gthelen@google.com \
--cc=hannes@cmpxchg.org \
--cc=hughd@google.com \
--cc=kamezawa.hiroyu@jp.fujitsu.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.