All of lore.kernel.org
 help / color / mirror / Atom feed
From: Glauber Costa <glommer@parallels.com>
To: Mel Gorman <mgorman@suse.de>
Cc: Glauber Costa <glommer@openvz.org>,
	linux-mm@kvack.org, cgroups@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	Greg Thelen <gthelen@google.com>,
	kamezawa.hiroyu@jp.fujitsu.com, Michal Hocko <mhocko@suse.cz>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Dave Chinner <dchinner@redhat.com>
Subject: Re: [PATCH v4 08/31] list: add a new LRU list type
Date: Tue, 30 Apr 2013 20:01:41 +0400	[thread overview]
Message-ID: <517FEAE5.2010809@parallels.com> (raw)
In-Reply-To: <20130430151854.GH6415@suse.de>

>> +
>> +struct list_lru {
>> +	spinlock_t		lock;
>> +	struct list_head	list;
>> +	long			nr_items;
>> +};
> 
> Is there ever a circumstance where nr_items is negative? If so, what
> does that mean?
> 

No, but we would like to be able to detect it and BUG (which we actually do)


> 
>> +int list_lru_add(struct list_lru *lru, struct list_head *item);
>> +int list_lru_del(struct list_lru *lru, struct list_head *item);
>> +
> 
> However, these are bool and the return value determines if the item was
> really added to the list or not. It fails if the item is already part of
> a list and it would be very nice to have a comment explaining why it's
> not a bug if this happens because it feels like it would be a lookup and
> insertion race. Maybe it's clear later in the series why this is ok but
> it's not very obvious at this point.
> 

I actually don't know.
I would appreciate Dave's comments on this one. Dave?

>> +static inline long list_lru_count(struct list_lru *lru)
>> +{
>> +	return lru->nr_items;
>> +}
>> +
>> +typedef enum lru_status
>> +(*list_lru_walk_cb)(struct list_head *item, spinlock_t *lock, void *cb_arg);
>> +
>> +typedef void (*list_lru_dispose_cb)(struct list_head *dispose_list);
>> +
>> +long list_lru_walk(struct list_lru *lru, list_lru_walk_cb isolate,
>> +		   void *cb_arg, long nr_to_walk);
>> +
> 
> Is nr_to_walk ever negative?
> 

Shouldn't be, and this one, we don't BUG.

>> +long list_lru_dispose_all(struct list_lru *lru, list_lru_dispose_cb dispose);
>> +
>> +#endif /* _LRU_LIST_H */
>> diff --git a/lib/Makefile b/lib/Makefile
>> index af79e8c..40a6d4a 100644
>> --- a/lib/Makefile
>> +++ b/lib/Makefile
>> @@ -13,7 +13,7 @@ lib-y := ctype.o string.o vsprintf.o cmdline.o \
>>  	 sha1.o md5.o irq_regs.o reciprocal_div.o argv_split.o \
>>  	 proportions.o flex_proportions.o prio_heap.o ratelimit.o show_mem.o \
>>  	 is_single_threaded.o plist.o decompress.o kobject_uevent.o \
>> -	 earlycpio.o percpu-refcount.o
>> +	 earlycpio.o percpu-refcount.o list_lru.o
>>  
>>  obj-$(CONFIG_ARCH_HAS_DEBUG_STRICT_USER_COPY_CHECKS) += usercopy.o
>>  lib-$(CONFIG_MMU) += ioremap.o
>> diff --git a/lib/list_lru.c b/lib/list_lru.c
>> new file mode 100644
>> index 0000000..937ee87
>> --- /dev/null
>> +++ b/lib/list_lru.c
>> @@ -0,0 +1,118 @@
>> +/*
>> + * Copyright (c) 2010-2012 Red Hat, Inc. All rights reserved.
>> + * Author: David Chinner
>> + *
>> + * Generic LRU infrastructure
>> + */
>> +#include <linux/kernel.h>
>> +#include <linux/module.h>
>> +#include <linux/list_lru.h>
>> +
>> +int
>> +list_lru_add(
>> +	struct list_lru	*lru,
>> +	struct list_head *item)
>> +{
>> +	spin_lock(&lru->lock);
>> +	if (list_empty(item)) {
>> +		list_add_tail(item, &lru->list);
>> +		lru->nr_items++;
>> +		spin_unlock(&lru->lock);
>> +		return 1;
>> +	}
>> +	spin_unlock(&lru->lock);
>> +	return 0;
>> +}
>> +EXPORT_SYMBOL_GPL(list_lru_add);
>> +
>> +int
>> +list_lru_del(
>> +	struct list_lru	*lru,
>> +	struct list_head *item)
>> +{
>> +	spin_lock(&lru->lock);
>> +	if (!list_empty(item)) {
>> +		list_del_init(item);
>> +		lru->nr_items--;
>> +		spin_unlock(&lru->lock);
>> +		return 1;
>> +	}
>> +	spin_unlock(&lru->lock);
>> +	return 0;
>> +}
>> +EXPORT_SYMBOL_GPL(list_lru_del);
>> +
>> +long
>> +list_lru_walk(
> 
> It returns long but never actually returns a negative number.
> 
>> +	struct list_lru *lru,
>> +	list_lru_walk_cb isolate,
>> +	void		*cb_arg,
>> +	long		nr_to_walk)
>> +{
>> +	struct list_head *item, *n;
>> +	long removed = 0;
>> +
>> +	spin_lock(&lru->lock);
>> +restart:
>> +	list_for_each_safe(item, n, &lru->list) {
>> +		enum lru_status ret;
>> +
>> +		if (nr_to_walk-- < 0)
>> +			break;
>> +
>> +		ret = isolate(item, &lru->lock, cb_arg);
>> +		switch (ret) {
>> +		case LRU_REMOVED:
>> +			lru->nr_items--;
>> +			removed++;
>> +			break;
>> +		case LRU_ROTATE:
>> +			list_move_tail(item, &lru->list);
>> +			break;
>> +		case LRU_SKIP:
>> +			break;
>> +		case LRU_RETRY:
>> +			goto restart;
> 
> Are the two users of LRU_RETRY guaranteed to eventually make progress
> or can this infinite loop? It feels like the behaviour of LRU_RETRY is
> not very desirable. Once an object that returns LRU_RETRY is isolated on
> the list then it looks like XFS can stall for 100 ticks (bit arbitrary)
> each time it tries and maybe do this forever.
> 
> The inode.c user looks like it could race where some other process is
> reallocating the buffers between each time this isolate callback tries to
> isolate it. Granted, it may accidentally break out because the spinlock
> is contended and it returns LRU_SKIP so maybe no one will hit the problem
> but if it's ok with LRU_SKIP then maybe a LRU_RETRY_ONCE would also be
> suitable and use that in fs/inode.c?
> 
> Either hypothetical situation would require that the list you are trying
> to walk is very small but maybe memcg will hit that problem. If there is
> a guarantee of forward progress then a comment would be nice.
> 

I have to look into this issue further, which I plan to do soon, but
can't today. Meanwhile, This is really a lot more under Dave's umbrella,
so Dave, if you would give us the honor =)

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2013-04-30 16:01 UTC|newest]

Thread overview: 105+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-04-26 23:18 [PATCH v4 00/31] kmemcg shrinkers Glauber Costa
2013-04-26 23:18 ` [PATCH v4 01/31] super: fix calculation of shrinkable objects for small numbers Glauber Costa
2013-04-30 13:03   ` Mel Gorman
2013-04-26 23:18 ` [PATCH v4 02/31] vmscan: take at least one pass with shrinkers Glauber Costa
     [not found]   ` <1367018367-11278-3-git-send-email-glommer-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
2013-04-30 13:22     ` Mel Gorman
2013-04-30 13:22       ` Mel Gorman
2013-04-30 13:31       ` Glauber Costa
     [not found]         ` <517FC7B4.5030101-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2013-04-30 15:37           ` Mel Gorman
2013-04-30 15:37             ` Mel Gorman
2013-05-07 13:35             ` Glauber Costa
     [not found] ` <1367018367-11278-1-git-send-email-glommer-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
2013-04-26 23:18   ` [PATCH v4 03/31] dcache: convert dentry_stat.nr_unused to per-cpu counters Glauber Costa
2013-04-26 23:18     ` Glauber Costa
2013-04-30 13:37     ` Mel Gorman
2013-04-26 23:19   ` [PATCH v4 04/31] dentry: move to per-sb LRU locks Glauber Costa
2013-04-26 23:19     ` Glauber Costa
     [not found]     ` <1367018367-11278-5-git-send-email-glommer-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
2013-04-30 14:01       ` Mel Gorman
2013-04-30 14:01         ` Mel Gorman
2013-04-26 23:19   ` [PATCH v4 05/31] dcache: remove dentries from LRU before putting on dispose list Glauber Costa
2013-04-26 23:19     ` Glauber Costa
     [not found]     ` <1367018367-11278-6-git-send-email-glommer-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
2013-04-30 14:14       ` Mel Gorman
2013-04-30 14:14         ` Mel Gorman
2013-04-26 23:19   ` [PATCH v4 06/31] mm: new shrinker API Glauber Costa
2013-04-26 23:19     ` Glauber Costa
     [not found]     ` <1367018367-11278-7-git-send-email-glommer-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
2013-04-30 14:40       ` Mel Gorman
2013-04-30 14:40         ` Mel Gorman
2013-04-30 15:03         ` Glauber Costa
2013-04-30 15:32           ` Mel Gorman
2013-04-26 23:19   ` [PATCH v4 07/31] shrinker: convert superblock shrinkers to new API Glauber Costa
2013-04-26 23:19     ` Glauber Costa
2013-04-30 14:49     ` Mel Gorman
2013-04-26 23:19   ` [PATCH v4 08/31] list: add a new LRU list type Glauber Costa
2013-04-26 23:19     ` Glauber Costa
2013-04-30 15:18     ` Mel Gorman
2013-04-30 16:01       ` Glauber Costa [this message]
2013-04-26 23:19   ` [PATCH v4 09/31] inode: convert inode lru list to generic lru list code Glauber Costa
2013-04-26 23:19     ` Glauber Costa
     [not found]     ` <1367018367-11278-10-git-send-email-glommer-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
2013-04-30 15:46       ` Mel Gorman
2013-04-30 15:46         ` Mel Gorman
2013-05-07 13:47         ` Glauber Costa
2013-04-26 23:19   ` [PATCH v4 10/31] dcache: convert to use new lru list infrastructure Glauber Costa
2013-04-26 23:19     ` Glauber Costa
     [not found]     ` <1367018367-11278-11-git-send-email-glommer-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
2013-04-30 16:04       ` Mel Gorman
2013-04-30 16:04         ` Mel Gorman
2013-04-30 16:13         ` Glauber Costa
2013-04-26 23:19   ` [PATCH v4 11/31] list_lru: per-node " Glauber Costa
2013-04-26 23:19     ` Glauber Costa
     [not found]     ` <1367018367-11278-12-git-send-email-glommer-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
2013-04-30 16:33       ` Mel Gorman
2013-04-30 16:33         ` Mel Gorman
2013-04-30 21:44         ` Glauber Costa
2013-04-26 23:19   ` [PATCH v4 12/31] shrinker: add node awareness Glauber Costa
2013-04-26 23:19     ` Glauber Costa
     [not found]     ` <1367018367-11278-13-git-send-email-glommer-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
2013-04-30 16:35       ` Mel Gorman
2013-04-30 16:35         ` Mel Gorman
2013-04-26 23:19   ` [PATCH v4 13/31] fs: convert inode and dentry shrinking to be node aware Glauber Costa
2013-04-26 23:19     ` Glauber Costa
     [not found]     ` <1367018367-11278-14-git-send-email-glommer-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
2013-04-30 17:39       ` Mel Gorman
2013-04-30 17:39         ` Mel Gorman
2013-04-26 23:19   ` [PATCH v4 14/31] xfs: convert buftarg LRU to generic code Glauber Costa
2013-04-26 23:19     ` Glauber Costa
2013-04-26 23:19   ` [PATCH v4 15/31] xfs: convert dquot cache lru to list_lru Glauber Costa
2013-04-26 23:19     ` Glauber Costa
2013-04-26 23:19   ` [PATCH v4 16/31] fs: convert fs shrinkers to new scan/count API Glauber Costa
2013-04-26 23:19     ` Glauber Costa
2013-04-26 23:19   ` [PATCH v4 17/31] drivers: convert shrinkers to new count/scan API Glauber Costa
2013-04-26 23:19     ` Glauber Costa
2013-04-30 21:53     ` Mel Gorman
     [not found]       ` <20130430215355.GN6415-l3A5Bk7waGM@public.gmane.org>
2013-04-30 22:00         ` Kent Overstreet
2013-04-30 22:00           ` Kent Overstreet
     [not found]           ` <20130430220050.GK9931-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2013-05-02  9:37             ` Mel Gorman
2013-05-02  9:37               ` Mel Gorman
2013-05-02 13:37               ` Glauber Costa
2013-05-01 15:26       ` Daniel Vetter
2013-05-02  9:31         ` Mel Gorman
2013-04-26 23:19   ` [PATCH v4 18/31] shrinker: convert remaining shrinkers to " Glauber Costa
2013-04-26 23:19     ` Glauber Costa
2013-04-26 23:19   ` [PATCH v4 19/31] hugepage: convert huge zero page shrinker to new shrinker API Glauber Costa
2013-04-26 23:19     ` Glauber Costa
2013-04-26 23:19   ` [PATCH v4 20/31] shrinker: Kill old ->shrink API Glauber Costa
2013-04-26 23:19     ` Glauber Costa
     [not found]     ` <1367018367-11278-21-git-send-email-glommer-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
2013-04-30 21:57       ` Mel Gorman
2013-04-30 21:57         ` Mel Gorman
2013-04-26 23:19   ` [PATCH v4 21/31] vmscan: also shrink slab in memcg pressure Glauber Costa
2013-04-26 23:19     ` Glauber Costa
2013-04-26 23:19   ` [PATCH v4 22/31] memcg,list_lru: duplicate LRUs upon kmemcg creation Glauber Costa
2013-04-26 23:19     ` Glauber Costa
2013-04-26 23:19   ` [PATCH v4 23/31] lru: add an element to a memcg list Glauber Costa
2013-04-26 23:19     ` Glauber Costa
2013-04-26 23:19   ` [PATCH v4 24/31] list_lru: per-memcg walks Glauber Costa
2013-04-26 23:19     ` Glauber Costa
2013-04-26 23:19   ` [PATCH v4 25/31] memcg: per-memcg kmem shrinking Glauber Costa
2013-04-26 23:19     ` Glauber Costa
2013-04-26 23:19   ` [PATCH v4 26/31] memcg: scan cache objects hierarchically Glauber Costa
2013-04-26 23:19     ` Glauber Costa
2013-04-26 23:19   ` [PATCH v4 27/31] super: targeted memcg reclaim Glauber Costa
2013-04-26 23:19     ` Glauber Costa
2013-04-26 23:19   ` [PATCH v4 28/31] memcg: move initialization to memcg creation Glauber Costa
2013-04-26 23:19     ` Glauber Costa
2013-04-26 23:19   ` [PATCH v4 29/31] vmpressure: in-kernel notifications Glauber Costa
2013-04-26 23:19     ` Glauber Costa
2013-04-26 23:19   ` [PATCH v4 30/31] memcg: reap dead memcgs upon global memory pressure Glauber Costa
2013-04-26 23:19     ` Glauber Costa
2013-04-26 23:19   ` [PATCH v4 31/31] memcg: debugging facility to access dangling memcgs Glauber Costa
2013-04-26 23:19     ` Glauber Costa
2013-04-30 22:47 ` [PATCH v4 00/31] kmemcg shrinkers Mel Gorman
2013-05-01  9:05   ` Mel Gorman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=517FEAE5.2010809@parallels.com \
    --to=glommer@parallels.com \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=dchinner@redhat.com \
    --cc=glommer@openvz.org \
    --cc=gthelen@google.com \
    --cc=hannes@cmpxchg.org \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=mhocko@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.