All of lore.kernel.org
 help / color / mirror / Atom feed
From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
To: Glauber Costa <glommer@openvz.org>
Cc: akpm@linux-foundation.org, linux-fsdevel@vger.kernel.org,
	mgorman@suse.de, david@fromorbit.com, linux-mm@kvack.org,
	cgroups@vger.kernel.org, kamezawa.hiroyu@jp.fujitsu.com,
	mhocko@suze.cz, hannes@cmpxchg.org, hughd@google.com,
	gthelen@google.com, "Dave Chinner" <dchinner@redhat.com>,
	"Daniel Vetter" <daniel.vetter@ffwll.ch>,
	"Kent Overstreet" <koverstreet@google.com>,
	"Arve Hjønnevåg" <arve@android.com>,
	"John Stultz" <john.stultz@linaro.org>,
	"David Rientjes" <rientjes@google.com>,
	"Jerome Glisse" <jglisse@redhat.com>,
	"Thomas Hellstrom" <thellstrom@vmware.com>
Subject: Re: [PATCH v11 20/25] drivers: convert shrinkers to new count/scan API
Date: Fri, 7 Jun 2013 10:10:27 -0400	[thread overview]
Message-ID: <20130607141027.GH25649@phenom.dumpdata.com> (raw)
In-Reply-To: <1370550898-26711-21-git-send-email-glommer@openvz.org>

On Fri, Jun 07, 2013 at 12:34:53AM +0400, Glauber Costa wrote:
> From: Dave Chinner <dchinner@redhat.com>
> 
> Convert the driver shrinkers to the new API. Most changes are
> compile tested only because I either don't have the hardware or it's
> staging stuff.
> 
> FWIW, the md and android code is pretty good, but the rest of it
> makes me want to claw my eyes out.  The amount of broken code I just
> encountered is mind boggling.  I've added comments explaining what
> is broken, but I fear that some of the code would be best dealt with
> by being dragged behind the bike shed, burying in mud up to it's
> neck and then run over repeatedly with a blunt lawn mower.

The rest being i915, ttm, bcache- etc ?

> 
> Special mention goes to the zcache/zcache2 drivers. They can't
> co-exist in the build at the same time, they are under different
> menu options in menuconfig, they only show up when you've got the
> right set of mm subsystem options configured and so even compile
> testing is an exercise in pulling teeth.  And that doesn't even take
> into account the horrible, broken code...

Now that you have rebased it, did you still see issues here.

> diff --git a/drivers/gpu/drm/ttm/ttm_page_alloc.c b/drivers/gpu/drm/ttm/ttm_page_alloc.c
> index bd2a3b4..1746f30 100644
> --- a/drivers/gpu/drm/ttm/ttm_page_alloc.c
> +++ b/drivers/gpu/drm/ttm/ttm_page_alloc.c
> @@ -377,28 +377,26 @@ out:
>  	return nr_free;
>  }
>  
> -/* Get good estimation how many pages are free in pools */
> -static int ttm_pool_get_num_unused_pages(void)
> -{
> -	unsigned i;
> -	int total = 0;
> -	for (i = 0; i < NUM_POOLS; ++i)
> -		total += _manager->pools[i].npages;
> -
> -	return total;
> -}
> -

I am unclear as of why you move this.
>  /**
>   * Callback for mm to request pool to reduce number of page held.
> + *
> + * XXX: (dchinner) Deadlock warning!
> + *
> + * ttm_page_pool_free() does memory allocation using GFP_KERNEL.  that means

That
> + * this can deadlock when called a sc->gfp_mask that is not equal to
> + * GFP_KERNEL.
> + *
> + * This code is crying out for a shrinker per pool....

It iterates over different pools.


The ttm_page_pool_free() could use GFP_ATOMIC to guard against the dead-lock
I think?

>   */
> -static int ttm_pool_mm_shrink(struct shrinker *shrink,
> -			      struct shrink_control *sc)
> +static long
> +ttm_pool_shrink_scan(struct shrinker *shrink, struct shrink_control *sc)
>  {
>  	static atomic_t start_pool = ATOMIC_INIT(0);
>  	unsigned i;
>  	unsigned pool_offset = atomic_add_return(1, &start_pool);
>  	struct ttm_page_pool *pool;
>  	int shrink_pages = sc->nr_to_scan;
> +	long freed = 0;
>  
>  	pool_offset = pool_offset % NUM_POOLS;
>  	/* select start pool in round robin fashion */
> @@ -408,14 +406,28 @@ static int ttm_pool_mm_shrink(struct shrinker *shrink,
>  			break;
>  		pool = &_manager->pools[(i + pool_offset)%NUM_POOLS];
>  		shrink_pages = ttm_page_pool_free(pool, nr_free);
> +		freed += nr_free - shrink_pages;
>  	}
> -	/* return estimated number of unused pages in pool */
> -	return ttm_pool_get_num_unused_pages();
> +	return freed;
> +}
> +
> +
> +static long
> +ttm_pool_shrink_count(struct shrinker *shrink, struct shrink_control *sc)
> +{
> +	unsigned i;
> +	long count = 0;
> +
> +	for (i = 0; i < NUM_POOLS; ++i)
> +		count += _manager->pools[i].npages;
> +
> +	return count;
>  }
>  
>  static void ttm_pool_mm_shrink_init(struct ttm_pool_manager *manager)
>  {
> -	manager->mm_shrink.shrink = &ttm_pool_mm_shrink;
> +	manager->mm_shrink.count_objects = &ttm_pool_shrink_count;
> +	manager->mm_shrink.scan_objects = &ttm_pool_shrink_scan;
>  	manager->mm_shrink.seeks = 1;
>  	register_shrinker(&manager->mm_shrink);
>  }
> diff --git a/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c b/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
> index b8b3943..dc009f1 100644
> --- a/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
> +++ b/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
> @@ -918,19 +918,6 @@ int ttm_dma_populate(struct ttm_dma_tt *ttm_dma, struct device *dev)
>  }
>  EXPORT_SYMBOL_GPL(ttm_dma_populate);
>  
> -/* Get good estimation how many pages are free in pools */
> -static int ttm_dma_pool_get_num_unused_pages(void)
> -{
> -	struct device_pools *p;
> -	unsigned total = 0;
> -
> -	mutex_lock(&_manager->lock);
> -	list_for_each_entry(p, &_manager->pools, pools)
> -		total += p->pool->npages_free;
> -	mutex_unlock(&_manager->lock);
> -	return total;
> -}
> -
>  /* Put all pages in pages list to correct pool to wait for reuse */
>  void ttm_dma_unpopulate(struct ttm_dma_tt *ttm_dma, struct device *dev)
>  {
> @@ -1002,18 +989,29 @@ EXPORT_SYMBOL_GPL(ttm_dma_unpopulate);
>  
>  /**
>   * Callback for mm to request pool to reduce number of page held.
> + *
> + * XXX: (dchinner) Deadlock warning!
> + *
> + * ttm_dma_page_pool_free() does GFP_KERNEL memory allocation, and so attention
> + * needs to be paid to sc->gfp_mask to determine if this can be done or not.
> + * GFP_KERNEL memory allocation in a GFP_ATOMIC reclaim context woul dbe really
> + * bad.

would be.
> + *
> + * I'm getting sadder as I hear more pathetical whimpers about needing per-pool
> + * shrinkers

Were are these whimpers coming from?

>   */
> -static int ttm_dma_pool_mm_shrink(struct shrinker *shrink,
> -				  struct shrink_control *sc)
> +static long
> +ttm_dma_pool_shrink_scan(struct shrinker *shrink, struct shrink_control *sc)
>  {
>  	static atomic_t start_pool = ATOMIC_INIT(0);
>  	unsigned idx = 0;
>  	unsigned pool_offset = atomic_add_return(1, &start_pool);
>  	unsigned shrink_pages = sc->nr_to_scan;
>  	struct device_pools *p;
> +	long freed = 0;
>  
>  	if (list_empty(&_manager->pools))
> -		return 0;
> +		return SHRINK_STOP;
>  
>  	mutex_lock(&_manager->lock);
>  	pool_offset = pool_offset % _manager->npools;
> @@ -1029,18 +1027,33 @@ static int ttm_dma_pool_mm_shrink(struct shrinker *shrink,
>  			continue;
>  		nr_free = shrink_pages;
>  		shrink_pages = ttm_dma_page_pool_free(p->pool, nr_free);
> +		freed += nr_free - shrink_pages;
> +
>  		pr_debug("%s: (%s:%d) Asked to shrink %d, have %d more to go\n",
>  			 p->pool->dev_name, p->pool->name, current->pid,
>  			 nr_free, shrink_pages);
>  	}
>  	mutex_unlock(&_manager->lock);
> -	/* return estimated number of unused pages in pool */
> -	return ttm_dma_pool_get_num_unused_pages();
> +	return freed;
> +}

That code looks good.
> +
> +static long
> +ttm_dma_pool_shrink_count(struct shrinker *shrink, struct shrink_control *sc)
> +{
> +	struct device_pools *p;
> +	long count = 0;
> +
> +	mutex_lock(&_manager->lock);
> +	list_for_each_entry(p, &_manager->pools, pools)
> +		count += p->pool->npages_free;
> +	mutex_unlock(&_manager->lock);
> +	return count;
>  }

But this needn't to be moved? Or is it b/c you would like the code to
be in "one section" ?

If so, please use the same style for functions as the rest of the file
has.

>  
>  static void ttm_dma_pool_mm_shrink_init(struct ttm_pool_manager *manager)
>  {
> -	manager->mm_shrink.shrink = &ttm_dma_pool_mm_shrink;
> +	manager->mm_shrink.count_objects = &ttm_dma_pool_shrink_count;
> +	manager->mm_shrink.scan_objects = &ttm_dma_pool_shrink_scan;
>  	manager->mm_shrink.seeks = 1;
>  	register_shrinker(&manager->mm_shrink);
>  }

.. snip..
> diff --git a/drivers/staging/zcache/zcache-main.c b/drivers/staging/zcache/zcache-main.c
> index dcceed2..4ade8e3 100644
> --- a/drivers/staging/zcache/zcache-main.c
> +++ b/drivers/staging/zcache/zcache-main.c
> @@ -1140,23 +1140,19 @@ static bool zcache_freeze;
>   * pageframes in use.  FIXME POLICY: Probably the writeback should only occur
>   * if the eviction doesn't free enough pages.
>   */
> -static int shrink_zcache_memory(struct shrinker *shrink,
> -				struct shrink_control *sc)
> +static long scan_zcache_memory(struct shrinker *shrink,
> +			       struct shrink_control *sc)
>  {
>  	static bool in_progress;
> -	int ret = -1;
> -	int nr = sc->nr_to_scan;
>  	int nr_evict = 0;
>  	int nr_writeback = 0;
>  	struct page *page;
>  	int  file_pageframes_inuse, anon_pageframes_inuse;
> -
> -	if (nr <= 0)
> -		goto skip_evict;
> +	long freed = 0;
>  
>  	/* don't allow more than one eviction thread at a time */
>  	if (in_progress)
> -		goto skip_evict;
> +		return 0;
>  
>  	in_progress = true;
>  
> @@ -1176,6 +1172,7 @@ static int shrink_zcache_memory(struct shrinker *shrink,
>  		if (page == NULL)
>  			break;
>  		zcache_free_page(page);
> +		freed++;
>  	}
>  
>  	zcache_last_active_anon_pageframes =
> @@ -1192,13 +1189,22 @@ static int shrink_zcache_memory(struct shrinker *shrink,
>  #ifdef CONFIG_ZCACHE_WRITEBACK
>  		int writeback_ret;
>  		writeback_ret = zcache_frontswap_writeback();
> -		if (writeback_ret == -ENOMEM)
> +		if (writeback_ret != -ENOMEM)
> +			freed++;
> +		else
>  #endif
>  			break;
>  	}
>  	in_progress = false;
>  
> -skip_evict:
> +	return freed;
> +}
> +
> +static long count_zcache_memory(struct shrinker *shrink,
> +				struct shrink_control *sc)
> +{
> +	int ret = -1;
> +
>  	/* resample: has changed, but maybe not all the way yet */
>  	zcache_last_active_file_pageframes =
>  		global_page_state(NR_LRU_BASE + LRU_ACTIVE_FILE);
> @@ -1212,7 +1218,8 @@ skip_evict:
>  }
>  
>  static struct shrinker zcache_shrinker = {
> -	.shrink = shrink_zcache_memory,
> +	.scan_objects = scan_zcache_memory,
> +	.count_objects = count_zcache_memory,
>  	.seeks = DEFAULT_SEEKS,
>  };
>  

That looks OK, but I think it needs an Ack from Greg KH as well?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2013-06-07 14:10 UTC|newest]

Thread overview: 71+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-06-06 20:34 [PATCH v11 00/25] shrinkers rework: per-numa, generic lists, etc Glauber Costa
2013-06-06 20:34 ` Glauber Costa
2013-06-06 20:34 ` [PATCH v11 02/25] super: fix calculation of shrinkable objects for small numbers Glauber Costa
     [not found] ` <1370550898-26711-1-git-send-email-glommer-GEFAQzZX7r8dnm+yROfE0A@public.gmane.org>
2013-06-06 20:34   ` [PATCH v11 01/25] fs: bump inode and dentry counters to long Glauber Costa
2013-06-06 20:34     ` Glauber Costa
2013-06-06 20:34   ` [PATCH v11 03/25] dcache: convert dentry_stat.nr_unused to per-cpu counters Glauber Costa
2013-06-06 20:34     ` Glauber Costa
2013-06-06 20:34   ` [PATCH v11 04/25] dentry: move to per-sb LRU locks Glauber Costa
2013-06-06 20:34     ` Glauber Costa
2013-06-06 20:34   ` [PATCH v11 05/25] dcache: remove dentries from LRU before putting on dispose list Glauber Costa
2013-06-06 20:34     ` Glauber Costa
2013-06-06 20:34   ` [PATCH v11 06/25] mm: new shrinker API Glauber Costa
2013-06-06 20:34     ` Glauber Costa
2013-06-06 20:34   ` [PATCH v11 07/25] shrinker: convert superblock shrinkers to new API Glauber Costa
2013-06-06 20:34     ` Glauber Costa
2013-06-06 20:34   ` [PATCH v11 08/25] list: add a new LRU list type Glauber Costa
2013-06-06 20:34     ` Glauber Costa
2013-06-06 20:34   ` [PATCH v11 09/25] inode: convert inode lru list to generic lru list code Glauber Costa
2013-06-06 20:34     ` Glauber Costa
2013-06-06 20:34   ` [PATCH v11 10/25] dcache: convert to use new lru list infrastructure Glauber Costa
2013-06-06 20:34     ` Glauber Costa
2013-06-06 20:34   ` [PATCH v11 11/25] list_lru: per-node " Glauber Costa
2013-06-06 20:34     ` Glauber Costa
2013-06-06 20:34   ` [PATCH v11 12/25] list_lru: per-node API Glauber Costa
2013-06-06 20:34     ` Glauber Costa
2013-06-06 20:34   ` [PATCH v11 13/25] shrinker: add node awareness Glauber Costa
2013-06-06 20:34     ` Glauber Costa
2013-06-06 20:34   ` [PATCH v11 14/25] vmscan: per-node deferred work Glauber Costa
2013-06-06 20:34     ` Glauber Costa
2013-06-06 20:34   ` [PATCH v11 15/25] fs: convert inode and dentry shrinking to be node aware Glauber Costa
2013-06-06 20:34     ` Glauber Costa
2013-06-06 20:34   ` [PATCH v11 16/25] xfs: convert buftarg LRU to generic code Glauber Costa
2013-06-06 20:34     ` Glauber Costa
2013-06-06 20:34   ` [PATCH v11 17/25] xfs: rework buffer dispose list tracking Glauber Costa
2013-06-06 20:34     ` Glauber Costa
2013-06-06 20:34   ` [PATCH v11 18/25] xfs: convert dquot cache lru to list_lru Glauber Costa
2013-06-06 20:34     ` Glauber Costa
2013-06-06 20:34   ` [PATCH v11 21/25] i915: bail out earlier when shrinker cannot acquire mutex Glauber Costa
2013-06-06 20:34     ` Glauber Costa
2013-06-06 20:34   ` [PATCH v11 23/25] hugepage: convert huge zero page shrinker to new shrinker API Glauber Costa
2013-06-06 20:34     ` Glauber Costa
2013-06-06 20:34   ` [PATCH v11 24/25] shrinker: Kill old ->shrink API Glauber Costa
2013-06-06 20:34     ` Glauber Costa
2013-06-06 20:34   ` [PATCH v11 25/25] list_lru: dynamically adjust node arrays Glauber Costa
2013-06-06 20:34     ` Glauber Costa
2013-06-18  9:42     ` Li Zhong
2013-06-19  7:31       ` Glauber Costa
2013-06-19  9:12         ` Li Zhong
2013-06-19 13:29           ` Glauber Costa
2013-06-19 13:29             ` Glauber Costa
2013-06-19 17:14             ` Andrew Morton
2013-06-20  0:50               ` Li Zhong
2013-06-20  1:35             ` Li Zhong
2013-06-20  2:37     ` Dave Chinner
2013-06-06 21:15   ` [PATCH v11 00/25] shrinkers rework: per-numa, generic lists, etc Andrew Morton
2013-06-06 21:15     ` Andrew Morton
2013-06-07  6:11     ` Glauber Costa
2013-06-07  6:11       ` Glauber Costa
     [not found]       ` <51B1797D.3010209-bzQdu9zFT3WakBO8gow8eQ@public.gmane.org>
2013-06-07  7:08         ` Glauber Costa
2013-06-07  7:08           ` Glauber Costa
2013-06-07  7:08           ` Glauber Costa
2013-06-07  8:04     ` Glauber Costa
2013-06-07  8:04       ` Glauber Costa
2013-06-07  8:04       ` Glauber Costa
2013-06-06 20:34 ` [PATCH v11 19/25] fs: convert fs shrinkers to new scan/count API Glauber Costa
2013-06-06 20:34 ` [PATCH v11 20/25] drivers: convert shrinkers to new count/scan API Glauber Costa
2013-06-06 20:34   ` Glauber Costa
2013-06-07 14:10   ` Konrad Rzeszutek Wilk [this message]
2013-06-09 12:02     ` Glauber Costa
2013-06-06 20:34 ` [PATCH v11 22/25] shrinker: convert remaining shrinkers to " Glauber Costa
2013-06-06 22:31   ` Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130607141027.GH25649@phenom.dumpdata.com \
    --to=konrad.wilk@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=arve@android.com \
    --cc=cgroups@vger.kernel.org \
    --cc=daniel.vetter@ffwll.ch \
    --cc=david@fromorbit.com \
    --cc=dchinner@redhat.com \
    --cc=glommer@openvz.org \
    --cc=gthelen@google.com \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=jglisse@redhat.com \
    --cc=john.stultz@linaro.org \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=koverstreet@google.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=mhocko@suze.cz \
    --cc=rientjes@google.com \
    --cc=thellstrom@vmware.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.