linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
To: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: dchinner@redhat.com, airlied@linux.ie, glommer@openvz.org,
	mgorman@suse.de, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org
Subject: Re: [PATCH] gpu/drm/ttm: Use mutex_lock_killable() for shrinker functions.
Date: Fri, 30 May 2014 12:08:24 -0400	[thread overview]
Message-ID: <20140530160824.GD3621@localhost.localdomain> (raw)
In-Reply-To: <201405292334.EAG00503.FLOOJFStHVQMFO@I-love.SAKURA.ne.jp>

On Thu, May 29, 2014 at 11:34:59PM +0900, Tetsuo Handa wrote:
> Tetsuo Handa wrote:
> > Konrad Rzeszutek Wilk wrote:
> > > On Sat, May 24, 2014 at 11:22:09PM +0900, Tetsuo Handa wrote:
> > > > Hello.
> > > > 
> > > > I tried to test whether it is OK (from point of view of reentrant) to use
> > > > mutex_lock() or mutex_lock_killable() inside shrinker functions when shrinker
> > > > functions do memory allocation, for drivers/gpu/drm/ttm/ttm_page_alloc_dma.c is
> > > > doing memory allocation with mutex lock held inside ttm_dma_pool_shrink_scan().
> > > > 
> > > > If I compile a test module shown below which mimics extreme case of what
> > > > ttm_dma_pool_shrink_scan() will do
> > > 
> > > And ttm_pool_shrink_scan.
> > 
> > I don't know why but ttm_pool_shrink_scan() does not take mutex.
> > 
> Well, it seems to me that ttm_pool_shrink_scan() not taking mutex is a bug
> which could lead to stack overflow if kmalloc() in ttm_page_pool_free()
> triggered recursion.
> 
>   shrink_slab()
>   => ttm_pool_shrink_scan()
>      => ttm_page_pool_free()
>         => kmalloc(GFP_KERNEL)
>            => shrink_slab()
>               => ttm_pool_shrink_scan()
>                  => ttm_page_pool_free()
>                     => kmalloc(GFP_KERNEL)
> 
> Maybe shrink_slab() should be updated not to call same shrinker in parallel?
> 
> Also, it seems to me that ttm_dma_pool_shrink_scan() has potential division
> by 0 bug as described below. Is this patch correct?

Looks OK. I would need to test it first. Could you send both patches
to me please so I can just test them and queue them up together?

Thank you!
> ----------
> >From 4a65744a300e14e5e202c5f13ba2759e1e797d29 Mon Sep 17 00:00:00 2001
> From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> Date: Thu, 29 May 2014 18:25:42 +0900
> Subject: [PATCH] gpu/drm/ttm: Use mutex_trylock() for shrinker functions.
> 
> I can observe that RHEL7 environment stalls with 100% CPU usage when a
> certain type of memory pressure is given. While the shrinker functions
> are called by shrink_slab() before the OOM killer is triggered, the stall
> lasts for many minutes.
> 
> One of reasons of this stall is that
> ttm_dma_pool_shrink_count()/ttm_dma_pool_shrink_scan() are called and
> are blocked at mutex_lock(&_manager->lock). GFP_KERNEL allocation with
> _manager->lock held causes someone (including kswapd) to deadlock when
> these functions are called due to memory pressure. This patch changes
> "mutex_lock();" to "if (!mutex_trylock()) return ...;" in order to
> avoid deadlock.
> 
> At the same time, this patch fixes potential division by 0 due to
> unconditionally doing "% _manager->npools". This is because
> list_empty(&_manager->pools) being false does not guarantee that
> _manager->npools != 0 after taking the _manager->lock because
> _manager->npools is updated under the _manager->lock.
> 
> At the same time, this patch moves updating of start_pool variable
> in order to avoid skipping when choosing a pool to shrink in
> round-robin style. The start_pool is changed from "atomic_t" to
> "unsigned int" because it is now updated under the _manager->lock.
> 
> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> Cc: stable <stable@kernel.org> [3.3+]
> ---
>  drivers/gpu/drm/ttm/ttm_page_alloc_dma.c | 15 ++++++++++-----
>  1 file changed, 10 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c b/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
> index fb8259f..5e332b4 100644
> --- a/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
> +++ b/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
> @@ -1004,9 +1004,9 @@ EXPORT_SYMBOL_GPL(ttm_dma_unpopulate);
>  static unsigned long
>  ttm_dma_pool_shrink_scan(struct shrinker *shrink, struct shrink_control *sc)
>  {
> -	static atomic_t start_pool = ATOMIC_INIT(0);
> +	static unsigned int start_pool;
>  	unsigned idx = 0;
> -	unsigned pool_offset = atomic_add_return(1, &start_pool);
> +	unsigned pool_offset;
>  	unsigned shrink_pages = sc->nr_to_scan;
>  	struct device_pools *p;
>  	unsigned long freed = 0;
> @@ -1014,8 +1014,11 @@ ttm_dma_pool_shrink_scan(struct shrinker *shrink, struct shrink_control *sc)
>  	if (list_empty(&_manager->pools))
>  		return SHRINK_STOP;
>  
> -	mutex_lock(&_manager->lock);
> -	pool_offset = pool_offset % _manager->npools;
> +	if (!mutex_trylock(&_manager->lock))
> +		return SHRINK_STOP;
> +	if (!_manager->npools)
> +		goto out;
> +	pool_offset = ++start_pool % _manager->npools;
>  	list_for_each_entry(p, &_manager->pools, pools) {
>  		unsigned nr_free;
>  
> @@ -1034,6 +1037,7 @@ ttm_dma_pool_shrink_scan(struct shrinker *shrink, struct shrink_control *sc)
>  			 p->pool->dev_name, p->pool->name, current->pid,
>  			 nr_free, shrink_pages);
>  	}
> +out:
>  	mutex_unlock(&_manager->lock);
>  	return freed;
>  }
> @@ -1044,7 +1048,8 @@ ttm_dma_pool_shrink_count(struct shrinker *shrink, struct shrink_control *sc)
>  	struct device_pools *p;
>  	unsigned long count = 0;
>  
> -	mutex_lock(&_manager->lock);
> +	if (!mutex_trylock(&_manager->lock))
> +		return 0;
>  	list_for_each_entry(p, &_manager->pools, pools)
>  		count += p->pool->npages_free;
>  	mutex_unlock(&_manager->lock);
> -- 
> 1.8.3.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2014-05-30 16:08 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <201405192339.JIJ04144.FHQFVFOtOSLJOM@I-love.SAKURA.ne.jp>
     [not found] ` <alpine.DEB.2.00.1405200140010.20503@skynet.skynet.ie>
     [not found]   ` <201405210030.HBD65663.FFLVHOFMSJOtOQ@I-love.SAKURA.ne.jp>
2014-05-24 14:22     ` [PATCH] gpu/drm/ttm: Use mutex_lock_killable() for shrinker functions Tetsuo Handa
2014-05-28 18:54       ` Konrad Rzeszutek Wilk
2014-05-28 21:47         ` Tetsuo Handa
2014-05-29 14:34           ` Tetsuo Handa
2014-05-30 16:08             ` Konrad Rzeszutek Wilk [this message]
2014-05-31  2:58               ` [PATCH 1/5] gpu/drm/ttm: Fix possible division by 0 in ttm_dma_pool_shrink_scan() Tetsuo Handa
2014-05-31  2:59                 ` [PATCH 2/5] gpu/drm/ttm: Choose a pool to shrink correctly " Tetsuo Handa
2014-05-31  3:00                   ` [PATCH 3/5] gpu/drm/ttm: Use mutex_trylock() to avoid deadlock inside shrinker functions Tetsuo Handa
2014-05-31  3:01                     ` [PATCH 4/5] gpu/drm/ttm: Fix possible stack overflow by recursive shrinker calls Tetsuo Handa
2014-05-31  3:02                       ` [PATCH 5/5] gpu/drm/ttm: Pass GFP flags in order to avoid deadlock Tetsuo Handa
2014-06-10 19:17                     ` [PATCH 3/5] gpu/drm/ttm: Use mutex_trylock() to avoid deadlock inside shrinker functions Konrad Rzeszutek Wilk
2014-06-10 20:16                       ` Tetsuo Handa
2014-05-30 16:06           ` [PATCH] gpu/drm/ttm: Use mutex_lock_killable() for " Konrad Rzeszutek Wilk

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140530160824.GD3621@localhost.localdomain \
    --to=konrad.wilk@oracle.com \
    --cc=airlied@linux.ie \
    --cc=dchinner@redhat.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=glommer@openvz.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=penguin-kernel@I-love.SAKURA.ne.jp \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).