From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
To: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: dchinner@redhat.com, airlied@linux.ie, glommer@openvz.org,
mgorman@suse.de, linux-mm@kvack.org,
linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org
Subject: Re: [PATCH] gpu/drm/ttm: Use mutex_lock_killable() for shrinker functions.
Date: Fri, 30 May 2014 12:08:24 -0400 [thread overview]
Message-ID: <20140530160824.GD3621@localhost.localdomain> (raw)
In-Reply-To: <201405292334.EAG00503.FLOOJFStHVQMFO@I-love.SAKURA.ne.jp>
On Thu, May 29, 2014 at 11:34:59PM +0900, Tetsuo Handa wrote:
> Tetsuo Handa wrote:
> > Konrad Rzeszutek Wilk wrote:
> > > On Sat, May 24, 2014 at 11:22:09PM +0900, Tetsuo Handa wrote:
> > > > Hello.
> > > >
> > > > I tried to test whether it is OK (from point of view of reentrant) to use
> > > > mutex_lock() or mutex_lock_killable() inside shrinker functions when shrinker
> > > > functions do memory allocation, for drivers/gpu/drm/ttm/ttm_page_alloc_dma.c is
> > > > doing memory allocation with mutex lock held inside ttm_dma_pool_shrink_scan().
> > > >
> > > > If I compile a test module shown below which mimics extreme case of what
> > > > ttm_dma_pool_shrink_scan() will do
> > >
> > > And ttm_pool_shrink_scan.
> >
> > I don't know why but ttm_pool_shrink_scan() does not take mutex.
> >
> Well, it seems to me that ttm_pool_shrink_scan() not taking mutex is a bug
> which could lead to stack overflow if kmalloc() in ttm_page_pool_free()
> triggered recursion.
>
> shrink_slab()
> => ttm_pool_shrink_scan()
> => ttm_page_pool_free()
> => kmalloc(GFP_KERNEL)
> => shrink_slab()
> => ttm_pool_shrink_scan()
> => ttm_page_pool_free()
> => kmalloc(GFP_KERNEL)
>
> Maybe shrink_slab() should be updated not to call same shrinker in parallel?
>
> Also, it seems to me that ttm_dma_pool_shrink_scan() has potential division
> by 0 bug as described below. Is this patch correct?
Looks OK. I would need to test it first. Could you send both patches
to me please so I can just test them and queue them up together?
Thank you!
> ----------
> >From 4a65744a300e14e5e202c5f13ba2759e1e797d29 Mon Sep 17 00:00:00 2001
> From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> Date: Thu, 29 May 2014 18:25:42 +0900
> Subject: [PATCH] gpu/drm/ttm: Use mutex_trylock() for shrinker functions.
>
> I can observe that RHEL7 environment stalls with 100% CPU usage when a
> certain type of memory pressure is given. While the shrinker functions
> are called by shrink_slab() before the OOM killer is triggered, the stall
> lasts for many minutes.
>
> One of reasons of this stall is that
> ttm_dma_pool_shrink_count()/ttm_dma_pool_shrink_scan() are called and
> are blocked at mutex_lock(&_manager->lock). GFP_KERNEL allocation with
> _manager->lock held causes someone (including kswapd) to deadlock when
> these functions are called due to memory pressure. This patch changes
> "mutex_lock();" to "if (!mutex_trylock()) return ...;" in order to
> avoid deadlock.
>
> At the same time, this patch fixes potential division by 0 due to
> unconditionally doing "% _manager->npools". This is because
> list_empty(&_manager->pools) being false does not guarantee that
> _manager->npools != 0 after taking the _manager->lock because
> _manager->npools is updated under the _manager->lock.
>
> At the same time, this patch moves updating of start_pool variable
> in order to avoid skipping when choosing a pool to shrink in
> round-robin style. The start_pool is changed from "atomic_t" to
> "unsigned int" because it is now updated under the _manager->lock.
>
> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> Cc: stable <stable@kernel.org> [3.3+]
> ---
> drivers/gpu/drm/ttm/ttm_page_alloc_dma.c | 15 ++++++++++-----
> 1 file changed, 10 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c b/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
> index fb8259f..5e332b4 100644
> --- a/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
> +++ b/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
> @@ -1004,9 +1004,9 @@ EXPORT_SYMBOL_GPL(ttm_dma_unpopulate);
> static unsigned long
> ttm_dma_pool_shrink_scan(struct shrinker *shrink, struct shrink_control *sc)
> {
> - static atomic_t start_pool = ATOMIC_INIT(0);
> + static unsigned int start_pool;
> unsigned idx = 0;
> - unsigned pool_offset = atomic_add_return(1, &start_pool);
> + unsigned pool_offset;
> unsigned shrink_pages = sc->nr_to_scan;
> struct device_pools *p;
> unsigned long freed = 0;
> @@ -1014,8 +1014,11 @@ ttm_dma_pool_shrink_scan(struct shrinker *shrink, struct shrink_control *sc)
> if (list_empty(&_manager->pools))
> return SHRINK_STOP;
>
> - mutex_lock(&_manager->lock);
> - pool_offset = pool_offset % _manager->npools;
> + if (!mutex_trylock(&_manager->lock))
> + return SHRINK_STOP;
> + if (!_manager->npools)
> + goto out;
> + pool_offset = ++start_pool % _manager->npools;
> list_for_each_entry(p, &_manager->pools, pools) {
> unsigned nr_free;
>
> @@ -1034,6 +1037,7 @@ ttm_dma_pool_shrink_scan(struct shrinker *shrink, struct shrink_control *sc)
> p->pool->dev_name, p->pool->name, current->pid,
> nr_free, shrink_pages);
> }
> +out:
> mutex_unlock(&_manager->lock);
> return freed;
> }
> @@ -1044,7 +1048,8 @@ ttm_dma_pool_shrink_count(struct shrinker *shrink, struct shrink_control *sc)
> struct device_pools *p;
> unsigned long count = 0;
>
> - mutex_lock(&_manager->lock);
> + if (!mutex_trylock(&_manager->lock))
> + return 0;
> list_for_each_entry(p, &_manager->pools, pools)
> count += p->pool->npages_free;
> mutex_unlock(&_manager->lock);
> --
> 1.8.3.1
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
To: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: dchinner@redhat.com, airlied@linux.ie, glommer@openvz.org,
mgorman@suse.de, linux-mm@kvack.org,
linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org
Subject: Re: [PATCH] gpu/drm/ttm: Use mutex_lock_killable() for shrinker functions.
Date: Fri, 30 May 2014 12:08:24 -0400 [thread overview]
Message-ID: <20140530160824.GD3621@localhost.localdomain> (raw)
In-Reply-To: <201405292334.EAG00503.FLOOJFStHVQMFO@I-love.SAKURA.ne.jp>
On Thu, May 29, 2014 at 11:34:59PM +0900, Tetsuo Handa wrote:
> Tetsuo Handa wrote:
> > Konrad Rzeszutek Wilk wrote:
> > > On Sat, May 24, 2014 at 11:22:09PM +0900, Tetsuo Handa wrote:
> > > > Hello.
> > > >
> > > > I tried to test whether it is OK (from point of view of reentrant) to use
> > > > mutex_lock() or mutex_lock_killable() inside shrinker functions when shrinker
> > > > functions do memory allocation, for drivers/gpu/drm/ttm/ttm_page_alloc_dma.c is
> > > > doing memory allocation with mutex lock held inside ttm_dma_pool_shrink_scan().
> > > >
> > > > If I compile a test module shown below which mimics extreme case of what
> > > > ttm_dma_pool_shrink_scan() will do
> > >
> > > And ttm_pool_shrink_scan.
> >
> > I don't know why but ttm_pool_shrink_scan() does not take mutex.
> >
> Well, it seems to me that ttm_pool_shrink_scan() not taking mutex is a bug
> which could lead to stack overflow if kmalloc() in ttm_page_pool_free()
> triggered recursion.
>
> shrink_slab()
> => ttm_pool_shrink_scan()
> => ttm_page_pool_free()
> => kmalloc(GFP_KERNEL)
> => shrink_slab()
> => ttm_pool_shrink_scan()
> => ttm_page_pool_free()
> => kmalloc(GFP_KERNEL)
>
> Maybe shrink_slab() should be updated not to call same shrinker in parallel?
>
> Also, it seems to me that ttm_dma_pool_shrink_scan() has potential division
> by 0 bug as described below. Is this patch correct?
Looks OK. I would need to test it first. Could you send both patches
to me please so I can just test them and queue them up together?
Thank you!
> ----------
> >From 4a65744a300e14e5e202c5f13ba2759e1e797d29 Mon Sep 17 00:00:00 2001
> From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> Date: Thu, 29 May 2014 18:25:42 +0900
> Subject: [PATCH] gpu/drm/ttm: Use mutex_trylock() for shrinker functions.
>
> I can observe that RHEL7 environment stalls with 100% CPU usage when a
> certain type of memory pressure is given. While the shrinker functions
> are called by shrink_slab() before the OOM killer is triggered, the stall
> lasts for many minutes.
>
> One of reasons of this stall is that
> ttm_dma_pool_shrink_count()/ttm_dma_pool_shrink_scan() are called and
> are blocked at mutex_lock(&_manager->lock). GFP_KERNEL allocation with
> _manager->lock held causes someone (including kswapd) to deadlock when
> these functions are called due to memory pressure. This patch changes
> "mutex_lock();" to "if (!mutex_trylock()) return ...;" in order to
> avoid deadlock.
>
> At the same time, this patch fixes potential division by 0 due to
> unconditionally doing "% _manager->npools". This is because
> list_empty(&_manager->pools) being false does not guarantee that
> _manager->npools != 0 after taking the _manager->lock because
> _manager->npools is updated under the _manager->lock.
>
> At the same time, this patch moves updating of start_pool variable
> in order to avoid skipping when choosing a pool to shrink in
> round-robin style. The start_pool is changed from "atomic_t" to
> "unsigned int" because it is now updated under the _manager->lock.
>
> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> Cc: stable <stable@kernel.org> [3.3+]
> ---
> drivers/gpu/drm/ttm/ttm_page_alloc_dma.c | 15 ++++++++++-----
> 1 file changed, 10 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c b/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
> index fb8259f..5e332b4 100644
> --- a/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
> +++ b/drivers/gpu/drm/ttm/ttm_page_alloc_dma.c
> @@ -1004,9 +1004,9 @@ EXPORT_SYMBOL_GPL(ttm_dma_unpopulate);
> static unsigned long
> ttm_dma_pool_shrink_scan(struct shrinker *shrink, struct shrink_control *sc)
> {
> - static atomic_t start_pool = ATOMIC_INIT(0);
> + static unsigned int start_pool;
> unsigned idx = 0;
> - unsigned pool_offset = atomic_add_return(1, &start_pool);
> + unsigned pool_offset;
> unsigned shrink_pages = sc->nr_to_scan;
> struct device_pools *p;
> unsigned long freed = 0;
> @@ -1014,8 +1014,11 @@ ttm_dma_pool_shrink_scan(struct shrinker *shrink, struct shrink_control *sc)
> if (list_empty(&_manager->pools))
> return SHRINK_STOP;
>
> - mutex_lock(&_manager->lock);
> - pool_offset = pool_offset % _manager->npools;
> + if (!mutex_trylock(&_manager->lock))
> + return SHRINK_STOP;
> + if (!_manager->npools)
> + goto out;
> + pool_offset = ++start_pool % _manager->npools;
> list_for_each_entry(p, &_manager->pools, pools) {
> unsigned nr_free;
>
> @@ -1034,6 +1037,7 @@ ttm_dma_pool_shrink_scan(struct shrinker *shrink, struct shrink_control *sc)
> p->pool->dev_name, p->pool->name, current->pid,
> nr_free, shrink_pages);
> }
> +out:
> mutex_unlock(&_manager->lock);
> return freed;
> }
> @@ -1044,7 +1048,8 @@ ttm_dma_pool_shrink_count(struct shrinker *shrink, struct shrink_control *sc)
> struct device_pools *p;
> unsigned long count = 0;
>
> - mutex_lock(&_manager->lock);
> + if (!mutex_trylock(&_manager->lock))
> + return 0;
> list_for_each_entry(p, &_manager->pools, pools)
> count += p->pool->npages_free;
> mutex_unlock(&_manager->lock);
> --
> 1.8.3.1
next prev parent reply other threads:[~2014-05-30 16:08 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-05-19 14:39 [PATCH] gpu/drm/ttm: Use mutex_lock_killable() for shrinker functions Tetsuo Handa
2014-05-20 0:40 ` Dave Airlie
2014-05-20 0:40 ` Dave Airlie
2014-05-20 8:38 ` Fwd: " Thomas Hellstrom
2014-05-20 15:30 ` Tetsuo Handa
2014-05-24 14:22 ` Tetsuo Handa
2014-05-28 18:54 ` Konrad Rzeszutek Wilk
2014-05-28 18:54 ` Konrad Rzeszutek Wilk
2014-05-28 21:47 ` Tetsuo Handa
2014-05-28 21:47 ` Tetsuo Handa
2014-05-29 14:34 ` Tetsuo Handa
2014-05-30 16:08 ` Konrad Rzeszutek Wilk [this message]
2014-05-30 16:08 ` Konrad Rzeszutek Wilk
2014-05-31 2:58 ` [PATCH 1/5] gpu/drm/ttm: Fix possible division by 0 in ttm_dma_pool_shrink_scan() Tetsuo Handa
2014-05-31 2:59 ` [PATCH 2/5] gpu/drm/ttm: Choose a pool to shrink correctly " Tetsuo Handa
2014-05-31 3:00 ` [PATCH 3/5] gpu/drm/ttm: Use mutex_trylock() to avoid deadlock inside shrinker functions Tetsuo Handa
2014-05-31 3:01 ` [PATCH 4/5] gpu/drm/ttm: Fix possible stack overflow by recursive shrinker calls Tetsuo Handa
2014-05-31 3:02 ` [PATCH 5/5] gpu/drm/ttm: Pass GFP flags in order to avoid deadlock Tetsuo Handa
2014-06-10 19:17 ` [PATCH 3/5] gpu/drm/ttm: Use mutex_trylock() to avoid deadlock inside shrinker functions Konrad Rzeszutek Wilk
2014-06-10 19:17 ` Konrad Rzeszutek Wilk
2014-06-10 19:17 ` Konrad Rzeszutek Wilk
2014-06-10 20:16 ` Tetsuo Handa
2014-06-10 20:16 ` Tetsuo Handa
2014-08-03 11:14 ` [PATCH 1/5] gpu/drm/ttm: Fix possible division by 0 in ttm_dma_pool_shrink_scan() Tetsuo Handa
2014-08-03 11:14 ` [PATCH 2/5] gpu/drm/ttm: Choose a pool to shrink correctly " Tetsuo Handa
2014-08-03 11:15 ` [PATCH 3/5] gpu/drm/ttm: Use mutex_trylock() to avoid deadlock inside shrinker functions Tetsuo Handa
2014-08-03 11:16 ` [PATCH 4/5] gpu/drm/ttm: Fix possible stack overflow by recursive shrinker calls Tetsuo Handa
2014-08-03 11:16 ` [PATCH 5/5] gpu/drm/ttm: Pass GFP flags in order to avoid deadlock Tetsuo Handa
2014-05-30 16:06 ` [PATCH] gpu/drm/ttm: Use mutex_lock_killable() for shrinker functions Konrad Rzeszutek Wilk
2014-05-30 16:06 ` Konrad Rzeszutek Wilk
2014-05-30 16:06 ` Konrad Rzeszutek Wilk
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140530160824.GD3621@localhost.localdomain \
--to=konrad.wilk@oracle.com \
--cc=airlied@linux.ie \
--cc=dchinner@redhat.com \
--cc=dri-devel@lists.freedesktop.org \
--cc=glommer@openvz.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@suse.de \
--cc=penguin-kernel@I-love.SAKURA.ne.jp \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.