* [PATCH] shrinker: fix a bug when callback returns -1 @ 2011-08-29 19:36 Mikulas Patocka 2011-08-30 2:13 ` Dave Chinner 0 siblings, 1 reply; 5+ messages in thread From: Mikulas Patocka @ 2011-08-29 19:36 UTC (permalink / raw) To: Dave Chinner, Al Viro; +Cc: Christoph Hellwig, dm-devel, linux-kernel Hi This patch fixes a lockup when shrinker callback returns -1. BTW. shouldn't the value returned by callback be long instead of int? On 64-bit architectures, there may be more than 2^32 entries allocated. Mikulas --- shrinker: fix a bug when callback returns -1 Shrinker callback can return -1 if it is at a risk of deadlock. However, this is not tested at some places. If do_shrinker_shrink returns -1 here "max_pass = do_shrinker_shrink(shrinker, shrink, 0)", it is converted to an unsigned long integer. This may result in excessive total_scan value and a lockup due to code looping too much in "while (total_scan >= batch_size)" cycle. Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> --- mm/vmscan.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) Index: linux-3.1-rc3-fast/mm/vmscan.c =================================================================== --- linux-3.1-rc3-fast.orig/mm/vmscan.c 2011-08-29 20:34:27.000000000 +0200 +++ linux-3.1-rc3-fast/mm/vmscan.c 2011-08-29 20:37:38.000000000 +0200 @@ -250,6 +250,7 @@ unsigned long shrink_slab(struct shrink_ unsigned long long delta; unsigned long total_scan; unsigned long max_pass; + int sr; int shrink_ret = 0; long nr; long new_nr; @@ -266,7 +267,10 @@ unsigned long shrink_slab(struct shrink_ } while (cmpxchg(&shrinker->nr, nr, 0) != nr); total_scan = nr; - max_pass = do_shrinker_shrink(shrinker, shrink, 0); + sr = do_shrinker_shrink(shrinker, shrink, 0); + if (sr == -1) + continue; + max_pass = sr; delta = (4 * nr_pages_scanned) / shrinker->seeks; delta *= max_pass; do_div(delta, lru_pages + 1); @@ -309,6 +313,8 @@ unsigned long shrink_slab(struct shrink_ int nr_before; nr_before = do_shrinker_shrink(shrinker, shrink, 0); + if (nr_before == -1) + break; shrink_ret = do_shrinker_shrink(shrinker, shrink, batch_size); if (shrink_ret == -1) ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] shrinker: fix a bug when callback returns -1 2011-08-29 19:36 [PATCH] shrinker: fix a bug when callback returns -1 Mikulas Patocka @ 2011-08-30 2:13 ` Dave Chinner 2011-08-30 19:52 ` Mikulas Patocka 0 siblings, 1 reply; 5+ messages in thread From: Dave Chinner @ 2011-08-30 2:13 UTC (permalink / raw) To: Mikulas Patocka Cc: Dave Chinner, Al Viro, Christoph Hellwig, dm-devel, linux-kernel On Mon, Aug 29, 2011 at 03:36:48PM -0400, Mikulas Patocka wrote: > Hi > > This patch fixes a lockup when shrinker callback returns -1. What lockup is that? I haven't seen any bug reports, and this code has been like this for several years, so I'm kind of wondering why this is suddenly an issue.... > BTW. shouldn't the value returned by callback be long instead of int? On > 64-bit architectures, there may be more than 2^32 entries allocated. The API hasn't changed since the early 2.5 series, so that wasn't a consideration when it was originally written. As it is, I make this exact change in the shrinker API update patchset I proposed recently for exactly the reasons you suggest: http://thread.gmane.org/gmane.linux.kernel.mm/67326 > Mikulas > > --- > > shrinker: fix a bug when callback returns -1 > > Shrinker callback can return -1 if it is at a risk of deadlock. > However, this is not tested at some places. > > If do_shrinker_shrink returns -1 here > "max_pass = do_shrinker_shrink(shrinker, shrink, 0)", > it is converted to an unsigned long integer. This may result in excessive > total_scan value and a lockup due to code looping too much in > "while (total_scan >= batch_size)" cycle. > > Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> > > --- > mm/vmscan.c | 8 +++++++- > 1 file changed, 7 insertions(+), 1 deletion(-) > > Index: linux-3.1-rc3-fast/mm/vmscan.c > =================================================================== > --- linux-3.1-rc3-fast.orig/mm/vmscan.c 2011-08-29 20:34:27.000000000 +0200 > +++ linux-3.1-rc3-fast/mm/vmscan.c 2011-08-29 20:37:38.000000000 +0200 > @@ -250,6 +250,7 @@ unsigned long shrink_slab(struct shrink_ > unsigned long long delta; > unsigned long total_scan; > unsigned long max_pass; > + int sr; > int shrink_ret = 0; > long nr; > long new_nr; > @@ -266,7 +267,10 @@ unsigned long shrink_slab(struct shrink_ > } while (cmpxchg(&shrinker->nr, nr, 0) != nr); > > total_scan = nr; > - max_pass = do_shrinker_shrink(shrinker, shrink, 0); > + sr = do_shrinker_shrink(shrinker, shrink, 0); > + if (sr == -1) > + continue; IIRC from my recent shrinker audit, none of the existing shrinkers return return -1 when nr_to_scan == 0, so this check has never been necessary. > + max_pass = sr; > delta = (4 * nr_pages_scanned) / shrinker->seeks; > delta *= max_pass; > do_div(delta, lru_pages + 1); > @@ -309,6 +313,8 @@ unsigned long shrink_slab(struct shrink_ > int nr_before; > > nr_before = do_shrinker_shrink(shrinker, shrink, 0); > + if (nr_before == -1) > + break; Same here. > shrink_ret = do_shrinker_shrink(shrinker, shrink, > batch_size); > if (shrink_ret == -1) Cheers, Dave. -- Dave Chinner david@fromorbit.com ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] shrinker: fix a bug when callback returns -1 2011-08-30 2:13 ` Dave Chinner @ 2011-08-30 19:52 ` Mikulas Patocka 2011-08-30 22:37 ` Dave Chinner 0 siblings, 1 reply; 5+ messages in thread From: Mikulas Patocka @ 2011-08-30 19:52 UTC (permalink / raw) To: Dave Chinner Cc: Dave Chinner, Al Viro, Christoph Hellwig, dm-devel, linux-kernel On Tue, 30 Aug 2011, Dave Chinner wrote: > On Mon, Aug 29, 2011 at 03:36:48PM -0400, Mikulas Patocka wrote: > > Hi > > > > This patch fixes a lockup when shrinker callback returns -1. > > What lockup is that? I haven't seen any bug reports, and this code > has been like this for several years, so I'm kind of wondering why > this is suddenly an issue.... I got the lockups when modifying my own dm-bufio code to use the shrinker. The reason for lockups was that the variable total_scan contained extremely high values. The only possible way how such extreme values could be stored in total_scan was this: max_pass = do_shrinker_shrink(shrinker, shrink, 0); delta = (4 * nr_pages_scanned) / shrinker->seeks; delta *= max_pass; do_div(delta, lru_pages + 1); total_scan += delta; --- you don't test if do_shinker_shrink retuned -1 here. The variables are unsigned long, so you end up adding extreme value (approximately 2^64/(lru_pages+1) to total_scan. Note that some existing shrinkers contain workaround for this (something like "return nr_to_scan ? -1 : 0", while some can still return -1 when nr_to_scan is 0 and trigger this bug (prune_super). > > BTW. shouldn't the value returned by callback be long instead of int? On > > 64-bit architectures, there may be more than 2^32 entries allocated. > > The API hasn't changed since the early 2.5 series, so that wasn't a > consideration when it was originally written. As it is, I make this > exact change in the shrinker API update patchset I proposed > recently for exactly the reasons you suggest: > > http://thread.gmane.org/gmane.linux.kernel.mm/67326 > > > > Mikulas > > > > --- > > > > shrinker: fix a bug when callback returns -1 > > > > Shrinker callback can return -1 if it is at a risk of deadlock. > > However, this is not tested at some places. > > > > If do_shrinker_shrink returns -1 here > > "max_pass = do_shrinker_shrink(shrinker, shrink, 0)", > > it is converted to an unsigned long integer. This may result in excessive > > total_scan value and a lockup due to code looping too much in > > "while (total_scan >= batch_size)" cycle. > > > > > Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> > > > > --- > > mm/vmscan.c | 8 +++++++- > > 1 file changed, 7 insertions(+), 1 deletion(-) > > > > Index: linux-3.1-rc3-fast/mm/vmscan.c > > =================================================================== > > --- linux-3.1-rc3-fast.orig/mm/vmscan.c 2011-08-29 20:34:27.000000000 +0200 > > +++ linux-3.1-rc3-fast/mm/vmscan.c 2011-08-29 20:37:38.000000000 +0200 > > @@ -250,6 +250,7 @@ unsigned long shrink_slab(struct shrink_ > > unsigned long long delta; > > unsigned long total_scan; > > unsigned long max_pass; > > + int sr; > > int shrink_ret = 0; > > long nr; > > long new_nr; > > @@ -266,7 +267,10 @@ unsigned long shrink_slab(struct shrink_ > > } while (cmpxchg(&shrinker->nr, nr, 0) != nr); > > > > total_scan = nr; > > - max_pass = do_shrinker_shrink(shrinker, shrink, 0); > > + sr = do_shrinker_shrink(shrinker, shrink, 0); > > + if (sr == -1) > > + continue; > > IIRC from my recent shrinker audit, none of the existing shrinkers > return return -1 when nr_to_scan == 0, so this check has never been > necessary. > > > + max_pass = sr; > > delta = (4 * nr_pages_scanned) / shrinker->seeks; > > delta *= max_pass; > > do_div(delta, lru_pages + 1); > > @@ -309,6 +313,8 @@ unsigned long shrink_slab(struct shrink_ > > int nr_before; > > > > nr_before = do_shrinker_shrink(shrinker, shrink, 0); > > + if (nr_before == -1) > > + break; > > Same here. > > > shrink_ret = do_shrinker_shrink(shrinker, shrink, > > batch_size); > > if (shrink_ret == -1) > > Cheers, > > Dave. > -- > Dave Chinner > david@fromorbit.com > ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] shrinker: fix a bug when callback returns -1 2011-08-30 19:52 ` Mikulas Patocka @ 2011-08-30 22:37 ` Dave Chinner 2011-08-31 13:31 ` Mikulas Patocka 0 siblings, 1 reply; 5+ messages in thread From: Dave Chinner @ 2011-08-30 22:37 UTC (permalink / raw) To: Mikulas Patocka Cc: Dave Chinner, Al Viro, Christoph Hellwig, dm-devel, linux-kernel On Tue, Aug 30, 2011 at 03:52:02PM -0400, Mikulas Patocka wrote: > On Tue, 30 Aug 2011, Dave Chinner wrote: > > > On Mon, Aug 29, 2011 at 03:36:48PM -0400, Mikulas Patocka wrote: > > > Hi > > > > > > This patch fixes a lockup when shrinker callback returns -1. > > > > What lockup is that? I haven't seen any bug reports, and this code > > has been like this for several years, so I'm kind of wondering why > > this is suddenly an issue.... > > I got the lockups when modifying my own dm-bufio code to use the shrinker. > The reason for lockups was that the variable total_scan contained > extremely high values. Your new shrinker was returning -1 when nr_to_scan == 0? That's not correct - you should be returning the count of objects (regardless of the specified gfp_mask) or 0 if you can't get one for whatever reason.... > The only possible way how such extreme values could be stored in > total_scan was this: > > max_pass = do_shrinker_shrink(shrinker, shrink, 0); > delta = (4 * nr_pages_scanned) / shrinker->seeks; > delta *= max_pass; > do_div(delta, lru_pages + 1); > total_scan += delta; > > --- you don't test if do_shinker_shrink retuned -1 here. The variables are > unsigned long, so you end up adding extreme value (approximately > 2^64/(lru_pages+1) to total_scan. That's not the only way to get large values in total scan. If you do any amount of GFP_NOFS/GFP_NOIO allocation and the shrinker aborts when it sees this, the shrinker->nr total will aggregate until it becomes large. total_scan contains that aggregation because it starts from the current value of shrinker->nr. > Note that some existing shrinkers contain workaround for this (something > like "return nr_to_scan ? -1 : 0", That's not a workaround - that is exactly how the current API expects them to operate. That is, when counting objects, you return the count of objects. If you can't get the count, you return 0. Did I mention I was rewriting the API to make it more sane, obvious and simple to implement correctly? > while some can still return -1 when > nr_to_scan is 0 and trigger this bug (prune_super). prune_super() will only return -1 if grab_super_passive() fails, which indicates that something serious is happening on the superblock (like unmount, remount or freeze) in which case the caches are about to or already undergoing significant change anyway. It could be seen as a bug, but it's really a "don't care" case - it doesn't matter what the calculated value is because it doesn't matter what the shrinker does after such a failure - the next call is going to fail to grab the superblock, too. And FWIW, that wart also goes away with the shrinker API rework. Cheers, Dave. -- Dave Chinner david@fromorbit.com ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] shrinker: fix a bug when callback returns -1 2011-08-30 22:37 ` Dave Chinner @ 2011-08-31 13:31 ` Mikulas Patocka 0 siblings, 0 replies; 5+ messages in thread From: Mikulas Patocka @ 2011-08-31 13:31 UTC (permalink / raw) To: Dave Chinner Cc: Dave Chinner, Al Viro, Christoph Hellwig, dm-devel, linux-kernel On Wed, 31 Aug 2011, Dave Chinner wrote: > On Tue, Aug 30, 2011 at 03:52:02PM -0400, Mikulas Patocka wrote: > > On Tue, 30 Aug 2011, Dave Chinner wrote: > > > > > On Mon, Aug 29, 2011 at 03:36:48PM -0400, Mikulas Patocka wrote: > > > > Hi > > > > > > > > This patch fixes a lockup when shrinker callback returns -1. > > > > > > What lockup is that? I haven't seen any bug reports, and this code > > > has been like this for several years, so I'm kind of wondering why > > > this is suddenly an issue.... > > > > I got the lockups when modifying my own dm-bufio code to use the shrinker. > > The reason for lockups was that the variable total_scan contained > > extremely high values. > > Your new shrinker was returning -1 when nr_to_scan == 0? That's not > correct - you should be returning the count of objects (regardless > of the specified gfp_mask) or 0 if you can't get one for whatever > reason.... > > > The only possible way how such extreme values could be stored in > > total_scan was this: > > > > max_pass = do_shrinker_shrink(shrinker, shrink, 0); > > delta = (4 * nr_pages_scanned) / shrinker->seeks; > > delta *= max_pass; > > do_div(delta, lru_pages + 1); > > total_scan += delta; > > > > --- you don't test if do_shinker_shrink retuned -1 here. The variables are > > unsigned long, so you end up adding extreme value (approximately > > 2^64/(lru_pages+1) to total_scan. > > That's not the only way to get large values in total scan. If > you do any amount of GFP_NOFS/GFP_NOIO allocation and the shrinker > aborts when it sees this, the shrinker->nr total will aggregate > until it becomes large. total_scan contains that aggregation because > it starts from the current value of shrinker->nr. > > > Note that some existing shrinkers contain workaround for this (something > > like "return nr_to_scan ? -1 : 0", > > That's not a workaround - that is exactly how the current API > expects them to operate. That is, when counting objects, you return > the count of objects. If you can't get the count, you return 0. So apply this patch, that mentions this requirement in the specification and fixes the bug in fs/super.c. > Did I mention I was rewriting the API to make it more sane, obvious > and simple to implement correctly? > > > while some can still return -1 when > > nr_to_scan is 0 and trigger this bug (prune_super). > > prune_super() will only return -1 if grab_super_passive() fails, > which indicates that something serious is happening on the > superblock (like unmount, remount or freeze) in which case the > caches are about to or already undergoing significant change anyway. > > It could be seen as a bug, but it's really a "don't care" case - it > doesn't matter what the calculated value is because it doesn't > matter what the shrinker does after such a failure - the next call > is going to fail to grab the superblock, too. It is a bug, because the shrinker will loop for 2^64/(lru_pages+1)/batch_size times if that "-1" is returned at certain place --- in "max_pass = do_shrinker_shrink(shrinker, shrink, 0)". Mikulas > And FWIW, that wart also goes away with the shrinker API rework. > > Cheers, > > Dave. --- Fix shrinker callback bug in fs/super.c The callback must not return -1 when nr_to_scan is zero. Fix the bug in fs/super.c and add this requirement to the callback specification. CC: stable@kernel.org Signed-off-by: Mikulas Patocka <mpatocka@redhat.com> --- fs/super.c | 2 +- include/linux/shrinker.h | 1 + 2 files changed, 2 insertions(+), 1 deletion(-) Index: linux-3.1-rc3-fast/fs/super.c =================================================================== --- linux-3.1-rc3-fast.orig/fs/super.c 2011-08-31 15:21:01.000000000 +0200 +++ linux-3.1-rc3-fast/fs/super.c 2011-08-31 15:21:21.000000000 +0200 @@ -61,7 +61,7 @@ static int prune_super(struct shrinker * return -1; if (!grab_super_passive(sb)) - return -1; + return !sc->nr_to_scan ? 0 : -1; if (sb->s_op && sb->s_op->nr_cached_objects) fs_objects = sb->s_op->nr_cached_objects(sb); Index: linux-3.1-rc3-fast/include/linux/shrinker.h =================================================================== --- linux-3.1-rc3-fast.orig/include/linux/shrinker.h 2011-08-31 15:21:28.000000000 +0200 +++ linux-3.1-rc3-fast/include/linux/shrinker.h 2011-08-31 15:21:58.000000000 +0200 @@ -20,6 +20,7 @@ struct shrink_control { * 'nr_to_scan' entries and attempt to free them up. It should return * the number of objects which remain in the cache. If it returns -1, it means * it cannot do any scanning at this time (eg. there is a risk of deadlock). + * The callback must not return -1 if nr_to_scan is zero. * * The 'gfpmask' refers to the allocation we are currently trying to * fulfil. ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2011-08-31 13:31 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-08-29 19:36 [PATCH] shrinker: fix a bug when callback returns -1 Mikulas Patocka 2011-08-30 2:13 ` Dave Chinner 2011-08-30 19:52 ` Mikulas Patocka 2011-08-30 22:37 ` Dave Chinner 2011-08-31 13:31 ` Mikulas Patocka
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox