All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mel Gorman <mgorman@techsingularity.net>
To: Vlastimil Babka <vbabka@suse.cz>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Linux MM <linux-mm@kvack.org>,
	linux-m68k <linux-m68k@lists.linux-m68k.org>
Subject: Re: BUG: scheduling while atomic: cron/668/0x10c9a0c0
Date: Thu, 2 Jun 2016 13:19:36 +0100	[thread overview]
Message-ID: <20160602121936.GV2527@techsingularity.net> (raw)
In-Reply-To: <0eb1f112-65d4-f2e5-911e-697b21324b9f@suse.cz>

On Thu, Jun 02, 2016 at 02:04:42PM +0200, Vlastimil Babka wrote:
> On 06/02/2016 12:39 PM, Mel Gorman wrote:
> >On Wed, Jun 01, 2016 at 12:01:24PM +0200, Vlastimil Babka wrote:
> >>>Why?
> >>>
> >>>The comment is fine but I do not see why the recalculation would occur.
> >>>
> >>>In the original code, the preferred_zoneref for statistics is calculated
> >>>based on either the supplied nodemask or cpuset_current_mems_allowed during
> >>>the initial attempt. It then relies on the cpuset checks in the slowpath
> >>>to encorce mems_allowed but the preferred zone doesn't change.
> >>>
> >>>With your proposed change, it's possible that the
> >>>preferred_zoneref recalculation points to a zoneref disallowed by
> >>>cpuset_current_mems_sllowed. While it'll be skipped during allocation,
> >>>the statistics will still be against a zone that is potentially outside
> >>>what is allowed.
> >>
> >>Hmm that's true and I was ready to agree. But then I noticed  that
> >>gfp_to_alloc_flags() can mask out ALLOC_CPUSET for GFP_ATOMIC. So it's
> >>like a lighter version of the ALLOC_NO_WATERMARKS situation. In that
> >>case it's wrong if we leave ac->preferred_zoneref at a position that has
> >>skipped some zones due to mempolicies?
> >>
> >
> >So both options are wrong then. How about this?
> 
> I wonder if the original patch we're fixing was worth all this trouble (and
> more
> for my compaction priority series :), but yeah this should work.
> 

I considered that option when the bug report first came in. It was a 2%
hit to the page allocator microbenchmark to revert it. More than I expected
but enough to care. If this causes another problem, I'll revert it as
there will be other options later.

> >---8<---
> >mm, page_alloc: Recalculate the preferred zoneref if the context can ignore memory policies
> >
> >The optimistic fast path may use cpuset_current_mems_allowed instead of
> >of a NULL nodemask supplied by the caller for cpuset allocations. The
> >preferred zone is calculated on this basis for statistic purposes and
> >as a starting point in the zonelist iterator.
> >
> >However, if the context can ignore memory policies due to being atomic or
> >being able to ignore watermarks then the starting point in the zonelist
> >iterator is no longer correct. This patch resets the zonelist iterator in
> >the allocator slowpath if the context can ignore memory policies. This will
> >alter the zone used for statistics but only after it is known that it makes
> >sense for that context. Resetting it before entering the slowpath would
> >potentially allow an ALLOC_CPUSET allocation to be accounted for against
> >the wrong zone. Note that while nodemask is not explicitly set to the
> >original nodemask, it would only have been overwritten if cpuset_enabled()
> >and it was reset before the slowpath was entered.
> >
> >Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
> 
> Acked-by: Vlastimil Babka <vbabka@suse.cz>
> 

Thanks.

-- 
Mel Gorman
SUSE Labs

WARNING: multiple messages have this Message-ID (diff)
From: Mel Gorman <mgorman@techsingularity.net>
To: Vlastimil Babka <vbabka@suse.cz>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Linux MM <linux-mm@kvack.org>,
	linux-m68k <linux-m68k@lists.linux-m68k.org>
Subject: Re: BUG: scheduling while atomic: cron/668/0x10c9a0c0
Date: Thu, 2 Jun 2016 13:19:36 +0100	[thread overview]
Message-ID: <20160602121936.GV2527@techsingularity.net> (raw)
In-Reply-To: <0eb1f112-65d4-f2e5-911e-697b21324b9f@suse.cz>

On Thu, Jun 02, 2016 at 02:04:42PM +0200, Vlastimil Babka wrote:
> On 06/02/2016 12:39 PM, Mel Gorman wrote:
> >On Wed, Jun 01, 2016 at 12:01:24PM +0200, Vlastimil Babka wrote:
> >>>Why?
> >>>
> >>>The comment is fine but I do not see why the recalculation would occur.
> >>>
> >>>In the original code, the preferred_zoneref for statistics is calculated
> >>>based on either the supplied nodemask or cpuset_current_mems_allowed during
> >>>the initial attempt. It then relies on the cpuset checks in the slowpath
> >>>to encorce mems_allowed but the preferred zone doesn't change.
> >>>
> >>>With your proposed change, it's possible that the
> >>>preferred_zoneref recalculation points to a zoneref disallowed by
> >>>cpuset_current_mems_sllowed. While it'll be skipped during allocation,
> >>>the statistics will still be against a zone that is potentially outside
> >>>what is allowed.
> >>
> >>Hmm that's true and I was ready to agree. But then I noticed  that
> >>gfp_to_alloc_flags() can mask out ALLOC_CPUSET for GFP_ATOMIC. So it's
> >>like a lighter version of the ALLOC_NO_WATERMARKS situation. In that
> >>case it's wrong if we leave ac->preferred_zoneref at a position that has
> >>skipped some zones due to mempolicies?
> >>
> >
> >So both options are wrong then. How about this?
> 
> I wonder if the original patch we're fixing was worth all this trouble (and
> more
> for my compaction priority series :), but yeah this should work.
> 

I considered that option when the bug report first came in. It was a 2%
hit to the page allocator microbenchmark to revert it. More than I expected
but enough to care. If this causes another problem, I'll revert it as
there will be other options later.

> >---8<---
> >mm, page_alloc: Recalculate the preferred zoneref if the context can ignore memory policies
> >
> >The optimistic fast path may use cpuset_current_mems_allowed instead of
> >of a NULL nodemask supplied by the caller for cpuset allocations. The
> >preferred zone is calculated on this basis for statistic purposes and
> >as a starting point in the zonelist iterator.
> >
> >However, if the context can ignore memory policies due to being atomic or
> >being able to ignore watermarks then the starting point in the zonelist
> >iterator is no longer correct. This patch resets the zonelist iterator in
> >the allocator slowpath if the context can ignore memory policies. This will
> >alter the zone used for statistics but only after it is known that it makes
> >sense for that context. Resetting it before entering the slowpath would
> >potentially allow an ALLOC_CPUSET allocation to be accounted for against
> >the wrong zone. Note that while nodemask is not explicitly set to the
> >original nodemask, it would only have been overwritten if cpuset_enabled()
> >and it was reset before the slowpath was entered.
> >
> >Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
> 
> Acked-by: Vlastimil Babka <vbabka@suse.cz>
> 

Thanks.

-- 
Mel Gorman
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Mel Gorman <mgorman@techsingularity.net>
To: Vlastimil Babka <vbabka@suse.cz>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Linux MM <linux-mm@kvack.org>,
	linux-m68k <linux-m68k@vger.kernel.org>
Subject: Re: BUG: scheduling while atomic: cron/668/0x10c9a0c0
Date: Thu, 2 Jun 2016 13:19:36 +0100	[thread overview]
Message-ID: <20160602121936.GV2527@techsingularity.net> (raw)
In-Reply-To: <0eb1f112-65d4-f2e5-911e-697b21324b9f@suse.cz>

On Thu, Jun 02, 2016 at 02:04:42PM +0200, Vlastimil Babka wrote:
> On 06/02/2016 12:39 PM, Mel Gorman wrote:
> >On Wed, Jun 01, 2016 at 12:01:24PM +0200, Vlastimil Babka wrote:
> >>>Why?
> >>>
> >>>The comment is fine but I do not see why the recalculation would occur.
> >>>
> >>>In the original code, the preferred_zoneref for statistics is calculated
> >>>based on either the supplied nodemask or cpuset_current_mems_allowed during
> >>>the initial attempt. It then relies on the cpuset checks in the slowpath
> >>>to encorce mems_allowed but the preferred zone doesn't change.
> >>>
> >>>With your proposed change, it's possible that the
> >>>preferred_zoneref recalculation points to a zoneref disallowed by
> >>>cpuset_current_mems_sllowed. While it'll be skipped during allocation,
> >>>the statistics will still be against a zone that is potentially outside
> >>>what is allowed.
> >>
> >>Hmm that's true and I was ready to agree. But then I noticed  that
> >>gfp_to_alloc_flags() can mask out ALLOC_CPUSET for GFP_ATOMIC. So it's
> >>like a lighter version of the ALLOC_NO_WATERMARKS situation. In that
> >>case it's wrong if we leave ac->preferred_zoneref at a position that has
> >>skipped some zones due to mempolicies?
> >>
> >
> >So both options are wrong then. How about this?
> 
> I wonder if the original patch we're fixing was worth all this trouble (and
> more
> for my compaction priority series :), but yeah this should work.
> 

I considered that option when the bug report first came in. It was a 2%
hit to the page allocator microbenchmark to revert it. More than I expected
but enough to care. If this causes another problem, I'll revert it as
there will be other options later.

> >---8<---
> >mm, page_alloc: Recalculate the preferred zoneref if the context can ignore memory policies
> >
> >The optimistic fast path may use cpuset_current_mems_allowed instead of
> >of a NULL nodemask supplied by the caller for cpuset allocations. The
> >preferred zone is calculated on this basis for statistic purposes and
> >as a starting point in the zonelist iterator.
> >
> >However, if the context can ignore memory policies due to being atomic or
> >being able to ignore watermarks then the starting point in the zonelist
> >iterator is no longer correct. This patch resets the zonelist iterator in
> >the allocator slowpath if the context can ignore memory policies. This will
> >alter the zone used for statistics but only after it is known that it makes
> >sense for that context. Resetting it before entering the slowpath would
> >potentially allow an ALLOC_CPUSET allocation to be accounted for against
> >the wrong zone. Note that while nodemask is not explicitly set to the
> >original nodemask, it would only have been overwritten if cpuset_enabled()
> >and it was reset before the slowpath was entered.
> >
> >Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
> 
> Acked-by: Vlastimil Babka <vbabka@suse.cz>
> 

Thanks.

-- 
Mel Gorman
SUSE Labs

  reply	other threads:[~2016-06-02 12:19 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-05-30 13:13 BUG: scheduling while atomic: cron/668/0x10c9a0c0 (was: Re: mm, page_alloc: avoid looking up the first zone in a zonelist twice) Geert Uytterhoeven
2016-05-30 13:13 ` Geert Uytterhoeven
2016-05-30 15:56 ` Mel Gorman
2016-05-30 15:56   ` Mel Gorman
2016-05-30 17:37   ` Geert Uytterhoeven
2016-05-30 17:37     ` Geert Uytterhoeven
2016-05-30 17:37     ` Geert Uytterhoeven
2016-05-30 18:56     ` Mel Gorman
2016-05-30 18:56     ` Mel Gorman
2016-05-30 18:56       ` Mel Gorman
2016-05-31  9:28       ` Geert Uytterhoeven
2016-05-31  9:28       ` Geert Uytterhoeven
2016-05-31  9:28         ` Geert Uytterhoeven
2016-05-31 10:13         ` Mel Gorman
2016-05-31 10:13           ` Mel Gorman
2016-05-31 10:13           ` Mel Gorman
2016-05-31 21:44   ` Vlastimil Babka
2016-05-31 21:44   ` Vlastimil Babka
2016-05-31 21:44     ` Vlastimil Babka
2016-06-01  9:19     ` Mel Gorman
2016-06-01  9:19     ` Mel Gorman
2016-06-01  9:19       ` Mel Gorman
2016-06-01 10:01       ` BUG: scheduling while atomic: cron/668/0x10c9a0c0 Vlastimil Babka
2016-06-01 10:01         ` Vlastimil Babka
2016-06-02 10:39         ` Mel Gorman
2016-06-02 10:39         ` Mel Gorman
2016-06-02 10:39           ` Mel Gorman
2016-06-02 12:04           ` Vlastimil Babka
2016-06-02 12:04             ` Vlastimil Babka
2016-06-02 12:04             ` Vlastimil Babka
2016-06-02 12:19             ` Mel Gorman [this message]
2016-06-02 12:19               ` Mel Gorman
2016-06-02 12:19               ` Mel Gorman
2016-06-02 18:43               ` Andrew Morton
2016-06-02 18:43                 ` Andrew Morton
2016-06-03  3:52                 ` Stephen Rothwell
2016-06-03  3:52                   ` Stephen Rothwell
2016-06-03  7:57                 ` Geert Uytterhoeven
2016-06-03  7:57                   ` Geert Uytterhoeven
2016-06-03  8:41                   ` Mel Gorman
2016-06-03  8:41                     ` Mel Gorman
2016-06-03  9:00                     ` Geert Uytterhoeven
2016-06-03  9:00                       ` Geert Uytterhoeven
2016-06-03  9:00                       ` Geert Uytterhoeven
2016-06-03 16:35                       ` Andrew Morton
2016-06-03 16:35                         ` Andrew Morton
2016-06-03 16:46                         ` Mel Gorman
2016-06-03 16:46                           ` Mel Gorman
2016-06-03 16:49                           ` Andrew Morton
2016-06-03 16:49                             ` Andrew Morton
2016-06-01 10:01       ` Vlastimil Babka
2016-05-30 15:56 ` BUG: scheduling while atomic: cron/668/0x10c9a0c0 (was: Re: mm, page_alloc: avoid looking up the first zone in a zonelist twice) Mel Gorman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160602121936.GV2527@techsingularity.net \
    --to=mgorman@techsingularity.net \
    --cc=akpm@linux-foundation.org \
    --cc=geert@linux-m68k.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-m68k@lists.linux-m68k.org \
    --cc=linux-mm@kvack.org \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.