All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@kernel.org>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v6] mm: Add memory allocation watchdog kernel thread.
Date: Thu, 26 Jan 2017 08:57:53 +0100	[thread overview]
Message-ID: <20170126075753.GD8456@dhcp22.suse.cz> (raw)
In-Reply-To: <20170125192245.GA19321@cmpxchg.org>

On Wed 25-01-17 14:22:45, Johannes Weiner wrote:
> On Wed, Jan 25, 2017 at 07:45:49PM +0100, Michal Hocko wrote:
> > On Wed 25-01-17 13:11:50, Johannes Weiner wrote:
> > [...]
> > > >From 6420cae52cac8167bd5fb19f45feed2d540bc11d Mon Sep 17 00:00:00 2001
> > > From: Johannes Weiner <hannes@cmpxchg.org>
> > > Date: Wed, 25 Jan 2017 12:57:20 -0500
> > > Subject: [PATCH] mm: page_alloc: __GFP_NOWARN shouldn't suppress stall
> > >  warnings
> > > 
> > > __GFP_NOWARN, which is usually added to avoid warnings from callsites
> > > that expect to fail and have fallbacks, currently also suppresses
> > > allocation stall warnings. These trigger when an allocation is stuck
> > > inside the allocator for 10 seconds or longer.
> > > 
> > > But there is no class of allocations that can get legitimately stuck
> > > in the allocator for this long. This always indicates a problem.
> > > 
> > > Always emit stall warnings. Restrict __GFP_NOWARN to alloc failures.
> > 
> > Tetsuo has already suggested something like this and I didn't really
> > like it because it makes the semantic of the flag confusing. The mask
> > says to not warn while the kernel log might contain an allocation splat.
> > You are right that stalling for 10s seconds means a problem on its own
> > but on the other hand I can imagine somebody might really want to have
> > clean logs and the last thing we want is to have another gfp flag for
> > that purpose.
> 
> I don't think it's confusing. __GFP_NOWARN tells the allocator whether
> an allocation failure can be handled or whether it constitutes a bug.
> 
> If we agree that stalling for 10s is a bug, then we should emit the
> warnings.

Yes, in many cases it would be a bug in the MM. Some of them would be
inherent because the allocator doesn't implement any fairness and
starvation cannot be ruled out (would that be a bug?). In general,
looping/spending a lot of time in kernel can be seen as a bug. We have
watchdogs to report those cases and the time has told us that we had to
develop ways to silent those lockups because in some cases we couldn't
handle them. I am worried we will eventually find cases like that for
allocation stalls as well. I might be over sensitive because we have
made some mistakes in the gfp flags land already and I would like to
prevent more to come.

That being said, I will not stand in the way...
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)
From: Michal Hocko <mhocko@kernel.org>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v6] mm: Add memory allocation watchdog kernel thread.
Date: Thu, 26 Jan 2017 08:57:53 +0100	[thread overview]
Message-ID: <20170126075753.GD8456@dhcp22.suse.cz> (raw)
In-Reply-To: <20170125192245.GA19321@cmpxchg.org>

On Wed 25-01-17 14:22:45, Johannes Weiner wrote:
> On Wed, Jan 25, 2017 at 07:45:49PM +0100, Michal Hocko wrote:
> > On Wed 25-01-17 13:11:50, Johannes Weiner wrote:
> > [...]
> > > >From 6420cae52cac8167bd5fb19f45feed2d540bc11d Mon Sep 17 00:00:00 2001
> > > From: Johannes Weiner <hannes@cmpxchg.org>
> > > Date: Wed, 25 Jan 2017 12:57:20 -0500
> > > Subject: [PATCH] mm: page_alloc: __GFP_NOWARN shouldn't suppress stall
> > >  warnings
> > > 
> > > __GFP_NOWARN, which is usually added to avoid warnings from callsites
> > > that expect to fail and have fallbacks, currently also suppresses
> > > allocation stall warnings. These trigger when an allocation is stuck
> > > inside the allocator for 10 seconds or longer.
> > > 
> > > But there is no class of allocations that can get legitimately stuck
> > > in the allocator for this long. This always indicates a problem.
> > > 
> > > Always emit stall warnings. Restrict __GFP_NOWARN to alloc failures.
> > 
> > Tetsuo has already suggested something like this and I didn't really
> > like it because it makes the semantic of the flag confusing. The mask
> > says to not warn while the kernel log might contain an allocation splat.
> > You are right that stalling for 10s seconds means a problem on its own
> > but on the other hand I can imagine somebody might really want to have
> > clean logs and the last thing we want is to have another gfp flag for
> > that purpose.
> 
> I don't think it's confusing. __GFP_NOWARN tells the allocator whether
> an allocation failure can be handled or whether it constitutes a bug.
> 
> If we agree that stalling for 10s is a bug, then we should emit the
> warnings.

Yes, in many cases it would be a bug in the MM. Some of them would be
inherent because the allocator doesn't implement any fairness and
starvation cannot be ruled out (would that be a bug?). In general,
looping/spending a lot of time in kernel can be seen as a bug. We have
watchdogs to report those cases and the time has told us that we had to
develop ways to silent those lockups because in some cases we couldn't
handle them. I am worried we will eventually find cases like that for
allocation stalls as well. I might be over sensitive because we have
made some mistakes in the gfp flags land already and I would like to
prevent more to come.

That being said, I will not stand in the way...
-- 
Michal Hocko
SUSE Labs

  reply	other threads:[~2017-01-26  7:57 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-11-06  7:15 [PATCH v6] mm: Add memory allocation watchdog kernel thread Tetsuo Handa
2016-11-06  7:15 ` Tetsuo Handa
2016-12-15 10:24 ` Tetsuo Handa
2016-12-15 10:24   ` Tetsuo Handa
2016-12-28 11:42   ` Tetsuo Handa
2016-12-28 11:42     ` Tetsuo Handa
2017-01-25 14:03     ` Tetsuo Handa
2017-01-25 14:03       ` Tetsuo Handa
2017-01-25 14:21       ` Michal Hocko
2017-01-25 14:21         ` Michal Hocko
2017-01-25 18:11 ` Johannes Weiner
2017-01-25 18:11   ` Johannes Weiner
2017-01-25 18:45   ` Michal Hocko
2017-01-25 18:45     ` Michal Hocko
2017-01-25 19:22     ` Johannes Weiner
2017-01-25 19:22       ` Johannes Weiner
2017-01-26  7:57       ` Michal Hocko [this message]
2017-01-26  7:57         ` Michal Hocko
2017-01-26 10:28     ` Tetsuo Handa
2017-01-26 10:28       ` Tetsuo Handa
2017-02-22  2:11       ` Tetsuo Handa
2017-02-22  2:11         ` Tetsuo Handa
2017-01-25 23:44   ` Minchan Kim
2017-01-25 23:44     ` Minchan Kim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170126075753.GD8456@dhcp22.suse.cz \
    --to=mhocko@kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=penguin-kernel@I-love.SAKURA.ne.jp \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.