From: Doug Ledford <dledford@redhat.com>
To: Andrew Morton <akpm@linux-foundation.org>,
Jonathan Toppins <jtoppins@redhat.com>
Cc: linux-mm@kvack.org, linux-rdma@vger.kernel.org,
Michal Hocko <mhocko@suse.com>, Vlastimil Babka <vbabka@suse.cz>,
Mel Gorman <mgorman@techsingularity.net>,
Hillf Danton <hillf.zj@alibaba-inc.com>,
open list <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] mm: ratelimit PFNs busy info message
Date: Fri, 04 Aug 2017 14:55:06 -0400 [thread overview]
Message-ID: <1501872906.79618.10.camel@redhat.com> (raw)
In-Reply-To: <20170802141720.228502368b534f517e3107ff@linux-foundation.org>
On Wed, 2017-08-02 at 14:17 -0700, Andrew Morton wrote:
> On Wed, 2 Aug 2017 13:44:57 -0400 Jonathan Toppins <jtoppins@redhat.
> com> wrote:
>
> > The RDMA subsystem can generate several thousand of these messages
> > per
> > second eventually leading to a kernel crash. Ratelimit these
> > messages
> > to prevent this crash.
>
> Well... why are all these EBUSY's occurring? It sounds inefficient
> (at
> least) but if it is expected, normal and unavoidable then perhaps we
> should just remove that message altogether?
I don't have an answer to that question. To be honest, I haven't
looked real hard. We never had this at all, then it started out of the
blue, but only on our Dell 730xd machines (and it hits all of them),
but no other classes or brands of machines. And we have our 730xd
machines loaded up with different brands and models of cards (for
instance one dedicated to mlx4 hardware, one for qib, one for mlx5, an
ocrdma/cxgb4 combo, etc), so the fact that it hit all of the machines
meant it wasn't tied to any particular brand/model of RDMA hardware.
To me, it always smelled of a hardware oddity specific to maybe the
CPUs or mainboard chipsets in these machines, so given that I'm not an
mm expert anyway, I never chased it down.
A few other relevant details: it showed up somewhere around 4.8/4.9 or
thereabouts. It never happened before, but the prinkt has been there
since the 3.18 days, so possibly the test to trigger this message was
changed, or something else in the allocator changed such that the
situation started happening on these machines?
And, like I said, it is specific to our 730xd machines (but they are
all identical, so that could mean it's something like their specific
ram configuration is causing the allocator to hit this on these machine
but not on other machines in the cluster, I don't want to say it's
necessarily the model of chipset or CPU, there are other bits of
identicalness between these machines).
--
Doug Ledford <dledford@redhat.com>
GPG KeyID: B826A3330E572FDD
Key fingerprint = AE6B 1BDA 122B 23B4 265B 1274 B826 A333 0E57 2FDD
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2017-08-04 18:55 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-08-02 17:44 [PATCH] mm: ratelimit PFNs busy info message Jonathan Toppins
2017-08-02 18:05 ` Doug Ledford
2017-08-02 21:17 ` Andrew Morton
2017-08-04 18:55 ` Doug Ledford [this message]
2017-08-07 6:58 ` Michal Hocko
2017-08-08 5:34 ` Michael Ellerman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1501872906.79618.10.camel@redhat.com \
--to=dledford@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=hillf.zj@alibaba-inc.com \
--cc=jtoppins@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-rdma@vger.kernel.org \
--cc=mgorman@techsingularity.net \
--cc=mhocko@suse.com \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).