All of lore.kernel.org
 help / color / mirror / Atom feed
From: Leon Romanovsky <leon@kernel.org>
To: Jakub Kicinski <kuba@kernel.org>
Cc: "David S. Miller" <davem@davemloft.net>,
	Arjan van de Ven <arjan@linux.intel.com>,
	Cong Wang <xiyou.wangcong@gmail.com>,
	Jamal Hadi Salim <jhs@mojatatu.com>,
	Jiri Pirko <jiri@resnulli.us>,
	netdev@vger.kernel.org
Subject: Re: [PATCH net v1] net/sched: Don't print dump stack in event of transmission timeout
Date: Sun, 12 Apr 2020 22:23:36 +0300	[thread overview]
Message-ID: <20200412192336.GD334007@unreal> (raw)
In-Reply-To: <20200412115913.14d69a7c@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com>

On Sun, Apr 12, 2020 at 11:59:13AM -0700, Jakub Kicinski wrote:
> On Sun, 12 Apr 2020 09:08:54 +0300 Leon Romanovsky wrote:
> > Hi Dave,
> >
> > This is a new version of previously sent v0 [1] with change in print error
> > level as was suggested by Jakub and Cong. I'm asking you to reevaluate
> > your previous decision [2] given the fact that this is user triggered
> > bug and very similar scenario was committed by Linus "fs/filesystems.c:
> > downgrade user-reachable WARN_ONCE() to pr_warn_once()" a couple of days
> > ago [3].
> >
> > [1] https://lore.kernel.org/netdev/20200402152336.538433-1-leon@kernel.org
> > [2] https://lore.kernel.org/netdev/20200402.180218.940555077368617365.davem@davemloft.net
> > [3] https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/commit/?h=x86/urgent&id=26c5d78c976ca298e59a56f6101a97b618ba3539
>
> How is it user triggerable? If there's a IB-specific reason maybe ib
> netdev should stop implementing ndo_tx_timeout.

It is happening if device is extremely over loaded with traffic,
internally HW decreases the performance (HW bug), it is causing to
the TX timeouts and to the WARN_ON splat.

We don't want to stop implementing ndo_tx_timeout, because it works
right most (if not all) of the time.

If it is very important, I will dig into internal bug reports to see
the possible reproduction scenarios, but from what I saw till now,
it is statistical failure.

And it is not IB specific, but mlx4 specific.

Thanks

  reply	other threads:[~2020-04-12 19:23 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-12  6:08 [PATCH net v1] net/sched: Don't print dump stack in event of transmission timeout Leon Romanovsky
2020-04-12 18:59 ` Jakub Kicinski
2020-04-12 19:23   ` Leon Romanovsky [this message]
2020-04-13  4:19 ` David Miller
2020-04-13  5:03   ` Leon Romanovsky
2020-04-13  9:01 ` Jose Abreu
2020-04-13 10:20   ` Leon Romanovsky
2020-04-13 10:37     ` Jose Abreu
2020-04-13 10:54       ` Leon Romanovsky
2020-04-13 11:01         ` Jose Abreu
2020-04-13 11:25           ` Leon Romanovsky
2020-04-13 17:22 ` Cong Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200412192336.GD334007@unreal \
    --to=leon@kernel.org \
    --cc=arjan@linux.intel.com \
    --cc=davem@davemloft.net \
    --cc=jhs@mojatatu.com \
    --cc=jiri@resnulli.us \
    --cc=kuba@kernel.org \
    --cc=netdev@vger.kernel.org \
    --cc=xiyou.wangcong@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.