stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Paul Gortmaker <paul.gortmaker@windriver.com>
To: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: <stable@vger.kernel.org>, Roi Dayan <roid@mellanox.com>,
	Mark Bloch <markb@mellanox.com>,
	Saeed Mahameed <saeedm@mellanox.com>
Subject: Re: Possible linux-stable mis-backport in ethernet/mellanox/mlx5
Date: Wed, 10 Jun 2020 13:26:30 -0400	[thread overview]
Message-ID: <20200610172630.GA29331@windriver.com> (raw)
In-Reply-To: <20200609172907.GA873279@kroah.com>

[Re: Possible linux-stable mis-backport in ethernet/mellanox/mlx5] On 09/06/2020 (Tue 19:29) Greg Kroah-Hartman wrote:

> On Sun, Jun 07, 2020 at 04:34:25PM -0400, Paul Gortmaker wrote:
> > I happened to notice this commit:
> > 
> > 9ca415399dae - "net/mlx5: Annotate mutex destroy for root ns"
> > 
> > ...was backported to 4.19 and 5.4 and v5.6 in linux-stable.
> > 
> > It patches del_sw_root_ns() - which only exists after v5.7-rc7 from:
> > 
> > 6eb7a268a99b - "net/mlx5: Don't maintain a case of del_sw_func being
> > null"
> > 
> > which creates the one line del_sw_root_ns stub function around
> > kfree(node) by breaking it out of tree_put_node().
> > 
> > In the absense of del_sw_root_ns - the backport finds an identical one
> > line kfree stub fcn - named del_sw_prio from this earlier commit:
> > 
> > 139ed6c6c46a - "net/mlx5: Fix steering memory leak"  [in v4.15-rc5]
> > 
> > and then puts the mutex_destroy() into that (wrong) function, instead of
> > putting it into tree_put_node where the root ns case used to be handled.
> 
> Ugh, good catch.  I'll go revert this from everywhere.  If you could,
> can you provide a working backport?

Maybe the simplest option is to just forget the commit, now that
you reverted the mis-applied version.  In doing so, we are assuming
mutex_destroy is just a trap for future bugs/issues and not really a
run-time bug-fix (it is a no-op unless CONFIG_DEBUG_MUTEXES=y).

Or, if you want a valid backport I'd suggest the following:

Patching tree_put_node() in-place with the mutex_destroy results in long
lines, and isn't very future-proof if there are future mellanox backports
looking for del_sw_root_ns() -- and IIRC, the mellanox stuff does seem to
see a higher than average number of stable backports.

Instead as indicated below, establishing del_sw_root_ns() in the various
currently active stable versions can be achieved w/o any real runtime
impact - thus allowing for any future backports to have a higher
probability of applying, and applying properly.

I've read the code, and compile tested the below on x86-64 allmodconfig,
but I'm not familiar with the code and don't have the hardware, so I
defer to the Mellanox guys to double check on what I've outlined below.

Hope this helps,
Paul.

v5.6
=====
-apply 6eb7a268a99b so there is now a del_sw_root_ns().  [applies as-is]
This commit claims that it "Fixes: 2cc43b494a6c" [v4.5-rc1] but my
reading of the commit just says to me that it simplifies the code by
assigning the root ns a trivial delete so it doesn't have to be
special-cased.  Runtime doesn't appear to change IMHO.

-now apply 9ca415399dae - the original mutex_destroy() commit, and it
will apply as-is, and in the right function this time.`

v5.4
=====
-apply 476a028f0a2d ("net/mlx5: Fix cleaning unmanaged flow tables").
While it claims that it "Fixes: 5281a0c90919" [v5.6-rc1] - and I'm
assuming that is true, but what it really does is remove all the fragile
assumptions about who has a parent and how that translates to being the
root ns or not -- and simply replaces it with the more logical "if you
have a delete function, then I'll use it."

-now you can do the v5.6 steps above.

v4.19
=====
-two choices here, because v4.19 doesn't have 476d61b783e5 ("net/mlx5:
Add a locked flag to node removal functions") [v5.1-rc1] which does:
   -static void down_write_ref_node(struct fs_node *node)
   +static void down_write_ref_node(struct fs_node *node, bool locked)
...and the same for up.

Choice one is to strip the ", locked" from the commits used above in
v5.4/5.6; choice #2 is to simply apply this 476d61 as it applies as is
with only one minor fuzz warn, and as the commit says itself, it is inert
since "locked == false" at this stage of development.

-I went with #2 and then the steps above in v5.4 went "hands-free"

--
> 
> thanks,
> 
> greg k-h

      reply	other threads:[~2020-06-10 17:27 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-07 20:34 Possible linux-stable mis-backport in ethernet/mellanox/mlx5 Paul Gortmaker
2020-06-09 17:29 ` Greg Kroah-Hartman
2020-06-10 17:26   ` Paul Gortmaker [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200610172630.GA29331@windriver.com \
    --to=paul.gortmaker@windriver.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=markb@mellanox.com \
    --cc=roid@mellanox.com \
    --cc=saeedm@mellanox.com \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).