All of lore.kernel.org
 help / color / mirror / Atom feed
From: Paul Gortmaker <paul.gortmaker@windriver.com>
To: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: <stable@vger.kernel.org>, Roi Dayan <roid@mellanox.com>,
	Mark Bloch <markb@mellanox.com>,
	Saeed Mahameed <saeedm@mellanox.com>
Subject: Re: Possible linux-stable mis-backport in ethernet/mellanox/mlx5
Date: Wed, 10 Jun 2020 13:26:30 -0400	[thread overview]
Message-ID: <20200610172630.GA29331@windriver.com> (raw)
In-Reply-To: <20200609172907.GA873279@kroah.com>

[Re: Possible linux-stable mis-backport in ethernet/mellanox/mlx5] On 09/06/2020 (Tue 19:29) Greg Kroah-Hartman wrote:

> On Sun, Jun 07, 2020 at 04:34:25PM -0400, Paul Gortmaker wrote:
> > I happened to notice this commit:
> > 
> > 9ca415399dae - "net/mlx5: Annotate mutex destroy for root ns"
> > 
> > ...was backported to 4.19 and 5.4 and v5.6 in linux-stable.
> > 
> > It patches del_sw_root_ns() - which only exists after v5.7-rc7 from:
> > 
> > 6eb7a268a99b - "net/mlx5: Don't maintain a case of del_sw_func being
> > null"
> > 
> > which creates the one line del_sw_root_ns stub function around
> > kfree(node) by breaking it out of tree_put_node().
> > 
> > In the absense of del_sw_root_ns - the backport finds an identical one
> > line kfree stub fcn - named del_sw_prio from this earlier commit:
> > 
> > 139ed6c6c46a - "net/mlx5: Fix steering memory leak"  [in v4.15-rc5]
> > 
> > and then puts the mutex_destroy() into that (wrong) function, instead of
> > putting it into tree_put_node where the root ns case used to be handled.
> 
> Ugh, good catch.  I'll go revert this from everywhere.  If you could,
> can you provide a working backport?

Maybe the simplest option is to just forget the commit, now that
you reverted the mis-applied version.  In doing so, we are assuming
mutex_destroy is just a trap for future bugs/issues and not really a
run-time bug-fix (it is a no-op unless CONFIG_DEBUG_MUTEXES=y).

Or, if you want a valid backport I'd suggest the following:

Patching tree_put_node() in-place with the mutex_destroy results in long
lines, and isn't very future-proof if there are future mellanox backports
looking for del_sw_root_ns() -- and IIRC, the mellanox stuff does seem to
see a higher than average number of stable backports.

Instead as indicated below, establishing del_sw_root_ns() in the various
currently active stable versions can be achieved w/o any real runtime
impact - thus allowing for any future backports to have a higher
probability of applying, and applying properly.

I've read the code, and compile tested the below on x86-64 allmodconfig,
but I'm not familiar with the code and don't have the hardware, so I
defer to the Mellanox guys to double check on what I've outlined below.

Hope this helps,
Paul.

v5.6
=====
-apply 6eb7a268a99b so there is now a del_sw_root_ns().  [applies as-is]
This commit claims that it "Fixes: 2cc43b494a6c" [v4.5-rc1] but my
reading of the commit just says to me that it simplifies the code by
assigning the root ns a trivial delete so it doesn't have to be
special-cased.  Runtime doesn't appear to change IMHO.

-now apply 9ca415399dae - the original mutex_destroy() commit, and it
will apply as-is, and in the right function this time.`

v5.4
=====
-apply 476a028f0a2d ("net/mlx5: Fix cleaning unmanaged flow tables").
While it claims that it "Fixes: 5281a0c90919" [v5.6-rc1] - and I'm
assuming that is true, but what it really does is remove all the fragile
assumptions about who has a parent and how that translates to being the
root ns or not -- and simply replaces it with the more logical "if you
have a delete function, then I'll use it."

-now you can do the v5.6 steps above.

v4.19
=====
-two choices here, because v4.19 doesn't have 476d61b783e5 ("net/mlx5:
Add a locked flag to node removal functions") [v5.1-rc1] which does:
   -static void down_write_ref_node(struct fs_node *node)
   +static void down_write_ref_node(struct fs_node *node, bool locked)
...and the same for up.

Choice one is to strip the ", locked" from the commits used above in
v5.4/5.6; choice #2 is to simply apply this 476d61 as it applies as is
with only one minor fuzz warn, and as the commit says itself, it is inert
since "locked == false" at this stage of development.

-I went with #2 and then the steps above in v5.4 went "hands-free"

--
> 
> thanks,
> 
> greg k-h

      reply	other threads:[~2020-06-10 17:27 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-07 20:34 Possible linux-stable mis-backport in ethernet/mellanox/mlx5 Paul Gortmaker
2020-06-09 17:29 ` Greg Kroah-Hartman
2020-06-10 17:26   ` Paul Gortmaker [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200610172630.GA29331@windriver.com \
    --to=paul.gortmaker@windriver.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=markb@mellanox.com \
    --cc=roid@mellanox.com \
    --cc=saeedm@mellanox.com \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.