netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sowmini Varadhan <sowmini.varadhan@oracle.com>
To: Dmitry Vyukov <dvyukov@google.com>
Cc: netdev <netdev@vger.kernel.org>
Subject: Re: [PATCH RFC v2 net-next] rds-tcp: Take explicit refcounts on struct net
Date: Thu, 2 Mar 2017 05:46:33 -0500	[thread overview]
Message-ID: <20170302104633.GD23804@oracle.com> (raw)
In-Reply-To: <CACT4Y+Zzqq38Av-brhoDYpyY_1cvxTGjxehZkzz9wwF3GjcTdQ@mail.gmail.com>

On (03/02/17 11:07), Dmitry Vyukov wrote:
> 
> The other 2 does not look like net-related, but you also mailed patch
> "Cancel any pending connection attempts before taking down
> connection", which looks like it should fix the other 2, right?

no, that patch was still broken.. because, as you pointed out,
it only takes care of one workq, and not the other workqs. Also,
there are a number of clean up operations performed on the socket
associated with the rds_connection, all of which could potentially
be in jeopardy if the race is happening as suspected. I think the
v2 patch (this subject line) is the more appropriate fix- I see
that same thing is being done for svc_xprt's xpt_net, for example.

> I now applied both of your patched on bots. But only happened 1+2
> times over the last 2 weeks. So it will require at least a month to
> make a weak conclusion that it might have helped. So I would suggest
> to either (1) re-review the crash reports, the code and the fix and
> commit it if everything looks consistent, or (2) write a stress test
> that provokes the bugs as much as possible, add some sleeps into the
> kernel code, reproduce the crashes and check that the patches fix
> them.

I can try both, but IME reproducing such things is quite challenging.
Even with things like dtrace-chill on other OSes, it can take a loong
time to nail it. Let's give it a week, while I try out (1) at least.

--Sowmini

      reply	other threads:[~2017-03-02 10:49 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-02-28 16:33 [PATCH RFC v2 net-next] rds-tcp: Take explicit refcounts on struct net Sowmini Varadhan
2017-03-02 10:07 ` Dmitry Vyukov
2017-03-02 10:46   ` Sowmini Varadhan [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170302104633.GD23804@oracle.com \
    --to=sowmini.varadhan@oracle.com \
    --cc=dvyukov@google.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).