From: Jay Vosburgh <fubar@us.ibm.com>
To: netdev@vger.kernel.org
Cc: =?us-ascii?Q?=3D=3FUTF-8=3FQ=3FAm=3DC3=3DA9rico=5FWang=3F=3D?=
<xiyou.wangcong@gmail.com>,
"Stephen Hemminger" <shemminger@vyatta.com>,
"Mitsuo Hayasaka" <mitsuo.hayasaka.hu@hitachi.com>,
"Andy Gospodarek" <andy@greyhouse.net>,
linux-kernel@vger.kernel.org, yrl.pp-manager.tt@hitachi.com
Subject: Re: [PATCH net -v2] [BUGFIX] bonding: use flush_delayed_work_sync in bond_close
Date: Fri, 21 Oct 2011 17:59:02 -0700 [thread overview]
Message-ID: <14766.1319245142@death> (raw)
In-Reply-To: <17144.1319178396@death>
Jay Vosburgh <fubar@us.ibm.com> wrote:
>Américo Wang <xiyou.wangcong@gmail.com> wrote:
>
>>On Thu, Oct 20, 2011 at 3:09 AM, Jay Vosburgh <fubar@us.ibm.com> wrote:
>>> Stephen Hemminger <shemminger@vyatta.com> wrote:
>>>
>>>>On Wed, 19 Oct 2011 11:01:02 -0700
>>>>Jay Vosburgh <fubar@us.ibm.com> wrote:
>>>>
>>>>> Mitsuo Hayasaka <mitsuo.hayasaka.hu@hitachi.com> wrote:
>>>>>
>>>>> >The bond_close() calls cancel_delayed_work() to cancel delayed works.
>>>>> >It, however, cannot cancel works that were already queued in workqueue.
>>>>> >The bond_open() initializes work->data, and proccess_one_work() refers
>>>>> >get_work_cwq(work)->wq->flags. The get_work_cwq() returns NULL when
>>>>> >work->data has been initialized. Thus, a panic occurs.
>>>>> >
>>>>> >This patch uses flush_delayed_work_sync() instead of cancel_delayed_work()
>>>>> >in bond_close(). It cancels delayed timer and waits for work to finish
>>>>> >execution. So, it can avoid the null pointer dereference due to the
>>>>> >parallel executions of proccess_one_work() and initializing proccess
>>>>> >of bond_open().
>>>>>
>>>>> I'm setting up to test this. I have a dim recollection that we
>>>>> tried this some years ago, and there was a different deadlock that
>>>>> manifested through the flush path. Perhaps changes since then have
>>>>> removed that problem.
>>>>>
>>>>> -J
>>>>
>>>>Won't this deadlock on RTNL. The problem is that:
>>>>
>>>> CPU0 CPU1
>>>> rtnl_lock
>>>> bond_close
>>>> delayed_work
>>>> mii_work
>>>> read_lock(bond->lock);
>>>> read_unlock(bond->lock);
>>>> rtnl_lock... waiting for CPU0
>>>> flush_delayed_work_sync
>>>> waiting for delayed_work to finish...
>>>
>>> Yah, that was it. We discussed this a couple of years ago in
>>> regards to a similar patch:
>>>
>>> http://lists.openwall.net/netdev/2009/12/17/3
>>>
>>> The short version is that we could rework the rtnl_lock inside
>>> the montiors to be conditional and retry on failure (where "retry" means
>>> "reschedule the work and try again later," not "spin retrying on rtnl").
>>> That should permit the use of flush or cancel to terminate the work
>>> items.
>>
>>Yes? Even if we use rtnl_trylock(), doesn't flush_delayed_work_sync()
>>still queue the pending delayed work and wait for it to be finished?
>
> Yes, it does. The original patch wants to use flush instead of
>cancel to wait for the work to finish, because there's evidently a
>possibility of getting back into bond_open before the work item
>executes, and bond_open would reinitialize the work queue and corrupt
>the queued work item.
>
> The original patch series, and recipe for destruction, is here:
>
> http://www.spinics.net/lists/netdev/msg176382.html
>
> I've been unable to reproduce the work queue panic locally,
>although it sounds plausible.
>
> Mitsuo: can you provide the precise bonding configuration you're
>using to induce the problem? Driver options, number and type of slaves,
>etc.
>
>>Maybe I am too blind, why do we need rtnl_lock for cancel_delayed_work()
>>inside bond_close()?
>
> We don't need RTNL for cancel/flush. However, bond_close is an
>ndo_stop operation, and is called in the dev_close path, which always
>occurs under RTNL. The mii / arp monitor work functions separately
>acquire RTNL if they need to perform various failover related
>operations.
>
> I'm working on a patch that should resolve the mii / arp monitor
>RTNL problem as I described above (if rtnl_trylock fails, punt and
>reschedule the work). I need to rearrange the netdev_bonding_change
>stuff a bit as well, since it acquires RTNL separately.
>
> Once these changes are made to mii / arp monitor, then
>bond_close can call flush instead of cancel, which should eliminate the
>original problem described at the top.
Just an update: there are three functions that may deadlock if
the cancel work calls are changed to flush_sync. There are two
rtnl_lock calls in each of the bond_mii_monitor and
bond_activebackup_arp_mon functions, and one more in the
bond_alb_monitor.
Still testing to make sure I haven't missed anything, and I
still haven't been able to reproduce Mitsuo's original failure.
-J
---
-Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com
next prev parent reply other threads:[~2011-10-22 0:59 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-10-19 8:17 [PATCH net -v2] [BUGFIX] bonding: use flush_delayed_work_sync in bond_close Mitsuo Hayasaka
2011-10-19 18:01 ` Jay Vosburgh
2011-10-19 18:41 ` Stephen Hemminger
2011-10-19 19:09 ` Jay Vosburgh
2011-10-21 5:45 ` Américo Wang
2011-10-21 6:26 ` Jay Vosburgh
2011-10-22 0:59 ` Jay Vosburgh [this message]
2011-10-24 4:00 ` HAYASAKA Mitsuo
2011-10-26 17:31 ` Jay Vosburgh
2011-10-28 1:52 ` HAYASAKA Mitsuo
2011-10-28 3:15 ` David Miller
2011-10-29 1:42 ` [PATCH net-next] bonding: eliminate bond_close race conditions Jay Vosburgh
2011-10-30 7:13 ` David Miller
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=14766.1319245142@death \
--to=fubar@us.ibm.com \
--cc=andy@greyhouse.net \
--cc=linux-kernel@vger.kernel.org \
--cc=mitsuo.hayasaka.hu@hitachi.com \
--cc=netdev@vger.kernel.org \
--cc=shemminger@vyatta.com \
--cc=xiyou.wangcong@gmail.com \
--cc=yrl.pp-manager.tt@hitachi.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).