From: Salil Mehta <salil.mehta@huawei.com>
To: linyunsheng <linyunsheng@huawei.com>,
"davem@davemloft.net" <davem@davemloft.net>
Cc: "hkallweit1@gmail.com" <hkallweit1@gmail.com>,
"f.fainelli@gmail.com" <f.fainelli@gmail.com>,
"stephen@networkplumber.org" <stephen@networkplumber.org>,
"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
Linuxarm <linuxarm@huawei.com>
Subject: RE: [PATCH v2 net-next] net: link_watch: prevent starvation when processing linkwatch wq
Date: Fri, 31 May 2019 11:17:00 +0000 [thread overview]
Message-ID: <8a93eecf7a7a4ffd81f1b7d08f1a7442@huawei.com> (raw)
In-Reply-To: <1559293233-43017-1-git-send-email-linyunsheng@huawei.com>
> From: netdev-owner@vger.kernel.org [mailto:netdev-
> owner@vger.kernel.org] On Behalf Of Yunsheng Lin
> Sent: Friday, May 31, 2019 10:01 AM
> To: davem@davemloft.net
> Cc: hkallweit1@gmail.com; f.fainelli@gmail.com;
> stephen@networkplumber.org; netdev@vger.kernel.org; linux-
> kernel@vger.kernel.org; Linuxarm <linuxarm@huawei.com>
> Subject: [PATCH v2 net-next] net: link_watch: prevent starvation when
> processing linkwatch wq
>
> When user has configured a large number of virtual netdev, such
> as 4K vlans, the carrier on/off operation of the real netdev
> will also cause it's virtual netdev's link state to be processed
> in linkwatch. Currently, the processing is done in a work queue,
> which may cause cpu and rtnl locking starvation problem.
>
> This patch releases the cpu and rtnl lock when link watch worker
> has processed a fixed number of netdev' link watch event.
>
> Currently __linkwatch_run_queue is called with rtnl lock, so
> enfore it with ASSERT_RTNL();
>
> Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
> ---
> V2: use cond_resched and rtnl_unlock after processing a fixed
> number of events
> ---
> net/core/link_watch.c | 17 +++++++++++++++++
> 1 file changed, 17 insertions(+)
>
> diff --git a/net/core/link_watch.c b/net/core/link_watch.c
> index 7f51efb..07eebfb 100644
> --- a/net/core/link_watch.c
> +++ b/net/core/link_watch.c
> @@ -168,9 +168,18 @@ static void linkwatch_do_dev(struct net_device
> *dev)
>
> static void __linkwatch_run_queue(int urgent_only)
> {
> +#define MAX_DO_DEV_PER_LOOP 100
> +
> + int do_dev = MAX_DO_DEV_PER_LOOP;
> struct net_device *dev;
> LIST_HEAD(wrk);
>
> + ASSERT_RTNL();
> +
> + /* Give urgent case more budget */
> + if (urgent_only)
> + do_dev += MAX_DO_DEV_PER_LOOP;
> +
> /*
> * Limit the number of linkwatch events to one
> * per second so that a runaway driver does not
> @@ -200,6 +209,14 @@ static void __linkwatch_run_queue(int urgent_only)
> }
> spin_unlock_irq(&lweventlist_lock);
> linkwatch_do_dev(dev);
> +
> + if (--do_dev < 0) {
> + rtnl_unlock();
> + cond_resched();
Sorry, missed in my earlier comment. I could see multiple problems here
and please correct me if I am wrong:
1. It looks like releasing the rtnl_lock here and then res-scheduling might
not be safe, especially when you have already held *lweventlist_lock*
(which is global and not per-netdev), and when you are trying to
reschedule. This can cause *deadlock* with itself.
Reason: once you release the rtnl_lock() the similar leg of function
netdev_wait_allrefs() could be called for some other netdevice which
might end up in waiting for same global linkwatch event list lock
i.e. *lweventlist_lock*.
2. After releasing the rtnl_lock() we have not ensured that all the rcu
operations are complete. Perhaps we need to take rcu_barrier() before
retaking the rtnl_lock()
> + do_dev = MAX_DO_DEV_PER_LOOP;
Here, I think rcu_barrier() should exist.
> + rtnl_lock();
> + }
> +
> spin_lock_irq(&lweventlist_lock);
> }
next prev parent reply other threads:[~2019-05-31 11:17 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-05-31 9:00 [PATCH v2 net-next] net: link_watch: prevent starvation when processing linkwatch wq Yunsheng Lin
2019-05-31 9:54 ` Salil Mehta
2019-06-03 1:20 ` Yunsheng Lin
2019-05-31 11:17 ` Salil Mehta [this message]
2019-06-03 2:11 ` Yunsheng Lin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8a93eecf7a7a4ffd81f1b7d08f1a7442@huawei.com \
--to=salil.mehta@huawei.com \
--cc=davem@davemloft.net \
--cc=f.fainelli@gmail.com \
--cc=hkallweit1@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linuxarm@huawei.com \
--cc=linyunsheng@huawei.com \
--cc=netdev@vger.kernel.org \
--cc=stephen@networkplumber.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox