From: Jason Wang <jasowang@redhat.com>
To: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Michael Dalton <mwdalton@google.com>,
"Michael S. Tsirkin" <mst@redhat.com>,
netdev@vger.kernel.org,
lf-virt <virtualization@lists.linux-foundation.org>,
Eric Dumazet <edumazet@google.com>,
"David S. Miller" <davem@davemloft.net>
Subject: Re: [PATCH net-next 2/3] virtio-net: use per-receive queue page frag alloc for mergeable bufs
Date: Fri, 27 Dec 2013 14:12:41 +0800 [thread overview]
Message-ID: <52BD1A59.9090706@redhat.com> (raw)
In-Reply-To: <1388123210.12212.44.camel@edumazet-glaptop2.roam.corp.google.com>
On 12/27/2013 01:46 PM, Eric Dumazet wrote:
> On Fri, 2013-12-27 at 12:55 +0800, Jason Wang wrote:
>> On 12/27/2013 05:56 AM, Eric Dumazet wrote:
>>> On Thu, 2013-12-26 at 13:28 -0800, Michael Dalton wrote:
>>>> On Mon, Dec 23, 2013 at 11:37 AM, Michael S. Tsirkin <mst@redhat.com> wrote:
>>>>> So there isn't a conflict with respect to locking.
>>>>>
>>>>> Is it problematic to use same page_frag with both GFP_ATOMIC and with
>>>>> GFP_KERNEL? If yes why?
>>>> I believe it is safe to use the same page_frag and I will send out a
>>>> followup patchset using just the per-receive page_frags. For future
>>>> consideration, Eric noted that disabling NAPI before GFP_KERNEL
>>>> allocs can potentially inhibit virtio-net network processing for some
>>>> time (e.g., during a blocking memory allocation or preemption).
>>> Yep, using napi_disable() in the refill process looks quite inefficient
>>> to me, it not buggy.
>>>
>>> napi_disable() is a big hammer, while whole idea of having a process to
>>> block on GFP_KERNEL allocations is to allow some asynchronous behavior.
>>>
>>> I have hard time to convince myself virtio_net is safe anyway with this
>>> work queue thing.
>>>
>>> virtnet_open() seems racy for example :
>>>
>>> for (i = 0; i < vi->max_queue_pairs; i++) {
>>> if (i < vi->curr_queue_pairs)
>>> /* Make sure we have some buffers: if oom use wq. */
>>> if (!try_fill_recv(&vi->rq[i], GFP_KERNEL))
>>> schedule_delayed_work(&vi->refill, 0);
>>> virtnet_napi_enable(&vi->rq[i]);
>>>
>>>
>>> What if the workqueue is scheduled _before_ the call to virtnet_napi_enable(&vi->rq[i]) ?
>> Then napi_disable() in refill_work() will busy wait until napi is
>> enabled by virtnet_napi_enable() which looks safe. Looks like the real
>> issue is in virtnet_restore() who calls try_fill_recv() in neither napi
>> context nor napi disabled context.
> I think you don't really get the race.
>
> The issue is the following :
>
> CPU0 CPU1
>
> schedule_delayed_work()
> napi_disable(&rq->napi);
> try_fill_recv(rq, GFP_KERNEL);
If I didn't miss anything. In this case, for a specific rq,
napi_disable() won't return immediately since NAPI_STATE_SCHED bit was
still set. It will busy wait until NAPI_STATE_SCHED bit was clear by
virtnet_napi_enable(), and then reset the bit. So try_fill_recv() should
be called after its napi was enabled by virtnet_napi_enable() in CPU0.
So the following order were guaranteed:
- try_fill_recv(rq, GFP_ATOMIC) in CPU0
- virtnet_napi_enable(&vi->rq[i]) in CPU0
- napi_disable(&rq->napi) returned in CPU1
- try_fill_recv(rq) in CPU1
...
>
> virtnet_napi_enable(&vi->rq[i]);
> ...
> try_fill_recv(rq, GFP_ATOMIC);
>
> napi_enable();// crash on :
> BUG_ON(!test_bit(NAPI_STATE_SCHED, &n->state));
>
>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2013-12-27 6:12 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-12-17 0:16 [PATCH net-next 1/3] net: allow > 0 order atomic page alloc in skb_page_frag_refill Michael Dalton
2013-12-17 0:16 ` [PATCH net-next 2/3] virtio-net: use per-receive queue page frag alloc for mergeable bufs Michael Dalton
2013-12-23 8:12 ` Jason Wang
2013-12-23 17:27 ` Eric Dumazet
2013-12-23 19:37 ` Michael S. Tsirkin
2013-12-26 21:28 ` Michael Dalton
2013-12-26 21:37 ` Michael S. Tsirkin
2013-12-26 22:00 ` Eric Dumazet
2014-01-08 17:21 ` Michael S. Tsirkin
2014-01-08 18:09 ` Eric Dumazet
2014-01-08 18:57 ` Michael S. Tsirkin
2014-01-08 19:54 ` David Miller
2014-01-08 21:16 ` Rick Jones
2013-12-26 21:56 ` Eric Dumazet
2013-12-27 4:55 ` Jason Wang
2013-12-27 5:46 ` Eric Dumazet
2013-12-27 6:12 ` Jason Wang [this message]
2013-12-23 13:31 ` Michael S. Tsirkin
2013-12-17 0:16 ` [PATCH net-next 3/3] net: auto-tune mergeable rx buffer size for improved performance Michael Dalton
2013-12-23 12:51 ` Michael S. Tsirkin
2013-12-23 13:33 ` Michael S. Tsirkin
2013-12-30 10:14 ` Amos Kong
2014-01-08 17:41 ` Michael S. Tsirkin
2013-12-26 7:33 ` Jason Wang
2013-12-26 20:06 ` Michael Dalton
2013-12-26 20:24 ` Michael S. Tsirkin
2013-12-27 3:04 ` Jason Wang
2013-12-27 21:41 ` Michael Dalton
2013-12-30 4:50 ` Jason Wang
2013-12-30 5:38 ` Jason Wang
2014-01-08 17:37 ` Michael S. Tsirkin
2013-12-19 19:58 ` [PATCH net-next 1/3] net: allow > 0 order atomic page alloc in skb_page_frag_refill David Miller
2013-12-23 13:35 ` Michael S. Tsirkin
2013-12-23 7:52 ` Jason Wang
2013-12-23 17:24 ` Eric Dumazet
2013-12-23 12:53 ` Michael S. Tsirkin
2013-12-23 17:30 ` Eric Dumazet
2013-12-23 19:19 ` Michael S. Tsirkin
2013-12-24 22:46 ` David Miller
2014-01-03 0:42 ` Debabrata Banerjee
2014-01-03 0:56 ` Eric Dumazet
2014-01-03 1:26 ` Eric Dumazet
2014-01-03 1:59 ` Debabrata Banerjee
2014-01-03 22:47 ` Debabrata Banerjee
2014-01-03 22:54 ` Eric Dumazet
2014-01-03 23:27 ` Debabrata Banerjee
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=52BD1A59.9090706@redhat.com \
--to=jasowang@redhat.com \
--cc=davem@davemloft.net \
--cc=edumazet@google.com \
--cc=eric.dumazet@gmail.com \
--cc=mst@redhat.com \
--cc=mwdalton@google.com \
--cc=netdev@vger.kernel.org \
--cc=virtualization@lists.linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).