public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Feng Liu <feliu@nvidia.com>
To: Jason Wang <jasowang@redhat.com>, Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Cc: "Michael S . Tsirkin" <mst@redhat.com>,
	Simon Horman <simon.horman@corigine.com>,
	Bodong Wang <bodong@nvidia.com>, William Tu <witu@nvidia.com>,
	Parav Pandit <parav@nvidia.com>,
	virtualization@lists.linux-foundation.org,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	bpf@vger.kernel.org
Subject: Re: [PATCH net v3] virtio_net: Fix error unwinding of XDP initialization
Date: Wed, 10 May 2023 07:54:49 -0400	[thread overview]
Message-ID: <17f9d4ef-9493-3fcd-65ff-beb92f99f854@nvidia.com> (raw)
In-Reply-To: <a13a2d3f-e76e-b6a6-3d30-d5534e2fa917@redhat.com>



On 2023-05-10 a.m.1:00, Jason Wang wrote:
> External email: Use caution opening links or attachments
> 
> 
> 在 2023/5/9 09:43, Xuan Zhuo 写道:
>> On Mon, 8 May 2023 11:00:10 -0400, Feng Liu <feliu@nvidia.com> wrote:
>>>
>>> On 2023-05-07 p.m.9:45, Xuan Zhuo wrote:
>>>> External email: Use caution opening links or attachments
>>>>
>>>>
>>>> On Sat, 6 May 2023 08:08:02 -0400, Feng Liu <feliu@nvidia.com> wrote:
>>>>>
>>>>> On 2023-05-05 p.m.10:33, Xuan Zhuo wrote:
>>>>>> External email: Use caution opening links or attachments
>>>>>>
>>>>>>
>>>>>> On Tue, 2 May 2023 20:35:25 -0400, Feng Liu <feliu@nvidia.com> wrote:
>>>>>>> When initializing XDP in virtnet_open(), some rq xdp initialization
>>>>>>> may hit an error causing net device open failed. However, previous
>>>>>>> rqs have already initialized XDP and enabled NAPI, which is not the
>>>>>>> expected behavior. Need to roll back the previous rq initialization
>>>>>>> to avoid leaks in error unwinding of init code.
>>>>>>>
>>>>>>> Also extract a helper function of disable queue pairs, and use newly
>>>>>>> introduced helper function in error unwinding and virtnet_close;
>>>>>>>
>>>>>>> Issue: 3383038
>>>>>>> Fixes: 754b8a21a96d ("virtio_net: setup xdp_rxq_info")
>>>>>>> Signed-off-by: Feng Liu <feliu@nvidia.com>
>>>>>>> Reviewed-by: William Tu <witu@nvidia.com>
>>>>>>> Reviewed-by: Parav Pandit <parav@nvidia.com>
>>>>>>> Reviewed-by: Simon Horman <simon.horman@corigine.com>
>>>>>>> Acked-by: Michael S. Tsirkin <mst@redhat.com>
>>>>>>> Change-Id: Ib4c6a97cb7b837cfa484c593dd43a435c47ea68f
>>>>>>> ---
>>>>>>>     drivers/net/virtio_net.c | 30 ++++++++++++++++++++----------
>>>>>>>     1 file changed, 20 insertions(+), 10 deletions(-)
>>>>>>>
>>>>>>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
>>>>>>> index 8d8038538fc4..3737cf120cb7 100644
>>>>>>> --- a/drivers/net/virtio_net.c
>>>>>>> +++ b/drivers/net/virtio_net.c
>>>>>>> @@ -1868,6 +1868,13 @@ static int virtnet_poll(struct napi_struct 
>>>>>>> *napi, int budget)
>>>>>>>          return received;
>>>>>>>     }
>>>>>>>
>>>>>>> +static void virtnet_disable_qp(struct virtnet_info *vi, int 
>>>>>>> qp_index)
>>>>>>> +{
>>>>>>> +     virtnet_napi_tx_disable(&vi->sq[qp_index].napi);
>>>>>>> +     napi_disable(&vi->rq[qp_index].napi);
>>>>>>> +     xdp_rxq_info_unreg(&vi->rq[qp_index].xdp_rxq);
>>>>>>> +}
>>>>>>> +
>>>>>>>     static int virtnet_open(struct net_device *dev)
>>>>>>>     {
>>>>>>>          struct virtnet_info *vi = netdev_priv(dev);
>>>>>>> @@ -1883,20 +1890,26 @@ static int virtnet_open(struct net_device 
>>>>>>> *dev)
>>>>>>>
>>>>>>>                  err = xdp_rxq_info_reg(&vi->rq[i].xdp_rxq, dev, 
>>>>>>> i, vi->rq[i].napi.napi_id);
>>>>>>>                  if (err < 0)
>>>>>>> -                     return err;
>>>>>>> +                     goto err_xdp_info_reg;
>>>>>>>
>>>>>>>                  err = 
>>>>>>> xdp_rxq_info_reg_mem_model(&vi->rq[i].xdp_rxq,
>>>>>>>                                                   
>>>>>>> MEM_TYPE_PAGE_SHARED, NULL);
>>>>>>> -             if (err < 0) {
>>>>>>> -                     xdp_rxq_info_unreg(&vi->rq[i].xdp_rxq);
>>>>>>> -                     return err;
>>>>>>> -             }
>>>>>>> +             if (err < 0)
>>>>>>> +                     goto err_xdp_reg_mem_model;
>>>>>>>
>>>>>>>                  virtnet_napi_enable(vi->rq[i].vq, &vi->rq[i].napi);
>>>>>>>                  virtnet_napi_tx_enable(vi, vi->sq[i].vq, 
>>>>>>> &vi->sq[i].napi);
>>>>>>>          }
>>>>>>>
>>>>>>>          return 0;
>>>>>>> +
>>>>>>> +err_xdp_reg_mem_model:
>>>>>>> +     xdp_rxq_info_unreg(&vi->rq[i].xdp_rxq);
>>>>>>> +err_xdp_info_reg:
>>>>>>> +     for (i = i - 1; i >= 0; i--)
>>>>>>> +             virtnet_disable_qp(vi, i);
>>>>>>
>>>>>> I would to know should we handle for these:
>>>>>>
>>>>>>            disable_delayed_refill(vi);
>>>>>>            cancel_delayed_work_sync(&vi->refill);
>>>>>>
>>>>>>
>>>>>> Maybe we should call virtnet_close() with "i" directly.
>>>>>>
>>>>>> Thanks.
>>>>>>
>>>>>>
>>>>> Can’t use i directly here, because if xdp_rxq_info_reg fails, napi has
>>>>> not been enabled for current qp yet, I should roll back from the queue
>>>>> pairs where napi was enabled before(i--), otherwise it will hang at 
>>>>> napi
>>>>> disable api
>>>> This is not the point, the key is whether we should handle with:
>>>>
>>>>             disable_delayed_refill(vi);
>>>>             cancel_delayed_work_sync(&vi->refill);
>>>>
>>>> Thanks.
>>>>
>>>>
>>> OK, get the point. Thanks for your careful review. And I check the code
>>> again.
>>>
>>> There are two points that I need to explain:
>>>
>>> 1. All refill delay work calls(vi->refill, vi->refill_enabled) are based
>>> on that the virtio interface is successfully opened, such as
>>> virtnet_receive, virtnet_rx_resize, _virtnet_set_queues, etc. If there
>>> is an error in the xdp reg here, it will not trigger these subsequent
>>> functions. There is no need to call disable_delayed_refill() and
>>> cancel_delayed_work_sync().
>> Maybe something is wrong. I think these lines may call delay work.
>>
>> static int virtnet_open(struct net_device *dev)
>> {
>>       struct virtnet_info *vi = netdev_priv(dev);
>>       int i, err;
>>
>>       enable_delayed_refill(vi);
>>
>>       for (i = 0; i < vi->max_queue_pairs; i++) {
>>               if (i < vi->curr_queue_pairs)
>>                       /* Make sure we have some buffers: if oom use 
>> wq. */
>> -->                   if (!try_fill_recv(vi, &vi->rq[i], GFP_KERNEL))
>> -->                           schedule_delayed_work(&vi->refill, 0);
>>
>>               err = xdp_rxq_info_reg(&vi->rq[i].xdp_rxq, dev, i, 
>> vi->rq[i].napi.napi_id);
>>               if (err < 0)
>>                       return err;
>>
>>               err = xdp_rxq_info_reg_mem_model(&vi->rq[i].xdp_rxq,
>>                                                MEM_TYPE_PAGE_SHARED, 
>> NULL);
>>               if (err < 0) {
>>                       xdp_rxq_info_unreg(&vi->rq[i].xdp_rxq);
>>                       return err;
>>               }
>>
>>               virtnet_napi_enable(vi->rq[i].vq, &vi->rq[i].napi);
>>               virtnet_napi_tx_enable(vi, vi->sq[i].vq, &vi->sq[i].napi);
>>       }
>>
>>       return 0;
>> }
>>
>>
>> And I think, if we virtnet_open() return error, then the status of 
>> virtnet
>> should like the status after virtnet_close().
>>
>> Or someone has other opinion.
> 
> 
> I agree, we need to disable and sync with the refill work.
> 
> Thanks
> 
> 
OK,got it.
Thanks Jason & Xuan
>>
>> Thanks.
>>
>>> The logic here is different from that of
>>> virtnet_close. virtnet_close is based on the success of virtnet_open and
>>> the tx and rx has been carried out normally. For error unwinding, only
>>> disable qp is needed. Also encapuslated a helper function of disable qp,
>>> which is used ing error unwinding and virtnet close
>>> 2. The current error qp, which has not enabled NAPI, can only call xdp
>>> unreg, and cannot call the interface of disable NAPI, otherwise the
>>> kernel will be stuck. So for i-- the reason for calling disable qp on
>>> the previous queue
>>>
>>> Thanks
>>>
>>>>>>> +
>>>>>>> +     return err;
>>>>>>>     }
>>>>>>>
>>>>>>>     static int virtnet_poll_tx(struct napi_struct *napi, int budget)
>>>>>>> @@ -2305,11 +2318,8 @@ static int virtnet_close(struct net_device 
>>>>>>> *dev)
>>>>>>>          /* Make sure refill_work doesn't re-enable napi! */
>>>>>>>          cancel_delayed_work_sync(&vi->refill);
>>>>>>>
>>>>>>> -     for (i = 0; i < vi->max_queue_pairs; i++) {
>>>>>>> -             virtnet_napi_tx_disable(&vi->sq[i].napi);
>>>>>>> -             napi_disable(&vi->rq[i].napi);
>>>>>>> -             xdp_rxq_info_unreg(&vi->rq[i].xdp_rxq);
>>>>>>> -     }
>>>>>>> +     for (i = 0; i < vi->max_queue_pairs; i++)
>>>>>>> +             virtnet_disable_qp(vi, i);
>>>>>>>
>>>>>>>          return 0;
>>>>>>>     }
>>>>>>> -- 
>>>>>>> 2.37.1 (Apple Git-137.1)
>>>>>>>
> 

  reply	other threads:[~2023-05-10 11:55 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-03  0:35 [PATCH net v3] virtio_net: Fix error unwinding of XDP initialization Feng Liu
2023-05-03  3:14 ` Parav Pandit
2023-05-03 15:00   ` Feng Liu
2023-05-06  2:33 ` Xuan Zhuo
2023-05-06 12:08   ` Feng Liu
2023-05-08  1:45     ` Xuan Zhuo
2023-05-08 15:00       ` Feng Liu
2023-05-09  1:43         ` Xuan Zhuo
2023-05-10  5:00           ` Jason Wang
2023-05-10 11:54             ` Feng Liu [this message]
2023-05-12  1:54             ` Feng Liu
2023-05-12  3:25               ` Xuan Zhuo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=17f9d4ef-9493-3fcd-65ff-beb92f99f854@nvidia.com \
    --to=feliu@nvidia.com \
    --cc=bodong@nvidia.com \
    --cc=bpf@vger.kernel.org \
    --cc=jasowang@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mst@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=parav@nvidia.com \
    --cc=simon.horman@corigine.com \
    --cc=virtualization@lists.linux-foundation.org \
    --cc=witu@nvidia.com \
    --cc=xuanzhuo@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox