netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jason Wang <jasowang@redhat.com>
To: wangyunjian <wangyunjian@huawei.com>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>
Cc: "kuba@kernel.org" <kuba@kernel.org>,
	"davem@davemloft.net" <davem@davemloft.net>,
	"mst@redhat.com" <mst@redhat.com>,
	"virtualization@lists.linux-foundation.org" 
	<virtualization@lists.linux-foundation.org>,
	dingxiaoxiong <dingxiaoxiong@huawei.com>
Subject: Re: [PATCH net-next] virtio_net: set link state down when virtqueue is broken
Date: Fri, 4 Jun 2021 10:37:30 +0800	[thread overview]
Message-ID: <1cc933e6-cde4-ba20-3c54-7391db93a9a1@redhat.com> (raw)
In-Reply-To: <20a5f1bd8a5a49fa8c0f90875a49631b@huawei.com>


在 2021/6/3 下午7:34, wangyunjian 写道:
>> -----Original Message-----
>> From: Jason Wang [mailto:jasowang@redhat.com]
>> Sent: Monday, May 31, 2021 11:29 AM
>> To: wangyunjian <wangyunjian@huawei.com>; netdev@vger.kernel.org
>> Cc: kuba@kernel.org; davem@davemloft.net; mst@redhat.com;
>> virtualization@lists.linux-foundation.org; dingxiaoxiong
>> <dingxiaoxiong@huawei.com>
>> Subject: Re: [PATCH net-next] virtio_net: set link state down when virtqueue is
>> broken
>>
>>
>> 在 2021/5/28 下午6:58, wangyunjian 写道:
>>>> -----Original Message-----
>>>>> From: Yunjian Wang <wangyunjian@huawei.com>
>>>>>
>>>>> The NIC can't receive/send packets if a rx/tx virtqueue is broken.
>>>>> However, the link state of the NIC is still normal. As a result, the
>>>>> user cannot detect the NIC exception.
>>>> Doesn't we have:
>>>>
>>>>           /* This should not happen! */
>>>>            if (unlikely(err)) {
>>>>                    dev->stats.tx_fifo_errors++;
>>>>                    if (net_ratelimit())
>>>>                            dev_warn(&dev->dev,
>>>>                                     "Unexpected TXQ (%d)
>> queue
>>>> failure: %d\n",
>>>>                                     qnum, err);
>>>>                    dev->stats.tx_dropped++;
>>>>                    dev_kfree_skb_any(skb);
>>>>                    return NETDEV_TX_OK;
>>>>            }
>>>>
>>>> Which should be sufficient?
>>> There may be other reasons for this error, e.g -ENOSPC(no free desc).
>>
>> This should not happen unless the device or driver is buggy. We always reserved
>> sufficient slots:
>>
>>           if (sq->vq->num_free < 2+MAX_SKB_FRAGS) {
>>                   netif_stop_subqueue(dev, qnum); ...
>>
>>
>>> And if rx virtqueue is broken, there is no error statistics.
>>
>> Feel free to add one if it's necessary.
> Currently receiving scenario, it is impossible to distinguish whether the reason for
> not receiving packet is virtqueue's broken or no packet.


Can we introduce rx_fifo_errors for that?


>
>> Let's leave the policy decision (link down) to userspace.
>>
>>
>>>>> The driver can set the link state down when the virtqueue is broken.
>>>>> If the state is down, the user can switch over to another NIC.
>>>> Note that, we probably need the watchdog for virtio-net in order to
>>>> be a complete solution.
>>> Yes, I can think of is that the virtqueue's broken exception is detected on
>> watchdog.
>>> Is there anything else that needs to be done?
>>
>> Basically, it's all about TX stall which watchdog tries to catch. Broken vq is only
>> one of the possible reason.
> Are there any plans for the watchdog?


Somebody posted a prototype 3 or 4 years ago, you can search it and 
maybe we can start from there.

Thanks


>
> Thanks
>
>> Thanks
>>
>>
>>> Thanks
>>>
>>>> Thanks
>>>>
>>>>


  reply	other threads:[~2021-06-04  2:37 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-26 11:39 [PATCH net-next] virtio_net: set link state down when virtqueue is broken wangyunjian
2021-05-27  0:28 ` Jakub Kicinski
2021-05-27  3:06   ` wangyunjian
2021-05-27  4:22 ` Jason Wang
2021-05-28 10:58   ` wangyunjian
2021-05-31  3:28     ` Jason Wang
2021-06-03 11:34       ` wangyunjian
2021-06-04  2:37         ` Jason Wang [this message]
2021-06-05  7:10           ` wangyunjian
2021-06-07  2:28             ` Jason Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1cc933e6-cde4-ba20-3c54-7391db93a9a1@redhat.com \
    --to=jasowang@redhat.com \
    --cc=davem@davemloft.net \
    --cc=dingxiaoxiong@huawei.com \
    --cc=kuba@kernel.org \
    --cc=mst@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=virtualization@lists.linux-foundation.org \
    --cc=wangyunjian@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).