Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH RFC v4 net-next 1/5] virtio_net: enable tx interrupt
From: Jason Wang @ 2014-12-19 10:02 UTC (permalink / raw)
  To: Qin Chuanyu; +Cc: pagupta, mst, netdev, linux-kernel, virtualization, davem
In-Reply-To: <5493D488.7020000@huawei.com>



On Fri, Dec 19, 2014 at 3:32 PM, Qin Chuanyu <qinchuanyu@huawei.com> 
wrote:
> On 2014/12/1 18:17, Jason Wang wrote:
>> On newer hosts that support delayed tx interrupts,
>> we probably don't have much to gain from orphaning
>> packets early.
>> 
>> Note: this might degrade performance for
>> hosts without event idx support.
>> Should be addressed by the next patch.
>> 
>> Cc: Rusty Russell <rusty@rustcorp.com.au>
>> Cc: Michael S. Tsirkin <mst@redhat.com>
>> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
>> Signed-off-by: Jason Wang <jasowang@redhat.com>
>> ---
>>   drivers/net/virtio_net.c | 132 
>> +++++++++++++++++++++++++++++++----------------
>>   1 file changed, 88 insertions(+), 44 deletions(-)
>> 
>> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
>> index ec2a8b4..f68114e 100644
>> --- a/drivers/net/virtio_net.c
>> +++ b/drivers/net/virtio_net.c
>>   static int xmit_skb(struct send_queue *sq, struct sk_buff *skb)
>>   {
>>   	struct skb_vnet_hdr *hdr;
>> @@ -912,7 +951,9 @@ static int xmit_skb(struct send_queue *sq, 
>> struct sk_buff *skb)
>>   		sg_set_buf(sq->sg, hdr, hdr_len);
>>   		num_sg = skb_to_sgvec(skb, sq->sg + 1, 0, skb->len) + 1;
>>   	}
>> -	return virtqueue_add_outbuf(sq->vq, sq->sg, num_sg, skb, 
>> GFP_ATOMIC);
>> +
>> +	return virtqueue_add_outbuf(sq->vq, sq->sg, num_sg, skb,
>> +				    GFP_ATOMIC);
>>   }
>> 
>>   static netdev_tx_t start_xmit(struct sk_buff *skb, struct 
>> net_device *dev)
>> @@ -924,8 +965,7 @@ static netdev_tx_t start_xmit(struct sk_buff 
>> *skb, struct net_device *dev)
>>   	struct netdev_queue *txq = netdev_get_tx_queue(dev, qnum);
>>   	bool kick = !skb->xmit_more;
>> 
>> -	/* Free up any pending old buffers before queueing new ones. */
>> -	free_old_xmit_skbs(sq);
> 
> I think there is no need to remove free_old_xmit_skbs here.
> you could add free_old_xmit_skbs in tx_irq's napi func.
> also could do this in start_xmit if you handle the race well.

Note, free_old_xmit_skbs() has already called in tx napi.
It was a must after tx interrupt was enabled.
> 
> 
> I have done the same thing in ixgbe driver(free skb in ndo_start_xmit 
> and both in tx_irq's poll func), and it seems work well:)

Any performance numbers on this change?
I suspect it reduce the effects of interrupt coalescing.
> 
> I think there would be no so much interrupts in this way, also tx 
> interrupt coalesce is not needed.

Tests (multiple sessions of TCP_RR) does not support this.
Calling free_old_xmit_skbs() in fact damage the performance. 

Any justification that you think it may reduce the interrupts?

Thanks
> 
> 
>> +	virtqueue_disable_cb(sq->vq);
>> 
>>   	/* Try to transmit */
>>   	err = xmit_skb(sq, skb);
>> @@ -941,27 +981,19 @@ static netdev_tx_t start_xmit(struct sk_buff 
>> *skb, struct net_device *dev)
>>   		return NETDEV_TX_OK;
>>   	}
>> 
>> -	/* Don't wait up for transmitted skbs to be freed. */
>> -	skb_orphan(skb);
>> -	nf_reset(skb);
>> -
>>   	/* Apparently nice girls don't return TX_BUSY; stop the queue
>>   	 * before it gets out of hand.  Naturally, this wastes entries. */
>> -	if (sq->vq->num_free < 2+MAX_SKB_FRAGS) {
>> +	if (sq->vq->num_free < 2+MAX_SKB_FRAGS)
>>   		netif_stop_subqueue(dev, qnum);
>> -		if (unlikely(!virtqueue_enable_cb_delayed(sq->vq))) {
>> -			/* More just got used, free them then recheck. */
>> -			free_old_xmit_skbs(sq);
>> -			if (sq->vq->num_free >= 2+MAX_SKB_FRAGS) {
>> -				netif_start_subqueue(dev, qnum);
>> -				virtqueue_disable_cb(sq->vq);
>> -			}
>> -		}
>> -	}
>> 
>>   	if (kick || netif_xmit_stopped(txq))
>>   		virtqueue_kick(sq->vq);
>> 
>> +	if (unlikely(!virtqueue_enable_cb_delayed(sq->vq))) {
>> +		virtqueue_disable_cb(sq->vq);
>> +		napi_schedule(&sq->napi);
>> +	}
>> +
>>   	return NETDEV_TX_OK;
>>   }
>> 
>> @@ -1138,8 +1170,10 @@ static int virtnet_close(struct net_device 
>> *dev)
>>   	/* Make sure refill_work doesn't re-enable napi! */
>>   	cancel_delayed_work_sync(&vi->refill);
>> 
>> -	for (i = 0; i < vi->max_queue_pairs; i++)
>> +	for (i = 0; i < vi->max_queue_pairs; i++) {
>>   		napi_disable(&vi->rq[i].napi);
>> +		napi_disable(&vi->sq[i].napi);
>> +	}
>> 
>>   	return 0;
>>   }
>> @@ -1452,8 +1486,10 @@ static void virtnet_free_queues(struct 
>> virtnet_info *vi)
>>   {
>>   	int i;
>> 
>> -	for (i = 0; i < vi->max_queue_pairs; i++)
>> +	for (i = 0; i < vi->max_queue_pairs; i++) {
>>   		netif_napi_del(&vi->rq[i].napi);
>> +		netif_napi_del(&vi->sq[i].napi);
>> +	}
>> 
>>   	kfree(vi->rq);
>>   	kfree(vi->sq);
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [RFC PATCH net-next v2 1/1] net: Support for switch port configuration
From: Jiri Pirko @ 2014-12-19  9:55 UTC (permalink / raw)
  To: B Viswanath
  Cc: Roopa Prabhu, Samudrala, Sridhar, John Fastabend, Varlese, Marco,
	netdev@vger.kernel.org, Thomas Graf, sfeldma@gmail.com,
	linux-kernel@vger.kernel.org
In-Reply-To: <CAN+pFwJr7TvJEW4x_HYPgTEvYGfi+YHgp=5vek2YiU1GLiH6_g@mail.gmail.com>

Fri, Dec 19, 2014 at 10:35:27AM CET, marichika4@gmail.com wrote:
>On 19 December 2014 at 14:53, Jiri Pirko <jiri@resnulli.us> wrote:
>> Fri, Dec 19, 2014 at 10:01:46AM CET, marichika4@gmail.com wrote:
>>>On 19 December 2014 at 13:57, Jiri Pirko <jiri@resnulli.us> wrote:
>>>> Fri, Dec 19, 2014 at 06:14:57AM CET, marichika4@gmail.com wrote:
>>>>>On 19 December 2014 at 05:18, Roopa Prabhu <roopa@cumulusnetworks.com> wrote:
>>>>>> On 12/18/14, 3:26 PM, Samudrala, Sridhar wrote:
>>><snipped for ease of reading>
>>>>>>>
>>>>>>>
>>>>>>> We also need an interface to set per-switch attributes. Can this work?
>>>>>>>     bridge link set dev sw0 sw_attr bcast_flooding 1 master
>>>>>>> where sw0 is a bridge representing the hardware switch.
>>>>>>
>>>>>>
>>>>>> Not today. We discussed this @ LPC, and one way to do this would be to have
>>>>>> a device
>>>>>> representing the switch asic. This is in the works.
>>>>>
>>>>>
>>>>>Can I assume that on  platforms which house more than one asic (say
>>>>>two 24 port asics, interconnected via a 10G link or equivalent, to get
>>>>>a 48 port 'switch') , the 'rocker' driver (or similar) should expose
>>>>>them as a single set of ports, and not as two 'switch ports' ?
>>>>
>>>> Well that really depends on particular implementation and drivers. If you
>>>> have 2 pci-e devices, I think you should expose them as 2 entities. For
>>>> sure, you can have the driver to do the masking for you. I don't believe
>>>> that is correct though.
>>>>
>>>
>>>In a platform that houses two asic chips, IMO, the user is still
>>>expected to manage the router as a single entity. The configuration
>>>being applied on both asic devices need to be matching if not
>>>identical, and may not be conflicting. The FDB is to be synchronized
>>>so that (offloaded) switching can happen across the asics. Some of
>>>this stuff is asic specific anyway. Another example is that of the
>>>learning. The (hardware) learning can't be enabled on one asic, while
>>>being disabled on another one. The general use cases I have seen are
>>>all involving managing the 'router' as a single entity.  That the
>>>'router' is implemented with two asics instead of a single asic (with
>>>more ports) is to be treated as an implementation detail.  This is the
>>>usual router management method that exists today.
>>>
>>>I hope I make sense.
>>>
>>>So I am trying to figure out what this single entity that will be used
>>>from a user perspective. It can be a bridge, but our bridges are more
>>>802.1q bridges. We can use the 'self' mode, but then it means that it
>>>should reflect the entire port count, and not just an asic.
>>>
>>>So I was trying to deduce that in our switchdevice model, the best bet
>>>would be to leave the unification to the driver (i.e., to project the
>>>multiple physical asics as a single virtual switch device). Thist
>>
>> Is it possible to have the asic as just single one? Or is it possible to
>> connect asics being multiple chips maybe from multiple vendors together?
>
>I didn't understand the first question. Some times, it is possible to

I ment that there is a design with just a single asic of this type,
instead of a pair.

>have a single asic replace two, but its a cost factor, and others that
>are involved.
>
>AFAIK, the answer to the second question is a No. Two asics from
>different vendors may not be connected together. The interconnect
>tends to be proprietary.

Okay. In that case, it might make sense to mask it on driver level.


>
>> I believe that answer is "yes" in both cases. Making two separate asics
>> to appear as one for user is not correct in my opinion. Driver should
>> not do such masking. It is unclean, unextendable.
>>
>
>I am only looking for a single management entity. I am not thinking it
>needs to be at driver level. I am not sure of  any other option apart
>from creating a 'switchdev' that Roopa was mentioning.

Well the thing is there is a common desire to make the offloading as
transparent as possible. For example, have 4 ports of same switch and
put them into br0. Just like that, without need to do anything else
than you would do when bridging ordinary NICs. Introducing some
"management entity" would break this approach.


>
>>
>>>allows any 'switch' level configurations to the bridge in 'self' mode.
>>>
>>>And  then we would need to consider stacking. Stacking differs from
>>>this multi-asic scenario since  there would be multiple CPU involved.
>>>
>>>Thanks
>>>Vissu
>>>
>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>>>>>> the body of a message to majordomo@vger.kernel.org
>>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [RFC PATCH net-next v2 1/1] net: Support for switch port configuration
From: B Viswanath @ 2014-12-19  9:35 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: Roopa Prabhu, Samudrala, Sridhar, John Fastabend, Varlese, Marco,
	netdev@vger.kernel.org, Thomas Graf, sfeldma@gmail.com,
	linux-kernel@vger.kernel.org
In-Reply-To: <20141219092308.GF1848@nanopsycho.orion>

On 19 December 2014 at 14:53, Jiri Pirko <jiri@resnulli.us> wrote:
> Fri, Dec 19, 2014 at 10:01:46AM CET, marichika4@gmail.com wrote:
>>On 19 December 2014 at 13:57, Jiri Pirko <jiri@resnulli.us> wrote:
>>> Fri, Dec 19, 2014 at 06:14:57AM CET, marichika4@gmail.com wrote:
>>>>On 19 December 2014 at 05:18, Roopa Prabhu <roopa@cumulusnetworks.com> wrote:
>>>>> On 12/18/14, 3:26 PM, Samudrala, Sridhar wrote:
>><snipped for ease of reading>
>>>>>>
>>>>>>
>>>>>> We also need an interface to set per-switch attributes. Can this work?
>>>>>>     bridge link set dev sw0 sw_attr bcast_flooding 1 master
>>>>>> where sw0 is a bridge representing the hardware switch.
>>>>>
>>>>>
>>>>> Not today. We discussed this @ LPC, and one way to do this would be to have
>>>>> a device
>>>>> representing the switch asic. This is in the works.
>>>>
>>>>
>>>>Can I assume that on  platforms which house more than one asic (say
>>>>two 24 port asics, interconnected via a 10G link or equivalent, to get
>>>>a 48 port 'switch') , the 'rocker' driver (or similar) should expose
>>>>them as a single set of ports, and not as two 'switch ports' ?
>>>
>>> Well that really depends on particular implementation and drivers. If you
>>> have 2 pci-e devices, I think you should expose them as 2 entities. For
>>> sure, you can have the driver to do the masking for you. I don't believe
>>> that is correct though.
>>>
>>
>>In a platform that houses two asic chips, IMO, the user is still
>>expected to manage the router as a single entity. The configuration
>>being applied on both asic devices need to be matching if not
>>identical, and may not be conflicting. The FDB is to be synchronized
>>so that (offloaded) switching can happen across the asics. Some of
>>this stuff is asic specific anyway. Another example is that of the
>>learning. The (hardware) learning can't be enabled on one asic, while
>>being disabled on another one. The general use cases I have seen are
>>all involving managing the 'router' as a single entity.  That the
>>'router' is implemented with two asics instead of a single asic (with
>>more ports) is to be treated as an implementation detail.  This is the
>>usual router management method that exists today.
>>
>>I hope I make sense.
>>
>>So I am trying to figure out what this single entity that will be used
>>from a user perspective. It can be a bridge, but our bridges are more
>>802.1q bridges. We can use the 'self' mode, but then it means that it
>>should reflect the entire port count, and not just an asic.
>>
>>So I was trying to deduce that in our switchdevice model, the best bet
>>would be to leave the unification to the driver (i.e., to project the
>>multiple physical asics as a single virtual switch device). Thist
>
> Is it possible to have the asic as just single one? Or is it possible to
> connect asics being multiple chips maybe from multiple vendors together?

I didn't understand the first question. Some times, it is possible to
have a single asic replace two, but its a cost factor, and others that
are involved.

AFAIK, the answer to the second question is a No. Two asics from
different vendors may not be connected together. The interconnect
tends to be proprietary.

> I believe that answer is "yes" in both cases. Making two separate asics
> to appear as one for user is not correct in my opinion. Driver should
> not do such masking. It is unclean, unextendable.
>

I am only looking for a single management entity. I am not thinking it
needs to be at driver level. I am not sure of  any other option apart
from creating a 'switchdev' that Roopa was mentioning.

>
>>allows any 'switch' level configurations to the bridge in 'self' mode.
>>
>>And  then we would need to consider stacking. Stacking differs from
>>this multi-asic scenario since  there would be multiple CPU involved.
>>
>>Thanks
>>Vissu
>>
>>>>
>>>>>
>>>>>
>>>>> --
>>>>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>>>>> the body of a message to majordomo@vger.kernel.org
>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [RFC PATCH net-next v2 1/1] net: Support for switch port configuration
From: Jiri Pirko @ 2014-12-19  9:35 UTC (permalink / raw)
  To: B Viswanath
  Cc: Roopa Prabhu, Samudrala, Sridhar, John Fastabend, Varlese, Marco,
	netdev@vger.kernel.org, Thomas Graf, sfeldma@gmail.com,
	linux-kernel@vger.kernel.org
In-Reply-To: <CAN+pFwLsSTo2DN=6A=Pxa3tjBnNcc9J7JTEKNnxk_-m4Ha9EMQ@mail.gmail.com>

Fri, Dec 19, 2014 at 10:22:24AM CET, marichika4@gmail.com wrote:
>On 19 December 2014 at 14:31, B Viswanath <marichika4@gmail.com> wrote:
>> On 19 December 2014 at 13:57, Jiri Pirko <jiri@resnulli.us> wrote:
>>> Fri, Dec 19, 2014 at 06:14:57AM CET, marichika4@gmail.com wrote:
>>>>On 19 December 2014 at 05:18, Roopa Prabhu <roopa@cumulusnetworks.com> wrote:
>>>>> On 12/18/14, 3:26 PM, Samudrala, Sridhar wrote:
>> <snipped for ease of reading>
>>>>>>
>>>>>>
>>>>>> We also need an interface to set per-switch attributes. Can this work?
>>>>>>     bridge link set dev sw0 sw_attr bcast_flooding 1 master
>>>>>> where sw0 is a bridge representing the hardware switch.
>>>>>
>>>>>
>>>>> Not today. We discussed this @ LPC, and one way to do this would be to have
>>>>> a device
>>>>> representing the switch asic. This is in the works.
>>>>
>>>>
>>>>Can I assume that on  platforms which house more than one asic (say
>>>>two 24 port asics, interconnected via a 10G link or equivalent, to get
>>>>a 48 port 'switch') , the 'rocker' driver (or similar) should expose
>>>>them as a single set of ports, and not as two 'switch ports' ?
>>>
>>> Well that really depends on particular implementation and drivers. If you
>>> have 2 pci-e devices, I think you should expose them as 2 entities. For
>>> sure, you can have the driver to do the masking for you. I don't believe
>>> that is correct though.
>>>
>>
>> In a platform that houses two asic chips, IMO, the user is still
>> expected to manage the router as a single entity. The configuration
>> being applied on both asic devices need to be matching if not
>> identical, and may not be conflicting. The FDB is to be synchronized
>> so that (offloaded) switching can happen across the asics. Some of
>> this stuff is asic specific anyway. Another example is that of the
>> learning. The (hardware) learning can't be enabled on one asic, while
>> being disabled on another one. The general use cases I have seen are
>> all involving managing the 'router' as a single entity.  That the
>> 'router' is implemented with two asics instead of a single asic (with
>> more ports) is to be treated as an implementation detail.  This is the
>> usual router management method that exists today.
>>
>> I hope I make sense.
>>
>> So I am trying to figure out what this single entity that will be used
>> from a user perspective. It can be a bridge, but our bridges are more
>> 802.1q bridges. We can use the 'self' mode, but then it means that it
>> should reflect the entire port count, and not just an asic.
>>
>> So I was trying to deduce that in our switchdevice model, the best bet
>> would be to leave the unification to the driver (i.e., to project the
>> multiple physical asics as a single virtual switch device). This
>> allows any 'switch' level configurations to the bridge in 'self' mode.
>>
>> And  then we would need to consider stacking. Stacking differs from
>> this multi-asic scenario since  there would be multiple CPU involved.
>>
>> Thanks
>> Vissu
>>
>
>Another example i can provide is that of mirroring. Imagine user
>wanted to mirror all traffic from port 1 of asic 1 to port 2 of asic
>2. This can be offloaded to hardware. However, user would be able to
>enter such a command only if he/she can look at a single management
>entity.

I understand your use case. I think this could be handled by higher
entity. In this sase tome userspace agent which would be aware (by
configuration) of all asics and how they are interconnected. Just a
thought. Seem much nice than to do custom masking in drivers.

>
>Thanks
>Vissu
>
>>>>
>>>>>
>>>>>
>>>>> --
>>>>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>>>>> the body of a message to majordomo@vger.kernel.org
>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [RFC PATCH net-next v2 1/1] net: Support for switch port configuration
From: Jiri Pirko @ 2014-12-19  9:23 UTC (permalink / raw)
  To: B Viswanath
  Cc: Roopa Prabhu, Samudrala, Sridhar, John Fastabend, Varlese, Marco,
	netdev@vger.kernel.org, Thomas Graf, sfeldma@gmail.com,
	linux-kernel@vger.kernel.org
In-Reply-To: <CAN+pFwJAJ71-+MZjJXPbfu5mzMp6AWNxtYR81Lx6siy3D3HfJg@mail.gmail.com>

Fri, Dec 19, 2014 at 10:01:46AM CET, marichika4@gmail.com wrote:
>On 19 December 2014 at 13:57, Jiri Pirko <jiri@resnulli.us> wrote:
>> Fri, Dec 19, 2014 at 06:14:57AM CET, marichika4@gmail.com wrote:
>>>On 19 December 2014 at 05:18, Roopa Prabhu <roopa@cumulusnetworks.com> wrote:
>>>> On 12/18/14, 3:26 PM, Samudrala, Sridhar wrote:
><snipped for ease of reading>
>>>>>
>>>>>
>>>>> We also need an interface to set per-switch attributes. Can this work?
>>>>>     bridge link set dev sw0 sw_attr bcast_flooding 1 master
>>>>> where sw0 is a bridge representing the hardware switch.
>>>>
>>>>
>>>> Not today. We discussed this @ LPC, and one way to do this would be to have
>>>> a device
>>>> representing the switch asic. This is in the works.
>>>
>>>
>>>Can I assume that on  platforms which house more than one asic (say
>>>two 24 port asics, interconnected via a 10G link or equivalent, to get
>>>a 48 port 'switch') , the 'rocker' driver (or similar) should expose
>>>them as a single set of ports, and not as two 'switch ports' ?
>>
>> Well that really depends on particular implementation and drivers. If you
>> have 2 pci-e devices, I think you should expose them as 2 entities. For
>> sure, you can have the driver to do the masking for you. I don't believe
>> that is correct though.
>>
>
>In a platform that houses two asic chips, IMO, the user is still
>expected to manage the router as a single entity. The configuration
>being applied on both asic devices need to be matching if not
>identical, and may not be conflicting. The FDB is to be synchronized
>so that (offloaded) switching can happen across the asics. Some of
>this stuff is asic specific anyway. Another example is that of the
>learning. The (hardware) learning can't be enabled on one asic, while
>being disabled on another one. The general use cases I have seen are
>all involving managing the 'router' as a single entity.  That the
>'router' is implemented with two asics instead of a single asic (with
>more ports) is to be treated as an implementation detail.  This is the
>usual router management method that exists today.
>
>I hope I make sense.
>
>So I am trying to figure out what this single entity that will be used
>from a user perspective. It can be a bridge, but our bridges are more
>802.1q bridges. We can use the 'self' mode, but then it means that it
>should reflect the entire port count, and not just an asic.
>
>So I was trying to deduce that in our switchdevice model, the best bet
>would be to leave the unification to the driver (i.e., to project the
>multiple physical asics as a single virtual switch device). This

Is it possible to have the asic as just single one? Or is it possible to
connect asics being multiple chips maybe from multiple vendors together?
I believe that answer is "yes" in both cases. Making two separate asics
to appear as one for user is not correct in my opinion. Driver should
not do such masking. It is unclean, unextendable.


>allows any 'switch' level configurations to the bridge in 'self' mode.
>
>And  then we would need to consider stacking. Stacking differs from
>this multi-asic scenario since  there would be multiple CPU involved.
>
>Thanks
>Vissu
>
>>>
>>>>
>>>>
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [RFC PATCH net-next v2 1/1] net: Support for switch port configuration
From: B Viswanath @ 2014-12-19  9:22 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: Roopa Prabhu, Samudrala, Sridhar, John Fastabend, Varlese, Marco,
	netdev@vger.kernel.org, Thomas Graf, sfeldma@gmail.com,
	linux-kernel@vger.kernel.org
In-Reply-To: <CAN+pFwJAJ71-+MZjJXPbfu5mzMp6AWNxtYR81Lx6siy3D3HfJg@mail.gmail.com>

On 19 December 2014 at 14:31, B Viswanath <marichika4@gmail.com> wrote:
> On 19 December 2014 at 13:57, Jiri Pirko <jiri@resnulli.us> wrote:
>> Fri, Dec 19, 2014 at 06:14:57AM CET, marichika4@gmail.com wrote:
>>>On 19 December 2014 at 05:18, Roopa Prabhu <roopa@cumulusnetworks.com> wrote:
>>>> On 12/18/14, 3:26 PM, Samudrala, Sridhar wrote:
> <snipped for ease of reading>
>>>>>
>>>>>
>>>>> We also need an interface to set per-switch attributes. Can this work?
>>>>>     bridge link set dev sw0 sw_attr bcast_flooding 1 master
>>>>> where sw0 is a bridge representing the hardware switch.
>>>>
>>>>
>>>> Not today. We discussed this @ LPC, and one way to do this would be to have
>>>> a device
>>>> representing the switch asic. This is in the works.
>>>
>>>
>>>Can I assume that on  platforms which house more than one asic (say
>>>two 24 port asics, interconnected via a 10G link or equivalent, to get
>>>a 48 port 'switch') , the 'rocker' driver (or similar) should expose
>>>them as a single set of ports, and not as two 'switch ports' ?
>>
>> Well that really depends on particular implementation and drivers. If you
>> have 2 pci-e devices, I think you should expose them as 2 entities. For
>> sure, you can have the driver to do the masking for you. I don't believe
>> that is correct though.
>>
>
> In a platform that houses two asic chips, IMO, the user is still
> expected to manage the router as a single entity. The configuration
> being applied on both asic devices need to be matching if not
> identical, and may not be conflicting. The FDB is to be synchronized
> so that (offloaded) switching can happen across the asics. Some of
> this stuff is asic specific anyway. Another example is that of the
> learning. The (hardware) learning can't be enabled on one asic, while
> being disabled on another one. The general use cases I have seen are
> all involving managing the 'router' as a single entity.  That the
> 'router' is implemented with two asics instead of a single asic (with
> more ports) is to be treated as an implementation detail.  This is the
> usual router management method that exists today.
>
> I hope I make sense.
>
> So I am trying to figure out what this single entity that will be used
> from a user perspective. It can be a bridge, but our bridges are more
> 802.1q bridges. We can use the 'self' mode, but then it means that it
> should reflect the entire port count, and not just an asic.
>
> So I was trying to deduce that in our switchdevice model, the best bet
> would be to leave the unification to the driver (i.e., to project the
> multiple physical asics as a single virtual switch device). This
> allows any 'switch' level configurations to the bridge in 'self' mode.
>
> And  then we would need to consider stacking. Stacking differs from
> this multi-asic scenario since  there would be multiple CPU involved.
>
> Thanks
> Vissu
>

Another example i can provide is that of mirroring. Imagine user
wanted to mirror all traffic from port 1 of asic 1 to port 2 of asic
2. This can be offloaded to hardware. However, user would be able to
enter such a command only if he/she can look at a single management
entity.

Thanks
Vissu

>>>
>>>>
>>>>
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [RFC PATCH net-next v2 1/1] net: Support for switch port configuration
From: B Viswanath @ 2014-12-19  9:01 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: Roopa Prabhu, Samudrala, Sridhar, John Fastabend, Varlese, Marco,
	netdev@vger.kernel.org, Thomas Graf, sfeldma@gmail.com,
	linux-kernel@vger.kernel.org
In-Reply-To: <20141219082754.GE1848@nanopsycho.orion>

On 19 December 2014 at 13:57, Jiri Pirko <jiri@resnulli.us> wrote:
> Fri, Dec 19, 2014 at 06:14:57AM CET, marichika4@gmail.com wrote:
>>On 19 December 2014 at 05:18, Roopa Prabhu <roopa@cumulusnetworks.com> wrote:
>>> On 12/18/14, 3:26 PM, Samudrala, Sridhar wrote:
<snipped for ease of reading>
>>>>
>>>>
>>>> We also need an interface to set per-switch attributes. Can this work?
>>>>     bridge link set dev sw0 sw_attr bcast_flooding 1 master
>>>> where sw0 is a bridge representing the hardware switch.
>>>
>>>
>>> Not today. We discussed this @ LPC, and one way to do this would be to have
>>> a device
>>> representing the switch asic. This is in the works.
>>
>>
>>Can I assume that on  platforms which house more than one asic (say
>>two 24 port asics, interconnected via a 10G link or equivalent, to get
>>a 48 port 'switch') , the 'rocker' driver (or similar) should expose
>>them as a single set of ports, and not as two 'switch ports' ?
>
> Well that really depends on particular implementation and drivers. If you
> have 2 pci-e devices, I think you should expose them as 2 entities. For
> sure, you can have the driver to do the masking for you. I don't believe
> that is correct though.
>

In a platform that houses two asic chips, IMO, the user is still
expected to manage the router as a single entity. The configuration
being applied on both asic devices need to be matching if not
identical, and may not be conflicting. The FDB is to be synchronized
so that (offloaded) switching can happen across the asics. Some of
this stuff is asic specific anyway. Another example is that of the
learning. The (hardware) learning can't be enabled on one asic, while
being disabled on another one. The general use cases I have seen are
all involving managing the 'router' as a single entity.  That the
'router' is implemented with two asics instead of a single asic (with
more ports) is to be treated as an implementation detail.  This is the
usual router management method that exists today.

I hope I make sense.

So I am trying to figure out what this single entity that will be used
from a user perspective. It can be a bridge, but our bridges are more
802.1q bridges. We can use the 'self' mode, but then it means that it
should reflect the entire port count, and not just an asic.

So I was trying to deduce that in our switchdevice model, the best bet
would be to leave the unification to the driver (i.e., to project the
multiple physical asics as a single virtual switch device). This
allows any 'switch' level configurations to the bridge in 'self' mode.

And  then we would need to consider stacking. Stacking differs from
this multi-asic scenario since  there would be multiple CPU involved.

Thanks
Vissu

>>
>>>
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* [PATCH net-next 1/2] r8152: adjust set_carrier
From: Hayes Wang @ 2014-12-19  8:55 UTC (permalink / raw)
  To: netdev; +Cc: nic_swsd, linux-kernel, linux-usb, Hayes Wang
In-Reply-To: <1394712342-15778-107-Taiwan-albertk@realtek.com>

Update the tp->speed at the beginning of the function. Then,
the other fucntion could use it for checking linking status.

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
---
 drivers/net/usb/r8152.c | 11 +++++------
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/drivers/net/usb/r8152.c b/drivers/net/usb/r8152.c
index 2d1c77e..59b70c5 100644
--- a/drivers/net/usb/r8152.c
+++ b/drivers/net/usb/r8152.c
@@ -2848,26 +2848,25 @@ static void rtl8153_down(struct r8152 *tp)
 static void set_carrier(struct r8152 *tp)
 {
 	struct net_device *netdev = tp->netdev;
-	u8 speed;
+	u8 old_speed = tp->speed;
 
 	clear_bit(RTL8152_LINK_CHG, &tp->flags);
-	speed = rtl8152_get_speed(tp);
+	tp->speed = rtl8152_get_speed(tp);
 
-	if (speed & LINK_STATUS) {
-		if (!(tp->speed & LINK_STATUS)) {
+	if (tp->speed & LINK_STATUS) {
+		if (!(old_speed & LINK_STATUS)) {
 			tp->rtl_ops.enable(tp);
 			set_bit(RTL8152_SET_RX_MODE, &tp->flags);
 			netif_carrier_on(netdev);
 		}
 	} else {
-		if (tp->speed & LINK_STATUS) {
+		if (old_speed & LINK_STATUS) {
 			netif_carrier_off(netdev);
 			tasklet_disable(&tp->tl);
 			tp->rtl_ops.disable(tp);
 			tasklet_enable(&tp->tl);
 		}
 	}
-	tp->speed = speed;
 }
 
 static void rtl_work_func_t(struct work_struct *work)
-- 
2.1.0

^ permalink raw reply related

* [PATCH net-next 2/2] r8152: check the status before submitting rx
From: Hayes Wang @ 2014-12-19  8:56 UTC (permalink / raw)
  To: netdev-u79uwXL29TY76Z2rM5mHXA
  Cc: nic_swsd-Rasf1IRRPZFBDgjK7y7TUQ,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-usb-u79uwXL29TY76Z2rM5mHXA, Hayes Wang
In-Reply-To: <1394712342-15778-107-Taiwan-albertk-Rasf1IRRPZFBDgjK7y7TUQ@public.gmane.org>

Don't submit the rx if the device is unplugged, linking down,
or stopped.

Signed-off-by: Hayes Wang <hayeswang-Rasf1IRRPZFBDgjK7y7TUQ@public.gmane.org>
---
 drivers/net/usb/r8152.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/net/usb/r8152.c b/drivers/net/usb/r8152.c
index 59b70c5..b39b2e4 100644
--- a/drivers/net/usb/r8152.c
+++ b/drivers/net/usb/r8152.c
@@ -1789,6 +1789,11 @@ int r8152_submit_rx(struct r8152 *tp, struct rx_agg *agg, gfp_t mem_flags)
 {
 	int ret;
 
+	/* The rx would be stopped, so skip submitting */
+	if (test_bit(RTL8152_UNPLUG, &tp->flags) ||
+	    !test_bit(WORK_ENABLE, &tp->flags) || !(tp->speed & LINK_STATUS))
+		return 0;
+
 	usb_fill_bulk_urb(agg->urb, tp->udev, usb_rcvbulkpipe(tp->udev, 1),
 			  agg->head, agg_buf_sz,
 			  (usb_complete_t)read_bulk_callback, agg);
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* [PATCH net-next 0/2] r8152: adjust r8152_submit_rx
From: Hayes Wang @ 2014-12-19  8:55 UTC (permalink / raw)
  To: netdev; +Cc: nic_swsd, linux-kernel, linux-usb, Hayes Wang

Avoid r8152_submit_rx() from submitting rx during unexpected
moment. This could reduce the time of stopping rx.

For patch #1, the tp->speed should be updated early. Then,
the patch #2 could use it to check the current linking status.

Hayes Wang (2):
  r8152: adjust set_carrier
  r8152: check the status before submitting rx

 drivers/net/usb/r8152.c | 16 ++++++++++------
 1 file changed, 10 insertions(+), 6 deletions(-)

-- 
2.1.0

^ permalink raw reply

* [PATCH] bonding: avoid re-entry of bond_release
From: Wengang Wang @ 2014-12-19  8:56 UTC (permalink / raw)
  To: netdev; +Cc: wen.gang.wang

If bond_release is run against an interface which is already detached from
it's master, then there is an error message shown like
	"<master name> cannot release <slave name>".

The call path is:
	bond_do_ioctl()
		bond_release()
			__bond_release_one()

Though it does not really harm, the message the message is misleading.
This patch tries to avoid the message.

Signed-off-by: Wengang Wang <wen.gang.wang@oracle.com>
---
 drivers/net/bonding/bond_main.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 184c434..4a71bbd 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -3256,7 +3256,10 @@ static int bond_do_ioctl(struct net_device *bond_dev, struct ifreq *ifr, int cmd
 		break;
 	case BOND_RELEASE_OLD:
 	case SIOCBONDRELEASE:
-		res = bond_release(bond_dev, slave_dev);
+		if (slave_dev->flags & IFF_SLAVE)
+			res = bond_release(bond_dev, slave_dev);
+		else
+			res = 0;
 		break;
 	case BOND_SETHWADDR_OLD:
 	case SIOCBONDSETHWADDR:
-- 
1.8.3.1

^ permalink raw reply related

* Re: [PATCH Iproute2 next] ip netns: 'ip netns id' cmd without argument should not give error
From: Vadim Kochan @ 2014-12-19  8:31 UTC (permalink / raw)
  To: Mahesh Bandewar; +Cc: netdev, Stephen Hemminger, Vadim Kochan, Saied Kazemi
In-Reply-To: <1418969138-26199-1-git-send-email-maheshb@google.com>

On Thu, Dec 18, 2014 at 10:05:38PM -0800, Mahesh Bandewar wrote:
> The command 'ip netns identify' without PID parameter is supposed to
> use the self PID to identify its ns but a trivial error prevented it
> from doing so. This patch fixes that error.
> 
> So before the patch -
> 	# ip netns id
> 	No pid specified
> 	# echo $?
> 	1
> 	#
> 
> After the patch -
> 	# ip netns id
> 	test-ns
> 	# echo $?
> 	0
> 	#
> 
> Signed-off-by: Mahesh Bandewar <maheshb@google.com>
> Cc: Stephen Hemminger <stephen@networkplumber.org>
> Cc: Vadim Kochan <vadim4j@gmail.com> 
> Cc: Saied Kazemi <saied@google.com>
> ---
>  ip/ipnetns.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/ip/ipnetns.c b/ip/ipnetns.c
> index 1c8aa029073e..ec9afa2177a5 100644
> --- a/ip/ipnetns.c
> +++ b/ip/ipnetns.c
> @@ -298,7 +298,7 @@ static int netns_identify(int argc, char **argv)
>  	DIR *dir;
>  	struct dirent *entry;
>  
> -	if (argc < 1) {
> +	if (!argc) {
>  		pidstr = "self";
>  	} else if (argc > 1) {
>  		fprintf(stderr, "extra arguments specified\n");
> -- 
> 2.2.0.rc0.207.ga3a616c
> 

Hi Mahesh,

I did not get such error on the current master branch?

Did I miss something ?

Regards,
Vadim Kochan

^ permalink raw reply

* Re: [RFC PATCH net-next v2 1/1] net: Support for switch port configuration
From: Jiri Pirko @ 2014-12-19  8:27 UTC (permalink / raw)
  To: B Viswanath
  Cc: Roopa Prabhu, Samudrala, Sridhar, John Fastabend, Varlese, Marco,
	netdev@vger.kernel.org, Thomas Graf, sfeldma@gmail.com,
	linux-kernel@vger.kernel.org
In-Reply-To: <CAN+pFwJyfg1+=+tXUpon0Kket56HJ=9BR3XohPhZnaX7hfPgpg@mail.gmail.com>

Fri, Dec 19, 2014 at 06:14:57AM CET, marichika4@gmail.com wrote:
>On 19 December 2014 at 05:18, Roopa Prabhu <roopa@cumulusnetworks.com> wrote:
>> On 12/18/14, 3:26 PM, Samudrala, Sridhar wrote:
>>>
>>>
>>> On 12/18/2014 3:07 PM, Roopa Prabhu wrote:
>>>>
>>>> On 12/18/14, 11:21 AM, John Fastabend wrote:
>>>>>
>>>>> On 12/18/2014 10:14 AM, Roopa Prabhu wrote:
>>>>>>
>>>>>> On 12/18/14, 10:02 AM, Varlese, Marco wrote:
>>>>>>>
>>>>>>> Removed unnecessary content for ease of reading...
>>>>>>>
>>>>>>>>>>>>> +/* Switch Port Attributes section */
>>>>>>>>>>>>> +
>>>>>>>>>>>>> +enum {
>>>>>>>>>>>>> +    IFLA_ATTR_UNSPEC,
>>>>>>>>>>>>> +    IFLA_ATTR_LEARNING,
>>>>>>>>>>>>
>>>>>>>>>>>> Any reason you want learning here ?. This is covered as part  of
>>>>>>>>>>>> the bridge setlink attributes.
>>>>>>>>>>>>
>>>>>>>>>>> Yes, because the user may _not_ want to go through a bridge
>>>>>>>>>>> interface
>>>>>>>>>>
>>>>>>>>>> necessarily.
>>>>>>>>>> But, the bridge setlink/getlink interface was changed to
>>>>>>>>>> accommodate
>>>>>>>>
>>>>>>>> 'self'
>>>>>>>>>>
>>>>>>>>>> for exactly such cases.
>>>>>>>>>> I kind of understand your case for the other attributes (these are
>>>>>>>>>> per port settings that switch asics provide).
>>>>>>>>>>
>>>>>>>>>> However, i don't understand the reason to pull in bridge attributes
>>>>>>>>>> here.
>>>>>>>>>>
>>>>>>>>> Maybe, I am missing something so you might help. The learning
>>>>>>>>> attribute -
>>>>>>>>
>>>>>>>> in my case - it is like all other attributes: a port attribute (as
>>>>>>>> you said, port
>>>>>>>> settings that the switch provides per port).
>>>>>>>>>
>>>>>>>>> So, what I was saying is "why the user shall go through a bridge to
>>>>>>>>> configure
>>>>>>>>
>>>>>>>> the learning attribute"? From my perspective, it is as any other
>>>>>>>> attribute and
>>>>>>>> as such configurable on the port.
>>>>>>>>
>>>>>>>> Thinking about this some more, i don't see why any of these
>>>>>>>> attributes
>>>>>>>> (except loopback. I dont understand the loopback attribute) cant be
>>>>>>>> part of
>>>>>>>> the birdge port attributes.
>>>>>>>>
>>>>>>>> With this we will end up adding l2 attributes in two places: the
>>>>>>>> general link
>>>>>>>> attributes and bridge attributes.
>>>>>>>>
>>>>>>>> And since we have gone down the path of using
>>>>>>>> ndo_bridge_setlink/getlink
>>>>>>>> with 'self'....we should stick to that for all l2 attributes.
>>>>>>>>
>>>>>>>> The idea of overloading ndo_bridge_set/getlink, was to have the same
>>>>>>>> set of
>>>>>>>> attributes but support both cases where the user wants to go through
>>>>>>>> the
>>>>>>>> bridge driver or directly to the switch port driver. So, you are not
>>>>>>>> really going
>>>>>>>> through the bridge driver if you use 'self' and
>>>>>>>> ndo_bridge_setlink/getlink.
>>>>>>>>
>>>>>>> Roopa, one of the comments I got from Thomas Graf on my v1 patch
>>>>>>> was that your patch and mine were supplementary ("I think Roopa's
>>>>>>> patches are supplementary. Not all switchdev users will be backed
>>>>>>> with a Linux Bridge. I therefore welcome your patches very
>>>>>>> much")... I also understood by others that the patch made sense for
>>>>>>> the same reason. I simply do not understand why these attributes
>>>>>>> (and maybe others in the future) could not be configured directly
>>>>>>> on a standard port but have to go through a bridge.
>>>>>>>
>>>>>> ok, i am very confused in that case. The whole moving of bridge
>>>>>> attributes from the bridge driver to rtnetlink.c was to make the
>>>>>> bridge attributes accessible to any driver who wants to set l2/bridge
>>>>>> attributes on their switch ports. So, its unclear to me why we are
>>>>>> doing this parallel thing again. This move to rtnetlink.c was done
>>>>>> during the recent rocker support. so, maybe scott/jiri can elaborate
>>>>>> more.
>>>>>
>>>>>
>>>>> Not sure if this will add to the confusion or help. But you do not
>>>>> need to have the bridge.ko loaded or netdev's attached to a bridge
>>>>> to use the setlink/getlink ndo ops and netlink messages.
>>>>>
>>>>> This was intentionally done. Its already used with NIC devices to
>>>>> configure embedded bridge settings such as VEB/VEPA.
>>>>
>>>>
>>>> that helps my case, thanks.
>>>
>>>
>>> So the user interface to set/get the per-port attributes will be via
>>> 'bridge', not 'ip'
>>>
>>>     bridge link set dev sw0p1 port_attr bcast_flooding 1 self
>>>     bridge link get dev sw0p1 port_attr bcast_flooding self
>>
>>
>> yes, l2 attributes.
>>>
>>>
>>> We also need an interface to set per-switch attributes. Can this work?
>>>     bridge link set dev sw0 sw_attr bcast_flooding 1 master
>>> where sw0 is a bridge representing the hardware switch.
>>
>>
>> Not today. We discussed this @ LPC, and one way to do this would be to have
>> a device
>> representing the switch asic. This is in the works.
>
>
>Can I assume that on  platforms which house more than one asic (say
>two 24 port asics, interconnected via a 10G link or equivalent, to get
>a 48 port 'switch') , the 'rocker' driver (or similar) should expose
>them as a single set of ports, and not as two 'switch ports' ?

Well that really depends on particular implementation and drivers. If you
have 2 pci-e devices, I think you should expose them as 2 entities. For
sure, you can have the driver to do the masking for you. I don't believe
that is correct though.

>
>>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [RFC PATCH net-next v2 1/1] net: Support for switch port configuration
From: Jiri Pirko @ 2014-12-19  8:25 UTC (permalink / raw)
  To: Roopa Prabhu
  Cc: Samudrala, Sridhar, John Fastabend, Varlese, Marco,
	netdev@vger.kernel.org, Thomas Graf, sfeldma@gmail.com,
	linux-kernel@vger.kernel.org
In-Reply-To: <549367CC.2080307@cumulusnetworks.com>

Fri, Dec 19, 2014 at 12:48:28AM CET, roopa@cumulusnetworks.com wrote:
>On 12/18/14, 3:26 PM, Samudrala, Sridhar wrote:
>>
>>On 12/18/2014 3:07 PM, Roopa Prabhu wrote:
>>>On 12/18/14, 11:21 AM, John Fastabend wrote:
>>>>On 12/18/2014 10:14 AM, Roopa Prabhu wrote:
>>>>>On 12/18/14, 10:02 AM, Varlese, Marco wrote:
>>>>>>Removed unnecessary content for ease of reading...
>>>>>>
>>>>>>>>>>>>+/* Switch Port Attributes section */
>>>>>>>>>>>>+
>>>>>>>>>>>>+enum {
>>>>>>>>>>>>+    IFLA_ATTR_UNSPEC,
>>>>>>>>>>>>+    IFLA_ATTR_LEARNING,
>>>>>>>>>>>Any reason you want learning here ?. This is covered as part  of
>>>>>>>>>>>the bridge setlink attributes.
>>>>>>>>>>>
>>>>>>>>>>Yes, because the user may _not_ want to go through a bridge
>>>>>>>>>>interface
>>>>>>>>>necessarily.
>>>>>>>>>But, the bridge setlink/getlink interface was changed to
>>>>>>>>>accommodate
>>>>>>>'self'
>>>>>>>>>for exactly such cases.
>>>>>>>>>I kind of understand your case for the other attributes (these are
>>>>>>>>>per port settings that switch asics provide).
>>>>>>>>>
>>>>>>>>>However, i don't understand the reason to pull in bridge
>>>>>>>>>attributes here.
>>>>>>>>>
>>>>>>>>Maybe, I am missing something so you might help. The learning
>>>>>>>>attribute -
>>>>>>>in my case - it is like all other attributes: a port attribute
>>>>>>>(as you said, port
>>>>>>>settings that the switch provides per port).
>>>>>>>>So, what I was saying is "why the user shall go through a
>>>>>>>>bridge to configure
>>>>>>>the learning attribute"? From my perspective, it is as any other
>>>>>>>attribute and
>>>>>>>as such configurable on the port.
>>>>>>>
>>>>>>>Thinking about this some more, i don't see why any of these
>>>>>>>attributes
>>>>>>>(except loopback. I dont understand the loopback attribute) cant
>>>>>>>be part of
>>>>>>>the birdge port attributes.
>>>>>>>
>>>>>>>With this we will end up adding l2 attributes in two places: the
>>>>>>>general link
>>>>>>>attributes and bridge attributes.
>>>>>>>
>>>>>>>And since we have gone down the path of using
>>>>>>>ndo_bridge_setlink/getlink
>>>>>>>with 'self'....we should stick to that for all l2 attributes.
>>>>>>>
>>>>>>>The idea of overloading ndo_bridge_set/getlink, was to have the
>>>>>>>same set of
>>>>>>>attributes but support both cases where the user wants to go
>>>>>>>through the
>>>>>>>bridge driver or directly to the switch port driver. So, you are
>>>>>>>not really going
>>>>>>>through the bridge driver if you use 'self' and
>>>>>>>ndo_bridge_setlink/getlink.
>>>>>>>
>>>>>>Roopa, one of the comments I got from Thomas Graf on my v1 patch
>>>>>>was that your patch and mine were supplementary ("I think Roopa's
>>>>>>patches are supplementary. Not all switchdev users will be backed
>>>>>>with a Linux Bridge. I therefore welcome your patches very
>>>>>>much")... I also understood by others that the patch made sense for
>>>>>>the same reason. I simply do not understand why these attributes
>>>>>>(and maybe others in the future) could not be configured directly
>>>>>>on a standard port but have to go through a bridge.
>>>>>>
>>>>>ok, i am very confused in that case. The whole moving of bridge
>>>>>attributes from the bridge driver to rtnetlink.c was to make the
>>>>>bridge attributes accessible to any driver who wants to set l2/bridge
>>>>>attributes on their switch ports. So, its unclear to me why we are
>>>>>doing this parallel thing again. This move to rtnetlink.c was done
>>>>>during the recent rocker support. so, maybe scott/jiri can elaborate
>>>>>more.
>>>>
>>>>Not sure if this will add to the confusion or help. But you do not
>>>>need to have the bridge.ko loaded or netdev's attached to a bridge
>>>>to use the setlink/getlink ndo ops and netlink messages.
>>>>
>>>>This was intentionally done. Its already used with NIC devices to
>>>>configure embedded bridge settings such as VEB/VEPA.
>>>
>>>that helps my case, thanks.
>>
>>So the user interface to set/get the per-port attributes will be via
>>'bridge', not 'ip'
>>
>>    bridge link set dev sw0p1 port_attr bcast_flooding 1 self
>>    bridge link get dev sw0p1 port_attr bcast_flooding self
>
>yes, l2 attributes.
>>
>>We also need an interface to set per-switch attributes. Can this work?
>>    bridge link set dev sw0 sw_attr bcast_flooding 1 master
>>where sw0 is a bridge representing the hardware switch.

How can a bridge represent the hardware switch? What if you have 2 bridge
on ports of the same hardware switch. Or what if you have a bridge with
couple of ports from one hardware switch and couple of ports from second
hardware switch. Or else, you can have a misxture of switch ports and
other devices (taps for example). Bridge->hardware switch does not map
1:1.

If you want to set any per switch attributes, just use one of the ports
as a "gateway". port->hardware switch mapping is strict, so you should
be okay doing that. There are no ndos for this purpose at the moment,
but they can be easily added.

>
>Not today. We discussed this @ LPC, and one way to do this would be to have a
>device
>representing the switch asic. This is in the works.
>

Yep, I still do not find a need to introduce some new device class. I
think it would be just confusing. Example.

br0 --- sw0p0
    --- sw0p1
    --- sw0p2

br1 --- sw0p3
    --- sw0p4
    --- sw0p5

and in paralel, without any connection you would have some mysterious
device "sw0" which would do something, not clear why.

Btw this device would not be netdev, so it could not be used for
anything netdevs can be currently used for.

Yes, I still might be missing something, for sure. Nobody told me what yet.

^ permalink raw reply

* Re: [RFC PATCH net-next v2 1/1] net: Support for switch port configuration
From: Jiri Pirko @ 2014-12-19  8:14 UTC (permalink / raw)
  To: Arad, Ronen
  Cc: Fastabend, John R, Roopa Prabhu, Varlese, Marco,
	netdev@vger.kernel.org, Thomas Graf, sfeldma@gmail.com,
	linux-kernel@vger.kernel.org
In-Reply-To: <E4CD12F19ABA0C4D8729E087A761DC3505DC9956@ORSMSX101.amr.corp.intel.com>

Thu, Dec 18, 2014 at 11:43:06PM CET, ronen.arad@intel.com wrote:
>
>
>>-----Original Message-----
>>From: netdev-owner@vger.kernel.org [mailto:netdev-owner@vger.kernel.org] On
>>Behalf Of John Fastabend
>>Sent: Thursday, December 18, 2014 9:21 PM
>>To: Roopa Prabhu; Varlese, Marco
>>Cc: netdev@vger.kernel.org; Thomas Graf; Jiri Pirko; sfeldma@gmail.com; linux-
>>kernel@vger.kernel.org
>>Subject: Re: [RFC PATCH net-next v2 1/1] net: Support for switch port
>>configuration
>>
>>On 12/18/2014 10:14 AM, Roopa Prabhu wrote:
>>> On 12/18/14, 10:02 AM, Varlese, Marco wrote:
>>>> Removed unnecessary content for ease of reading...
>>>>
>>>>>>>>>> +/* Switch Port Attributes section */
>>>>>>>>>> +
>>>>>>>>>> +enum {
>>>>>>>>>> +    IFLA_ATTR_UNSPEC,
>>>>>>>>>> +    IFLA_ATTR_LEARNING,
>>>>>>>>> Any reason you want learning here ?. This is covered as part  of
>>>>>>>>> the bridge setlink attributes.
>>>>>>>>>
>>>>>>>> Yes, because the user may _not_ want to go through a bridge
>>>>>>>> interface
>>>>>>> necessarily.
>>>>>>> But, the bridge setlink/getlink interface was changed to accommodate
>>>>> 'self'
>>>>>>> for exactly such cases.
>>>>>>> I kind of understand your case for the other attributes (these are
>>>>>>> per port settings that switch asics provide).
>>>>>>>
>>>>>>> However, i don't understand the reason to pull in bridge attributes
>>here.
>>>>>>>
>>>>>> Maybe, I am missing something so you might help. The learning attribute -
>>>>> in my case - it is like all other attributes: a port attribute (as you
>>said, port
>>>>> settings that the switch provides per port).
>>>>>> So, what I was saying is "why the user shall go through a bridge to
>>configure
>>>>> the learning attribute"? From my perspective, it is as any other attribute
>>and
>>>>> as such configurable on the port.
>>>>>
>>>>> Thinking about this some more, i don't see why any of these attributes
>>>>> (except loopback. I dont understand the loopback attribute) cant be part
>>of
>>>>> the birdge port attributes.
>>>>>
>>>>> With this we will end up adding l2 attributes in two places: the general
>>link
>>>>> attributes and bridge attributes.
>>>>>
>>>>> And since we have gone down the path of using ndo_bridge_setlink/getlink
>>>>> with 'self'....we should stick to that for all l2 attributes.
>>>>>
>>>>> The idea of overloading ndo_bridge_set/getlink, was to have the same set
>>of
>>>>> attributes but support both cases where the user wants to go through the
>>>>> bridge driver or directly to the switch port driver. So, you are not
>>really going
>>>>> through the bridge driver if you use 'self' and
>>ndo_bridge_setlink/getlink.
>>>>>
>>
>>>> Roopa, one of the comments I got from Thomas Graf on my v1 patch
>>>> was that your patch and mine were supplementary ("I think Roopa's
>>>> patches are supplementary. Not all switchdev users will be backed
>>>> with a Linux Bridge. I therefore welcome your patches very
>>>> much")... I also understood by others that the patch made sense for
>>>> the same reason. I simply do not understand why these attributes
>>>> (and maybe others in the future) could not be configured directly
>>>> on a standard port but have to go through a bridge.
>>>>
>>> ok, i am very confused in that case. The whole moving of bridge
>>> attributes from the bridge driver to rtnetlink.c was to make the
>>> bridge attributes accessible to any driver who wants to set l2/bridge
>>> attributes on their switch ports. So, its unclear to me why we are
>>> doing this parallel thing again. This move to rtnetlink.c was done
>>> during the recent rocker support. so, maybe scott/jiri can elaborate
>>> more.
>>
>>
>>Not sure if this will add to the confusion or help. But you do not
>>need to have the bridge.ko loaded or netdev's attached to a bridge
>>to use the setlink/getlink ndo ops and netlink messages.
>
>No you don't need bridge.ko to implement ndo_bridge_setlink/getlink. Rtnetlink invokes those ndos from code which does not depend on CONFIG_BRIDGE or the presence of bridge.ko.
>Calling some bridge exported functions such as br_fdb_external_learn_add/del requires the presence of bridge.ko and it only makes sense when the switch port device is enslaved to a bridge.

Note I plan to change br_fdb_external_learn_add/del to (semi-)generic
notifier very soon.


>
>>
>>This was intentionally done. Its already used with NIC devices to
>>configure embedded bridge settings such as VEB/VEPA.
>>
>>I think I'm just repeating Roopa though.
>>
>>--
>>To unsubscribe from this list: send the line "unsubscribe netdev" in
>>the body of a message to majordomo@vger.kernel.org
>>More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [RFC PATCH net-next v3 1/1] net: Support for switch port configuration
From: Jiri Pirko @ 2014-12-19  8:12 UTC (permalink / raw)
  To: Thomas Graf
  Cc: John Fastabend, Varlese, Marco, netdev@vger.kernel.org,
	Fastabend, John R, roopa@cumulusnetworks.com, sfeldma@gmail.com,
	linux-kernel@vger.kernel.org
In-Reply-To: <20141219003510.GC16239@casper.infradead.org>

Fri, Dec 19, 2014 at 01:35:10AM CET, tgraf@suug.ch wrote:
>On 12/18/14 at 08:03am, John Fastabend wrote:
>> On 12/18/2014 07:30 AM, Varlese, Marco wrote:
>> Could you also document the attributes. I think they are mostly
>> clear but what is IFLA_SW_LOOPBACK. It will help later when we
>> try to read the code in 6months and implement drivers.
>> 
>> I am thinking something like
>> 
>> /* Switch Port Attributes section
>>  * IFLA_SW_LEARNING - turns learning on in the bridge
>>  * IFLA_SW_LOOPBACK - does something interesting
>> 
>>  [...]
>>  */
>
>+1. I would even ask for more than that. While clear in the bridge
>context, "learning" for this API targetting multi layer switches
>is ambigious. The expectation towards the driver must be crystical
>clear.
>
>> >+
>> >+enum {
>> >+	IFLA_SW_UNSPEC,
>> >+	IFLA_SW_LEARNING,
>> 
>> Can you address Roopa's feedback. I'm also a bit confused by the
>> duplication.
>
>Agreed. Can we decide on the ndo first and then build the APIs on
>top of that? While I agree that we should have a non-bridge based
>Netlink API, the underlying ndo should be the same.

Maybe we can use this kind of ndos (proposed by this patch) and call it
for switchdevs instead of bridge_get/setlink. Would make sense to me.
single set of ndos, many possible users (userspace/in-kernel).
bridge_get/setlink would be just a wrapper for this.

>
>> >+static const struct nla_policy ifla_sw_attr_policy[IFLA_SW_ATTR_MAX+1] = {
>> >+	[IFLA_SW_LEARNING]	= { .type = NLA_U64 },
>> >+	[IFLA_SW_LOOPBACK]	= { .type = NLA_U64 },
>> >+	[IFLA_SW_BCAST_FLOODING] = { .type = NLA_U64 },
>> >+	[IFLA_SW_UCAST_FLOODING] = { .type = NLA_U64 },
>> >+	[IFLA_SW_MCAST_FLOODING] = { .type = NLA_U64 },
>> >+};
>> 
>> Why U64 values? What would we pass in these? Are these just boolean
>> bits? Maybe the annotation above will help me understand this.
>
>I think the intent is to keep the ndo API as simple as possible
>but I agree that this is wasteful. I gave this feedback on v2 already.

^ permalink raw reply

* Re: [PATCH RFC v4 net-next 1/5] virtio_net: enable tx interrupt
From: Qin Chuanyu @ 2014-12-19  7:32 UTC (permalink / raw)
  To: Jason Wang, mst, virtualization, netdev, linux-kernel, davem; +Cc: pagupta
In-Reply-To: <1417429028-11971-2-git-send-email-jasowang@redhat.com>

On 2014/12/1 18:17, Jason Wang wrote:
> On newer hosts that support delayed tx interrupts,
> we probably don't have much to gain from orphaning
> packets early.
>
> Note: this might degrade performance for
> hosts without event idx support.
> Should be addressed by the next patch.
>
> Cc: Rusty Russell <rusty@rustcorp.com.au>
> Cc: Michael S. Tsirkin <mst@redhat.com>
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> Signed-off-by: Jason Wang <jasowang@redhat.com>
> ---
>   drivers/net/virtio_net.c | 132 +++++++++++++++++++++++++++++++----------------
>   1 file changed, 88 insertions(+), 44 deletions(-)
>
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index ec2a8b4..f68114e 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
>   static int xmit_skb(struct send_queue *sq, struct sk_buff *skb)
>   {
>   	struct skb_vnet_hdr *hdr;
> @@ -912,7 +951,9 @@ static int xmit_skb(struct send_queue *sq, struct sk_buff *skb)
>   		sg_set_buf(sq->sg, hdr, hdr_len);
>   		num_sg = skb_to_sgvec(skb, sq->sg + 1, 0, skb->len) + 1;
>   	}
> -	return virtqueue_add_outbuf(sq->vq, sq->sg, num_sg, skb, GFP_ATOMIC);
> +
> +	return virtqueue_add_outbuf(sq->vq, sq->sg, num_sg, skb,
> +				    GFP_ATOMIC);
>   }
>
>   static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
> @@ -924,8 +965,7 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
>   	struct netdev_queue *txq = netdev_get_tx_queue(dev, qnum);
>   	bool kick = !skb->xmit_more;
>
> -	/* Free up any pending old buffers before queueing new ones. */
> -	free_old_xmit_skbs(sq);

I think there is no need to remove free_old_xmit_skbs here.
you could add free_old_xmit_skbs in tx_irq's napi func.
also could do this in start_xmit if you handle the race well.

I have done the same thing in ixgbe driver(free skb in ndo_start_xmit 
and both in tx_irq's poll func), and it seems work well:)

I think there would be no so much interrupts in this way, also tx 
interrupt coalesce is not needed.

> +	virtqueue_disable_cb(sq->vq);
>
>   	/* Try to transmit */
>   	err = xmit_skb(sq, skb);
> @@ -941,27 +981,19 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, struct net_device *dev)
>   		return NETDEV_TX_OK;
>   	}
>
> -	/* Don't wait up for transmitted skbs to be freed. */
> -	skb_orphan(skb);
> -	nf_reset(skb);
> -
>   	/* Apparently nice girls don't return TX_BUSY; stop the queue
>   	 * before it gets out of hand.  Naturally, this wastes entries. */
> -	if (sq->vq->num_free < 2+MAX_SKB_FRAGS) {
> +	if (sq->vq->num_free < 2+MAX_SKB_FRAGS)
>   		netif_stop_subqueue(dev, qnum);
> -		if (unlikely(!virtqueue_enable_cb_delayed(sq->vq))) {
> -			/* More just got used, free them then recheck. */
> -			free_old_xmit_skbs(sq);
> -			if (sq->vq->num_free >= 2+MAX_SKB_FRAGS) {
> -				netif_start_subqueue(dev, qnum);
> -				virtqueue_disable_cb(sq->vq);
> -			}
> -		}
> -	}
>
>   	if (kick || netif_xmit_stopped(txq))
>   		virtqueue_kick(sq->vq);
>
> +	if (unlikely(!virtqueue_enable_cb_delayed(sq->vq))) {
> +		virtqueue_disable_cb(sq->vq);
> +		napi_schedule(&sq->napi);
> +	}
> +
>   	return NETDEV_TX_OK;
>   }
>
> @@ -1138,8 +1170,10 @@ static int virtnet_close(struct net_device *dev)
>   	/* Make sure refill_work doesn't re-enable napi! */
>   	cancel_delayed_work_sync(&vi->refill);
>
> -	for (i = 0; i < vi->max_queue_pairs; i++)
> +	for (i = 0; i < vi->max_queue_pairs; i++) {
>   		napi_disable(&vi->rq[i].napi);
> +		napi_disable(&vi->sq[i].napi);
> +	}
>
>   	return 0;
>   }
> @@ -1452,8 +1486,10 @@ static void virtnet_free_queues(struct virtnet_info *vi)
>   {
>   	int i;
>
> -	for (i = 0; i < vi->max_queue_pairs; i++)
> +	for (i = 0; i < vi->max_queue_pairs; i++) {
>   		netif_napi_del(&vi->rq[i].napi);
> +		netif_napi_del(&vi->sq[i].napi);
> +	}
>
>   	kfree(vi->rq);
>   	kfree(vi->sq);

^ permalink raw reply

* Re: [PATCH net-next v3] Add support of Cavium Liquidio ethernet adapters
From: Stephen Hemminger @ 2014-12-19  7:28 UTC (permalink / raw)
  To: Raghu Vatsavayi
  Cc: davem, netdev, Derek Chickles, Satanand Burla, Felix Manlunas,
	Raghu Vatsavayi
In-Reply-To: <1418959519-31681-1-git-send-email-rvatsavayi@caviumnetworks.com>

On Thu, 18 Dec 2014 19:25:19 -0800
Raghu Vatsavayi <rvatsavayi@caviumnetworks.com> wrote:

> +void process_noresponse_list(struct octeon_device *oct,
> +			     struct octeon_instr_queue *iq)
> +{
> +	uint32_t get_idx;
> +	struct octeon_soft_command *sc;
> +	struct octeon_instr_ih *ih;

All global function names must start with same prefix, driver may not
always be built as a module. 

In this case oceteon_process_noresponse_list.

^ permalink raw reply

* Re: [PATCH net-next v3] Add support of Cavium Liquidio ethernet adapters
From: Stephen Hemminger @ 2014-12-19  7:26 UTC (permalink / raw)
  To: Raghu Vatsavayi
  Cc: davem, netdev, Derek Chickles, Satanand Burla, Felix Manlunas,
	Raghu Vatsavayi
In-Reply-To: <1418959519-31681-1-git-send-email-rvatsavayi@caviumnetworks.com>

On Thu, 18 Dec 2014 19:25:19 -0800
Raghu Vatsavayi <rvatsavayi@caviumnetworks.com> wrote:

> +/**
> + * \brief Alloc net device
> + * @param size Size to allocate
> + * @param nq how many queues
> + * @returns pointer to net device structure
> + */
> +static inline struct net_device *liquidio_alloc_netdev(int size, int nq)
> +{
> +	if (nq > 1)
> +		return alloc_netdev_mq(size, "lio%d", NET_NAME_UNKNOWN,
> +				       ether_setup, nq);
> +	else
> +		return alloc_netdev(size, "lio%d", NET_NAME_UNKNOWN,
> +				    ether_setup);
> +}

There is no need for the (nq > 1) test, just do:
> +		return alloc_netdev_mq(size, "lio%d", NET_NAME_UNKNOWN,
> +				       ether_setup, nq);

Also, why does this device not have standard ethernet naming?

^ permalink raw reply

* Re: [PATCH net-next v3] Add support of Cavium Liquidio ethernet adapters
From: Stephen Hemminger @ 2014-12-19  7:23 UTC (permalink / raw)
  To: Raghu Vatsavayi
  Cc: davem, netdev, Derek Chickles, Satanand Burla, Felix Manlunas,
	Raghu Vatsavayi
In-Reply-To: <1418959519-31681-1-git-send-email-rvatsavayi@caviumnetworks.com>

On Thu, 18 Dec 2014 19:25:19 -0800
Raghu Vatsavayi <rvatsavayi@caviumnetworks.com> wrote:

> +static int __init liquidio_init(void)
> +{
> +	int i;
> +	struct handshake *hs;
> +
> +	pr_info("LiquidIO: Starting Network module version %s\n",
> +		LIQUIDIO_VERSION);
> +

Do you really need to log this. Systems are already too chatty.

^ permalink raw reply

* Re: [PATCH net-next v3] Add support of Cavium Liquidio ethernet adapters
From: Stephen Hemminger @ 2014-12-19  7:21 UTC (permalink / raw)
  To: Raghu Vatsavayi
  Cc: davem, netdev, Derek Chickles, Satanand Burla, Felix Manlunas,
	Raghu Vatsavayi
In-Reply-To: <1418959519-31681-1-git-send-email-rvatsavayi@caviumnetworks.com>

On Thu, 18 Dec 2014 19:25:19 -0800
Raghu Vatsavayi <rvatsavayi@caviumnetworks.com> wrote:

> +static int liquidio_ioctl(struct net_device *netdev, struct ifreq *ifr, int cmd)
> +{
> +	switch (cmd) {
> +	case SIOCSHWTSTAMP:
> +		return hwtstamp_ioctl(netdev, ifr, cmd);
> +	default:
> +		return -EINVAL

Standard error code for unsupported ioctl is -EOPNOTSUPP

^ permalink raw reply

* Re: [PATCH net-next v3] Add support of Cavium Liquidio ethernet adapters
From: Stephen Hemminger @ 2014-12-19  7:14 UTC (permalink / raw)
  To: Raghu Vatsavayi
  Cc: davem, netdev, Derek Chickles, Satanand Burla, Felix Manlunas,
	Raghu Vatsavayi
In-Reply-To: <1418959519-31681-1-git-send-email-rvatsavayi@caviumnetworks.com>

On Thu, 18 Dec 2014 19:25:19 -0800
Raghu Vatsavayi <rvatsavayi@caviumnetworks.com> wrote:

> +static int liquidio_xmit(struct sk_buff *skb, struct net_device *netdev)
> +{
> +	struct lio *lio;
> +	struct octnet_buf_free_info *finfo;
> +	union octnic_cmd_setup cmdsetup;
> +	struct octnic_data_pkt ndata;
> +	struct octeon_device *oct;
> +	int cpu = 0, status = 0;
> +	int q_idx = 0;
> +	int xmit_more;
> +
> +	lio = GET_LIO(netdev);
> +	oct = lio->oct_dev;
> +	atomic64_inc((atomic64_t *)&lio->stats.tx_packets);
> +	lio->stats.tx_bytes += skb->len;
> +
> +	if (!ifstate_check(lio, LIO_IFSTATE_TXENABLED)) {
> +		lio_info(lio, tx_err, "Transmit Busy, not enabled\n");
> +		return NETDEV_TX_BUSY;
> +	}

This kind of busy handling in transmit, will cause the send in higher
level to retry and spin the CPU. Better to do proper flow control
like other drivers.

^ permalink raw reply

* Re: [PATCH net-next v3] Add support of Cavium Liquidio ethernet adapters
From: Stephen Hemminger @ 2014-12-19  7:11 UTC (permalink / raw)
  To: Raghu Vatsavayi
  Cc: davem, netdev, Derek Chickles, Satanand Burla, Felix Manlunas,
	Raghu Vatsavayi
In-Reply-To: <1418959519-31681-1-git-send-email-rvatsavayi@caviumnetworks.com>

On Thu, 18 Dec 2014 19:25:19 -0800
Raghu Vatsavayi <rvatsavayi@caviumnetworks.com> wrote:

> +		if (packet_was_received) {
> +			lio->stats.rx_bytes += len;
> +			atomic64_inc((atomic64_t *)&lio->stats.rx_packets);
> +			netdev->last_rx = jiffies;
> +		} else {
> +			atomic64_inc((atomic64_t *)&lio->stats.rx_dropped);
> +			lio_info(lio, rx_err,
> +				 "netif_rxXX  error rx_dropped:%ld\n",
> +				 lio->stats.rx_dropped);
> +		}

What is point of keeping own statistics in atomic (which are slower)?
Just use dev->stats directly.

^ permalink raw reply

* Re: [PATCH net-next v3] Add support of Cavium Liquidio ethernet adapters
From: Stephen Hemminger @ 2014-12-19  7:07 UTC (permalink / raw)
  To: Raghu Vatsavayi
  Cc: davem, netdev, Derek Chickles, Satanand Burla, Felix Manlunas,
	Raghu Vatsavayi
In-Reply-To: <1418959519-31681-1-git-send-email-rvatsavayi@caviumnetworks.com>

On Thu, 18 Dec 2014 19:25:19 -0800
Raghu Vatsavayi <rvatsavayi@caviumnetworks.com> wrote:

> +/** \brief Network device get stats
> + * @param netdev    pointer to network device
> + * @returns pointer stats structure
> + */
> +static struct net_device_stats *liquidio_stats(struct net_device *netdev)
> +{
> +	return &(GET_LIO(netdev)->stats);
> +}
> +

Unnecessary function, this is what network core does already
if .ndo_get_stats is NULL;

^ permalink raw reply

* Re: [PATCH net-next v3] Add support of Cavium Liquidio ethernet adapters
From: Stephen Hemminger @ 2014-12-19  7:05 UTC (permalink / raw)
  To: Raghu Vatsavayi
  Cc: davem, netdev, Derek Chickles, Satanand Burla, Felix Manlunas,
	Raghu Vatsavayi
In-Reply-To: <1418959519-31681-1-git-send-email-rvatsavayi@caviumnetworks.com>

On Thu, 18 Dec 2014 19:25:19 -0800
Raghu Vatsavayi <rvatsavayi@caviumnetworks.com> wrote:

> +static void
> +lio_get_ethtool_stats(struct net_device *netdev,
> +		      struct ethtool_stats *stats, u64 *data)
> +{
> +	struct lio *lio = GET_LIO(netdev);
> +	struct octeon_device *oct_dev = lio->oct_dev;
> +	int i = 0, j;
> +
> +	data[i++] = jiffies;
> +	data[i++] = lio->stats.tx_packets;
> +	data[i++] = lio->stats.tx_bytes;
> +	data[i++] = lio->stats.rx_packets;
> +	data[i++] = lio->stats.rx_bytes;
> +	data[i++] = lio->stats.tx_errors;
> +	data[i++] = lio->stats.tx_dropped;
> +	data[i++] = lio->stats.rx_dropped;
> +

Do not mirror existing netdevice statistics in ethtool.
The purpose of ethtool stats is to provide device specific information.

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox