Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH net 0/4] bridge: Fix problems around the PVID
From: Vlad Yasevich @ 2013-09-24 17:55 UTC (permalink / raw)
  To: Toshiaki Makita
  Cc: Toshiaki Makita, David Miller, netdev, Fernando Luis Vazquez Cao,
	Patrick McHardy
In-Reply-To: <1380043818.4391.26.camel@localhost.localdomain>

On 09/24/2013 01:30 PM, Toshiaki Makita wrote:
> On Tue, 2013-09-24 at 09:35 -0400, Vlad Yasevich wrote:
>> On 09/24/2013 07:45 AM, Toshiaki Makita wrote:
>>> On Mon, 2013-09-23 at 10:41 -0400, Vlad Yasevich wrote:
>>>> On 09/17/2013 04:12 AM, Toshiaki Makita wrote:
>>>>> On Mon, 2013-09-16 at 13:49 -0400, Vlad Yasevich wrote:
>>>>>> On 09/13/2013 08:06 AM, Toshiaki Makita wrote:
>>>>>>> On Thu, 2013-09-12 at 16:00 -0400, David Miller wrote:
>>>>>>>> From: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
>>>>>>>> Date: Tue, 10 Sep 2013 19:27:54 +0900
>>>>>>>>
>>>>>>>>> There seem to be some undesirable behaviors related with PVID.
>>>>>>>>> 1. It has no effect assigning PVID to a port. PVID cannot be applied
>>>>>>>>> to any frame regardless of whether we set it or not.
>>>>>>>>> 2. FDB entries learned via frames applied PVID are registered with
>>>>>>>>> VID 0 rather than VID value of PVID.
>>>>>>>>> 3. We can set 0 or 4095 as a PVID that are not allowed in IEEE 802.1Q.
>>>>>>>>> This leads interoperational problems such as sending frames with VID
>>>>>>>>> 4095, which is not allowed in IEEE 802.1Q, and treating frames with VID
>>>>>>>>> 0 as they belong to VLAN 0, which is expected to be handled as they have
>>>>>>>>> no VID according to IEEE 802.1Q.
>>>>>>>>>
>>>>>>>>> Note: 2nd and 3rd problems are potential and not exposed unless 1st problem
>>>>>>>>> is fixed, because we cannot activate PVID due to it.
>>>>>>>>
>>>>>>>> Please work out the issues in patch #2 with Vlad and resubmit this
>>>>>>>> series.
>>>>>>>>
>>>>>>>> Thank you.
>>>>>>>
>>>>>>> I'm hovering between whether we should fix the issue by changing vlan 0
>>>>>>> interface behavior in 8021q module or enabling a bridge port to sending
>>>>>>> priority-tagged frames, or another better way.
>>>>>>>
>>>>>>> If you could comment it, I'd appreciate it :)
>>>>>>>
>>>>>>>
>>>>>>> BTW, I think what is discussed in patch #2 is another problem about
>>>>>>> handling priority-tags, and it exists without this patch set applied.
>>>>>>> It looks like that we should prepare another patch set than this to fix
>>>>>>> that problem.
>>>>>>>
>>>>>>> Should I include patches that fix the priority-tags problem in this
>>>>>>> patch set and resubmit them all together?
>>>>>>>
>>>>>>
>>>>>> I am thinking that we might need to do it in bridge and it looks like
>>>>>> the simplest way to do it is to have default priority regeneration table
>>>>>> (table 6-5 from 802.1Q doc).
>>>>>>
>>>>>> That way I think we would conform to the spec.
>>>>>>
>>>>>> -vlad
>>>>>
>>>>> Unfortunately I don't think the default priority regeneration table
>>>>> resolves the problem because IEEE 802.1Q says that a VLAN-aware bridge
>>>>> can transmit untagged or VLAN-tagged frames only (the end of section 7.5
>>>>> and 8.1.7).
>>>>>
>>>>> No mechanism to send priority-tagged frames is found as far as I can see
>>>>> the standard. I think the regenerated priority is used for outgoing PCP
>>>>> field only if egress policy is not untagged (i.e. transmitting as
>>>>> VLAN-tagged), and unused if untagged (Section 6.9.2 3rd/4th Paragraph).
>>>>>
>>>>> If we want to transmit priority-tagged frames from a bridge port, I
>>>>> think we need to implement a new (optional) feature that is above the
>>>>> standard, as I stated previously.
>>>>>
>>>>> How do you feel about adding a per-port policy that enables a bridge to
>>>>> send priority-tagged frames instead of untagged frames when egress
>>>>> policy for the port is untagged?
>>>>> With this change, we can transmit frames for a given vlan as either all
>>>>> untagged, all priority-tagged or all VLAN-tagged.
>>>>
>>>> That would work.  What I am thinking is that we do it by special casing
>>>> the vid 0 egress policy specification.  Let it be untagged by default
>>>> and if it is tagged, then we preserve the priority field and forward
>>>> it on.
>>>>
>>>> This keeps the API stable and doesn't require user/admin from knowing
>>>> exactly what happens.  Default operation conforms to the spec and allows
>>>> simple change to make it backward-compatible.
>>>>
>>>> What do you think.  I've done a simple prototype of this an it seems to
>>>> work with the VMs I am testing with.
>>>
>>> Are you saying that
>>> - by default, set the 0th bit of untagged_bitmap; and
>>> - if we unset the 0th bit and set the "vid"th bit, we transmit frames
>>> classified as belonging to VLAN "vid" as priority-tagged?
>>>
>>> If so, though it's attractive to keep current API, I'm worried about if
>>> it could be a bit confusing and not intuitive for kernel/iproute2
>>> developers that VID 0 has a special meaning only in the egress policy.
>>> Wouldn't it be better to adding a new member to struct net_port_vlans
>>> instead of using VID 0 of untagged_bitmap?
>>>
>>> Or are you saying that we use a new flag in struct net_port_vlans but
>>> use the BRIDGE_VLAN_INFO_UNTAGGED bit with VID 0 in netlink to set the
>>> flag?
>>>
>>> Even in that case, I'm afraid that it might be confusing for developers
>>> for the same reason. We are going to prohibit to specify VID with 0 (and
>>> 4095) in adding/deleting a FDB entry or a vlan filtering entry, but it
>>> would allow us to use VID 0 only when a vlan filtering entry is
>>> configured.
>>> I am thinking a new nlattr is a straightforward approach to configure
>>> it.
>>
>> By making this an explicit attribute it makes vid 0 a special case for
>> any automatic tool that would provision such filtering.  Seeing vid 0
>> would mean that these tools would have to know that this would have to
>> be translated to a different attribute instead of setting the policy
>> values.
>
> Yes, I agree with you that we can do it by the way you explained.
> What I don't understand is the advantage of using vid 0 over another way
> such as adding a new nlattr.
> I think we can indicate transmitting priority-tags explicitly by such a
> nlattr. Using vid 0 seems to be easier to implement than a new nlattr,
> but, for me, it looks less intuitive and more difficult to maintain
> because we have to care about vid 0 instead of simply ignoring it.
>

The point I am trying to make is that regardless of the approach someone
has to know what to do when enabling priority tagged frames.  You 
proposal would require the administrator or config tool to have that 
knowledge.  Example is:
	Admin does: bridge vlan set priority on dev eth0
         Automated app:
		if (vid == 0)
			/* Turn on priority tagged frame support */

My proposal would require the bridge filtering implementation to have it.
	user tool: bridge vlan add vid 0 tagged
	Automated app:  No special case.

IMO its better to have 1 piece code handling the special case then 
putting it multiple places.

Thanks
-vlad

> Thanks,
>
> Toshiaki Makita
>
>>
>> How it is implemented internally in the kernel isn't as big of an issue.
>> We can do it as a separate flag or as part of existing policy.
>>
>> -vlad
>>
>>>
>>> Thanks,
>>>
>>> Toshiaki Makita
>>>
>>>>
>>>> -vlad
>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Toshiaki Makita
>>>>>
>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Toshiaki Makita
>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>>>>>>>> the body of a message to majordomo@vger.kernel.org
>>>>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>>>>> the body of a message to majordomo@vger.kernel.org
>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>>
>>>
>>>
>>
>
>

^ permalink raw reply

* Re: [PATCH] net: Delay default_device_exit_batch until no devices are unregistering v2
From: Francesco Ruggeri @ 2013-09-24 17:54 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David Miller, Eric Dumazet, Jiří Pírko,
	Alexander Duyck, Cong Wang, netdev
In-Reply-To: <87a9j26eca.fsf_-_@xmission.com>

The patch looks fine to me.
Thanks again for looking into this.

Francesco

^ permalink raw reply

* Re: [REGRESSION][BISECTED] skge: add dma_mapping check
From: Mikulas Patocka @ 2013-09-24 17:50 UTC (permalink / raw)
  To: Joseph Salisbury
  Cc: Igor Gnatenko, Francois Romieu, Mirko Lindner, linux-kernel,
	Stephen Hemminger, netdev, member graysky, Greg KH, David Miller
In-Reply-To: <5241ACEA.6010402@canonical.com>



On Tue, 24 Sep 2013, Joseph Salisbury wrote:

> On 09/19/2013 05:03 AM, Igor Gnatenko wrote:
> > Please, send patch.
> >
> The patch is in mainline as of 3.12-rc2 as commit:
> 
> Author: Mikulas Patocka <mpatocka@redhat.com>
> Date:   Thu Sep 19 14:13:17 2013 -0400
> 
>     skge: fix broken driver
> 
> I don't see that the commit was Cc'd to stable.  Mikulas, we might need
> to send a request directly to the stable maintainers and reqeust that
> the commit be pulled into stable, in case they didn't notice the request
> in the commit message.

The patch needs to be added to 3.11 stable tree. David Miller's networking 
tree has a rule that it doesn't want "Cc: stable@kernel.org" in patches 
submitted through it.

So, David hopefully submits the patch to the stable tree on his own.

Mikulas

^ permalink raw reply

* sveiki
From: Kreditu agentura LaPoste @ 2013-09-24 17:24 UTC (permalink / raw)
  To: netdev

sveiki
Mes piedavajam istermina aizdevumus un ilgtermina ar gada likmi 2% par jusu projektu un personigas projektiem un izdevumiem, bankas nav atbildet uz jusu pieprasijumu, tad si ir iespeja izpirkt sevi Paldies Sazinaties lai sanemtu sikaku informaciju
E-pasts: informations.agence@laposte.net
Jusu kredita pieteikumu, ludzu, aizpildiet so pagaidu formu:
vards:
First Family:
Pilna adrese:
Kredits Kredita summa:
Laiks paterins:
pilseta:
Valsts:
profesija:
Gimenes stavoklis:
Talrunis:
Paldies jums, lai iegutu vairak informacijas, ludzu, sazinieties ar mums: informations.agence@laposte.net

^ permalink raw reply

* Re: [PATCH 1/2] net: Toeplitz library functions
From: Eric Dumazet @ 2013-09-24 17:44 UTC (permalink / raw)
  To: Rick Jones
  Cc: Tom Herbert, Hannes Frederic Sowa, David Miller,
	Linux Netdev List, Brandeburg, Jesse
In-Reply-To: <5241CDF2.1020601@hp.com>

On Tue, 2013-09-24 at 10:37 -0700, Rick Jones wrote:
> On 09/24/2013 10:34 AM, Eric Dumazet wrote:
> > If you have multiqueue NIC, you do not use RPS/RFS, so skb->rxhash might
> > be 0
> 
> If one has a CPU count >> the NIC queue count, mightn't one still want 
> to use RPS/RFS?

Maybe, but UDP stack should not assume anything.

By the way, there is a problem with RFS under stress, I am cooking a
patch to address this.

^ permalink raw reply

* Re: [PATCH 1/2] net: Toeplitz library functions
From: Rick Jones @ 2013-09-24 17:37 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Tom Herbert, Hannes Frederic Sowa, David Miller,
	Linux Netdev List, Brandeburg, Jesse
In-Reply-To: <1380044075.3165.108.camel@edumazet-glaptop>

On 09/24/2013 10:34 AM, Eric Dumazet wrote:
> If you have multiqueue NIC, you do not use RPS/RFS, so skb->rxhash might
> be 0

If one has a CPU count >> the NIC queue count, mightn't one still want 
to use RPS/RFS?

rick jones

^ permalink raw reply

* Re: [PATCH 1/2] net: Toeplitz library functions
From: Eric Dumazet @ 2013-09-24 17:34 UTC (permalink / raw)
  To: Tom Herbert
  Cc: Hannes Frederic Sowa, David Miller, Linux Netdev List,
	Brandeburg, Jesse
In-Reply-To: <CA+mtBx_tWS44q8jER+QtR5k4k2ieq99He6YdPF8KKBF4Z2+32w@mail.gmail.com>

On Tue, 2013-09-24 at 10:03 -0700, Tom Herbert wrote:
> On Tue, Sep 24, 2013 at 9:46 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> > On Tue, 2013-09-24 at 09:35 -0700, Tom Herbert wrote:
> >
> >> We should really be using rxhash for that anyway, eliminate this
> >> ehashfn.  This would entail adding rxhash argument in the various
> >> udp_lookup functions.
> >
> > Nope : Some NICs provide UDP rxhash only using L3  (source IP,
> > destination IP), not L4 (adding source & destination ports)
> >
> Then the NIC won't set l4_rxhash and we'll rehash over 4-tuple when
> skb_get_rxhash is called.

Yes, but then in this case you add cpu cycles for no reason.

If you have multiqueue NIC, you do not use RPS/RFS, so skb->rxhash might
be 0

hash = inet_ehashfn(net, daddr, hnum, saddr, sport);

is faster than the whole flow dissection game.

^ permalink raw reply

* Re: [PATCH net 0/4] bridge: Fix problems around the PVID
From: Toshiaki Makita @ 2013-09-24 17:30 UTC (permalink / raw)
  To: vyasevic
  Cc: Toshiaki Makita, David Miller, netdev, Fernando Luis Vazquez Cao,
	Patrick McHardy
In-Reply-To: <52419509.1020103@redhat.com>

On Tue, 2013-09-24 at 09:35 -0400, Vlad Yasevich wrote:
> On 09/24/2013 07:45 AM, Toshiaki Makita wrote:
> > On Mon, 2013-09-23 at 10:41 -0400, Vlad Yasevich wrote:
> >> On 09/17/2013 04:12 AM, Toshiaki Makita wrote:
> >>> On Mon, 2013-09-16 at 13:49 -0400, Vlad Yasevich wrote:
> >>>> On 09/13/2013 08:06 AM, Toshiaki Makita wrote:
> >>>>> On Thu, 2013-09-12 at 16:00 -0400, David Miller wrote:
> >>>>>> From: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
> >>>>>> Date: Tue, 10 Sep 2013 19:27:54 +0900
> >>>>>>
> >>>>>>> There seem to be some undesirable behaviors related with PVID.
> >>>>>>> 1. It has no effect assigning PVID to a port. PVID cannot be applied
> >>>>>>> to any frame regardless of whether we set it or not.
> >>>>>>> 2. FDB entries learned via frames applied PVID are registered with
> >>>>>>> VID 0 rather than VID value of PVID.
> >>>>>>> 3. We can set 0 or 4095 as a PVID that are not allowed in IEEE 802.1Q.
> >>>>>>> This leads interoperational problems such as sending frames with VID
> >>>>>>> 4095, which is not allowed in IEEE 802.1Q, and treating frames with VID
> >>>>>>> 0 as they belong to VLAN 0, which is expected to be handled as they have
> >>>>>>> no VID according to IEEE 802.1Q.
> >>>>>>>
> >>>>>>> Note: 2nd and 3rd problems are potential and not exposed unless 1st problem
> >>>>>>> is fixed, because we cannot activate PVID due to it.
> >>>>>>
> >>>>>> Please work out the issues in patch #2 with Vlad and resubmit this
> >>>>>> series.
> >>>>>>
> >>>>>> Thank you.
> >>>>>
> >>>>> I'm hovering between whether we should fix the issue by changing vlan 0
> >>>>> interface behavior in 8021q module or enabling a bridge port to sending
> >>>>> priority-tagged frames, or another better way.
> >>>>>
> >>>>> If you could comment it, I'd appreciate it :)
> >>>>>
> >>>>>
> >>>>> BTW, I think what is discussed in patch #2 is another problem about
> >>>>> handling priority-tags, and it exists without this patch set applied.
> >>>>> It looks like that we should prepare another patch set than this to fix
> >>>>> that problem.
> >>>>>
> >>>>> Should I include patches that fix the priority-tags problem in this
> >>>>> patch set and resubmit them all together?
> >>>>>
> >>>>
> >>>> I am thinking that we might need to do it in bridge and it looks like
> >>>> the simplest way to do it is to have default priority regeneration table
> >>>> (table 6-5 from 802.1Q doc).
> >>>>
> >>>> That way I think we would conform to the spec.
> >>>>
> >>>> -vlad
> >>>
> >>> Unfortunately I don't think the default priority regeneration table
> >>> resolves the problem because IEEE 802.1Q says that a VLAN-aware bridge
> >>> can transmit untagged or VLAN-tagged frames only (the end of section 7.5
> >>> and 8.1.7).
> >>>
> >>> No mechanism to send priority-tagged frames is found as far as I can see
> >>> the standard. I think the regenerated priority is used for outgoing PCP
> >>> field only if egress policy is not untagged (i.e. transmitting as
> >>> VLAN-tagged), and unused if untagged (Section 6.9.2 3rd/4th Paragraph).
> >>>
> >>> If we want to transmit priority-tagged frames from a bridge port, I
> >>> think we need to implement a new (optional) feature that is above the
> >>> standard, as I stated previously.
> >>>
> >>> How do you feel about adding a per-port policy that enables a bridge to
> >>> send priority-tagged frames instead of untagged frames when egress
> >>> policy for the port is untagged?
> >>> With this change, we can transmit frames for a given vlan as either all
> >>> untagged, all priority-tagged or all VLAN-tagged.
> >>
> >> That would work.  What I am thinking is that we do it by special casing
> >> the vid 0 egress policy specification.  Let it be untagged by default
> >> and if it is tagged, then we preserve the priority field and forward
> >> it on.
> >>
> >> This keeps the API stable and doesn't require user/admin from knowing
> >> exactly what happens.  Default operation conforms to the spec and allows
> >> simple change to make it backward-compatible.
> >>
> >> What do you think.  I've done a simple prototype of this an it seems to
> >> work with the VMs I am testing with.
> >
> > Are you saying that
> > - by default, set the 0th bit of untagged_bitmap; and
> > - if we unset the 0th bit and set the "vid"th bit, we transmit frames
> > classified as belonging to VLAN "vid" as priority-tagged?
> >
> > If so, though it's attractive to keep current API, I'm worried about if
> > it could be a bit confusing and not intuitive for kernel/iproute2
> > developers that VID 0 has a special meaning only in the egress policy.
> > Wouldn't it be better to adding a new member to struct net_port_vlans
> > instead of using VID 0 of untagged_bitmap?
> >
> > Or are you saying that we use a new flag in struct net_port_vlans but
> > use the BRIDGE_VLAN_INFO_UNTAGGED bit with VID 0 in netlink to set the
> > flag?
> >
> > Even in that case, I'm afraid that it might be confusing for developers
> > for the same reason. We are going to prohibit to specify VID with 0 (and
> > 4095) in adding/deleting a FDB entry or a vlan filtering entry, but it
> > would allow us to use VID 0 only when a vlan filtering entry is
> > configured.
> > I am thinking a new nlattr is a straightforward approach to configure
> > it.
> 
> By making this an explicit attribute it makes vid 0 a special case for
> any automatic tool that would provision such filtering.  Seeing vid 0
> would mean that these tools would have to know that this would have to
> be translated to a different attribute instead of setting the policy
> values.

Yes, I agree with you that we can do it by the way you explained.
What I don't understand is the advantage of using vid 0 over another way
such as adding a new nlattr.
I think we can indicate transmitting priority-tags explicitly by such a
nlattr. Using vid 0 seems to be easier to implement than a new nlattr,
but, for me, it looks less intuitive and more difficult to maintain
because we have to care about vid 0 instead of simply ignoring it.

Thanks,

Toshiaki Makita

> 
> How it is implemented internally in the kernel isn't as big of an issue.
> We can do it as a separate flag or as part of existing policy.
> 
> -vlad
> 
> >
> > Thanks,
> >
> > Toshiaki Makita
> >
> >>
> >> -vlad
> >>
> >>>
> >>> Thanks,
> >>>
> >>> Toshiaki Makita
> >>>
> >>>>
> >>>>>
> >>>>> Thanks,
> >>>>>
> >>>>> Toshiaki Makita
> >>>>>
> >>>>>>
> >>>>>> --
> >>>>>> To unsubscribe from this list: send the line "unsubscribe netdev" in
> >>>>>> the body of a message to majordomo@vger.kernel.org
> >>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >>>>>
> >>>>>
> >>>>>
> >>>
> >>>
> >>> --
> >>> To unsubscribe from this list: send the line "unsubscribe netdev" in
> >>> the body of a message to majordomo@vger.kernel.org
> >>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >>>
> >
> >
> 

^ permalink raw reply

* [PATCH net v3] vxlan: Use RCU apis to access sk_user_data.
From: Pravin B Shelar @ 2013-09-24 17:25 UTC (permalink / raw)
  To: netdev; +Cc: eric.dumazet, stephen, Pravin B Shelar, Jesse Gross

Use of RCU api makes vxlan code easier to understand.  It also
fixes bug due to missing ACCESS_ONCE() on sk_user_data dereference.
In rare case without ACCESS_ONCE() compiler might omit vs on
sk_user_data dereference.
Compiler can use vs as alias for sk->sk_user_data, resulting in
multiple sk_user_data dereference in rcu read context which
could change.

CC: Jesse Gross <jesse@nicira.com>
Signed-off-by: Pravin B Shelar <pshelar@nicira.com>
---
v3:
remove extra space
User rcu_deref and rcu_assign api for accessing sk_user_data
v2:
Use RCU_INIT_POINTER
---
 drivers/net/vxlan.c |    9 +++------
 include/net/sock.h  |    5 +++++
 2 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index d1292fe..2ef5b62 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -952,8 +952,7 @@ void vxlan_sock_release(struct vxlan_sock *vs)
 
 	spin_lock(&vn->sock_lock);
 	hlist_del_rcu(&vs->hlist);
-	smp_wmb();
-	vs->sock->sk->sk_user_data = NULL;
+	rcu_assign_sk_user_data(vs->sock->sk, NULL);
 	vxlan_notify_del_rx_port(sk);
 	spin_unlock(&vn->sock_lock);
 
@@ -1048,8 +1047,7 @@ static int vxlan_udp_encap_recv(struct sock *sk, struct sk_buff *skb)
 
 	port = inet_sk(sk)->inet_sport;
 
-	smp_read_barrier_depends();
-	vs = (struct vxlan_sock *)sk->sk_user_data;
+	vs = rcu_dereference_sk_user_data(sk);
 	if (!vs)
 		goto drop;
 
@@ -2302,8 +2300,7 @@ static struct vxlan_sock *vxlan_socket_create(struct net *net, __be16 port,
 	atomic_set(&vs->refcnt, 1);
 	vs->rcv = rcv;
 	vs->data = data;
-	smp_wmb();
-	vs->sock->sk->sk_user_data = vs;
+	rcu_assign_sk_user_data(vs->sock->sk, vs);
 
 	spin_lock(&vn->sock_lock);
 	hlist_add_head_rcu(&vs->hlist, vs_head(net, port));
diff --git a/include/net/sock.h b/include/net/sock.h
index 6ba2e7b..1d37a80 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -409,6 +409,11 @@ struct sock {
 	void                    (*sk_destruct)(struct sock *sk);
 };
 
+#define __sk_user_data(sk) ((*((void __rcu **)&(sk)->sk_user_data)))
+
+#define rcu_dereference_sk_user_data(sk)	rcu_dereference(__sk_user_data((sk)))
+#define rcu_assign_sk_user_data(sk, ptr)	rcu_assign_pointer(__sk_user_data((sk)), ptr)
+
 /*
  * SK_CAN_REUSE and SK_NO_REUSE on a socket mean that the socket is OK
  * or not whether his port will be reused by someone else. SK_FORCE_REUSE
-- 
1.7.1

^ permalink raw reply related

* Re: [PATCH] powerpc/83xx: gianfar_ptp: select 1588 clock source through dts file
From: Richard Cochran @ 2013-09-24 17:14 UTC (permalink / raw)
  To: Aida Mynzhasova; +Cc: linuxppc-dev, devicetree, netdev
In-Reply-To: <1380008397-22980-1-git-send-email-aida.mynzhasova@skitlab.ru>

On Tue, Sep 24, 2013 at 11:39:57AM +0400, Aida Mynzhasova wrote:
> Currently IEEE 1588 timer reference clock source is determined through
> hard-coded value in gianfar_ptp driver. This patch allows to select ptp
> clock source by means of device tree file node.
> 
> For instance:
> 
> 	fsl,cksel = <0>;
> 
> for using external (TSEC_TMR_CLK input) high precision timer
> reference clock.
> 
> Other acceptable values:
> 
> 	<1> : eTSEC system clock
> 	<2> : eTSEC1 transmit clock
> 	<3> : RTC clock input

I think it would be useful to have this table in the binding document
as well.

Thanks,
Richard

^ permalink raw reply

* Re: [PATCH 1/2] net: Toeplitz library functions
From: Tom Herbert @ 2013-09-24 17:03 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Hannes Frederic Sowa, David Miller, Linux Netdev List,
	Brandeburg, Jesse
In-Reply-To: <1380041185.3165.97.camel@edumazet-glaptop>

On Tue, Sep 24, 2013 at 9:46 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Tue, 2013-09-24 at 09:35 -0700, Tom Herbert wrote:
>
>> We should really be using rxhash for that anyway, eliminate this
>> ehashfn.  This would entail adding rxhash argument in the various
>> udp_lookup functions.
>
> Nope : Some NICs provide UDP rxhash only using L3  (source IP,
> destination IP), not L4 (adding source & destination ports)
>
Then the NIC won't set l4_rxhash and we'll rehash over 4-tuple when
skb_get_rxhash is called.

> This is again a case where we want our own hashing.
>
>
>

^ permalink raw reply

* Re: [PATCH 1/2] net: Add function to get SW rxhash
From: Eric Dumazet @ 2013-09-24 17:03 UTC (permalink / raw)
  To: Tom Herbert; +Cc: davem, netdev
In-Reply-To: <alpine.DEB.2.02.1309240915001.24581@tomh.mtv.corp.google.com>

On Tue, 2013-09-24 at 09:20 -0700, Tom Herbert wrote:
> Some uses of skb_get_rxhash expect that the function will return
> a consistent value whether it is called on RX or TX paths. On RX
> path, we will use the rxhash if provided by the NIC, so this
> would not normally be the same result computed in TX path which is
> a software calculation.
> 
> This patch adds skb_get_sw_rxhash to explicitly request a hash
> calculated by the stack, disregarding the hash provided by NIC.
> 
> Signed-off-by: Tom Herbert <therbert@google.com>
> ---

1) Adding another skb field for this corner case, is it worth it ?
   Computing rxhash in software should be real fast for the
   cases where we really care of symmetry.

  ie ignore skb->rxhash and use __skb_get_rxhash()

2) Not copying sw_rxhash in __copy_skb_header()

   -> value will be undefined after a clone()

^ permalink raw reply

* Re: [PATCH 1/2] net: Toeplitz library functions
From: Ben Hutchings @ 2013-09-24 17:02 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Tom Herbert, Hannes Frederic Sowa, David Miller,
	Linux Netdev List, Brandeburg, Jesse
In-Reply-To: <1380041185.3165.97.camel@edumazet-glaptop>

On Tue, 2013-09-24 at 09:46 -0700, Eric Dumazet wrote:
> On Tue, 2013-09-24 at 09:35 -0700, Tom Herbert wrote:
> 
> > We should really be using rxhash for that anyway, eliminate this
> > ehashfn.  This would entail adding rxhash argument in the various
> > udp_lookup functions.
> 
> Nope : Some NICs provide UDP rxhash only using L3  (source IP,
> destination IP), not L4 (adding source & destination ports)

Indeed the Microsoft RSS spec doesn't define an L4 hash for UDP/IP,
since it can't work with fragmented datagrams (which TCP avoids).  Using
the TCP/IP hash function for UDP is a non-standard option - presumably
useful where no fragmentation is expected.

Ben.

> This is again a case where we want our own hashing.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply

* Re: SMSC 9303 support
From: Gary Thomas @ 2013-09-24 16:59 UTC (permalink / raw)
  To: Ben Hutchings; +Cc: netdev
In-Reply-To: <1380041488.2736.41.camel@bwh-desktop.uk.level5networks.com>

On 2013-09-24 10:51, Ben Hutchings wrote:
> On Tue, 2013-09-24 at 06:21 -0600, Gary Thomas wrote:
>> I need to support the SMSC9303 in an embedded system.  I'm not
>> finding any [explicit] support for this device in the latest
>> mainline kernel.  Did I miss something?
>>
>> To be clear, the SMSC9303 is a 3-port managed ethernet switch
>> capable of supporting 802.1D/802.1Q directly. This switch is
>> driven by a single MAC via MII/RMII and exposes the other two
>> ports via physical PHYs.  What I need it to do is behave like
>> two external, separate devices.  I was thinking that what I need
>> to do is treat these as VLAN devices since the switch can manage
>> the routing.
>>
>> Does this seem like a reasonable approach?
>
> Linux has 'DSA' (Distributed Switch Architecture) which supports tagging
> of packets to indicate which switch port they are sent or received
> through.  This was originally added to support some Marvell switch chips
> and I don't know whether it would be suitable or extensible for this
> one.

I've used the DSA stuff for years (worked directly with the Marvell folks
when it was being developed).  It might work for this device, I'll think
some more about using it although I was hoping for a lighter weight solution.

>> How do I "hook up" my normal ethernet driver to it?  To the hardware
>> it just looks like any other MII/RMII PHY.  The device is managed
>> separately via I2C.  I can have that set up separately if necessary.
>>
>> Thanks for any pointers/ideas
>>
>

-- 
------------------------------------------------------------
Gary Thomas                 |  Consulting for the
MLB Associates              |    Embedded world
------------------------------------------------------------

^ permalink raw reply

* Re: [PATCH 04/10] amd/7990: Remove extern from function prototypes
From: Joe Perches @ 2013-09-24 16:55 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, linux-kernel
In-Reply-To: <65ef2eba58fddc268ba0afb4de7757f073d3bb49.1379974101.git.joe@perches.com>

Hey David.

This ended up in your tree as
commit 44da5c2f523876a02336c18dbcce93449fdbf0ca
without your signoff.

Is there something I need to do differently so
your scripts add your sign-off properly or did
this manage to just slip through via some manual
git am without the -s?

^ permalink raw reply

* Re: SMSC 9303 support
From: Ben Hutchings @ 2013-09-24 16:51 UTC (permalink / raw)
  To: Gary Thomas; +Cc: netdev
In-Reply-To: <524183D6.6040801@mlbassoc.com>

On Tue, 2013-09-24 at 06:21 -0600, Gary Thomas wrote:
> I need to support the SMSC9303 in an embedded system.  I'm not
> finding any [explicit] support for this device in the latest
> mainline kernel.  Did I miss something?
> 
> To be clear, the SMSC9303 is a 3-port managed ethernet switch
> capable of supporting 802.1D/802.1Q directly. This switch is
> driven by a single MAC via MII/RMII and exposes the other two
> ports via physical PHYs.  What I need it to do is behave like
> two external, separate devices.  I was thinking that what I need
> to do is treat these as VLAN devices since the switch can manage
> the routing.
> 
> Does this seem like a reasonable approach?

Linux has 'DSA' (Distributed Switch Architecture) which supports tagging
of packets to indicate which switch port they are sent or received
through.  This was originally added to support some Marvell switch chips
and I don't know whether it would be suitable or extensible for this
one.

Ben.

> How do I "hook up" my normal ethernet driver to it?  To the hardware
> it just looks like any other MII/RMII PHY.  The device is managed
> separately via I2C.  I can have that set up separately if necessary.
> 
> Thanks for any pointers/ideas
> 

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply

* Re: [PATCH net-next v2] dev: always advertise rx_flags changes
From: Stephen Hemminger @ 2013-09-24 16:47 UTC (permalink / raw)
  To: Nicolas Dichtel; +Cc: netdev, davem
In-Reply-To: <1380039692-8157-1-git-send-email-nicolas.dichtel@6wind.com>

On Tue, 24 Sep 2013 18:21:32 +0200
Nicolas Dichtel <nicolas.dichtel@6wind.com> wrote:

> diff --git a/net/core/dev.c b/net/core/dev.c
> index 5c713f2239cc..067bfbe70c4c 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -4822,7 +4822,7 @@ static void dev_change_rx_flags(struct net_device *dev, int flags)
>  		ops->ndo_change_rx_flags(dev, flags);
>  }
>  
> -static int __dev_set_promiscuity(struct net_device *dev, int inc)
> +static int __dev_set_promiscuity(struct net_device *dev, int inc, bool notif)
>  {
>  	unsigned int old_flags = dev->flags;
>  	kuid_t uid;
> @@ -4865,6 +4865,10 @@ static int __dev_set_promiscuity(struct net_device *dev, int inc)
>  
>  		dev_change_rx_flags(dev, IFF_PROMISC);
>  	}
> +	if (notif) {
> +		call_netdevice_notifiers(NETDEV_CHANGE, dev);
> +		rtmsg_ifinfo(RTM_NEWLINK, dev, IFF_PROMISC);
> +	}
>  	return 0;
>  }

This is getting uglier as the conditional is added to each
function. Why not create a helper function and change callers.

static void __dev_notify_flag(dev, int flags)
{
        call_netdevice_notifiers(NETDEV_CHANGE, dev);
	rtmsg_ifinfo(RTM_NEWLINK, dev, flags);
}

^ permalink raw reply

* Re: [PATCH 1/2] net: Toeplitz library functions
From: Eric Dumazet @ 2013-09-24 16:46 UTC (permalink / raw)
  To: Tom Herbert
  Cc: Hannes Frederic Sowa, David Miller, Linux Netdev List,
	Brandeburg, Jesse
In-Reply-To: <CA+mtBx9u3-mnULxTwyWAZA4gZ6Dg-PeYYVGZibwQi7toinHJ-A@mail.gmail.com>

On Tue, 2013-09-24 at 09:35 -0700, Tom Herbert wrote:

> We should really be using rxhash for that anyway, eliminate this
> ehashfn.  This would entail adding rxhash argument in the various
> udp_lookup functions.

Nope : Some NICs provide UDP rxhash only using L3  (source IP,
destination IP), not L4 (adding source & destination ports)

This is again a case where we want our own hashing.

^ permalink raw reply

* [PATCH 2/2] tun: call skb_get_sw_rxhash
From: Tom Herbert @ 2013-09-24 16:21 UTC (permalink / raw)
  To: davem; +Cc: netdev

tun calls skb_get_rxhash on both transmit and receive side with the
expectation that the calculation in symmetric. To ensure this
property call skb_get_sw_rxhash instead (ignores hash provided by NIC
on receive).

Signed-off-by: Tom Herbert <therbert@google.com>
---
 drivers/net/tun.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 807815f..0619fdf 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -358,7 +358,7 @@ static u16 tun_select_queue(struct net_device *dev, struct sk_buff *skb)
 	rcu_read_lock();
 	numqueues = ACCESS_ONCE(tun->numqueues);
 
-	txq = skb_get_rxhash(skb);
+	txq = skb_get_sw_rxhash(skb);
 	if (txq) {
 		e = tun_flow_find(&tun->flows[tun_hashfn(txq)], txq);
 		if (e)
@@ -1138,7 +1138,7 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
 	skb_reset_network_header(skb);
 	skb_probe_transport_header(skb, 0);
 
-	rxhash = skb_get_rxhash(skb);
+	rxhash = skb_get_sw_rxhash(skb);
 	netif_rx_ni(skb);
 
 	tun->dev->stats.rx_packets++;
-- 
1.8.4

^ permalink raw reply related

* Re: [PATCH 1/2] net: Toeplitz library functions
From: Ben Hutchings @ 2013-09-24 16:38 UTC (permalink / raw)
  To: David Laight; +Cc: Tom Herbert, davem, netdev, jesse.brandeburg
In-Reply-To: <AE90C24D6B3A694183C094C60CF0A2F6026B7355@saturn3.aculab.com>

On Tue, 2013-09-24 at 09:32 +0100, David Laight wrote:
> > +static inline unsigned int
> > +toeplitz_hash(const unsigned char *bytes,
> > +	      struct toeplitz *toeplitz, int n)
> > +{
> > +	int i;
> > +	unsigned int result = 0;
> > +
> > +	for (i = 0; i < n; i++)
> > +		result ^= toeplitz->key_cache[i][bytes[i]];
> > +
> > +        return result;
> > +};
> 
> That is a horrid hash function to be calculating in software.
> 
> The code looks very much like a simple 32bit CRC.
> It isn't entirely clears exactly where the 'key' gets included,
> but I suspect it is just xored with the data bytes.

Each input bit is multiplied by a 32-bit slice of the key and the
products are then bitwise-summed (i.e. xored).

[...]
> I also thought the hash was arranged so that tx and rx packets
> for a single connection hash to the same value?

Not in general.  I think I saw that one of the BSDs (DragonflyBSD?)
generates Toeplitz keys with this property and has an efficient software
implementation that works for those keys.  But that significantly
weakens the hash.

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply

* Re: [PATCH 1/2] net: Toeplitz library functions
From: Tom Herbert @ 2013-09-24 16:35 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Hannes Frederic Sowa, David Miller, Linux Netdev List,
	Brandeburg, Jesse
In-Reply-To: <1380039299.3165.91.camel@edumazet-glaptop>

On Tue, Sep 24, 2013 at 9:14 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Tue, 2013-09-24 at 18:01 +0200, Hannes Frederic Sowa wrote:
>
>> inet(6)_ehashfn does get used by udp btw. But I do not understand the code,
>> yet. ;)
>
> Oh well, thats the SO_REUSEPORT thing. We do not really care, thats only
> used to try to steer udp flows on a given socket, but its a hint.
>
> Presumably it should use a separate secret key.
>
>
We should really be using rxhash for that anyway, eliminate this
ehashfn.  This would entail adding rxhash argument in the various
udp_lookup functions.

^ permalink raw reply

* Re: [PATCH 1/2] net: Toeplitz library functions
From: Ben Hutchings @ 2013-09-24 16:32 UTC (permalink / raw)
  To: Tom Herbert; +Cc: davem, netdev, jesse.brandeburg
In-Reply-To: <alpine.DEB.2.02.1309231535030.23896@tomh.mtv.corp.google.com>

On Mon, 2013-09-23 at 15:41 -0700, Tom Herbert wrote:
> Introduce Toeplitz hash functions. Toeplitz is a hash used primarily in
> NICs to performan RSS flow steering.  This is a software implemenation
> of that. In order to make the hash calculation efficient, we precompute
> the possible hash values for each inidividual byte of input. The input
> length is up to 40 bytes, so we make an array of cache[40][256].

The input length is up to 36 bytes, but we need an extra 31 bits in the
key so that each bit of input is multiplied by 32 bits of the key.  The
first dimension should therefore be 36.

But this is still pretty big. :-/

> The implemenation was verified against MSDN "Verify RSS hash" sample
> values.
[...]
> --- /dev/null
> +++ b/lib/toeplitz.c

I'm not convinced that this is going to be useful outside of net.

> @@ -0,0 +1,66 @@
> +/*
> + * Toeplitz hash implemenation. See include/linux/toeplitz.h
> + *
> + * Copyright (c) 2011, Tom Herbert <therbert@google.com>
> + */
> +#include <linux/types.h>
> +#include <linux/kernel.h>
> +#include <linux/slab.h>
> +#include <linux/random.h>
> +#include <linux/toeplitz.h>
> +
> +struct toeplitz *toeplitz_alloc(void)
> +{
> +	return kmalloc(sizeof(struct toeplitz), GFP_KERNEL);
> +}
> +
> +static u32 toeplitz_get_kval(unsigned char *key, int idx)

unsigned int idx

> +{
> +	u32 v, r;
> +	int off, rem;
> +
> +	off = idx / 8;
> +	rem = idx % 8;

Otherwise this requires a 'real' division if the compiler doesn't work
out that idx >= 0.

> +	v = (((unsigned int)key[off]) << 24) +
> +	    (((unsigned int)key[off + 1]) << 16) +
> +	    (((unsigned int)key[off + 2]) << 8) +
> +	    (((unsigned int)key[off + 3]));
> +
> +	r = v << rem | (unsigned int)key[off + 4] >> (8 - rem);
> +	return r;
> +}
> +
> +static inline int idx8(int idx)
> +{
> +#ifdef __LITTLE_ENDIAN
> +        idx = (idx / 8) * 8 + (8 - (idx % 8 + 1));

Or in short: idx ^= 7

> +#endif
> +        return idx;
> +}
> +
> +void toeplitz_init(struct toeplitz *toeplitz, u8 *key_vals)
> +{
> +	int i;
> +	unsigned long a, j;
> +	unsigned int result = 0;
> +
> +	/* Set up key val table */
> +	if (key_vals)
> +		for (i = 0; i < TOEPLITZ_KEY_LEN; i++)
> +			toeplitz->key_vals[i] = key_vals[i];
> +	else
> +		prandom_bytes(toeplitz->key_vals, TOEPLITZ_KEY_LEN);

Should have braces around the if and else bodies.

Is it worth sacrificing a little randomness here by avoiding long runs
of zeroes in the key?

> +       /* Set up key cache table */
> +       for (i = 0; i < TOEPLITZ_KEY_LEN; i++) {
[...]

As I said above, the outer dimension of key_cache should be 36.  The
current loop causes toeplitz_get_kval() to over-run the bounds of the
key_vals array.

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply

* Re: [Xen-devel] [PATCH net-next] xen-netfront: convert to GRO API and advertise this feature
From: Konrad Rzeszutek Wilk @ 2013-09-24 16:30 UTC (permalink / raw)
  To: Wei Liu; +Cc: netdev, Anirban Chakraborty, Ian Campbell, xen-devel
In-Reply-To: <1379779543-27122-1-git-send-email-wei.liu2@citrix.com>

On Sat, Sep 21, 2013 at 05:05:43PM +0100, Wei Liu wrote:
> Anirban was seeing netfront received MTU size packets, which downgraded
> throughput. The following patch makes netfront use GRO API which
> improves throughput for that case.
> 
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> Signed-off-by: Anirban Chakraborty <abchak@juniper.net>
> Cc: Ian Campbell <ian.campbell@citrix.com>

Acked-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

> ---
>  drivers/net/xen-netfront.c |    7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
> index 36808bf..5664165 100644
> --- a/drivers/net/xen-netfront.c
> +++ b/drivers/net/xen-netfront.c
> @@ -952,7 +952,7 @@ static int handle_incoming_queue(struct net_device *dev,
>  		u64_stats_update_end(&stats->syncp);
>  
>  		/* Pass it up. */
> -		netif_receive_skb(skb);
> +		napi_gro_receive(&np->napi, skb);
>  	}
>  
>  	return packets_dropped;
> @@ -1051,6 +1051,8 @@ err:
>  	if (work_done < budget) {
>  		int more_to_do = 0;
>  
> +		napi_gro_flush(napi, false);
> +
>  		local_irq_save(flags);
>  
>  		RING_FINAL_CHECK_FOR_RESPONSES(&np->rx, more_to_do);
> @@ -1371,7 +1373,8 @@ static struct net_device *xennet_create_dev(struct xenbus_device *dev)
>  	netif_napi_add(netdev, &np->napi, xennet_poll, 64);
>  	netdev->features        = NETIF_F_IP_CSUM | NETIF_F_RXCSUM |
>  				  NETIF_F_GSO_ROBUST;
> -	netdev->hw_features	= NETIF_F_IP_CSUM | NETIF_F_SG | NETIF_F_TSO;
> +	netdev->hw_features	= NETIF_F_IP_CSUM | NETIF_F_SG | NETIF_F_TSO |
> +				  NETIF_F_GRO;
>  
>  	/*
>           * Assume that all hw features are available for now. This set
> -- 
> 1.7.10.4
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply

* Re: [PATCH net] vxlan: Use RCU apis to access sk_user_data.
From: Pravin Shelar @ 2013-09-24 16:27 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev, Jesse Gross
In-Reply-To: <20130924092050.2c80bc74@nehalam.linuxnetplumber.net>

On Tue, Sep 24, 2013 at 9:20 AM, Stephen Hemminger
<stephen@networkplumber.org> wrote:
> On Mon, 23 Sep 2013 22:28:21 -0700
> Pravin B Shelar <pshelar@nicira.com> wrote:
>
>>
>> +#define sk_user_data_rcu(sk) ((*((void __rcu **)&(sk)->sk_user_data)))
>
> I don't like the macro, is it just to keep sparse happy?
yes, it is for sparse.
> The name also implies that rcu_dereference() is built in.
ok, I can right separate API for rcu-set and rcu-deref.

^ permalink raw reply

* [RFC PATCH 1/2] vlan: export functions to register/unregister vlan devices
From: James Chapman @ 2013-09-24 16:27 UTC (permalink / raw)
  To: netdev; +Cc: James Chapman
In-Reply-To: <1380040030-6648-1-git-send-email-jchapman@katalix.com>

Export two new symbols to let external code register and unregister
vlan devices. Allow the caller to specify the name template for the
new netdev instance.

The new symbols are:
vlan_register_device
unregister_vlan_dev
---
 include/linux/if_vlan.h |   11 ++++++
 net/8021q/vlan.c        |   82 ++++++++++++++++++++++++++++------------------
 2 files changed, 61 insertions(+), 32 deletions(-)

diff --git a/include/linux/if_vlan.h b/include/linux/if_vlan.h
index 715c343..c6431d5 100644
--- a/include/linux/if_vlan.h
+++ b/include/linux/if_vlan.h
@@ -101,6 +101,8 @@ extern void vlan_vids_del_by_dev(struct net_device *dev,
 				 const struct net_device *by_dev);
 
 extern bool vlan_uses_dev(const struct net_device *dev);
+extern int vlan_register_device(struct net_device *real_dev, u16 vlan_id, struct net_device **dev, char *name);
+extern void unregister_vlan_dev(struct net_device *dev, struct list_head *head);
 #else
 static inline struct net_device *
 __vlan_find_dev_deep(struct net_device *real_dev,
@@ -155,6 +157,15 @@ static inline bool vlan_uses_dev(const struct net_device *dev)
 {
 	return false;
 }
+
+static inline int vlan_register_device(struct net_device *real_dev, u16 vlan_id, struct net_device **dev, char *name)
+{
+	return -EPROTONOSUPPORT;
+}
+
+static inline void unregister_vlan_dev(struct net_device *dev, struct list_head *head)
+{
+}
 #endif
 
 static inline bool vlan_hw_offload_capable(netdev_features_t features,
diff --git a/net/8021q/vlan.c b/net/8021q/vlan.c
index 61fc573..272157c 100644
--- a/net/8021q/vlan.c
+++ b/net/8021q/vlan.c
@@ -121,6 +121,7 @@ void unregister_vlan_dev(struct net_device *dev, struct list_head *head)
 	/* Get rid of the vlan's reference to real_dev */
 	dev_put(real_dev);
 }
+EXPORT_SYMBOL(unregister_vlan_dev);
 
 int vlan_check_real_dev(struct net_device *real_dev,
 			__be16 protocol, u16 vlan_id)
@@ -204,18 +205,16 @@ out_vid_del:
 	return err;
 }
 
-/*  Attach a VLAN device to a mac address (ie Ethernet Card).
- *  Returns 0 if the device was created or a negative error code otherwise.
- */
-static int register_vlan_device(struct net_device *real_dev, u16 vlan_id)
+int vlan_register_device(struct net_device *real_dev, u16 vlan_id, struct net_device **dev, char *name)
 {
 	struct net_device *new_dev;
 	struct vlan_dev_priv *vlan;
 	struct net *net = dev_net(real_dev);
-	struct vlan_net *vn = net_generic(net, vlan_net_id);
-	char name[IFNAMSIZ];
 	int err;
 
+	if (!dev || !name)
+		return -EINVAL;
+
 	if (vlan_id >= VLAN_VID_MASK)
 		return -ERANGE;
 
@@ -223,32 +222,6 @@ static int register_vlan_device(struct net_device *real_dev, u16 vlan_id)
 	if (err < 0)
 		return err;
 
-	/* Gotta set up the fields for the device. */
-	switch (vn->name_type) {
-	case VLAN_NAME_TYPE_RAW_PLUS_VID:
-		/* name will look like:	 eth1.0005 */
-		snprintf(name, IFNAMSIZ, "%s.%.4i", real_dev->name, vlan_id);
-		break;
-	case VLAN_NAME_TYPE_PLUS_VID_NO_PAD:
-		/* Put our vlan.VID in the name.
-		 * Name will look like:	 vlan5
-		 */
-		snprintf(name, IFNAMSIZ, "vlan%i", vlan_id);
-		break;
-	case VLAN_NAME_TYPE_RAW_PLUS_VID_NO_PAD:
-		/* Put our vlan.VID in the name.
-		 * Name will look like:	 eth0.5
-		 */
-		snprintf(name, IFNAMSIZ, "%s.%i", real_dev->name, vlan_id);
-		break;
-	case VLAN_NAME_TYPE_PLUS_VID:
-		/* Put our vlan.VID in the name.
-		 * Name will look like:	 vlan0005
-		 */
-	default:
-		snprintf(name, IFNAMSIZ, "vlan%.4i", vlan_id);
-	}
-
 	new_dev = alloc_netdev(sizeof(struct vlan_dev_priv), name, vlan_setup);
 
 	if (new_dev == NULL)
@@ -273,12 +246,57 @@ static int register_vlan_device(struct net_device *real_dev, u16 vlan_id)
 	if (err < 0)
 		goto out_free_newdev;
 
+	*dev = new_dev;
+
 	return 0;
 
 out_free_newdev:
 	free_netdev(new_dev);
 	return err;
 }
+EXPORT_SYMBOL(vlan_register_device);
+
+/*  Attach a VLAN device to a mac address (ie Ethernet Card).
+ *  Returns 0 if the device was created or a negative error code otherwise.
+ */
+static int register_vlan_device(struct net_device *real_dev, u16 vlan_id)
+{
+	struct net_device *new_dev;
+	struct net *net = dev_net(real_dev);
+	struct vlan_net *vn = net_generic(net, vlan_net_id);
+	char name[IFNAMSIZ];
+	int err;
+
+	/* Gotta set up the fields for the device. */
+	switch (vn->name_type) {
+	case VLAN_NAME_TYPE_RAW_PLUS_VID:
+		/* name will look like:	 eth1.0005 */
+		snprintf(name, IFNAMSIZ, "%s.%.4i", real_dev->name, vlan_id);
+		break;
+	case VLAN_NAME_TYPE_PLUS_VID_NO_PAD:
+		/* Put our vlan.VID in the name.
+		 * Name will look like:	 vlan5
+		 */
+		snprintf(name, IFNAMSIZ, "vlan%i", vlan_id);
+		break;
+	case VLAN_NAME_TYPE_RAW_PLUS_VID_NO_PAD:
+		/* Put our vlan.VID in the name.
+		 * Name will look like:	 eth0.5
+		 */
+		snprintf(name, IFNAMSIZ, "%s.%i", real_dev->name, vlan_id);
+		break;
+	case VLAN_NAME_TYPE_PLUS_VID:
+		/* Put our vlan.VID in the name.
+		 * Name will look like:	 vlan0005
+		 */
+	default:
+		snprintf(name, IFNAMSIZ, "vlan%.4i", vlan_id);
+	}
+
+	err = vlan_register_device(real_dev, vlan_id, &new_dev, name);
+
+	return err;
+}
 
 static void vlan_sync_address(struct net_device *dev,
 			      struct net_device *vlandev)
-- 
1.7.0.4

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox