Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH] Allow userspace code to use flag IFA_F_SECONDARY to specify an ip address to be primary or secondary ip on an interface
From: David Miller @ 2013-09-24 19:28 UTC (permalink / raw)
  To: vincent.mc.li; +Cc: netdev, linux-kernel
In-Reply-To: <1380046281-6012-1-git-send-email-vincent.mc.li@gmail.com>

From: Vincent Li <vincent.mc.li@gmail.com>
Date: Tue, 24 Sep 2013 11:11:21 -0700

> the current behavior is when an IP is added to an interface, the primary
> or secondary attributes is depending on the order of ip added to the interface
> the first IP will be primary and second, third,... or alias IP will be secondary
> if the IP subnet matches
> 
> this patch add the flexiblity to allow user to specify an argument 'primary' or 'secondary'
> (use 'ip addr add ip/mask primary|secondary dev ethX ' from iproute2 for example) to specify
> an IP address to be  primary or secondary.
> 
> the reason for this patch is that we have a multi blade cluster platform sharing 'floating management ip'
> and also that each blade has its own management ip on the management interface, so whichever blade in the
> cluster becomes primary blade, the 'floating mangaement ip' follows it, and we want any of our traffic
> originated from the primary blade source from the 'floating management ip' for consistency. but in this
> case, since the local blade management ip is always the primary ip on the mangaement interface and 'floating
> management ip' is always secondary, kernel always choose the primary ip as source ip address. thus we would
> like to add the flexibility in kernel to allow us to specify which ip to be primary or secondary.
> 
> Signed-off-by: Vincent Li <vincent.mc.li@gmail.com>

When submitting a patch, please:

1) Specify an appropriate prefix for your subject line, indicating the
   subsystem.  "ipv4: " might be appropriate here.

2) Format your commit message so that lines do not exceed 80 columns.
   People will read using ASCII text based tools in 80 column
   terminals.

^ permalink raw reply

* Re: [PATCH] Allow userspace code to use flag IFA_F_SECONDARY to specify an ip address to be primary or secondary ip on an interface
From: Stephen Hemminger @ 2013-09-24 19:21 UTC (permalink / raw)
  To: Vincent Li; +Cc: netdev, linux-kernel, davem
In-Reply-To: <1380046281-6012-1-git-send-email-vincent.mc.li@gmail.com>

On Tue, 24 Sep 2013 11:11:21 -0700
Vincent Li <vincent.mc.li@gmail.com> wrote:

> -	ifa->ifa_flags &= ~IFA_F_SECONDARY;
>  	last_primary = &in_dev->ifa_list;
>  
> +	if((*last_primary) == NULL)


Minor nit, paren's here are unnecessary

^ permalink raw reply

* Re: [PATCH v4 net-next 06/27] bonding: modify bond_get_slave_by_dev() to use neighbours
From: David Miller @ 2013-09-24 19:19 UTC (permalink / raw)
  To: vfalico; +Cc: netdev, jiri, fubar, andy
In-Reply-To: <1380023227-9576-7-git-send-email-vfalico@redhat.com>

From: Veaceslav Falico <vfalico@redhat.com>
Date: Tue, 24 Sep 2013 13:46:46 +0200

> -	return NULL;
> +	return (struct slave *)netdev_lower_dev_get_private(bond->dev,
> +							    slave_dev);
>  }

void pointers never need to be cast, please remove this "struct slave *"
cast.

^ permalink raw reply

* Re: [PATCH 1/2] net: Toeplitz library functions
From: Eric Dumazet @ 2013-09-24 19:14 UTC (permalink / raw)
  To: Tom Herbert
  Cc: Ben Hutchings, David Miller, David Laight, Linux Netdev List,
	Brandeburg, Jesse
In-Reply-To: <CA+mtBx83=X6k42oe4=-hD_coRoX72bJ05Kb5dDfRCDCdTzOH+A@mail.gmail.com>

On Tue, 2013-09-24 at 11:24 -0700, Tom Herbert wrote:
> >
> There are some drivers are using fixed keys.  This is not good.  I'll
> take a look at those.  It should be sufficient to randomly set the key
> once at initialization.
> 

If the software fallback (we need this for dumb NIC, or encapsulated
traffic) is more expensive than jhash, I do not think we will ever use
this Toeplitz thing.

We can not use jhash or Toeplitz.

^ permalink raw reply

* Re: SMSC 9303 support
From: Gary Thomas @ 2013-09-24 19:13 UTC (permalink / raw)
  To: Florian Fainelli; +Cc: Ben Hutchings, netdev
In-Reply-To: <CAGVrzcbF+24ouTGxMwsu3-QCi=JeHEcS7BqBceGBX7Jf+e7BdQ@mail.gmail.com>

On 2013-09-24 12:29, Florian Fainelli wrote:
> Hello,
>
> 2013/9/24 Gary Thomas <gary@mlbassoc.com>:
>> On 2013-09-24 10:51, Ben Hutchings wrote:
>>>
>>> On Tue, 2013-09-24 at 06:21 -0600, Gary Thomas wrote:
>>>>
>>>> I need to support the SMSC9303 in an embedded system.  I'm not
>>>> finding any [explicit] support for this device in the latest
>>>> mainline kernel.  Did I miss something?
>>>>
>>>> To be clear, the SMSC9303 is a 3-port managed ethernet switch
>>>> capable of supporting 802.1D/802.1Q directly. This switch is
>>>> driven by a single MAC via MII/RMII and exposes the other two
>>>> ports via physical PHYs.  What I need it to do is behave like
>>>> two external, separate devices.  I was thinking that what I need
>>>> to do is treat these as VLAN devices since the switch can manage
>>>> the routing.
>>>>
>>>> Does this seem like a reasonable approach?
>>>
>>>
>>> Linux has 'DSA' (Distributed Switch Architecture) which supports tagging
>>> of packets to indicate which switch port they are sent or received
>>> through.  This was originally added to support some Marvell switch chips
>>> and I don't know whether it would be suitable or extensible for this
>>> one.
>>
>>
>> I've used the DSA stuff for years (worked directly with the Marvell folks
>> when it was being developed).  It might work for this device, I'll think
>> some more about using it although I was hoping for a lighter weight
>> solution.
>
> I do not think DSA is suitable for pure 802.1q switches such as this
> one. OpenWrt has an out of tree patch which adds some switch-specific
> operations that can be controlled over netlink (currently trying to
> get them in a shape where they can be submitted for mainline
> inclusion) [1], which I think is much more suitable than DSA or any
> other proprietary switch tagging mechanism.
>
> [1]: https://dev.openwrt.org/browser/trunk/target/linux/generic/files/drivers/net/phy/swconfig.c
>

This looks interesting.  Do you have any more information on how to
integrate this and/or use it?

-- 
------------------------------------------------------------
Gary Thomas                 |  Consulting for the
MLB Associates              |    Embedded world
------------------------------------------------------------

^ permalink raw reply

* Re: [net 5/6] i40e: better return values
From: Jesse Brandeburg @ 2013-09-24 19:09 UTC (permalink / raw)
  To: David Miller; +Cc: joe, jeffrey.t.kirsher, netdev, gospo, sassmann
In-Reply-To: <20130924.101231.1193830264265403478.davem@davemloft.net>

On Tue, 24 Sep 2013 10:12:31 -0400
David Miller <davem@davemloft.net> wrote:
> > Ick.  post_increment problem.
> > 
> > 	return ++num_tc;
> > 
> > There's nothing wrong with the original code
> > unless this is a bugfix which should be documented
> > better than "better return values".
> 
> Agreed, this style of coding is asking for a bug.
> 
> If you want to return "num_tc PLUS ONE" just say that:
> 
> 	return num_tc + 1;

Oops! Obviously the original post-inc was not the intent. That was my
bad. good catch Joe! will fix and resubmit.

^ permalink raw reply

* Re: [PATCH 1/2] net: Toeplitz library functions
From: Tom Herbert @ 2013-09-24 19:04 UTC (permalink / raw)
  To: Jesse Brandeburg; +Cc: David Miller, David Laight, Linux Netdev List
In-Reply-To: <20130924114853.00003935@unknown>

On Tue, Sep 24, 2013 at 11:48 AM, Jesse Brandeburg
<jesse.brandeburg@intel.com> wrote:
> On Tue, 24 Sep 2013 14:03:12 -0400 David Miller <davem@redhat.com> wrote:
> ...
>> >> For security reasons we absolutely cannot use it for that purpose,
>> >> please stop claiming this.
>> >>
>> >> Any hash function which an attacker can reproduce is attackable.
>> >
> ...
>> > that should be addressed.  It is possible to DoS attack through the
>> > steering mechanism.
>>
>> All of them are using a fixed, defined, key.
>
> We selected the fixed key on purpose.  The existing mechanisms built
> into the stack for preventing the impact of DOS attacks like NAPI
> polling will prevent any actual damage even if someone sends lots of
> packets on a single flow.  If someone overflows a receive queue that
> CPU runs until it can't keep up and then hardware drops further
> packets.  In this case even with a randomized seed key any single flow
> can still be targeted at a queue, which is no different than a single
> queue adapter.
>
> I'm not convinced there is an actual impact in practice.
>
But what is the purpose of using a fixed key, what are the benefits?
A clever attacker could attack a specific queue with different
4-tuples.  While this might be similar to a single flow attack, it
will be much harder to detect and be used to attack multiple servers
simultaneously in the say way (since they all share the same key).  I
don't think we could deploy a NIC with a static RSS key.

^ permalink raw reply

* Re: [PATCH 1/2] net: Toeplitz library functions
From: David Miller @ 2013-09-24 18:49 UTC (permalink / raw)
  To: bhutchings; +Cc: therbert, David.Laight, netdev, jesse.brandeburg
In-Reply-To: <1380046245.2736.52.camel@bwh-desktop.uk.level5networks.com>

From: Ben Hutchings <bhutchings@solarflare.com>
Date: Tue, 24 Sep 2013 19:10:45 +0100

> On Tue, 2013-09-24 at 14:03 -0400, David Miller wrote:
>> From: Tom Herbert <therbert@google.com>
>> Date: Tue, 24 Sep 2013 08:54:24 -0700
>> 
>> > On Tue, Sep 24, 2013 at 8:39 AM, David Miller <davem@redhat.com> wrote:
>> >> From: Tom Herbert <therbert@google.com>
>> >> Date: Tue, 24 Sep 2013 08:22:55 -0700
>> >>
>> >>> We use this value for steering, and could use it for other uses like
>> >>> connection lookup.
>> >>
>> >> For security reasons we absolutely cannot use it for that purpose,
>> >> please stop claiming this.
>> >>
>> >> Any hash function which an attacker can reproduce is attackable.
>> > 
>> > The Toeplitz function uses a secret key whose length is based on the
>> > input length.  96 bits in IPv4, 320 bits in IPv6.  I don't see how an
>> > attacker can reproduce this if the key is random.  If the problem is
>> > that devices are not being configured with a sufficiently random key
>> > (some actually are using a fixed key :-( ), that's a separate issue
>> > that should be addressed.  It is possible to DoS attack through the
>> > steering mechanism.
>> 
>> All of them are using a fixed, defined, key.
> 
> This is certainly false, as I know sfc randomises the key.  And the
> Microsoft RSS spec appears to require that the key is programmable.

Then I stand corrected.

^ permalink raw reply

* Re: [PATCH 1/2] net: Toeplitz library functions
From: Jesse Brandeburg @ 2013-09-24 18:48 UTC (permalink / raw)
  To: David Miller; +Cc: therbert, David.Laight, netdev
In-Reply-To: <20130924.140312.1944338200709799169.davem@redhat.com>

On Tue, 24 Sep 2013 14:03:12 -0400 David Miller <davem@redhat.com> wrote:
...
> >> For security reasons we absolutely cannot use it for that purpose,
> >> please stop claiming this.
> >>
> >> Any hash function which an attacker can reproduce is attackable.
> > 
...
> > that should be addressed.  It is possible to DoS attack through the
> > steering mechanism.
> 
> All of them are using a fixed, defined, key.

We selected the fixed key on purpose.  The existing mechanisms built
into the stack for preventing the impact of DOS attacks like NAPI
polling will prevent any actual damage even if someone sends lots of
packets on a single flow.  If someone overflows a receive queue that
CPU runs until it can't keep up and then hardware drops further
packets.  In this case even with a randomized seed key any single flow
can still be targeted at a queue, which is no different than a single
queue adapter.

I'm not convinced there is an actual impact in practice.

^ permalink raw reply

* Re: [PATCH 1/2] net: Toeplitz library functions
From: David Miller @ 2013-09-24 18:48 UTC (permalink / raw)
  To: therbert; +Cc: eric.dumazet, hannes, netdev, jesse.brandeburg
In-Reply-To: <CA+mtBx8PJj1HgaoiOoJjQL6VmU4Rn7ctBpscr1YxpfrWW9VSbA@mail.gmail.com>

From: Tom Herbert <therbert@google.com>
Date: Tue, 24 Sep 2013 11:02:11 -0700

> But if we already have valid l4_rxhash we're just wasting time
> recomputing it (very like the same value).  So just do:

Honestly, you're arguing over 12 or so cycles.  That's about how much
jhash costs.

^ permalink raw reply

* Re: SMSC 9303 support
From: Florian Fainelli @ 2013-09-24 18:29 UTC (permalink / raw)
  To: Gary Thomas; +Cc: Ben Hutchings, netdev
In-Reply-To: <5241C4F2.2040106@mlbassoc.com>

Hello,

2013/9/24 Gary Thomas <gary@mlbassoc.com>:
> On 2013-09-24 10:51, Ben Hutchings wrote:
>>
>> On Tue, 2013-09-24 at 06:21 -0600, Gary Thomas wrote:
>>>
>>> I need to support the SMSC9303 in an embedded system.  I'm not
>>> finding any [explicit] support for this device in the latest
>>> mainline kernel.  Did I miss something?
>>>
>>> To be clear, the SMSC9303 is a 3-port managed ethernet switch
>>> capable of supporting 802.1D/802.1Q directly. This switch is
>>> driven by a single MAC via MII/RMII and exposes the other two
>>> ports via physical PHYs.  What I need it to do is behave like
>>> two external, separate devices.  I was thinking that what I need
>>> to do is treat these as VLAN devices since the switch can manage
>>> the routing.
>>>
>>> Does this seem like a reasonable approach?
>>
>>
>> Linux has 'DSA' (Distributed Switch Architecture) which supports tagging
>> of packets to indicate which switch port they are sent or received
>> through.  This was originally added to support some Marvell switch chips
>> and I don't know whether it would be suitable or extensible for this
>> one.
>
>
> I've used the DSA stuff for years (worked directly with the Marvell folks
> when it was being developed).  It might work for this device, I'll think
> some more about using it although I was hoping for a lighter weight
> solution.

I do not think DSA is suitable for pure 802.1q switches such as this
one. OpenWrt has an out of tree patch which adds some switch-specific
operations that can be controlled over netlink (currently trying to
get them in a shape where they can be submitted for mainline
inclusion) [1], which I think is much more suitable than DSA or any
other proprietary switch tagging mechanism.

[1]: https://dev.openwrt.org/browser/trunk/target/linux/generic/files/drivers/net/phy/swconfig.c
-- 
Florian

^ permalink raw reply

* Re: [PATCH 1/2] net: Toeplitz library functions
From: Tom Herbert @ 2013-09-24 18:24 UTC (permalink / raw)
  To: Ben Hutchings
  Cc: David Miller, David Laight, Linux Netdev List, Brandeburg, Jesse
In-Reply-To: <1380046245.2736.52.camel@bwh-desktop.uk.level5networks.com>

On Tue, Sep 24, 2013 at 11:10 AM, Ben Hutchings
<bhutchings@solarflare.com> wrote:
> On Tue, 2013-09-24 at 14:03 -0400, David Miller wrote:
>> From: Tom Herbert <therbert@google.com>
>> Date: Tue, 24 Sep 2013 08:54:24 -0700
>>
>> > On Tue, Sep 24, 2013 at 8:39 AM, David Miller <davem@redhat.com> wrote:
>> >> From: Tom Herbert <therbert@google.com>
>> >> Date: Tue, 24 Sep 2013 08:22:55 -0700
>> >>
>> >>> We use this value for steering, and could use it for other uses like
>> >>> connection lookup.
>> >>
>> >> For security reasons we absolutely cannot use it for that purpose,
>> >> please stop claiming this.
>> >>
>> >> Any hash function which an attacker can reproduce is attackable.
>> >
>> > The Toeplitz function uses a secret key whose length is based on the
>> > input length.  96 bits in IPv4, 320 bits in IPv6.  I don't see how an
>> > attacker can reproduce this if the key is random.  If the problem is
>> > that devices are not being configured with a sufficiently random key
>> > (some actually are using a fixed key :-( ), that's a separate issue
>> > that should be addressed.  It is possible to DoS attack through the
>> > steering mechanism.
>>
>> All of them are using a fixed, defined, key.
>
> This is certainly false, as I know sfc randomises the key.  And the
> Microsoft RSS spec appears to require that the key is programmable.
>
There are some drivers are using fixed keys.  This is not good.  I'll
take a look at those.  It should be sufficient to randomly set the key
once at initialization.


> Ben.
>
> --
> Ben Hutchings, Staff Engineer, Solarflare
> Not speaking for my employer; that's the marketing department's job.
> They asked us to note that Solarflare product names are trademarked.
>

^ permalink raw reply

* [PATCH] Allow userspace code to use flag IFA_F_SECONDARY to specify an ip address to be primary or secondary ip on an interface
From: Vincent Li @ 2013-09-24 18:11 UTC (permalink / raw)
  To: netdev; +Cc: linux-kernel, davem, Vincent Li

the current behavior is when an IP is added to an interface, the primary
or secondary attributes is depending on the order of ip added to the interface
the first IP will be primary and second, third,... or alias IP will be secondary
if the IP subnet matches

this patch add the flexiblity to allow user to specify an argument 'primary' or 'secondary'
(use 'ip addr add ip/mask primary|secondary dev ethX ' from iproute2 for example) to specify
an IP address to be  primary or secondary.

the reason for this patch is that we have a multi blade cluster platform sharing 'floating management ip'
and also that each blade has its own management ip on the management interface, so whichever blade in the
cluster becomes primary blade, the 'floating mangaement ip' follows it, and we want any of our traffic
originated from the primary blade source from the 'floating management ip' for consistency. but in this
case, since the local blade management ip is always the primary ip on the mangaement interface and 'floating
management ip' is always secondary, kernel always choose the primary ip as source ip address. thus we would
like to add the flexibility in kernel to allow us to specify which ip to be primary or secondary.

Signed-off-by: Vincent Li <vincent.mc.li@gmail.com>
---
 net/ipv4/devinet.c |    9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c
index a1b5bcb..bfc702a 100644
--- a/net/ipv4/devinet.c
+++ b/net/ipv4/devinet.c
@@ -440,9 +440,11 @@ static int __inet_insert_ifa(struct in_ifaddr *ifa, struct nlmsghdr *nlh,
 		return 0;
 	}

-	ifa->ifa_flags &= ~IFA_F_SECONDARY;
 	last_primary = &in_dev->ifa_list;

+	if((*last_primary) == NULL)
+		ifa->ifa_flags &= ~IFA_F_SECONDARY;
+
 	for (ifap = &in_dev->ifa_list; (ifa1 = *ifap) != NULL;
 	     ifap = &ifa1->ifa_next) {
 		if (!(ifa1->ifa_flags & IFA_F_SECONDARY) &&
@@ -458,7 +460,10 @@ static int __inet_insert_ifa(struct in_ifaddr *ifa, struct nlmsghdr *nlh,
 				inet_free_ifa(ifa);
 				return -EINVAL;
 			}
-			ifa->ifa_flags |= IFA_F_SECONDARY;
+                        if (!(ifa->ifa_flags & IFA_F_SECONDARY))
+                                ifa1->ifa_flags |= IFA_F_SECONDARY;
+                        else
+                                ifa->ifa_flags |= IFA_F_SECONDARY;
 		}
 	}

-- 
1.7.9.5

^ permalink raw reply related

* Re: [PATCH 1/2] net: Toeplitz library functions
From: Ben Hutchings @ 2013-09-24 18:10 UTC (permalink / raw)
  To: David Miller; +Cc: therbert, David.Laight, netdev, jesse.brandeburg
In-Reply-To: <20130924.140312.1944338200709799169.davem@redhat.com>

On Tue, 2013-09-24 at 14:03 -0400, David Miller wrote:
> From: Tom Herbert <therbert@google.com>
> Date: Tue, 24 Sep 2013 08:54:24 -0700
> 
> > On Tue, Sep 24, 2013 at 8:39 AM, David Miller <davem@redhat.com> wrote:
> >> From: Tom Herbert <therbert@google.com>
> >> Date: Tue, 24 Sep 2013 08:22:55 -0700
> >>
> >>> We use this value for steering, and could use it for other uses like
> >>> connection lookup.
> >>
> >> For security reasons we absolutely cannot use it for that purpose,
> >> please stop claiming this.
> >>
> >> Any hash function which an attacker can reproduce is attackable.
> > 
> > The Toeplitz function uses a secret key whose length is based on the
> > input length.  96 bits in IPv4, 320 bits in IPv6.  I don't see how an
> > attacker can reproduce this if the key is random.  If the problem is
> > that devices are not being configured with a sufficiently random key
> > (some actually are using a fixed key :-( ), that's a separate issue
> > that should be addressed.  It is possible to DoS attack through the
> > steering mechanism.
> 
> All of them are using a fixed, defined, key.

This is certainly false, as I know sfc randomises the key.  And the
Microsoft RSS spec appears to require that the key is programmable.

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply

* Re: [PATCH 1/2] net: Toeplitz library functions
From: Tom Herbert @ 2013-09-24 18:06 UTC (permalink / raw)
  To: David Miller; +Cc: David Laight, Linux Netdev List, Brandeburg, Jesse
In-Reply-To: <20130924.140312.1944338200709799169.davem@redhat.com>

On Tue, Sep 24, 2013 at 11:03 AM, David Miller <davem@redhat.com> wrote:
> From: Tom Herbert <therbert@google.com>
> Date: Tue, 24 Sep 2013 08:54:24 -0700
>
>> On Tue, Sep 24, 2013 at 8:39 AM, David Miller <davem@redhat.com> wrote:
>>> From: Tom Herbert <therbert@google.com>
>>> Date: Tue, 24 Sep 2013 08:22:55 -0700
>>>
>>>> We use this value for steering, and could use it for other uses like
>>>> connection lookup.
>>>
>>> For security reasons we absolutely cannot use it for that purpose,
>>> please stop claiming this.
>>>
>>> Any hash function which an attacker can reproduce is attackable.
>>
>> The Toeplitz function uses a secret key whose length is based on the
>> input length.  96 bits in IPv4, 320 bits in IPv6.  I don't see how an
>> attacker can reproduce this if the key is random.  If the problem is
>> that devices are not being configured with a sufficiently random key
>> (some actually are using a fixed key :-( ), that's a separate issue
>> that should be addressed.  It is possible to DoS attack through the
>> steering mechanism.
>
> All of them are using a fixed, defined, key.

bnx2x, at least, seems to being trying to set a random key:

prandom_bytes(params.rss_key, T_ETH_RSS_KEY * 4);

^ permalink raw reply

* Re: [PATCH 04/10] amd/7990: Remove extern from function prototypes
From: David Miller @ 2013-09-24 18:06 UTC (permalink / raw)
  To: joe; +Cc: netdev, linux-kernel
In-Reply-To: <1380041755.3575.93.camel@joe-AO722>

From: Joe Perches <joe@perches.com>
Date: Tue, 24 Sep 2013 09:55:55 -0700

> Hey David.
> 
> This ended up in your tree as
> commit 44da5c2f523876a02336c18dbcce93449fdbf0ca
> without your signoff.
> 
> Is there something I need to do differently so
> your scripts add your sign-off properly or did
> this manage to just slip through via some manual
> git am without the -s?

I forgot to "--signoff", that's all.

^ permalink raw reply

* Re: [PATCH 1/2] net: Toeplitz library functions
From: David Miller @ 2013-09-24 18:03 UTC (permalink / raw)
  To: therbert; +Cc: David.Laight, netdev, jesse.brandeburg
In-Reply-To: <CA+mtBx_AcmcNBDhRKVB8YDkSNdOwgJ3CXxBqpgxDw23X3m2rvQ@mail.gmail.com>

From: Tom Herbert <therbert@google.com>
Date: Tue, 24 Sep 2013 08:54:24 -0700

> On Tue, Sep 24, 2013 at 8:39 AM, David Miller <davem@redhat.com> wrote:
>> From: Tom Herbert <therbert@google.com>
>> Date: Tue, 24 Sep 2013 08:22:55 -0700
>>
>>> We use this value for steering, and could use it for other uses like
>>> connection lookup.
>>
>> For security reasons we absolutely cannot use it for that purpose,
>> please stop claiming this.
>>
>> Any hash function which an attacker can reproduce is attackable.
> 
> The Toeplitz function uses a secret key whose length is based on the
> input length.  96 bits in IPv4, 320 bits in IPv6.  I don't see how an
> attacker can reproduce this if the key is random.  If the problem is
> that devices are not being configured with a sufficiently random key
> (some actually are using a fixed key :-( ), that's a separate issue
> that should be addressed.  It is possible to DoS attack through the
> steering mechanism.

All of them are using a fixed, defined, key.

^ permalink raw reply

* Re: [PATCH 1/2] net: Toeplitz library functions
From: Tom Herbert @ 2013-09-24 18:02 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Hannes Frederic Sowa, David Miller, Linux Netdev List,
	Brandeburg, Jesse
In-Reply-To: <1380044075.3165.108.camel@edumazet-glaptop>

On Tue, Sep 24, 2013 at 10:34 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On Tue, 2013-09-24 at 10:03 -0700, Tom Herbert wrote:
>> On Tue, Sep 24, 2013 at 9:46 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>> > On Tue, 2013-09-24 at 09:35 -0700, Tom Herbert wrote:
>> >
>> >> We should really be using rxhash for that anyway, eliminate this
>> >> ehashfn.  This would entail adding rxhash argument in the various
>> >> udp_lookup functions.
>> >
>> > Nope : Some NICs provide UDP rxhash only using L3  (source IP,
>> > destination IP), not L4 (adding source & destination ports)
>> >
>> Then the NIC won't set l4_rxhash and we'll rehash over 4-tuple when
>> skb_get_rxhash is called.
>
> Yes, but then in this case you add cpu cycles for no reason.
>
> If you have multiqueue NIC, you do not use RPS/RFS, so skb->rxhash might
> be 0
>
> hash = inet_ehashfn(net, daddr, hnum, saddr, sport);
>
> is faster than the whole flow dissection game.
>
But if we already have valid l4_rxhash we're just wasting time
recomputing it (very like the same value).  So just do:

if (skb->l4_hash)
    hash = skb->rxhash
else
    hash = inet_ehashfn(net, daddr, hnum, saddr, sport);

Even if we go the inet_ehashfn route in the packet, it's makes sense
to store this in skb->rxhash so that subsequent functions don't need
to compute the hash

By the way, I believe there is an insidious in using the same hash
value or even related hash values in multiple places for steering.
Since we typically do something like hash % numqueues, we introduce
bias if multiple steering levels look at the same bits.  With
SO_REUSEPORT this might actually be beneficial since we could get port
selection to match RSS locality, but this is not by design!

>
>
>
>

^ permalink raw reply

* Re: [PATCH net 0/4] bridge: Fix problems around the PVID
From: Vlad Yasevich @ 2013-09-24 17:55 UTC (permalink / raw)
  To: Toshiaki Makita
  Cc: Toshiaki Makita, David Miller, netdev, Fernando Luis Vazquez Cao,
	Patrick McHardy
In-Reply-To: <1380043818.4391.26.camel@localhost.localdomain>

On 09/24/2013 01:30 PM, Toshiaki Makita wrote:
> On Tue, 2013-09-24 at 09:35 -0400, Vlad Yasevich wrote:
>> On 09/24/2013 07:45 AM, Toshiaki Makita wrote:
>>> On Mon, 2013-09-23 at 10:41 -0400, Vlad Yasevich wrote:
>>>> On 09/17/2013 04:12 AM, Toshiaki Makita wrote:
>>>>> On Mon, 2013-09-16 at 13:49 -0400, Vlad Yasevich wrote:
>>>>>> On 09/13/2013 08:06 AM, Toshiaki Makita wrote:
>>>>>>> On Thu, 2013-09-12 at 16:00 -0400, David Miller wrote:
>>>>>>>> From: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
>>>>>>>> Date: Tue, 10 Sep 2013 19:27:54 +0900
>>>>>>>>
>>>>>>>>> There seem to be some undesirable behaviors related with PVID.
>>>>>>>>> 1. It has no effect assigning PVID to a port. PVID cannot be applied
>>>>>>>>> to any frame regardless of whether we set it or not.
>>>>>>>>> 2. FDB entries learned via frames applied PVID are registered with
>>>>>>>>> VID 0 rather than VID value of PVID.
>>>>>>>>> 3. We can set 0 or 4095 as a PVID that are not allowed in IEEE 802.1Q.
>>>>>>>>> This leads interoperational problems such as sending frames with VID
>>>>>>>>> 4095, which is not allowed in IEEE 802.1Q, and treating frames with VID
>>>>>>>>> 0 as they belong to VLAN 0, which is expected to be handled as they have
>>>>>>>>> no VID according to IEEE 802.1Q.
>>>>>>>>>
>>>>>>>>> Note: 2nd and 3rd problems are potential and not exposed unless 1st problem
>>>>>>>>> is fixed, because we cannot activate PVID due to it.
>>>>>>>>
>>>>>>>> Please work out the issues in patch #2 with Vlad and resubmit this
>>>>>>>> series.
>>>>>>>>
>>>>>>>> Thank you.
>>>>>>>
>>>>>>> I'm hovering between whether we should fix the issue by changing vlan 0
>>>>>>> interface behavior in 8021q module or enabling a bridge port to sending
>>>>>>> priority-tagged frames, or another better way.
>>>>>>>
>>>>>>> If you could comment it, I'd appreciate it :)
>>>>>>>
>>>>>>>
>>>>>>> BTW, I think what is discussed in patch #2 is another problem about
>>>>>>> handling priority-tags, and it exists without this patch set applied.
>>>>>>> It looks like that we should prepare another patch set than this to fix
>>>>>>> that problem.
>>>>>>>
>>>>>>> Should I include patches that fix the priority-tags problem in this
>>>>>>> patch set and resubmit them all together?
>>>>>>>
>>>>>>
>>>>>> I am thinking that we might need to do it in bridge and it looks like
>>>>>> the simplest way to do it is to have default priority regeneration table
>>>>>> (table 6-5 from 802.1Q doc).
>>>>>>
>>>>>> That way I think we would conform to the spec.
>>>>>>
>>>>>> -vlad
>>>>>
>>>>> Unfortunately I don't think the default priority regeneration table
>>>>> resolves the problem because IEEE 802.1Q says that a VLAN-aware bridge
>>>>> can transmit untagged or VLAN-tagged frames only (the end of section 7.5
>>>>> and 8.1.7).
>>>>>
>>>>> No mechanism to send priority-tagged frames is found as far as I can see
>>>>> the standard. I think the regenerated priority is used for outgoing PCP
>>>>> field only if egress policy is not untagged (i.e. transmitting as
>>>>> VLAN-tagged), and unused if untagged (Section 6.9.2 3rd/4th Paragraph).
>>>>>
>>>>> If we want to transmit priority-tagged frames from a bridge port, I
>>>>> think we need to implement a new (optional) feature that is above the
>>>>> standard, as I stated previously.
>>>>>
>>>>> How do you feel about adding a per-port policy that enables a bridge to
>>>>> send priority-tagged frames instead of untagged frames when egress
>>>>> policy for the port is untagged?
>>>>> With this change, we can transmit frames for a given vlan as either all
>>>>> untagged, all priority-tagged or all VLAN-tagged.
>>>>
>>>> That would work.  What I am thinking is that we do it by special casing
>>>> the vid 0 egress policy specification.  Let it be untagged by default
>>>> and if it is tagged, then we preserve the priority field and forward
>>>> it on.
>>>>
>>>> This keeps the API stable and doesn't require user/admin from knowing
>>>> exactly what happens.  Default operation conforms to the spec and allows
>>>> simple change to make it backward-compatible.
>>>>
>>>> What do you think.  I've done a simple prototype of this an it seems to
>>>> work with the VMs I am testing with.
>>>
>>> Are you saying that
>>> - by default, set the 0th bit of untagged_bitmap; and
>>> - if we unset the 0th bit and set the "vid"th bit, we transmit frames
>>> classified as belonging to VLAN "vid" as priority-tagged?
>>>
>>> If so, though it's attractive to keep current API, I'm worried about if
>>> it could be a bit confusing and not intuitive for kernel/iproute2
>>> developers that VID 0 has a special meaning only in the egress policy.
>>> Wouldn't it be better to adding a new member to struct net_port_vlans
>>> instead of using VID 0 of untagged_bitmap?
>>>
>>> Or are you saying that we use a new flag in struct net_port_vlans but
>>> use the BRIDGE_VLAN_INFO_UNTAGGED bit with VID 0 in netlink to set the
>>> flag?
>>>
>>> Even in that case, I'm afraid that it might be confusing for developers
>>> for the same reason. We are going to prohibit to specify VID with 0 (and
>>> 4095) in adding/deleting a FDB entry or a vlan filtering entry, but it
>>> would allow us to use VID 0 only when a vlan filtering entry is
>>> configured.
>>> I am thinking a new nlattr is a straightforward approach to configure
>>> it.
>>
>> By making this an explicit attribute it makes vid 0 a special case for
>> any automatic tool that would provision such filtering.  Seeing vid 0
>> would mean that these tools would have to know that this would have to
>> be translated to a different attribute instead of setting the policy
>> values.
>
> Yes, I agree with you that we can do it by the way you explained.
> What I don't understand is the advantage of using vid 0 over another way
> such as adding a new nlattr.
> I think we can indicate transmitting priority-tags explicitly by such a
> nlattr. Using vid 0 seems to be easier to implement than a new nlattr,
> but, for me, it looks less intuitive and more difficult to maintain
> because we have to care about vid 0 instead of simply ignoring it.
>

The point I am trying to make is that regardless of the approach someone
has to know what to do when enabling priority tagged frames.  You 
proposal would require the administrator or config tool to have that 
knowledge.  Example is:
	Admin does: bridge vlan set priority on dev eth0
         Automated app:
		if (vid == 0)
			/* Turn on priority tagged frame support */

My proposal would require the bridge filtering implementation to have it.
	user tool: bridge vlan add vid 0 tagged
	Automated app:  No special case.

IMO its better to have 1 piece code handling the special case then 
putting it multiple places.

Thanks
-vlad

> Thanks,
>
> Toshiaki Makita
>
>>
>> How it is implemented internally in the kernel isn't as big of an issue.
>> We can do it as a separate flag or as part of existing policy.
>>
>> -vlad
>>
>>>
>>> Thanks,
>>>
>>> Toshiaki Makita
>>>
>>>>
>>>> -vlad
>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Toshiaki Makita
>>>>>
>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Toshiaki Makita
>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>>>>>>>> the body of a message to majordomo@vger.kernel.org
>>>>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>>>>> the body of a message to majordomo@vger.kernel.org
>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>>
>>>
>>>
>>
>
>

^ permalink raw reply

* Re: [PATCH] net: Delay default_device_exit_batch until no devices are unregistering v2
From: Francesco Ruggeri @ 2013-09-24 17:54 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: David Miller, Eric Dumazet, Jiří Pírko,
	Alexander Duyck, Cong Wang, netdev
In-Reply-To: <87a9j26eca.fsf_-_@xmission.com>

The patch looks fine to me.
Thanks again for looking into this.

Francesco

^ permalink raw reply

* Re: [REGRESSION][BISECTED] skge: add dma_mapping check
From: Mikulas Patocka @ 2013-09-24 17:50 UTC (permalink / raw)
  To: Joseph Salisbury
  Cc: Igor Gnatenko, Francois Romieu, Mirko Lindner, linux-kernel,
	Stephen Hemminger, netdev, member graysky, Greg KH, David Miller
In-Reply-To: <5241ACEA.6010402@canonical.com>



On Tue, 24 Sep 2013, Joseph Salisbury wrote:

> On 09/19/2013 05:03 AM, Igor Gnatenko wrote:
> > Please, send patch.
> >
> The patch is in mainline as of 3.12-rc2 as commit:
> 
> Author: Mikulas Patocka <mpatocka@redhat.com>
> Date:   Thu Sep 19 14:13:17 2013 -0400
> 
>     skge: fix broken driver
> 
> I don't see that the commit was Cc'd to stable.  Mikulas, we might need
> to send a request directly to the stable maintainers and reqeust that
> the commit be pulled into stable, in case they didn't notice the request
> in the commit message.

The patch needs to be added to 3.11 stable tree. David Miller's networking 
tree has a rule that it doesn't want "Cc: stable@kernel.org" in patches 
submitted through it.

So, David hopefully submits the patch to the stable tree on his own.

Mikulas

^ permalink raw reply

* sveiki
From: Kreditu agentura LaPoste @ 2013-09-24 17:24 UTC (permalink / raw)
  To: netdev

sveiki
Mes piedavajam istermina aizdevumus un ilgtermina ar gada likmi 2% par jusu projektu un personigas projektiem un izdevumiem, bankas nav atbildet uz jusu pieprasijumu, tad si ir iespeja izpirkt sevi Paldies Sazinaties lai sanemtu sikaku informaciju
E-pasts: informations.agence@laposte.net
Jusu kredita pieteikumu, ludzu, aizpildiet so pagaidu formu:
vards:
First Family:
Pilna adrese:
Kredits Kredita summa:
Laiks paterins:
pilseta:
Valsts:
profesija:
Gimenes stavoklis:
Talrunis:
Paldies jums, lai iegutu vairak informacijas, ludzu, sazinieties ar mums: informations.agence@laposte.net

^ permalink raw reply

* Re: [PATCH 1/2] net: Toeplitz library functions
From: Eric Dumazet @ 2013-09-24 17:44 UTC (permalink / raw)
  To: Rick Jones
  Cc: Tom Herbert, Hannes Frederic Sowa, David Miller,
	Linux Netdev List, Brandeburg, Jesse
In-Reply-To: <5241CDF2.1020601@hp.com>

On Tue, 2013-09-24 at 10:37 -0700, Rick Jones wrote:
> On 09/24/2013 10:34 AM, Eric Dumazet wrote:
> > If you have multiqueue NIC, you do not use RPS/RFS, so skb->rxhash might
> > be 0
> 
> If one has a CPU count >> the NIC queue count, mightn't one still want 
> to use RPS/RFS?

Maybe, but UDP stack should not assume anything.

By the way, there is a problem with RFS under stress, I am cooking a
patch to address this.

^ permalink raw reply

* Re: [PATCH 1/2] net: Toeplitz library functions
From: Rick Jones @ 2013-09-24 17:37 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Tom Herbert, Hannes Frederic Sowa, David Miller,
	Linux Netdev List, Brandeburg, Jesse
In-Reply-To: <1380044075.3165.108.camel@edumazet-glaptop>

On 09/24/2013 10:34 AM, Eric Dumazet wrote:
> If you have multiqueue NIC, you do not use RPS/RFS, so skb->rxhash might
> be 0

If one has a CPU count >> the NIC queue count, mightn't one still want 
to use RPS/RFS?

rick jones

^ permalink raw reply

* Re: [PATCH 1/2] net: Toeplitz library functions
From: Eric Dumazet @ 2013-09-24 17:34 UTC (permalink / raw)
  To: Tom Herbert
  Cc: Hannes Frederic Sowa, David Miller, Linux Netdev List,
	Brandeburg, Jesse
In-Reply-To: <CA+mtBx_tWS44q8jER+QtR5k4k2ieq99He6YdPF8KKBF4Z2+32w@mail.gmail.com>

On Tue, 2013-09-24 at 10:03 -0700, Tom Herbert wrote:
> On Tue, Sep 24, 2013 at 9:46 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
> > On Tue, 2013-09-24 at 09:35 -0700, Tom Herbert wrote:
> >
> >> We should really be using rxhash for that anyway, eliminate this
> >> ehashfn.  This would entail adding rxhash argument in the various
> >> udp_lookup functions.
> >
> > Nope : Some NICs provide UDP rxhash only using L3  (source IP,
> > destination IP), not L4 (adding source & destination ports)
> >
> Then the NIC won't set l4_rxhash and we'll rehash over 4-tuple when
> skb_get_rxhash is called.

Yes, but then in this case you add cpu cycles for no reason.

If you have multiqueue NIC, you do not use RPS/RFS, so skb->rxhash might
be 0

hash = inet_ehashfn(net, daddr, hnum, saddr, sport);

is faster than the whole flow dissection game.

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox