Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH net-next 8/8] tg3: Code movement
From: Joe Perches @ 2011-08-31 23:59 UTC (permalink / raw)
  To: Matt Carlson; +Cc: davem, netdev
In-Reply-To: <1314827094-29714-9-git-send-email-mcarlson@broadcom.com>

On Wed, 2011-08-31 at 14:44 -0700, Matt Carlson wrote:
> This patch just moves some code around for better organization.
> diff --git a/drivers/net/ethernet/broadcom/tg3.c b/drivers/net/ethernet/broadcom/tg3.c
[]
> @@ -15541,7 +15542,7 @@ static int __devinit tg3_init_one(struct pci_dev *pdev,
>  		tnapi->tx_pending = TG3_DEF_TX_RING_PENDING;
>  
>  		tnapi->int_mbox = intmbx;
> -		if (i < 4)
> +		if (i <= 4)
>  			intmbx += 0x8;
>  		else
>  			intmbx += 0x4;

Not just code movement.

^ permalink raw reply

* Re: RFC - should network devices trim frames > soft mtu
From: Ben Hutchings @ 2011-08-31 22:45 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: David Miller, Michael Chan, netdev
In-Reply-To: <20110831151823.23cfb7bc@nehalam.ftrdhcpuser.net>

On Wed, 2011-08-31 at 15:18 -0700, Stephen Hemminger wrote:
> I noticed the following in the bnx2 driver.
> 
> 
> static int
> bnx2_rx_int(struct bnx2 *bp, struct bnx2_napi *bnapi, int budget)
> {
> ...
> 		skb->protocol = eth_type_trans(skb, bp->dev);
> 
> 		if ((len > (bp->dev->mtu + ETH_HLEN)) &&
> 			(ntohs(skb->protocol) != 0x8100)) {
> 
> 			dev_kfree_skb(skb);
> 			goto next_rx;
> 
> 		}
> 
> This means that for non-VLAN tagged frames, the device drops received
> packets if the length is greater than the MTU.  I don't see that in
> other devices. What is the correct method? IMHO the bnx2 driver is
> wrong here and if the policy is desired it should be enforced at
> the next level (netif_receive_skb).  Hardcoding a protocol value is
> kind of a giveaway that something is fishy.

According to netdevices.txt:

"MTU is symmetrical and applies both to receive and transmit.  ...
The device may either: drop, truncate, or pass up oversize packets, but
dropping oversize packets is preferred."

I believe UNH interop tests expect that MRU = MTU and oversize packets
are dropped.  However, I seem to recall that David has said more
recently that it's preferable to always use the maximum possible MRU if
DMA scatter is supported (so that this doesn't require page allocations
of order > 0).

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply

* Re: RFC - should network devices trim frames > soft mtu
From: Michael Chan @ 2011-08-31 22:27 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: David Miller, netdev@vger.kernel.org
In-Reply-To: <20110831151823.23cfb7bc@nehalam.ftrdhcpuser.net>


On Wed, 2011-08-31 at 15:18 -0700, Stephen Hemminger wrote:
> I noticed the following in the bnx2 driver.
> 
> 
> static int
> bnx2_rx_int(struct bnx2 *bp, struct bnx2_napi *bnapi, int budget)
> {
> ...
> 		skb->protocol = eth_type_trans(skb, bp->dev);
> 
> 		if ((len > (bp->dev->mtu + ETH_HLEN)) &&
> 			(ntohs(skb->protocol) != 0x8100)) {
> 
> 			dev_kfree_skb(skb);
> 			goto next_rx;
> 
> 		}
> 
> This means that for non-VLAN tagged frames, the device drops received
> packets if the length is greater than the MTU.  I don't see that in
> other devices. What is the correct method? IMHO the bnx2 driver is
> wrong here and if the policy is desired it should be enforced at
> the next level (netif_receive_skb).  Hardcoding a protocol value is
> kind of a giveaway that something is fishy.
> 

I guess the reasoning is that we program the RX MTU in our chip to
automatically discard packets bigger than the RX MTU and count them as
over-size packets.  We add 4 bytes to the RX MTU to account for the VLAN
tag which may be stripped or not stripped by the chip depending on
settings.  The extra 4 bytes in the RX MTU setting will allow over-size
packets by up to 4 bytes to get through.

I agree we should move this to the next level.

^ permalink raw reply

* Re: RFC - should network devices trim frames > soft mtu
From: Ben Greear @ 2011-08-31 22:26 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: David Miller, Michael Chan, netdev
In-Reply-To: <20110831151823.23cfb7bc@nehalam.ftrdhcpuser.net>

On 08/31/2011 03:18 PM, Stephen Hemminger wrote:
> I noticed the following in the bnx2 driver.
>
>
> static int
> bnx2_rx_int(struct bnx2 *bp, struct bnx2_napi *bnapi, int budget)
> {
> ...
> 		skb->protocol = eth_type_trans(skb, bp->dev);
>
> 		if ((len>  (bp->dev->mtu + ETH_HLEN))&&
> 			(ntohs(skb->protocol) != 0x8100)) {
>
> 			dev_kfree_skb(skb);
> 			goto next_rx;
>
> 		}
>
> This means that for non-VLAN tagged frames, the device drops received
> packets if the length is greater than the MTU.  I don't see that in
> other devices. What is the correct method? IMHO the bnx2 driver is
> wrong here and if the policy is desired it should be enforced at
> the next level (netif_receive_skb).  Hardcoding a protocol value is
> kind of a giveaway that something is fishy.

Maybe that lets them use some kind of offload?

Either way, seems the pkt should be allowed to come up the
stack if the NIC can receive it and it's not otherwise funky.

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply

* Re: [PATCH] ethtool: Uncook tg3 regdump output
From: Ben Hutchings @ 2011-08-31 22:25 UTC (permalink / raw)
  To: Matt Carlson; +Cc: netdev
In-Reply-To: <1314842557-32247-1-git-send-email-mcarlson@broadcom.com>

On Wed, 2011-08-31 at 19:02 -0700, Matt Carlson wrote:
> tg3 devices have changed a lot since the bcm5700 days.  Register blocks
> have been added, removed and redefined.  The existing tg3 register dump
> code has not kept up.
> 
> This patch changes the tg3_dump_regs() function to be more simplistic.
> Rather than attempting to locate where meaningful data is, it will
> instead print the registers as 32-bit values, omitting all registers
> that have a value of zero.  By performing the output this way, we hope
> to future-proof the interface.
[...]

Applied.

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply

* RFC - should network devices trim frames > soft mtu
From: Stephen Hemminger @ 2011-08-31 22:18 UTC (permalink / raw)
  To: David Miller, Michael Chan; +Cc: netdev

I noticed the following in the bnx2 driver.


static int
bnx2_rx_int(struct bnx2 *bp, struct bnx2_napi *bnapi, int budget)
{
...
		skb->protocol = eth_type_trans(skb, bp->dev);

		if ((len > (bp->dev->mtu + ETH_HLEN)) &&
			(ntohs(skb->protocol) != 0x8100)) {

			dev_kfree_skb(skb);
			goto next_rx;

		}

This means that for non-VLAN tagged frames, the device drops received
packets if the length is greater than the MTU.  I don't see that in
other devices. What is the correct method? IMHO the bnx2 driver is
wrong here and if the policy is desired it should be enforced at
the next level (netif_receive_skb).  Hardcoding a protocol value is
kind of a giveaway that something is fishy.

^ permalink raw reply

* [PATCH] ethtool: Uncook tg3 regdump output
From: Matt Carlson @ 2011-09-01  2:02 UTC (permalink / raw)
  To: bhutchings; +Cc: netdev, mcarlson

tg3 devices have changed a lot since the bcm5700 days.  Register blocks
have been added, removed and redefined.  The existing tg3 register dump
code has not kept up.

This patch changes the tg3_dump_regs() function to be more simplistic.
Rather than attempting to locate where meaningful data is, it will
instead print the registers as 32-bit values, omitting all registers
that have a value of zero.  By performing the output this way, we hope
to future-proof the interface.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
---
 tg3.c |   30 +++++++-----------------------
 1 files changed, 7 insertions(+), 23 deletions(-)

diff --git a/tg3.c b/tg3.c
index 1ce07a8..4868504 100644
--- a/tg3.c
+++ b/tg3.c
@@ -26,32 +26,16 @@ tg3_dump_eeprom(struct ethtool_drvinfo *info, struct ethtool_eeprom *ee)
 int
 tg3_dump_regs(struct ethtool_drvinfo *info, struct ethtool_regs *regs)
 {
-	int i, j;
-	int reg_boundaries[] = { 0x015c, 0x0200, 0x0400, 0x0400, 0x08f0, 0x0c00,
-	       			 0x0ce0, 0x1000, 0x1004, 0x1400, 0x1480, 0x1800,
-				 0x1848, 0x1c00, 0x1c04, 0x2000, 0x225c, 0x2400,
-				 0x24c4, 0x2800, 0x2804, 0x2c00, 0x2c20, 0x3000,
-				 0x3014, 0x3400, 0x3408, 0x3800, 0x3808, 0x3c00,
-				 0x3d00, 0x4000, 0x4010, 0x4400, 0x4458, 0x4800,
-				 0x4808, 0x4c00, 0x4c08, 0x5000, 0x5280, 0x5400,
-				 0x5680, 0x5800, 0x5a10, 0x5c00, 0x5d20, 0x6000,
-				 0x600c, 0x6800, 0x6848, 0x7000, 0x7034, 0x7c00,
-				 0x7e40, 0x8000 };
+	int i;
+	u32 reg;
 
 	fprintf(stdout, "Offset\tValue\n");
 	fprintf(stdout, "------\t----------\n");
-	for (i = 0, j = 0; i < regs->len; ) {
-		u32 reg;
-
-		memcpy(&reg, &regs->data[i], 4);
-		fprintf(stdout, "0x%04x\t0x%08x\n", i, reg);
-
-		i += 4;
-		if (i == reg_boundaries[j]) {
-			i = reg_boundaries[j + 1];
-			j += 2;
-			fprintf(stdout, "\n");
-		}
+	for (i = 0; i < regs->len; i += sizeof(reg)) {
+		memcpy(&reg, &regs->data[i], sizeof(reg));
+		if (reg)
+			fprintf(stdout, "0x%04x\t0x%08x\n", i, reg);
+
 	}
 	fprintf(stdout, "\n");
 	return 0;
-- 
1.7.3.4

^ permalink raw reply related

* Re: [PATCH] bridge: mask forwarding of IEEE 802 local multicast groups
From: Ed Swierk @ 2011-08-31 22:00 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Nick Carter, David Lamparter, netdev, Michał Mirosław,
	davem
In-Reply-To: <20110831134904.1a050924@nehalam.ftrdhcpuser.net>

On Wed, Aug 31, 2011 at 1:49 PM, Stephen Hemminger
<shemminger@linux-foundation.org> wrote:
> I prefer the netfilter solution because it is more general. We already have
> a firewall solution why shouldn't this case be part of it?

For my application, I just want the bridge to forward all link-local
frames as well as regular frames between a pair of interfaces. It
seems a little odd to use a mechanism like netfilter to tell the
bridge _not_ to drop frames that would otherwise be dropped.

But other than that minor issue, my strong preference is that either
your or Nick's patch be committed as soon as possible :-)

--Ed

^ permalink raw reply

* Re: [patch net-next-2.6 1/2] net: allow to change carrier via sysfs
From: Ben Greear @ 2011-08-31 21:49 UTC (permalink / raw)
  To: Ben Hutchings
  Cc: Nicolas de Pesloüan, Jiri Pirko, Michał Mirosław,
	netdev, davem, eric.dumazet, shemminger
In-Reply-To: <1314826605.3274.34.camel@bwh-desktop>

On 08/31/2011 02:36 PM, Ben Hutchings wrote:
> On Wed, 2011-08-31 at 13:48 -0700, Ben Greear wrote:
>> On 08/31/2011 01:31 PM, Nicolas de Pesloüan wrote:
>>> Le 31/08/2011 22:12, Ben Hutchings a écrit :
>>>> On Wed, 2011-08-31 at 22:03 +0200, Nicolas de Pesloüan wrote:
>>>>> Le 31/08/2011 10:45, Jiri Pirko a écrit :
>>>>>
>>>>>>>>> Do you expect drivers using implementation different than just calling
>>>>>>>>> netif_carrier_on/off? Or is it supposed to also e.g. power down PHYs?
>>>>>>>> Yes, generally it can be used also for en/disable phy, for testing
>>>>>>>> purposes if hw and driver would support it.
>>>>>>>
>>>>>>> I'd like to see this working for GRE tunnel devices (for keepalive
>>>>>>> daemon to be able to indicate to routing daemons whether tunnel is
>>>>>>> really working) - implementation would be identical to dummy's case.
>>>>>>> Should I prepare a patch or can I leave it to you?
>>>>>>
>>>>>> Ok, I can include it to this patchset (I'm going to repost first patch
>>>>>> anyway)
>>>>>
>>>>> Can't we assume that the dummy's case is the default behavior and
>>>>> register this default
>>>>> ndo_change_carrier callback for every device ?
>>>>
>>>> You have got to be joking. No device driver that has real link
>>>> monitoring should use this implementation.
>>>
>>> Well, why not? Arguably, this is probably not the feature one would use every day, but...
>>>
>>> Testing a cluster reaction to a link down event would be easier if one doesn't need to unplug the cable for the test. I understand that one can turn off the
>>> switch port (physical or virtual), but echo 0>  /sys/class/net/eth0/carrier would be nice too.
>>
>> There is special hardware out there that can do bypass, and often it also has a mode
>> that will programatically cut link by throwing some relays.  We use this for our
>> testing equipment...
>>
>> If there is some way to twiddle standard-ish hardware to actually drop link, that
>> would be neat.  I'd think it should be an ethtool type of thing, however.
>
> We need to be able to control this as part of our driver test suite (on
> the peer, not the device under test).  There are various MDIO bits that
> look like they should do this but unfortunately they don't have
> consistent effects.  Besides that, many PHYs are not MDIO-manageable.
> So this would have to be a device-specific operation, whether it's
> exposed through ethtool or sysfs.

Well, I have boxes that can act as the peer.  Nics are intel chipset 1G,
the bypass/link-drop feature is some separate hardware on the NIC, and
there is a hack of a driver & user-space tools that controls those bits.

But, if there is a way to fiddle normal NICs w/out the special bypass/disconnect
hardware, that would be welcome.  Even if it's different for each driver, and
not supported by all, ethtool can deal with that sort of thing.

Here's a link to the hardware we use...it comes with drivers, though we've hacked on
ours a bit.

http://silicom-usa.com/upload/Downloads/Product_102.pdf

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply

* Re: [patch net-next-2.6 1/2] net: allow to change carrier via sysfs
From: Stephen Hemminger @ 2011-08-31 21:40 UTC (permalink / raw)
  To: Ben Hutchings
  Cc: Ben Greear, Nicolas de Pesloüan, Jiri Pirko,
	Michał Mirosław, netdev, davem, eric.dumazet
In-Reply-To: <1314826605.3274.34.camel@bwh-desktop>

On Wed, 31 Aug 2011 22:36:45 +0100
Ben Hutchings <bhutchings@solarflare.com> wrote:

> On Wed, 2011-08-31 at 13:48 -0700, Ben Greear wrote:
> > On 08/31/2011 01:31 PM, Nicolas de Pesloüan wrote:
> > > Le 31/08/2011 22:12, Ben Hutchings a écrit :
> > >> On Wed, 2011-08-31 at 22:03 +0200, Nicolas de Pesloüan wrote:
> > >>> Le 31/08/2011 10:45, Jiri Pirko a écrit :
> > >>>
> > >>>>>>> Do you expect drivers using implementation different than just calling
> > >>>>>>> netif_carrier_on/off? Or is it supposed to also e.g. power down PHYs?
> > >>>>>> Yes, generally it can be used also for en/disable phy, for testing
> > >>>>>> purposes if hw and driver would support it.
> > >>>>>
> > >>>>> I'd like to see this working for GRE tunnel devices (for keepalive
> > >>>>> daemon to be able to indicate to routing daemons whether tunnel is
> > >>>>> really working) - implementation would be identical to dummy's case.
> > >>>>> Should I prepare a patch or can I leave it to you?
> > >>>>
> > >>>> Ok, I can include it to this patchset (I'm going to repost first patch
> > >>>> anyway)
> > >>>
> > >>> Can't we assume that the dummy's case is the default behavior and
> > >>> register this default
> > >>> ndo_change_carrier callback for every device ?
> > >>
> > >> You have got to be joking. No device driver that has real link
> > >> monitoring should use this implementation.
> > >
> > > Well, why not? Arguably, this is probably not the feature one would use every day, but...
> > >
> > > Testing a cluster reaction to a link down event would be easier if one doesn't need to unplug the cable for the test. I understand that one can turn off the
> > > switch port (physical or virtual), but echo 0 > /sys/class/net/eth0/carrier would be nice too.
> > 
> > There is special hardware out there that can do bypass, and often it also has a mode
> > that will programatically cut link by throwing some relays.  We use this for our
> > testing equipment...
> > 
> > If there is some way to twiddle standard-ish hardware to actually drop link, that
> > would be neat.  I'd think it should be an ethtool type of thing, however.
> 
> We need to be able to control this as part of our driver test suite (on
> the peer, not the device under test).  There are various MDIO bits that
> look like they should do this but unfortunately they don't have
> consistent effects.  Besides that, many PHYs are not MDIO-manageable.
> So this would have to be a device-specific operation, whether it's
> exposed through ethtool or sysfs.
> 
> > Actually dropping link, and letting that naturally propagate up the stack seems
> > more reasonable than lying about the status half way up the stack.
> 

For testing clustering, there are hooks in vmware and QEMU/KVM to allow
dropping carrier on the VM side.

^ permalink raw reply

* Re: [patch net-next-2.6 1/2] net: allow to change carrier via sysfs
From: Ben Hutchings @ 2011-08-31 21:36 UTC (permalink / raw)
  To: Ben Greear
  Cc: Nicolas de Pesloüan, Jiri Pirko, Michał Mirosław,
	netdev, davem, eric.dumazet, shemminger
In-Reply-To: <4E5E9E16.6060502@candelatech.com>

On Wed, 2011-08-31 at 13:48 -0700, Ben Greear wrote:
> On 08/31/2011 01:31 PM, Nicolas de Pesloüan wrote:
> > Le 31/08/2011 22:12, Ben Hutchings a écrit :
> >> On Wed, 2011-08-31 at 22:03 +0200, Nicolas de Pesloüan wrote:
> >>> Le 31/08/2011 10:45, Jiri Pirko a écrit :
> >>>
> >>>>>>> Do you expect drivers using implementation different than just calling
> >>>>>>> netif_carrier_on/off? Or is it supposed to also e.g. power down PHYs?
> >>>>>> Yes, generally it can be used also for en/disable phy, for testing
> >>>>>> purposes if hw and driver would support it.
> >>>>>
> >>>>> I'd like to see this working for GRE tunnel devices (for keepalive
> >>>>> daemon to be able to indicate to routing daemons whether tunnel is
> >>>>> really working) - implementation would be identical to dummy's case.
> >>>>> Should I prepare a patch or can I leave it to you?
> >>>>
> >>>> Ok, I can include it to this patchset (I'm going to repost first patch
> >>>> anyway)
> >>>
> >>> Can't we assume that the dummy's case is the default behavior and
> >>> register this default
> >>> ndo_change_carrier callback for every device ?
> >>
> >> You have got to be joking. No device driver that has real link
> >> monitoring should use this implementation.
> >
> > Well, why not? Arguably, this is probably not the feature one would use every day, but...
> >
> > Testing a cluster reaction to a link down event would be easier if one doesn't need to unplug the cable for the test. I understand that one can turn off the
> > switch port (physical or virtual), but echo 0 > /sys/class/net/eth0/carrier would be nice too.
> 
> There is special hardware out there that can do bypass, and often it also has a mode
> that will programatically cut link by throwing some relays.  We use this for our
> testing equipment...
> 
> If there is some way to twiddle standard-ish hardware to actually drop link, that
> would be neat.  I'd think it should be an ethtool type of thing, however.

We need to be able to control this as part of our driver test suite (on
the peer, not the device under test).  There are various MDIO bits that
look like they should do this but unfortunately they don't have
consistent effects.  Besides that, many PHYs are not MDIO-manageable.
So this would have to be a device-specific operation, whether it's
exposed through ethtool or sysfs.

> Actually dropping link, and letting that naturally propagate up the stack seems
> more reasonable than lying about the status half way up the stack.

Right.

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply

* Re: [PATCH 2/2] Add a netlink attribute INET_DIAG_SECCTX
From: Paul Moore @ 2011-08-31 21:18 UTC (permalink / raw)
  To: rongqing.li; +Cc: netdev, selinux, linux-security-module
In-Reply-To: <1314779777-12669-3-git-send-email-rongqing.li@windriver.com>

On Wednesday, August 31, 2011 04:36:17 PM rongqing.li@windriver.com wrote:
> From: Roy.Li <rongqing.li@windriver.com>
> 
> Add a new netlink attribute INET_DIAG_SECCTX to dump the security
> context of TCP sockets.

You'll have to forgive me, I'm not familiar with the netlink code used by 
netstat and friends, but is there anyway to report back the security context 
of UDP sockets?  Or does the code below handle that already?

In general, AF_INET and AF_INET6 sockets, regardless of any upper level 
protocols, have security contexts associated with them and it would be nice to 
see them in netstat.

> diff --git a/net/ipv4/inet_diag.c b/net/ipv4/inet_diag.c
> index 389a2e6..1faf752 100644
> --- a/net/ipv4/inet_diag.c
> +++ b/net/ipv4/inet_diag.c
> @@ -34,6 +34,8 @@
> 
>  #include <linux/inet_diag.h>
> 
> +#define MAX_SECCTX_LEN 128

I'll echo Stephen's concerns that this is too small.  A MCS/MLS system with a 
moderate number of categories could bump into this limit without too much 
difficulty.

>  struct inet_diag_entry {
> @@ -108,6 +110,25 @@ static int inet_csk_diag_fill(struct sock *sk,
>  		       icsk->icsk_ca_ops->name);
>  	}
> 
> +	if (ext & (1 << (INET_DIAG_SECCTX - 1))) {
> +		u32 ctxlen = 0;
> +		void *secctx;
> +		int error;
> +
> +		error = security_sk_getsecctx(sk, &secctx, &ctxlen);
> +
> +		if (!error && ctxlen) {
> +			if (ctxlen < MAX_SECCTX_LEN) {
> +				strcpy(INET_DIAG_PUT(skb, INET_DIAG_SECCTX,
> +					ctxlen + 1), secctx);
> +			} else {
> +				strcpy(INET_DIAG_PUT(skb, INET_DIAG_SECCTX,
> +					2), "-");

Is the "-" string a special value already interpreted by the userspace tools?  
If not, you might consider using a string that would indicate an out-of-space 
condition occurred; at first glance I thought the "-" string indicated no 
context.

> +			}
> +			security_release_secctx(secctx, ctxlen);
> +		}
> +	}
> +
>  	r->idiag_family = sk->sk_family;
>  	r->idiag_state = sk->sk_state;
>  	r->idiag_timer = 0;
> @@ -246,7 +267,7 @@ static int sk_diag_fill(struct sock *sk, struct sk_buff
> *skb, static int inet_diag_get_exact(struct sk_buff *in_skb,
>  			       const struct nlmsghdr *nlh)
>  {
> -	int err;
> +	int err, len;
>  	struct sock *sk;
>  	struct inet_diag_req *req = NLMSG_DATA(nlh);
>  	struct sk_buff *rep;
> @@ -293,10 +314,17 @@ static int inet_diag_get_exact(struct sk_buff *in_skb,
> goto out;
> 
>  	err = -ENOMEM;
> -	rep = alloc_skb(NLMSG_SPACE((sizeof(struct inet_diag_msg) +
> -				     sizeof(struct inet_diag_meminfo) +
> -				     handler->idiag_info_size + 64)),
> -			GFP_KERNEL);
> +	len = sizeof(struct inet_diag_msg) + 64;
> +
> +	len += (req->idiag_ext & (1 << (INET_DIAG_MEMINFO - 1))) ?
> +		sizeof(struct inet_diag_meminfo) : 0;
> +	len += (req->idiag_ext & (1 << (INET_DIAG_INFO - 1))) ?
> +		handler->idiag_info_size : 0;
> +	len += (req->idiag_ext & (1 << (INET_DIAG_SECCTX - 1))) ?
> +		MAX_SECCTX_LEN : 0;
> +
> +	rep = alloc_skb(NLMSG_SPACE(len), GFP_KERNEL);

How much of a problem would it be if you just allocated an entire page (or 4k 
in the case of huge pages) and used that?  Is memory usage a concern here?

-- 
paul moore
www.paul-moore.com

^ permalink raw reply

* GIT networking temporary freeze...
From: David Miller @ 2011-08-31 20:58 UTC (permalink / raw)
  To: netdev-u79uwXL29TY76Z2rM5mHXA
  Cc: linux-wireless-u79uwXL29TY76Z2rM5mHXA,
	netfilter-devel-u79uwXL29TY76Z2rM5mHXA


As you may or may not have noticed, master.kernel.org is down since
late yesterday with some filesystem problems.

It is being worked on actively, but meanwhile I'm not going to be
applying anything to my tree until it comes back up.

Hopefully it should be going again in the next day.

Just FYI...
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: BQL crap and wireless
From: Luis R. Rodriguez @ 2011-08-31 20:50 UTC (permalink / raw)
  To: Jim Gettys
  Cc: Andrew McGregor, Adrian Chadd, Tom Herbert, Dave Taht,
	linux-wireless, Matt Smith, Kevin Hayes, Derek Smithies,
	netdev-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <4E5E36EE.8080501-CC+yJ3UmIYqDUpFQwHEjaQ@public.gmane.org>

On Wed, Aug 31, 2011 at 6:28 AM, Jim Gettys <jg-CC+yJ3UmIYqDUpFQwHEjaQ@public.gmane.org> wrote:
> On 08/30/2011 05:47 PM, Andrew McGregor wrote:
>> On 31/08/2011, at 1:58 AM, Jim Gettys wrote:
>>
>>> On 08/29/2011 11:42 PM, Adrian Chadd wrote:
>>>> On 30 August 2011 11:34, Tom Herbert <therbert-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org> wrote:
>>>>
>>>> C(P) is going to be quite variable - a full frame retransmit of a 4ms
>>>> long aggregate frame is SUM(exponential backoff, grab the air,
>>>> preamble, header, 4ms, etc. for each pass.)
>>>>
>>> It's not clear to me that doing heroic measures to compute the cost is
>>> going to be worthwhile due to the rate at which the costs can change on
>>> wireless; just getting into the rough ballpark may be enough. But
>>> buffering algorithms and AQM algorithms are going to need an estimate of
>>> the *time* it will take to transmit data, more than # of bytes or packets.
>> That's not heroic measures; mac80211 needs all the code to calculate these times anyway, it's just a matter of collecting together some things we already know and calling the right function.
>>
>>
>
> Fine; if it's easy, accurate is better (presuming the costs get
> recalculated when circumstances change). We also will need the amount of
> data being transmitted; it is the rate of transmission (the rate at
> which the buffers are draining) that we'll likely need.
>
> Here's what I've gleaned from reading "RED in a different light",  Van
> Jacobson's Mitre talk and several conversations with Kathleen Nichols
> and Van: AQM algorithms that can handle variable bandwidth environments
> will need to be able to know the rate at which buffers empty.  It's the
> direction they are going with their experiments on a "RED light" algorithm.
>
> The fundamental insight as to why classic RED can't work in the wireless
> environment is that the instantaneous queue length has little actual
> information in it.

Can you elaborate? Mind you I know squat about AQM but am curious
about what *is* required here.

> Classic RED is tuned using the queue length as its
> basic parameter.  Their belief as to algorithms that will work is that
> the need to keep track of the running queue length *minimum over time*;
> you want to keep the buffers small on a longer term basis (so they both
> can absorb transients, which is their reason for existence, while
> keeping the latency typically low).

I think I get this.

Now, I do care about the above details of what is needed but -- it
still puzzles me that no algorithm on addressing this has been
developed that simply does trial and error to fixate on the
appropriate queue length. Do we know the outcome of using a "bad queue
length", is it the latency issues? If so then can we not use latency
as a reactive measure to help us fine-tune the desired queue length.
This would be PHY agnostic. This is what I was luring to earlier when
I had mentioned using the approach Minstrel takes to solve the rate
control problem. I've lately been fixating on these sort of algorithms
approaches as solutions to complex problems as inspired by Tim
Harford's elaboration on the God complex on a TED Talk:

http://www.youtube.com/watch?v=K5wCfYujRdE

> The additional major challenge we
> face that core routers do not is the highly variable traffic of mixed
> mice and elephants.  What actually works only time will tell.

Thank you for elaborating on this, it helps to further dissect the
issues. When mixing them together it makes the task harder to address.
As for 802.11 we'd just need to worry about management frames. Dave
and Andrew had also pointed out issues with multicast and the low
rates used for them in comparison to unicast transmissions.

> So in an environment in which the rate of transmission is highly
> variable,

Not only that, remember we have different possible topologies
depending on the type of 802.11 service used, IBSS, the typical AP-STA
environment, Mesh, P2P (AP-STA really), and now with 802.11ad we get
PBSS. What counts here is we may have variable rate of transmission to
different possible peers, as such the queue length will depend on all
of the expected transmissions.

> such as wireless, or even possibly modern broadband with
> PowerBoost, classic RED or similar algorithms that do not take the
> buffer drain rate cannot possibly hack it properly.

Understood, just curious if anyone has tried a Minstrel approach.

Lets try to document all of this here:

http://wireless.kernel.org/en/developers/bufferbloat

  Luis
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH] bridge: mask forwarding of IEEE 802 local multicast groups
From: Stephen Hemminger @ 2011-08-31 20:49 UTC (permalink / raw)
  To: Nick Carter
  Cc: David Lamparter, eswierk, netdev, Michał Mirosław,
	davem
In-Reply-To: <CAEJpZP0riOx=7umYgr3fV4JF31+8Jv8m6DRTkyyuiDnioff4Vw@mail.gmail.com>

On Wed, 31 Aug 2011 21:41:26 +0100
Nick Carter <ncarter100@gmail.com> wrote:

> On 15 August 2011 19:25, Stephen Hemminger
> <shemminger@linux-foundation.org> wrote:
> > On Mon, 15 Aug 2011 17:27:12 +0100
> > Nick Carter <ncarter100@gmail.com> wrote:
> >
> >> On 28 July 2011 16:41, Stephen Hemminger
> >> <shemminger@linux-foundation.org> wrote:
> >> > On Wed, 27 Jul 2011 13:17:15 +0200
> >> > David Lamparter <equinox@diac24.net> wrote:
> >> >
> >> >> On Fri, Jul 15, 2011 at 06:33:45PM +0200, David Lamparter wrote:
> >> >> > On Fri, Jul 15, 2011 at 06:03:57PM +0200, David Lamparter wrote:
> >> >> > > On Fri, Jul 15, 2011 at 04:44:50PM +0100, Nick Carter wrote:
> >> >> > > > On 12 July 2011 12:36, David Lamparter <equinox@diac24.net> wrote:
> >> >> > > > > On Mon, Jul 11, 2011 at 08:27:55AM -0700, Stephen Hemminger wrote:
> >> >> > > > >> I am still undecided on this. Understand the need, but don't like idea
> >> >> > > > >> of bridge behaving in non-conforming manner. Will see if IEEE 802 committee
> >> >> > > > >> has any input.
> >> >> > > > >
> >> >> > > > > The patch doesn't make the bridge behave nonconformant. The default mask
> >> >> > > > > is 0, which just keeps the old behaviour.
> >> >> >
> >> >> > P.S.: I'd like to once more stress this. In my opinion the patch should
> >> >> > be merged because it provides desireable functionality at a small cost
> >> >> > (one test, one knob) and __does not change any default behaviour__.
> >> >>
> >> >> Stephen, anything new on this?
> >> >
> >> > No.
> >> > Don't like adding yet another hack user visible API which will have
> >> > to be maintained for too long. But on the other hand I don't have
> >> > a better solution at my finger tips. If better idea doesn't come
> >> > along, then we can go with yours.
> >> >
> >> I have not noticed any other proposals and this thread has been open
> >> for quite a while.  Have we waited long enough ? If so can this patch
> >> be taken ?
> >>
> >
> > I am testing an alternative. The problem with your proposal is that
> > it relies on the multicast address. It turns out there are people using
> > other addresses for the STP group address, so using that as a identifier
> > is incorrect.
> If the chosen STP group address is in the local multicast group range
> this patch will handle it.
> 
> David Lamparter has reviewed this patch and asked for it to be merged.
>  This patch has at least two real world uses.  Ed needs this patch to
> forward LLDP frames and I need this patch to forward 802.1X frames.
> 
> This patch has been out for review for 9 weeks and it still looks like
> the best solution.

I prefer the netfilter solution because it is more general. We already have
a firewall solution why shouldn't this case be part of it?

^ permalink raw reply

* Re: [PATCH 1/2] Define security_sk_getsecctx
From: Casey Schaufler @ 2011-08-31 20:49 UTC (permalink / raw)
  To: Stephen Smalley
  Cc: rongqing.li, netdev, selinux, linux-security-module,
	Casey Schaufler
In-Reply-To: <1314816361.6850.51.camel@moss-pluto>

On 8/31/2011 11:46 AM, Stephen Smalley wrote:
> On Wed, 2011-08-31 at 08:43 -0700, Casey Schaufler wrote:
>> On 8/31/2011 1:36 AM, rongqing.li@windriver.com wrote:
>>> From: Roy.Li <rongqing.li@windriver.com>
>>>
>>> Define security_sk_getsecctx to return the security
>>> context of a sock.
>> So, what is the intended use of the information
>> coming from this hook? If I wanted to write the
>> Smack hook, which of the "contexts" would I want
>> to return? There are potentially three. If I know
>> what the caller is looking for, I can (hopefully)
>> select the correct information.
> The initial use case is for netstat -Z so that it can reliably show the
> security context of the socket rather than inferring it from the owning
> process, which can be inaccurate for security-aware applications.
>
> In your situation, when in != out, which would you rather see in netstat
> -Z output?  Alternatively, if you want them both, perhaps you could
> combine in and out into a single string that is returned, similar to
> what you proposed for handling multiple xattrs with inode_getsecctx()?
>

If we want to use secctx consistently within the kernel, and
I personally think that is a good idea, I would have to chose
the SMACK64IPIN (label checked on packet delivery) value. Putting
both the SMACK64IPOUT and SMACK64IPIN "contexts" into the secctx
would violate the architectural notion that a secctx is the
textual representation of a value used to make access control
decisions. It would mean that calling security_secctx_to_secid()
with the value returned by security_sk_getsecctx would be invalid.
Note that this is different from the mechanism I had suggested
for handling secctx in the multiple LSM case, as the composed
string would map to a single secid which would in turn map back
to that same composed string. If, on the other hand, what netstat -Z
is out to show is all of the LSM base information about the socket
a compound string might make sense, it just would not be a
secctx, it would be an informational string of some other flavor.

^ permalink raw reply

* Re: [patch net-next-2.6 1/2] net: allow to change carrier via sysfs
From: Ben Greear @ 2011-08-31 20:48 UTC (permalink / raw)
  To: Nicolas de Pesloüan
  Cc: Ben Hutchings, Jiri Pirko, Michał Mirosław, netdev,
	davem, eric.dumazet, shemminger
In-Reply-To: <4E5E9A30.1040105@gmail.com>

On 08/31/2011 01:31 PM, Nicolas de Pesloüan wrote:
> Le 31/08/2011 22:12, Ben Hutchings a écrit :
>> On Wed, 2011-08-31 at 22:03 +0200, Nicolas de Pesloüan wrote:
>>> Le 31/08/2011 10:45, Jiri Pirko a écrit :
>>>
>>>>>>> Do you expect drivers using implementation different than just calling
>>>>>>> netif_carrier_on/off? Or is it supposed to also e.g. power down PHYs?
>>>>>> Yes, generally it can be used also for en/disable phy, for testing
>>>>>> purposes if hw and driver would support it.
>>>>>
>>>>> I'd like to see this working for GRE tunnel devices (for keepalive
>>>>> daemon to be able to indicate to routing daemons whether tunnel is
>>>>> really working) - implementation would be identical to dummy's case.
>>>>> Should I prepare a patch or can I leave it to you?
>>>>
>>>> Ok, I can include it to this patchset (I'm going to repost first patch
>>>> anyway)
>>>
>>> Can't we assume that the dummy's case is the default behavior and
>>> register this default
>>> ndo_change_carrier callback for every device ?
>>
>> You have got to be joking. No device driver that has real link
>> monitoring should use this implementation.
>
> Well, why not? Arguably, this is probably not the feature one would use every day, but...
>
> Testing a cluster reaction to a link down event would be easier if one doesn't need to unplug the cable for the test. I understand that one can turn off the
> switch port (physical or virtual), but echo 0 > /sys/class/net/eth0/carrier would be nice too.

There is special hardware out there that can do bypass, and often it also has a mode
that will programatically cut link by throwing some relays.  We use this for our
testing equipment...

If there is some way to twiddle standard-ish hardware to actually drop link, that
would be neat.  I'd think it should be an ethtool type of thing, however.

Actually dropping link, and letting that naturally propagate up the stack seems
more reasonable than lying about the status half way up the stack.

Thanks,
Ben

-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply

* Re: [PATCH] net: Initialize entire flowi struct
From: David Miller @ 2011-08-31 20:47 UTC (permalink / raw)
  To: ja; +Cc: david.ward, netdev
In-Reply-To: <alpine.LFD.2.00.1108312321400.1826@ja.ssi.bg>

From: Julian Anastasov <ja@ssi.bg>
Date: Wed, 31 Aug 2011 23:51:32 +0300 (EEST)

> 	Such change will cause problems for callers that
> use flowi4 in stack. Examples:
> 
> ip_route_me_harder
> icmp_route_lookup

Right.

^ permalink raw reply

* Re: [PATCH] net: Initialize entire flowi struct
From: Julian Anastasov @ 2011-08-31 20:51 UTC (permalink / raw)
  To: David Ward; +Cc: netdev
In-Reply-To: <1314806703-5275-1-git-send-email-david.ward@ll.mit.edu>


	Hello,

On Wed, 31 Aug 2011, David Ward wrote:

> The entire flowi struct needs to be initialized by afinfo->decode_session,
> because flow_hash_code operates over the entire struct and may otherwise
> return different hash values for what is intended to be the same key.

	Such change will cause problems for callers that
use flowi4 in stack. Examples:

ip_route_me_harder
icmp_route_lookup

	Not sure if adding size as parameter to flow_hash_code
is better approach. May be flow_cache_lookup needs to
determine size from family that can be used for flow_hash_code,
flow_key_compare and the memcpy(&fle->key, key, sizeof(*key))
after fle = kmem_cache_alloc(flow_cachep, GFP_ATOMIC). The
question is how to get size by family.

> Signed-off-by: David Ward <david.ward@ll.mit.edu>
> ---
>  net/ipv4/xfrm4_policy.c |    2 +-
>  net/ipv6/xfrm6_policy.c |    2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/net/ipv4/xfrm4_policy.c b/net/ipv4/xfrm4_policy.c
> index fc5368a..afce24d 100644
> --- a/net/ipv4/xfrm4_policy.c
> +++ b/net/ipv4/xfrm4_policy.c
> @@ -114,7 +114,7 @@ _decode_session4(struct sk_buff *skb, struct flowi *fl, int reverse)
>  	u8 *xprth = skb_network_header(skb) + iph->ihl * 4;
>  	struct flowi4 *fl4 = &fl->u.ip4;
>  
> -	memset(fl4, 0, sizeof(struct flowi4));
> +	memset(fl, 0, sizeof(struct flowi));
>  	fl4->flowi4_mark = skb->mark;
>  
>  	if (!ip_is_fragment(iph)) {
> diff --git a/net/ipv6/xfrm6_policy.c b/net/ipv6/xfrm6_policy.c
> index d879f7e..9088d38 100644
> --- a/net/ipv6/xfrm6_policy.c
> +++ b/net/ipv6/xfrm6_policy.c
> @@ -129,7 +129,7 @@ _decode_session6(struct sk_buff *skb, struct flowi *fl, int reverse)
>  	const unsigned char *nh = skb_network_header(skb);
>  	u8 nexthdr = nh[IP6CB(skb)->nhoff];
>  
> -	memset(fl6, 0, sizeof(struct flowi6));
> +	memset(fl, 0, sizeof(struct flowi));
>  	fl6->flowi6_mark = skb->mark;
>  
>  	ipv6_addr_copy(&fl6->daddr, reverse ? &hdr->saddr : &hdr->daddr);
> -- 
> 1.7.4.1

Regards

--
Julian Anastasov <ja@ssi.bg>

^ permalink raw reply

* Re: [patch net-next-2.6 1/2] net: allow to change carrier via sysfs
From: Ben Hutchings @ 2011-08-31 20:44 UTC (permalink / raw)
  To: Nicolas de Pesloüan
  Cc: Jiri Pirko, Michał Mirosław, netdev, davem,
	eric.dumazet, shemminger
In-Reply-To: <4E5E9A30.1040105@gmail.com>

On Wed, 2011-08-31 at 22:31 +0200, Nicolas de Pesloüan wrote:
> Le 31/08/2011 22:12, Ben Hutchings a écrit :
> > On Wed, 2011-08-31 at 22:03 +0200, Nicolas de Pesloüan wrote:
> >> Le 31/08/2011 10:45, Jiri Pirko a écrit :
> >>
> >>>>>> Do you expect drivers using implementation different than just calling
> >>>>>> netif_carrier_on/off? Or is it supposed to also e.g. power down PHYs?
> >>>>> Yes, generally it can be used also for en/disable phy, for testing
> >>>>> purposes if hw and driver would support it.
> >>>>
> >>>> I'd like to see this working for GRE tunnel devices (for keepalive
> >>>> daemon to be able to indicate to routing daemons whether tunnel is
> >>>> really working) - implementation would be identical to dummy's case.
> >>>> Should I prepare a patch or can I leave it to you?
> >>>
> >>> Ok, I can include it to this patchset (I'm going to repost first patch
> >>> anyway)
> >>
> >> Can't we assume that the dummy's case is the default behavior and
> >> register this default
> >> ndo_change_carrier callback for every device ?
> >
> > You have got to be joking.  No device driver that has real link
> > monitoring should use this implementation.
> 
> Well, why not? Arguably, this is probably not the feature one would
> use every day, but...
> 
> Testing a cluster reaction to a link down event would be easier if one
> doesn't need to unplug the 
> cable for the test. I understand that one can turn off the switch port
> (physical or virtual), but 
> echo 0 > /sys/class/net/eth0/carrier would be nice too.
> 
> Of course, I assume that netif_carrier_on and netif_carrier_off are
> only called on real link status 
> change. So the value written into /sys/class/net/eth0/carrier would
> stay until the link revert it, 
> due to a double status change (up-down-up or down-up-down). But I may
> miss totally this point.

Think what happens if the user sets carrier on, when the link is
actually down.

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply

* Re: [PATCH] bridge: mask forwarding of IEEE 802 local multicast groups
From: Nick Carter @ 2011-08-31 20:41 UTC (permalink / raw)
  To: Stephen Hemminger, David Lamparter, eswierk
  Cc: netdev, Michał Mirosław, davem
In-Reply-To: <20110815112501.3a3c01ad@nehalam.ftrdhcpuser.net>

On 15 August 2011 19:25, Stephen Hemminger
<shemminger@linux-foundation.org> wrote:
> On Mon, 15 Aug 2011 17:27:12 +0100
> Nick Carter <ncarter100@gmail.com> wrote:
>
>> On 28 July 2011 16:41, Stephen Hemminger
>> <shemminger@linux-foundation.org> wrote:
>> > On Wed, 27 Jul 2011 13:17:15 +0200
>> > David Lamparter <equinox@diac24.net> wrote:
>> >
>> >> On Fri, Jul 15, 2011 at 06:33:45PM +0200, David Lamparter wrote:
>> >> > On Fri, Jul 15, 2011 at 06:03:57PM +0200, David Lamparter wrote:
>> >> > > On Fri, Jul 15, 2011 at 04:44:50PM +0100, Nick Carter wrote:
>> >> > > > On 12 July 2011 12:36, David Lamparter <equinox@diac24.net> wrote:
>> >> > > > > On Mon, Jul 11, 2011 at 08:27:55AM -0700, Stephen Hemminger wrote:
>> >> > > > >> I am still undecided on this. Understand the need, but don't like idea
>> >> > > > >> of bridge behaving in non-conforming manner. Will see if IEEE 802 committee
>> >> > > > >> has any input.
>> >> > > > >
>> >> > > > > The patch doesn't make the bridge behave nonconformant. The default mask
>> >> > > > > is 0, which just keeps the old behaviour.
>> >> >
>> >> > P.S.: I'd like to once more stress this. In my opinion the patch should
>> >> > be merged because it provides desireable functionality at a small cost
>> >> > (one test, one knob) and __does not change any default behaviour__.
>> >>
>> >> Stephen, anything new on this?
>> >
>> > No.
>> > Don't like adding yet another hack user visible API which will have
>> > to be maintained for too long. But on the other hand I don't have
>> > a better solution at my finger tips. If better idea doesn't come
>> > along, then we can go with yours.
>> >
>> I have not noticed any other proposals and this thread has been open
>> for quite a while.  Have we waited long enough ? If so can this patch
>> be taken ?
>>
>
> I am testing an alternative. The problem with your proposal is that
> it relies on the multicast address. It turns out there are people using
> other addresses for the STP group address, so using that as a identifier
> is incorrect.
If the chosen STP group address is in the local multicast group range
this patch will handle it.

David Lamparter has reviewed this patch and asked for it to be merged.
 This patch has at least two real world uses.  Ed needs this patch to
forward LLDP frames and I need this patch to forward 802.1X frames.

This patch has been out for review for 9 weeks and it still looks like
the best solution.

Could this patch be merged please ?

Thanks,
Nick

^ permalink raw reply

* Re: [patch net-next-2.6 1/2] net: allow to change carrier via sysfs
From: Nicolas de Pesloüan @ 2011-08-31 20:31 UTC (permalink / raw)
  To: Ben Hutchings
  Cc: Jiri Pirko, Michał Mirosław, netdev, davem,
	eric.dumazet, shemminger
In-Reply-To: <1314821537.3274.24.camel@bwh-desktop>

Le 31/08/2011 22:12, Ben Hutchings a écrit :
> On Wed, 2011-08-31 at 22:03 +0200, Nicolas de Pesloüan wrote:
>> Le 31/08/2011 10:45, Jiri Pirko a écrit :
>>
>>>>>> Do you expect drivers using implementation different than just calling
>>>>>> netif_carrier_on/off? Or is it supposed to also e.g. power down PHYs?
>>>>> Yes, generally it can be used also for en/disable phy, for testing
>>>>> purposes if hw and driver would support it.
>>>>
>>>> I'd like to see this working for GRE tunnel devices (for keepalive
>>>> daemon to be able to indicate to routing daemons whether tunnel is
>>>> really working) - implementation would be identical to dummy's case.
>>>> Should I prepare a patch or can I leave it to you?
>>>
>>> Ok, I can include it to this patchset (I'm going to repost first patch
>>> anyway)
>>
>> Can't we assume that the dummy's case is the default behavior and
>> register this default
>> ndo_change_carrier callback for every device ?
>
> You have got to be joking.  No device driver that has real link
> monitoring should use this implementation.

Well, why not? Arguably, this is probably not the feature one would use every day, but...

Testing a cluster reaction to a link down event would be easier if one doesn't need to unplug the 
cable for the test. I understand that one can turn off the switch port (physical or virtual), but 
echo 0 > /sys/class/net/eth0/carrier would be nice too.

Of course, I assume that netif_carrier_on and netif_carrier_off are only called on real link status 
change. So the value written into /sys/class/net/eth0/carrier would stay until the link revert it, 
due to a double status change (up-down-up or down-up-down). But I may miss totally this point.

	Nicolas.

^ permalink raw reply

* Re: [patch net-next-2.6 1/2] net: allow to change carrier via sysfs
From: Jiri Pirko @ 2011-08-31 20:26 UTC (permalink / raw)
  To: Ben Hutchings
  Cc: Nicolas de Pesloüan, Michał Mirosław, netdev,
	davem, eric.dumazet, shemminger
In-Reply-To: <1314821537.3274.24.camel@bwh-desktop>

Wed, Aug 31, 2011 at 10:12:17PM CEST, bhutchings@solarflare.com wrote:
>On Wed, 2011-08-31 at 22:03 +0200, Nicolas de Pesloüan wrote:
>> Le 31/08/2011 10:45, Jiri Pirko a écrit :
>> 
>> >>>> Do you expect drivers using implementation different than just calling
>> >>>> netif_carrier_on/off? Or is it supposed to also e.g. power down PHYs?
>> >>> Yes, generally it can be used also for en/disable phy, for testing
>> >>> purposes if hw and driver would support it.
>> >>
>> >> I'd like to see this working for GRE tunnel devices (for keepalive
>> >> daemon to be able to indicate to routing daemons whether tunnel is
>> >> really working) - implementation would be identical to dummy's case.
>> >> Should I prepare a patch or can I leave it to you?
>> >
>> > Ok, I can include it to this patchset (I'm going to repost first patch
>> > anyway)
>> 
>> Can't we assume that the dummy's case is the default behavior and
>> register this default 
>> ndo_change_carrier callback for every device ?
>
>You have got to be joking.  No device driver that has real link
>monitoring should use this implementation.
>
>[...]
>> If someone is not confident with this default callback registered for
>> all device, at least, we can 
>> put this code in a common place, so that a driver willing to use it
>> doesn't need to have its own 
>> version of it.
>
>I agree that this might be useful in some other software devices,
>though.

Maybe it would be better to just add priv_flag for this instead of
introducing netdev_op. Then do on/off is this flag is set. Not sure
though...

Jirka

>
>Ben.
>
>-- 
>Ben Hutchings, Staff Engineer, Solarflare
>Not speaking for my employer; that's the marketing department's job.
>They asked us to note that Solarflare product names are trademarked.
>

^ permalink raw reply

* Re: [patch net-next-2.6 1/2] net: allow to change carrier via sysfs
From: Ben Hutchings @ 2011-08-31 20:12 UTC (permalink / raw)
  To: Nicolas de Pesloüan
  Cc: Jiri Pirko, Michał Mirosław, netdev, davem,
	eric.dumazet, shemminger
In-Reply-To: <4E5E939C.5000009@gmail.com>

On Wed, 2011-08-31 at 22:03 +0200, Nicolas de Pesloüan wrote:
> Le 31/08/2011 10:45, Jiri Pirko a écrit :
> 
> >>>> Do you expect drivers using implementation different than just calling
> >>>> netif_carrier_on/off? Or is it supposed to also e.g. power down PHYs?
> >>> Yes, generally it can be used also for en/disable phy, for testing
> >>> purposes if hw and driver would support it.
> >>
> >> I'd like to see this working for GRE tunnel devices (for keepalive
> >> daemon to be able to indicate to routing daemons whether tunnel is
> >> really working) - implementation would be identical to dummy's case.
> >> Should I prepare a patch or can I leave it to you?
> >
> > Ok, I can include it to this patchset (I'm going to repost first patch
> > anyway)
> 
> Can't we assume that the dummy's case is the default behavior and
> register this default 
> ndo_change_carrier callback for every device ?

You have got to be joking.  No device driver that has real link
monitoring should use this implementation.

[...]
> If someone is not confident with this default callback registered for
> all device, at least, we can 
> put this code in a common place, so that a driver willing to use it
> doesn't need to have its own 
> version of it.

I agree that this might be useful in some other software devices,
though.

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply

* Re: [patch net-next-2.6 1/2] net: allow to change carrier via sysfs
From: Nicolas de Pesloüan @ 2011-08-31 20:03 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: Michał Mirosław, netdev, davem, eric.dumazet,
	bhutchings, shemminger
In-Reply-To: <20110831084511.GD2010@minipsycho.brq.redhat.com>

Le 31/08/2011 10:45, Jiri Pirko a écrit :

>>>> Do you expect drivers using implementation different than just calling
>>>> netif_carrier_on/off? Or is it supposed to also e.g. power down PHYs?
>>> Yes, generally it can be used also for en/disable phy, for testing
>>> purposes if hw and driver would support it.
>>
>> I'd like to see this working for GRE tunnel devices (for keepalive
>> daemon to be able to indicate to routing daemons whether tunnel is
>> really working) - implementation would be identical to dummy's case.
>> Should I prepare a patch or can I leave it to you?
>
> Ok, I can include it to this patchset (I'm going to repost first patch
> anyway)

Can't we assume that the dummy's case is the default behavior and register this default 
ndo_change_carrier callback for every device ?

Device drivers willing to do something different can install a different callback if appropriate.

This would avoid duplicating the following code in most drivers, that don't need something special.

static int dummy_change_carrier(struct net_device *dev, bool new_carrier)
{
	if (new_carrier)
		netif_carrier_on(dev);
	else
		netif_carrier_off(dev);
	return 0;
}

If someone is not confident with this default callback registered for all device, at least, we can 
put this code in a common place, so that a driver willing to use it doesn't need to have its own 
version of it.

	Nicolas.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox