Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH] iwlegacy: print how long queue was actually stuck
From: Paul Bolle @ 2012-07-02  9:32 UTC (permalink / raw)
  To: Stanislaw Gruszka; +Cc: John W. Linville, linux-wireless, netdev, linux-kernel
In-Reply-To: <20120702090602.GA8286@redhat.com>

On Mon, 2012-07-02 at 11:06 +0200, Stanislaw Gruszka wrote:
> On Sat, Jun 30, 2012 at 03:20:06PM +0200, Paul Bolle wrote:
> For some reason we are not able to authenticate, what itself is a
> problem. Maybe this is wrong key offset problem and that will
> be fixed by Emmanuel patch.

Which I can only try if we fix the small problem you mentioned in your
comment on Emmanuel's patch (reference
<20120702082653.GA2479@redhat.com>, not yet archived on lkml.org), can't
I?

> Regarding "Queue 2 stuck", there is another fix in iwlwifi that
> did not make to iwlegacy, which perhaps could help. If not here
> then maybe on suspend.  
> 
> commit 342bbf3fee2fa9a18147e74b2e3c4229a4564912
> Author: Johannes Berg <johannes.berg@intel.com>
> Date:   Sun Mar 4 08:50:46 2012 -0800
> 
>     iwlwifi: always monitor for stuck queues
> 
> I'm attaching iwlegacy version of it.

Thanks, I'll try it. The explanation of the iwlwifi commit makes a lot
of sense: it seems to match the events found in the logs of this laptop.
That's encouraging. Should I report whether the iwlegacy version works
or not?

> > 2) It's always "Queue 2" that's stuck. What does that queue do?
> 
> It's TX queue, probably one used for default traffic i.e. for Best
> Effort category (others are Video, Voice and Background).

I see.

Any thoughts on my patch (ie, the patch that is actually the subject of
this thread?

Paul Bolle

^ permalink raw reply

* Re: [PATCH 1/5] ipv4: Delete routing cache.
From: Eric Dumazet @ 2012-07-02  9:35 UTC (permalink / raw)
  To: David Miller; +Cc: netdev
In-Reply-To: <20120701.050249.1959437975207758368.davem@davemloft.net>

On Sun, 2012-07-01 at 05:02 -0700, David Miller wrote:
> Signed-off-by: David S. Miller <davem@davemloft.net>
> ---
>  include/net/route.h     |    5 +-
>  net/ipv4/fib_frontend.c |    5 -
>  net/ipv4/icmp.c         |    5 +-
>  net/ipv4/route.c        | 1053 +++--------------------------------------------
>  4 files changed, 52 insertions(+), 1016 deletions(-)

When testing this patch, I had to disable CONFIG_IP_ROUTE_VERBOSE to
avoid a compile error at the end of ip_rt_redirect()

^ permalink raw reply

* Re: [PATCH] iwlegacy: print how long queue was actually stuck
From: Stanislaw Gruszka @ 2012-07-02  9:39 UTC (permalink / raw)
  To: Paul Bolle
  Cc: John W. Linville, linux-wireless-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1341221563.1911.141.camel-uMdlDhfIn7prKue/0VVhAg@public.gmane.org>

On Mon, Jul 02, 2012 at 11:32:43AM +0200, Paul Bolle wrote:
> On Mon, 2012-07-02 at 11:06 +0200, Stanislaw Gruszka wrote:
> > On Sat, Jun 30, 2012 at 03:20:06PM +0200, Paul Bolle wrote:
> > For some reason we are not able to authenticate, what itself is a
> > problem. Maybe this is wrong key offset problem and that will
> > be fixed by Emmanuel patch.
> 
> Which I can only try if we fix the small problem you mentioned in your
> comment on Emmanuel's patch (reference
> <20120702082653.GA2479-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>, not yet archived on lkml.org), can't
> I?
Oh, you were not CCed, and lkml too, v2 patch is here:
http://marc.info/?l=linux-wireless&m=134121823510017&w=2

> > Regarding "Queue 2 stuck", there is another fix in iwlwifi that
> > did not make to iwlegacy, which perhaps could help. If not here
> > then maybe on suspend.  
> > 
> > commit 342bbf3fee2fa9a18147e74b2e3c4229a4564912
> > Author: Johannes Berg <johannes.berg-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> > Date:   Sun Mar 4 08:50:46 2012 -0800
> > 
> >     iwlwifi: always monitor for stuck queues
> > 
> > I'm attaching iwlegacy version of it.
> 
> Thanks, I'll try it. The explanation of the iwlwifi commit makes a lot
> of sense: it seems to match the events found in the logs of this laptop.
> That's encouraging. Should I report whether the iwlegacy version works
> or not?
Actually I'll post it anyway. But if this patch and Emmanuel's will not 
fix the problem you should report it. 
   
> > > 2) It's always "Queue 2" that's stuck. What does that queue do?
> > 
> > It's TX queue, probably one used for default traffic i.e. for Best
> > Effort category (others are Video, Voice and Background).
> 
> I see.
> 
> Any thoughts on my patch (ie, the patch that is actually the subject of
> this thread?
It was applied to wireless-testing.

Stanislaw
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH] iwlegacy: print how long queue was actually stuck
From: Paul Bolle @ 2012-07-02  9:53 UTC (permalink / raw)
  To: Stanislaw Gruszka; +Cc: John W. Linville, linux-wireless, netdev, linux-kernel
In-Reply-To: <20120702093913.GA9045@redhat.com>

On Mon, 2012-07-02 at 11:39 +0200, Stanislaw Gruszka wrote:
> On Mon, Jul 02, 2012 at 11:32:43AM +0200, Paul Bolle wrote:
> Oh, you were not CCed, and lkml too, v2 patch is here:
> http://marc.info/?l=linux-wireless&m=134121823510017&w=2

Thanks.

> > Should I report whether the iwlegacy version works
> > or not?
> Actually I'll post it anyway. But if this patch and Emmanuel's will not 
> fix the problem you should report it. 

I'll try to report any oddities I notice.

> > Any thoughts on my patch (ie, the patch that is actually the subject of
> > this thread?
> It was applied to wireless-testing.

That's nice.

Paul Bolle

^ permalink raw reply

* Re: [PATCH 1/5] ipv4: Delete routing cache.
From: David Miller @ 2012-07-02 10:15 UTC (permalink / raw)
  To: eric.dumazet; +Cc: netdev
In-Reply-To: <1341221712.5269.40.camel@edumazet-glaptop>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Mon, 02 Jul 2012 11:35:12 +0200

> When testing this patch, I had to disable CONFIG_IP_ROUTE_VERBOSE to
> avoid a compile error at the end of ip_rt_redirect()

I'll try to get that fixed the next time I respin, thanks for the
report Eric.

^ permalink raw reply

* RE: [patch] [SCSI] bnx2i: use strlcpy() instead of memcpy() for strings
From: David Laight @ 2012-07-02 10:09 UTC (permalink / raw)
  To: Dan Carpenter, James E.J. Bottomley, Barak Witkowski
  Cc: Eddie Wai, Michael Chan, linux-scsi, netdev, David S. Miller
In-Reply-To: <20120630114935.GB22767@elgon.mountain>

> Subject: [patch] [SCSI] bnx2i: use strlcpy() instead of memcpy() for
strings
> 
> DRV_MODULE_VERSION here is "2.7.2.2" which is only 8 chars but we copy
> 12 bytes from the stack so it's a small information leak.
> 
> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
> ---
> This was just added to linux-next yesterday, but I'm not sure 
> which tree it came from.
> 
> diff --git a/drivers/scsi/bnx2i/bnx2i_init.c 
> b/drivers/scsi/bnx2i/bnx2i_init.c
> index 7729a52..b17637a 100644
> --- a/drivers/scsi/bnx2i/bnx2i_init.c
> +++ b/drivers/scsi/bnx2i/bnx2i_init.c
> @@ -400,7 +400,7 @@ int bnx2i_get_stats(void *handle)
>  	if (!stats)
>  		return -ENOMEM;
>  
> -	memcpy(stats->version, DRV_MODULE_VERSION,
sizeof(stats->version));
> +	strlcpy(stats->version, DRV_MODULE_VERSION,
sizeof(stats->version));
>  	memcpy(stats->mac_add1 + 2, hba->cnic->mac_addr, ETH_ALEN);

Doesn't that leak the original contents of the last bytes of
stats->version instead?

	David

^ permalink raw reply

* RE: [PATCH net-next 09/10] net/mlx4_en: Manage flow steering rules with ethtool
From: David Laight @ 2012-07-02 10:19 UTC (permalink / raw)
  To: Joe Perches, Andreas Schwab
  Cc: Ben Hutchings, Or Gerlitz, davem, roland, yevgenyp, oren, netdev,
	Hadar Hen Zion
In-Reply-To: <1341172352.8675.5.camel@joe2Laptop>

 
> > Or write it as (!field || !(typeof(field))~field) which more closely
> > resembles what the macro name expresses.
> 
> Better still, or maybe:
> 
> 	field == 0 || field == (typeof field)~0

Which doesn't work when sizeof(field) > sizeof(int).
Needs another cast.

	field == 0 || field == (typeof field)~(typeof field)0

	David

^ permalink raw reply

* Re: [RFC PATCH net-next] ipvs: add missing lock in ip_vs_ftp_init_conn()
From: Xiaotian Feng @ 2012-07-02 10:30 UTC (permalink / raw)
  To: Julian Anastasov
  Cc: netdev, lvs-devel, netfilter-devel, netfilter, coreteam,
	linux-kernel, Xiaotian Feng, Wensong Zhang, Simon Horman,
	Pablo Neira Ayuso, Patrick McHardy, David S. Miller
In-Reply-To: <alpine.LFD.2.00.1206291125270.1690@ja.ssi.bg>

On Fri, Jun 29, 2012 at 5:04 PM, Julian Anastasov <ja@ssi.bg> wrote:
>
>         Hello,
>
> On Fri, 29 Jun 2012, Xiaotian Feng wrote:
>
>> > On Thu, 28 Jun 2012, Xiaotian Feng wrote:
>> >
>> >> We met a kernel panic in 2.6.32.43 kernel:
>> >>
>> >> [2680191.848044] IPVS: ip_vs_conn_hash(): request for already hashed, called from run_timer_softirq+0x175/0x1d0
>> >> <snip>
>> >> [2680311.849009] general protection fault: 0000 [#1] SMP
>
>         What we see here is 120 seconds between 2680191 and
> 2680311. It can mean 2 things:
>
> - some state timeout, it depends on your forwarding method.
> What is it? NAT? DR?
>
> - 60 seconds for ip_vs_conn_expire retries
>
>> >> After code review, the only chance that kernel change connection flag without protection is
>> >> in ip_vs_ftp_init_conn().
>> >
>> >        Hm, ip_vs_ftp_init_conn is called before 1st hashing,
>> > from ip_vs_bind_app() in ip_vs_conn_new() before
>> > ip_vs_conn_hash(). It should be another problem with
>> > the flags. How different is IPVS in 2.6.32.43 compared to
>> > recent kernels? If commit aea9d711 is present, I'm not
>> > aware of other similar problems.
>>
>> ip_vs_bind_app() is also called by ip_vs_try_bind_dest(), which can be
>> traced to ip_vs_proc_conn().
>> I've checked the changes in upstream, but nothing helps since aea9d711
>> has been taken into 2.6.32.28 kernel.
>
>         OK, this fix should make it safe for master-backup
> sync and it should be applied but I suspect you are not
> using sync, right? And then this fix will not solve the oops.
>

We're using sync.

>         There are no many places that rehash conn:
>
> ip_vs_conn_fill_cport
>         - used for FTP
>
> ip_vs_check_template:
>         - do you have persistence configured?

No.

>
>         After you provide details for the used forwarding
> method, persistence and sync we should think how such races
> with rehashing can lead to double hlist_del. May be
> you can modify the debug message in ip_vs_conn_hash, so
> that we can see cp->flags and ntohs of cp->cport, cp->dport
> and cp->vport when oops happens again.

I just found 2.6.32.34 kernel differ from upstream kernel,  2.6.32
kernel doesn't have ip_vs_try_bind_dest(), but ip_vs_process_message()
kernel might change conn flags without lock protection. This is fixed in
commit f73181c, following changes:

@@ -834,6 +843,7 @@ static void ip_vs_proc_conn(struct net *net,
struct ip_vs_conn_param *param,
                kfree(param->pe_data);

                dest = cp->dest;
+               spin_lock(&cp->lock);
                if ((cp->flags ^ flags) & IP_VS_CONN_F_INACTIVE &&
                    !(flags & IP_VS_CONN_F_TEMPLATE) && dest) {
                        if (flags & IP_VS_CONN_F_INACTIVE) {
@@ -847,6 +857,7 @@ static void ip_vs_proc_conn(struct net *net,
struct ip_vs_conn_param *param,
                flags &= IP_VS_CONN_F_BACKUP_UPD_MASK;
                flags |= cp->flags & ~IP_VS_CONN_F_BACKUP_UPD_MASK;
                cp->flags = flags;
+               spin_unlock(&cp->lock);
                if (!dest) {
                        dest = ip_vs_try_bind_dest(cp);
                        if (dest)

So I took this part into 2.6.32 kernel. But I still think the patch I
posted is required for upstream kernel. Even though there are no many
places that rehash conn, this is potential race as cp->flags is not
protected.

Thanks.

>
> Regards
>
> --
> Julian Anastasov <ja@ssi.bg>

^ permalink raw reply

* Re: [PATCH 0/5] rtcache remove respin
From: Eric Dumazet @ 2012-07-02 10:44 UTC (permalink / raw)
  To: David Miller; +Cc: netdev
In-Reply-To: <20120701.050243.908285695895815999.davem@davemloft.net>

On Sun, 2012-07-01 at 05:02 -0700, David Miller wrote:
> It's been a while and there were of course a lot of merge hassles with
> the most recent set I posted, so I respun these patches tonight
> because I wanted to see the effects of the recent rpfilter hacks on an
> rtcache-less system.
> 
> On a SPARC T3-1:
> 
> 1) Output route lookup: ~2800 cycles
> 2) Input route lookups: ~3000 cycles (rpfilter=0)
>                         ~4300 cycles (rpfilter=1)
> 
> Another nice part is how small struct rtable is after this patch set:
> 
> struct rtable {
> 	struct dst_entry        dst;
> 	int                     rt_genid;
> 	unsigned int            rt_flags;
> 	__u16                   rt_type;
> 	__be32                  rt_dst;
> 	int                     rt_route_iif;
> 	int                     rt_iif;
> 	int                     rt_oif;
> 	__be32                  rt_gateway;
> 	u32                     rt_peer_genid;
> 	unsigned long           _peer;
> 	struct fib_info         *fi;
> };
> 
> which is about 208 bytes on sparc64.
> 
> Signed-off-by: David S. Miller <davem@davemloft.net>

Can be <= 192 actually

rcu_head not needed anymore in dst_entry

If we still want __refcnt being on cache line boundary, we might find a
better way to accomplish this.

^ permalink raw reply

* Re: [patch] [SCSI] bnx2i: use strlcpy() instead of memcpy() for strings
From: Dan Carpenter @ 2012-07-02 10:48 UTC (permalink / raw)
  To: David Laight
  Cc: James E.J. Bottomley, Barak Witkowski, Eddie Wai, Michael Chan,
	linux-scsi, netdev, David S. Miller
In-Reply-To: <AE90C24D6B3A694183C094C60CF0A2F6026B6F6F@saturn3.aculab.com>

On Mon, Jul 02, 2012 at 11:09:19AM +0100, David Laight wrote:
> > Subject: [patch] [SCSI] bnx2i: use strlcpy() instead of memcpy() for
> strings
> > 
> > DRV_MODULE_VERSION here is "2.7.2.2" which is only 8 chars but we copy
> > 12 bytes from the stack so it's a small information leak.
> > 
> > Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
> > ---
> > This was just added to linux-next yesterday, but I'm not sure 
> > which tree it came from.
> > 
> > diff --git a/drivers/scsi/bnx2i/bnx2i_init.c 
> > b/drivers/scsi/bnx2i/bnx2i_init.c
> > index 7729a52..b17637a 100644
> > --- a/drivers/scsi/bnx2i/bnx2i_init.c
> > +++ b/drivers/scsi/bnx2i/bnx2i_init.c
> > @@ -400,7 +400,7 @@ int bnx2i_get_stats(void *handle)
> >  	if (!stats)
> >  		return -ENOMEM;
> >  
> > -	memcpy(stats->version, DRV_MODULE_VERSION,
> sizeof(stats->version));
> > +	strlcpy(stats->version, DRV_MODULE_VERSION,
> sizeof(stats->version));
> >  	memcpy(stats->mac_add1 + 2, hba->cnic->mac_addr, ETH_ALEN);
> 
> Doesn't that leak the original contents of the last bytes of
> stats->version instead?

I'm pretty sure we set those to zero in bnx2x_handle_drv_info_req().

regards,
dan carpenter


^ permalink raw reply

* Re: [PATCH 0/5] rtcache remove respin
From: David Miller @ 2012-07-02 10:59 UTC (permalink / raw)
  To: eric.dumazet; +Cc: netdev
In-Reply-To: <1341225841.5269.69.camel@edumazet-glaptop>

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Mon, 02 Jul 2012 12:44:01 +0200

> On Sun, 2012-07-01 at 05:02 -0700, David Miller wrote:
>> It's been a while and there were of course a lot of merge hassles with
>> the most recent set I posted, so I respun these patches tonight
>> because I wanted to see the effects of the recent rpfilter hacks on an
>> rtcache-less system.
>> 
>> On a SPARC T3-1:
>> 
>> 1) Output route lookup: ~2800 cycles
>> 2) Input route lookups: ~3000 cycles (rpfilter=0)
>>                         ~4300 cycles (rpfilter=1)
>> 
>> Another nice part is how small struct rtable is after this patch set:
>> 
>> struct rtable {
>> 	struct dst_entry        dst;
>> 	int                     rt_genid;
>> 	unsigned int            rt_flags;
>> 	__u16                   rt_type;
>> 	__be32                  rt_dst;
>> 	int                     rt_route_iif;
>> 	int                     rt_iif;
>> 	int                     rt_oif;
>> 	__be32                  rt_gateway;
>> 	u32                     rt_peer_genid;
>> 	unsigned long           _peer;
>> 	struct fib_info         *fi;
>> };
>> 
>> which is about 208 bytes on sparc64.
>> 
>> Signed-off-by: David S. Miller <davem@davemloft.net>
> 
> Can be <= 192 actually
> 
> rcu_head not needed anymore in dst_entry
> 
> If we still want __refcnt being on cache line boundary, we might find a
> better way to accomplish this.

Once we can actually check something like this in, rt_dst is
guarenteed to be eliminated as well.

The dst neighbour pointer will also be gone.

So lots of shrinking still to go and yes we'll need to reposition
that __refcnt member carefully.

^ permalink raw reply

* Re: [RFC] [TCP 1/3] tcp: Add MSG_NEW_PACKET flag to indicate preferable packet boundaries
From: Andreas Gruenbacher @ 2012-07-02 11:00 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev, linux-kernel, Herbert Xu, David S. Miller
In-Reply-To: <1340990157.21162.5.camel@edumazet-glaptop>

On Fri, 2012-06-29 at 19:15 +0200, Eric Dumazet wrote:
> On Fri, 2012-06-29 at 17:38 +0200, Andreas Gruenbacher wrote:
> 
> > The primary use case is fast Gigabit (10 or more) Ethernet connections
> > with jumbo frames and switches that support them.  There, frames will go
> > through unchanged and you can zero-copy receive all the time.
> > 
> > Not sure how well the approach scales to other kinds of connections; it
> > may work often enough to be worth it.  When things get distorted between
> > the sender and the receiver and tcp_recvbio() fails, the data can still
> > be copied out of the socket as before.
> 
> If you have a packet loss, receiver can and will coalesce frames.

That's alright as long as we'll get "back to normal" eventually; the
only effect will be that we'll copy data out of the socket receive
buffers for a while.  There will be extremely little packet loss on the
kinds of networks that we want to use this on.

Thanks,
Andreas

^ permalink raw reply

* Re: one pci_id missing in sky2 driver
From: Mirko Lindner @ 2012-07-02 11:15 UTC (permalink / raw)
  To: Xose Vazquez Perez; +Cc: netdev@vger.kernel.org, shemminger@vyatta.com
In-Reply-To: <4FEEDE8D.1000403@gmail.com>

On Sat, 2012-06-30 at 04:10 -0700, Xose Vazquez Perez wrote:
> Mirko Lindner wrote:
> 
>  > On Sunday, April 01, 2012 12:22:35 PM Stephen Hemminger wrote:
> 
>  >> I would like to use this opportunity to have the developers at Marvell, test
>  >> and submit this. They have the hardware (I don't) and often new chips
>  >> require other special tweaks.  Marvell expressed interest in taking over
>  >> maintaining the sky2 driver, this would be a good first step.
> 
> > Thanks Stephen. The ID belongs to our newest device. We'll include the code
> > into the driver and send a patch to the list as soon the driver has passed our
> > internal tests.
> 
> Mirko, any news on this ?

Sorry for the late reply. The patch will be ready by the end of this
week.

Mirko

^ permalink raw reply

* [PATCH v3] ieee802154: verify packet size before trying to allocate it
From: Sasha Levin @ 2012-07-02 11:29 UTC (permalink / raw)
  To: dbaryshkov-Re5JQEeQqe8AvxtiuMwx3w, slapin-9cOl001CZnBAfugRpC6u6w,
	davem-fT/PcQaiUtIeIZ0/mPfg9Q
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-zigbee-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f, Sasha Levin

Currently when sending data over datagram, the send function will attempt to
allocate any size passed on from the userspace.

We should make sure that this size is checked and limited. We'll limit it
to the MTU of the device, which is checked later anyway.

Signed-off-by: Sasha Levin <levinsasha928-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
---
 net/ieee802154/dgram.c |   12 ++++++------
 1 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/net/ieee802154/dgram.c b/net/ieee802154/dgram.c
index 6fbb2ad..1670561 100644
--- a/net/ieee802154/dgram.c
+++ b/net/ieee802154/dgram.c
@@ -230,6 +230,12 @@ static int dgram_sendmsg(struct kiocb *iocb, struct sock *sk,
 	mtu = dev->mtu;
 	pr_debug("name = %s, mtu = %u\n", dev->name, mtu);
 
+	if (size > mtu) {
+		pr_debug("size = %Zu, mtu = %u\n", size, mtu);
+		err = -EINVAL;
+		goto out_dev;
+	}
+
 	hlen = LL_RESERVED_SPACE(dev);
 	tlen = dev->needed_tailroom;
 	skb = sock_alloc_send_skb(sk, hlen + tlen + size,
@@ -258,12 +264,6 @@ static int dgram_sendmsg(struct kiocb *iocb, struct sock *sk,
 	if (err < 0)
 		goto out_skb;
 
-	if (size > mtu) {
-		pr_debug("size = %Zu, mtu = %u\n", size, mtu);
-		err = -EINVAL;
-		goto out_skb;
-	}
-
 	skb->dev = dev;
 	skb->sk  = sk;
 	skb->protocol = htons(ETH_P_IEEE802154);
-- 
1.7.8.6


------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/

^ permalink raw reply related

* [PATCH] NFC: Don't warn when closing uninitialized socket
From: Sasha Levin @ 2012-07-02 11:31 UTC (permalink / raw)
  To: lauro.venancio-430g2QfJUUCGglJvpFV4uA,
	aloisio.almeida-430g2QfJUUCGglJvpFV4uA,
	sameo-VuQAYsv1563Yd54FQh9/CA, linville-2XuSBdqkA4R54TAoqtyWWQ
  Cc: linux-wireless-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Sasha Levin

It is possible to close a NFC socket without initializing it first,
there's no need to warn about it.

Signed-off-by: Sasha Levin <levinsasha928-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
---
 net/nfc/llcp/llcp.c |    2 --
 1 files changed, 0 insertions(+), 2 deletions(-)

diff --git a/net/nfc/llcp/llcp.c b/net/nfc/llcp/llcp.c
index 5d503ee..cf84544 100644
--- a/net/nfc/llcp/llcp.c
+++ b/net/nfc/llcp/llcp.c
@@ -118,8 +118,6 @@ static void local_release(struct kref *ref)
 
 int nfc_llcp_local_put(struct nfc_llcp_local *local)
 {
-	WARN_ON(local == NULL);

^ permalink raw reply related

* Re: [PATCH net-next 09/10] net/mlx4_en: Manage flow steering rules with ethtool
From: Andreas Schwab @ 2012-07-02 11:35 UTC (permalink / raw)
  To: David Laight
  Cc: Joe Perches, Ben Hutchings, Or Gerlitz, davem, roland, yevgenyp,
	oren, netdev, Hadar Hen Zion
In-Reply-To: <AE90C24D6B3A694183C094C60CF0A2F6026B6F70@saturn3.aculab.com>

"David Laight" <David.Laight@ACULAB.COM> writes:

>  
>> > Or write it as (!field || !(typeof(field))~field) which more closely
>> > resembles what the macro name expresses.
>> 
>> Better still, or maybe:
>> 
>> 	field == 0 || field == (typeof field)~0
>
> Which doesn't work when sizeof(field) > sizeof(int).
> Needs another cast.
>
> 	field == 0 || field == (typeof field)~(typeof field)0

You can avoid that by using (typeof field)-1.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply

* Re: [RFC] [TCP 0/3] Receive from socket into bio without copying
From: Andreas Gruenbacher @ 2012-07-02 11:45 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev, linux-kernel, Herbert Xu, David S. Miller
In-Reply-To: <1340982523.21162.1.camel@edumazet-glaptop>

On Fri, 2012-06-29 at 17:08 +0200, Eric Dumazet wrote:
> This looks like yet another zero copy, needing another couple of hundred
> of lines.

Kind of, yes.  We really want to make no copies at all though; the cpu
just passes buffers from one device to the other.

> Why splice infrastructure doesnt fit your needs ?

The pipe api that splice is based on saves a copy between the kernel and
user space, but it currently writes to files, going through the page
cache.  For that, the alignment of data in the network receive buffers
doesn't matter.

We want to go directly to the block layer instead.  This requires that
the network hardware receives the data into sector aligned buffers.
Hence the proposed MSG_NEW_PACKET flag.

With that, it might be possible to implement a pipe "sink" that goes to
a bio instead of writing to a file.  Going through the pipe
infrastructure doesn't actually help in this case though, it's just
overhead.

Thanks,
Andreas

^ permalink raw reply

* RE: [PATCH net-next 09/10] net/mlx4_en: Manage flow steering rules with ethtool
From: David Laight @ 2012-07-02 12:15 UTC (permalink / raw)
  To: Andreas Schwab
  Cc: Joe Perches, Ben Hutchings, Or Gerlitz, davem, roland, yevgenyp,
	oren, netdev, Hadar Hen Zion
In-Reply-To: <m28vf2o0oa.fsf@igel.home>

 
> "David Laight" <David.Laight@ACULAB.COM> writes:
> 
> >  
> >> > Or write it as (!field || !(typeof(field))~field) which more
closely
> >> > resembles what the macro name expresses.
> >> 
> >> Better still, or maybe:
> >> 
> >> 	field == 0 || field == (typeof field)~0
> >
> > Which doesn't work when sizeof(field) > sizeof(int).
> > Needs another cast.
> >
> > 	field == 0 || field == (typeof field)~(typeof field)0
> 
> You can avoid that by using (typeof field)-1.

Gah, I thought I knew the integral promotion rules!
-1 and ~0 are both 'integer' and get treated the same.

A quick test shows that gcc does sign extend when converting
32bit int to 64bit unsigned long long.
Which probably means that is required by the standard!

	David

^ permalink raw reply

* Re: [PATCH v6] sctp: be more restrictive in transport selection on bundled sacks
From: Neil Horman @ 2012-07-02 12:25 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, vyasevich, linux-sctp
In-Reply-To: <20120701234424.GA23935@neilslaptop.think-freely.org>

On Sun, Jul 01, 2012 at 07:44:25PM -0400, Neil Horman wrote:
> On Sun, Jul 01, 2012 at 02:43:19PM -0700, David Miller wrote:
> > From: Neil Horman <nhorman@tuxdriver.com>
> > Date: Sun, 1 Jul 2012 08:47:50 -0400
> > 
> > > Perhaps we could modify the SubmittingPatches document to indicate that an
> > > Acked-by from a subsystem maintainer implicitly confers authority on the
> > > upstream receiver to request reasonable stylistic changes that don't affect the
> > > functionality of the patch in the interests of maintaining coding conventions.
> > 
> > Yes, that would make sense.
> > 
> 
> 
> I'll propose it in a few days.
> Neil
> 
How does this language sound to you?


diff --git a/Documentation/SubmittingPatches b/Documentation/SubmittingPatches
index c379a2a..1eaaeb1 100644
--- a/Documentation/SubmittingPatches
+++ b/Documentation/SubmittingPatches
@@ -414,6 +414,16 @@ the part which affects that maintainer's code.  Judgement should be used here.
 When in doubt people should refer to the original discussion in the mailing
 list archives.
 
+Note that an Acked-by: From a subsystem maintainer on a given patch confers
+upon the tree maintainer integrating the path the authority to carry those Acks
+forward through subsequent versions of a patch, as long as those versions do not
+significantly impact the functionality of the patch.  For example, say the isdn
+subsystem maintainer sends an Acked-by: on version 1 of a patch bound for the
+networking tree.  The networking maintainer then requests that some comments in
+the code be modified to comply with the CodingStyle document.  The networking
+tree maintanier may reapply the subsystem maintainers Acked-by: to the new
+version as no significant changes were made to the patch functionality.
+
 If a person has had the opportunity to comment on a patch, but has not
 provided such comments, you may optionally add a "Cc:" tag to the patch.
 This is the only tag which might be added without an explicit action by the

^ permalink raw reply related

* Re: [RFC] [TCP 0/3] Receive from socket into bio without copying
From: Eric Dumazet @ 2012-07-02 12:36 UTC (permalink / raw)
  To: Andreas Gruenbacher; +Cc: netdev, linux-kernel, Herbert Xu, David S. Miller
In-Reply-To: <1341229532.29646.39.camel@gurkel.linbit>

On Mon, 2012-07-02 at 13:45 +0200, Andreas Gruenbacher wrote:
> On Fri, 2012-06-29 at 17:08 +0200, Eric Dumazet wrote:
> > This looks like yet another zero copy, needing another couple of hundred
> > of lines.
> 
> Kind of, yes.  We really want to make no copies at all though; the cpu
> just passes buffers from one device to the other.
> 
> > Why splice infrastructure doesnt fit your needs ?
> 
> The pipe api that splice is based on saves a copy between the kernel and
> user space, but it currently writes to files, going through the page
> cache.  For that, the alignment of data in the network receive buffers
> doesn't matter.
> 

No files or page cache are needed for splice() usage, for example from
socket to another socket.

It just works (check haproxy for an example), with 10Gb performance out
of the box.

The pipe is only a container for buffers, in case the data fetched from
producer cannot be fully sent to consumer. You don't want to lose this
data.

> We want to go directly to the block layer instead.  This requires that
> the network hardware receives the data into sector aligned buffers.
> Hence the proposed MSG_NEW_PACKET flag.
> 

This only is a hint something is wrong with the approach.

> With that, it might be possible to implement a pipe "sink" that goes to
> a bio instead of writing to a file.  Going through the pipe
> infrastructure doesn't actually help in this case though, it's just
> overhead.

There is no expensive overhead in splice() infrastructure, only some
small details that should be eventually solved instead of designing a
new zero copy mode.

You didnt actually tried splice() if you believe a regular file is
needed.

You only need proper splice() support (from pipe to bio), if not already
there.

^ permalink raw reply

* Re: [PATCH net-next 09/10] net/mlx4_en: Manage flow steering rules with ethtool
From: Or Gerlitz @ 2012-07-02 12:57 UTC (permalink / raw)
  To: Ben Hutchings
  Cc: davem, roland, Yevgeny Petrilin, Oren Duer, netdev,
	Hadar Hen Zion, Amir Vadai
In-Reply-To: <1341158452.4852.107.camel@deadeye.wl.decadent.org.uk>

On 7/1/2012 7:00 PM, Ben Hutchings wrote:
[...]

Hi Ben,

Thanks for the detailed feedback, see below some responses


> +		l4_mask = &cmd->fs.m_u.tcp_ip4_spec;
> +		/* don't allow mask which isn't all 0 or 1 */
> +		if (not_all_zeros_or_all_ones(l4_mask->ip4src, __be32) ||
> +		    not_all_zeros_or_all_ones(l4_mask->ip4dst, __be32) ||
> +		    not_all_zeros_or_all_ones(l4_mask->psrc, __be16) ||
> +		    not_all_zeros_or_all_ones(l4_mask->pdst, __be16))
> +			return -EOPNOTSUPP;
>
> Again, here and in many further instances, the error code should be EINVAL.


OK, will fix to use EINVAL instead of EOPNOTSUPP here and else-where needed.


> +
> +static void add_ip_rule(struct mlx4_en_priv *priv,
> +			struct ethtool_rxnfc *cmd,
> +			struct list_head *list_h)
> +{
> +	struct mlx4_spec_list *spec_l3;
> +
> +	spec_l3 = kzalloc(sizeof *spec_l3, GFP_KERNEL);
> +	if (!spec_l3) {
> +		en_err(priv, "Fail to alloc ethtool rule.\n");
> +		return;
> +	}
>
> This should return an error code as well - logging is not a substitue.

OK, will do.

>
>
>> +	spec_l3->id = MLX4_NET_TRANS_RULE_ID_IPV4;
>> +	spec_l3->ipv4.src_ip = cmd->fs.h_u.usr_ip4_spec.ip4src;
>> +	if (spec_l3->ipv4.src_ip)
>> +		spec_l3->ipv4.src_ip_msk = EN_ETHTOOL_WORD_MASK;
>> +	spec_l3->ipv4.dst_ip = cmd->fs.h_u.usr_ip4_spec.ip4dst;
>> +	if (spec_l3->ipv4.dst_ip)
>> +		spec_l3->ipv4.dst_ip_msk = EN_ETHTOOL_WORD_MASK;
>
> The conditions should be using the mask (cmd->fs.m_u) not the value.

I'd like to clarify things here a bit - the way the code is written, is to

1st make sure that we can deal with the mask provided by the user in the 
ethtool
command, e.g its all zeroes or all ones (leaving a side for a minute the 
other
discussion on how would be best to impl. that check...) - this check is 
done
in mlx4_en_validate_flow

2nd initialize mlx4 flow spec which is all empty - see calls to kzalloc 
spec_l2/l3/l4

3rd import the non-zero values (not masks) from the ethtool command into 
the mlx4
flow specs, with FULL mask


Under this logic, we can use the values and not the masks, isn't that?



>
>
>> +static void add_tcp_udp_rule(struct mlx4_en_priv *priv,
>> +			     struct ethtool_rxnfc *cmd,
>> +			     struct list_head *list_h, int proto)
>> +{
>> +	struct mlx4_spec_list *spec_l3;
>> +	struct mlx4_spec_list *spec_l4;
>> +
>> +	spec_l3 = kzalloc(sizeof *spec_l3, GFP_KERNEL);
>> +	spec_l4 = kzalloc(sizeof *spec_l4, GFP_KERNEL);
>> +	if (!spec_l4 || !spec_l3) {
>> +		en_err(priv, "Fail to alloc ethtool rule.\n");
>
> If one of them was successfully allocated, it will now be leaked.

THANKS, will fix.

>> +static int mlx4_en_ethtool_to_net_trans_rule(struct net_device *dev,
>> +					     struct ethtool_rxnfc *cmd,
>> +					     struct list_head *rule_list_h)
>> +{
>> +	int err;
>> +	u64 mac;
>> +	__be64 be_mac;
>> +	struct ethhdr *eth_spec;
>> +	struct mlx4_en_priv *priv = netdev_priv(dev);
>> +	struct mlx4_spec_list *spec_l2;
>> +	__be64 mac_msk = cpu_to_be64(EN_ETHTOOL_MAC_MASK << 16);
>> +
>> +	err = mlx4_en_validate_flow(dev, cmd);
>> +	if (err)
>> +		return err;
>> +
>> +	spec_l2 = kzalloc(sizeof *spec_l2, GFP_KERNEL);
>> +	if (!spec_l2)
>> +		return -ENOMEM;
>> +
>> +	mac = priv->mac & EN_ETHTOOL_MAC_MASK;
>> +	be_mac = cpu_to_be64(mac << 16);
>> +
>> +	spec_l2->id = MLX4_NET_TRANS_RULE_ID_ETH;
>> +	memcpy(spec_l2->eth.dst_mac_msk, &mac_msk, ETH_ALEN);
>> +	if ((cmd->fs.flow_type & ~FLOW_EXT) != ETHER_FLOW)
>> +		memcpy(spec_l2->eth.dst_mac, &be_mac, ETH_ALEN);
>
> Does the hardware require filtering by MAC address and not only by layer 3/4 addresses?

YES

>
>
>> +	if ((cmd->fs.flow_type & FLOW_EXT) && cmd->fs.m_ext.vlan_tci) {
>> +		spec_l2->eth.vlan_id = cmd->fs.h_ext.vlan_tci;
>> +		spec_l2->eth.vlan_id_msk = cpu_to_be16(0xfff);
>
> This doesn't match mlx4_en_validate_flow(); you are replacing a mask of
> 0xffff with 0xfff.

I need to check how exactly this should be done here, vlan ID is only 12 
bits in size, so this is
probably the source for the 0xfff vs 0xffff


>
>
>> +	switch (cmd->fs.flow_type & ~FLOW_EXT) {
>> +	case ETHER_FLOW:
>> +		eth_spec = &cmd->fs.h_u.ether_spec;
>> +		memcpy(&spec_l2->eth.dst_mac, eth_spec->h_dest, ETH_ALEN);
>> +		spec_l2->eth.ether_type = eth_spec->h_proto;
>> +		if (eth_spec->h_proto)
>> +			spec_l2->eth.ether_type_enable = 1;
>> +		break;
>> +	case IP_USER_FLOW:
>> +		add_ip_rule(priv, cmd, rule_list_h);
>> +		break;
>> +	case TCP_V4_FLOW:
>> +		add_tcp_udp_rule(priv, cmd, rule_list_h, TCP_V4_FLOW);
>> +		break;
>> +	case UDP_V4_FLOW:
>> +		add_tcp_udp_rule(priv, cmd, rule_list_h, UDP_V4_FLOW);
>> +		break;
>
> All those functions need to be able to return errors.


OK

^ permalink raw reply

* Re: [RFC] [TCP 0/3] Receive from socket into bio without copying
From: Andreas Gruenbacher @ 2012-07-02 13:02 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev, linux-kernel, Herbert Xu, David S. Miller
In-Reply-To: <1341232568.22621.8.camel@edumazet-glaptop>

On Mon, 2012-07-02 at 14:36 +0200, Eric Dumazet wrote:
> No files or page cache are needed for splice() usage, for example from
> socket to another socket.
> 
> It just works (check haproxy for an example), with 10Gb performance out
> of the box.

bio_vec's have some alignment requirements that must be met, and
anything that doesn't meet those requirements can't be passed to the
block layer (without copying it first).  Additional layers between the
network and block layers, like a pipe, won't make that problem go away.

> The pipe is only a container for buffers, in case the data fetched from
> producer cannot be fully sent to consumer. You don't want to lose this
> data.

Stuff that isn't pulled out of a socket receive buffer will stay there,
it won't magically be lost.

> > We want to go directly to the block layer instead.  This requires that
> > the network hardware receives the data into sector aligned buffers.
> > Hence the proposed MSG_NEW_PACKET flag.
> 
> This only is a hint something is wrong with the approach.

It just means that I'm trying to do something that isn't currently
supported.

> You only need proper splice() support (from pipe to bio), if not already
> there.

It's not already there, it requires the alignment issue to be addresses
first.

Andreas

^ permalink raw reply

* Re: [PATCH net-next 06/10] {NET,IB}/mlx4: Add device managed flow steering firmware API
From: Or Gerlitz @ 2012-07-02 13:00 UTC (permalink / raw)
  To: David Miller; +Cc: roland, yevgenyp, oren, netdev, hadarh, Amir Vadai
In-Reply-To: <20120702.013626.13785263551119133.davem@davemloft.net>

On 7/2/2012 11:36 AM, David Miller wrote:
> So you must create a real generic interface that other chipsets in similar situations can hook into as well.

OK, understood.

We will remove the module param from the patch set, such that at this point
of submission,  there's no run time setting for the flow steering hash 
function.

Or.

^ permalink raw reply

* Re: [PATCH net-next 09/10] net/mlx4_en: Manage flow steering rules with ethtool
From: Or Gerlitz @ 2012-07-02 13:32 UTC (permalink / raw)
  To: Ben Hutchings
  Cc: davem, roland, yevgenyp, oren, netdev, Hadar Hen Zion, Amir Vadai
In-Reply-To: <1341158452.4852.107.camel@deadeye.wl.decadent.org.uk>

On 7/1/2012 7:00 PM, Ben Hutchings wrote:
>> static int mlx4_en_get_rxnfc(struct net_device *dev, struct ethtool_rxnfc *cmd,
>> >  			     u32 *rule_locs)
>> >  {
>> >  	struct mlx4_en_priv *priv = netdev_priv(dev);
>> >+	struct mlx4_en_dev *mdev = priv->mdev;
>> >  	int err = 0;
>> >+	int i = 0, priority = 0;
>> >+
>> >+	if (mdev->dev->caps.steering_mode != MLX4_STEERING_MODE_DEVICE_MANAGED)
>> >+		return -EOPNOTSUPP;
>> >  
>> >  	switch (cmd->cmd) {
>> >  	case ETHTOOL_GRXRINGS:
>> >  		cmd->data = priv->rx_ring_num;
>> >  		break;
>> >+	case ETHTOOL_GRXCLSRLCNT:
>> >+		cmd->rule_cnt = mlx4_en_get_num_flows(priv);
>> >+		break;
>> >+	case ETHTOOL_GRXCLSRULE:
>> >+		err = mlx4_en_get_flow(dev, cmd, cmd->fs.location);
>> >+		break;
>> >+	case ETHTOOL_GRXCLSRLALL:
>> >+		while (!err || err == -ENOENT) {
>> >+			err = mlx4_en_get_flow(dev, cmd, i);
>> >+			if (!err)
>> >+				((u32 *)(rule_locs))[priority++] = i;
> I don't see any range check against cmd->rule_cnt.

OK, will fix to make sure we don't cross cmd->rule_cnt

>
>
> Why are you casting rule_locs?

doesn't seem to be needed, will remove that casting

>
>
> Also, are the rules really prioritised by location, so that if rule 0
> and 1 match a packet then only rule 0 is applied?  If they are actually
> prioritised by the match type then you need to assign locations on the
> driver side that reflect the actual priorities.


Rules are prioritized by a scheme made of "domain" X location, see below 
and in patch #6
the MLX4_DOMAIN_yyy entries, higher domain value means lower priority, 
so for instance rules
placed by ethtool would have higher priority over rules added by the L2 
NIC  or by future RFS
patch. Within a domain, the location matters.

You can see that simple L2 rules (e.g contain dest-mac, possibly vlan) 
added by the driver
use the NIC domain, wheres those added to serve ethtool commands use the 
ETHTOOL one.

Within the ethtool domain, we maintain the priority as the location 
specified by the user, hope this explains
things.

> +enum {
> +	MLX4_DOMAIN_UVERBS	= 0x1000,
> +	MLX4_DOMAIN_ETHTOOL     = 0x2000,
> +	MLX4_DOMAIN_RFS         = 0x3000,
> +	MLX4_DOMAIN_NIC    = 0x5000,
> +};

>> >+			i++;
>> >+		}
>> >+		if (priority)
>> >+			err = 0;
> [...]
>
> But if there are no rules defined, this is an error?  That's not right.
> I think you should unconditionally set err = 0 here.

OK, will do.

Or.

^ permalink raw reply

* Re: [RFC] [TCP 0/3] Receive from socket into bio without copying
From: saeed bishara @ 2012-07-02 13:39 UTC (permalink / raw)
  To: Andreas Gruenbacher
  Cc: Eric Dumazet, netdev, linux-kernel, Herbert Xu, David S. Miller
In-Reply-To: <1341229532.29646.39.camel@gurkel.linbit>

On Mon, Jul 2, 2012 at 2:45 PM, Andreas Gruenbacher <agruen@linbit.com> wrote:

>
> We want to go directly to the block layer instead.  This requires that
> the network hardware receives the data into sector aligned buffers.
> Hence the proposed MSG_NEW_PACKET flag.
Andreas,
 I didn't read your patches, but what are you looking for can't be
achieved using "normal" NICs, for that you need to use RDMA protocol
with a hardware that supports RDMA.

saeed

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox