Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH] sctp: check dst validity after IPsec operations
From: Vlad Yasevich @ 2012-09-06 16:04 UTC (permalink / raw)
  To: Nicolas Dichtel; +Cc: sri, linux-sctp, netdev
In-Reply-To: <1346953229-3825-1-git-send-email-nicolas.dichtel@6wind.com>

On 09/06/2012 01:40 PM, Nicolas Dichtel wrote:
> dst stored in struct sctp_transport needs to be recalculated when ipsec policy
> are updated. We use flow_cache_genid for that.
>
> For example, if a SCTP connection is established and then an IPsec policy is
> set, the old SCTP flow will not be updated and thus will not use the new
> IPsec policy.
>
> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>

why doesn't this need to be done for TCP?  What makes SCTP special in 
this case?

ip_queue_xmit does an __sk_dst_check() which is essentially what 
sctp_transport_dst_check() does.  That should determine if the currently 
cached route is valid or not.

Looks like sctp may need to change to using ip_route_output_ports() call
as ip_route_output_key may not do all that is necessary

-vlad

> ---
>   include/net/sctp/sctp.h    | 3 ++-
>   include/net/sctp/structs.h | 3 +++
>   net/sctp/ipv6.c            | 1 +
>   net/sctp/protocol.c        | 1 +
>   4 files changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/include/net/sctp/sctp.h b/include/net/sctp/sctp.h
> index 9c6414f..3267246 100644
> --- a/include/net/sctp/sctp.h
> +++ b/include/net/sctp/sctp.h
> @@ -712,7 +712,8 @@ static inline void sctp_v4_map_v6(union sctp_addr *addr)
>    */
>   static inline struct dst_entry *sctp_transport_dst_check(struct sctp_transport *t)
>   {
> -	if (t->dst && !dst_check(t->dst, 0)) {
> +	if ((t->dst && !dst_check(t->dst, 0)) ||
> +	    (t->flow_cache_genid != atomic_read(&flow_cache_genid))) {
>   		dst_release(t->dst);
>   		t->dst = NULL;
>   	}
> diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h
> index 0fef00f..9dab882 100644
> --- a/include/net/sctp/structs.h
> +++ b/include/net/sctp/structs.h
> @@ -948,6 +948,9 @@ struct sctp_transport {
>
>   	/* 64-bit random number sent with heartbeat. */
>   	__u64 hb_nonce;
> +
> +	/* used to check validity of dst */
> +	__u32 flow_cache_genid;
>   };
>
>   struct sctp_transport *sctp_transport_new(struct net *, const union sctp_addr *,
> diff --git a/net/sctp/ipv6.c b/net/sctp/ipv6.c
> index ea14cb4..2ed7410 100644
> --- a/net/sctp/ipv6.c
> +++ b/net/sctp/ipv6.c
> @@ -349,6 +349,7 @@ out:
>   		struct rt6_info *rt;
>   		rt = (struct rt6_info *)dst;
>   		t->dst = dst;
> +		t->flow_cache_genid = atomic_read(&flow_cache_genid);
>   		SCTP_DEBUG_PRINTK("rt6_dst:%pI6 rt6_src:%pI6\n",
>   			&rt->rt6i_dst.addr, &fl6->saddr);
>   	} else {
> diff --git a/net/sctp/protocol.c b/net/sctp/protocol.c
> index 2d51842..4795a1a 100644
> --- a/net/sctp/protocol.c
> +++ b/net/sctp/protocol.c
> @@ -512,6 +512,7 @@ out_unlock:
>   	rcu_read_unlock();
>   out:
>   	t->dst = dst;
> +	t->flow_cache_genid = atomic_read(&flow_cache_genid);
>   	if (dst)
>   		SCTP_DEBUG_PRINTK("rt_dst:%pI4, rt_src:%pI4\n",
>   				  &fl4->daddr, &fl4->saddr);
>

^ permalink raw reply

* Re: [PATCH v3 01/17] hashtable: introduce a small and naive hashtable
From: Steven Rostedt @ 2012-09-06 16:00 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Josh Triplett, Mathieu Desnoyers, Pedro Alves, Tejun Heo,
	torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	paul.gortmaker-CWA4WttNNZF54TAoqtyWWQ,
	davem-fT/PcQaiUtIeIZ0/mPfg9Q, mingo-X9Un+BFzKDI,
	ebiederm-aS9lmoZGLiVWk0Htik3J/w, aarcange-H+wXaHxf7aLQT0dZR+AlfA,
	ericvh-Re5JQEeQqe8AvxtiuMwx3w, netdev-u79uwXL29TY76Z2rM5mHXA,
	eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w, axboe-tSWWG44O7X1aa/9Udqfwiw,
	agk-H+wXaHxf7aLQT0dZR+AlfA, dm-devel-H+wXaHxf7aLQT0dZR+AlfA,
	neilb-l3A5Bk7waGM, ccaulfie-H+wXaHxf7aLQT0dZR+AlfA,
	teigland-H+wXaHxf7aLQT0dZR+AlfA,
	Trond.Myklebust-HgOvQuBEEgTQT0dZR+AlfA,
	bfields-uC3wQj2KruNg9hUCZPvPmw, fweisbec-Re5JQEeQqe8AvxtiuMwx3w,
	jesse-l0M0P4e3n4LQT0dZR+AlfA,
	venkat.x.venkatsubra-QHcLZuEGTsvQT0dZR+AlfA,
	ejt-H+wXaHxf7aLQT0dZR+AlfA, snitzer-H+wXaHxf7aLQT0dZR+AlfA,
	edumazet-hpIqsD4AKlfQT0dZR+AlfA, linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	dev-yBygre7rU0TnMu66kgdUjQ, rds-devel-N0ozoZBvEnrZJqsBc5GL+g,
	lw-BthXqXjhjHXQFUHtdCDX3A
In-Reply-To: <5048C615.4070204-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>

On Thu, 2012-09-06 at 17:49 +0200, Sasha Levin wrote:
>  
> > Looks reasonable.  However, it would break (or rather, not break) on
> > code like this:
> > 
> > 	hash_for_each_entry(...) {
> > 		if (...) {
> > 			foo(node);
> > 			node = NULL;

ug, I didn't even notice this. Ignore my last email :-p

/me needs to wake-up a bit more.

> > 			break;
> > 		}
> > 	}
> > 
> > Hiding the double loop still seems error-prone.
> 
> I think that that code doesn't make sense. The users of hlist_for_each_* aren't
> supposed to be changing the loop cursor.

I totally agree. Modifying the 'node' pointer is just asking for issues.
Yes that is error prone, but not due to the double loop. It's due to the
modifying of the node pointer that is used internally by the loop
counter. Don't do that :-)


> 
> We have three options here:
> 
>  1. Stuff everything into a single for(). While not too difficult, it will make
> the readability of the code difficult as it will force us to abandon using
> hlist_for_each_* macros.
> 
>  2. Over-complicate everything, and check for 'node == NULL && obj &&
> obj->member.next == NULL' instead. That one will fail only if the user has
> specifically set the object as the last object in the list and the node as NULL.
> 
>  3. Use 2 loops which might not work properly if the user does something odd,
> with a big fat warning above them.
> 
> 
> To sum it up, I'd rather go with 3 and let anyone who does things he shouldn't
> be doing break.

I agree, the double loop itself is not error prone. If you are modifying
'node' you had better know what the hell you are doing.

Actually, it may be something that is legitimate. That is, if you want
to skip to the next bucket, just set node to NULL and do the break (as
Josh had done). This would break if the macro loop changed later on, but
hey, like I said, it's error prone ;-) If you really want to do that,
then hand coding the double loop would be a better bet. IOW, don't use
the macro loop.

-- Steve


--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH v3 01/17] hashtable: introduce a small and naive hashtable
From: Sasha Levin @ 2012-09-06 15:49 UTC (permalink / raw)
  To: Josh Triplett
  Cc: snitzer-H+wXaHxf7aLQT0dZR+AlfA, neilb-l3A5Bk7waGM,
	fweisbec-Re5JQEeQqe8AvxtiuMwx3w,
	Trond.Myklebust-HgOvQuBEEgTQT0dZR+AlfA,
	bfields-uC3wQj2KruNg9hUCZPvPmw,
	paul.gortmaker-CWA4WttNNZF54TAoqtyWWQ,
	dm-devel-H+wXaHxf7aLQT0dZR+AlfA, agk-H+wXaHxf7aLQT0dZR+AlfA,
	aarcange-H+wXaHxf7aLQT0dZR+AlfA, rds-devel-N0ozoZBvEnrZJqsBc5GL+g,
	eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w,
	venkat.x.venkatsubra-QHcLZuEGTsvQT0dZR+AlfA,
	ccaulfie-H+wXaHxf7aLQT0dZR+AlfA, mingo-X9Un+BFzKDI,
	dev-yBygre7rU0TnMu66kgdUjQ, ericvh-Re5JQEeQqe8AvxtiuMwx3w,
	Steven Rostedt, lw-BthXqXjhjHXQFUHtdCDX3A, Mathieu Desnoyers,
	axboe-tSWWG44O7X1aa/9Udqfwiw, linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	edumazet-hpIqsD4AKlfQT0dZR+AlfA, linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	Pedro Alves, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	ejt-H+wXaHxf7aLQT0dZR+AlfA, ebiederm-aS9lmoZGLiVWk0Htik3J/w,
	netdev-u79uwXL29TY76Z2rM5mHXA, Tejun Heo,
	teigland-H+wXaHxf7aLQT0dZR+AlfA,
	akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	davem-fT/PcQaiUtIeIZ0/mPfg9Q
In-Reply-To: <20120906145545.GA17332@leaf>

On 09/06/2012 04:55 PM, Josh Triplett wrote:
> On Thu, Sep 06, 2012 at 03:53:58PM +0200, Sasha Levin wrote:
>> On 09/04/2012 07:01 PM, Mathieu Desnoyers wrote:
>>>> #define do_for_each_ftrace_rec(pg, rec)                                          \
>>>>>         for (pg = ftrace_pages_start, rec = &pg->records[pg->index];             \
>>>>>              pg && rec == &pg->records[pg->index];                               \
>>>>>              pg = pg->next)                                                      \
>>>>>           for (rec = pg->records; rec < &pg->records[pg->index]; rec++)
>>> Maybe in some cases there might be ways to combine the two loops into
>>> one ? I'm not seeing exactly how to do it for this one, but it should
>>> not be impossible. If the inner loop condition can be moved to the outer
>>> loop, and if we use (blah ? loop1_conf : loop2_cond) to test for
>>> different conditions depending on the context, and do the same for the
>>> 3rd argument of the for() loop. The details elude me for now though, so
>>> maybe it's complete non-sense ;)
>>>
>>> It might not be that useful for do_for_each_ftrace_rec, but if we can do
>>> it for the hash table iterator, it might be worth it.
>>
>> So I think that for the hash iterator it might actually be simpler.
>>
>> My solution to making 'break' work in the iterator is:A code like that doesn
>>
>> 	for (bkt = 0, node = NULL; bkt < HASH_SIZE(name) && node == NULL; bkt++)
>> 		hlist_for_each_entry(obj, node, &name[bkt], member)
>>
>> We initialize our node loop cursor with NULL in the external loop, and the
>> external loop will have a new condition to loop while that cursor is NULL.
>>
>> My logic is that we can only 'break' when we are iterating over an object in the
>> internal loop. If we're iterating over an object in that loop then 'node != NULL'.
>>
>> This way, if we broke from within the internal loop, the external loop will see
>> node as not NULL, and so it will stop looping itself. On the other hand, if the
>> internal loop has actually ended, then node will be NULL, and the outer loop
>> will keep running.
>>
>> Is there anything I've missed?
> 
> Looks reasonable.  However, it would break (or rather, not break) on
> code like this:
> 
> 	hash_for_each_entry(...) {
> 		if (...) {
> 			foo(node);
> 			node = NULL;
> 			break;
> 		}
> 	}
> 
> Hiding the double loop still seems error-prone.

I think that that code doesn't make sense. The users of hlist_for_each_* aren't
supposed to be changing the loop cursor.

We have three options here:

 1. Stuff everything into a single for(). While not too difficult, it will make
the readability of the code difficult as it will force us to abandon using
hlist_for_each_* macros.

 2. Over-complicate everything, and check for 'node == NULL && obj &&
obj->member.next == NULL' instead. That one will fail only if the user has
specifically set the object as the last object in the list and the node as NULL.

 3. Use 2 loops which might not work properly if the user does something odd,
with a big fat warning above them.


To sum it up, I'd rather go with 3 and let anyone who does things he shouldn't
be doing break.


Thanks,
Sasha

^ permalink raw reply

* Re: [PATCH 04/10] net/macb: Fix a race in macb_start_xmit()
From: Havard Skinnemoen @ 2012-09-06 15:49 UTC (permalink / raw)
  To: David Miller
  Cc: nicolas.ferre, netdev, linux-arm-kernel, plagnioj, jamie,
	linux-kernel, patrice.vilchez
In-Reply-To: <20120905.173023.2235590074897156746.davem@davemloft.net>

On Wed, Sep 5, 2012 at 11:30 PM, David Miller <davem@davemloft.net> wrote:
> From: Nicolas Ferre <nicolas.ferre@atmel.com>
> Date: Wed, 5 Sep 2012 10:19:11 +0200
>
>> From: Havard Skinnemoen <havard@skinnemoen.net>
>>
>> Fix a race in macb_start_xmit() where we unconditionally set the TSTART bit.
>> If an underrun just happened (we do this with interrupts disabled, so it might
>> not have been handled yet), the controller starts transmitting from the first
>> entry in the ring, which is usually wrong.
>> Restart the controller after error handling.
>>
>> Signed-off-by: Havard Skinnemoen <havard@skinnemoen.net>
>> [nicolas.ferre@atmel.com: split patch in topics]
>> Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
>
> Accumulating special case code and checks into the hot path of TX packet
> processing is extremely unwise.
>
> Instead, when you handle the TX error conditions and reset the chip you
> should first ensure that there are no flows of control in the transmit
> function of your driver by using the appropriate locking et al. facilities.

IIRC, the hardware resets the ring pointers when an error happens, and
if we set TSTART right after that happens, the hardware will happily
transmit whatever is sitting in the beginning of the ring. This is
what I was trying to avoid.

The details are a bit hazy as it's been a while since I looked at
this, so it could be that simply letting it happen and using a bigger
hammer during reset processing might work just as well. Just want to
make sure y'all understand that we're talking about a race against
hardware, not against interrupt handlers, threads or anything that can
be solved by locking :)

Havard

^ permalink raw reply

* [PATCH] sctp: check dst validity after IPsec operations
From: Nicolas Dichtel @ 2012-09-06 17:40 UTC (permalink / raw)
  To: vyasevich, sri, linux-sctp; +Cc: netdev, Nicolas Dichtel

dst stored in struct sctp_transport needs to be recalculated when ipsec policy
are updated. We use flow_cache_genid for that.

For example, if a SCTP connection is established and then an IPsec policy is
set, the old SCTP flow will not be updated and thus will not use the new
IPsec policy.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
---
 include/net/sctp/sctp.h    | 3 ++-
 include/net/sctp/structs.h | 3 +++
 net/sctp/ipv6.c            | 1 +
 net/sctp/protocol.c        | 1 +
 4 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/include/net/sctp/sctp.h b/include/net/sctp/sctp.h
index 9c6414f..3267246 100644
--- a/include/net/sctp/sctp.h
+++ b/include/net/sctp/sctp.h
@@ -712,7 +712,8 @@ static inline void sctp_v4_map_v6(union sctp_addr *addr)
  */
 static inline struct dst_entry *sctp_transport_dst_check(struct sctp_transport *t)
 {
-	if (t->dst && !dst_check(t->dst, 0)) {
+	if ((t->dst && !dst_check(t->dst, 0)) ||
+	    (t->flow_cache_genid != atomic_read(&flow_cache_genid))) {
 		dst_release(t->dst);
 		t->dst = NULL;
 	}
diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h
index 0fef00f..9dab882 100644
--- a/include/net/sctp/structs.h
+++ b/include/net/sctp/structs.h
@@ -948,6 +948,9 @@ struct sctp_transport {
 
 	/* 64-bit random number sent with heartbeat. */
 	__u64 hb_nonce;
+
+	/* used to check validity of dst */
+	__u32 flow_cache_genid;
 };
 
 struct sctp_transport *sctp_transport_new(struct net *, const union sctp_addr *,
diff --git a/net/sctp/ipv6.c b/net/sctp/ipv6.c
index ea14cb4..2ed7410 100644
--- a/net/sctp/ipv6.c
+++ b/net/sctp/ipv6.c
@@ -349,6 +349,7 @@ out:
 		struct rt6_info *rt;
 		rt = (struct rt6_info *)dst;
 		t->dst = dst;
+		t->flow_cache_genid = atomic_read(&flow_cache_genid);
 		SCTP_DEBUG_PRINTK("rt6_dst:%pI6 rt6_src:%pI6\n",
 			&rt->rt6i_dst.addr, &fl6->saddr);
 	} else {
diff --git a/net/sctp/protocol.c b/net/sctp/protocol.c
index 2d51842..4795a1a 100644
--- a/net/sctp/protocol.c
+++ b/net/sctp/protocol.c
@@ -512,6 +512,7 @@ out_unlock:
 	rcu_read_unlock();
 out:
 	t->dst = dst;
+	t->flow_cache_genid = atomic_read(&flow_cache_genid);
 	if (dst)
 		SCTP_DEBUG_PRINTK("rt_dst:%pI4, rt_src:%pI4\n",
 				  &fl4->daddr, &fl4->saddr);
-- 
1.7.12

^ permalink raw reply related

* WARNING: at net/ipv4/tcp.c:1303 tcp_cleanup_rbuf+0x54/0x120()
From: Dave Jones @ 2012-09-06 15:32 UTC (permalink / raw)
  To: netdev; +Cc: Fedora Kernel Team
In-Reply-To: <bug-854367-176318@bugzilla.redhat.com>

We've had about 20 reports of this bug in the last 2 days. 
No idea why it only just started showing up, but it may coincide with
users updating from 3.4->3.5 when we pushed an update.

Seems to be affecting multiple different network drivers.
(The original trace is a 'crap' driver from staging, but there are
 reports from e1000e and iwlwifi for eg)

 > https://bugzilla.redhat.com/show_bug.cgi?id=854367
 > 
 > backtrace:
 > :WARNING: at net/ipv4/tcp.c:1303 tcp_cleanup_rbuf+0x54/0x120()
 > :Hardware name: GA-MA785GM-US2H
 > :cleanup rbuf bug: copied A48FD80 seq A48FD80 rcvnxt A48FD80
 > :Modules linked in: fuse ebtable_nat ebtables ipt_MASQUERADE
 > nf_conntrack_netbios_ns nf_conntrack_broadcast ip6table_mangle ip6t_REJECT
 > nf_conntrack_ipv6 nf_defrag_ipv6 lockd ip6table_filter sunrpc ip6_tables
 > iptable_nat nf_nat rfcomm iptable_mangle nf_conntrack_ipv4 nf_defrag_ipv4 bnep
 > xt_conntrack nf_conntrack binfmt_misc btusb bluetooth snd_hda_codec_hdmi
 > snd_hda_codec_realtek snd_hda_intel joydev r8712u(C) snd_hda_codec rfkill
 > snd_hwdep shpchp sp5100_tco r8169 mii serio_raw snd_pcm snd_page_alloc
 > snd_timer snd soundcore i2c_piix4 edac_core edac_mce_amd ppdev parport_pc
 > parport microcode k10temp vhost_net tun macvtap macvlan kvm_amd kvm uinput
 > ata_generic pata_acpi firewire_ohci firewire_core crc_itu_t pata_atiixp wmi
 > usb_storage radeon i2c_algo_bit drm_kms_helper ttm drm i2c_core [last unloaded:
 > scsi_wait_scan]
 > :Pid: 2241, comm: Chrome_IOThread Tainted: G         C   3.5.3-1.fc17.x86_64 #1
 > :Call Trace:
 > : [<ffffffff810584bf>] warn_slowpath_common+0x7f/0xc0
 > : [<ffffffff810585b6>] warn_slowpath_fmt+0x46/0x50
 > : [<ffffffff815443c4>] tcp_cleanup_rbuf+0x54/0x120
 > : [<ffffffff8154564c>] tcp_recvmsg+0x76c/0xd30
 > : [<ffffffff81569c1b>] inet_recvmsg+0x6b/0x80
 > : [<ffffffff814e7612>] sock_aio_read.part.9+0x142/0x170
 > : [<ffffffff814e7665>] sock_aio_read+0x25/0x40
 > : [<ffffffff81187a06>] do_sync_read+0xe6/0x120
 > : [<ffffffff8127948a>] ? inode_has_perm.isra.31.constprop.61+0x2a/0x30
 > : [<ffffffff812764d2>] ? security_file_permission+0x92/0xb0
 > : [<ffffffff81187ea1>] ? rw_verify_area+0x61/0xf0
 > : [<ffffffff811883ed>] vfs_read+0x15d/0x180
 > : [<ffffffff8118845a>] sys_read+0x4a/0x90
 > : [<ffffffff81614ae9>] system_call_fastpath+0x16/0x1b

Could this be related to the "recvmsg bug" WARN's we've also been seeing recently ?

	Dave

^ permalink raw reply

* Re: [PATCH v3 01/17] hashtable: introduce a small and naive hashtable
From: Steven Rostedt @ 2012-09-06 15:11 UTC (permalink / raw)
  To: Josh Triplett
  Cc: snitzer-H+wXaHxf7aLQT0dZR+AlfA, neilb-l3A5Bk7waGM,
	fweisbec-Re5JQEeQqe8AvxtiuMwx3w,
	Trond.Myklebust-HgOvQuBEEgTQT0dZR+AlfA,
	bfields-uC3wQj2KruNg9hUCZPvPmw,
	paul.gortmaker-CWA4WttNNZF54TAoqtyWWQ,
	dm-devel-H+wXaHxf7aLQT0dZR+AlfA, agk-H+wXaHxf7aLQT0dZR+AlfA,
	aarcange-H+wXaHxf7aLQT0dZR+AlfA, rds-devel-N0ozoZBvEnrZJqsBc5GL+g,
	eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w,
	Tejun-/PVsmBQoxgPKo9QCiBeYKEEOCMrvLtNR,
	venkat.x.venkatsubra-QHcLZuEGTsvQT0dZR+AlfA,
	ccaulfie-H+wXaHxf7aLQT0dZR+AlfA, mingo-X9Un+BFzKDI,
	dev-yBygre7rU0TnMu66kgdUjQ, ericvh-Re5JQEeQqe8AvxtiuMwx3w,
	lw-BthXqXjhjHXQFUHtdCDX3A, Mathieu Desnoyers, Sasha Levin,
	axboe-tSWWG44O7X1aa/9Udqfwiw, linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	edumazet-hpIqsD4AKlfQT0dZR+AlfA, linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	Pedro Alves, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	ejt-H+wXaHxf7aLQT0dZR+AlfA, ebiederm-aS9lmoZGLiVWk0Htik3J/w,
	netdev-u79uwXL29TY76Z2rM5mHXA, Heo,
	teigland-H+wXaHxf7aLQT0dZR+AlfA,
	akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	davem-fT/PcQaiUtIeIZ0/mPfg9Q
In-Reply-To: <20120906145545.GA17332@leaf>

On Thu, 2012-09-06 at 07:55 -0700, Josh Triplett wrote:

> > My solution to making 'break' work in the iterator is:
> > 
> > 	for (bkt = 0, node = NULL; bkt < HASH_SIZE(name) && node == NULL; bkt++)
> > 		hlist_for_each_entry(obj, node, &name[bkt], member)
> > 
> 
> Looks reasonable.  However, it would break (or rather, not break) on
> code like this:
> 
> 	hash_for_each_entry(...) {
> 		if (...) {
> 			foo(node);
> 			node = NULL;
> 			break;
> 		}
> 	}
> 
> Hiding the double loop still seems error-prone.

We've already had this conversation ;-)  A guess a big comment is in
order:

/*
 * NOTE!  Although this is a double loop, 'break' still works because of
 *        the 'node == NULL' condition in the outer loop. On break of
 *        the inner loop, node will be !NULL, and the outer loop will
 *        exit as well.
 */

-- Steve

^ permalink raw reply

* Re: [PATCH v3 01/17] hashtable: introduce a small and naive hashtable
From: Josh Triplett @ 2012-09-06 14:55 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Mathieu Desnoyers, Pedro Alves, Steven Rostedt, Tejun Heo,
	torvalds, akpm, linux-kernel, linux-mm, paul.gortmaker, davem,
	mingo, ebiederm, aarcange, ericvh, netdev, eric.dumazet, axboe,
	agk, dm-devel, neilb, ccaulfie, teigland, Trond.Myklebust,
	bfields, fweisbec, jesse, venkat.x.venkatsubra, ejt, snitzer,
	edumazet, linux-nfs, dev, rds-devel, lw
In-Reply-To: <5048AAF6.5090101@gmail.com>

On Thu, Sep 06, 2012 at 03:53:58PM +0200, Sasha Levin wrote:
> On 09/04/2012 07:01 PM, Mathieu Desnoyers wrote:
> >> #define do_for_each_ftrace_rec(pg, rec)                                          \
> >> >         for (pg = ftrace_pages_start, rec = &pg->records[pg->index];             \
> >> >              pg && rec == &pg->records[pg->index];                               \
> >> >              pg = pg->next)                                                      \
> >> >           for (rec = pg->records; rec < &pg->records[pg->index]; rec++)
> > Maybe in some cases there might be ways to combine the two loops into
> > one ? I'm not seeing exactly how to do it for this one, but it should
> > not be impossible. If the inner loop condition can be moved to the outer
> > loop, and if we use (blah ? loop1_conf : loop2_cond) to test for
> > different conditions depending on the context, and do the same for the
> > 3rd argument of the for() loop. The details elude me for now though, so
> > maybe it's complete non-sense ;)
> > 
> > It might not be that useful for do_for_each_ftrace_rec, but if we can do
> > it for the hash table iterator, it might be worth it.
> 
> So I think that for the hash iterator it might actually be simpler.
> 
> My solution to making 'break' work in the iterator is:
> 
> 	for (bkt = 0, node = NULL; bkt < HASH_SIZE(name) && node == NULL; bkt++)
> 		hlist_for_each_entry(obj, node, &name[bkt], member)
> 
> We initialize our node loop cursor with NULL in the external loop, and the
> external loop will have a new condition to loop while that cursor is NULL.
> 
> My logic is that we can only 'break' when we are iterating over an object in the
> internal loop. If we're iterating over an object in that loop then 'node != NULL'.
> 
> This way, if we broke from within the internal loop, the external loop will see
> node as not NULL, and so it will stop looping itself. On the other hand, if the
> internal loop has actually ended, then node will be NULL, and the outer loop
> will keep running.
> 
> Is there anything I've missed?

Looks reasonable.  However, it would break (or rather, not break) on
code like this:

	hash_for_each_entry(...) {
		if (...) {
			foo(node);
			node = NULL;
			break;
		}
	}

Hiding the double loop still seems error-prone.

- Josh Triplett

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply

* Re: [PATCH 04/10] net/macb: Fix a race in macb_start_xmit()
From: Nicolas Ferre @ 2012-09-06 14:52 UTC (permalink / raw)
  To: David Miller
  Cc: netdev, linux-arm-kernel, havard, plagnioj, jamie, linux-kernel,
	patrice.vilchez
In-Reply-To: <20120905.173023.2235590074897156746.davem@davemloft.net>

On 09/05/2012 11:30 PM, David Miller :
> From: Nicolas Ferre <nicolas.ferre@atmel.com>
> Date: Wed, 5 Sep 2012 10:19:11 +0200
> 
>> From: Havard Skinnemoen <havard@skinnemoen.net>
>>
>> Fix a race in macb_start_xmit() where we unconditionally set the TSTART bit.
>> If an underrun just happened (we do this with interrupts disabled, so it might
>> not have been handled yet), the controller starts transmitting from the first
>> entry in the ring, which is usually wrong.
>> Restart the controller after error handling.
>>
>> Signed-off-by: Havard Skinnemoen <havard@skinnemoen.net>
>> [nicolas.ferre@atmel.com: split patch in topics]
>> Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
> 
> Accumulating special case code and checks into the hot path of TX packet
> processing is extremely unwise.
> 
> Instead, when you handle the TX error conditions and reset the chip you
> should first ensure that there are no flows of control in the transmit
> function of your driver by using the appropriate locking et al. facilities.
> 
> For example, you can quiesce the transmit path by handling the chip error
> interrupt as follows:
> 
> 1) Disable chip interrupt generation.
> 
> 2) Schedule a workqueue so you can process the reset outside of hard
>    interrupt context.
> 
> 3) In the workqueue function, disable NAPI and perform a
>    netif_tx_disable() to guarentee there are no threads of
>    execution trying to queue up packets for TX into the driver.
> 
> 4) Perform your chip reset and whatever else is necessary.
> 
> 5) Re-enable NAPI and TX.
> 
> Then you don't need any special checks in your xmit method at all.

I see... I will rework the series and try to implement this as part of
the "[PATCH 06/10] net/macb: better manage tx errors"

So this patch will disappear in future v2 series and patch 06 will be
seriously modified. In fact I will also try to stack "cosmetic" patches
at the beginning of the series.

Thanks, best regards,
-- 
Nicolas Ferre

^ permalink raw reply

* Re: NULL pointer dereference in xt_register_target()
From: Pablo Neira Ayuso @ 2012-09-06 14:44 UTC (permalink / raw)
  To: Cong Wang; +Cc: Eric Dumazet, netfilter-devel, Linux Kernel Network Developers
In-Reply-To: <CAM_iQpVAZwKhZxLdk-BLMjnK3eX4DD_o9KvzuAoTttyxvmCsJA@mail.gmail.com>

On Thu, Sep 06, 2012 at 10:27:22PM +0800, Cong Wang wrote:
> On Thu, Sep 6, 2012 at 12:48 AM, Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> > On Wed, Sep 05, 2012 at 05:55:06PM +0200, Eric Dumazet wrote:
> >> On Wed, 2012-09-05 at 23:43 +0800, Cong Wang wrote:
> >> > Hi, folks,
> >> >
> >> > The latest net-next tree can't boot due to a NULL ptr def
> >> > bug in the kernel, the full backtrack is:
> >> >
> >> > http://img1.douban.com/view/photo/photo/public/p1697139550.jpg
> >> >
> >> > the kernel .config file is:
> >> >
> >> > http://pastebin.com/9YTnkqKN
> >> >
> >> > I don't have time to look into the issue. If you need other info,
> >> > please let me know.
> >>
> >> It seems xt_nat_init() is called before xt_init(), so xt array is not
> >> yet setup.
> >
> > I have enqueued the following patch to fix this:
> >
> > http://1984.lsi.us.es/git/nf-next/commit/?id=00545bec9412d130c77f72a08d6c8b6ad21d4a1
> > e
> > commit 00545bec9412d130c77f72a08d6c8b6ad21d4a1e
> > Author: Pablo Neira Ayuso <pablo@netfilter.org>
> > Date:   Wed Sep 5 18:24:55 2012 +0200
> >
> >     netfilter: fix crash during boot if NAT has been compiled built-in
> >
> 
> Yeah, this indeed fixes the bug.
> 
> Please push it to net-next as soon as possible?

Will do, thanks for testing.

^ permalink raw reply

* RE: [PATCH v3 01/17] hashtable: introduce a small and naive hashtable
From: David Laight @ 2012-09-06 14:36 UTC (permalink / raw)
  To: Sasha Levin, Mathieu Desnoyers
  Cc: Pedro Alves, Steven Rostedt, Tejun Heo, torvalds, akpm,
	linux-kernel, linux-mm, paul.gortmaker, davem, mingo, ebiederm,
	aarcange, ericvh, netdev, josh, eric.dumazet, axboe, agk,
	dm-devel, neilb, ccaulfie, teigland, Trond.Myklebust, bfields,
	fweisbec, jesse, venkat.x.venkatsubra, ejt, snitzer, edumazet,
	linux-nfs, dev, rds-devel, lw
In-Reply-To: <5048AAF6.5090101@gmail.com>

> My solution to making 'break' work in the iterator is:
> 
> 	for (bkt = 0, node = NULL; bkt < HASH_SIZE(name) && node ==
NULL; bkt++)
> 		hlist_for_each_entry(obj, node, &name[bkt], member)

I'd take a look at the generated code.
Might come out a bit better if the condition is changed to:
	node == NULL && bkt < HASH_SIZE(name)
you might find the compiler always optimises out the
node == NULL comparison.
(It might anyway, but switching the order gives it a better
chance.)

	David



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply

* Re: [PATCH v2 09/10] net/macb: ethtool interface: add register dump feature
From: Ben Hutchings @ 2012-09-06 14:34 UTC (permalink / raw)
  To: Nicolas Ferre
  Cc: netdev, davem, linux-arm-kernel, havard, plagnioj,
	patrice.vilchez, linux-kernel
In-Reply-To: <1346941256-15676-1-git-send-email-nicolas.ferre@atmel.com>

On Thu, 2012-09-06 at 16:20 +0200, Nicolas Ferre wrote:
> Add macb_get_regs() ethtool function and its helper function:
> macb_get_regs_len().
> The version field is deduced from the IP revision which gives the
> "MACB or GEM" information. An additional version field is reserved.
> 
> Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
Reviewed-by: Ben Hutchings <bhutchings@solarflare.com>
> ---
> v2: - modify MACB_GREGS_NBR name and adapt to number of registers
>       actually displayed.
>     - change version format to reflect register layout and
>       add a version number to be future proof.
> 
>  drivers/net/ethernet/cadence/macb.c |   40 +++++++++++++++++++++++++++++++++++
>  drivers/net/ethernet/cadence/macb.h |    3 +++
>  2 files changed, 43 insertions(+)
> 
> diff --git a/drivers/net/ethernet/cadence/macb.c b/drivers/net/ethernet/cadence/macb.c
> index dc34ff1..cab42e7 100644
> --- a/drivers/net/ethernet/cadence/macb.c
> +++ b/drivers/net/ethernet/cadence/macb.c
> @@ -1223,9 +1223,49 @@ static int macb_set_settings(struct net_device *dev, struct ethtool_cmd *cmd)
>  	return phy_ethtool_sset(phydev, cmd);
>  }
>  
> +static int macb_get_regs_len(struct net_device *netdev)
> +{
> +	return MACB_GREGS_NBR * sizeof(u32);
> +}
> +
> +static void macb_get_regs(struct net_device *dev, struct ethtool_regs *regs,
> +			  void *p)
> +{
> +	struct macb *bp = netdev_priv(dev);
> +	unsigned int tail, head;
> +	u32 *regs_buff = p;
> +
> +	regs->version = (macb_readl(bp, MID) & ((1 << MACB_REV_SIZE) - 1))
> +			| MACB_GREGS_VERSION;
> +
> +	tail = macb_tx_ring_wrap(bp->tx_tail);
> +	head = macb_tx_ring_wrap(bp->tx_head);
> +
> +	regs_buff[0]  = macb_readl(bp, NCR);
> +	regs_buff[1]  = macb_or_gem_readl(bp, NCFGR);
> +	regs_buff[2]  = macb_readl(bp, NSR);
> +	regs_buff[3]  = macb_readl(bp, TSR);
> +	regs_buff[4]  = macb_readl(bp, RBQP);
> +	regs_buff[5]  = macb_readl(bp, TBQP);
> +	regs_buff[6]  = macb_readl(bp, RSR);
> +	regs_buff[7]  = macb_readl(bp, IMR);
> +
> +	regs_buff[8]  = tail;
> +	regs_buff[9]  = head;
> +	regs_buff[10] = macb_tx_dma(bp, tail);
> +	regs_buff[11] = macb_tx_dma(bp, head);
> +
> +	if (macb_is_gem(bp)) {
> +		regs_buff[12] = gem_readl(bp, USRIO);
> +		regs_buff[13] = gem_readl(bp, DMACFG);
> +	}
> +}
> +
>  static const struct ethtool_ops macb_ethtool_ops = {
>  	.get_settings		= macb_get_settings,
>  	.set_settings		= macb_set_settings,
> +	.get_regs_len		= macb_get_regs_len,
> +	.get_regs		= macb_get_regs,
>  	.get_link		= ethtool_op_get_link,
>  };
>  
> diff --git a/drivers/net/ethernet/cadence/macb.h b/drivers/net/ethernet/cadence/macb.h
> index f69ceef..bcadc3c 100644
> --- a/drivers/net/ethernet/cadence/macb.h
> +++ b/drivers/net/ethernet/cadence/macb.h
> @@ -10,6 +10,9 @@
>  #ifndef _MACB_H
>  #define _MACB_H
>  
> +#define MACB_GREGS_NBR 16
> +#define MACB_GREGS_VERSION 1
> +
>  /* MACB register offsets */
>  #define MACB_NCR				0x0000
>  #define MACB_NCFGR				0x0004

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply

* Re: [PATCH v3 01/17] hashtable: introduce a small and naive hashtable
From: Mathieu Desnoyers @ 2012-09-06 14:33 UTC (permalink / raw)
  To: Sasha Levin
  Cc: snitzer-H+wXaHxf7aLQT0dZR+AlfA, neilb-l3A5Bk7waGM,
	fweisbec-Re5JQEeQqe8AvxtiuMwx3w,
	Trond.Myklebust-HgOvQuBEEgTQT0dZR+AlfA,
	bfields-uC3wQj2KruNg9hUCZPvPmw,
	paul.gortmaker-CWA4WttNNZF54TAoqtyWWQ,
	dm-devel-H+wXaHxf7aLQT0dZR+AlfA, agk-H+wXaHxf7aLQT0dZR+AlfA,
	aarcange-H+wXaHxf7aLQT0dZR+AlfA, rds-devel-N0ozoZBvEnrZJqsBc5GL+g,
	eric.dumazet-Re5JQEeQqe8AvxtiuMwx3w,
	venkat.x.venkatsubra-QHcLZuEGTsvQT0dZR+AlfA,
	ccaulfie-H+wXaHxf7aLQT0dZR+AlfA, mingo-X9Un+BFzKDI,
	dev-yBygre7rU0TnMu66kgdUjQ, ericvh-Re5JQEeQqe8AvxtiuMwx3w,
	josh-iaAMLnmF4UmaiuxdJuQwMA, Steven Rostedt,
	lw-BthXqXjhjHXQFUHtdCDX3A, teigland-H+wXaHxf7aLQT0dZR+AlfA,
	axboe-tSWWG44O7X1aa/9Udqfwiw, linux-nfs-u79uwXL29TY76Z2rM5mHXA,
	edumazet-hpIqsD4AKlfQT0dZR+AlfA, linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	Pedro Alves, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	ejt-H+wXaHxf7aLQT0dZR+AlfA, ebiederm-aS9lmoZGLiVWk0Htik3J/w,
	netdev-u79uwXL29TY76Z2rM5mHXA, Tejun Heo,
	akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	davem-fT/PcQaiUtIeIZ0/mPfg9Q
In-Reply-To: <5048AAF6.5090101-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>

* Sasha Levin (levinsasha928-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org) wrote:
> On 09/04/2012 07:01 PM, Mathieu Desnoyers wrote:
> >> #define do_for_each_ftrace_rec(pg, rec)                                          \
> >> >         for (pg = ftrace_pages_start, rec = &pg->records[pg->index];             \
> >> >              pg && rec == &pg->records[pg->index];                               \
> >> >              pg = pg->next)                                                      \
> >> >           for (rec = pg->records; rec < &pg->records[pg->index]; rec++)
> > Maybe in some cases there might be ways to combine the two loops into
> > one ? I'm not seeing exactly how to do it for this one, but it should
> > not be impossible. If the inner loop condition can be moved to the outer
> > loop, and if we use (blah ? loop1_conf : loop2_cond) to test for
> > different conditions depending on the context, and do the same for the
> > 3rd argument of the for() loop. The details elude me for now though, so
> > maybe it's complete non-sense ;)
> > 
> > It might not be that useful for do_for_each_ftrace_rec, but if we can do
> > it for the hash table iterator, it might be worth it.
> 
> So I think that for the hash iterator it might actually be simpler.
> 
> My solution to making 'break' work in the iterator is:
> 
> 	for (bkt = 0, node = NULL; bkt < HASH_SIZE(name) && node == NULL; bkt++)
> 		hlist_for_each_entry(obj, node, &name[bkt], member)
> 
> We initialize our node loop cursor with NULL in the external loop, and the
> external loop will have a new condition to loop while that cursor is NULL.
> 
> My logic is that we can only 'break' when we are iterating over an object in the
> internal loop. If we're iterating over an object in that loop then 'node != NULL'.
> 
> This way, if we broke from within the internal loop, the external loop will see
> node as not NULL, and so it will stop looping itself. On the other hand, if the
> internal loop has actually ended, then node will be NULL, and the outer loop
> will keep running.
> 
> Is there anything I've missed?

This sounds good. Unless I'm missing something too.

Thanks!

Mathieu

> 
> 
> Thanks,
> Sasha

-- 
Mathieu Desnoyers
Operating System Efficiency R&D Consultant
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply

* Re: NULL pointer dereference in xt_register_target()
From: Cong Wang @ 2012-09-06 14:27 UTC (permalink / raw)
  To: Pablo Neira Ayuso
  Cc: Eric Dumazet, netfilter-devel, Linux Kernel Network Developers
In-Reply-To: <20120905164831.GA21836@1984>

On Thu, Sep 6, 2012 at 12:48 AM, Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> On Wed, Sep 05, 2012 at 05:55:06PM +0200, Eric Dumazet wrote:
>> On Wed, 2012-09-05 at 23:43 +0800, Cong Wang wrote:
>> > Hi, folks,
>> >
>> > The latest net-next tree can't boot due to a NULL ptr def
>> > bug in the kernel, the full backtrack is:
>> >
>> > http://img1.douban.com/view/photo/photo/public/p1697139550.jpg
>> >
>> > the kernel .config file is:
>> >
>> > http://pastebin.com/9YTnkqKN
>> >
>> > I don't have time to look into the issue. If you need other info,
>> > please let me know.
>>
>> It seems xt_nat_init() is called before xt_init(), so xt array is not
>> yet setup.
>
> I have enqueued the following patch to fix this:
>
> http://1984.lsi.us.es/git/nf-next/commit/?id=00545bec9412d130c77f72a08d6c8b6ad21d4a1
> e
> commit 00545bec9412d130c77f72a08d6c8b6ad21d4a1e
> Author: Pablo Neira Ayuso <pablo@netfilter.org>
> Date:   Wed Sep 5 18:24:55 2012 +0200
>
>     netfilter: fix crash during boot if NAT has been compiled built-in
>

Yeah, this indeed fixes the bug.

Please push it to net-next as soon as possible?

Thanks!

^ permalink raw reply

* [PATCH v2 09/10] net/macb: ethtool interface: add register dump feature
From: Nicolas Ferre @ 2012-09-06 14:20 UTC (permalink / raw)
  To: netdev, bhutchings, davem
  Cc: linux-arm-kernel, havard, plagnioj, patrice.vilchez, linux-kernel,
	Nicolas Ferre
In-Reply-To: <1346888174.5325.55.camel@bwh-desktop.uk.solarflarecom.com>

Add macb_get_regs() ethtool function and its helper function:
macb_get_regs_len().
The version field is deduced from the IP revision which gives the
"MACB or GEM" information. An additional version field is reserved.

Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
---
v2: - modify MACB_GREGS_NBR name and adapt to number of registers
      actually displayed.
    - change version format to reflect register layout and
      add a version number to be future proof.

 drivers/net/ethernet/cadence/macb.c |   40 +++++++++++++++++++++++++++++++++++
 drivers/net/ethernet/cadence/macb.h |    3 +++
 2 files changed, 43 insertions(+)

diff --git a/drivers/net/ethernet/cadence/macb.c b/drivers/net/ethernet/cadence/macb.c
index dc34ff1..cab42e7 100644
--- a/drivers/net/ethernet/cadence/macb.c
+++ b/drivers/net/ethernet/cadence/macb.c
@@ -1223,9 +1223,49 @@ static int macb_set_settings(struct net_device *dev, struct ethtool_cmd *cmd)
 	return phy_ethtool_sset(phydev, cmd);
 }
 
+static int macb_get_regs_len(struct net_device *netdev)
+{
+	return MACB_GREGS_NBR * sizeof(u32);
+}
+
+static void macb_get_regs(struct net_device *dev, struct ethtool_regs *regs,
+			  void *p)
+{
+	struct macb *bp = netdev_priv(dev);
+	unsigned int tail, head;
+	u32 *regs_buff = p;
+
+	regs->version = (macb_readl(bp, MID) & ((1 << MACB_REV_SIZE) - 1))
+			| MACB_GREGS_VERSION;
+
+	tail = macb_tx_ring_wrap(bp->tx_tail);
+	head = macb_tx_ring_wrap(bp->tx_head);
+
+	regs_buff[0]  = macb_readl(bp, NCR);
+	regs_buff[1]  = macb_or_gem_readl(bp, NCFGR);
+	regs_buff[2]  = macb_readl(bp, NSR);
+	regs_buff[3]  = macb_readl(bp, TSR);
+	regs_buff[4]  = macb_readl(bp, RBQP);
+	regs_buff[5]  = macb_readl(bp, TBQP);
+	regs_buff[6]  = macb_readl(bp, RSR);
+	regs_buff[7]  = macb_readl(bp, IMR);
+
+	regs_buff[8]  = tail;
+	regs_buff[9]  = head;
+	regs_buff[10] = macb_tx_dma(bp, tail);
+	regs_buff[11] = macb_tx_dma(bp, head);
+
+	if (macb_is_gem(bp)) {
+		regs_buff[12] = gem_readl(bp, USRIO);
+		regs_buff[13] = gem_readl(bp, DMACFG);
+	}
+}
+
 static const struct ethtool_ops macb_ethtool_ops = {
 	.get_settings		= macb_get_settings,
 	.set_settings		= macb_set_settings,
+	.get_regs_len		= macb_get_regs_len,
+	.get_regs		= macb_get_regs,
 	.get_link		= ethtool_op_get_link,
 };
 
diff --git a/drivers/net/ethernet/cadence/macb.h b/drivers/net/ethernet/cadence/macb.h
index f69ceef..bcadc3c 100644
--- a/drivers/net/ethernet/cadence/macb.h
+++ b/drivers/net/ethernet/cadence/macb.h
@@ -10,6 +10,9 @@
 #ifndef _MACB_H
 #define _MACB_H
 
+#define MACB_GREGS_NBR 16
+#define MACB_GREGS_VERSION 1
+
 /* MACB register offsets */
 #define MACB_NCR				0x0000
 #define MACB_NCFGR				0x0004
-- 
1.7.10

^ permalink raw reply related

* Re: [PATCH v3 01/17] hashtable: introduce a small and naive hashtable
From: Pedro Alves @ 2012-09-06 14:19 UTC (permalink / raw)
  To: Sasha Levin
  Cc: Mathieu Desnoyers, Steven Rostedt, Tejun Heo, torvalds, akpm,
	linux-kernel, linux-mm, paul.gortmaker, davem, mingo, ebiederm,
	aarcange, ericvh, netdev, josh, eric.dumazet, axboe, agk,
	dm-devel, neilb, ccaulfie, teigland, Trond.Myklebust, bfields,
	fweisbec, jesse, venkat.x.venkatsubra, ejt, snitzer, edumazet,
	linux-nfs, dev, rds-devel, lw
In-Reply-To: <5048AAF6.5090101@gmail.com>

On 09/06/2012 02:53 PM, Sasha Levin wrote:

> So I think that for the hash iterator it might actually be simpler.
> 
> My solution to making 'break' work in the iterator is:
> 
> 	for (bkt = 0, node = NULL; bkt < HASH_SIZE(name) && node == NULL; bkt++)
> 		hlist_for_each_entry(obj, node, &name[bkt], member)
> 
> We initialize our node loop cursor with NULL in the external loop, and the
> external loop will have a new condition to loop while that cursor is NULL.
> 
> My logic is that we can only 'break' when we are iterating over an object in the
> internal loop. If we're iterating over an object in that loop then 'node != NULL'.
> 
> This way, if we broke from within the internal loop, the external loop will see
> node as not NULL, and so it will stop looping itself. On the other hand, if the
> internal loop has actually ended, then node will be NULL, and the outer loop
> will keep running.
> 
> Is there anything I've missed?

Looks right to me, from a cursory look at hlist_for_each_entry.  That's exactly
what I meant with this most often being trivial when the inner loop's iterator
is a pointer that goes NULL at the end.

-- 
Pedro Alves

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply

* Re: [PATCH 08/10] net/macb: macb_get_drvinfo: add GEM/MACB suffix to differentiate revision
From: Nicolas Ferre @ 2012-09-06 14:01 UTC (permalink / raw)
  To: Ben Hutchings
  Cc: netdev, linux-arm-kernel, davem, havard, plagnioj, jamie,
	linux-kernel, patrice.vilchez
In-Reply-To: <1346887671.5325.47.camel@bwh-desktop.uk.solarflarecom.com>

On 09/06/2012 01:27 AM, Ben Hutchings :
> On Wed, 2012-09-05 at 11:00 +0200, Nicolas Ferre wrote:
>> Add an indication about which revision of the hardware we are running in
>> info->driver string.
>>
>> Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
>> ---
>>  drivers/net/ethernet/cadence/macb.c |    4 ++++
>>  1 file changed, 4 insertions(+)
>>
>> diff --git a/drivers/net/ethernet/cadence/macb.c b/drivers/net/ethernet/cadence/macb.c
>> index bd331fd..c7c39f1 100644
>> --- a/drivers/net/ethernet/cadence/macb.c
>> +++ b/drivers/net/ethernet/cadence/macb.c
>> @@ -1313,6 +1313,10 @@ static void macb_get_drvinfo(struct net_device *dev,
>>  	struct macb *bp = netdev_priv(dev);
>>  
>>  	strcpy(info->driver, bp->pdev->dev.driver->name);
>> +	if (macb_is_gem(bp))
>> +		strcat(info->driver, " GEM");
>> +	else
>> +		strcat(info->driver, " MACB");
>>  	strcpy(info->version, "$Revision: 1.14 $");
> 
> Related to hardware revisions (which don't belong here, as David said),
> I rather doubt this CVS ID is very useful as a driver version.
> 
> If the driver doesn't have a meaningful version (aside from the kernel
> version) then you can remove this function and let the ethtool core fill
> in the other two fields automatically.

Absolutely, I will do this.

Thanks for the tip.

Best regards,
-- 
Nicolas Ferre

^ permalink raw reply

* Re: [PATCH net-next] ipv6: fix handling of throw routes
From: Eric Dumazet @ 2012-09-06 13:58 UTC (permalink / raw)
  To: Nicolas Dichtel; +Cc: davem, netdev, markus.stenberg
In-Reply-To: <1346946815-3094-1-git-send-email-nicolas.dichtel@6wind.com>

On Thu, 2012-09-06 at 11:53 -0400, Nicolas Dichtel wrote:
> It's the same problem that previous fix about blackhole and prohibit routes.
> 
> When adding a throw route, it was handled like a classic route.
> Moreover, it was only possible to add this kind of routes by specifying
> an interface.
> 
> Before the patch:
>   $ ip route add throw 2001::2/128
>   RTNETLINK answers: No such device
>   $ ip route add throw 2001::2/128 dev eth0
>   $ ip -6 route | grep 2001::2
>   2001::2 dev eth0  metric 1024
> 
> After:
>   $ ip route add throw 2001::2/128
>   $ ip -6 route | grep 2001::2
>   throw 2001::2 dev lo  metric 1024  error -11
> 
> Reported-by: Markus Stenberg <markus.stenberg@iki.fi>
> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
> ---
>  net/ipv6/route.c | 9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)

Acked-by: Eric Dumazet <edumazet@google.com>

Thanks Nicolas

^ permalink raw reply

* Re: [PATCH] mac80211: use list_move instead of list_del/list_add
From: Wei Yongjun @ 2012-09-06 13:57 UTC (permalink / raw)
  To: johannes; +Cc: linville, yongjun_wei, linux-wireless, netdev

On 09/06/2012 05:56 PM, Johannes Berg wrote:

Hi Johannes,

> On Thu, 2012-09-06 at 13:20 +0800, Wei Yongjun wrote:
>> From: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
>>
>> Using list_move() instead of list_del() + list_add().
>>
>> spatch with a semantic match is used to found this problem.
>> (http://coccinelle.lip6.fr/)
>>
>> Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
> Applied. FWIW, I don't think it's really a "problem" rather than a
> simplification or something like that, but anyway.

That is right, I will change the patch description if I send some other
patchs like this cleanup.

Thanks,
Yongjun Wei

>
> johannes
>
>
>

^ permalink raw reply

* Re: CBQ(but probably u32 filter bug), kernel "freeze", at least from 3.2.0
From: Eric Dumazet @ 2012-09-06 13:56 UTC (permalink / raw)
  To: Denys Fedoryshchenko; +Cc: netdev
In-Reply-To: <3012b13d6556b629a84089b68bce9a6b@visp.net.lb>

On Thu, 2012-09-06 at 16:47 +0300, Denys Fedoryshchenko wrote:

> Dear Eric
> 
> Very sorry for delay, most of time in desert, without decent internet.
> I will try to test today or tomorrow.

No problem, I reproduced the bug on my dev machine, but its always
better to have bug reporter adding its own 'Tested-by:' tag ;)

Thanks

^ permalink raw reply

* Re: CBQ(but probably u32 filter bug), kernel "freeze", at least from 3.2.0
From: Denys Fedoryshchenko @ 2012-09-06 13:47 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev
In-Reply-To: <1346920806.13121.180.camel@edumazet-glaptop>

On 2012-09-06 11:40, Eric Dumazet wrote:
> On Tue, 2012-08-28 at 03:59 -0700, Eric Dumazet wrote:
>> On Tue, 2012-08-28 at 07:50 +0300, Denys Fedoryshchenko wrote:
>> > Hi
>> >
>> > Got information from friend, confirmed that it crashed at least 
>> two my
>> > boxes also :)
>> > 3.0.5-rc1 is working fine, 3.4.1 , 3.2.0 from ubuntu  - crashing
>> > No watchdog fired, and didn't got yet significant debugging
>> > information.
>> >
>> > Very easy to reproduce:
>> > 1)run the script
>> > 2)ping 192.168.3.234
>> >
>> > script:
>> > DEV_OUT=eth0
>> > ICMP="match ip protocol 1 0xff"
>> > U32="protocol ip u32"
>> > DST="match ip dst"
>> > tc qdisc add dev $DEV_OUT root handle 1: cbq avpkt 1000 bandwidth
>> > 100mbit
>> > tc class add dev $DEV_OUT parent 1: classid 1:1 cbq rate 512kbit 
>> allot
>> > 1500 prio 5 bounded isolated
>> > tc filter add dev $DEV_OUT parent 1:              prio 3 $U32 
>> $ICMP
>> > $DST 192.168.3.234 flowid 1:
>> > tc qdisc add dev $DEV_OUT parent 1:1 sfq perturb 10
>>
>> Not sure what your friend expected from this buggy configuration.
>>
>> It probably never worked at all.
>>
>> CBQ needs at least one child class and one leaf class.
>>
>> This scripts creates a loop inside CBQ, so cpu is probably looping 
>> in
>> cbq_enqueue() (or more exactly cbq_classify()), as instructed by the
>> sysadmin ;)
>>
>> u32 (or sfq) seems ok.
>>
>> Could you try the following patch ?
>>
>> Thanks !
>>
>> diff --git a/net/sched/sch_cbq.c b/net/sched/sch_cbq.c
>> index 6aabd77..564b9fc 100644
>> --- a/net/sched/sch_cbq.c
>> +++ b/net/sched/sch_cbq.c
>> @@ -250,10 +250,11 @@ cbq_classify(struct sk_buff *skb, struct Qdisc 
>> *sch, int *qerr)
>>  			else if ((cl = defmap[res.classid & TC_PRIO_MAX]) == NULL)
>>  				cl = defmap[TC_PRIO_BESTEFFORT];
>>
>> -			if (cl == NULL || cl->level >= head->level)
>> +			if (cl == NULL)
>>  				goto fallback;
>>  		}
>> -
>> +		if (cl->level >= head->level)
>> +			goto fallback;
>>  #ifdef CONFIG_NET_CLS_ACT
>>  		switch (result) {
>>  		case TC_ACT_QUEUED:
>>
>
> Hi Denys
>
> Any feedback on the suggested patch ?
>
> Thanks !
Dear Eric

Very sorry for delay, most of time in desert, without decent internet.
I will try to test today or tomorrow.

---
Denys Fedoryshchenko, Network Engineer, Virtual ISP S.A.L.

^ permalink raw reply

* Re: [PATCH v3 01/17] hashtable: introduce a small and naive hashtable
From: Sasha Levin @ 2012-09-06 13:53 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Pedro Alves, Steven Rostedt, Tejun Heo, torvalds, akpm,
	linux-kernel, linux-mm, paul.gortmaker, davem, mingo, ebiederm,
	aarcange, ericvh, netdev, josh, eric.dumazet, axboe, agk,
	dm-devel, neilb, ccaulfie, teigland, Trond.Myklebust, bfields,
	fweisbec, jesse, venkat.x.venkatsubra, ejt, snitzer, edumazet,
	linux-nfs, dev, rds-devel, lw
In-Reply-To: <20120904170138.GB31934@Krystal>

On 09/04/2012 07:01 PM, Mathieu Desnoyers wrote:
>> #define do_for_each_ftrace_rec(pg, rec)                                          \
>> >         for (pg = ftrace_pages_start, rec = &pg->records[pg->index];             \
>> >              pg && rec == &pg->records[pg->index];                               \
>> >              pg = pg->next)                                                      \
>> >           for (rec = pg->records; rec < &pg->records[pg->index]; rec++)
> Maybe in some cases there might be ways to combine the two loops into
> one ? I'm not seeing exactly how to do it for this one, but it should
> not be impossible. If the inner loop condition can be moved to the outer
> loop, and if we use (blah ? loop1_conf : loop2_cond) to test for
> different conditions depending on the context, and do the same for the
> 3rd argument of the for() loop. The details elude me for now though, so
> maybe it's complete non-sense ;)
> 
> It might not be that useful for do_for_each_ftrace_rec, but if we can do
> it for the hash table iterator, it might be worth it.

So I think that for the hash iterator it might actually be simpler.

My solution to making 'break' work in the iterator is:

	for (bkt = 0, node = NULL; bkt < HASH_SIZE(name) && node == NULL; bkt++)
		hlist_for_each_entry(obj, node, &name[bkt], member)

We initialize our node loop cursor with NULL in the external loop, and the
external loop will have a new condition to loop while that cursor is NULL.

My logic is that we can only 'break' when we are iterating over an object in the
internal loop. If we're iterating over an object in that loop then 'node != NULL'.

This way, if we broke from within the internal loop, the external loop will see
node as not NULL, and so it will stop looping itself. On the other hand, if the
internal loop has actually ended, then node will be NULL, and the outer loop
will keep running.

Is there anything I've missed?


Thanks,
Sasha

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply

* [PATCH net-next] ipv6: fix handling of throw routes
From: Nicolas Dichtel @ 2012-09-06 15:53 UTC (permalink / raw)
  To: davem; +Cc: netdev, markus.stenberg, eric.dumazet, Nicolas Dichtel
In-Reply-To: <5048A374.60005@6wind.com>

It's the same problem that previous fix about blackhole and prohibit routes.

When adding a throw route, it was handled like a classic route.
Moreover, it was only possible to add this kind of routes by specifying
an interface.

Before the patch:
  $ ip route add throw 2001::2/128
  RTNETLINK answers: No such device
  $ ip route add throw 2001::2/128 dev eth0
  $ ip -6 route | grep 2001::2
  2001::2 dev eth0  metric 1024

After:
  $ ip route add throw 2001::2/128
  $ ip -6 route | grep 2001::2
  throw 2001::2 dev lo  metric 1024  error -11

Reported-by: Markus Stenberg <markus.stenberg@iki.fi>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
---
 net/ipv6/route.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index fa26444..339d921 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -1471,6 +1471,9 @@ int ip6_route_add(struct fib6_config *cfg)
 		case RTN_PROHIBIT:
 			rt->dst.error = -EACCES;
 			break;
+		case RTN_THROW:
+			rt->dst.error = -EAGAIN;
+			break;
 		default:
 			rt->dst.error = -ENETUNREACH;
 			break;
@@ -2275,7 +2278,8 @@ static int rtm_to_fib6_config(struct sk_buff *skb, struct nlmsghdr *nlh,
 
 	if (rtm->rtm_type == RTN_UNREACHABLE ||
 	    rtm->rtm_type == RTN_BLACKHOLE ||
-	    rtm->rtm_type == RTN_PROHIBIT)
+	    rtm->rtm_type == RTN_PROHIBIT ||
+	    rtm->rtm_type == RTN_THROW)
 		cfg->fc_flags |= RTF_REJECT;
 
 	if (rtm->rtm_type == RTN_LOCAL)
@@ -2412,6 +2416,9 @@ static int rt6_fill_node(struct net *net,
 		case -EACCES:
 			rtm->rtm_type = RTN_PROHIBIT;
 			break;
+		case -EAGAIN:
+			rtm->rtm_type = RTN_THROW;
+			break;
 		default:
 			rtm->rtm_type = RTN_UNREACHABLE;
 			break;
-- 
1.7.12

^ permalink raw reply related

* Re: Increased multicast packet drops in 3.4
From: Eric Dumazet @ 2012-09-06 13:31 UTC (permalink / raw)
  To: Shawn Bohrer; +Cc: netdev
In-Reply-To: <1346937667.2484.33.camel@edumazet-glaptop>

On Thu, 2012-09-06 at 15:21 +0200, Eric Dumazet wrote:

> 
> Are you receiving fragmented UDP frames ?
> 
> I ask this because with latest kernels (linux-3.5), we should no longer
> build a list of skb, but a single skb with page fragments.
> 
> commit 3cc4949269e01f39443d0fcfffb5bc6b47878d45
> Author: Eric Dumazet <edumazet@google.com>
> Date:   Sat May 19 03:02:20 2012 +0000
> 
>     ipv4: use skb coalescing in defragmentation
>     
>     ip_frag_reasm() can use skb_try_coalesce() to build optimized skb,
>     reducing memory used by them (truesize), and reducing number of cache
>     line misses and overhead for the consumer.
>     
>     Signed-off-by: Eric Dumazet <edumazet@google.com>
>     Cc: Alexander Duyck <alexander.h.duyck@intel.com>
>     Signed-off-by: David S. Miller <davem@davemloft.net>
> 
> 

Unfortunately mlx4 pulls too many bytes from the frame to skb->head, so
it defeats coalescing completely.

Try following patch (if you also try linux-3.5)

diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
index 9d27e42..700e70e 100644
--- a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
+++ b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
@@ -150,7 +150,7 @@ enum {
 #define ETH_LLC_SNAP_SIZE	8
 
 #define SMALL_PACKET_SIZE      (256 - NET_IP_ALIGN)
-#define HEADER_COPY_SIZE       (128 - NET_IP_ALIGN)
+#define HEADER_COPY_SIZE       ETH_HLEN
 #define MLX4_LOOPBACK_TEST_PAYLOAD (HEADER_COPY_SIZE - ETH_HLEN)
 
 #define MLX4_EN_MIN_MTU		46

^ permalink raw reply related

* Re: IPv6 routing type - not at par with IPv4 one?
From: Nicolas Dichtel @ 2012-09-06 13:21 UTC (permalink / raw)
  To: Markus; +Cc: Eric Dumazet, Markus Stenberg, netdev
In-Reply-To: <7F0FC925-D0A4-4E27-AE8A-F2E0A619FB36@iki.fi>

Le 06/09/2012 12:54, Markus a écrit :
> On 6.9.2012, at 13.35, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>> Well, it  seems you missed this :
>>
>> http://git.kernel.org/?p=linux/kernel/git/davem/net-next.git;a=commitdiff;h=ef2c7d7b59708d54213c7556a82d14de9a7e4475
>>
>> At least the blackhole is now supported on IPv6, so you probably have to
>> add the 'throw' bit, if it makes any sense.
>
>
> Ah, cool, teaches me to refresh my git trees before posting ;-)
> Nicolas, want to update for that too or will I?
Ok, I will send another patch for this.


Thank you,
Nicolas

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox