Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: linux-next: Tree for Sep 5 (netfilter: xt_socket.c)
From: Randy Dunlap @ 2013-09-05 20:23 UTC (permalink / raw)
  To: David Miller; +Cc: sfr, linux-next, linux-kernel, netdev, netfilter-devel
In-Reply-To: <20130905.143830.660252697193033003.davem@davemloft.net>

On 09/05/13 11:38, David Miller wrote:
> From: Randy Dunlap <rdunlap@infradead.org>
> Date: Thu, 05 Sep 2013 10:37:04 -0700
> 
>> On 09/05/13 02:32, Stephen Rothwell wrote:
>>> Hi all,
>>>
>>> Please do not add any code for v3.13 to your linux-next included branches
>>> until after v3.12-rc1 is released.
>>>
>>> Changes since 20130904:
>>>
>>
>> on x86_64:
>>
>> when CONFIG_IPV6=m
>> and CONFIG_NETFILTER_XT_MATCH_SOCKET=y:
>>
>> net/built-in.o: In function `socket_mt6_v1_v2':
>> xt_socket.c:(.text+0x51b55): undefined reference to `udp6_lib_lookup'
>> net/built-in.o: In function `socket_mt_init':
>> xt_socket.c:(.init.text+0x1ef8): undefined reference to `nf_defrag_ipv6_enable'
> 
> I just commited the following to fix this:
> 
> --------------------
> [PATCH] netfilter: Fix build errors with xt_socket.c
> 
> As reported by Randy Dunlap:
> 
> ====================
> when CONFIG_IPV6=m
> and CONFIG_NETFILTER_XT_MATCH_SOCKET=y:
> 
> net/built-in.o: In function `socket_mt6_v1_v2':
> xt_socket.c:(.text+0x51b55): undefined reference to `udp6_lib_lookup'
> net/built-in.o: In function `socket_mt_init':
> xt_socket.c:(.init.text+0x1ef8): undefined reference to `nf_defrag_ipv6_enable'
> ====================
> 
> Like several other modules under net/netfilter/ we have to
> have a dependency "IPV6 disabled or set compatibly with this
> module" clause.
> 
> Reported-by: Randy Dunlap <rdunlap@infradead.org>
> Signed-off-by: David S. Miller <davem@davemloft.net>

Acked-by: Randy Dunlap <rdunlap@infradead.org>

Thanks.

> ---
>  net/netfilter/Kconfig | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/net/netfilter/Kconfig b/net/netfilter/Kconfig
> index 62a171a..6e839b6 100644
> --- a/net/netfilter/Kconfig
> +++ b/net/netfilter/Kconfig
> @@ -1175,6 +1175,7 @@ config NETFILTER_XT_MATCH_SOCKET
>  	depends on NETFILTER_XTABLES
>  	depends on NETFILTER_ADVANCED
>  	depends on !NF_CONNTRACK || NF_CONNTRACK
> +	depends on (IPV6 || IPV6=n)
>  	select NF_DEFRAG_IPV4
>  	select NF_DEFRAG_IPV6 if IP6_NF_IPTABLES
>  	help
> 


-- 
~Randy

^ permalink raw reply

* Re: [PATCHv2] am33xx: cpsw: default to ethernet hwaddr from efuse if not defined in dt
From: Olof Johansson @ 2013-09-05 20:22 UTC (permalink / raw)
  To: Matt Porter
  Cc: Peter Korsgaard, Mark Jackson, Tony Lindgren, Mugunthan V N,
	linux-omap, Network Development,
	devicetree-discuss@lists.ozlabs.org, hvaibhav, Richard Cochran,
	Kevin Hilman, michal.bachraty, michal.bachraty
In-Reply-To: <20130905201637.GB10973@ohporter.com>

On Thu, Sep 5, 2013 at 1:16 PM, Matt Porter <matt.porter@linaro.org> wrote:
> On Mon, Jul 15, 2013 at 07:31:18AM +0200, Peter Korsgaard wrote:
>> >>>>> "Mark" == Mark Jackson <mpfj-list@newflow.co.uk> writes:
>>
>> Hi,
>>
>>  Mark> On 08/07/13 13:42, Mark Jackson wrote:
>>  >> On 18/01/13 05:14, Mugunthan V N wrote:
>>  >>> On 1/18/2013 3:48 AM, Peter Korsgaard wrote:
>>  >>>> When booting with CONFIG_ARM_APPENDED_DTB (either because of using an old
>>  >>>> U-Boot, not wanting the hassle of 2 files or when using Falcon fast boot
>>  >>>> mode in U-Boot), nothing updates the ethernet hwaddr specified for the
>>  >>>> CPSW slaves, causing the driver to use a random hwaddr, which is some times
>>  >>>> troublesome.
>>  >>>>
>>  >>>> The am33xx has unique ethernet hwaddrs programmed in the efuse, so it makes
>>  >>>> more sense to default to these rather than random ones. Add a fixup step
>>  >>>> which adds mac-address dt properties using the efuse addresses if the DTB
>>  >>>> didn't contain valid ones.
>>  >>>>
>>  >>>> Signed-off-by: Peter Korsgaard <jacmet@sunsite.dk>
>>  >>>
>>  >>> This implementation looks fine.
>>  >>> Acked-by: Mugunthan V N <mugunthanvnm@ti.com>
>>  >>
>>  >> Tested-by: Mark Jackson <mpfj-list@newflow.co.uk>
>>
>>  Mark> Is this ever going to be put into the mainline code ?
>>
>> Good question. Tony, could you please pick this up? It has been pending
>> since January and has a number of acks.
>>
>> Do you want me to resend?
>
> Also working nicely here on 3.11.
>
> Tested-by: Matt Porter <matt.porter@linaro.org>
>
> Kevin or Olof: can you apply? Seems to be continuing no response after
> Ack back in January.

I have no idea what the patch is, it's no longer in the email. Can you
resend it, please?


-Olof

^ permalink raw reply

* Re: [PATCHv2] am33xx: cpsw: default to ethernet hwaddr from efuse if not defined in dt
From: Matt Porter @ 2013-09-05 20:16 UTC (permalink / raw)
  To: Peter Korsgaard
  Cc: Mark Jackson, tony, Mugunthan V N, linux-omap, netdev,
	devicetree-discuss, hvaibhav, richardcochran, Olof Johansson,
	Kevin Hilman, michal.bachraty, michal.bachraty
In-Reply-To: <871u70bd95.fsf@dell.be.48ers.dk>

On Mon, Jul 15, 2013 at 07:31:18AM +0200, Peter Korsgaard wrote:
> >>>>> "Mark" == Mark Jackson <mpfj-list@newflow.co.uk> writes:
> 
> Hi,
> 
>  Mark> On 08/07/13 13:42, Mark Jackson wrote:
>  >> On 18/01/13 05:14, Mugunthan V N wrote:
>  >>> On 1/18/2013 3:48 AM, Peter Korsgaard wrote:
>  >>>> When booting with CONFIG_ARM_APPENDED_DTB (either because of using an old
>  >>>> U-Boot, not wanting the hassle of 2 files or when using Falcon fast boot
>  >>>> mode in U-Boot), nothing updates the ethernet hwaddr specified for the
>  >>>> CPSW slaves, causing the driver to use a random hwaddr, which is some times
>  >>>> troublesome.
>  >>>> 
>  >>>> The am33xx has unique ethernet hwaddrs programmed in the efuse, so it makes
>  >>>> more sense to default to these rather than random ones. Add a fixup step
>  >>>> which adds mac-address dt properties using the efuse addresses if the DTB
>  >>>> didn't contain valid ones.
>  >>>> 
>  >>>> Signed-off-by: Peter Korsgaard <jacmet@sunsite.dk>
>  >>> 
>  >>> This implementation looks fine.
>  >>> Acked-by: Mugunthan V N <mugunthanvnm@ti.com>
>  >> 
>  >> Tested-by: Mark Jackson <mpfj-list@newflow.co.uk>
> 
>  Mark> Is this ever going to be put into the mainline code ?
> 
> Good question. Tony, could you please pick this up? It has been pending
> since January and has a number of acks.
> 
> Do you want me to resend?

Also working nicely here on 3.11.

Tested-by: Matt Porter <matt.porter@linaro.org>

Kevin or Olof: can you apply? Seems to be continuing no response after
Ack back in January.

-Matt

^ permalink raw reply

* [PATCH] bnx2x: avoid atomic allocations during initialization
From: Michal Schmidt @ 2013-09-05 20:13 UTC (permalink / raw)
  To: netdev; +Cc: Ariel Elior, Eilon Greenstein, David Miller

During initialization bnx2x allocates significant amounts of memory
(for rx data, rx SGEs, TPA pool) using atomic allocations.

I received a report where bnx2x failed to allocate SGEs and it had
to fall back to TPA-less operation.

Let's use GFP_KERNEL allocations during initialization, which runs
in process context. Add gfp_t parameters to functions that are used
both in initialization and in the receive path.

Use an unlikely branch in bnx2x_frag_alloc() to avoid atomic allocation
by netdev_alloc_frag(). The branch is taken several thousands of times
during initialization, but then never more. Note that fp->rx_frag_size
is never greater than PAGE_SIZE, so __get_free_page() can be used here.

Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
---
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c | 38 +++++++++++++++----------
 1 file changed, 23 insertions(+), 15 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
index 8d726f6..884e8ad 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
@@ -484,10 +484,10 @@ static void bnx2x_set_gro_params(struct sk_buff *skb, u16 parsing_flags,
 	NAPI_GRO_CB(skb)->count = num_of_coalesced_segs;
 }
 
-static int bnx2x_alloc_rx_sge(struct bnx2x *bp,
-			      struct bnx2x_fastpath *fp, u16 index)
+static int bnx2x_alloc_rx_sge(struct bnx2x *bp, struct bnx2x_fastpath *fp,
+			      u16 index, gfp_t gfp_mask)
 {
-	struct page *page = alloc_pages(GFP_ATOMIC, PAGES_PER_SGE_SHIFT);
+	struct page *page = alloc_pages(gfp_mask, PAGES_PER_SGE_SHIFT);
 	struct sw_rx_page *sw_buf = &fp->rx_page_ring[index];
 	struct eth_rx_sge *sge = &fp->rx_sge_ring[index];
 	dma_addr_t mapping;
@@ -566,7 +566,7 @@ static int bnx2x_fill_frag_skb(struct bnx2x *bp, struct bnx2x_fastpath *fp,
 
 		/* If we fail to allocate a substitute page, we simply stop
 		   where we are and drop the whole packet */
-		err = bnx2x_alloc_rx_sge(bp, fp, sge_idx);
+		err = bnx2x_alloc_rx_sge(bp, fp, sge_idx, GFP_ATOMIC);
 		if (unlikely(err)) {
 			bnx2x_fp_qstats(bp, fp)->rx_skb_alloc_failed++;
 			return err;
@@ -610,12 +610,17 @@ static void bnx2x_frag_free(const struct bnx2x_fastpath *fp, void *data)
 		kfree(data);
 }
 
-static void *bnx2x_frag_alloc(const struct bnx2x_fastpath *fp)
+static void *bnx2x_frag_alloc(const struct bnx2x_fastpath *fp, gfp_t gfp_mask)
 {
-	if (fp->rx_frag_size)
+	if (fp->rx_frag_size) {
+		/* GFP_KERNEL allocations are used only during initialization */
+		if (unlikely(gfp_mask & __GFP_WAIT))
+			return (void *)__get_free_page(gfp_mask);
+
 		return netdev_alloc_frag(fp->rx_frag_size);
+	}
 
-	return kmalloc(fp->rx_buf_size + NET_SKB_PAD, GFP_ATOMIC);
+	return kmalloc(fp->rx_buf_size + NET_SKB_PAD, gfp_mask);
 }
 
 #ifdef CONFIG_INET
@@ -695,7 +700,7 @@ static void bnx2x_tpa_stop(struct bnx2x *bp, struct bnx2x_fastpath *fp,
 		goto drop;
 
 	/* Try to allocate the new data */
-	new_data = bnx2x_frag_alloc(fp);
+	new_data = bnx2x_frag_alloc(fp, GFP_ATOMIC);
 	/* Unmap skb in the pool anyway, as we are going to change
 	   pool entry status to BNX2X_TPA_STOP even if new skb allocation
 	   fails. */
@@ -746,15 +751,15 @@ drop:
 	bnx2x_fp_stats(bp, fp)->eth_q_stats.rx_skb_alloc_failed++;
 }
 
-static int bnx2x_alloc_rx_data(struct bnx2x *bp,
-			       struct bnx2x_fastpath *fp, u16 index)
+static int bnx2x_alloc_rx_data(struct bnx2x *bp, struct bnx2x_fastpath *fp,
+			       u16 index, gfp_t gfp_mask)
 {
 	u8 *data;
 	struct sw_rx_bd *rx_buf = &fp->rx_buf_ring[index];
 	struct eth_rx_bd *rx_bd = &fp->rx_desc_ring[index];
 	dma_addr_t mapping;
 
-	data = bnx2x_frag_alloc(fp);
+	data = bnx2x_frag_alloc(fp, gfp_mask);
 	if (unlikely(data == NULL))
 		return -ENOMEM;
 
@@ -947,7 +952,8 @@ int bnx2x_rx_int(struct bnx2x_fastpath *fp, int budget)
 			memcpy(skb->data, data + pad, len);
 			bnx2x_reuse_rx_data(fp, bd_cons, bd_prod);
 		} else {
-			if (likely(bnx2x_alloc_rx_data(bp, fp, bd_prod) == 0)) {
+			if (likely(bnx2x_alloc_rx_data(bp, fp, bd_prod,
+						       GFP_ATOMIC) == 0)) {
 				dma_unmap_single(&bp->pdev->dev,
 						 dma_unmap_addr(rx_buf, mapping),
 						 fp->rx_buf_size,
@@ -1307,7 +1313,8 @@ void bnx2x_init_rx_rings(struct bnx2x *bp)
 				struct sw_rx_bd *first_buf =
 					&tpa_info->first_buf;
 
-				first_buf->data = bnx2x_frag_alloc(fp);
+				first_buf->data =
+					bnx2x_frag_alloc(fp, GFP_KERNEL);
 				if (!first_buf->data) {
 					BNX2X_ERR("Failed to allocate TPA skb pool for queue[%d] - disabling TPA on this queue!\n",
 						  j);
@@ -1329,7 +1336,8 @@ void bnx2x_init_rx_rings(struct bnx2x *bp)
 			for (i = 0, ring_prod = 0;
 			     i < MAX_RX_SGE_CNT*NUM_RX_SGE_PAGES; i++) {
 
-				if (bnx2x_alloc_rx_sge(bp, fp, ring_prod) < 0) {
+				if (bnx2x_alloc_rx_sge(bp, fp, ring_prod,
+						       GFP_KERNEL) < 0) {
 					BNX2X_ERR("was only able to allocate %d rx sges\n",
 						  i);
 					BNX2X_ERR("disabling TPA for queue[%d]\n",
@@ -4214,7 +4222,7 @@ static int bnx2x_alloc_rx_bds(struct bnx2x_fastpath *fp,
 	 * fp->eth_q_stats.rx_skb_alloc_failed = 0
 	 */
 	for (i = 0; i < rx_ring_size; i++) {
-		if (bnx2x_alloc_rx_data(bp, fp, ring_prod) < 0) {
+		if (bnx2x_alloc_rx_data(bp, fp, ring_prod, GFP_KERNEL) < 0) {
 			failure_cnt++;
 			continue;
 		}
-- 
1.8.3.1

^ permalink raw reply related

* Re: [PATCHv2] am33xx: cpsw: default to ethernet hwaddr from efuse if not defined in dt
From: Koen Kooi @ 2013-09-05 20:11 UTC (permalink / raw)
  To: Mark Jackson
  Cc: Mugunthan V N, Peter Korsgaard, linux-omap, netdev,
	devicetree-discuss, hvaibhav, richardcochran, tony,
	michal.bachraty, michal.bachraty
In-Reply-To: <51DAB39C.7090106@newflow.co.uk>


Op 8 jul. 2013, om 14:42 heeft Mark Jackson <mpfj-list@newflow.co.uk> het volgende geschreven:

> On 18/01/13 05:14, Mugunthan V N wrote:
>> On 1/18/2013 3:48 AM, Peter Korsgaard wrote:
>>> When booting with CONFIG_ARM_APPENDED_DTB (either because of using an old
>>> U-Boot, not wanting the hassle of 2 files or when using Falcon fast boot
>>> mode in U-Boot), nothing updates the ethernet hwaddr specified for the
>>> CPSW slaves, causing the driver to use a random hwaddr, which is some times
>>> troublesome.
>>> 
>>> The am33xx has unique ethernet hwaddrs programmed in the efuse, so it makes
>>> more sense to default to these rather than random ones. Add a fixup step
>>> which adds mac-address dt properties using the efuse addresses if the DTB
>>> didn't contain valid ones.
>>> 
>>> Signed-off-by: Peter Korsgaard <jacmet@sunsite.dk>
>>> 
>> 
>> This implementation looks fine.
>> Acked-by: Mugunthan V N <mugunthanvnm@ti.com>
> 
> Tested-by: Mark Jackson <mpfj-list@newflow.co.uk>

Tested-by: Koen Kooi <koen@dominion.thruhere.net>




^ permalink raw reply

* Re: [PATCH net] net: netlink: filter particular protocols from analyzers
From: David Miller @ 2013-09-05 20:07 UTC (permalink / raw)
  To: dborkman; +Cc: netdev, stephen
In-Reply-To: <5228E2A0.4010607@redhat.com>

From: Daniel Borkmann <dborkman@redhat.com>
Date: Thu, 05 Sep 2013 21:59:28 +0200

> On 09/05/2013 09:54 PM, David Miller wrote:
>> From: Daniel Borkmann <dborkman@redhat.com>
>> Date: Thu, 05 Sep 2013 21:48:00 +0200
>>
>>> With socket(PF_PACKET, ..., htons(ETH_P_ALL)) you will already get
>>> all users from the suggested white-list of the patch, which is the
>>> majority of netlink users I believe. Hence, you do not need to have
>>> one socket per protocol. skbs from there should get dragged into
>>> pf_packet via dev_queue_xmit_nit() which works on ptype_all list.
>>
>> What about user level netlink protocols?
> 
> If you are referring to NETLINK_USERSOCK, then we let this pass here,
> so nothing changes.

Ok I need to think about this some more, I moved your patch back into under review
state.

^ permalink raw reply

* Re: [PATCH net] net: netlink: filter particular protocols from analyzers
From: Daniel Borkmann @ 2013-09-05 19:59 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, stephen
In-Reply-To: <20130905.155417.2121308426258876038.davem@davemloft.net>

On 09/05/2013 09:54 PM, David Miller wrote:
> From: Daniel Borkmann <dborkman@redhat.com>
> Date: Thu, 05 Sep 2013 21:48:00 +0200
>
>> With socket(PF_PACKET, ..., htons(ETH_P_ALL)) you will already get
>> all users from the suggested white-list of the patch, which is the
>> majority of netlink users I believe. Hence, you do not need to have
>> one socket per protocol. skbs from there should get dragged into
>> pf_packet via dev_queue_xmit_nit() which works on ptype_all list.
>
> What about user level netlink protocols?

If you are referring to NETLINK_USERSOCK, then we let this pass here,
so nothing changes.

^ permalink raw reply

* Re: [PATCH] openvswitch: Fix alignment of struct sw_flow_key.
From: David Miller @ 2013-09-05 19:54 UTC (permalink / raw)
  To: jesse; +Cc: netdev, dev, azhou, fengguang.wu, geert
In-Reply-To: <1378408625-18415-1-git-send-email-jesse@nicira.com>

From: Jesse Gross <jesse@nicira.com>
Date: Thu,  5 Sep 2013 12:17:05 -0700

> sw_flow_key alignment was declared as " __aligned(__alignof__(long))".
> However, this breaks on the m68k architecture where long is 32 bit in
> size but 16 bit aligned by default. This aligns to the size of a long to
> ensure that we can always do comparsions in full long-sized chunks. It
> also adds an additional build check to catch any reduction in alignment.
> 
> CC: Andy Zhou <azhou@nicira.com>
> Reported-by: Fengguang Wu <fengguang.wu@intel.com>
> Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
> Signed-off-by: Jesse Gross <jesse@nicira.com>

Applied, thanks.

^ permalink raw reply

* Re: [PATCH net] net: netlink: filter particular protocols from analyzers
From: David Miller @ 2013-09-05 19:54 UTC (permalink / raw)
  To: dborkman; +Cc: netdev, stephen
In-Reply-To: <5228DFF0.7070106@redhat.com>

From: Daniel Borkmann <dborkman@redhat.com>
Date: Thu, 05 Sep 2013 21:48:00 +0200

> With socket(PF_PACKET, ..., htons(ETH_P_ALL)) you will already get
> all users from the suggested white-list of the patch, which is the
> majority of netlink users I believe. Hence, you do not need to have
> one socket per protocol. skbs from there should get dragged into
> pf_packet via dev_queue_xmit_nit() which works on ptype_all list.

What about user level netlink protocols?

^ permalink raw reply

* Re: [PATCH net] net: netlink: filter particular protocols from analyzers
From: David Miller @ 2013-09-05 19:50 UTC (permalink / raw)
  To: stephen; +Cc: dborkman, netdev, dborkmann
In-Reply-To: <20130905124437.6b3e14ad@nehalam.linuxnetplumber.net>

From: Stephen Hemminger <stephen@networkplumber.org>
Date: Thu, 5 Sep 2013 12:44:37 -0700

> On Thu, 05 Sep 2013 15:03:41 -0400 (EDT)
> David Miller <davem@davemloft.net> wrote:
> 
>> From: Stephen Hemminger <stephen@networkplumber.org>
>> Date: Thu, 5 Sep 2013 11:57:55 -0700
>> 
>> > If you want filtering, why not add BPF (sk_filter) support to this?
>> 
>> That's one idea.
>> 
>> But using sk_protocol is at least consistent with how AF_PACKET does
>> things.
> 
> Why not both.

That's fine too.

^ permalink raw reply

* Re: [PATCH] openvswitch: Fix alignment of struct sw_flow_key.
From: David Miller @ 2013-09-05 19:49 UTC (permalink / raw)
  To: geert; +Cc: jesse, netdev, dev, azhou, fengguan.wu
In-Reply-To: <CAMuHMdVgd499_oAU0oodE8ie+h+Oc6M-dRcyC4tT6QDB=01GSg@mail.gmail.com>

From: Geert Uytterhoeven <geert@linux-m68k.org>
Date: Thu, 5 Sep 2013 21:20:16 +0200

> Why don't you abort the loop if a difference is found?

We are optimizing for a match.

^ permalink raw reply

* Re: [PATCH net] net: netlink: filter particular protocols from analyzers
From: Daniel Borkmann @ 2013-09-05 19:48 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, Stephen Hemminger
In-Reply-To: <20130905.144442.2085221662776542385.davem@davemloft.net>

On 09/05/2013 08:44 PM, David Miller wrote:
> From: Daniel Borkmann <dborkman@redhat.com>
> Date: Thu,  5 Sep 2013 17:48:47 +0200
>
>> From: Daniel Borkmann <dborkmann@redhat.com>
>>
>> Fix finer-grained control and let only a whitelist of allowed netlink
>> protocols pass, in our case related to networking. If later on, other
>> subsystems decide they want to add their protocol as well to the list
>> of allowed protocols they shall simply add it. While at it, we also
>> need to tell what protocol is in use otherwise BPF_S_ANC_PROTOCOL can
>> not pick it up (as it's not filled out).
>>
>> Signed-off-by: Daniel Borkmann <dborkman@redhat.com>

To answer Stephen and Dave in one message ... ;-)

> This takes away functionality that I'd be more interesting in using,
> namely being able to listen to all netlink protocols using one tap.
>
> Seriously, when I first saw this feature, that was the first way I'd
> imagine myself using it, as a tcpdump for netlink traffic, all of
> it.
>
> If I just want to hear all netlink traffic, don't make me be forced to
> know every single NETLINK_* protocol value and have to open that many
> sockets just to do so.
>
> It also makes it so that I can't listen to userlevel custom netlink
> protocols, another minus of filtering.
>
> At the very least, allow an sk_protocol of zero or similar to have this
> meaning of "everything".

I agree with you with all the above and I think with having this in
tcpdump would be great to debug netlink traffic of course ...

With socket(PF_PACKET, ..., htons(ETH_P_ALL)) you will already get all
users from the suggested white-list of the patch, which is the majority
of netlink users I believe. Hence, you do not need to have one socket
per protocol. skbs from there should get dragged into pf_packet via
dev_queue_xmit_nit() which works on ptype_all list.

Plus, with the nskb->protocol addition in this patch, they can then either
take all netlink traffic from that, or filter with BPF e.g. for particular
protocols via BPF_S_ANC_PROTOCOL or more advanced stuff, for example. Also,
they can select particular dissectors with that information that comes in
via sll_protocol which is useful, of course.

I think for out-of-tree modules, we never really cared much, and they
could use generic netlink probably anyway. The thing why I wanted to
do this is to keep security related messages (audit, selinux) generally
under the radar, which is "more or less" what's remaining, so we still get
main users. This patch would accomplish those things, that's why I think
it's useful.

^ permalink raw reply

* Re: [PATCH] openvswitch: Fix alignment of struct sw_flow_key.
From: Joe Perches @ 2013-09-05 19:47 UTC (permalink / raw)
  To: David Miller; +Cc: jesse, netdev, dev, azhou, fengguan.wu, geert
In-Reply-To: <20130905.144044.1053960608071929025.davem@davemloft.net>

On Thu, 2013-09-05 at 14:40 -0400, David Miller wrote:
> From: Jesse Gross <jesse@nicira.com>
> Date: Thu, 5 Sep 2013 11:36:19 -0700
> > On Thu, Sep 5, 2013 at 11:17 AM, David Miller <davem@davemloft.net> wrote:
> >> From: Jesse Gross <jesse@nicira.com>
> >> Date: Thu,  5 Sep 2013 10:41:27 -0700
> >>
> >>> -} __aligned(__alignof__(long));
> >>> +} __aligned(8); /* 8 byte alignment ensures this can be accessed as a long */
> >>
> >> This kind of stuff drives me crazy.
> >>
> >> If the issue is the type, therefore at least use an expression that
> >> mentions the type explicitly.  And mention the actual type that
> >> matters.  "long" isn't it.
> > 
> > 'long' actually is the real type here.
> > 
> > When doing comparisons, this structure is being accessed as a byte
> > array in 'long' sized chunks, not by its members. Therefore, the
> > compiler's alignment does not necessarily correspond to anything for
> > this purpose. It could be a struct full of u16's and we would still
> > want to access it in chunks of 'long'.
> > 
> > To completely honest, I think the correct alignment should be
> > sizeof(long) because I know that 'long' is not always 8 bytes on all
> > architectures. However, you made the point before that this could
> > break the alignment of the 64-bit values on architectures where 'long'
> > is 32 bits wide, so 8 bytes is the generic solution.
> 
> Look at net/core/flow.c:flow_key_compare().
> 
> And then we annotate struct flowi with
> 
> } __attribute__((__aligned__(BITS_PER_LONG/8)));
> 
> Don't reinvent the wheel, either mimick how existing code does
> this kind of thing or provide a justification for doing it differently
> and update the existing cases to match and be consistent.

I think using the BITS_PER_LONG #define is pretty odd
as it's set using a test for CONFIG_64BIT and that

	__aligned(__alignof__(long))
or
	__aligned(sizeof(long))

may be simpler to read. That first would allow
architectures that can read any long on an arbitrary
boundary to pack the structure as densely as possible.
That may not work if the entire structure is always
read with via long *.

If so, my choice would be to standardize using

} __aligned(sizeof(long));

Currently, what happens to a struct that isn't
actually a multiple of long bytes?  Are these
structs guaranteed to be zero padded right now?

Also, what happens for structs like:

struct foo {
	u8	bar[6];
	long	baz;
	u8	qux[2];
} __aligned(sizeof(long));

where the padding may or may not exist if
__packed is also specified?

btw:

all the __attribute__((__aligned__(foo)...)) and
__aligned(...) / __packed(...) uses could be rewritten
to a more standard style.

$ grep -rP --include=*.[ch] -oh "\b__aligned[^\(\n_]*\s*\([^;\n]+[;\n]" * | \
  sed -r -e 's/\s//g' -e 's/\)[^\)]*$/\)/' | sort | uniq -c
      3 __aligned(1)
      3 __aligned(16)
      1 __aligned(1<<WORK_STRUCT_FLAG_BITS)
     12 __aligned(2)
      7 __aligned(32)
     32 __aligned(4)
      8 __aligned(64)
     15 __aligned(8)
      1 __aligned(__alignof__(long))
      3 __aligned(__CACHE_WRITEBACK_GRANULE)
      1 __aligned((IOR_DMA_OFFSET_BITS/IOR_BPC))
      1 __aligned((IOR_PHYS_OFFSET_BITS/IOR_BPC))
      1 __aligned(LOG_ALIGN)
      2 __aligned(NETDEV_ALIGN)
     10 __aligned(PAGE_SIZE)
      2 __aligned(sizeof(s64))
      4 __aligned(sizeof(u16))
      2 __aligned(sizeof(u32))
      7 __aligned(sizeof(void*))

$ grep -rP --include=*.[ch] -oh "\b__attribute.*aligned[^\(\n]*\s*\([^;\n]+[;\n]" * | \
  sed -r -e 's/\s//g' -e 's/\)[^\)]*$/\)/' | sort | uniq -c
      2 __attribute__((aligned(0x100)))
      1 __attribute__((aligned(0x2000)))
      2 __attribute__((aligned(0x20000)))
      1 __attribute__((__aligned__(0x400)))
      3 __attribute__((aligned(0x80)))
      1 __attribute__((aligned(1024),packed))
      2 __attribute__((__aligned__(128)))
      9 __attribute__((__aligned__(16)))
     30 __attribute__((aligned(16)))
      2 __attribute__((aligned(1),packed))
      1 __attribute__((aligned(2)))
      1 __attribute__((aligned(2048)))
     14 __attribute__((aligned(256)))
      1 __attribute__((aligned(256),packed))
      5 __attribute__((aligned(2),packed))
      1 __attribute__((aligned(2*sizeof(long))))
      1 __attribute((aligned(2*sizeof(unsignedlong))))
      3 __attribute__((__aligned__(32)))
     53 __attribute__((aligned(32)))
      2 __attribute__((aligned(32),packed))
      2 __attribute__((__aligned__(4)))
     22 __attribute__((aligned(4)))
      2 __attribute__((aligned(4096)))
      1 __attribute__((aligned(4<<10)))
      1 __attribute__((aligned(4),packed))
     18 __attribute__((aligned(64)))
      1 __attribute__((aligned(65536)))
     15 __attribute__((__aligned__(8)))
     89 __attribute__((aligned(8)))
      2 __attribute__((aligned(a)))
      5 __attribute__((aligned(__alignof__(structebt_replace))))
      1 __attribute__((aligned(__alignof__(structhash_alg_common))))
      1 __attribute__((aligned(__alignof__(u32))))
      1 __attribute__((aligned(ATOMIC_HASH_L2_SIZE*4)))
      4 __attribute__((__aligned__(BITS_PER_LONG/8)))
      1 __attribute__((aligned(BUFFER_SIZE)))
      2 __attribute__((aligned(CHIP_L2_LINE_SIZE())))
      1 __attribute__((aligned(CPPI_DESCRIPTOR_ALIGN)))
      1 __attribute__((aligned(DM_IO_MAX_REGIONS)))
      1 __attribute__((aligned(HV_PAGE_TABLE_ALIGN)))
      1 __attribute__((aligned(_K_SS_ALIGNSIZE)))
      1 __attribute__((aligned(L1_CACHE_BYTES)))
      3 __attribute__((aligned(L2_CACHE_BYTES)))
      1 __attribute__((__aligned__(NETDEV_ALIGN)))
      3 __attribute__((__aligned__(PADLOCK_ALIGNMENT)))
      4 __attribute__((__aligned__(PAGE_SIZE)))
      7 __attribute__((aligned(PAGE_SIZE)))
      1 __attribute__((aligned(PS3_BMP_MINALIGN)))
      2 __attribute__((aligned(sizeof(Elf##size##_Word))))
      1 __attribute__((aligned(sizeof(int))))
      1 __attribute__((aligned(sizeof(__kernel_size_t))))
      1 __attribute__((aligned(sizeof(kernel_ulong_t))))
      7 __attribute__((aligned(sizeof(long))))
      1 __attribute__((aligned(sizeof(s64))))
      1 __attribute__((__aligned__(sizeof(structxencomm_mini))))
      1 __attribute__((aligned(sizeof(__u64))))
      8 __attribute__((aligned(sizeof(uint64_t))))
      6 __attribute__((aligned(sizeof(unsignedlong))))
      1 __attribute__((__aligned__(sizeof(void*))))
      1 __attribute__((aligned(sizeof(void*))))
      7 __attribute__((__aligned__(SMP_CACHE_BYTES)))
      6 __attribute__((aligned(SMP_CACHE_BYTES)))
      1 __attribute__((aligned(stackalign)))
      1 __attribute__((aligned(THREAD_SIZE)))
      1 __attribute__((aligned(TSB_ENTRY_ALIGNMENT)))
      1 __attribute__((aligned(XCHAL_CP0_SA_ALIGN)))
      1 __attribute__((aligned(XCHAL_CP1_SA_ALIGN)))
      1 __attribute__((aligned(XCHAL_CP2_SA_ALIGN)))
      1 __attribute__((aligned(XCHAL_CP3_SA_ALIGN)))
      1 __attribute__((aligned(XCHAL_CP4_SA_ALIGN)))
      1 __attribute__((aligned(XCHAL_CP5_SA_ALIGN)))
      1 __attribute__((aligned(XCHAL_CP6_SA_ALIGN)))
      1 __attribute__((aligned(XCHAL_CP7_SA_ALIGN)))
      2 __attribute__((aligned(XCHAL_NCP_SA_ALIGN)))
      1 __attribute__((packed,aligned(1024)))
      1 __attribute__((packed,__aligned__(16)))
      4 __attribute__((packed,aligned(16)))
      7 __attribute__((packed,aligned(2)))
      1 __attribute__((packed,aligned(2048)))
      4 __attribute__((packed,aligned(256)))
    177 __attribute__((packed,aligned(4)))
      2 __attribute__((packed,aligned(4096)))
      2 __attribute__((packed,aligned(64)))
     47 __attribute__((packed,aligned(8)))
      2 __attribute__((packed,aligned(PAGE_SIZE)))
      1 __attribute__((packed,aligned(PMCRAID_IOADL_ALIGNMENT)))
      2 __attribute__((packed,aligned(PMCRAID_IOARCB_ALIGNMENT)))
      1 __attribute__((__section__(".data..vm0.pgd"),aligned(PAGE_SIZE)))
      1 __attribute__((__section__(".data..vm0.pmd"),aligned(PAGE_SIZE)))
      1 __attribute__((__section__(".data..vm0.pte"),aligned(PAGE_SIZE)))

^ permalink raw reply

* Re: [PATCH net] net: netlink: filter particular protocols from analyzers
From: Stephen Hemminger @ 2013-09-05 19:44 UTC (permalink / raw)
  To: David Miller; +Cc: dborkman, netdev, dborkmann
In-Reply-To: <20130905.150341.1090416835512463609.davem@davemloft.net>

On Thu, 05 Sep 2013 15:03:41 -0400 (EDT)
David Miller <davem@davemloft.net> wrote:

> From: Stephen Hemminger <stephen@networkplumber.org>
> Date: Thu, 5 Sep 2013 11:57:55 -0700
> 
> > If you want filtering, why not add BPF (sk_filter) support to this?
> 
> That's one idea.
> 
> But using sk_protocol is at least consistent with how AF_PACKET does
> things.

Why not both.

^ permalink raw reply

* Re: [PATCH] openvswitch: Fix alignment of struct sw_flow_key.
From: Jesse Gross @ 2013-09-05 19:25 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: David Miller, netdev@vger.kernel.org, dev@openvswitch.org,
	Andy Zhou, fengguan.wu
In-Reply-To: <CAMuHMdVgd499_oAU0oodE8ie+h+Oc6M-dRcyC4tT6QDB=01GSg@mail.gmail.com>

On Thu, Sep 5, 2013 at 12:20 PM, Geert Uytterhoeven
<geert@linux-m68k.org> wrote:
> Why don't you abort the loop if a difference is found?
> Or is this a security-related struct where you want to protect against
> timing attacks?

It's more expensive to test for a difference on every iteration in the
common case where the comparison succeeds.

> Furthermore, as you compare the raw bytes, I hope you always
> initialize all gaps in the struct to zero.
> E.g. there's a 2-byte gap immediately after "ip", as the next member
> is 32-bit (except op m68k, where the 32-bit member will be 2-byte aligned).

It's initialized to zero.

^ permalink raw reply

* Re: [PATCH] openvswitch: Fix alignment of struct sw_flow_key.
From: Geert Uytterhoeven @ 2013-09-05 19:25 UTC (permalink / raw)
  To: Jesse Gross
  Cc: David Miller, netdev@vger.kernel.org, dev@openvswitch.org,
	Andy Zhou, Fengguang Wu
In-Reply-To: <1378408625-18415-1-git-send-email-jesse@nicira.com>

On Thu, Sep 5, 2013 at 9:17 PM, Jesse Gross <jesse@nicira.com> wrote:
> --- a/net/openvswitch/flow.c
> +++ b/net/openvswitch/flow.c
> @@ -1981,6 +1981,7 @@ nla_put_failure:
>   * Returns zero if successful or a negative error code. */
>  int ovs_flow_init(void)
>  {
> +       BUILD_BUG_ON(__alignof__(struct sw_flow_key) % __alignof__(long));

> -} __aligned(__alignof__(long));
> +} __aligned(BITS_PER_LONG/8); /* Ensure that we can do comparisons as longs. */

These don't match: the struct definition says aligned to 4 or 8 bytes, the check
checks for a multiple of the alignment of "long", which is 2, 4 or 8.

Anyway, BITS_PER_LONG/8 is always a multiple of __alignof__(long), so your
check will never fail.

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply

* Re: [PATCH] openvswitch: Fix alignment of struct sw_flow_key.
From: Geert Uytterhoeven @ 2013-09-05 19:20 UTC (permalink / raw)
  To: David Miller
  Cc: Jesse Gross, netdev@vger.kernel.org, dev@openvswitch.org,
	Andy Zhou, fengguan.wu
In-Reply-To: <20130905.144044.1053960608071929025.davem@davemloft.net>

On Thu, Sep 5, 2013 at 8:40 PM, David Miller <davem@davemloft.net> wrote:
> When doing comparisons, this structure is being accessed as a byte
> array in 'long' sized chunks, not by its members. Therefore, the

Like ... an optimized memcmp() does? Which handles all (un)alignment
well?

>> To completely honest, I think the correct alignment should be
>> sizeof(long) because I know that 'long' is not always 8 bytes on all
>> architectures. However, you made the point before that this could
>> break the alignment of the 64-bit values on architectures where 'long'
>> is 32 bits wide, so 8 bytes is the generic solution.
>
> Look at net/core/flow.c:flow_key_compare().
>
> And then we annotate struct flowi with
>
> } __attribute__((__aligned__(BITS_PER_LONG/8)));
>
> Don't reinvent the wheel, either mimick how existing code does
> this kind of thing or provide a justification for doing it differently
> and update the existing cases to match and be consistent.

This may still break the alignment of 64-bit values on 32-bit architectures
where __alignof__(u64) == 8.

Now let's look at the comparison function:

static bool __cmp_key(const struct sw_flow_key *key1,
                const struct sw_flow_key *key2,  int key_start, int key_end)
{
        const long *cp1 = (long *)((u8 *)key1 + key_start);
        const long *cp2 = (long *)((u8 *)key2 + key_start);
        long diffs = 0;
        int i;

        for (i = key_start; i < key_end;  i += sizeof(long))
                diffs |= *cp1++ ^ *cp2++;

        return diffs == 0;
}

Why don't you abort the loop if a difference is found?
Or is this a security-related struct where you want to protect against
timing attacks?

Furthermore, as you compare the raw bytes, I hope you always
initialize all gaps in the struct to zero.
E.g. there's a 2-byte gap immediately after "ip", as the next member
is 32-bit (except op m68k, where the 32-bit member will be 2-byte aligned).

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply

* [PATCH] openvswitch: Fix alignment of struct sw_flow_key.
From: Jesse Gross @ 2013-09-05 19:17 UTC (permalink / raw)
  To: David Miller
  Cc: netdev, dev, Andy Zhou, Fengguang Wu, Geert Uytterhoeven,
	Jesse Gross

sw_flow_key alignment was declared as " __aligned(__alignof__(long))".
However, this breaks on the m68k architecture where long is 32 bit in
size but 16 bit aligned by default. This aligns to the size of a long to
ensure that we can always do comparsions in full long-sized chunks. It
also adds an additional build check to catch any reduction in alignment.

CC: Andy Zhou <azhou@nicira.com>
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Jesse Gross <jesse@nicira.com>
---
 net/openvswitch/flow.c | 1 +
 net/openvswitch/flow.h | 2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/openvswitch/flow.c b/net/openvswitch/flow.c
index ad1aeeb..fb36f85 100644
--- a/net/openvswitch/flow.c
+++ b/net/openvswitch/flow.c
@@ -1981,6 +1981,7 @@ nla_put_failure:
  * Returns zero if successful or a negative error code. */
 int ovs_flow_init(void)
 {
+	BUILD_BUG_ON(__alignof__(struct sw_flow_key) % __alignof__(long));
 	BUILD_BUG_ON(sizeof(struct sw_flow_key) % sizeof(long));
 
 	flow_cache = kmem_cache_create("sw_flow", sizeof(struct sw_flow), 0,
diff --git a/net/openvswitch/flow.h b/net/openvswitch/flow.h
index b65f885..212fbf7 100644
--- a/net/openvswitch/flow.h
+++ b/net/openvswitch/flow.h
@@ -125,7 +125,7 @@ struct sw_flow_key {
 			} nd;
 		} ipv6;
 	};
-} __aligned(__alignof__(long));
+} __aligned(BITS_PER_LONG/8); /* Ensure that we can do comparisons as longs. */
 
 struct sw_flow {
 	struct rcu_head rcu;
-- 
1.8.1.2

^ permalink raw reply related

* Re: [PATCH] openvswitch: Fix alignment of struct sw_flow_key.
From: Jesse Gross @ 2013-09-05 19:14 UTC (permalink / raw)
  To: David Miller
  Cc: dev-yBygre7rU0TnMu66kgdUjQ@public.gmane.org, netdev,
	Geert Uytterhoeven
In-Reply-To: <20130905.144044.1053960608071929025.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>

On Thu, Sep 5, 2013 at 11:40 AM, David Miller <davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org> wrote:
> From: Jesse Gross <jesse-l0M0P4e3n4LQT0dZR+AlfA@public.gmane.org>
> Date: Thu, 5 Sep 2013 11:36:19 -0700
>
>> On Thu, Sep 5, 2013 at 11:17 AM, David Miller <davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org> wrote:
>>> From: Jesse Gross <jesse-l0M0P4e3n4LQT0dZR+AlfA@public.gmane.org>
>>> Date: Thu,  5 Sep 2013 10:41:27 -0700
>>>
>>>> -} __aligned(__alignof__(long));
>>>> +} __aligned(8); /* 8 byte alignment ensures this can be accessed as a long */
>>>
>>> This kind of stuff drives me crazy.
>>>
>>> If the issue is the type, therefore at least use an expression that
>>> mentions the type explicitly.  And mention the actual type that
>>> matters.  "long" isn't it.
>>
>> 'long' actually is the real type here.
>>
>> When doing comparisons, this structure is being accessed as a byte
>> array in 'long' sized chunks, not by its members. Therefore, the
>> compiler's alignment does not necessarily correspond to anything for
>> this purpose. It could be a struct full of u16's and we would still
>> want to access it in chunks of 'long'.
>>
>> To completely honest, I think the correct alignment should be
>> sizeof(long) because I know that 'long' is not always 8 bytes on all
>> architectures. However, you made the point before that this could
>> break the alignment of the 64-bit values on architectures where 'long'
>> is 32 bits wide, so 8 bytes is the generic solution.
>
> Look at net/core/flow.c:flow_key_compare().
>
> And then we annotate struct flowi with
>
> } __attribute__((__aligned__(BITS_PER_LONG/8)));
>
> Don't reinvent the wheel, either mimick how existing code does
> this kind of thing or provide a justification for doing it differently
> and update the existing cases to match and be consistent.

Sure, I'll send a new patch to use BITS_PER_LONG.

^ permalink raw reply

* Re: [PATCH net] net: netlink: filter particular protocols from analyzers
From: David Miller @ 2013-09-05 19:03 UTC (permalink / raw)
  To: stephen; +Cc: dborkman, netdev, dborkmann
In-Reply-To: <20130905115755.7aec533c@nehalam.linuxnetplumber.net>

From: Stephen Hemminger <stephen@networkplumber.org>
Date: Thu, 5 Sep 2013 11:57:55 -0700

> If you want filtering, why not add BPF (sk_filter) support to this?

That's one idea.

But using sk_protocol is at least consistent with how AF_PACKET does
things.

^ permalink raw reply

* Re: [PATCH net] net: netlink: filter particular protocols from analyzers
From: Stephen Hemminger @ 2013-09-05 18:57 UTC (permalink / raw)
  To: David Miller; +Cc: dborkman, netdev, dborkmann
In-Reply-To: <20130905.144442.2085221662776542385.davem@davemloft.net>

On Thu, 05 Sep 2013 14:44:42 -0400 (EDT)
David Miller <davem@davemloft.net> wrote:

> From: Daniel Borkmann <dborkman@redhat.com>
> Date: Thu,  5 Sep 2013 17:48:47 +0200
> 
> > From: Daniel Borkmann <dborkmann@redhat.com>
> > 
> > Fix finer-grained control and let only a whitelist of allowed netlink
> > protocols pass, in our case related to networking. If later on, other
> > subsystems decide they want to add their protocol as well to the list
> > of allowed protocols they shall simply add it. While at it, we also
> > need to tell what protocol is in use otherwise BPF_S_ANC_PROTOCOL can
> > not pick it up (as it's not filled out).
> > 
> > Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
> 
> This takes away functionality that I'd be more interesting in using,
> namely being able to listen to all netlink protocols using one tap.
> 
> Seriously, when I first saw this feature, that was the first way I'd
> imagine myself using it, as a tcpdump for netlink traffic, all of
> it.
> 
> If I just want to hear all netlink traffic, don't make me be forced to
> know every single NETLINK_* protocol value and have to open that many
> sockets just to do so.
> 
> It also makes it so that I can't listen to userlevel custom netlink
> protocols, another minus of filtering.
> 
> At the very least, allow an sk_protocol of zero or similar to have this
> meaning of "everything".
> 
> I'm not applying this patch, sorry.
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


If you want filtering, why not add BPF (sk_filter) support to this?

^ permalink raw reply

* Re: [PATCH net] vxlan: Fix kernel panic on device delete.
From: David Miller @ 2013-09-05 18:51 UTC (permalink / raw)
  To: pshelar; +Cc: netdev
In-Reply-To: <1378329141-11868-1-git-send-email-pshelar@nicira.com>

From: Pravin B Shelar <pshelar@nicira.com>
Date: Wed,  4 Sep 2013 14:12:21 -0700

> On vxlan device create if socket create fails vxlan device is not
> added to hash table. Therefore we need to check if device
> is in hashtable before we delete it from hlist.
> Following patch avoid the crash. net-next already has this fix.
 ...
> Signed-off-by: Pravin B Shelar <pshelar@nicira.com>

Applied.

^ permalink raw reply

* Re: [PATCH] net: mvneta: properly disable HW PHY polling and ensure adjust_link() works
From: David Miller @ 2013-09-05 18:51 UTC (permalink / raw)
  To: jason
  Cc: thomas.petazzoni, netdev, alior, jochen.armkernel, simon.guinot,
	ryan, vdonnefort, ethan, stable, ezequiel.garcia, yves,
	gregory.clement, psanford, w, linux-arm-kernel
In-Reply-To: <20130904145051.GO19598@titan.lakedaemon.net>

From: Jason Cooper <jason@lakedaemon.net>
Date: Wed, 4 Sep 2013 10:50:51 -0400

> On Wed, Sep 04, 2013 at 04:21:18PM +0200, Thomas Petazzoni wrote:
>> This commit fixes a long-standing bug that has been reported by many
>> users: on some Armada 370 platforms, only the network interface that
>> has been used in U-Boot to tftp the kernel works properly in
>> Linux. The other network interfaces can see a 'link up', but are
>> unable to transmit data. The reports were generally made on the Armada
>> 370-based Mirabox, but have also been given on the Armada 370-RD
>> board.
 ...
> 
> David,
> 
> Offending patch is:
> 
>   c5aff18 net: mvneta: driver for Marvell Armada 370/XP network unit
> 
> Applies and builds cleanly against v3.8.13, v3.9.11, v3.10.10, and v3.11
> 
> Acked-by: Jason Cooper <jason@lakedaemon.net>

Applied.

^ permalink raw reply

* Re: [PATCH net] net: netlink: filter particular protocols from analyzers
From: David Miller @ 2013-09-05 18:44 UTC (permalink / raw)
  To: dborkman; +Cc: netdev, dborkmann
In-Reply-To: <1378396127-8342-1-git-send-email-dborkman@redhat.com>

From: Daniel Borkmann <dborkman@redhat.com>
Date: Thu,  5 Sep 2013 17:48:47 +0200

> From: Daniel Borkmann <dborkmann@redhat.com>
> 
> Fix finer-grained control and let only a whitelist of allowed netlink
> protocols pass, in our case related to networking. If later on, other
> subsystems decide they want to add their protocol as well to the list
> of allowed protocols they shall simply add it. While at it, we also
> need to tell what protocol is in use otherwise BPF_S_ANC_PROTOCOL can
> not pick it up (as it's not filled out).
> 
> Signed-off-by: Daniel Borkmann <dborkman@redhat.com>

This takes away functionality that I'd be more interesting in using,
namely being able to listen to all netlink protocols using one tap.

Seriously, when I first saw this feature, that was the first way I'd
imagine myself using it, as a tcpdump for netlink traffic, all of
it.

If I just want to hear all netlink traffic, don't make me be forced to
know every single NETLINK_* protocol value and have to open that many
sockets just to do so.

It also makes it so that I can't listen to userlevel custom netlink
protocols, another minus of filtering.

At the very least, allow an sk_protocol of zero or similar to have this
meaning of "everything".

I'm not applying this patch, sorry.

^ permalink raw reply

* Re: [PATCH] openvswitch: Fix alignment of struct sw_flow_key.
From: David Miller @ 2013-09-05 18:40 UTC (permalink / raw)
  To: jesse-l0M0P4e3n4LQT0dZR+AlfA
  Cc: dev-yBygre7rU0TnMu66kgdUjQ, netdev-u79uwXL29TY76Z2rM5mHXA,
	fengguan.wu-ral2JQCrhuEAvxtiuMwx3w, geert-Td1EMuHUCqxL1ZNQvxDV9g
In-Reply-To: <CAEP_g=8gcCtkv=no=HkVmS+NDWPM-Uu-SVyR+vb-YQ=7TWJsXA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>

From: Jesse Gross <jesse-l0M0P4e3n4LQT0dZR+AlfA@public.gmane.org>
Date: Thu, 5 Sep 2013 11:36:19 -0700

> On Thu, Sep 5, 2013 at 11:17 AM, David Miller <davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org> wrote:
>> From: Jesse Gross <jesse-l0M0P4e3n4LQT0dZR+AlfA@public.gmane.org>
>> Date: Thu,  5 Sep 2013 10:41:27 -0700
>>
>>> -} __aligned(__alignof__(long));
>>> +} __aligned(8); /* 8 byte alignment ensures this can be accessed as a long */
>>
>> This kind of stuff drives me crazy.
>>
>> If the issue is the type, therefore at least use an expression that
>> mentions the type explicitly.  And mention the actual type that
>> matters.  "long" isn't it.
> 
> 'long' actually is the real type here.
> 
> When doing comparisons, this structure is being accessed as a byte
> array in 'long' sized chunks, not by its members. Therefore, the
> compiler's alignment does not necessarily correspond to anything for
> this purpose. It could be a struct full of u16's and we would still
> want to access it in chunks of 'long'.
> 
> To completely honest, I think the correct alignment should be
> sizeof(long) because I know that 'long' is not always 8 bytes on all
> architectures. However, you made the point before that this could
> break the alignment of the 64-bit values on architectures where 'long'
> is 32 bits wide, so 8 bytes is the generic solution.

Look at net/core/flow.c:flow_key_compare().

And then we annotate struct flowi with

} __attribute__((__aligned__(BITS_PER_LONG/8)));

Don't reinvent the wheel, either mimick how existing code does
this kind of thing or provide a justification for doing it differently
and update the existing cases to match and be consistent.

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox