Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH] net: dont use __netdev_alloc_skb for bounce buffer
From: Stefan Bader @ 2012-07-02 20:03 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, netdev
In-Reply-To: <1341254172.22621.456.camel@edumazet-glaptop>

[-- Attachment #1: Type: text/plain, Size: 2780 bytes --]

I can confirm that, with the below patch applied, at least the b44 regression is
fixed and network is usable again.

-Stefan

On 02.07.2012 20:36, Eric Dumazet wrote:
> From: Eric Dumazet <edumazet@google.com>
> 
> commit a1c7fff7e1 (net: netdev_alloc_skb() use build_skb()) broke b44 on
> some 64bit machines.
> 
> It appears b44 and b43 use __netdev_alloc_skb() instead of alloc_skb()
> for their bounce buffers.
> 
> There is no need to add an extra NET_SKB_PAD reservation for bounce
> buffers :
> 
> - In TX path, NET_SKB_PAD is useless
> 
> - In RX path in b44, we force a copy of incoming frames if
>   GFP_DMA allocations were needed.
> 
> Reported-and-bisected-by: Stefan Bader <stefan.bader@canonical.com>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> ---
>  drivers/net/ethernet/broadcom/b44.c  |    4 ++--
>  drivers/net/wireless/b43legacy/dma.c |    2 +-
>  2 files changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/ethernet/broadcom/b44.c b/drivers/net/ethernet/broadcom/b44.c
> index 46b8b7d..d09c6b5 100644
> --- a/drivers/net/ethernet/broadcom/b44.c
> +++ b/drivers/net/ethernet/broadcom/b44.c
> @@ -656,7 +656,7 @@ static int b44_alloc_rx_skb(struct b44 *bp, int src_idx, u32 dest_idx_unmasked)
>  			dma_unmap_single(bp->sdev->dma_dev, mapping,
>  					     RX_PKT_BUF_SZ, DMA_FROM_DEVICE);
>  		dev_kfree_skb_any(skb);
> -		skb = __netdev_alloc_skb(bp->dev, RX_PKT_BUF_SZ, GFP_ATOMIC|GFP_DMA);
> +		skb = alloc_skb(RX_PKT_BUF_SZ, GFP_ATOMIC | GFP_DMA);
>  		if (skb == NULL)
>  			return -ENOMEM;
>  		mapping = dma_map_single(bp->sdev->dma_dev, skb->data,
> @@ -967,7 +967,7 @@ static netdev_tx_t b44_start_xmit(struct sk_buff *skb, struct net_device *dev)
>  			dma_unmap_single(bp->sdev->dma_dev, mapping, len,
>  					     DMA_TO_DEVICE);
>  
> -		bounce_skb = __netdev_alloc_skb(dev, len, GFP_ATOMIC | GFP_DMA);
> +		bounce_skb = alloc_skb(len, GFP_ATOMIC | GFP_DMA);
>  		if (!bounce_skb)
>  			goto err_out;
>  
> diff --git a/drivers/net/wireless/b43legacy/dma.c b/drivers/net/wireless/b43legacy/dma.c
> index f1f8bd0..c8baf02 100644
> --- a/drivers/net/wireless/b43legacy/dma.c
> +++ b/drivers/net/wireless/b43legacy/dma.c
> @@ -1072,7 +1072,7 @@ static int dma_tx_fragment(struct b43legacy_dmaring *ring,
>  	meta->dmaaddr = map_descbuffer(ring, skb->data, skb->len, 1);
>  	/* create a bounce buffer in zone_dma on mapping failure. */
>  	if (b43legacy_dma_mapping_error(ring, meta->dmaaddr, skb->len, 1)) {
> -		bounce_skb = __dev_alloc_skb(skb->len, GFP_ATOMIC | GFP_DMA);
> +		bounce_skb = alloc_skb(skb->len, GFP_ATOMIC | GFP_DMA);
>  		if (!bounce_skb) {
>  			ring->current_slot = old_top_slot;
>  			ring->used_slots = old_used_slots;
> 
> 




[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 900 bytes --]

^ permalink raw reply

* humanitarian project, note attached
From: Julia @ 2012-07-02 19:58 UTC (permalink / raw)


[-- Attachment #1: Type: text/plain, Size: 35 bytes --]

humanitarian project, note attached

[-- Attachment #2: humanitarian project.JPG --]
[-- Type: image/jpeg, Size: 49761 bytes --]

^ permalink raw reply

* [PATCH] sctp: refactor sctp_packet_append_chunk and clenup some memory leaks
From: Neil Horman @ 2012-07-02 19:59 UTC (permalink / raw)
  To: netdev; +Cc: Neil Horman, Vlad Yasevich, David S. Miller, linux-sctp

While doing some recent work on sctp sack bundling I noted that
sctp_packet_append_chunk was pretty inefficient.  Specifially, it was called
recursively while trying to bundle auth and sack chunks.  Because of that we
call sctp_packet_bundle_sack and sctp_packet_bundle_auth a total of 4 times for
every call to sctp_packet_append_chunk, knowing that at least 3 of those calls
will do nothing.

So lets refactor sctp_packet_bundle_auth to have an outer part that does the
attempted bundling, and an inner part that just does the chunk appends.  This
saves us several calls per iteration that we just don't need.

Also, noticed that the auth and sack bundling fail to free the chunks they
allocate if the append fails, so make sure we add that in

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: Vlad Yasevich <vyasevich@gmail.com>
CC: "David S. Miller" <davem@davemloft.net>
CC: linux-sctp@vger.kernel.org
---
 net/sctp/output.c |   80 +++++++++++++++++++++++++++++++++++------------------
 1 files changed, 53 insertions(+), 27 deletions(-)

diff --git a/net/sctp/output.c b/net/sctp/output.c
index 0de6cd5..0b62f6c 100644
--- a/net/sctp/output.c
+++ b/net/sctp/output.c
@@ -64,6 +64,8 @@
 #include <net/sctp/checksum.h>
 
 /* Forward declarations for private helpers. */
+static sctp_xmit_t __sctp_packet_append_chunk(struct sctp_packet *packet,
+					      struct sctp_chunk *chunk);
 static sctp_xmit_t sctp_packet_can_append_data(struct sctp_packet *packet,
 					   struct sctp_chunk *chunk);
 static void sctp_packet_append_data(struct sctp_packet *packet,
@@ -224,7 +226,10 @@ static sctp_xmit_t sctp_packet_bundle_auth(struct sctp_packet *pkt,
 	if (!auth)
 		return retval;
 
-	retval = sctp_packet_append_chunk(pkt, auth);
+	retval = __sctp_packet_append_chunk(pkt, auth);
+
+	if (retval != SCTP_XMIT_OK)
+		sctp_chunk_free(auth);
 
 	return retval;
 }
@@ -256,48 +261,31 @@ static sctp_xmit_t sctp_packet_bundle_sack(struct sctp_packet *pkt,
 			asoc->a_rwnd = asoc->rwnd;
 			sack = sctp_make_sack(asoc);
 			if (sack) {
-				retval = sctp_packet_append_chunk(pkt, sack);
+				retval = __sctp_packet_append_chunk(pkt, sack);
+				if (retval != SCTP_XMIT_OK) {
+					sctp_chunk_free(sack);
+					goto out;
+				}
 				asoc->peer.sack_needed = 0;
 				if (del_timer(timer))
 					sctp_association_put(asoc);
 			}
 		}
 	}
+out:
 	return retval;
 }
 
+
 /* Append a chunk to the offered packet reporting back any inability to do
  * so.
  */
-sctp_xmit_t sctp_packet_append_chunk(struct sctp_packet *packet,
-				     struct sctp_chunk *chunk)
+static sctp_xmit_t __sctp_packet_append_chunk(struct sctp_packet *packet,
+					      struct sctp_chunk *chunk)
 {
 	sctp_xmit_t retval = SCTP_XMIT_OK;
 	__u16 chunk_len = WORD_ROUND(ntohs(chunk->chunk_hdr->length));
 
-	SCTP_DEBUG_PRINTK("%s: packet:%p chunk:%p\n", __func__, packet,
-			  chunk);
-
-	/* Data chunks are special.  Before seeing what else we can
-	 * bundle into this packet, check to see if we are allowed to
-	 * send this DATA.
-	 */
-	if (sctp_chunk_is_data(chunk)) {
-		retval = sctp_packet_can_append_data(packet, chunk);
-		if (retval != SCTP_XMIT_OK)
-			goto finish;
-	}
-
-	/* Try to bundle AUTH chunk */
-	retval = sctp_packet_bundle_auth(packet, chunk);
-	if (retval != SCTP_XMIT_OK)
-		goto finish;
-
-	/* Try to bundle SACK chunk */
-	retval = sctp_packet_bundle_sack(packet, chunk);
-	if (retval != SCTP_XMIT_OK)
-		goto finish;
-
 	/* Check to see if this chunk will fit into the packet */
 	retval = sctp_packet_will_fit(packet, chunk, chunk_len);
 	if (retval != SCTP_XMIT_OK)
@@ -339,6 +327,44 @@ finish:
 	return retval;
 }
 
+/* Append a chunk to the offered packet reporting back any inability to do
+ * so.
+ */
+sctp_xmit_t sctp_packet_append_chunk(struct sctp_packet *packet,
+				     struct sctp_chunk *chunk)
+{
+	sctp_xmit_t retval = SCTP_XMIT_OK;
+	__u16 chunk_len = WORD_ROUND(ntohs(chunk->chunk_hdr->length));
+
+	SCTP_DEBUG_PRINTK("%s: packet:%p chunk:%p\n", __func__, packet,
+			  chunk);
+
+	/* Data chunks are special.  Before seeing what else we can
+	 * bundle into this packet, check to see if we are allowed to
+	 * send this DATA.
+	 */
+	if (sctp_chunk_is_data(chunk)) {
+		retval = sctp_packet_can_append_data(packet, chunk);
+		if (retval != SCTP_XMIT_OK)
+			goto finish;
+	}
+
+	/* Try to bundle AUTH chunk */
+	retval = sctp_packet_bundle_auth(packet, chunk);
+	if (retval != SCTP_XMIT_OK)
+		goto finish;
+
+	/* Try to bundle SACK chunk */
+	retval = sctp_packet_bundle_sack(packet, chunk);
+	if (retval != SCTP_XMIT_OK)
+		goto finish;
+
+	retval = __sctp_packet_append_chunk(packet, chunk);
+
+finish:
+	return retval;
+}
+
 /* All packets are sent to the network through this function from
  * sctp_outq_tail().
  *
-- 
1.7.7.6

^ permalink raw reply related

* Re: [PATCH 00/13] drivers: hv: kvp
From: Ben Hutchings @ 2012-07-02 19:57 UTC (permalink / raw)
  To: KY Srinivasan
  Cc: Olaf Hering, Greg KH, apw@canonical.com,
	devel@linuxdriverproject.org, virtualization@lists.osdl.org,
	linux-kernel@vger.kernel.org, netdev@vger.kernel.org
In-Reply-To: <426367E2313C2449837CD2DE46E7EAF9155EF399@SN2PRD0310MB382.namprd03.prod.outlook.com>

On Mon, Jul 02, 2012 at 03:22:25PM +0000, KY Srinivasan wrote:
> 
> 
> > -----Original Message-----
> > From: Olaf Hering [mailto:olaf@aepfle.de]
> > Sent: Thursday, June 28, 2012 10:24 AM
> > To: KY Srinivasan
> > Cc: Greg KH; apw@canonical.com; devel@linuxdriverproject.org;
> > virtualization@lists.osdl.org; linux-kernel@vger.kernel.org
> > Subject: Re: [PATCH 00/13] drivers: hv: kvp
> > 
> > On Tue, Jun 26, KY Srinivasan wrote:
> > 
> > > > From: Greg KH [mailto:gregkh@linuxfoundation.org]
> > > > The fact that it was Red Hat specific was the main part, this should be
> > > > done in a standard way, with standard tools, right?
> > >
> > > The reason I asked this question was to make sure I address these
> > > issues in addition to whatever I am debugging now. I use the standard
> > > tools and calls to retrieve all the IP configuration. As I look at
> > > each distribution the files they keep persistent IP configuration
> > > Information is different and that is the reason I chose to start with
> > > RedHat. If there is a standard way to store the configuration, I will
> > > do that.
> > 
> > 
> > KY,
> > 
> > instead of using system() in kvp_get_ipconfig_info and kvp_set_ip_info,
> > wouldnt it be easier to call an external helper script which does all
> > the distribution specific work? Just define some API to pass values to
> > the script, and something to read values collected by the script back
> > into the daemon.
> 
> On the "Get" side I mostly use standard commands/APIs to get all the information:
> 
> 1) IP address information and subnet mask: getifaddrs()
> 2) DNS information:  Parsing /etc/resolv.conf
> 3) /sbin/ip command for all the routing information

If you're interested in the *current* configuration then (1) and (3)
are OK but you should really use the rtnetlink API.

However, I suspect that Hyper-V assumes that current and persistent
configuration are the same thing, which is obviously not true in
general on Linux.  But if NetworkManager is running then you can
assume they are.

> 4)  Parse /etc/sysconfig/network-scripts/ifcfg-ethx for boot protocol
> 
> As you can see, all but the boot protocol is gathered using the "standard distro
> independent mechanisms. I was looking at NetworkManager cli and it looks
> like I could gather all the information except the boot protocol information. I am 
> not sure how to gather the boot protocol information in a distro independent fashion.
> 
> On the SET side, I need to persistently store the settings in an appropriate configuration
> file and flush these settings down so that the interface is appropriately configured. It is here
> that I am struggling to find a distro independent way of doing things. It would be great if I can
> use NetworkManager cli (nmcli) to accomplish this. Any help here would be greatly appreciated.
[...]

What was wrong with the NetworkManager D-Bus API I pointed you at?
I don't see how it makes sense to use nmcli as an API.

Ben.

-- 
Ben Hutchings
We get into the habit of living before acquiring the habit of thinking.
                                                              - Albert Camus

^ permalink raw reply

* Re: [PATCH v2 0/2] Part 1: handle addr_assign_type for random addresses
From: Shuah Khan @ 2012-07-02 19:41 UTC (permalink / raw)
  To: jeffrey.t.kirsher
  Cc: Danny Kukawka, David S. Miller, Danny Kukawka, netdev,
	linux-kernel
In-Reply-To: <1341257012.1987.2.camel@jtkirshe-mobl>

On Mon, 2012-07-02 at 12:23 -0700, Jeff Kirsher wrote:

> 
> It looks like it was accepted into kernel 3.3.  I am not aware of any
> stable kernels earlier than 3.3 that picked up the patch.

Checked the 3.3 source didn't find it, it is in 3.4.

-- Shuah

^ permalink raw reply

* Re: [RFC] [TCP 0/3] Receive from socket into bio without copying
From: chetan loke @ 2012-07-02 19:41 UTC (permalink / raw)
  To: Andreas Gruenbacher
  Cc: Eric Dumazet, netdev, linux-kernel, Herbert Xu, David S. Miller
In-Reply-To: <1341245191.2177.40.camel@schurl.lan>

On Mon, Jul 2, 2012 at 12:06 PM, Andreas Gruenbacher <agruen@linbit.com> wrote:
> On Mon, 2012-07-02 at 15:54 +0200, Eric Dumazet wrote:
>> So I will just say no to your patches, unless you demonstrate the
>> splice() problems, and how you can fix the alignment problem in a new
>> layer instead of in the existing zero copy standard one.
>
> Again, splice or not is not the issue here. It does not, by itself, allow zero
> copy from the network directly to disk but it could likely be made to support
> that if we can get the alignment right first.  The proposed MSG_NEW_PACKET flag
> helps with that, but maybe someone has a better idea.
>

Eric - by using splice do you mean something like:

int filedes[2];
PIPE_SIZE (64*1024)
pipe(filedes);
ret = splice (sock_fd_from, &from_offset, filedes [1], NULL, PIPE_SIZE,
                     SPLICE_F_MORE | SPLICE_F_MOVE);

ret = splice (filedes [0], NULL, file_fd_to,
                         &to_offset, ret,
                         SPLICE_F_MORE | SPLICE_F_MOVE);

i.e. splice-in from socket to pipe, and splice-out from pipe to destination?

Andreas - if the above assumption is true then can you apply the
'MSG_NEW_PACKET' on the sender and see if the above pseudo-splice code
achieves something similar to what you expect on the receive side(you
can also play w/ F_SETPIPE_SZ -  although I found very little
reduction in CPU usage)? Note: My personal experience - using splice
from an input-file-A to output-file-B bought very minimal cpu
reduction(yes, both the files used O_DIRECT). Instead, a simple
read/write w/ O_DIRECT from file-A to file-B was much much faster.

Chetan

^ permalink raw reply

* Re: [PATCH v2 0/2] Part 1: handle addr_assign_type for random addresses
From: Jeff Kirsher @ 2012-07-02 19:23 UTC (permalink / raw)
  To: shuah.khan
  Cc: Danny Kukawka, David S. Miller, Danny Kukawka, netdev,
	linux-kernel
In-Reply-To: <1341255355.2750.23.camel@lorien2>

[-- Attachment #1: Type: text/plain, Size: 528 bytes --]

On Mon, 2012-07-02 at 12:55 -0600, Shuah Khan wrote:
> On Thu, 2012-02-09 at 12:09 -0800, Jeff Kirsher wrote:
> 
> > 
> > Thanks Danny, I will add both patches to my queue so that we can
> > validate the changes for ixgbevf and igbvf.
> 
> Jeff,
> 
> Which upstream kernel did this patch end up in? Also did it make it into
> any of the stable releases?
> 
> Thanks,
> -- Shuah
> 

It looks like it was accepted into kernel 3.3.  I am not aware of any
stable kernels earlier than 3.3 that picked up the patch.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply

* Re: [PATCH v2 0/2] Part 1: handle addr_assign_type for random addresses
From: Shuah Khan @ 2012-07-02 18:55 UTC (permalink / raw)
  To: jeffrey.t.kirsher
  Cc: Danny Kukawka, David S. Miller, Danny Kukawka, netdev,
	linux-kernel
In-Reply-To: <1328818164.3639.15.camel@jtkirshe-mobl>

On Thu, 2012-02-09 at 12:09 -0800, Jeff Kirsher wrote:

> 
> Thanks Danny, I will add both patches to my queue so that we can
> validate the changes for ixgbevf and igbvf.

Jeff,

Which upstream kernel did this patch end up in? Also did it make it into
any of the stable releases?

Thanks,
-- Shuah

^ permalink raw reply

* [PATCH] net: dont use __netdev_alloc_skb for bounce buffer
From: Eric Dumazet @ 2012-07-02 18:36 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, Stefan Bader

From: Eric Dumazet <edumazet@google.com>

commit a1c7fff7e1 (net: netdev_alloc_skb() use build_skb()) broke b44 on
some 64bit machines.

It appears b44 and b43 use __netdev_alloc_skb() instead of alloc_skb()
for their bounce buffers.

There is no need to add an extra NET_SKB_PAD reservation for bounce
buffers :

- In TX path, NET_SKB_PAD is useless

- In RX path in b44, we force a copy of incoming frames if
  GFP_DMA allocations were needed.

Reported-and-bisected-by: Stefan Bader <stefan.bader@canonical.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 drivers/net/ethernet/broadcom/b44.c  |    4 ++--
 drivers/net/wireless/b43legacy/dma.c |    2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/b44.c b/drivers/net/ethernet/broadcom/b44.c
index 46b8b7d..d09c6b5 100644
--- a/drivers/net/ethernet/broadcom/b44.c
+++ b/drivers/net/ethernet/broadcom/b44.c
@@ -656,7 +656,7 @@ static int b44_alloc_rx_skb(struct b44 *bp, int src_idx, u32 dest_idx_unmasked)
 			dma_unmap_single(bp->sdev->dma_dev, mapping,
 					     RX_PKT_BUF_SZ, DMA_FROM_DEVICE);
 		dev_kfree_skb_any(skb);
-		skb = __netdev_alloc_skb(bp->dev, RX_PKT_BUF_SZ, GFP_ATOMIC|GFP_DMA);
+		skb = alloc_skb(RX_PKT_BUF_SZ, GFP_ATOMIC | GFP_DMA);
 		if (skb == NULL)
 			return -ENOMEM;
 		mapping = dma_map_single(bp->sdev->dma_dev, skb->data,
@@ -967,7 +967,7 @@ static netdev_tx_t b44_start_xmit(struct sk_buff *skb, struct net_device *dev)
 			dma_unmap_single(bp->sdev->dma_dev, mapping, len,
 					     DMA_TO_DEVICE);
 
-		bounce_skb = __netdev_alloc_skb(dev, len, GFP_ATOMIC | GFP_DMA);
+		bounce_skb = alloc_skb(len, GFP_ATOMIC | GFP_DMA);
 		if (!bounce_skb)
 			goto err_out;
 
diff --git a/drivers/net/wireless/b43legacy/dma.c b/drivers/net/wireless/b43legacy/dma.c
index f1f8bd0..c8baf02 100644
--- a/drivers/net/wireless/b43legacy/dma.c
+++ b/drivers/net/wireless/b43legacy/dma.c
@@ -1072,7 +1072,7 @@ static int dma_tx_fragment(struct b43legacy_dmaring *ring,
 	meta->dmaaddr = map_descbuffer(ring, skb->data, skb->len, 1);
 	/* create a bounce buffer in zone_dma on mapping failure. */
 	if (b43legacy_dma_mapping_error(ring, meta->dmaaddr, skb->len, 1)) {
-		bounce_skb = __dev_alloc_skb(skb->len, GFP_ATOMIC | GFP_DMA);
+		bounce_skb = alloc_skb(skb->len, GFP_ATOMIC | GFP_DMA);
 		if (!bounce_skb) {
 			ring->current_slot = old_top_slot;
 			ring->used_slots = old_used_slots;

^ permalink raw reply related

* Re: [PATCH] NFC: Prevent NULL deref when getting socket name
From: John W. Linville @ 2012-07-02 18:24 UTC (permalink / raw)
  To: Sasha Levin
  Cc: lauro.venancio, aloisio.almeida, sameo, linux-wireless, netdev,
	linux-kernel
In-Reply-To: <1341050207-13145-1-git-send-email-levinsasha928@gmail.com>

On Sat, Jun 30, 2012 at 11:56:47AM +0200, Sasha Levin wrote:
> llcp_sock_getname can be called without a device attached to the nfc_llcp_sock.
> 
> This would lead to the following BUG:
> 
> [  362.341807] BUG: unable to handle kernel NULL pointer dereference at           (null)
> [  362.341815] IP: [<ffffffff836258e5>] llcp_sock_getname+0x75/0xc0
> [  362.341818] PGD 31b35067 PUD 30631067 PMD 0
> [  362.341821] Oops: 0000 [#627] PREEMPT SMP DEBUG_PAGEALLOC
> [  362.341826] CPU 3
> [  362.341827] Pid: 7816, comm: trinity-child55 Tainted: G      D W    3.5.0-rc4-next-20120628-sasha-00005-g9f23eb7 #479
> [  362.341831] RIP: 0010:[<ffffffff836258e5>]  [<ffffffff836258e5>] llcp_sock_getname+0x75/0xc0
> [  362.341832] RSP: 0018:ffff8800304fde88  EFLAGS: 00010286
> [  362.341834] RAX: 0000000000000000 RBX: ffff880033cb8000 RCX: 0000000000000001
> [  362.341835] RDX: ffff8800304fdec4 RSI: ffff8800304fdec8 RDI: ffff8800304fdeda
> [  362.341836] RBP: ffff8800304fdea8 R08: 7ebcebcb772b7ffb R09: 5fbfcb9c35bdfd53
> [  362.341838] R10: 4220020c54326244 R11: 0000000000000246 R12: ffff8800304fdec8
> [  362.341839] R13: ffff8800304fdec4 R14: ffff8800304fdec8 R15: 0000000000000044
> [  362.341841] FS:  00007effa376e700(0000) GS:ffff880035a00000(0000) knlGS:0000000000000000
> [  362.341843] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  362.341844] CR2: 0000000000000000 CR3: 0000000030438000 CR4: 00000000000406e0
> [  362.341851] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [  362.341856] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> [  362.341858] Process trinity-child55 (pid: 7816, threadinfo ffff8800304fc000, task ffff880031270000)
> [  362.341858] Stack:
> [  362.341862]  ffff8800304fdea8 ffff880035156780 0000000000000000 0000000000001000
> [  362.341865]  ffff8800304fdf78 ffffffff83183b40 00000000304fdec8 0000006000000000
> [  362.341868]  ffff8800304f0027 ffffffff83729649 ffff8800304fdee8 ffff8800304fdf48
> [  362.341869] Call Trace:
> [  362.341874]  [<ffffffff83183b40>] sys_getpeername+0xa0/0x110
> [  362.341877]  [<ffffffff83729649>] ? _raw_spin_unlock_irq+0x59/0x80
> [  362.341882]  [<ffffffff810f342b>] ? do_setitimer+0x23b/0x290
> [  362.341886]  [<ffffffff81985ede>] ? trace_hardirqs_on_thunk+0x3a/0x3f
> [  362.341889]  [<ffffffff8372a539>] system_call_fastpath+0x16/0x1b
> [  362.341921] Code: 84 00 00 00 00 00 b8 b3 ff ff ff 48 85 db 74 54 66 41 c7 04 24 27 00 49 8d 7c 24 12 41 c7 45 00 60 00 00 00 48 8b 83 28 05 00 00 <8b> 00 41 89 44 24 04 0f b6 83 41 05 00 00 41 88 44 24 10 0f b6
> [  362.341924] RIP  [<ffffffff836258e5>] llcp_sock_getname+0x75/0xc0
> [  362.341925]  RSP <ffff8800304fde88>
> [  362.341926] CR2: 0000000000000000
> [  362.341928] ---[ end trace 6d450e935ee18bf3 ]---
> 
> Signed-off-by: Sasha Levin <levinsasha928@gmail.com>

Samuel, I'm taking this one directly.

-- 
John W. Linville		Someday the world will need a hero, and you
linville@tuxdriver.com			might be all we have.  Be ready.

^ permalink raw reply

* Re: [PATCH net-next 06/10] {NET,IB}/mlx4: Add device managed flow steering firmware API
From: Ben Hutchings @ 2012-07-02 18:07 UTC (permalink / raw)
  To: David Miller; +Cc: ogerlitz, roland, yevgenyp, oren, netdev, hadarh
In-Reply-To: <20120702.013445.1273332212099485403.davem@davemloft.net>

On Mon, 2012-07-02 at 01:34 -0700, David Miller wrote:
> From: Or Gerlitz <ogerlitz@mellanox.com>
> Date: Mon, 2 Jul 2012 10:55:28 +0300
> 
> > On 7/2/2012 12:42 AM, David Miller wrote:
> >> [...] Module parameters stink because every driver is going to provide
> >> the knob differently, with a different name, and different
> >> semantics. This creates a terrible user experience, and I will not
> >> allow it.
> > 
> > OK, so if looking on what we are left with on the table, seems that
> > sysfs entry on the mlx4_core
> > level (as we do for the port link type {IB, Eth} or IB port MTU) could
> > be fine here, Roland, agree?
> 
> No way.
> 
> You have to create a real interface, that other vendors with similar
> chips can consistently use.

But there may not be enough commonality to define a non- vendor-specific
API.  And ethtool really isn't a good way to expose parameters that are
per-controller rather than per-net-device, particularly if changing them
may disrupt all running net devices on that controller and not just the
one used to invoke SIOCETHTOOL.

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply

* Re: [PATCH net-next v3] em_canid: Ematch rule to match CAN frames according to their identifiers
From: Oliver Hartkopp @ 2012-07-02 18:03 UTC (permalink / raw)
  To: Rostislav Lisovy; +Cc: netdev, linux-can, lartc, pisa, sojkam1
In-Reply-To: <4FF1DF65.5080306@hartkopp.net>

One more simplification:


>> +	rulescnt = len / sizeof(struct can_filter);
>> +
>> +	cm = kzalloc(sizeof(struct canid_match) + sizeof(struct can_filter) *
>> +		rulescnt, GFP_KERNEL);


No need to multiply the value again  ... you can take the len value as-is:

cm = kzalloc(sizeof(struct canid_match) + len, GFP_KERNEL);


> 
> *cm is no longer a fixed structure as it was in the first patches.
> 
> Must be:
> 
> m->datalen = sizeof(struct canid_match) + sizeof(struct can_filter) * rulescnt


dito:

m->datalen = sizeof(struct canid_match) + len;

Regards,
Oliver

^ permalink raw reply

* [PATCH net] net: qmi_wwan: add ZTE MF60
From: Bjørn Mork @ 2012-07-02 17:53 UTC (permalink / raw)
  To: netdev; +Cc: Bjørn Mork

Adding a device with limited QMI support. It does not support
normal QMI_WDS commands for connection management. Instead,
sending a QMI_CTL SET_INSTANCE_ID command is required to
enable the network interface:

  01 0f 00 00 00 00 00 00  20 00 04 00 01 01 00 00

A number of QMI_DMS and QMI_NAS commands are also supported
for optional device management.

Signed-off-by: Bjørn Mork <bjorn@mork.no>
---
 drivers/net/usb/qmi_wwan.c |   18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/drivers/net/usb/qmi_wwan.c b/drivers/net/usb/qmi_wwan.c
index b01960f..5badafd 100644
--- a/drivers/net/usb/qmi_wwan.c
+++ b/drivers/net/usb/qmi_wwan.c
@@ -346,6 +346,15 @@ static const struct driver_info	qmi_wwan_force_int1 = {
 	.data		= BIT(1), /* interface whitelist bitmap */
 };
 
+static const struct driver_info	qmi_wwan_force_int2 = {
+	.description	= "Qualcomm WWAN/QMI device",
+	.flags		= FLAG_WWAN,
+	.bind		= qmi_wwan_bind_shared,
+	.unbind		= qmi_wwan_unbind_shared,
+	.manage_power	= qmi_wwan_manage_power,
+	.data		= BIT(2), /* interface whitelist bitmap */
+};
+
 static const struct driver_info	qmi_wwan_force_int3 = {
 	.description	= "Qualcomm WWAN/QMI device",
 	.flags		= FLAG_WWAN,
@@ -498,6 +507,15 @@ static const struct usb_device_id products[] = {
 		.bInterfaceProtocol = 0xff,
 		.driver_info        = (unsigned long)&qmi_wwan_force_int4,
 	},
+	{	/* ZTE MF60 */
+		.match_flags	    = USB_DEVICE_ID_MATCH_DEVICE | USB_DEVICE_ID_MATCH_INT_INFO,
+		.idVendor           = 0x19d2,
+		.idProduct          = 0x1402,
+		.bInterfaceClass    = 0xff,
+		.bInterfaceSubClass = 0xff,
+		.bInterfaceProtocol = 0xff,
+		.driver_info        = (unsigned long)&qmi_wwan_force_int2,
+	},
 	{	/* Sierra Wireless MC77xx in QMI mode */
 		.match_flags	    = USB_DEVICE_ID_MATCH_DEVICE | USB_DEVICE_ID_MATCH_INT_INFO,
 		.idVendor           = 0x1199,
-- 
1.7.10

^ permalink raw reply related

* Re: [patch] [SCSI] bnx2i: use strlcpy() instead of memcpy() for strings
From: Eddie Wai @ 2012-07-02 17:53 UTC (permalink / raw)
  To: Michael Chan
  Cc: Dan Carpenter, David Laight, James E.J. Bottomley,
	Barak Witkowski, linux-scsi, netdev, David S. Miller
In-Reply-To: <1341242018.7472.5.camel@LTIRV-MCHAN1.corp.ad.broadcom.com>


On Mon, 2012-07-02 at 08:13 -0700, Michael Chan wrote:
> On Mon, 2012-07-02 at 13:48 +0300, Dan Carpenter wrote: 
> > On Mon, Jul 02, 2012 at 11:09:19AM +0100, David Laight wrote:
> > > > Subject: [patch] [SCSI] bnx2i: use strlcpy() instead of memcpy() for
> > > strings
> > > > 
> > > > DRV_MODULE_VERSION here is "2.7.2.2" which is only 8 chars but we copy
> > > > 12 bytes from the stack so it's a small information leak.
> > > > 
> > > > Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
> > > > ---
> > > > This was just added to linux-next yesterday, but I'm not sure 
> > > > which tree it came from.
> > > > 
> > > > diff --git a/drivers/scsi/bnx2i/bnx2i_init.c 
> > > > b/drivers/scsi/bnx2i/bnx2i_init.c
> > > > index 7729a52..b17637a 100644
> > > > --- a/drivers/scsi/bnx2i/bnx2i_init.c
> > > > +++ b/drivers/scsi/bnx2i/bnx2i_init.c
> > > > @@ -400,7 +400,7 @@ int bnx2i_get_stats(void *handle)
> > > >  	if (!stats)
> > > >  		return -ENOMEM;
> > > >  
> > > > -	memcpy(stats->version, DRV_MODULE_VERSION,
> > > sizeof(stats->version));
> > > > +	strlcpy(stats->version, DRV_MODULE_VERSION,
> > > sizeof(stats->version));
> > > >  	memcpy(stats->mac_add1 + 2, hba->cnic->mac_addr, ETH_ALEN);
> > > 
> > > Doesn't that leak the original contents of the last bytes of
> > > stats->version instead?
> > 
> > I'm pretty sure we set those to zero in bnx2x_handle_drv_info_req().
> > 
> 
> Yes, bnx2x zeros the whole stats structure, so strlcpy() is correct.
> 
> This came from the net-next tree, so David is the right persion to apply
> this.  Thanks.
> 
> Acked-by: Michael Chan <mchan@broadcom.com>
> 
True.  strlcpy() is the correct routine to use (instead of strncpy) as
this needs to be NULL terminated.  Thanks.

Acked-by: Eddie Wai <eddie.wai@broadcom.com>




^ permalink raw reply

* Re: [PATCH net-next v3] em_canid: Ematch rule to match CAN frames according to their identifiers
From: Oliver Hartkopp @ 2012-07-02 17:50 UTC (permalink / raw)
  To: Rostislav Lisovy; +Cc: netdev, linux-can, lartc, pisa, sojkam1
In-Reply-To: <1341241568-13438-1-git-send-email-lisovy@gmail.com>

Ugh - sorry.

I still found some issues ...

On 02.07.2012 17:06, Rostislav Lisovy wrote:



> +
> +static int em_canid_change(struct tcf_proto *tp, void *data, int len,
> +			  struct tcf_ematch *m)
> +{
> +	struct can_filter *conf = data; /* Array with rules,
> +					 * fixed size EM_CAN_RULES_SIZE
> +					 */


Remove this comment.

It's only an "array with rules" - but EM_CAN_RULES_SIZE is absent in the code now.

> +	struct canid_match *cm;
> +	struct canid_match *cm_old = (struct canid_match *)m->data;
> +	int i;
> +	int rulescnt;
> +
> +	if (!len)
> +		return -EINVAL;
> +
> +	if (len % sizeof(struct can_filter))
> +		return -EINVAL;
> +
> +	if (len > sizeof(struct can_filter) * EM_CAN_RULES_MAX)
> +		return -EINVAL;
> +
> +	rulescnt = len / sizeof(struct can_filter);
> +
> +	cm = kzalloc(sizeof(struct canid_match) + sizeof(struct can_filter) *
> +		rulescnt, GFP_KERNEL);
> +	if (!cm)
> +		return -ENOMEM;
> +
> +	cm->sff_rules_count = 0;
> +	cm->eff_rules_count = 0;


These two lines are obsolete as you used kzalloc(), right?

> +	cm->rules_count = rulescnt;
> +
> +	/*
> +	 * We need two for() loops for copying rules into
> +	 * two contiguous areas in rules_raw
> +	 */
> +
> +	/* Process EFF frame rules*/
> +	for (i = 0; i < cm->rules_count; i++) {


use rulescnt instead of cm->rules_count (no need to derefence data)

> +		if (((conf[i].can_id & CAN_EFF_FLAG) &&
> +		    (conf[i].can_mask & CAN_EFF_FLAG)) ||
> +		    !(conf[i].can_mask & CAN_EFF_FLAG)) {
> +			memcpy(cm->rules_raw + cm->eff_rules_count,
> +				&conf[i],
> +				sizeof(struct can_filter));
> +
> +			cm->eff_rules_count++;
> +		} else {
> +			continue;
> +		}
> +	}
> +
> +	/* Process SFF frame rules */
> +	for (i = 0; i < cm->rules_count; i++) {


use rulescnt instead of cm->rules_count (no need to derefence data)

> +		if ((conf[i].can_id & CAN_EFF_FLAG) &&
> +		    (conf[i].can_mask & CAN_EFF_FLAG)) {



|| !(conf[i].can_mask & CAN_EFF_FLAG)) {

is missing here (must be the same as the condition above!)

> +			continue;
> +		} else {
> +			memcpy(cm->rules_raw
> +				+ cm->eff_rules_count
> +				+ cm->sff_rules_count,
> +				&conf[i], sizeof(struct can_filter));
> +
> +			cm->sff_rules_count++;
> +
> +			em_canid_sff_match_add(cm,
> +				conf[i].can_id, conf[i].can_mask);
> +		}
> +	}
> +
> +	m->datalen = sizeof(*cm);


*cm is no longer a fixed structure as it was in the first patches.

Must be:

m->datalen = sizeof(struct canid_match) + sizeof(struct can_filter) * rulescnt

> +	m->data = (unsigned long)cm;
> +


Sorry, that i didn't see that before :-(

Regards,
Oliver


^ permalink raw reply

* Re: [PATCH net-next 09/15] net: bus: Add garbage collector for AF_BUS sockets.
From: Ben Hutchings @ 2012-07-02 17:44 UTC (permalink / raw)
  To: Vincent Sanders
  Cc: netdev, linux-kernel, David S. Miller, Javier Martinez Canillas
In-Reply-To: <1340988354-26981-10-git-send-email-vincent.sanders@collabora.co.uk>

On Fri, 2012-06-29 at 17:45 +0100, Vincent Sanders wrote:
> From: Javier Martinez Canillas <javier.martinez@collabora.co.uk>
> 
> This patch adds a garbage collector for AF_BUS sockets.
[...]
> +struct sock *bus_get_socket(struct file *filp)
> +{
> +	struct sock *u_sock = NULL;
> +	struct inode *inode = filp->f_path.dentry->d_inode;
> +
> +	/*
> +	 *	Socket ?
> +	 */
> +	if (S_ISSOCK(inode->i_mode) && !(filp->f_mode & FMODE_PATH)) {
> +		struct socket *sock = SOCKET_I(inode);
> +		struct sock *s = sock->sk;
> +
> +		/*
> +		 *	PF_BUS ?
> +		 */
> +		if (s && sock->ops && sock->ops->family == PF_BUS)
> +			u_sock = s;
> +	}
> +	return u_sock;
> +}
[...]

What about references cycles involving both AF_BUS and AF_UNIX sockets?
I think you must either specifically prevent passing AF_UNIX sockets
through AF_BUS sockets, or make a single garbage collector handle them
both.

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply

* Re: AW: AW: RFC: replace packets already in queue
From: Rick Jones @ 2012-07-02 17:25 UTC (permalink / raw)
  To: Erdt, Ralph; +Cc: Eric Dumazet, netdev@vger.kernel.org
In-Reply-To: <FB112703C4930F4ABEBB5B763F96491139379643@MAILSERV2A.lorien.fkie.fgan.de>

On 07/02/2012 01:38 AM, Erdt, Ralph wrote:
> I did not talking about W-LAN (802.11). I'm talking about an
> property technology which is able to send over KILOMETERs (WLAN <
> 100m) but with VERY low bandwidth: 9600 bit (no Mega, Giga or Kilo!)
> (W-LAN: slowest: 1Mbit). The devices is loosely connected to our
> boxes: No linux driver but a program which create an virtual network
> device. This just sends one packet to the devices and then waits for
> the acknowledgement that the packet was sent. THEN the next packet
> will be send. There is no further queue, because the wireless is so
> lame, that there is no need for that! (BTW: the qdisc and the
> connector are distinct problems/programs. There is no dependency.)

>
>
>> Most packets don't stay in qdisc but are sitting in wireless
>> driver, unless you really flood it. If it happens, you already are
>> in trouble.
>
> We ARE in trouble... :-/

While you may need to tweak some of the constants based on your bitrate
and MTU size (which if the former is 9600 bits per second I trust the
latter is rather smaller than 1500 bytes since that would be well over a
second to transmit... and so the RTT there will be large), it sounds 
like codel will do good things for you.  It won't replace the packet, 
but its use of drop on dequeue will I believe accomplish substantially 
the same effect.

rick jones

^ permalink raw reply

* Re: [PATCH net-next v3] em_canid: Ematch rule to match CAN frames according to their identifiers
From: Oliver Hartkopp @ 2012-07-02 17:04 UTC (permalink / raw)
  To: Rostislav Lisovy; +Cc: netdev, linux-can, lartc, pisa, sojkam1
In-Reply-To: <1341241568-13438-1-git-send-email-lisovy@gmail.com>

On 02.07.2012 17:06, Rostislav Lisovy wrote:

> This ematch makes it possible to classify CAN frames (AF_CAN) according
> to their identifiers. This functionality can not be easily achieved with
> existing classifiers, such as u32, because CAN identifier is always stored
> in native endianness, whereas u32 expects Network byte order.
> 
> Signed-off-by: Rostislav Lisovy <lisovy@gmail.com>


Acked-by: Oliver Hartkopp <socketcan@hartkopp.net>

Thanks Rostislav!

^ permalink raw reply

* Re: AF_BUS socket address family
From: Alban Crequy @ 2012-07-02 16:46 UTC (permalink / raw)
  To: Hans-Peter Jansen; +Cc: Vincent Sanders, netdev, linux-kernel, David S. Miller
In-Reply-To: <201206302241.08662.hpj@urpla.net>

Sat, 30 Jun 2012 22:41:08 +0200,
"Hans-Peter Jansen" <hpj@urpla.net> wrote :

> Dear Vincent,
> 
> On Friday 29 June 2012, 18:45:39 Vincent Sanders wrote:
> > This series adds the bus address family (AF_BUS) it is against
> > net-next as of yesterday.
> >
> > AF_BUS is a message oriented inter process communication system.
> >
> > The principle features are:
> >
> >  - Reliable datagram based communication (all sockets are of type
> >    SOCK_SEQPACKET)
> >
> >  - Multicast message delivery (one to many, unicast as a subset)
> >
> >  - Strict ordering (messages are delivered to every client in the
> > same order)
> >
> >  - Ability to pass file descriptors
> >
> >  - Ability to pass credentials
> >
> > The basic concept is to provide a virtual bus on which multiple
> > processes can communicate and policy is imposed by a "bus master".
> >
> > Introduction
> > ------------
> >
> > AF_BUS is based upon AF_UNIX but extended for multicast operation and
> > removes stream operation, responding to extensive feedback on
> > previous approaches we have made the implementation as isolated as
> > possible. There are opportunities in the future to integrate the
> > socket garbage collector with that of the unix socket implementation.
> >
> > The impetus for creating this IPC mechanism is to replace the
> > underlying transport for D-Bus. The D-Bus system currently emulates
> > this IPC mechanism using AF_UNIX sockets in userspace and has
> > numerous undesirable behaviours. D-Bus is now widely deployed in many
> > areas and has become a de-facto IPC standard. Using this IPC
> > mechanism as a transport gives a significant (100% or more)
> > improvement to throughput with comparable improvement to latency.
> 
> Your introduction is missing a comprehensive "Discussion" section, where 
> you compare the AF_UNIX based implementation with AF_BUS ones. 
> 
> You should elaborate on each of the above noted undesirable behaviours, 
> why and how AF_BUS is advantageous. Show the workarounds, that are 
> needed by AF_UNIX to operate (properly?!?) and how the new 
> implementation is going to improve this situation.

Hi Hans-Peter,

Thanks for your feedback. I would like to elaborate on the priority
inversion and on the latency.

Priority inversion:
===================

A bus can have users with different priorities. The classical example was
Nokia's N900 phone. A incoming phone call should query the contact 
database, start the correct ringtone, display the correct avatar very
quickly. Other background tasks don't have the same priority. Since all
messages go through dbus-daemon, it is a single bottleneck and the
kernel has no way to schedule the processes with the correct
priorities. Low priority messages are waking up dbus-daemon as much as
high priority messages.

A workaround was to set the nice level of dbus-daemon to -5. It didn't
really address the priority inversion, but it reduces the number of
context switches on multicast messages, and that helped a bit. The
diagram "Experiment #3" on this blog post shows dbus-daemon is no
longer context switched for every recipient of a multicast message:
http://alban-apinc.blogspot.co.uk/2011/12/importance-of-scheduling-priority-in-d.html

With AF_BUS, there is no single process who has to receive all messages
from low priority processes and high priority processes. The kernel can
schedule the high priority processes and they can progress in their
communication without having dbus-daemon involved.

Latency:
========

On AF_UNIX, a message round-trip would go like this:
- the sender sends a message to dbus-daemon
- dbus-daemon receives it and forward it to the correct recipient
- the recipient receives it and reply with a new message sent to
  dbus-daemon
- dbus-daemon receives the reply and forward it to the initial sender
- the sender receives the reply.
There is a total of 4 context switches.

On AF_BUS, the messages are most of the time not routed by dbus-daemon,
this halves the number of context switches. It reduced the latency and
brought the performance improvement mentioned by Vincent.

> This will help to get some progress into the indurated discussion here.
> 
> Please also note, that, while your aims are nice and sound, it's even 
> more important for IPC mechanisms to operate properly - even during 
> persisting error conditions (crashed bus master and clients, 
> misbehaving or even abusing members). It would be cool to create a 
> D-BUS test rig, that not only measures performance numbers, but also 
> checks for dead locks, corner cases and abuse attempts in both IPC 
> implementations.
> 
> It's a juggling act: while AF_UNIX might suffer from downsides, the code 
> is heavily exercised in every aspect. Your implementation will only be 
> exercised by a handful of users (basically one lib), but in order to 
> rectify its existence in kernel space, such extensions need different 
> kinds of users, and the basic concepts need to fit in the whole kernel 
> picture as well, or you need to call it AF_DBUS with even less chance 
> to get it into mainstream.

I am hoping there will be more users with different use-cases and it
should help to improve AF_BUS and fix the unavoidable bugs in a young
code. I would be happy if AF_BUS reduces the cost of maintaining the
out-of-tree multicast messaging protocol family based on AF_UNIX
mentioned by Chris Friesen.

Thank you!
Alban

^ permalink raw reply

* Re: linux-next: Tree for July 2 (pktgen)
From: Randy Dunlap @ 2012-07-02 16:19 UTC (permalink / raw)
  To: Stephen Rothwell; +Cc: linux-next, LKML, netdev
In-Reply-To: <20120702172334.2618cae84cc57b4ec5a63ed7@canb.auug.org.au>

On 07/02/2012 12:23 AM, Stephen Rothwell wrote:

> Hi all,
> 
> Changes since 20120629:



on i386:

net/built-in.o: In function `pktgen_if_write':
pktgen.c:(.text+0x1eaed): undefined reference to `__divdi3'


-- 

~Randy

^ permalink raw reply

* Re: [RFC] [TCP 0/3] Receive from socket into bio without copying
From: Andreas Gruenbacher @ 2012-07-02 16:06 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev, linux-kernel, Herbert Xu, David S. Miller
In-Reply-To: <1341237299.22621.88.camel@edumazet-glaptop>

On Mon, 2012-07-02 at 15:54 +0200, Eric Dumazet wrote:
> On Mon, 2012-07-02 at 15:02 +0200, Andreas Gruenbacher wrote:
> > bio_vec's have some alignment requirements that must be met, and
> > anything that doesn't meet those requirements can't be passed to the
> > block layer (without copying it first). Additional layers between
> > the
> > network and block layers, like a pipe, won't make that problem go
> > away.
> >
> 
> What are the "some alignment requirements" exactly, and how do you use
> TCP exactly to meet them ? (MSS= multiple of 512 ?)

Sectors of 512 bytes must be contiguous; some devices have additional
requirements (like 4k sectors).  I'm not sure if sectors always need to be
aligned, but if buffers are allocated page wise and handed out as half /
full pages, you get that automatically.

> I believe you try to escape from the real problem.
> 
> If the NIC driver provides non aligned data, neither splice() or your
> new stuff will magically align it. You _need_ a copy in either cases.

Yes, the NIC must provide aligned data.  A prerequisite for that is that the
NIC knows how to align things.  With no knowledge of the application protocol,
the NIC can only use the packet boundaries as hints.  I'm trying to get tcp
to start new packets at specific points in the protocol so that the packet
boundaries will coincide with alignment boundaries.  With that, NICs that do
header splitting can receive packets into appropriately aligned buffers, and
the problem is solved.

> If NIC driver provides aligned data, splice(socket -> pipe) will keep
> this alignment for you at 0 cost.

Yes of course.  That is not the real issue here though.

> > It's not already there, it requires the alignment issue to be
> > addresses first.
> 
> There is no guarantee TCP payload is aligned to a bio, ever, in linux
> ethernet/ip/tcp stack.
> 
> Really, your patches work for you, by pure luck, because you use one
> particular NIC driver that happens to prepare things for you
> (presumably doing header split). Nothing guarantee this wont change even
> for the same hardware in linux-3.8

NICs with header splitting are common enough that you don't have to resort
to pure luck to get one.

> So I will just say no to your patches, unless you demonstrate the
> splice() problems, and how you can fix the alignment problem in a new
> layer instead of in the existing zero copy standard one.

Again, splice or not is not the issue here. It does not, by itself, allow zero
copy from the network directly to disk but it could likely be made to support
that if we can get the alignment right first.  The proposed MSG_NEW_PACKET flag
helps with that, but maybe someone has a better idea.

This doesn't have to work with arbitrary NICs and it most likely will hurt with
small MTUs, but then you can still choose not to use it.  It just has to almost
always work with some particular NICs and with large MTUs.

Andreas

^ permalink raw reply

* Re: [PATCH net-next 13/15] netfilter: nfdbus: Add D-bus message parsing
From: Javier Martinez Canillas @ 2012-07-02 15:43 UTC (permalink / raw)
  To: Pablo Neira Ayuso
  Cc: Vincent Sanders, netdev, linux-kernel, David S. Miller,
	Alban Crequy
In-Reply-To: <20120629171108.GA6287@1984>

On 06/29/2012 07:11 PM, Pablo Neira Ayuso wrote:
> On Fri, Jun 29, 2012 at 05:45:52PM +0100, Vincent Sanders wrote:
>> From: Javier Martinez Canillas <javier.martinez@collabora.co.uk>
>> 
>> The netfilter D-Bus module needs to parse D-bus messages sent by
>> applications to decide whether a peer can receive or not a D-Bus
>> message. Add D-bus message parsing logic to be able to analyze.
> 
> Not talking about the entire patchset, only about the part I'm
> responsible for.
> 
> I don't see why you think this belong to netfilter at all.
> 
> This doesn't integrate into the existing filtering infrastructure,
> neither it extends it in any way.
> 

Hello Pablo,

Thanks a lot for your feedback.

This is the first of a set of patches that adds a netfilter module to parse
D-Bus messages, the complete patch-set is:

[PATCH 13/15] netfilter: nfdbus: Add D-bus message parsing
[PATCH 14/15] netfilter: nfdbus: Add D-bus match rule implementation
[PATCH 15/15] netfilter: add netfilter D-Bus module

patches 13 and 14 just include D-Bus helper code to be used by the netfilter
module (added on patch 15) and specially the dbus_filter netfilter hook function.

For the next post version we will reorganize the patches so first the D-Bus
netfilter module is added with an empty dbus_filter function and then added the
D-Bus helper code.

Also, we will move the nfdbus netfilter module to net/bus so is not inside the
netfilter core code.

Thanks a lot and best regards,
Javier

^ permalink raw reply

* RE: [PATCH 00/13] drivers: hv: kvp
From: KY Srinivasan @ 2012-07-02 15:22 UTC (permalink / raw)
  To: Olaf Hering
  Cc: Greg KH, apw@canonical.com, devel@linuxdriverproject.org,
	virtualization@lists.osdl.org, linux-kernel@vger.kernel.org,
	netdev@vger.kernel.org
In-Reply-To: <20120628142340.GA21537@aepfle.de>

> -----Original Message-----
> From: Olaf Hering [mailto:olaf@aepfle.de]
> Sent: Thursday, June 28, 2012 10:24 AM
> To: KY Srinivasan
> Cc: Greg KH; apw@canonical.com; devel@linuxdriverproject.org;
> virtualization@lists.osdl.org; linux-kernel@vger.kernel.org
> Subject: Re: [PATCH 00/13] drivers: hv: kvp
> 
> On Tue, Jun 26, KY Srinivasan wrote:
> 
> > > From: Greg KH [mailto:gregkh@linuxfoundation.org]
> > > The fact that it was Red Hat specific was the main part, this should be
> > > done in a standard way, with standard tools, right?
> >
> > The reason I asked this question was to make sure I address these
> > issues in addition to whatever I am debugging now. I use the standard
> > tools and calls to retrieve all the IP configuration. As I look at
> > each distribution the files they keep persistent IP configuration
> > Information is different and that is the reason I chose to start with
> > RedHat. If there is a standard way to store the configuration, I will
> > do that.
> 
> 
> KY,
> 
> instead of using system() in kvp_get_ipconfig_info and kvp_set_ip_info,
> wouldnt it be easier to call an external helper script which does all
> the distribution specific work? Just define some API to pass values to
> the script, and something to read values collected by the script back
> into the daemon.

On the "Get" side I mostly use standard commands/APIs to get all the information:

1) IP address information and subnet mask: getifaddrs()
2) DNS information:  Parsing /etc/resolv.conf
3) /sbin/ip command for all the routing information
4)  Parse /etc/sysconfig/network-scripts/ifcfg-ethx for boot protocol

As you can see, all but the boot protocol is gathered using the "standard distro
independent mechanisms. I was looking at NetworkManager cli and it looks
like I could gather all the information except the boot protocol information. I am 
not sure how to gather the boot protocol information in a distro independent fashion.

On the SET side, I need to persistently store the settings in an appropriate configuration
file and flush these settings down so that the interface is appropriately configured. It is here
that I am struggling to find a distro independent way of doing things. It would be great if I can
use NetworkManager cli (nmcli) to accomplish this. Any help here would be greatly appreciated.

While I toyed with your proposal, I feel it just pushes the problem out of the daemon code -
we would still need to write distro specific scripts. If this approach is something that everybody
is comfortable with, I can take a stab at implementing that. 

> 
> If the work is done in a script it will be much easier for an admin to
> debug and adjust it.
> 
> I think there is no standard way to configure all relevant distros in
> the same way. Maybe one day NetworkManager can finally handle all
> possible ways to configure network related things. But until that
> happens the config files need to be adjusted manually.
> 
> 
> 
> Some of the functions have deep indention levels due to 'while() {
> switch() }' usage. Perhaps such code could be moved into its own
> function so that lines dont need to be wrapped that much due to the odd
> 80 column limit.

I will take care of this. As suggested by Greg, I am adding netdev developers here to
seek their input. 

Regards,

K. Y

^ permalink raw reply

* Re: AF_BUS socket address family
From: Javier Martinez Canillas @ 2012-07-02 15:18 UTC (permalink / raw)
  To: Chris Friesen, David Miller, vincent.sanders, netdev,
	linux-kernel

On Mon, Jul 2, 2012 at 6:49 AM, Chris Friesen <chris.friesen@genband.com> wrote:
> On 06/29/2012 05:18 PM, David Miller wrote:
>>
>> From: Vincent Sanders<vincent.sanders@collabora.co.uk>
>> Date: Sat, 30 Jun 2012 00:12:37 +0100
>>
>>> I had hoped you would have at least read the opening list where I
>>> outlined the underlying features which explain why none of the
>>> existing IPC match the requirements.
>>
>> I had hoped that you had read the part we told you last time where
>> we explained why multicast and "reliable delivery" are fundamentally
>> incompatible attributes.
>>
>> We are not creating a full address family in the kernel which exists
>> for one, and only one, specific and difficult user.
>
>
> For what it's worth, the company I work for (and a number of other
> companies) currently use an out-of-tree datagram multicast messaging
> protocol family based on AF_UNIX.
>
> If AF_BUS were to be accepted, it would be essentially trivial for us to
> port our existing userspace messaging library to use it instead of our
> current protocol family, and we would almost certainly do so.
>
> I'd love to see AF_BUS go in.
>
> Chris Friesen
>

Hi Chris,

Thanks a lot for your comments and feedback.

We tried different approaches before developing the AF_BUS socket family and one
of them was extending AF_UNIX to support multicast. We posted our patches [1]
and the feedback was that the AF_UNIX code was already a complex and difficult
code to maintain. So, we decided to implement a new family (AF_BUS) that is
orthogonal to the rest of the networking stack and no added complexity nor
performance penalty would pay a user not using our IPC solution.

Looking at netdev archives I saw that you both raised the question about
multicast on unix sockets and post an implementation on early 2003. So if I
understand correctly you are maintaining an out-of-tree solution for around 9
years now.

We developed AF_BUS to improve the performance of the D-Bus IPC system (and our
results show us a 2X speedup) but design it to be as generic as possible so
other users can take advantage of it.

It would be a great help if you can join the discussion and explain the
arguments of your company (and the others companies you were talking about) in
favor of a simpler multicast socket family.

The fact that your company spent lots of engineering resources to maintain an
out-of-tree patch-set for 9 years should raise some eyebrows and convince more
than one people that a simpler local multicast solution is needed on the Linux
kernel (which was one of the reasons why Google also developed Binder I guess).

[1]: https://lkml.org/lkml/2012/2/20/84
[2]: https://lkml.org/lkml/2003/2/27/150
[3]: http://lwn.net/Articles/27001/

Thanks a lot and best regards,

Javier

^ permalink raw reply

* Re: [patch] [SCSI] bnx2i: use strlcpy() instead of memcpy() for strings
From: Michael Chan @ 2012-07-02 15:13 UTC (permalink / raw)
  To: Dan Carpenter
  Cc: David Laight, James E.J. Bottomley, Barak Witkowski, Eddie Wai,
	linux-scsi, netdev, David S. Miller
In-Reply-To: <20120702104843.GB4519@mwanda>

On Mon, 2012-07-02 at 13:48 +0300, Dan Carpenter wrote: 
> On Mon, Jul 02, 2012 at 11:09:19AM +0100, David Laight wrote:
> > > Subject: [patch] [SCSI] bnx2i: use strlcpy() instead of memcpy() for
> > strings
> > > 
> > > DRV_MODULE_VERSION here is "2.7.2.2" which is only 8 chars but we copy
> > > 12 bytes from the stack so it's a small information leak.
> > > 
> > > Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
> > > ---
> > > This was just added to linux-next yesterday, but I'm not sure 
> > > which tree it came from.
> > > 
> > > diff --git a/drivers/scsi/bnx2i/bnx2i_init.c 
> > > b/drivers/scsi/bnx2i/bnx2i_init.c
> > > index 7729a52..b17637a 100644
> > > --- a/drivers/scsi/bnx2i/bnx2i_init.c
> > > +++ b/drivers/scsi/bnx2i/bnx2i_init.c
> > > @@ -400,7 +400,7 @@ int bnx2i_get_stats(void *handle)
> > >  	if (!stats)
> > >  		return -ENOMEM;
> > >  
> > > -	memcpy(stats->version, DRV_MODULE_VERSION,
> > sizeof(stats->version));
> > > +	strlcpy(stats->version, DRV_MODULE_VERSION,
> > sizeof(stats->version));
> > >  	memcpy(stats->mac_add1 + 2, hba->cnic->mac_addr, ETH_ALEN);
> > 
> > Doesn't that leak the original contents of the last bytes of
> > stats->version instead?
> 
> I'm pretty sure we set those to zero in bnx2x_handle_drv_info_req().
> 

Yes, bnx2x zeros the whole stats structure, so strlcpy() is correct.

This came from the net-next tree, so David is the right persion to apply
this.  Thanks.

Acked-by: Michael Chan <mchan@broadcom.com>



^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox