Netdev List
 help / color / mirror / Atom feed
* [PATCH 1/2] netfilter: add xt_priority xtables match
From: Willem de Bruijn @ 2012-12-05 19:22 UTC (permalink / raw)
  To: netfilter-devel, netdev, edumazet, davem, kaber, pablo; +Cc: Willem de Bruijn
In-Reply-To: <1354735339-13402-1-git-send-email-willemb@google.com>

Add an iptables match based on the skb->priority field. This field
can be set by socket option SO_PRIORITY, among others.

The match supports range based matching on packet priority, with
optional inversion. Before matching, a mask can be applied to the
priority field to handle the case where different regions of the
bitfield are reserved for unrelated uses.
---
 include/linux/netfilter/xt_priority.h |   13 ++++++++
 net/netfilter/Kconfig                 |    9 ++++++
 net/netfilter/Makefile                |    1 +
 net/netfilter/xt_priority.c           |   51 +++++++++++++++++++++++++++++++++
 4 files changed, 74 insertions(+), 0 deletions(-)
 create mode 100644 include/linux/netfilter/xt_priority.h
 create mode 100644 net/netfilter/xt_priority.c

diff --git a/include/linux/netfilter/xt_priority.h b/include/linux/netfilter/xt_priority.h
new file mode 100644
index 0000000..da9a288
--- /dev/null
+++ b/include/linux/netfilter/xt_priority.h
@@ -0,0 +1,13 @@
+#ifndef _XT_PRIORITY_H
+#define _XT_PRIORITY_H
+
+#include <linux/types.h>
+
+struct xt_priority_info {
+	__u32 min;
+	__u32 max;
+	__u32 mask;
+	__u8  invert;
+};
+
+#endif /*_XT_PRIORITY_H */
diff --git a/net/netfilter/Kconfig b/net/netfilter/Kconfig
index fefa514..c9739c6 100644
--- a/net/netfilter/Kconfig
+++ b/net/netfilter/Kconfig
@@ -1093,6 +1093,15 @@ config NETFILTER_XT_MATCH_PKTTYPE
 
 	  To compile it as a module, choose M here.  If unsure, say N.
 
+config NETFILTER_XT_MATCH_PRIORITY
+	tristate '"priority" match support'
+	depends on NETFILTER_ADVANCED
+	help
+	  This option adds a match based on the value of the sk_buff
+	  priority field.
+
+	  To compile it as a module, choose M here.  If unsure, say N.
+
 config NETFILTER_XT_MATCH_QUOTA
 	tristate '"quota" match support'
 	depends on NETFILTER_ADVANCED
diff --git a/net/netfilter/Makefile b/net/netfilter/Makefile
index 3259697..8e5602f 100644
--- a/net/netfilter/Makefile
+++ b/net/netfilter/Makefile
@@ -124,6 +124,7 @@ obj-$(CONFIG_NETFILTER_XT_MATCH_OWNER) += xt_owner.o
 obj-$(CONFIG_NETFILTER_XT_MATCH_PHYSDEV) += xt_physdev.o
 obj-$(CONFIG_NETFILTER_XT_MATCH_PKTTYPE) += xt_pkttype.o
 obj-$(CONFIG_NETFILTER_XT_MATCH_POLICY) += xt_policy.o
+obj-$(CONFIG_NETFILTER_XT_MATCH_PRIORITY) += xt_priority.o
 obj-$(CONFIG_NETFILTER_XT_MATCH_QUOTA) += xt_quota.o
 obj-$(CONFIG_NETFILTER_XT_MATCH_RATEEST) += xt_rateest.o
 obj-$(CONFIG_NETFILTER_XT_MATCH_REALM) += xt_realm.o
diff --git a/net/netfilter/xt_priority.c b/net/netfilter/xt_priority.c
new file mode 100644
index 0000000..4982eee
--- /dev/null
+++ b/net/netfilter/xt_priority.c
@@ -0,0 +1,51 @@
+/* Xtables module to match packets based on their sk_buff priority field.
+ * Copyright 2012 Google Inc.
+ * Written by Willem de Bruijn <willemb@google.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/module.h>
+#include <linux/skbuff.h>
+
+#include <linux/netfilter/xt_priority.h>
+#include <linux/netfilter/x_tables.h>
+
+MODULE_AUTHOR("Willem de Bruijn <willemb@google.com>");
+MODULE_DESCRIPTION("Xtables: priority filter match");
+MODULE_LICENSE("GPL");
+MODULE_ALIAS("ipt_priority");
+MODULE_ALIAS("ip6t_priority");
+
+static bool priority_mt(const struct sk_buff *skb,
+			struct xt_action_param *par)
+{
+	const struct xt_priority_info *info = par->matchinfo;
+
+	__u32 priority = skb->priority & info->mask;
+	return (priority >= info->min && priority <= info->max) ^ info->invert;
+}
+
+static struct xt_match priority_mt_reg __read_mostly = {
+	.name		= "priority",
+	.revision	= 0,
+	.family		= NFPROTO_UNSPEC,
+	.match		= priority_mt,
+	.matchsize	= sizeof(struct xt_priority_info),
+	.me		= THIS_MODULE,
+};
+
+static int __init priority_mt_init(void)
+{
+	return xt_register_match(&priority_mt_reg);
+}
+
+static void __exit priority_mt_exit(void)
+{
+	xt_unregister_match(&priority_mt_reg);
+}
+
+module_init(priority_mt_init);
+module_exit(priority_mt_exit);
-- 
1.7.7.3

^ permalink raw reply related

* [PATCH rfc] netfilter: two xtables matches
From: Willem de Bruijn @ 2012-12-05 19:22 UTC (permalink / raw)
  To: netfilter-devel, netdev, edumazet, davem, kaber, pablo

The second patch is more speculative and aims to be a more general
workaround, as well as a performance optimization: support
(preferably JIT compiled) BPF programs as iptables match rules.

Potentially, the skb->priority match can be implemented by applying
only the second patch and adding a new BPF_S_ANC ancillary field to
Linux Socket Filters.

I also wrote corresponding userspace patches to iptables. The process
for submitting both kernel and user patches is not 100% clear to me.
Sending the kernel bits to both netdev and netfilter-devel for
initial feedback. Please correct me if you want it another way.

The patches apply to net-next.


^ permalink raw reply

* [PATCH net-next] ipv6: avoid taking locks at socket dismantle
From: Eric Dumazet @ 2012-12-05 19:18 UTC (permalink / raw)
  To: David Miller; +Cc: netdev

From: Eric Dumazet <edumazet@google.com>

ipv6_sock_mc_close() is called for ipv6 sockets at close time, and most
of them don't use multicast.

Add a test to avoid contention on a shared spinlock.

Same heuristic applies for ipv6_sock_ac_close(), to avoid contention
on a shared rwlock.

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 net/ipv6/anycast.c |    3 +++
 net/ipv6/mcast.c   |    3 +++
 2 files changed, 6 insertions(+)

diff --git a/net/ipv6/anycast.c b/net/ipv6/anycast.c
index 2f4f584..757a810 100644
--- a/net/ipv6/anycast.c
+++ b/net/ipv6/anycast.c
@@ -189,6 +189,9 @@ void ipv6_sock_ac_close(struct sock *sk)
 	struct net *net = sock_net(sk);
 	int	prev_index;
 
+	if (!np->ipv6_ac_list)
+		return;
+
 	write_lock_bh(&ipv6_sk_ac_lock);
 	pac = np->ipv6_ac_list;
 	np->ipv6_ac_list = NULL;
diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c
index b19ed51..28dfa5f 100644
--- a/net/ipv6/mcast.c
+++ b/net/ipv6/mcast.c
@@ -284,6 +284,9 @@ void ipv6_sock_mc_close(struct sock *sk)
 	struct ipv6_mc_socklist *mc_lst;
 	struct net *net = sock_net(sk);
 
+	if (!rcu_access_pointer(np->ipv6_mc_list))
+		return;
+
 	spin_lock(&ipv6_sk_mc_lock);
 	while ((mc_lst = rcu_dereference_protected(np->ipv6_mc_list,
 				lockdep_is_held(&ipv6_sk_mc_lock))) != NULL) {

^ permalink raw reply related

* Re: [PATCH] net/macb: increase RX buffer size for GEM
From: David Miller @ 2012-12-05 17:58 UTC (permalink / raw)
  To: nicolas.ferre; +Cc: netdev, linux-arm-kernel, linux-kernel, manabian, plagnioj
In-Reply-To: <50BF6366.8080600@atmel.com>

From: Nicolas Ferre <nicolas.ferre@atmel.com>
Date: Wed, 5 Dec 2012 16:08:22 +0100

> On 12/04/2012 07:22 PM, David Miller :
>> From: Nicolas Ferre <nicolas.ferre@atmel.com>
>> Date: Mon, 3 Dec 2012 13:15:43 +0100
>> 
>>> Macb Ethernet controller requires a RX buffer of 128 bytes. It is
>>> highly sub-optimal for Gigabit-capable GEM that is able to use
>>> a bigger DMA buffer. Change this constant and associated macros
>>> with data stored in the private structure.
>>> I also kept the result of buffers per page calculation to lower the
>>> impact of this move to a variable rx buffer size on rx hot path.
>>> RX DMA buffer size has to be multiple of 64 bytes as indicated in
>>> DMA Configuration Register specification.
>>>
>>> Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
>> 
>> This looks like it will waste a couple hundred bytes for 1500 MTU
>> frames, am I right?
> 
> Yep! But buffers get recycled, and with the current memory management by
> pages, it seems that I have to rework some part of it to optimize this
> memory usage (8KB memory blocks split into 5 buffers each as David said...).
> 
> Do you think it is worth digging this way or may I rework the rx buffer
> management in case of the GEM interface. If I implement a different path
> for GEM interface, I will have the possibility to tailor rx DMA buffers
> from 1500 Bytes up to 10KB jumbo frames...

I almost think you have to.

^ permalink raw reply

* Re: [PATCH] 3com: make 3c59x depend on HAS_IOPORT
From: David Miller @ 2012-12-05 17:58 UTC (permalink / raw)
  To: jang; +Cc: netdev
In-Reply-To: <1354716280.29038.2.camel@hal>

From: Jan Glauber <jang@linux.vnet.ibm.com>
Date: Wed, 05 Dec 2012 15:04:40 +0100

> From: Jan Glauber <jang@linux.vnet.ibm.com>
> 
> The 3com driver for 3c59x requires ioport_map. Since not all
> architectures support IO port mapping make 3c59x dependent on HAS_IOPORT.
> 
> Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com>

Which platforms support PCI or EISA yet do not set HAS_IOPORT?

^ permalink raw reply

* Re: [PATCH] net: ICMPv6 packets transmitted on wrong interface if nfmark is mangled
From: David Miller @ 2012-12-05 17:57 UTC (permalink / raw)
  To: dries.dewinter; +Cc: pablo, kaber, netdev, netfilter-devel
In-Reply-To: <CA+e04fgiMDoaCx0fpdK5F6YfJ_CSOZ13q1fn9xs7De4JO02c6g@mail.gmail.com>

From: Dries De Winter <dries.dewinter@gmail.com>
Date: Wed, 5 Dec 2012 14:41:59 +0100

> My "noreroute" patch will not fix this. Therefore it's indeed maybe
> better to add a simple check to ip6_route_me_harder(): not a check for
> ICMPv6, but a check for (ipv6_addr_type(&iph->daddr) &
> IPV6_ADDR_LINKLOCAL) instead. What do you think?

What if a packet is rewritten from a non-link-local destination address
into a link-local one?  Or vice versa?

Your test will fail in those cases.

^ permalink raw reply

* Re: [PATCH net-next 0/7] Allow to monitor multicast cache event via rtnetlink
From: David Miller @ 2012-12-05 17:54 UTC (permalink / raw)
  To: David.Laight; +Cc: nicolas.dichtel, netdev
In-Reply-To: <AE90C24D6B3A694183C094C60CF0A2F6026B70DA@saturn3.aculab.com>

From: "David Laight" <David.Laight@ACULAB.COM>
Date: Wed, 5 Dec 2012 11:41:33 -0000

> Probably worth commenting that the 64bit items might only be 32bit aligned.
> Just to stop anyone trying to read/write them with pointer casts.

Rather, let's not create this situation at all.

It's totally inappropriate to have special code to handle every single
time we want to put 64-bit values into netlink messages.

We need a real solution to this issue.

^ permalink raw reply

* Re: [PATCH net-next 0/7] Allow to monitor multicast cache event via rtnetlink
From: David Miller @ 2012-12-05 17:53 UTC (permalink / raw)
  To: nicolas.dichtel; +Cc: netdev
In-Reply-To: <50BF29DA.7020903@6wind.com>

From: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Date: Wed, 05 Dec 2012 12:02:50 +0100

> Le 04/12/2012 21:02, Nicolas Dichtel a écrit :
>> I can have a try on a tile platform. I don't have access to sparc or
>> mips.
> Hmm, I've read arm instead of mips! So I've tried on mips. Data are
> aligned on 32-bit, like for all netlink messages. nla_put_u64() will
> do the same, as it calls nla_put().
> 
> And the kernel will only use memcpy() to treat this attribute. Reader
> will be in userland.

Then userland will trap if the 64-bit values are only 32-bit aligned.

That's the problem I'm talking about.

I don't want to export any more unaligned 64-bit values in netlink
messages, it's a complete mess.

^ permalink raw reply

* Re: WARNING: drivers/net/ethernet/dlink/sundance.o(.text+0x2e87): Section mismatch in reference from the function sundance_probe1() to the variable .devinit.rodata:sundance_pci_tbl
From: David Miller @ 2012-12-05 17:50 UTC (permalink / raw)
  To: kda; +Cc: fengguang.wu, wfp5p, netdev, gregkh
In-Reply-To: <CAOJe8K3xBAuhVND-9dJgSxShhpHQ3zko=bE2CNrAKVa-gvDouA@mail.gmail.com>

From: Denis Kirjanov <kda@linux-powerpc.org>
Date: Wed, 5 Dec 2012 11:12:32 +0300

> I"ll fix it.

You can't.

It's only going to get fixed by a change in Greg KH's device tree
which updates the PCI table macros to not use __dev* section tags.

Fengguant, _PLEASE_, as Greg requested, stop reporting these section
mismatch errors, they aren't helpful and are wasting valuable
developer time.

^ permalink raw reply

* RE: kernel BUG at /build/buildd/linux-2.6.32/mm/mempolicy.c
From: Devendra C @ 2012-12-05 16:46 UTC (permalink / raw)
  To: Bokhan Artem, linux-kernel@vger.kernel.org, Borislav Petkov (
  Cc: netdev@vger.kernel.org
In-Reply-To: <50BF66A0.1050503@eml.ru>

> Date: Wed, 5 Dec 2012 22:22:08 +0700
> From: art@eml.ru
> To: linux-kernel@vger.kernel.org
> Subject: kernel BUG at /build/buildd/linux-2.6.32/mm/mempolicy.c
> 
> Hello.
> 
> We have several servers with mongodb running. Each server has several mongodb 
> instances. Mongodb dataset is larger then availiable memory (mongodb uses 
> memory-mapped files for all disk I/O).
> 2.6.32 and 2.6.38 kernels periodically crash and crash happens only with mongodb 
> servers.
> 
> 2.6.38's trace is in attachment.
> For 2.6.32 I only have "kernel BUG at 
> /build/buildd/linux-2.6.32/mm/mempolicy.c:1489!"
> 
> Heed help! :)

Adding netdev to cc,

seems like the networking bug?

if so please attach some tcpdump traces, lspci -vv -n, 

please forgive me if its not related to networking. :(

thanks, 		 	   		  

^ permalink raw reply

* Re: [RFC PATCH 1/2] tun: correctly report an error in tun_flow_init()
From: Paul Moore @ 2012-12-05 16:02 UTC (permalink / raw)
  To: jasowang; +Cc: netdev, linux-security-module, selinux
In-Reply-To: <20121129220629.30020.99947.stgit@sifl>

On Thursday, November 29, 2012 05:06:29 PM Paul Moore wrote:
> On error, the error code from tun_flow_init() is lost inside
> tun_set_iff(), this patch fixes this by assigning the tun_flow_init()
> error code to the "err" variable which is returned by
> the tun_flow_init() function on error.
> 
> Signed-off-by: Paul Moore <pmoore@redhat.com>

Jason, we've had some good discussion around patch 2/2 but nothing on this 
fix; can I assume you are okay with this patch?  If so I think we should go 
ahead and apply this ...

> ---
>  drivers/net/tun.c |    3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/tun.c b/drivers/net/tun.c
> index 607a3a5..877ffe2 100644
> --- a/drivers/net/tun.c
> +++ b/drivers/net/tun.c
> @@ -1605,7 +1605,8 @@ static int tun_set_iff(struct net *net, struct file
> *file, struct ifreq *ifr)
> 
>  		tun_net_init(dev);
> 
> -		if (tun_flow_init(tun))
> +		err = tun_flow_init(tun);
> +		if (err < 0)
>  			goto err_free_dev;
> 
>  		dev->hw_features = NETIF_F_SG | NETIF_F_FRAGLIST |
-- 
paul moore
security and virtualization @ redhat


^ permalink raw reply

* Re: [RFC PATCH 2/2] tun: fix LSM/SELinux labeling of tun/tap devices
From: Paul Moore @ 2012-12-05 16:00 UTC (permalink / raw)
  To: Jason Wang; +Cc: Michael S. Tsirkin, netdev, linux-security-module, selinux
In-Reply-To: <2433879.zRVUYBGg1f@jason-thinkpad-t430s>

On Wednesday, December 05, 2012 10:01:31 PM Jason Wang wrote:
> On Wednesday, December 05, 2012 01:44:55 PM Michael S. Tsirkin wrote:
> > On Wed, Dec 05, 2012 at 02:19:22PM +0800, Jason Wang wrote:
> > > On 12/05/2012 02:17 AM, Paul Moore wrote:
> > > > On Tuesday, December 04, 2012 07:36:26 PM Michael S. Tsirkin wrote:
> > > >> On Tue, Dec 04, 2012 at 11:18:57AM -0500, Paul Moore wrote:
> > > >>> Okay, based on your explanation of TUNSETQUEUE, the steps below are
> > > >>> what I
> > > >>> believe we need to do ... if you disagree speak up quickly please.
> > > >>> 
> > > >>> A. TUNSETIFF (new, non-persistent device)
> > > >>> 
> > > >>> [Allocate and initialize the tun_struct LSM state based on the
> > > >>> calling
> > > >>> process, use this state to label the TUN socket.]
> > > >>> 
> > > >>> 1. Call security_tun_dev_create() which authorizes the action.
> > > >>> 2. Call security_tun_dev_alloc_security() which allocates the
> > > >>> tun_struct
> > > >>> LSM blob and SELinux sets some internal blob state to record the
> > > >>> label
> > > >>> of
> > > >>> the calling process.
> > > >>> 3. Call security_tun_dev_attach() which sets the label of the TUN
> > > >>> socket
> > > >>> to match the label stored in the tun_struct LSM blob during A2.  No
> > > >>> authorization is done at this point since the socket is
> > > >>> new/unlabeled.
> > > >>> 
> > > >>> B. TUNSETIFF (existing, persistent device)
> > > >>> 
> > > >>> [Relabel the existing tun_struct LSM state based on the calling
> > > >>> process,
> > > >>> use this state to label the TUN socket.]
> > > >>> 
> > > >>> 1. Attempt to relabel/reset the tun_struct LSM blob from the
> > > >>> currently
> > > >>> stored value, set during A2, to the label of the current calling
> > > >>> process.
> > > >>> *** THIS IS NOT CURRENTLY DONE IN THE RFC PATCH ***
> > > >>> 2. Call security_tun_dev_attach() which sets the label of the TUN
> > > >>> socket
> > > >>> to match the label stored in the tun_struct LSM blob during B1. No
> > > >>> authorization is done at this point since the socket is
> > > >>> new/unlabeled.
> > > >>> 
> > > >>> C. TUNSETQUEUE
> > > >>> 
> > > >>> [Use the existing tun_struct LSM state to label the new TUN socket.]
> > > >>> 
> > > >>> 1. Call security_tun_dev_attach() which sets the label of the TUN
> > > >>> socket
> > > >>> to match the label stored in the tun_struct LSM blob set during
> > > >>> either
> > > >>> A2
> > > >>> or B1. No authorization is done at this point since the socket is
> > > >>> new/unlabeled.
> > > >> 
> > > >> Here's what bothers me. libvirt currently opens tun and passes
> > > >> fd to qemu. What would prevent qemu from attaching fd using
> > > >> TUNSETQUEUE
> > > >> to another device it does not own?
> > > > 
> > > > True, assuming all the above is correct and that I'm understanding it
> > > > correctly (Jason?), we should probably add a new SELinux access
> > > > control
> > > > for
> > > > TUNSETQUEUE.
> > > 
> > > Yes, we need make sure qemu can call TUNSETQUEUE for the device it does
> > > not own.
> > 
> > Meaning can *not* call?
> 
> Sorry for not being clear, I mean qemu can call TUNSETQUEUE for the device
> it owns and for the device it does not own, it can't call.

Okay, let me add a access control for TUNSETQUEUE and I'll post an updated 
patchset later today.

-- 
paul moore
security and virtualization @ redhat


^ permalink raw reply

* [RFC PATCH] Bug on AT91 macb driver rx with high network traffic
From: matteo.fortini @ 2012-12-05 15:48 UTC (permalink / raw)
  To: netdev; +Cc: nicolas.ferre

We are testing the robustness of the driver with UDP packets of 
increasing length, up to the maximum allowed.

We have an UDP echo server listening on an port on the AT91 board, and 
we send to it increasing length UDP packets, waiting for the echo reply 
before sending the next one.
When packets get larghish, in the >40000 bytes range, we see that the 
server is not receiving packets anymore, so the client does not receive 
the reply and the test stops. Pinging the interface once resumes the 
test, meaning that the packet has been actually received, but the driver 
is waiting for an interrupt that is not coming.

We traced this down to slow/missing IRQ response, and we fixed it as in 
the following patch, which calls napi_reschedule() before leaving the 
polling loop if the loop condition is still valid at the end of the 
polling loop, as other net drivers appear to do.

We don't know if this is the perfectly right way to do it, and we'd like 
your opinion on this before submitting a proper patch.

I added the udp_server.c and udp_client.c softwares which may be useful.

Thank you in advance,
Matteo Fortini

===================================================================

 From 2d8895022a0668f6a3c1112f15ebe471db1a471e Mon Sep 17 00:00:00 2001
From: Matteo Fortini <matteo.fortini@sadel.it>
Date: Thu, 8 Nov 2012 16:12:10 +0100
Subject: [PATCH] AT91 macb: Fix lost rx packets on high rx traffic

---
  drivers/net/ethernet/cadence/macb.c |    8 ++++++++
  1 file changed, 8 insertions(+)

diff --git a/drivers/net/ethernet/cadence/macb.c 
b/drivers/net/ethernet/cadence/macb.c
index 033064b..348a20f 100644
--- a/drivers/net/ethernet/cadence/macb.c
+++ b/drivers/net/ethernet/cadence/macb.c
@@ -522,8 +522,16 @@ static int macb_poll(struct napi_struct *napi, int 
budget)

         work_done = macb_rx(bp, budget);
         if (work_done < budget) {
+               u32 addr;
+
                 napi_complete(napi);

+               addr = bp->rx_ring[bp->rx_tail].addr;
+
+               if ((addr & MACB_BIT(RX_USED))) {
+                       netdev_warn(bp->dev, "poll: reschedule");
+                       napi_reschedule(napi);
+               }
                 /*
                  * We've done what we can to clean the buffers. Make 
sure we
                  * get notified when new packets arrive.
-- 
1.7.10.4

====================================================

TEST PROGRAMS:

/*
  * UDP client
  * Copyright 2012 SADEL SpA
  * Castel Maggiore
  * Bologna
  * Italy
  */

#include <sys/socket.h>
#include <netinet/in.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>

#define BUFFER_LEN      70000   ///< Lenght of buffer
#define LISTEN_PORT     32000   ///< Listen port
#define MIN_PACKET_LEN      8   ///< Min lenght of a packet
#define MAX_PACKET_LEN  65499   ///< Min lenght of a packet
#define TIMEOUT             0   ///< Default wait_time between each 
packet send

static int port         = LISTEN_PORT;
static int minPacketLen = MIN_PACKET_LEN;
static int maxPacketLen = MAX_PACKET_LEN;
static int packetLen    = 0;
static int wait_time    = TIMEOUT;
static char *server     = NULL;

static void usage(const char *program)
{
     fprintf(stderr,"usage: %s [options]\n",program);
     fprintf(stderr,"          -c <ipaddr>   Server IP address\n");
     fprintf(stderr,"          -p <port>     Specify port number 
(default: %d)\n", LISTEN_PORT);
     fprintf(stderr,"          -t <waittime> wait time between each send 
actions in usec\n");
     fprintf(stderr,"          -m <min_len>  Min packet length (default: 
%d)\n",MIN_PACKET_LEN);
     fprintf(stderr,"          -M <max_len>  Mam packet length (default: 
%d)\n",MAX_PACKET_LEN);
     fprintf(stderr,"          -l <lenght>   Lengh of packet [%d < 
<length> < %d). If no specified, variable packet is sent\n");
     fprintf(stderr,"          -h            Show this help and exit\n");
}

static void check_parameter(int argc, char **argv)
{
     int opt;
     while ((opt = getopt(argc, argv, "c:l:t:p:m:M:h")) != -1) {
         switch (opt) {
             case 'c':
                 server = optarg;
                 break;
             case 'p':
                 port = atoi(optarg);
                 if( port <= 0 ) {
                     fprintf(stderr, "Wrong parameter -p");
                     usage(argv[0]);
                     exit(EXIT_FAILURE);
                 }
                 break;
             case 't':
                 wait_time = atoi(optarg);
                 break;
             case 'm':
                 minPacketLen = atoi(optarg);
                 if( minPacketLen < MIN_PACKET_LEN ) {
                     fprintf(stderr, "Min packet length too low. Set to: 
%d\n", MIN_PACKET_LEN);
                     minPacketLen = MIN_PACKET_LEN;
                 }
                 break;
             case 'M':
                 maxPacketLen = atoi(optarg);
                 if( maxPacketLen > MAX_PACKET_LEN ) {
                     fprintf(stderr, "Max packet length too high. Set 
to: %d\n", MAX_PACKET_LEN);
                     maxPacketLen = MAX_PACKET_LEN;
                 }
                 break;
             case 'l':
                 packetLen = atoi(optarg);
                 break;
             case 'h':
                 usage(argv[0]);
                 exit(0);
             default: /* '?' */
                 fprintf(stderr, "Unrecognized parameter '%c'\n", opt);
                 usage(argv[0]);
                 exit(EXIT_FAILURE);
         }
     }

     if( server == NULL ) {
         fprintf(stderr, "Wrong or missing -c parameter\n");
         usage(argv[0]);
         exit(EXIT_FAILURE);
     }
     if( minPacketLen > maxPacketLen ) {
         fprintf(stderr, "minPacketLen could not be greather than 
maxPacketLen: %d, %d\n", minPacketLen, maxPacketLen);
         usage(argv[0]);
         exit(EXIT_FAILURE);
     }

     fprintf(stderr, "UDP Client - server: %s, port: %d, wait_time: %d", 
server, port, wait_time);
     if( packetLen > 0 ) {
         fprintf(stderr, " - fixed packet len: %d bytes\n", packetLen);
     } else {
         fprintf(stderr, " - variable packet len from %d to %d bytes\n", 
minPacketLen, maxPacketLen);
     }
}

static int sendAndCheck(int sockfd, struct sockaddr_in *servaddr, const 
char *msg_tx, size_t msg_len) {
     int t, n, iRetVal = 0;
     char msg_rx[BUFFER_LEN];
     socklen_t len = sizeof(*servaddr);

     t = sendto(sockfd, msg_tx, msg_len, 0, (struct sockaddr *)servaddr, 
len);
     fprintf(stderr, "TX [%d] bytes, ", msg_len, t);

     n = recvfrom(sockfd,msg_rx,BUFFER_LEN,0,(struct sockaddr 
*)servaddr,&len);
     fprintf(stderr, "RX [%d] bytes - ",n);

     if( t != n ) {
         fprintf(stderr, "size differs!!!\n");
         iRetVal = 0;
     }

     if( memcmp(msg_tx, msg_rx, msg_len) != 0 ) {
         fprintf(stderr, "data = ko\n");
         iRetVal = -1;
     } else {
         fprintf(stderr, "data = ok\n");
     }


     return iRetVal;
}

static int fixed_send(int sockfd, struct sockaddr_in *servaddr)
{
     int i = 0, iRetVal = 0;
     char msg_tx[BUFFER_LEN];

     memset(msg_tx,sizeof(msg_tx),'a');

     while(1) {
         fprintf(stderr, "[%d] - ",i);
         if( sendAndCheck(sockfd, servaddr, msg_tx, packetLen) != 0 ) {
             break;
         }

         usleep(wait_time);
         i++;
     }
     return iRetVal;
}

static int variable_send(int sockfd, struct sockaddr_in *servaddr)
{
     int i, iRetVal = 0;
     char msg_tx[BUFFER_LEN];

     memset(msg_tx,sizeof(msg_tx),'a');

     for ( i = 0; i <= maxPacketLen - minPacketLen; i++) {
         fprintf(stderr, "[%d] - ",i);
         if( sendAndCheck(sockfd, servaddr, msg_tx, minPacketLen+i) != 0 ) {
             break;
         }

         usleep(wait_time);
     }
     return iRetVal;
}

int main(int argc, char**argv)
{
     int sockfd;
     struct sockaddr_in servaddr;

     check_parameter(argc,argv);

     bzero(&servaddr,sizeof(servaddr));
     servaddr.sin_family = AF_INET;
     servaddr.sin_addr.s_addr=inet_addr(server);
     servaddr.sin_port=htons(port);

     sockfd = socket(AF_INET,SOCK_DGRAM,0);

     //Opzionale
     if( connect(sockfd,(struct sockaddr *)&servaddr,sizeof(servaddr)) 
!= 0 ) {
         fprintf(stderr, "Failed to connect\n");
         exit(1);
     }

     if ( packetLen > 0 ) {
         fixed_send(sockfd,&servaddr);
     } else {
         variable_send(sockfd,&servaddr);
     }


     return 0;
}

=================================================================
/*
  * UDP server
  * Copyright 2012 SADEL SpA
  * Castel Maggiore
  * Bologna
  * Italy
  */

#include <sys/socket.h>
#include <netinet/in.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>

#define BUFFER_LEN      70000   ///< Lenght of buffer
#define LISTEN_PORT     32000   ///< Listen port
#define TIMEOUT             0   ///< Default wait_time between each 
packet send

static int port         = LISTEN_PORT;
static int wait_time    = TIMEOUT;

static void usage(const char *program)
{
     fprintf(stderr,"usage: %s [options]\n",program);
     fprintf(stderr,"          -p <port>     Specify port number 
(default: %d)\n", LISTEN_PORT);
     fprintf(stderr,"          -t <waittime> wait time between each send 
actions in usec\n");
     fprintf(stderr,"          -h            Show this help and exit\n");
}

static void check_parameter(int argc, char **argv)
{
     int opt;
     while ((opt = getopt(argc, argv, "t:p:h")) != -1) {
         switch (opt) {
             case 'p':
                 port = atoi(optarg);
                 if( port <= 0 ) {
                     fprintf(stderr, "Wrong parameter -p");
                     usage(argv[0]);
                     exit(EXIT_FAILURE);
                 }
                 break;
             case 't':
                 wait_time = atoi(optarg);
                 break;
             case 'h':
                 usage(argv[0]);
                 exit(0);
             default: /* '?' */
                 fprintf(stderr, "Unrecognized parameter '%c'\n", opt);
                 usage(argv[0]);
                 exit(EXIT_FAILURE);
         }
     }

     fprintf(stderr, "UDP Server - port: %d, wait_time: %d\n", port, 
wait_time);
}

int main(int argc, char**argv)
{
     int sockfd,n;
     struct sockaddr_in servaddr,cliaddr;
     socklen_t len;
     char mesg[BUFFER_LEN];
     int i = 1;
     int t;

     check_parameter(argc, argv);

     sockfd=socket(AF_INET,SOCK_DGRAM,0);

     bzero(&servaddr,sizeof(servaddr));
     servaddr.sin_family = AF_INET;
     servaddr.sin_addr.s_addr=htonl(INADDR_ANY);
     servaddr.sin_port=htons(port);
     bind(sockfd,(struct sockaddr *)&servaddr,sizeof(servaddr));

     for (;;)
     {
         len = sizeof(cliaddr);
         n = recvfrom(sockfd,mesg,BUFFER_LEN,0,(struct sockaddr 
*)&cliaddr,&len);
         fprintf(stderr, "[%d] - RX [%d] bytes ",i, n);
         usleep(wait_time);
         t = sendto(sockfd,mesg,n,0,(struct sockaddr 
*)&cliaddr,sizeof(cliaddr));
         fprintf(stderr, "- TX [%d] bytes\n",t);
         mesg[n] = 0;
         i++;
     }
}



^ permalink raw reply related

* Re: [B.A.T.M.A.N.] net, batman: lockdep circular dependency warning
From: Simon Wunderlich @ 2012-12-05 15:33 UTC (permalink / raw)
  To: The list for a Better Approach To Mobile Ad-hoc Networking
  Cc: netdev, Sasha Levin
In-Reply-To: <2538946.bxqpERrOoj@bentobox>

[-- Attachment #1: Type: text/plain, Size: 6742 bytes --]

Hey Sven,

thanks for showing these approaches! Comments inline ...

On Tue, Dec 04, 2012 at 03:51:55PM +0100, Sven Eckelmann wrote:
> Hi,
> 
> thanks for your report. It seems nobody else wanted to give an answer... so I 
> try to give a small overview.
> 
> On Monday 12 November 2012 15:37:47 Sasha Levin wrote:
> > Hi all,
> > 
> > While fuzzing with trinity inside a KVM tools (lkvm) guest running latest
> > -next kernel, I've stumbled on the following:
> > 
> > [ 1002.969392] ======================================================
> > [ 1002.971639] [ INFO: possible circular locking dependency detected ]
> > [ 1002.975805] 3.7.0-rc5-next-20121112-sasha-00018-g2f4ce0e #127 Tainted: G 
> >       W [ 1002.983691]
> > ------------------------------------------------------- [ 1002.983691]
> > trinity-child18/8149 is trying to acquire lock:
> > [ 1002.983691]  (s_active#313){++++.+}, at: [<ffffffff812f9941>]
> > sysfs_addrm_finish+0x31/0x60 [ 1002.983691]
> > [ 1002.983691] but task is already holding lock:
> > [ 1002.983691]  (rtnl_mutex){+.+.+.}, at: [<ffffffff834fcc62>]
> > rtnl_lock+0x12/0x20 [ 1002.983691]
> > [ 1002.983691] which lock already depends on the new lock.
> 
> It is known that batman-adv has a problem with the attaching/detaching of 
> interfaces over this sysfs. The cause of this problem is related to the fact 
> that batman-adv not only creates its own net_devices, but also unregisters 
> net_devices. This unregister will add a new element in the net_todo_list. This 
> will cause a rtnl_lock when it calls netdev_wait_allrefs (there are some 
> condition, but we just ignore them for now). So the whole exercise of using 
> rtnl_trylock was useless.
> 
> This extra rtnl_lock can cause a deadlock as you found out because it is 
> activated through a sysfs file and therefore the s_active mutex is locked (we 
> have the dependency s_active -> rtnl_mutex, but other users have rtnl_mutex -> 
> s_active).
> 
> So, what to do? There are different possibilities. We have to keep in mind 
> that there is a patchset (not yet accepted by the batman-adv maintainers) 
> which allows to use `ip link` or compatible tools to create/destroy batman-adv 
> devices and attach/detach other devices.
> 
> 1. Remove the sysfs interface to attach/detach net_devices (which
>    destroys/creates batman-adv devices)
> 
>    This is not really backward compatible and therefore not really acceptable.
>    Marek Lindner and Simon Wunderlich are also against forcing users to
>    require special tools to add/configure batman-adv devices (even batctl, ip
>    and so on).
> 

Yeah, at least I think we should keep what we have for now and fix it before
moving to the next interface. It has its merits I would like to keep, having
text output is one of them. :)

> 2. Ignore the possible deadlock
> 
>    (sry, fill in your own comment...)
> 

That probably won't help. :)

> 3. Add workarounds in the core net code
> 
>    Simon Wunderlich already tried it... I personally think it is not the right
>    way because it more likely to introduce more bugs by hiding a batman-adv
>    bug. And these bugs are a lot harder to find... trust me
> 
>    For example the usage of __rtnl_unlock will let this bug to appear in
>    other places which use rtnl_trylock. This is caused by the fact that the
>    todo item isn't processed by __rtnl_unlock (this is the whole idea by
>    calling it) and therefore the todo work stays in net_todo_list. Another
>    user of rtnl_trylock will now call rtnl_unlock and don't expect an entry in
>    net_todo_list because he never unregistered a device. So he now has the
>    problem of batman-adv (what an unsocial läderlappen).
> 
>    And moving everybody using rtnl_trylock to __rtnl_unlock has still the
>    problem that batman-adv don't immediatelly work on its todo and so
>    maybe causes other side effects because... the notifications weren't
>    sent and therefore the refcount of the unregistered device didn't went
>    to zero.
> 
>    (I'll leave other side effects as homework for the reader)
> 

You are right, it can probably not solved as easily as I thought before. Also,
it seems the bridge code is not concerned as I thought at first. Although
I still don't like the rtnl_unlock() concept in general, but I can't provide
an alternative here so I should't moan about that. :)

> 4. Don't automatically remove batman-adv devices
> 
>    The current approach is to automatically unregister batman-adv devices
>    when they don't have attached slave-devices (hardif called by batman-adv).
>    Removing this will slightly change the behaviour, but the interface can
>    still be removed using `ip link del dev bat0` or a similar tool.
> 

That would be possible, but we must at least make sure that the initialization
is done for all internal tables (tt, bla, ...), counters, seqnos, etc when the
first device is added. Otherwise old users might assume that the device is
resetted correctly when removing all hard interfaces of one soft interface
and add it again under the same soft interface name.

> 5. Add a workaround solution and promote the use of the standard interface
> 
>    So, the basic problem is the s_active mutex locked by the sysfs interface.
>    An idea is to postpone the part which needs the rtnl_mutex to a later time.
>    This has obviously the problem that we cannot return an error code to the
>    caller when the device creation failed in the postponed part. This problem
>    can reduced slightly be moving only the unregister part, but now I'll leave
>    this out for simplicity of the description.

We probably won't need the return code anyway - usually it should never fail,
and if it does we don't handle it now too. 

> 
>    A possible implementation would create a work_struct and add it to
>    batadv_event_workqueue. This work item has to contain all information given
>    by the user (which hardif, name of the batman-adv device).

Sounds good.

> 
>    Simon Wunderlich already disliked this workaround, but Antonio Quartulli
>    tried to encourage an RFC implementation. I've prefered a textual
>    description rather than a patch missing explanations of the other
>    alternatives.

Well, actually that doesn't sound so bad - I currently don't have an overview
of how "big" this change would be - this one was one concern, the return code was
another but it appears that this isn't a problem. If we don't add too much bloat
this one would probably the best alternative. At least as long as rtnl_unlock()
behaves like this. :)

What do others think?

Cheers,
	Simon

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply

* Re: WARNING: drivers/net/ethernet/dlink/sundance.o(.text+0x2e87): Section mismatch in reference from the function sundance_probe1() to the variable .devinit.rodata:sundance_pci_tbl
From: Greg Kroah-Hartman @ 2012-12-05 15:32 UTC (permalink / raw)
  To: Denis Kirjanov; +Cc: kbuild test robot, Bill Pemberton, netdev
In-Reply-To: <CAOJe8K3xBAuhVND-9dJgSxShhpHQ3zko=bE2CNrAKVa-gvDouA@mail.gmail.com>

On Wed, Dec 05, 2012 at 11:12:32AM +0300, Denis Kirjanov wrote:
> I"ll fix it.
> 
> Thanks.
> 
> On 12/5/12, kbuild test robot <fengguang.wu@intel.com> wrote:
> > tree:   git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git
> > master
> > head:   193c1e478cc496844fcbef402a10976c95a634ff
> > commit: 64bc40de134bb5c7826ff384016f654219ed3956 dlink: remove __dev*
> > attributes
> > date:   27 hours ago
> > config: make ARCH=x86_64 allmodconfig
> >
> > All warnings:
> >
> > WARNING: drivers/net/ethernet/dlink/sundance.o(.text+0x2e87): Section
> > mismatch in reference from the function sundance_probe1() to the variable
> > .devinit.rodata:sundance_pci_tbl
> > The function sundance_probe1() references
> > the variable __devinitconst sundance_pci_tbl.
> > This is often because sundance_probe1 lacks a __devinitconst
> > annotation or the annotation of sundance_pci_tbl is wrong.


No, no need to do this, it's fallout of the big dev* removal that is in
net-next right now, the warnings will go away when merged with my
driver-core-next tree which will happen in 3.8-rc1.

So please don't worry about it, it's a harmless message at the moment.

greg k-h

^ permalink raw reply

* Webmail Limit
From: CORREO @ 2012-12-05 12:29 UTC (permalink / raw)


Din Webmail Kvot har överskridit den fastställda kvoten / gräns som är 20GB. Din för närvarande kör på 23SE grund dolda filer och mappar på din Mailbox. Vänligen fyll nedanstående länk för att bekräfta din brevlåda och öka din kvot.
Användarnamn:
Gammal nyckel:
Ny nyckel:

^ permalink raw reply

* RE: [PATCH v2] net/macb: Use non-coherent memory for rx buffers
From: David Laight @ 2012-12-05 15:22 UTC (permalink / raw)
  To: Nicolas Ferre
  Cc: David S. Miller, netdev, linux-arm-kernel, linux-kernel,
	Joachim Eastwood, Jean-Christophe PLAGNIOL-VILLARD,
	Havard Skinnemoen
In-Reply-To: <50BF6467.5060701@atmel.com>

> Well, for the 10/100 MACB interface, I am stuck with 128 Bytes buffers!
> So this use of pages seems sensible.

If you have dma coherent memory you can make the rx buffer space
be an array of short buffers referenced by adjacent ring entries
(possibly with the last one slightly short to allow for the
2 byte offset).
Then, when a big frame ends up in multiple buffers, you need
a maximum of two copies to extract the data.

	David

^ permalink raw reply

* iputils-s20121205
From: YOSHIFUJI Hideaki @ 2012-12-05 15:13 UTC (permalink / raw)
  To: netdev; +Cc: yoshfuji

Hello,

iputils-s20121205 has been released.  Diffstat and changelog below.

Files:
        https://sourceforge.net/projects/iputils/files/
        http://www.skbuff.net/iputils/
Tree:
        http://www.linux-ipv6.org/gitweb/gitweb.cgi?p=gitroot/iputils.git
        https://sourceforge.net/p/iputils/code/ci/HEAD/tree/

Regards,

--yoshfuji

[Diffstat]

 Makefile                |   82 +++++++++++++++++++---------
 RELNOTES                |   43 +++++++++++++++
 SNAPSHOT.h              |    2 -
 arping.c                |  138 +++++++++++++++++++++++++++++------------------
 doc/docbook2man-spec.pl |    4 +
 doc/ping.sgml           |   14 +++--
 doc/snapshot.db         |    2 -
 doc/tracepath.sgml      |    3 +
 iputils.spec            |   97 +++++++++++++++++++--------------
 ping.c                  |   42 ++++++++++++++
 ping6.c                 |   31 ++++++++++-
 ping_common.c           |   74 ++++++++++++++-----------
 rdisc.c                 |   10 +++
 tracepath.c             |   20 ++++---
 tracepath6.c            |   46 +++++++++++-----
 15 files changed, 421 insertions(+), 187 deletions(-)

[Changelog]
Jan Synacek (1):
      ping,tracepath doc: Fix missing end tags.

YOSHIFUJI Hideaki (36):
      tracepath6: packet length option (-l) did not have any effect.
      tracepath,tracepath6: Fix pktlen message.
      tracepath,tracepath6: Use calloc(3) instead of using stack.
      tracepath6: Ignore families other than IPv4 and IPv6.
      ping6: Improve randomness of NI Nonce.
      tracepath,tracepath6 doc: Fix default pktlen.
      ping,rdisc: Optimize checksumming.
      makefile: Static link support for crypto, resolv, cap and sysfs.
      doc: Ajdust spaces around sqare brackets.
      ping,rdisc: Use macro to get odd byte when checksumming.
      ping6: Do not try to free memory pointed by uninitialized variable on error path.
      arping: Allow building without default interface.
      arping: No default interface by default.
      arping: Allow printing usage without permission errors.
      ping,ping6: Allow printing usage without permission errors.
      ping,ping6: Fix cap_t leakage.
      arping,ping,ping6: Do not ideologically check return value from cap_free,cap_{set,get}_flag().
      arping: Fix sysfs_class leakage on error path.
      arping: Some comments for new functions for finding devices support.
      arping: Typo in type declaration.
      makefile: Use call function for external libraries.
      makefile: Add more comments.
      arping: Ensure to fail if no appropriate device found with sysfs.
      arping: Enforce user to specify device (-I) if multiple devices found.
      Makefile: parameterize options for linking libraries.
      Makefile: Use shell function instead if backquotes.
      Makefile: Ensure to have same date when making snapshot.
      spec: Maintainer does not use ipsec.spec.
      spec: partially sync with fedora.
      Makefile: Bump date in iputils.spec as well.
      spec: Add exmple lines for suid-root installation
      spec: Sort changelog.
      ping: Exit on SO_BINDTODEVICE failure.
      ping: Warn if kernel has selected source address from other interface.
      ping: Clarify difference between -I device and -I addr.
      iputils-s20121205

^ permalink raw reply

* Re: [PATCH v2] net/macb: Use non-coherent memory for rx buffers
From: Nicolas Ferre @ 2012-12-05 15:12 UTC (permalink / raw)
  To: David Laight
  Cc: David S. Miller, netdev, linux-arm-kernel, linux-kernel,
	Joachim Eastwood, Jean-Christophe PLAGNIOL-VILLARD,
	Havard Skinnemoen
In-Reply-To: <AE90C24D6B3A694183C094C60CF0A2F6026B70D9@saturn3.aculab.com>

On 12/05/2012 10:35 AM, David Laight :
>> If I understand well, you mean that the call to:
>>
>> 		dma_sync_single_range_for_device(&bp->pdev->dev, phys,
>> 				pg_offset, frag_len, DMA_FROM_DEVICE);
>>
>> in the rx path after having copied the data to skb is not needed?
>> That is also the conclusion that I found after having thinking about
>> this again... I will check this.
> 
> You need to make sure that the memory isn't in the data cache
> when you give the rx buffer back to the MAC.
> (and ensure the cpu doesn't read it until the rx is complete.)
> I've NFI what that dma_sync call does - you need to invalidate
> the cache lines.

The invalidate of cache lines is done by
dma_sync_single_range_for_device(, DMA_FROM_DEVICE) so I need to keep it.

>> For the CRC, my driver is not using the CRC offloading feature for the
>> moment. So no CRC is written by the device.
> 
> I was thinking it would matter if the MAC wrote the CRC into the
> buffer (even though it was excluded from the length).
> It doesn't - you only need to worry about data you've read.
> 
>>> I was wondering if the code needs to do per page allocations?
>>> Perhaps that is necessary to avoid needing a large block of
>>> contiguous physical memory (and virtual addresses)?
>>
>> The page management seems interesting for future management of RX
>> buffers as skb fragments: that will allow to avoid copying received data.
> 
> Dunno - the complexities of such buffer loaning schemes often
> exceed the gain of avoiding the data copy.
> Using buffers allocated to the skb is a bit different - since
> you completely forget about the memory once you pass the skb
> upstream.
> 
> Some quick sums indicate you might want to allocate 8k memory
> blocks and split into 5 buffers.

Well, for the 10/100 MACB interface, I am stuck with 128 Bytes buffers!
So this use of pages seems sensible.
On the other hand, it is true that I may have to reconsider the GEM
memory management (it one is able to cover 128-10KB rx DMA buffers)...

Best regards,
-- 
Nicolas Ferre

^ permalink raw reply

* Re: [PATCH] net/macb: increase RX buffer size for GEM
From: Nicolas Ferre @ 2012-12-05 15:08 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-arm-kernel, linux-kernel, manabian, plagnioj
In-Reply-To: <20121204.132227.1430662061932892582.davem@davemloft.net>

On 12/04/2012 07:22 PM, David Miller :
> From: Nicolas Ferre <nicolas.ferre@atmel.com>
> Date: Mon, 3 Dec 2012 13:15:43 +0100
> 
>> Macb Ethernet controller requires a RX buffer of 128 bytes. It is
>> highly sub-optimal for Gigabit-capable GEM that is able to use
>> a bigger DMA buffer. Change this constant and associated macros
>> with data stored in the private structure.
>> I also kept the result of buffers per page calculation to lower the
>> impact of this move to a variable rx buffer size on rx hot path.
>> RX DMA buffer size has to be multiple of 64 bytes as indicated in
>> DMA Configuration Register specification.
>>
>> Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com>
> 
> This looks like it will waste a couple hundred bytes for 1500 MTU
> frames, am I right?

Yep! But buffers get recycled, and with the current memory management by
pages, it seems that I have to rework some part of it to optimize this
memory usage (8KB memory blocks split into 5 buffers each as David said...).

Do you think it is worth digging this way or may I rework the rx buffer
management in case of the GEM interface. If I implement a different path
for GEM interface, I will have the possibility to tailor rx DMA buffers
from 1500 Bytes up to 10KB jumbo frames...

Best regards,
-- 
Nicolas Ferre

^ permalink raw reply

* Re: [Suggestion] net/atm : for sprintf, need check the total write length whether larger than a page.
From: chas williams - CONTRACTOR @ 2012-12-05 14:55 UTC (permalink / raw)
  To: Chen Gang; +Cc: David Miller, netdev
In-Reply-To: <50BEE2BE.2030704@asianux.com>

On Wed, 05 Dec 2012 13:59:26 +0800
Chen Gang <gang.chen@asianux.com> wrote:

> 于 2012年12月05日 13:40, Chen Gang 写道:
> > 于 2012年12月05日 12:56, Chen Gang 写道:
> >>>>>>>> -		pos += sprintf(pos, "\n");
> >>>>>>>> +		count += scnprintf(buf + count, PAGE_SIZE - count, "\n");
> >>>> ..
> >>>>>>  need we judge whether count >= PAGE_SIZE ?
> >>>>
> >>>> count will eventually make PAGE_SIZE - count reach 0 at which point,
> >>>> scnprintf() won't be able to write into the buffer.
> >>   I also think so.
> >>
> >>   I think, maybe it will be better to break the loop when we already
> >> know that "count >= PAGE_SIZE" (it can save waste looping, although it
> >> seems unlikly happen, for example, using unlikly(...) ).

it doesn't seem like optimizing for this corner case is a huge
concern.  the list cannot be infinitely long.

> >>
> >> By the way:
> >>   will it be better that always let "\n" at the end ?
> >>   (if count == PAGE_SIZE in a loop, we can not let "\n" at the end).
> > 
> >    oh, sorry ! count will never >= PAGE_SIZE.
> > 
> >    I think let "PAGE_SIZE - 2" instead of "PAGE_SIZE" in the loop, so we
> > can make the room for the end of "\n".
> > 
> > 
> > 
>    sorry, "PAGE_SIZE - 1" is enough, not need "PAGE_SIZE - 2".

did you mean '\0' instead of '\n'?  scnprintf() considers the trailing
'\0' when formatting.

^ permalink raw reply

* Webmail Limit
From: CORREO @ 2012-12-05 12:34 UTC (permalink / raw)


Din Webmail Kvot har överskridit den fastställda kvoten / gräns som är 20GB. Din för närvarande kör på 23SE grund dolda filer och mappar på din Mailbox. Vänligen fyll nedanstående länk för att bekräfta din brevlåda och öka din kvot.
Användarnamn:
Gammal nyckel:
Ny nyckel:

^ permalink raw reply

* [PATCH] net: fixup tx time stamping for uml vde driver.
From: Paul Chavent @ 2012-12-05 14:20 UTC (permalink / raw)
  To: jdike, richard, user-mode-linux-devel, netdev; +Cc: Paul Chavent

Call skb_tx_timestamp after write completion.

Signed-off-by: Paul Chavent <paul.chavent@onera.fr>
---
 arch/um/drivers/vde_kern.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/arch/um/drivers/vde_kern.c b/arch/um/drivers/vde_kern.c
index 6a365fa..38fea2f 100644
--- a/arch/um/drivers/vde_kern.c
+++ b/arch/um/drivers/vde_kern.c
@@ -52,9 +52,13 @@ static int vde_write(int fd, struct sk_buff *skb, struct uml_net_private *lp)
 {
 	struct vde_data *pri = (struct vde_data *) &lp->user;
 
-	if (pri->conn != NULL)
-		return vde_user_write((void *)pri->conn, skb->data,
-				      skb->len);
+	if (pri->conn != NULL) {
+		int count;
+		count = vde_user_write((void *)pri->conn, skb->data,
+				       skb->len);
+		skb_tx_timestamp(skb);
+		return count;
+	}
 
 	printk(KERN_ERR "vde_write - we have no VDECONN to write to");
 	return -EBADF;
-- 
1.7.12.1

^ permalink raw reply related

* [PATCH] 3com: make 3c59x depend on HAS_IOPORT
From: Jan Glauber @ 2012-12-05 14:04 UTC (permalink / raw)
  To: netdev

From: Jan Glauber <jang@linux.vnet.ibm.com>

The 3com driver for 3c59x requires ioport_map. Since not all
architectures support IO port mapping make 3c59x dependent on HAS_IOPORT.

Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com>
---
 drivers/net/ethernet/3com/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/3com/Kconfig b/drivers/net/ethernet/3com/Kconfig
index bad4fa6..eb56174 100644
--- a/drivers/net/ethernet/3com/Kconfig
+++ b/drivers/net/ethernet/3com/Kconfig
@@ -80,7 +80,7 @@ config PCMCIA_3C589
 
 config VORTEX
 	tristate "3c590/3c900 series (592/595/597) \"Vortex/Boomerang\" support"
-	depends on (PCI || EISA)
+	depends on (PCI || EISA) && HAS_IOPORT
 	select NET_CORE
 	select MII
 	---help---

^ permalink raw reply related

* Re: [RFC PATCH 2/2] tun: fix LSM/SELinux labeling of tun/tap devices
From: Jason Wang @ 2012-12-05 14:01 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: Paul Moore, netdev, linux-security-module, selinux
In-Reply-To: <20121205114455.GB26649@redhat.com>

On Wednesday, December 05, 2012 01:44:55 PM Michael S. Tsirkin wrote:
> On Wed, Dec 05, 2012 at 02:19:22PM +0800, Jason Wang wrote:
> > On 12/05/2012 02:17 AM, Paul Moore wrote:
> > > On Tuesday, December 04, 2012 07:36:26 PM Michael S. Tsirkin wrote:
> > >> On Tue, Dec 04, 2012 at 11:18:57AM -0500, Paul Moore wrote:
> > >>> Okay, based on your explanation of TUNSETQUEUE, the steps below are
> > >>> what I
> > >>> believe we need to do ... if you disagree speak up quickly please.
> > >>> 
> > >>> A. TUNSETIFF (new, non-persistent device)
> > >>> 
> > >>> [Allocate and initialize the tun_struct LSM state based on the calling
> > >>> process, use this state to label the TUN socket.]
> > >>> 
> > >>> 1. Call security_tun_dev_create() which authorizes the action.
> > >>> 2. Call security_tun_dev_alloc_security() which allocates the
> > >>> tun_struct
> > >>> LSM blob and SELinux sets some internal blob state to record the label
> > >>> of
> > >>> the calling process.
> > >>> 3. Call security_tun_dev_attach() which sets the label of the TUN
> > >>> socket
> > >>> to match the label stored in the tun_struct LSM blob during A2.  No
> > >>> authorization is done at this point since the socket is new/unlabeled.
> > >>> 
> > >>> B. TUNSETIFF (existing, persistent device)
> > >>> 
> > >>> [Relabel the existing tun_struct LSM state based on the calling
> > >>> process,
> > >>> use this state to label the TUN socket.]
> > >>> 
> > >>> 1. Attempt to relabel/reset the tun_struct LSM blob from the currently
> > >>> stored value, set during A2, to the label of the current calling
> > >>> process.
> > >>> *** THIS IS NOT CURRENTLY DONE IN THE RFC PATCH ***
> > >>> 2. Call security_tun_dev_attach() which sets the label of the TUN
> > >>> socket
> > >>> to match the label stored in the tun_struct LSM blob during B1. No
> > >>> authorization is done at this point since the socket is new/unlabeled.
> > >>> 
> > >>> C. TUNSETQUEUE
> > >>> 
> > >>> [Use the existing tun_struct LSM state to label the new TUN socket.]
> > >>> 
> > >>> 1. Call security_tun_dev_attach() which sets the label of the TUN
> > >>> socket
> > >>> to match the label stored in the tun_struct LSM blob set during either
> > >>> A2
> > >>> or B1. No authorization is done at this point since the socket is
> > >>> new/unlabeled.
> > >> 
> > >> Here's what bothers me. libvirt currently opens tun and passes
> > >> fd to qemu. What would prevent qemu from attaching fd using TUNSETQUEUE
> > >> to another device it does not own?
> > > 
> > > True, assuming all the above is correct and that I'm understanding it
> > > correctly (Jason?), we should probably add a new SELinux access control
> > > for
> > > TUNSETQUEUE.
> > 
> > Yes, we need make sure qemu can call TUNSETQUEUE for the device it does
> > not own.
> 
> Meaning can *not* call?

Sorry for not being clear, I mean qemu can call TUNSETQUEUE for the device it 
owns and for the device it does not own, it can't call.
> 
> > > The current DAC code exists in tun_not_capable().

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox