* [PATCH net-next] net: netdev_alloc_skb() use build_skb()
From: Eric Dumazet @ 2012-05-17 17:34 UTC (permalink / raw)
To: Willy Tarreau, David Miller; +Cc: netdev
In-Reply-To: <1337273387.3403.24.camel@edumazet-glaptop>
From: Eric Dumazet <edumazet@google.com>
Please note I havent tested yet this patch, lacking hardware for this.
(tg3/bnx2/bnx2x use build_skb, r8169 does a copy of incoming frames,
ixgbe uses fragments...)
Any volunteer ?
Thanks
[PATCH net-next] net: netdev_alloc_skb() use build_skb()
netdev_alloc_skb() is used by networks driver in their RX path to
allocate an skb to receive an incoming frame.
With recent skb->head_frag infrastructure, it makes sense to change
netdev_alloc_skb() to use build_skb() and a frag allocator.
This permits a zero copy splice(socket->pipe), and better GRO or TCP
coalescing.
Signed-off-by: Eric Dumazet <edumazet@google.com>
---
net/core/skbuff.c | 32 +++++++++++++++++++++++++++++++-
1 file changed, 31 insertions(+), 1 deletion(-)
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 2a18719..c02a8ec 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -293,6 +293,12 @@ struct sk_buff *build_skb(void *data, unsigned int frag_size)
}
EXPORT_SYMBOL(build_skb);
+struct netdev_alloc_cache {
+ struct page *page;
+ unsigned int offset;
+};
+static DEFINE_PER_CPU(struct netdev_alloc_cache, netdev_alloc_cache);
+
/**
* __netdev_alloc_skb - allocate an skbuff for rx on a specific device
* @dev: network device to receive on
@@ -310,8 +316,32 @@ struct sk_buff *__netdev_alloc_skb(struct net_device *dev,
unsigned int length, gfp_t gfp_mask)
{
struct sk_buff *skb;
+ unsigned int fragsz = SKB_DATA_ALIGN(length + NET_SKB_PAD) +
+ SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
- skb = __alloc_skb(length + NET_SKB_PAD, gfp_mask, 0, NUMA_NO_NODE);
+ if (fragsz <= PAGE_SIZE && !(gfp_mask & __GFP_WAIT)) {
+ struct netdev_alloc_cache *nc;
+ void *data = NULL;
+
+ nc = &get_cpu_var(netdev_alloc_cache);
+ if (!nc->page) {
+refill: nc->page = alloc_page(gfp_mask);
+ nc->offset = 0;
+ }
+ if (likely(nc->page)) {
+ if (nc->offset + fragsz > PAGE_SIZE) {
+ put_page(nc->page);
+ goto refill;
+ }
+ data = page_address(nc->page) + nc->offset;
+ nc->offset += fragsz;
+ get_page(nc->page);
+ }
+ put_cpu_var(netdev_alloc_cache);
+ skb = data ? build_skb(data, fragsz) : NULL;
+ } else {
+ skb = __alloc_skb(length + NET_SKB_PAD, gfp_mask, 0, NUMA_NO_NODE);
+ }
if (likely(skb)) {
skb_reserve(skb, NET_SKB_PAD);
skb->dev = dev;
^ permalink raw reply related
* Re: [PATCH net-next] net: netdev_alloc_skb() use build_skb()
From: Willy Tarreau @ 2012-05-17 17:45 UTC (permalink / raw)
To: Eric Dumazet; +Cc: David Miller, netdev
In-Reply-To: <1337276056.3403.37.camel@edumazet-glaptop>
On Thu, May 17, 2012 at 07:34:16PM +0200, Eric Dumazet wrote:
> From: Eric Dumazet <edumazet@google.com>
>
> Please note I havent tested yet this patch, lacking hardware for this.
>
> (tg3/bnx2/bnx2x use build_skb, r8169 does a copy of incoming frames,
> ixgbe uses fragments...)
>
> Any volunteer ?
>
> Thanks
>
> [PATCH net-next] net: netdev_alloc_skb() use build_skb()
>
> netdev_alloc_skb() is used by networks driver in their RX path to
> allocate an skb to receive an incoming frame.
>
> With recent skb->head_frag infrastructure, it makes sense to change
> netdev_alloc_skb() to use build_skb() and a frag allocator.
>
> This permits a zero copy splice(socket->pipe), and better GRO or TCP
> coalescing.
Impressed !
For the first time I could proxy HTTP traffic at gigabit speed on this
little box powered by USB ! I've long believed that proper splicing
would make this possible and now I'm seeing it is. Congrats Eric !
I'm still observing some stalls on medium-sized files (eg: 100k,
smaller than the pipe size, don't know yet if there is any relation).
I'll check closer and try to report something more exploitable.
Cheers,
Willy
^ permalink raw reply
* From Mr. Sameh Jawad
From: Mr. Sameh @ 2012-05-17 17:38 UTC (permalink / raw)
Good day,
Please I am looking for a profitable business where my client can invest some money.
I will be waiting for your positive investment idea/advice.
Best Regards,
Mr. Sameh Jawad.
^ permalink raw reply
* [PATCH v2] drop_monitor: convert to modular building
From: Neil Horman @ 2012-05-17 17:49 UTC (permalink / raw)
To: netdev; +Cc: Neil Horman, David S. Miller, Eric Dumazet, Ben Hutchings
In-Reply-To: <1337178426-2470-1-git-send-email-nhorman@tuxdriver.com>
When I first wrote drop monitor I wrote it to just build monolithically. There
is no reason it can't be built modularly as well, so lets give it that
flexibiity.
I've tested this by building it as both a module and monolithically, and it
seems to work quite well
Change notes:
v2)
* fixed for_each_present_cpu loops to be more correct as per Eric D.
* Converted exit path failures to BUG_ON as per Ben H.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: "David S. Miller" <davem@davemloft.net>
CC: Eric Dumazet <eric.dumazet@gmail.com>
CC: Ben Hutchings <bhutchings@solarflare.com>
---
net/Kconfig | 2 +-
net/core/drop_monitor.c | 46 ++++++++++++++++++++++++++++++++++++++++++++--
2 files changed, 45 insertions(+), 3 deletions(-)
diff --git a/net/Kconfig b/net/Kconfig
index e07272d..76ad6fa 100644
--- a/net/Kconfig
+++ b/net/Kconfig
@@ -295,7 +295,7 @@ config NET_TCPPROBE
module will be called tcp_probe.
config NET_DROP_MONITOR
- boolean "Network packet drop alerting service"
+ tristate "Network packet drop alerting service"
depends on INET && EXPERIMENTAL && TRACEPOINTS
---help---
This feature provides an alerting service to userspace in the
diff --git a/net/core/drop_monitor.c b/net/core/drop_monitor.c
index cfeeef8..b6760a6 100644
--- a/net/core/drop_monitor.c
+++ b/net/core/drop_monitor.c
@@ -24,6 +24,7 @@
#include <linux/timer.h>
#include <linux/bitops.h>
#include <linux/slab.h>
+#include <linux/module.h>
#include <net/genetlink.h>
#include <net/netevent.h>
@@ -225,9 +226,15 @@ static int set_all_monitor_traces(int state)
switch (state) {
case TRACE_ON:
+ if (!try_module_get(THIS_MODULE)) {
+ rc = -ENODEV;
+ break;
+ }
+
rc |= register_trace_kfree_skb(trace_kfree_skb_hit, NULL);
rc |= register_trace_napi_poll(trace_napi_poll_hit, NULL);
break;
+
case TRACE_OFF:
rc |= unregister_trace_kfree_skb(trace_kfree_skb_hit, NULL);
rc |= unregister_trace_napi_poll(trace_napi_poll_hit, NULL);
@@ -243,6 +250,9 @@ static int set_all_monitor_traces(int state)
kfree_rcu(new_stat, rcu);
}
}
+
+ module_put(THIS_MODULE);
+
break;
default:
rc = 1;
@@ -368,7 +378,7 @@ static int __init init_net_drop_monitor(void)
rc = 0;
- for_each_present_cpu(cpu) {
+ for_each_possible_cpu(cpu) {
data = &per_cpu(dm_cpu_data, cpu);
reset_per_cpu_data(data);
INIT_WORK(&data->dm_alert_work, send_dm_alert);
@@ -385,4 +395,36 @@ out:
return rc;
}
-late_initcall(init_net_drop_monitor);
+static void exit_net_drop_monitor(void)
+{
+ struct per_cpu_dm_data *data;
+ int cpu;
+
+ BUG_ON(unregister_netdevice_notifier(&dropmon_net_notifier));
+
+ /*
+ * Because of the module_get/put we do in the trace state change path
+ * we are guarnateed not to have any current users when we get here
+ * all we need to do is make sure that we don't have any running timers
+ * or pending schedule calls
+ */
+
+ for_each_possible_cpu(cpu) {
+ data = &per_cpu(dm_cpu_data, cpu);
+ del_timer(&data->send_timer);
+ cancel_work_sync(&data->dm_alert_work);
+ /*
+ * At this point, we should have exclusive access
+ * to this struct and can free the skb inside it
+ */
+ kfree_skb(data->skb);
+ }
+
+ BUG_ON(genl_unregister_family(&net_drop_monitor_family));
+}
+
+module_init(init_net_drop_monitor);
+module_exit(exit_net_drop_monitor);
+
+MODULE_LICENSE("GPL v2");
+MODULE_AUTHOR("Neil Horman <nhorman@tuxdriver.com>");
--
1.7.7.6
^ permalink raw reply related
* Re: [RFC 13/13] USB: Disable hub-initiated LPM for comms devices.
From: Andre Bella @ 2012-05-17 17:50 UTC (permalink / raw)
To: Tilman Schmidt, Sarah Sharp
Cc: gigaset307x-common, libertas-dev, Greg Kroah-Hartman, linux-usb,
linux-wireless, users, linux-bluetooth, ath9k-devel, Alan Stern,
Hansjoerg Lipp, netdev
In-Reply-To: <20120517173150.GE4967@xanatos>
[-- Attachment #1.1: Type: text/plain, Size: 2381 bytes --]
I dont know, sorry,
--- On Thu, 5/17/12, Sarah Sharp <sarah.a.sharp@linux.intel.com> wrote:
From: Sarah Sharp <sarah.a.sharp@linux.intel.com>
Subject: Re: [RFC 13/13] USB: Disable hub-initiated LPM for comms devices.
To: "Tilman Schmidt" <tilman@imap.cc>
Cc: gigaset307x-common@lists.sourceforge.net, libertas-dev@lists.infradead.org, "Greg Kroah-Hartman" <gregkh@linuxfoundation.org>, linux-usb@vger.kernel.org, linux-wireless@vger.kernel.org, users@rt2x00.serialmonkey.com, linux-bluetooth@vger.kernel.org, ath9k-devel@lists.ath9k.org, "Alan Stern" <stern@rowland.harvard.edu>, "Hansjoerg Lipp" <hjlipp@web.de>, netdev@vger.kernel.org
Date: Thursday, May 17, 2012, 1:31 PM
On Thu, May 17, 2012 at 07:07:32PM +0200, Tilman Schmidt wrote:
> Am 16.05.2012 23:55, schrieb Sarah Sharp:
> > Set the disable_hub_initiated_lpm flag for for all USB communications
> > drivers. I know there aren't currently any USB 3.0 devices that
> > implement these class specifications, but we should be ready if they do.
>
> I follow the argument for class drivers. But this patch also
> modifies drivers for specific existing USB 2.0 only devices
> which are unlikely to ever grow USB 3.0 support, such as the
> Gigaset ISDN driver:
>
> > drivers/isdn/gigaset/bas-gigaset.c | 1 +
> > drivers/isdn/gigaset/usb-gigaset.c | 1 +
Is there a particular reason why you think that driver is unlikely to
ever get USB 3.0 support? I pretty much grabbed any USB driver that
looked like a communications driver without looking too closely at the
code.
> What is the interest of setting the disable_hub_initiated_lpm
> flag for these?
It's partially to lay the foundation for anyone who wants to make a USB
3.0 communications driver in the future. They're likely to start from
some USB 2.0 class driver, and copy a lot of code. If they notice that
flag is set in all the USB communications class drivers, they're likely
to set it as well.
I'm not quite sure where the best place to provide documentation on the
flag is. I've added the kernel doc comments to the structure, but maybe
it needs to be documented somewhere in Documentation/usb/?
Sarah Sharp
_______________________________________________
libertas-dev mailing list
libertas-dev@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/libertas-dev
[-- Attachment #1.2: Type: text/html, Size: 3157 bytes --]
[-- Attachment #2: Type: text/plain, Size: 154 bytes --]
_______________________________________________
ath9k-devel mailing list
ath9k-devel@lists.ath9k.org
https://lists.ath9k.org/mailman/listinfo/ath9k-devel
^ permalink raw reply
* Re: tcp timestamp issues with google servers
From: Eric Dumazet @ 2012-05-17 18:12 UTC (permalink / raw)
To: Miklos Szeredi; +Cc: netdev, linux-kernel
In-Reply-To: <87r4ujno34.fsf@tucsk.pomaz.szeredi.hu>
On Thu, 2012-05-17 at 11:39 +0200, Miklos Szeredi wrote:
> Sometimes connection to google.com, gmail.com and other google servers
> doesn't work or takes ages to connect. When this hits it hits all
> google servers at the same time and it's persistent. It never happens
> to anything other than google. Rebooting helps. Rarely it goes away
> spontaneously.
>
> Apparently google is sometimes replying with an invalid TSecr timestamp
> value (smaller than the one sent in the last packet) and this confuses
> the Linux TCP stack which either discards the packet or sends a Reset.
>
> Network dump attached.
>
> I found only a couple of references to this issue:
>
> http://gotchas.livejournal.com/3028.html
>
> http://groups.google.com/group/comp.os.linux.networking/browse_thread/thread/29f56feded11b42a
>
> Turning tcp timestamps fixes the issue:
>
> sysctl -w net.ipv4.tcp_timestamps=0
>
> Not sure why this happens only to me and a very few others.
>
> It appears to be an issue with google TCP stack (is it a modified
> stack?) but I thought about issues in my network switch (restarting it
> doesn't help) or something in the ISP, but those look unlikely.
>
> Any ideas?
>
> Thanks,
> Miklos
>
>
>
> 1 0.000000 192.168.28.100 -> 74.125.232.226 TCP 51303 > http [SYN] Seq=0 Win=14600 Len=0 MSS=1460 SACK_PERM=1 TSV=35355050 TSER=0 WS=5
> 2 0.002730 74.125.232.226 -> 192.168.28.100 TCP http > 51303 [SYN, ACK] Seq=0 Ack=1 Win=14180 Len=0 MSS=1430 SACK_PERM=1 TSV=1184565067 TSER=35325344 WS=6
Do you really have 2730 usec RTT between you and this (Google ?)
server ?
Are you sure this is not a broken middle box ?
^ permalink raw reply
* Re: [PATCH v2] drop_monitor: convert to modular building
From: Ben Hutchings @ 2012-05-17 18:14 UTC (permalink / raw)
To: Neil Horman; +Cc: netdev, David S. Miller, Eric Dumazet
In-Reply-To: <1337276940-5025-1-git-send-email-nhorman@tuxdriver.com>
On Thu, 2012-05-17 at 13:49 -0400, Neil Horman wrote:
> When I first wrote drop monitor I wrote it to just build monolithically. There
> is no reason it can't be built modularly as well, so lets give it that
> flexibiity.
>
> I've tested this by building it as both a module and monolithically, and it
> seems to work quite well
>
> Change notes:
>
> v2)
> * fixed for_each_present_cpu loops to be more correct as per Eric D.
> * Converted exit path failures to BUG_ON as per Ben H.
Sorry I didn't pick up on this the first time:
[...]
> -late_initcall(init_net_drop_monitor);
> +static void exit_net_drop_monitor(void)
> +{
> + struct per_cpu_dm_data *data;
> + int cpu;
> +
> + BUG_ON(unregister_netdevice_notifier(&dropmon_net_notifier));
> +
> + /*
> + * Because of the module_get/put we do in the trace state change path
> + * we are guarnateed not to have any current users when we get here
> + * all we need to do is make sure that we don't have any running timers
> + * or pending schedule calls
> + */
> +
> + for_each_possible_cpu(cpu) {
> + data = &per_cpu(dm_cpu_data, cpu);
> + del_timer(&data->send_timer);
Doesn't this need to be del_timer_sync()?
Ben.
> + cancel_work_sync(&data->dm_alert_work);
> + /*
> + * At this point, we should have exclusive access
> + * to this struct and can free the skb inside it
> + */
> + kfree_skb(data->skb);
> + }
> +
> + BUG_ON(genl_unregister_family(&net_drop_monitor_family));
> +}
> +
> +module_init(init_net_drop_monitor);
> +module_exit(exit_net_drop_monitor);
> +
> +MODULE_LICENSE("GPL v2");
> +MODULE_AUTHOR("Neil Horman <nhorman@tuxdriver.com>");
--
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.
^ permalink raw reply
* Re: Stable regression with 'tcp: allow splice() to build full TSO packets'
From: Ben Hutchings @ 2012-05-17 18:38 UTC (permalink / raw)
To: Willy Tarreau; +Cc: Eric Dumazet, netdev
In-Reply-To: <20120517155621.GK14498@1wt.eu>
On Thu, 2012-05-17 at 17:56 +0200, Willy Tarreau wrote:
[...]
> The NIC does not support TSO but I've seen an alternate driver for this
> NIC which pretends to do TSO and in fact builds header frags so that the
> NIC is able to send all frames at once. I think it's already what GSO is
> doing but I'm wondering whether it would be possible to get more speed
> by doing this than by relying on GSO to (possibly) split the frags earlier.
[...]
Yes, GSO has some overhead for skb allocation and additional function
calls that you can avoid by doing 'TSO' in the driver.
Ben.
--
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.
^ permalink raw reply
* Re: [PATCH net-next] tcp: bool conversions
From: David Miller @ 2012-05-17 19:03 UTC (permalink / raw)
To: eric.dumazet; +Cc: netdev
In-Reply-To: <1337246134.4740.5.camel@edumazet-laptop>
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Thu, 17 May 2012 11:15:34 +0200
> From: Eric Dumazet <edumazet@google.com>
>
> bool conversions where possible.
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
Applied, with commit message updated to reflect reality :-)
^ permalink raw reply
* Re: [net-next 0/4][pull request] Intel Wired LAN Driver Updates
From: David Miller @ 2012-05-17 19:12 UTC (permalink / raw)
To: jeffrey.t.kirsher; +Cc: netdev, gospo, sassmann
In-Reply-To: <1337254070-32500-1-git-send-email-jeffrey.t.kirsher@intel.com>
From: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Date: Thu, 17 May 2012 04:27:46 -0700
> This series of patches contains updates for e1000, e1000e and igb.
>
> The following are changes since commit dc6b9b78234fecdc6d2ca5e1629185718202bcf5:
> net: include/net/sock.h cleanup
> and are available in the git repository at:
> git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next master
Pulled, thanks.
^ permalink raw reply
* Re: [PATCH v2] drop_monitor: convert to modular building
From: Neil Horman @ 2012-05-17 19:33 UTC (permalink / raw)
To: Ben Hutchings; +Cc: netdev, David S. Miller, Eric Dumazet
In-Reply-To: <1337278445.2496.17.camel@bwh-desktop.uk.solarflarecom.com>
On Thu, May 17, 2012 at 07:14:05PM +0100, Ben Hutchings wrote:
> On Thu, 2012-05-17 at 13:49 -0400, Neil Horman wrote:
> > When I first wrote drop monitor I wrote it to just build monolithically. There
> > is no reason it can't be built modularly as well, so lets give it that
> > flexibiity.
> >
> > I've tested this by building it as both a module and monolithically, and it
> > seems to work quite well
> >
> > Change notes:
> >
> > v2)
> > * fixed for_each_present_cpu loops to be more correct as per Eric D.
> > * Converted exit path failures to BUG_ON as per Ben H.
>
> Sorry I didn't pick up on this the first time:
>
> [...]
> > -late_initcall(init_net_drop_monitor);
> > +static void exit_net_drop_monitor(void)
> > +{
> > + struct per_cpu_dm_data *data;
> > + int cpu;
> > +
> > + BUG_ON(unregister_netdevice_notifier(&dropmon_net_notifier));
> > +
> > + /*
> > + * Because of the module_get/put we do in the trace state change path
> > + * we are guarnateed not to have any current users when we get here
> > + * all we need to do is make sure that we don't have any running timers
> > + * or pending schedule calls
> > + */
> > +
> > + for_each_possible_cpu(cpu) {
> > + data = &per_cpu(dm_cpu_data, cpu);
> > + del_timer(&data->send_timer);
>
> Doesn't this need to be del_timer_sync()?
>
Yeah, good catch. I was thinking it didn't need to be as the timer doesn't
re-arm itself and the cancel_work_sync would undo anything that a running timer
did, but thinking about it, its possible that a timer could fire on cpu A, and
cpu B could execute and complete the cancel_work_sync prior to cpu A scheduling
it, so there is a race window there. I'll fix that up.
Neil
^ permalink raw reply
* Re: [net-next 2/4] e1000: remove workaround for Errata 23 from jumbo alloc
From: David Miller @ 2012-05-17 19:32 UTC (permalink / raw)
To: bhutchings; +Cc: jeffrey.t.kirsher, bigeasy, netdev, gospo, sassmann
In-Reply-To: <1337265630.2496.11.camel@bwh-desktop.uk.solarflarecom.com>
From: Ben Hutchings <bhutchings@solarflare.com>
Date: Thu, 17 May 2012 15:40:30 +0100
> I don't believe PAGE_SIZE is >64K on any architecture, but perhaps you
> should replace the run-time check with:
powerpc can be built with PAGE_SHIFT == 18
^ permalink raw reply
* Re: [PATCH net-next-2.6] pppoe: remove unused return value from two methods.
From: David Miller @ 2012-05-17 19:35 UTC (permalink / raw)
To: ramirose; +Cc: netdev
In-Reply-To: <CAHLOa7RiNcZBLE7z7JTY-7WJGOL_DTYq33hmwyTdM0HWqn_WgA@mail.gmail.com>
From: Rami Rosen <ramirose@gmail.com>
Date: Thu, 17 May 2012 18:04:50 +0300
> Hi,
>
> The patch removes unused return value from __delete_item() and
> delete_item() methods in drivers/net/ppp/pppoe.c.
>
> Signed-off-by: Rami Rosen <ramirose@gmail.com>
Applied, but please in the future:
1) Don't put "hi," "how are you" and things like in your message
body, I have to edit them out every time when I commit your
changes.
2) Remove that leading space from your commit message lines, I have
to edit those out as well.
Thank you.
^ permalink raw reply
* Re: [PATCH net-next] etherdevice: fix comments
From: David Miller @ 2012-05-17 19:37 UTC (permalink / raw)
To: shemminger; +Cc: netdev
In-Reply-To: <20120517081728.54719129@nehalam.linuxnetplumber.net>
From: Stephen Hemminger <shemminger@vyatta.com>
Date: Thu, 17 May 2012 08:17:28 -0700
> Fix some minor problems in comments of etherdevice.h
> * Warning is out dated, file hasn't moved or disappeared in many years and
> is unlikely to do so soon.
> * Capitalize Ethernet consistently since it is a proper name
> * Fix descriptive comment of padding
> * Spelling and grammar fix for alignment comment
>
> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
Applied, thanks.
^ permalink raw reply
* Re: [PATCH] ipv6: correct the ipv6 option name - Pad0 to Pad1
From: David Miller @ 2012-05-17 19:50 UTC (permalink / raw)
To: eldad
Cc: yoshfuji, netdev, bridge, hadi, jmorris, linux-kernel, kuznet,
shemminger
In-Reply-To: <1337270425-21873-1-git-send-email-eldad@fogrefinery.com>
From: Eldad Zack <eldad@fogrefinery.com>
Date: Thu, 17 May 2012 18:00:25 +0200
> The padding destination or hop-by-hop option is called Pad1 and not Pad0.
>
> See RFC2460 (4.2) or the IANA ipv6-parameters registry:
> http://www.iana.org/assignments/ipv6-parameters/ipv6-parameters.xml
>
> Signed-off-by: Eldad Zack <eldad@fogrefinery.com>
Applied to net-next.
^ permalink raw reply
* Re: [PATCH net-next] net: netdev_alloc_skb() use build_skb()
From: David Miller @ 2012-05-17 19:53 UTC (permalink / raw)
To: eric.dumazet; +Cc: w, netdev
In-Reply-To: <1337276056.3403.37.camel@edumazet-glaptop>
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Thu, 17 May 2012 19:34:16 +0200
> [PATCH net-next] net: netdev_alloc_skb() use build_skb()
>
> netdev_alloc_skb() is used by networks driver in their RX path to
> allocate an skb to receive an incoming frame.
>
> With recent skb->head_frag infrastructure, it makes sense to change
> netdev_alloc_skb() to use build_skb() and a frag allocator.
>
> This permits a zero copy splice(socket->pipe), and better GRO or TCP
> coalescing.
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
Applied, we can sort out any fallout very easily before 3.5 is released.
Awesome work Eric.
^ permalink raw reply
* Re: Stable regression with 'tcp: allow splice() to build full TSO packets'
From: David Miller @ 2012-05-17 19:55 UTC (permalink / raw)
To: w; +Cc: eric.dumazet, netdev
In-Reply-To: <20120517150157.GA19274@1wt.eu>
From: Willy Tarreau <w@1wt.eu>
Date: Thu, 17 May 2012 17:01:57 +0200
>>From 6da6a21798d0156e647a993c31782eec739fa5df Mon Sep 17 00:00:00 2001
> From: Willy Tarreau <w@1wt.eu>
> Date: Thu, 17 May 2012 16:48:56 +0200
> Subject: [PATCH] tcp: force push data out when buffers are missing
>
> Commit 2f533844242 (tcp: allow splice() to build full TSO packets)
> significantly improved splice() performance for some workloads but
> caused stalls when pipe buffers were larger than socket buffers.
>
> The issue seems to happen when no data can be copied at all due to
> lack of buffers, which results in pending data never being pushed.
>
> This change checks if all pending data has been pushed or not and
> pushes them when waiting for send buffers.
Eric, please indicate whether we need Willy's patch here.
I want to propagate this fix as fast as possible if so.
^ permalink raw reply
* Re: Stable regression with 'tcp: allow splice() to build full TSO packets'
From: Willy Tarreau @ 2012-05-17 20:04 UTC (permalink / raw)
To: David Miller; +Cc: eric.dumazet, netdev
In-Reply-To: <20120517.155503.2294382162578627387.davem@davemloft.net>
Hi David,
On Thu, May 17, 2012 at 03:55:03PM -0400, David Miller wrote:
> From: Willy Tarreau <w@1wt.eu>
> Date: Thu, 17 May 2012 17:01:57 +0200
>
> >>From 6da6a21798d0156e647a993c31782eec739fa5df Mon Sep 17 00:00:00 2001
> > From: Willy Tarreau <w@1wt.eu>
> > Date: Thu, 17 May 2012 16:48:56 +0200
> > Subject: [PATCH] tcp: force push data out when buffers are missing
> >
> > Commit 2f533844242 (tcp: allow splice() to build full TSO packets)
> > significantly improved splice() performance for some workloads but
> > caused stalls when pipe buffers were larger than socket buffers.
> >
> > The issue seems to happen when no data can be copied at all due to
> > lack of buffers, which results in pending data never being pushed.
> >
> > This change checks if all pending data has been pushed or not and
> > pushes them when waiting for send buffers.
>
> Eric, please indicate whether we need Willy's patch here.
>
> I want to propagate this fix as fast as possible if so.
I think you should hold off for now, because it's possible that my patch
hides another issue instead of fixing it.
I'm having the same stall issue again since I applied Eric's build_skb
patch, but not for all data sizes. So if the same issue is still there,
it's possible that we're playing hide-and-seek with it. That's rather
strange.
Thanks,
Willy
^ permalink raw reply
* [PATCH v3] drop_monitor: convert to modular building
From: Neil Horman @ 2012-05-17 20:04 UTC (permalink / raw)
To: netdev; +Cc: Neil Horman, David S. Miller, Eric Dumazet, Ben Hutchings
In-Reply-To: <1337178426-2470-1-git-send-email-nhorman@tuxdriver.com>
When I first wrote drop monitor I wrote it to just build monolithically. There
is no reason it can't be built modularly as well, so lets give it that
flexibiity.
I've tested this by building it as both a module and monolithically, and it
seems to work quite well
Change notes:
v2)
* fixed for_each_present_cpu loops to be more correct as per Eric D.
* Converted exit path failures to BUG_ON as per Ben H.
v3)
* Converted del_timer to del_timer_sync to close race noted by Ben H.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: "David S. Miller" <davem@davemloft.net>
CC: Eric Dumazet <eric.dumazet@gmail.com>
CC: Ben Hutchings <bhutchings@solarflare.com>
---
net/Kconfig | 2 +-
net/core/drop_monitor.c | 46 ++++++++++++++++++++++++++++++++++++++++++++--
2 files changed, 45 insertions(+), 3 deletions(-)
diff --git a/net/Kconfig b/net/Kconfig
index e07272d..76ad6fa 100644
--- a/net/Kconfig
+++ b/net/Kconfig
@@ -295,7 +295,7 @@ config NET_TCPPROBE
module will be called tcp_probe.
config NET_DROP_MONITOR
- boolean "Network packet drop alerting service"
+ tristate "Network packet drop alerting service"
depends on INET && EXPERIMENTAL && TRACEPOINTS
---help---
This feature provides an alerting service to userspace in the
diff --git a/net/core/drop_monitor.c b/net/core/drop_monitor.c
index cfeeef8..f93f985 100644
--- a/net/core/drop_monitor.c
+++ b/net/core/drop_monitor.c
@@ -24,6 +24,7 @@
#include <linux/timer.h>
#include <linux/bitops.h>
#include <linux/slab.h>
+#include <linux/module.h>
#include <net/genetlink.h>
#include <net/netevent.h>
@@ -225,9 +226,15 @@ static int set_all_monitor_traces(int state)
switch (state) {
case TRACE_ON:
+ if (!try_module_get(THIS_MODULE)) {
+ rc = -ENODEV;
+ break;
+ }
+
rc |= register_trace_kfree_skb(trace_kfree_skb_hit, NULL);
rc |= register_trace_napi_poll(trace_napi_poll_hit, NULL);
break;
+
case TRACE_OFF:
rc |= unregister_trace_kfree_skb(trace_kfree_skb_hit, NULL);
rc |= unregister_trace_napi_poll(trace_napi_poll_hit, NULL);
@@ -243,6 +250,9 @@ static int set_all_monitor_traces(int state)
kfree_rcu(new_stat, rcu);
}
}
+
+ module_put(THIS_MODULE);
+
break;
default:
rc = 1;
@@ -368,7 +378,7 @@ static int __init init_net_drop_monitor(void)
rc = 0;
- for_each_present_cpu(cpu) {
+ for_each_possible_cpu(cpu) {
data = &per_cpu(dm_cpu_data, cpu);
reset_per_cpu_data(data);
INIT_WORK(&data->dm_alert_work, send_dm_alert);
@@ -385,4 +395,36 @@ out:
return rc;
}
-late_initcall(init_net_drop_monitor);
+static void exit_net_drop_monitor(void)
+{
+ struct per_cpu_dm_data *data;
+ int cpu;
+
+ BUG_ON(unregister_netdevice_notifier(&dropmon_net_notifier));
+
+ /*
+ * Because of the module_get/put we do in the trace state change path
+ * we are guarnateed not to have any current users when we get here
+ * all we need to do is make sure that we don't have any running timers
+ * or pending schedule calls
+ */
+
+ for_each_possible_cpu(cpu) {
+ data = &per_cpu(dm_cpu_data, cpu);
+ del_timer_sync(&data->send_timer);
+ cancel_work_sync(&data->dm_alert_work);
+ /*
+ * At this point, we should have exclusive access
+ * to this struct and can free the skb inside it
+ */
+ kfree_skb(data->skb);
+ }
+
+ BUG_ON(genl_unregister_family(&net_drop_monitor_family));
+}
+
+module_init(init_net_drop_monitor);
+module_exit(exit_net_drop_monitor);
+
+MODULE_LICENSE("GPL v2");
+MODULE_AUTHOR("Neil Horman <nhorman@tuxdriver.com>");
--
1.7.7.6
^ permalink raw reply related
* [PATCH] netfilter: xt_recent: Add optional mask option for xt_recent
From: Denys Fedoryshchenko @ 2012-05-17 20:07 UTC (permalink / raw)
To: Linux netdev; +Cc: Pablo Neira Ayuso, Denys Fedoryshchenko
Use case for this feature:
1)In some occasions if you need to allow,block,match specific subnet.
2)I can use recent as a trigger when netfilter rule matches, with mask 0.0.0.0
Tested for backward compatibility:
)old (userspace) iptables, new kernel
)old kernel, new iptables
)new kernel, new iptables
For v2:
As Pablo Neira Ayuso suggested, moved nf_inet_addr_mask to xt_recent.h
and made info_v1 as a stack variable.
Signed-off-by: Denys Fedoryshchenko <denys@visp.net.lb>
CC: Pablo Neira Ayuso <pablo@netfilter.org>
---
include/linux/netfilter/xt_recent.h | 20 +++++++++++
net/netfilter/xt_recent.c | 62 ++++++++++++++++++++++++++++++----
2 files changed, 74 insertions(+), 8 deletions(-)
diff --git a/include/linux/netfilter/xt_recent.h b/include/linux/netfilter/xt_recent.h
index 83318e0..5f69ebc 100644
--- a/include/linux/netfilter/xt_recent.h
+++ b/include/linux/netfilter/xt_recent.h
@@ -32,4 +32,24 @@ struct xt_recent_mtinfo {
__u8 side;
};
+struct xt_recent_mtinfo_v1 {
+ __u32 seconds;
+ __u32 hit_count;
+ __u8 check_set;
+ __u8 invert;
+ char name[XT_RECENT_NAME_LEN];
+ __u8 side;
+ union nf_inet_addr mask;
+};
+
+static inline void nf_inet_addr_mask(const union nf_inet_addr *a1,
+ union nf_inet_addr *result,
+ const union nf_inet_addr *mask)
+{
+ result->all[0] = a1->all[0] & mask->all[0];
+ result->all[1] = a1->all[1] & mask->all[1];
+ result->all[2] = a1->all[2] & mask->all[2];
+ result->all[3] = a1->all[3] & mask->all[3];
+}
+
#endif /* _LINUX_NETFILTER_XT_RECENT_H */
diff --git a/net/netfilter/xt_recent.c b/net/netfilter/xt_recent.c
index fc0d6db..ca4375c 100644
--- a/net/netfilter/xt_recent.c
+++ b/net/netfilter/xt_recent.c
@@ -75,6 +75,7 @@ struct recent_entry {
struct recent_table {
struct list_head list;
char name[XT_RECENT_NAME_LEN];
+ union nf_inet_addr mask;
unsigned int refcnt;
unsigned int entries;
struct list_head lru_list;
@@ -228,10 +229,11 @@ recent_mt(const struct sk_buff *skb, struct xt_action_param *par)
{
struct net *net = dev_net(par->in ? par->in : par->out);
struct recent_net *recent_net = recent_pernet(net);
- const struct xt_recent_mtinfo *info = par->matchinfo;
+ const struct xt_recent_mtinfo_v1 *info = par->matchinfo;
struct recent_table *t;
struct recent_entry *e;
union nf_inet_addr addr = {};
+ union nf_inet_addr addr_masked;
u_int8_t ttl;
bool ret = info->invert;
@@ -261,12 +263,15 @@ recent_mt(const struct sk_buff *skb, struct xt_action_param *par)
spin_lock_bh(&recent_lock);
t = recent_table_lookup(recent_net, info->name);
- e = recent_entry_lookup(t, &addr, par->family,
+
+ nf_inet_addr_mask(&addr, &addr_masked, &t->mask);
+
+ e = recent_entry_lookup(t, &addr_masked, par->family,
(info->check_set & XT_RECENT_TTL) ? ttl : 0);
if (e == NULL) {
if (!(info->check_set & XT_RECENT_SET))
goto out;
- e = recent_entry_init(t, &addr, par->family, ttl);
+ e = recent_entry_init(t, &addr_masked, par->family, ttl);
if (e == NULL)
par->hotdrop = true;
ret = !ret;
@@ -306,10 +311,10 @@ out:
return ret;
}
-static int recent_mt_check(const struct xt_mtchk_param *par)
+static int recent_mt_check(const struct xt_mtchk_param *par,
+ const struct xt_recent_mtinfo_v1 *info)
{
struct recent_net *recent_net = recent_pernet(par->net);
- const struct xt_recent_mtinfo *info = par->matchinfo;
struct recent_table *t;
#ifdef CONFIG_PROC_FS
struct proc_dir_entry *pde;
@@ -361,6 +366,8 @@ static int recent_mt_check(const struct xt_mtchk_param *par)
goto out;
}
t->refcnt = 1;
+
+ memcpy(&t->mask, &info->mask, sizeof(t->mask));
strcpy(t->name, info->name);
INIT_LIST_HEAD(&t->lru_list);
for (i = 0; i < ip_list_hash_size; i++)
@@ -385,10 +392,29 @@ out:
return ret;
}
+static int recent_mt_check_v0(const struct xt_mtchk_param *par)
+{
+ const struct xt_recent_mtinfo_v0 *info_v0 = par->matchinfo;
+ struct xt_recent_mtinfo_v1 info_v1;
+ int ret;
+
+ /* Copy old data */
+ memcpy(&info_v1, info_v0, sizeof(struct xt_recent_mtinfo));
+ /* Default mask will make same behavior as old recent */
+ memset(info_v1.mask.all, 0xFF, sizeof(info_v1.mask.all));
+ ret = recent_mt_check(par, &info_v1);
+ return ret;
+}
+
+static int recent_mt_check_v1(const struct xt_mtchk_param *par)
+{
+ return recent_mt_check(par, par->matchinfo);
+}
+
static void recent_mt_destroy(const struct xt_mtdtor_param *par)
{
struct recent_net *recent_net = recent_pernet(par->net);
- const struct xt_recent_mtinfo *info = par->matchinfo;
+ const struct xt_recent_mtinfo_v1 *info = par->matchinfo;
struct recent_table *t;
mutex_lock(&recent_mutex);
@@ -625,7 +651,7 @@ static struct xt_match recent_mt_reg[] __read_mostly = {
.family = NFPROTO_IPV4,
.match = recent_mt,
.matchsize = sizeof(struct xt_recent_mtinfo),
- .checkentry = recent_mt_check,
+ .checkentry = recent_mt_check_v0,
.destroy = recent_mt_destroy,
.me = THIS_MODULE,
},
@@ -635,10 +661,30 @@ static struct xt_match recent_mt_reg[] __read_mostly = {
.family = NFPROTO_IPV6,
.match = recent_mt,
.matchsize = sizeof(struct xt_recent_mtinfo),
- .checkentry = recent_mt_check,
+ .checkentry = recent_mt_check_v0,
+ .destroy = recent_mt_destroy,
+ .me = THIS_MODULE,
+ },
+ {
+ .name = "recent",
+ .revision = 1,
+ .family = NFPROTO_IPV4,
+ .match = recent_mt,
+ .matchsize = sizeof(struct xt_recent_mtinfo_v1),
+ .checkentry = recent_mt_check_v1,
.destroy = recent_mt_destroy,
.me = THIS_MODULE,
},
+ {
+ .name = "recent",
+ .revision = 1,
+ .family = NFPROTO_IPV6,
+ .match = recent_mt,
+ .matchsize = sizeof(struct xt_recent_mtinfo_v1),
+ .checkentry = recent_mt_check_v1,
+ .destroy = recent_mt_destroy,
+ .me = THIS_MODULE,
+ }
};
static int __init recent_mt_init(void)
--
1.7.3.4
^ permalink raw reply related
* Re: Stable regression with 'tcp: allow splice() to build full TSO packets'
From: David Miller @ 2012-05-17 20:07 UTC (permalink / raw)
To: w; +Cc: eric.dumazet, netdev
In-Reply-To: <20120517200404.GO14498@1wt.eu>
From: Willy Tarreau <w@1wt.eu>
Date: Thu, 17 May 2012 22:04:04 +0200
> I'm having the same stall issue again since I applied Eric's build_skb
> patch, but not for all data sizes. So if the same issue is still there,
> it's possible that we're playing hide-and-seek with it. That's rather
> strange.
Ok, a Heisenbug :-) Let me know when you guys resolve this.
^ permalink raw reply
* Re: [PATCH v3] drop_monitor: convert to modular building
From: Ben Hutchings @ 2012-05-17 20:08 UTC (permalink / raw)
To: Neil Horman; +Cc: netdev, David S. Miller, Eric Dumazet
In-Reply-To: <1337285040-20848-1-git-send-email-nhorman@tuxdriver.com>
On Thu, 2012-05-17 at 16:04 -0400, Neil Horman wrote:
> When I first wrote drop monitor I wrote it to just build monolithically. There
> is no reason it can't be built modularly as well, so lets give it that
> flexibiity.
>
> I've tested this by building it as both a module and monolithically, and it
> seems to work quite well
>
> Change notes:
>
> v2)
> * fixed for_each_present_cpu loops to be more correct as per Eric D.
> * Converted exit path failures to BUG_ON as per Ben H.
>
> v3)
> * Converted del_timer to del_timer_sync to close race noted by Ben H.
>
> Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
> CC: "David S. Miller" <davem@davemloft.net>
> CC: Eric Dumazet <eric.dumazet@gmail.com>
> CC: Ben Hutchings <bhutchings@solarflare.com>
[...]
Reviewed-by: Ben Hutchings <bhutchings@solarflare.com>
Thanks,
Ben.
--
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.
^ permalink raw reply
* [PATCH] iptables: xt_recent: Add optional mask option for xt_recent
From: Denys Fedoryshchenko @ 2012-05-17 20:08 UTC (permalink / raw)
To: netfilter-devel; +Cc: Linux netdev, Pablo Neira Ayuso, Denys Fedoryshchenko
Use case for this feature:
1)In some occasions if you need to allow,block,match specific subnet.
2)I can use recent as a trigger when netfilter rule matches, with mask 0.0.0.0
Tested for backward compatibility:
)old (userspace) iptables, new kernel
)old kernel, new iptables
)new kernel, new iptables
Signed-off-by: Denys Fedoryshchenko <denys@visp.net.lb>
---
extensions/libxt_recent.c | 152 ++++++++++++++++++++++++++++++----
include/linux/netfilter/xt_recent.h | 11 +++-
2 files changed, 144 insertions(+), 19 deletions(-)
diff --git a/extensions/libxt_recent.c b/extensions/libxt_recent.c
index c7dce4e..930da29 100644
--- a/extensions/libxt_recent.c
+++ b/extensions/libxt_recent.c
@@ -16,6 +16,7 @@ enum {
O_NAME,
O_RSOURCE,
O_RDEST,
+ O_MASK,
F_SET = 1 << O_SET,
F_RCHECK = 1 << O_RCHECK,
F_UPDATE = 1 << O_UPDATE,
@@ -25,7 +26,7 @@ enum {
};
#define s struct xt_recent_mtinfo
-static const struct xt_option_entry recent_opts[] = {
+static const struct xt_option_entry recent_opts_v0[] = {
{.name = "set", .id = O_SET, .type = XTTYPE_NONE,
.excl = F_ANY_OP, .flags = XTOPT_INVERT},
{.name = "rcheck", .id = O_RCHECK, .type = XTTYPE_NONE,
@@ -50,6 +51,33 @@ static const struct xt_option_entry recent_opts[] = {
};
#undef s
+#define s struct xt_recent_mtinfo_v1
+static const struct xt_option_entry recent_opts_v1[] = {
+ {.name = "set", .id = O_SET, .type = XTTYPE_NONE,
+ .excl = F_ANY_OP, .flags = XTOPT_INVERT},
+ {.name = "rcheck", .id = O_RCHECK, .type = XTTYPE_NONE,
+ .excl = F_ANY_OP, .flags = XTOPT_INVERT},
+ {.name = "update", .id = O_UPDATE, .type = XTTYPE_NONE,
+ .excl = F_ANY_OP, .flags = XTOPT_INVERT},
+ {.name = "remove", .id = O_REMOVE, .type = XTTYPE_NONE,
+ .excl = F_ANY_OP, .flags = XTOPT_INVERT},
+ {.name = "seconds", .id = O_SECONDS, .type = XTTYPE_UINT32,
+ .flags = XTOPT_PUT, XTOPT_POINTER(s, seconds)},
+ {.name = "hitcount", .id = O_HITCOUNT, .type = XTTYPE_UINT32,
+ .flags = XTOPT_PUT, XTOPT_POINTER(s, hit_count)},
+ {.name = "rttl", .id = O_RTTL, .type = XTTYPE_NONE,
+ .excl = F_SET | F_REMOVE},
+ {.name = "name", .id = O_NAME, .type = XTTYPE_STRING,
+ .flags = XTOPT_PUT, XTOPT_POINTER(s, name)},
+ {.name = "rsource", .id = O_RSOURCE, .type = XTTYPE_NONE},
+ {.name = "rdest", .id = O_RDEST, .type = XTTYPE_NONE},
+ {.name = "mask", .id = O_MASK, .type = XTTYPE_HOST,
+ .flags = XTOPT_PUT, XTOPT_POINTER(s, mask)},
+ XTOPT_TABLEEND,
+};
+#undef s
+
+
static void recent_help(void)
{
printf(
@@ -74,24 +102,27 @@ static void recent_help(void)
" --name name Name of the recent list to be used. DEFAULT used if none given.\n"
" --rsource Match/Save the source address of each packet in the recent list table (default).\n"
" --rdest Match/Save the destination address of each packet in the recent list table.\n"
+" --mask netmask Netmask that will be applied to this recent list.\n"
"xt_recent by: Stephen Frost <sfrost@snowman.net>. http://snowman.net/projects/ipt_recent/\n");
}
-static void recent_init(struct xt_entry_match *match)
+static void recent_init(struct xt_entry_match *match,unsigned int family)
{
- struct xt_recent_mtinfo *info = (void *)(match)->data;
+ struct xt_recent_mtinfo *info_v0 = (void *)(match)->data;
+ struct xt_recent_mtinfo_v1 *info_v1 = (void *)(match)->data;
- strncpy(info->name,"DEFAULT", XT_RECENT_NAME_LEN);
+ strncpy(info_v0->name,"DEFAULT", XT_RECENT_NAME_LEN);
/* even though XT_RECENT_NAME_LEN is currently defined as 200,
* better be safe, than sorry */
- info->name[XT_RECENT_NAME_LEN-1] = '\0';
- info->side = XT_RECENT_SOURCE;
+ info_v0->name[XT_RECENT_NAME_LEN-1] = '\0';
+ info_v0->side = XT_RECENT_SOURCE;
+ if (family == NFPROTO_IPV6)
+ memset(&info_v1->mask,0xFF,sizeof(info_v1->mask));
}
static void recent_parse(struct xt_option_call *cb)
{
struct xt_recent_mtinfo *info = cb->data;
-
xtables_option_parse(cb);
switch (cb->entry->id) {
case O_SET:
@@ -140,9 +171,9 @@ static void recent_check(struct xt_fcheck_call *cb)
}
static void recent_print(const void *ip, const struct xt_entry_match *match,
- int numeric)
+ unsigned int family)
{
- const struct xt_recent_mtinfo *info = (const void *)match->data;
+ const struct xt_recent_mtinfo_v1 *info = (const void *)match->data;
if (info->invert)
printf(" !");
@@ -167,11 +198,17 @@ static void recent_print(const void *ip, const struct xt_entry_match *match,
printf(" side: source");
if (info->side == XT_RECENT_DEST)
printf(" side: dest");
+ if (family == NFPROTO_IPV4)
+ printf(" mask: %s",
+ xtables_ipaddr_to_numeric(&info->mask.in));
+ if (family == NFPROTO_IPV6)
+ printf(" mask: %s",
+ xtables_ip6addr_to_numeric(&info->mask.in6));
}
-static void recent_save(const void *ip, const struct xt_entry_match *match)
+static void recent_save(const void *ip, const struct xt_entry_match *match,unsigned int family)
{
- const struct xt_recent_mtinfo *info = (const void *)match->data;
+ const struct xt_recent_mtinfo_v1 *info = (const void *)match->data;
if (info->invert)
printf(" !");
@@ -191,28 +228,107 @@ static void recent_save(const void *ip, const struct xt_entry_match *match)
if (info->check_set & XT_RECENT_TTL)
printf(" --rttl");
if(info->name) printf(" --name %s",info->name);
+ if (family == NFPROTO_IPV4)
+ printf(" --mask %s",
+ xtables_ipaddr_to_numeric(&info->mask.in));
+ if (family == NFPROTO_IPV6)
+ printf(" --mask %s",
+ xtables_ip6addr_to_numeric(&info->mask.in6));
+
if (info->side == XT_RECENT_SOURCE)
printf(" --rsource");
if (info->side == XT_RECENT_DEST)
printf(" --rdest");
}
-static struct xtables_match recent_mt_reg = {
- .name = "recent",
+static void recent_init_v0(struct xt_entry_match *match) {
+ recent_init(match,NFPROTO_UNSPEC);
+}
+
+static void recent_init_v1(struct xt_entry_match *match) {
+ recent_init(match,NFPROTO_IPV6);
+}
+
+static void recent_save_v0(const void *ip, const struct xt_entry_match *match)
+{
+ recent_save(ip,match,NFPROTO_UNSPEC);
+}
+
+static void recent_save_v4(const void *ip, const struct xt_entry_match *match)
+{
+ recent_save(ip,match,NFPROTO_IPV4);
+}
+
+static void recent_save_v6(const void *ip, const struct xt_entry_match *match)
+{
+ recent_save(ip,match,NFPROTO_IPV6);
+}
+
+static void recent_print_v0(const void *ip, const struct xt_entry_match *match,
+ int numeric)
+{
+ recent_print(ip,match,NFPROTO_UNSPEC);
+}
+
+static void recent_print_v4(const void *ip, const struct xt_entry_match *match,
+ int numeric)
+{
+ recent_print(ip,match,NFPROTO_IPV4);
+}
+
+static void recent_print_v6(const void *ip, const struct xt_entry_match *match,
+ int numeric)
+{
+ recent_print(ip,match,NFPROTO_IPV6);
+}
+
+static struct xtables_match recent_mt_reg[] = {
+ { .name = "recent",
.version = XTABLES_VERSION,
+ .revision = 0,
.family = NFPROTO_UNSPEC,
.size = XT_ALIGN(sizeof(struct xt_recent_mtinfo)),
.userspacesize = XT_ALIGN(sizeof(struct xt_recent_mtinfo)),
.help = recent_help,
- .init = recent_init,
+ .init = recent_init_v0,
+ .x6_parse = recent_parse,
+ .x6_fcheck = recent_check,
+ .print = recent_print_v0,
+ .save = recent_save_v0,
+ .x6_options = recent_opts_v0,
+ },
+ { .name = "recent",
+ .version = XTABLES_VERSION,
+ .revision = 1,
+ .family = NFPROTO_IPV4,
+ .size = XT_ALIGN(sizeof(struct xt_recent_mtinfo_v1)),
+ .userspacesize = XT_ALIGN(sizeof(struct xt_recent_mtinfo_v1)),
+ .help = recent_help,
+ .init = recent_init_v1,
+ .x6_parse = recent_parse,
+ .x6_fcheck = recent_check,
+ .print = recent_print_v4,
+ .save = recent_save_v4,
+ .x6_options = recent_opts_v1,
+ },
+ { .name = "recent",
+ .version = XTABLES_VERSION,
+ .revision = 1,
+ .family = NFPROTO_IPV6,
+ .size = XT_ALIGN(sizeof(struct xt_recent_mtinfo_v1)),
+ .userspacesize = XT_ALIGN(sizeof(struct xt_recent_mtinfo_v1)),
+ .help = recent_help,
+ .init = recent_init_v1,
.x6_parse = recent_parse,
.x6_fcheck = recent_check,
- .print = recent_print,
- .save = recent_save,
- .x6_options = recent_opts,
+ .print = recent_print_v6,
+ .save = recent_save_v6,
+ .x6_options = recent_opts_v1,
+ }
};
void _init(void)
{
- xtables_register_match(&recent_mt_reg);
+ xtables_register_matches(recent_mt_reg,
+ ARRAY_SIZE(recent_mt_reg));
}
diff --git a/include/linux/netfilter/xt_recent.h b/include/linux/netfilter/xt_recent.h
index 83318e0..b8d58c6 100644
--- a/include/linux/netfilter/xt_recent.h
+++ b/include/linux/netfilter/xt_recent.h
@@ -22,7 +22,6 @@ enum {
#define XT_RECENT_VALID_FLAGS (XT_RECENT_CHECK|XT_RECENT_SET|XT_RECENT_UPDATE|\
XT_RECENT_REMOVE|XT_RECENT_TTL|XT_RECENT_REAP)
-
struct xt_recent_mtinfo {
__u32 seconds;
__u32 hit_count;
@@ -32,4 +31,14 @@ struct xt_recent_mtinfo {
__u8 side;
};
+struct xt_recent_mtinfo_v1 {
+ __u32 seconds;
+ __u32 hit_count;
+ __u8 check_set;
+ __u8 invert;
+ char name[XT_RECENT_NAME_LEN];
+ __u8 side;
+ union nf_inet_addr mask;
+};
+
#endif /* _LINUX_NETFILTER_XT_RECENT_H */
--
1.7.3.4
^ permalink raw reply related
* Re: [PATCH 08/17] net: Introduce sk_gfp_atomic() to allow addition of GFP flags depending on the individual socket
From: David Miller @ 2012-05-17 20:10 UTC (permalink / raw)
To: mgorman
Cc: akpm, linux-mm, netdev, linux-kernel, neilb, a.p.zijlstra,
michaelc, emunson
In-Reply-To: <1337266231-8031-9-git-send-email-mgorman@suse.de>
From: Mel Gorman <mgorman@suse.de>
Date: Thu, 17 May 2012 15:50:22 +0100
> Introduce sk_gfp_atomic(), this function allows to inject sock specific
> flags to each sock related allocation. It is only used on allocation
> paths that may be required for writing pages back to network storage.
>
> [davem@davemloft.net: Use sk_gfp_atomic only when necessary]
> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
> Signed-off-by: Mel Gorman <mgorman@suse.de>
Acked-by: David S. Miller <davem@davemloft.net>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply
* Re: [PATCH v3] drop_monitor: convert to modular building
From: David Miller @ 2012-05-17 20:09 UTC (permalink / raw)
To: nhorman; +Cc: netdev, eric.dumazet, bhutchings
In-Reply-To: <1337285040-20848-1-git-send-email-nhorman@tuxdriver.com>
From: Neil Horman <nhorman@tuxdriver.com>
Date: Thu, 17 May 2012 16:04:00 -0400
> When I first wrote drop monitor I wrote it to just build monolithically. There
> is no reason it can't be built modularly as well, so lets give it that
> flexibiity.
>
> I've tested this by building it as both a module and monolithically, and it
> seems to work quite well
>
> Change notes:
>
> v2)
> * fixed for_each_present_cpu loops to be more correct as per Eric D.
> * Converted exit path failures to BUG_ON as per Ben H.
>
> v3)
> * Converted del_timer to del_timer_sync to close race noted by Ben H.
>
> Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Applied, althrough it didn't apply cleanly to net-next.
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox