Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: 3.4-rc: NETDEV WATCHDOG: eth0 (r8169): transmit queue 0 timed out
From: Tomas Papan @ 2012-06-12 14:34 UTC (permalink / raw)
  To: Francois Romieu; +Cc: netdev
In-Reply-To: <CAMGsXDSsSMWfrV1vjnqkWH45ypZw+ZtYF7=ytc79MWRJ5RF5jQ@mail.gmail.com>

Hi Francois,

Unfortunately after the warning occurred  once again after 28 000 seconds.
Is there anything else what can I do?

Regards
Tomas

[28914.344765] ------------[ cut here ]------------
[28914.344775] WARNING: at net/sched/sch_generic.c:256
dev_watchdog+0x16b/0x20f()
[28914.344779] Hardware name: SH55J
[28914.344782] NETDEV WATCHDOG: eth1 (r8169): transmit queue 0 timed out
[28914.344785] Modules linked in: vhost_net cls_route cls_u32 cls_fw
sch_sfq sch_htb ipt_REDIRECT ipt_MASQUERADE xt_limit xt_tcpudp
nf_conntrack_ipv6 nf_defrag_ipv6 xt_state iptable_nat nf_nat
nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 ip6table_filter
ip6_tables iptable_filter ip_tables x_tables kvm_intel kvm fuse r8169
[28914.344819] Pid: 0, comm: swapper/2 Not tainted 3.4.2 #1
[28914.344822] Call Trace:
[28914.344825]  <IRQ>  [<ffffffff81026881>] ? warn_slowpath_common+0x78/0x8c
[28914.344837]  [<ffffffff81026936>] ? warn_slowpath_fmt+0x45/0x4a
[28914.344843]  [<ffffffff81042fc2>] ? scheduler_tick+0xaf/0xc3
[28914.344850]  [<ffffffff81049f70>] ? ktime_get+0x5f/0xb9
[28914.344855]  [<ffffffff812535a0>] ? dev_watchdog+0x16b/0x20f
[28914.344861]  [<ffffffff8102f3ae>] ? run_timer_softirq+0x177/0x209
[28914.344868]  [<ffffffff8103e7b3>] ? hrtimer_interrupt+0x100/0x19b
[28914.344872]  [<ffffffff81253435>] ? qdisc_reset+0x35/0x35
[28914.344879]  [<ffffffff8102b256>] ? __do_softirq+0x7f/0x106
[28914.344884]  [<ffffffff812e298c>] ? call_softirq+0x1c/0x30
[28914.344890]  [<ffffffff81003376>] ? do_softirq+0x31/0x67
[28914.344895]  [<ffffffff8102b503>] ? irq_exit+0x44/0x75
[28914.344899]  [<ffffffff810032b5>] ? do_IRQ+0x94/0xad
[28914.344905]  [<ffffffff812e10a7>] ? common_interrupt+0x67/0x67
[28914.344908]  <EOI>  [<ffffffff81163f07>] ? intel_idle+0xde/0x103
[28914.344919]  [<ffffffff81163ee3>] ? intel_idle+0xba/0x103
[28914.344926]  [<ffffffff81220bfa>] ? cpuidle_idle_call+0x7e/0xc4
[28914.344933]  [<ffffffff81008e92>] ? cpu_idle+0x53/0x7c
[28914.344937] ---[ end trace 3d8459064a9171b4 ]---
[28914.347829] r8169 0000:01:00.0: eth1: link up

^ permalink raw reply

* Re: [PATCH RFC] c_can_pci: generic module for c_can on PCI
From: Marc Kleine-Budde @ 2012-06-12 14:46 UTC (permalink / raw)
  To: Federico Vaga
  Cc: Wolfgang Grandegger, Giancarlo Asnaghi, Alan Cox,
	Alessandro Rubini, linux-can, netdev, linux-kernel
In-Reply-To: <2105333.GQGP4hKIYU@harkonnen>

[-- Attachment #1: Type: text/plain, Size: 684 bytes --]

On 06/12/2012 04:25 PM, Federico Vaga wrote:
>>> +out_free_clock:
>>> +	if (!priv->priv)
>>
>>            ^^^
>>
>> looks fishy
> 
> Also c_can_platform.c use priv->priv when it needs to get clk. I can add 
> a comment to specify what the statement do.

> +out_free_clock:
> +	if (!priv->priv)
> +		clk_put(priv->priv);

Why do you call clk_put on priv->priv, if priv->priv is NULL?

Marc

-- 
Pengutronix e.K.                  | Marc Kleine-Budde           |
Industrial Linux Solutions        | Phone: +49-231-2826-924     |
Vertretung West/Dortmund          | Fax:   +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686  | http://www.pengutronix.de   |


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

^ permalink raw reply

* Re: [RFC] net/sched/em_canid: Ematch rule to match CAN frames according to their CAN IDs
From: Rostislav Lisovy @ 2012-06-12 14:50 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev, linux-can, lartc, pisa, sojkam1
In-Reply-To: <1339511593.22704.157.camel@edumazet-glaptop>

On Tue, 2012-06-12 at 16:33 +0200, Eric Dumazet wrote: 
> On Tue, 2012-06-12 at 16:03 +0200, Eric Dumazet wrote:
> > On Tue, 2012-06-12 at 15:48 +0200, Rostislav Lisovy wrote:
> > > em_canid is an ematch capable of classifying CAN frames according to
> > > their CAN IDs.
> > > 
> > > This RFC/Patch contains a reworked classifier initially posted in
> > > http://www.spinics.net/lists/netdev/msg200114.html
> > > The functionality is the same however there is almost 50% reduction
> > > in the source code length.
> > > 
> > > There is a slight difference between this ematch and other available
> > > ematches. Other ematches implement only a simple match operation and
> > > are meant to be combined with logic conjunctions (e.g. AND, OR).
> > > Our ematch makes it possible to use up to 32 rules in single
> > > 'configuration statement' (with OR semantics). This allows us to take
> > > the advantage of the bit field data-structure in the implementation of
> > > the match function.
> > > 
> > > Example: canid(sff 0x123 eff 0x124 sff 0x230:0x7f0)
> > > This ematch would match CAN SFF frames with the following IDs:
> > > 0x123, 0x230--0x23f or EFF frame with ID 0x124.
> > > 
> > > Signed-off-by: Rostislav Lisovy <lisovy@gmail.com>
> > > ---
> > >  include/linux/can.h     |    3 +
> > >  include/linux/pkt_cls.h |    5 +-
> > >  net/sched/Kconfig       |   10 +++
> > >  net/sched/Makefile      |    1 +
> > >  net/sched/em_canid.c    |  217 +++++++++++++++++++++++++++++++++++++++++++++++
> > >  5 files changed, 234 insertions(+), 2 deletions(-)
> > >  create mode 100644 net/sched/em_canid.c
> > > 
> > > diff --git a/include/linux/can.h b/include/linux/can.h
> > > index 9a19bcb..08d1610 100644
> > > --- a/include/linux/can.h
> > > +++ b/include/linux/can.h
> > > @@ -38,6 +38,9 @@
> > >   */
> > >  typedef __u32 canid_t;
> > >  
> > > +#define CAN_SFF_ID_BITS		11
> > > +#define CAN_EFF_ID_BITS		29
> > > +
> > 
> > > +struct canid_match {
> > > +	struct can_filter rules_raw[EM_CAN_RULES_SIZE]; /* Raw rules copied
> > > +		from netlink message; Used for sending information to
> > > +		userspace (when 'tc filter show' is invoked) AND when
> > > +		matching EFF frames*/
> > > +	DECLARE_BITMAP(match_sff, (1 << CAN_SFF_ID_BITS)); /* For each SFF CAN
> > > +		ID (11 bit) there is one record in this	bitfield */
> > > +	int rules_count;
> > > +	int eff_rules_count;
> > > +	int sff_rules_count;
> > > +
> > > +	struct rcu_head rcu;
> > > +};
> > 
> > The size of kmalloc() blob to hold this structure is 4 Mbytes
> > 
> > This is a huge cost, and unlikely to succeed but shortly after boot...
> > 
> > (this happens to be the largest possible kmalloc() allocation by the way
> > on x86)
> > 
> > 
> 
> Oh well, I mixed CAN_SFF_ID_BITS / CAN_EFF_ID_BITS
> 
> 

Indeed; On x86_64 the sizeof(struct canid_match) is 544 bytes;

Regards;
Rostislav Lisovy

^ permalink raw reply

* Re: [PATCH RFC] c_can_pci: generic module for c_can on PCI
From: Federico Vaga @ 2012-06-12 14:53 UTC (permalink / raw)
  To: Marc Kleine-Budde
  Cc: Wolfgang Grandegger, Giancarlo Asnaghi, Alan Cox,
	Alessandro Rubini, linux-can, netdev, linux-kernel
In-Reply-To: <4FD75648.7060702@pengutronix.de>


> > +out_free_clock:
> > +	if (!priv->priv)
> > +		clk_put(priv->priv);
> 
> Why do you call clk_put on priv->priv, if priv->priv is NULL?

Right!

if(priv->priv)
    clk_put(priv->priv);

-- 
Federico Vaga

^ permalink raw reply

* RE: [PATCH] netpoll: Fix skb tail pointer in netpoll_send_udp()
From: Hamciuc Bogdan-BHAMCIU1 @ 2012-06-12 15:08 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: davem@davemloft.net, netdev@vger.kernel.org
In-Reply-To: <1339508049.22704.143.camel@edumazet-glaptop>

Hi Eric,

> -----Original Message-----
> From: Eric Dumazet [mailto:eric.dumazet@gmail.com]
> Sent: Tuesday, June 12, 2012 4:34 PM
> To: Hamciuc Bogdan-BHAMCIU1
> Cc: davem@davemloft.net; netdev@vger.kernel.org
> Subject: Re: [PATCH] netpoll: Fix skb tail pointer in netpoll_send_udp()
> 
> On Tue, 2012-06-12 at 15:15 +0200, Eric Dumazet wrote:
> > On Tue, 2012-06-12 at 14:22 +0200, Eric Dumazet wrote:
> 
> > > Hmm, real question is why skb_realloc_headroom() is even necessary...
Our driver (Freescale P4080, unfortunately not upstream yet) needs a minimum amount of headroom in order to communicate metadata to the NIC.
For instance, frame offsets to the protocol headers are stored there, which the NIC then uses to fill in the egress checksum.
Originating frames, on the other hand, don't always have that amount of headroom, so we occasionally need to realloc. 
> > >
> > > I suspect we need to reserve more bytes.
> > >
> > > 	total_len = ip_len + ETH_HLEN + NET_IP_ALIGN + NET_SKB_PAD;
> > >
> > > or something like that ?
Indeed, your counter-proposal patch (adding LL_RESERVED_SPACE()) worked fine.

> > >
> > > Which driver triggers the bug ?
> > >
> 
> In case you wonder why I try so hard to avoid the
> skb_realloc_headroom() :
> 
> netpoll has complicated^Wspecial^Wnice skb cache, to make sure it can
> work even if memory is exhausted.
> 
> But if we trigger skb_realloc_headroom() the whole thing is useless.
> 
Thank you,
Bogdan


^ permalink raw reply

* Re: [RFC] net/sched/em_canid: Ematch rule to match CAN frames according to their CAN IDs
From: Oliver Hartkopp @ 2012-06-12 15:42 UTC (permalink / raw)
  To: Rostislav Lisovy, Eric Dumazet; +Cc: netdev, linux-can, lartc, pisa, sojkam1
In-Reply-To: <1339512622.8536.3.camel@lolumad>


Hello Rostislav, hello Eric,

On 12.06.2012 16:50, Rostislav Lisovy wrote:

> On Tue, 2012-06-12 at 16:33 +0200, Eric Dumazet wrote: 
>> On Tue, 2012-06-12 at 16:03 +0200, Eric Dumazet wrote:
>>> On Tue, 2012-06-12 at 15:48 +0200, Rostislav Lisovy wrote:
>>>> em_canid is an ematch capable of classifying CAN frames according to
>>>> their CAN IDs.
>>>>
>>>> This RFC/Patch contains a reworked classifier initially posted in
>>>> http://www.spinics.net/lists/netdev/msg200114.html
>>>> The functionality is the same however there is almost 50% reduction
>>>> in the source code length.
>>>>
>>>> There is a slight difference between this ematch and other available
>>>> ematches. Other ematches implement only a simple match operation and
>>>> are meant to be combined with logic conjunctions (e.g. AND, OR).
>>>> Our ematch makes it possible to use up to 32 rules in single
>>>> 'configuration statement' (with OR semantics). This allows us to take
>>>> the advantage of the bit field data-structure in the implementation of
>>>> the match function.
>>>>
>>>> Example: canid(sff 0x123 eff 0x124 sff 0x230:0x7f0)
>>>> This ematch would match CAN SFF frames with the following IDs:
>>>> 0x123, 0x230--0x23f or EFF frame with ID 0x124.



i tried to figure out the difference between the implementation as classifier
(link above) and this implementation as an ematch.

Do we have any disadvantages using the ematch? E.g. is it still possible to
add additional ematches like checking for patterns inside can_frame.data[]
(which is located in skb->data) with ematch_u32 or e.g. ematch_text ??


...

Btw.

>>>> +struct canid_match {
>>>> +	struct can_filter rules_raw[EM_CAN_RULES_SIZE]; /* Raw rules copied
>>>> +		from netlink message; Used for sending information to
>>>> +		userspace (when 'tc filter show' is invoked) AND when
>>>> +		matching EFF frames*/
>>>> +	DECLARE_BITMAP(match_sff, (1 << CAN_SFF_ID_BITS)); /* For each SFF CAN
>>>> +		ID (11 bit) there is one record in this	bitfield */


these comments have a styling problem :-)

Additional to these checkpatch warnings:

WARNING: line over 80 characters
#144: FILE: net/sched/em_canid.c:48:
+static void em_canid_sff_match_add(struct canid_match *cm, u32 can_id, u32 can_mask)

WARNING: line over 80 characters
#231: FILE: net/sched/em_canid.c:135:
+	if (cm->rules_count > EM_CAN_RULES_SIZE) /* Be sure to fit into the array */

WARNING: line over 80 characters
#256: FILE: net/sched/em_canid.c:160:
+			memcpy(cm->rules_raw + cm->eff_rules_count + cm->sff_rules_count,



Tnx & regards,
Oliver

^ permalink raw reply

* Re: net/netfilter/nf_conntrack_proto_tcp.c:1606:9: error: ‘struct nf_proto_net’ has no member named ‘user’
From: Pablo Neira Ayuso @ 2012-06-12 16:03 UTC (permalink / raw)
  To: Gao feng; +Cc: David Miller, wfg, netdev
In-Reply-To: <4FD72203.9090005@cn.fujitsu.com>

On Tue, Jun 12, 2012 at 07:03:31PM +0800, Gao feng wrote:
> 于 2012年06月12日 17:29, Pablo Neira Ayuso 写道:
> 
> >> nf_proto_net.users has different meaning when SYSCTL enabled or disabled.
> >>
> >> when SYSCTL enabled,it means if both tcpv4 and tcpv6 register the sysctl,
> >> it is increased when register sysctl success and decreased when unregister sysctl.
> >> we can regard it as the refcnt of ctl_table.
> >>
> >> when SYSCTL disabled,it just used to identify if the proto's pernet data
> >> has been initialized.
> > 
> > We have to use two different counters for this. The conditional
> > meaning of that variable is really confusing.
> > 
> Hi David & Pablo
> 
> Please have a look at this patch and tell me if it's OK.
> it base on Pable's patch.

I think we have to merge those tcpv4_init_net and tcpv6_init_net
functions into one single function tcp_init_net. Then, we can pass
l4proto->l3proto to init_net:

        if (proto->init_net) {
                ret = proto->init_net(net, l4proto->l3proto);
                if (ret < 0)
                        return ret;
        }

Thus, we can check if this is IPv4 or IPv6 and initialize the compat
part accordingly.

Still, we have that pn->users thing:

        if (!pn->users++) {
                for (i = 0; i < TCP_CONNTRACK_TIMEOUT_MAX; i++)
                        tn->timeouts[i] = tcp_timeouts[i];

                tn->tcp_loose = nf_ct_tcp_loose;
                tn->tcp_be_liberal = nf_ct_tcp_be_liberal;
                tn->tcp_max_retrans = nf_ct_tcp_max_retrans;
        }

Define some pn->initialized boolean. Set it to true at the end of
the new tcp_init_net.

Similar thing for other protocol trackers.

Let me know if you are going to send me patches. In that case, please
do it on top of the current tree.

Once that has been cleaned up, we can prepare follow-up patches to
move the sysctl code to nf_conntrack_proto_*_sysctl.c to reduce the
ifdef pollution.

^ permalink raw reply

* [PATCH v2] bonding: Fix corrupted queue_mapping
From: Eric Dumazet @ 2012-06-12 16:03 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, Tom Herbert, John Fastabend, Roland Dreier

From: Eric Dumazet <edumazet@google.com>

In the transmit path of the bonding driver, skb->cb is used to
stash the skb->queue_mapping so that the bonding device can set its
own queue mapping.  This value becomes corrupted since the skb->cb is
also used in __dev_xmit_skb.

When transmitting through bonding driver, bond_select_queue is
called from dev_queue_xmit.  In bond_select_queue the original
skb->queue_mapping is copied into skb->cb (via bond_queue_mapping)
and skb->queue_mapping is overwritten with the bond driver queue.

Subsequently in dev_queue_xmit, __dev_xmit_skb is called which writes
the packet length into skb->cb, thereby overwriting the stashed
queue mappping.  In bond_dev_queue_xmit (called from hard_start_xmit),
the queue mapping for the skb is set to the stashed value which is now
the skb length and hence is an invalid queue for the slave device.

If we want to save skb->queue_mapping into skb->cb[], best place is to
add a field in struct qdisc_skb_cb, to make sure it wont conflict with
other layers (eg : Qdiscc, Infiniband...)

This patchs also makes sure (struct qdisc_skb_cb)->data is aligned on 8
bytes :

netem qdisc for example assumes it can store an u64 in it, without
misalignment penalty.

Note : we only have 20 bytes left in (struct qdisc_skb_cb)->data[].
The largest user is CHOKe and it fills it.

Based on a previous patch from Tom Herbert.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Tom Herbert <therbert@google.com>
Cc: John Fastabend <john.r.fastabend@intel.com>
Cc: Roland Dreier <roland@kernel.org>
---
 drivers/net/bonding/bond_main.c |    9 +++++----
 include/net/sch_generic.h       |    7 +++++--
 2 files changed, 10 insertions(+), 6 deletions(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 2ee8cf9..b9c2ae6 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -76,6 +76,7 @@
 #include <net/route.h>
 #include <net/net_namespace.h>
 #include <net/netns/generic.h>
+#include <net/pkt_sched.h>
 #include "bonding.h"
 #include "bond_3ad.h"
 #include "bond_alb.h"
@@ -381,8 +382,6 @@ struct vlan_entry *bond_next_vlan(struct bonding *bond, struct vlan_entry *curr)
 	return next;
 }
 
-#define bond_queue_mapping(skb) (*(u16 *)((skb)->cb))
-
 /**
  * bond_dev_queue_xmit - Prepare skb for xmit.
  *
@@ -395,7 +394,9 @@ int bond_dev_queue_xmit(struct bonding *bond, struct sk_buff *skb,
 {
 	skb->dev = slave_dev;
 
-	skb->queue_mapping = bond_queue_mapping(skb);
+	BUILD_BUG_ON(sizeof(skb->queue_mapping) !=
+		     sizeof(qdisc_skb_cb(skb)->bond_queue_mapping));
+	skb->queue_mapping = qdisc_skb_cb(skb)->bond_queue_mapping;
 
 	if (unlikely(netpoll_tx_running(slave_dev)))
 		bond_netpoll_send_skb(bond_get_slave_by_dev(bond, slave_dev), skb);
@@ -4171,7 +4172,7 @@ static u16 bond_select_queue(struct net_device *dev, struct sk_buff *skb)
 	/*
 	 * Save the original txq to restore before passing to the driver
 	 */
-	bond_queue_mapping(skb) = skb->queue_mapping;
+	qdisc_skb_cb(skb)->bond_queue_mapping = skb->queue_mapping;
 
 	if (unlikely(txq >= dev->real_num_tx_queues)) {
 		do {
diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index 55ce96b..9d7d54a 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -220,13 +220,16 @@ struct tcf_proto {
 
 struct qdisc_skb_cb {
 	unsigned int		pkt_len;
-	unsigned char		data[24];
+	u16			bond_queue_mapping;
+	u16			_pad;
+	unsigned char		data[20];
 };
 
 static inline void qdisc_cb_private_validate(const struct sk_buff *skb, int sz)
 {
 	struct qdisc_skb_cb *qcb;
-	BUILD_BUG_ON(sizeof(skb->cb) < sizeof(unsigned int) + sz);
+
+	BUILD_BUG_ON(sizeof(skb->cb) < offsetof(struct qdisc_skb_cb, data) + sz);
 	BUILD_BUG_ON(sizeof(qcb->data) < sz);
 }
 

^ permalink raw reply related

* Re: [net-next PATCH 00/02] net/ipv4: Add support for new tunnel type VTI.
From: Nicolas Dichtel @ 2012-06-12 16:17 UTC (permalink / raw)
  To: Saurabh; +Cc: netdev
In-Reply-To: <20120608173225.GA11928@debian-saurabh-64.vyatta.com>

Hi,

thank you for pushing this feature upstream Saurabh.
This feature is very usefull, we have implemented something similar in our system.

Regards,
Nicolas

Le 08/06/2012 19:32, Saurabh a écrit :
>
>
> Introduction:
> Virtual tunnel interface is a way to represent policy based IPsec tunnels as virtual interfaces in linux. This is similar to Cisco's VTI (virtual tunnel interface) and Juniper's representaion of secure tunnel (st.xx). The advantage of representing an IPsec tunnel as an interface is that it is possible to plug Ipsec tunnels into the routing protocol infrastructure of a router. Therefore it becomes possible to influence the packet path by toggling the link state of the tunnel or based on routing metrics.
>
> Overview:
> Natively linux kernel does not support ipsec as an interface. Also secure interface assume a ipsec policy 4 tupple of {dst-ip-any, src-ip-any, dst-port-any, src-port-any}. Applying this 4 tuple in linux would result in all traffic matching the ipsec policy. What is needed is a tunnel distinguisher. The linux kernel skbuff has fwmark which is used for policy based routing (PBR). Linux kernel version 2.6.35 enhanced SPD/SADB to use fwmark as part of the IPsec policy. Strongswan has also introduced support for this kernel feature with version 4.5.0. We can therefore use the fwmark as the distinguisher for tunnel interface. We can also create a light weight tunnel kernel module (vti) to give the notion of an interface for rest of the kernel routing system. The tunnel module does not do any encapsulation/decapsulation. The kernel's xfrm modules still do the esp encryption/decryption.
>
> Usage:
> ip tunnel add sti15 mode vti remote 12.0.0.1 local 12.0.0.3 ikey 15
> or
> ip link add sti15 type vti key 15 remote 12.0.0.1 local 12.0.0.3
>
> Signed-off-by: Saurabh Mohan<saurabh.mohan@vyatta.com>
> Reviewed-by: Stephen Hemminger<shemminger@vyatta.com>
>
> ---
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [net-next.git 4/4 (v4)] stmmac: add the Energy Efficient Ethernet support
From: Ben Hutchings @ 2012-06-12 16:42 UTC (permalink / raw)
  To: Giuseppe CAVALLARO; +Cc: netdev, rayagond, davem, yuvalmin
In-Reply-To: <1339505153-26731-5-git-send-email-peppe.cavallaro@st.com>

On Tue, 2012-06-12 at 14:45 +0200, Giuseppe CAVALLARO wrote:
> This patch adds the Energy Efficient Ethernet support to the stmmac.
> 
> Please see the driver's documentation for further details about this support
> in the driver.
[...]
> +static int stmmac_ethtool_op_set_eee(struct net_device *dev,
> +				     struct ethtool_eee *edata)
> +{
> +	struct stmmac_priv *priv = netdev_priv(dev);
> +
> +	priv->eee_enabled = edata->eee_enabled;
> +
> +	if (!priv->eee_enabled)
> +		stmmac_disable_eee_mode(priv);
> +	else {
> +		/* We are asking for enabling the EEE but it is safe
> +		 * to verify all by invoking the eee_init function.
> +		 * In case of failure it will return an error.
> +		 */

It would be better if you could avoid changing any settings in the case
of failure, though.

> +		priv->tx_lpi_timer = edata->tx_lpi_timer;
> +		priv->eee_enabled = stmmac_eee_init(priv);
> +		if (!priv->eee_enabled)
> +			return -EPERM;
[...]

Surely -EOPNOTSUPP?

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply

* Re: [net-next.git 1/4 (v4)] phy: add the EEE support and the way to access to the MMD registers.
From: Ben Hutchings @ 2012-06-12 16:50 UTC (permalink / raw)
  To: Giuseppe CAVALLARO; +Cc: netdev, rayagond, davem, yuvalmin
In-Reply-To: <1339505153-26731-2-git-send-email-peppe.cavallaro@st.com>

On Tue, 2012-06-12 at 14:45 +0200, Giuseppe CAVALLARO wrote:
> This patch adds the support for the Energy-Efficient Ethernet (EEE)
> to the Physical Abstraction Layer.
> To support the EEE we have to access to the MMD registers 3.20 and
> 7.60/61. So two new functions have been added to read/write the MMD
> registers (clause 45).
> 
> An Ethernet driver (I tested the stmmac) can invoke the phy_init_eee to properly
> check if the EEE is supported by the PHYs and it can also set the clock
> stop enable bit in the 3.0 register.
> The phy_get_eee_err can be used for reporting the number of time where
> the PHY failed to complete its normal wake sequence.
> 
> In the end, this patch also adds the EEE ethtool support implementing:
>  o phy_ethtool_set_eee
>  o phy_ethtool_get_eee
[...]
> --- a/drivers/net/phy/phy.c
> +++ b/drivers/net/phy/phy.c
[...]
> +static int phy_eee_to_adv(int eee_adv)
> +{
> +	int adv = 0;
> +
> +	if (eee_adv & MDIO_EEE_100TX)
> +		adv |= ADVERTISED_100baseT_Full;
> +	if (eee_adv & MDIO_EEE_1000T)
> +		adv |= ADVERTISED_1000baseT_Full;
> +	if (eee_adv & MDIO_EEE_10GT)
> +		adv |= ADVERTISED_10000baseT_Full;
> +	if (eee_adv & MDIO_EEE_1000KX)
> +		adv |= ADVERTISED_1000baseKX_Full;
> +	if (eee_adv & MDIO_EEE_10GKX4)
> +		adv |= ADVERTISED_10000baseKX4_Full;
> +	if (eee_adv & MDIO_EEE_10GKR)
> +		adv |= ADVERTISED_10000baseKR_Full;
> +
> +	return adv;
> +}

The type of 'adv' and the return type should be u32 (per ethtool API);
the type of 'eee_adv' should be u16 (16-bit register).

Similarly in phy_eee_to_supported() and phy_adv_to_eee().

[...]
> --- a/include/linux/mii.h
> +++ b/include/linux/mii.h
[...]
> @@ -141,6 +143,15 @@
>  #define FLOW_CTRL_TX		0x01
>  #define FLOW_CTRL_RX		0x02
>  
> +/* MMD Access Control register fields */
> +#define MII_MMD_CTRL_DEVAD_MASK			0x1f	/* Mask MMD DEVAD*/
> +#define MII_MMD_CTRL_FUNC_ADDR			0x0000	/* Address */
> +#define MII_MMD_CTRL_FUNC_DATA_NOINCR		0x4000	/* no post increment */
> +#define MII_MMD_CTRL_FUNC_DATA_INCR_ON_RDWT	0x8000	/* post increment on
> +							 * reads & writes */
> +#define MII_MMD_CTRL_FUNC_DATA_INCR_ON_WT	0xC000	/* post increment on
> +							 * writes only */
> +
[...]

These names are quite long; could they reasonably be shortened, e.g. by
dropping 'FUNC_' and 'DATA_' parts?

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply

* RE: kernel ipsec error
From: Ben Hutchings @ 2012-06-12 16:53 UTC (permalink / raw)
  To: Marco Berizzi; +Cc: netdev
In-Reply-To: <DUB108-W54AAF31BD3E442DF275349B2F60@phx.gbl>

On Tue, 2012-06-12 at 08:31 +0200, Marco Berizzi wrote:
> bhutchings@solarflare.com wrote:
> > On Mon, 2012-06-11 at 14:45 +0200, Marco Berizzi wrote:
> > > Hello everybody.
> > >
> > > After 12 days uptime I got this message.
> > > Linux is 3.3.5 32bit, running openswan
> > > (this is an ipsec gateway/netfilter
> > > firewall) and squid.
> > >
> > > Jun 11 12:53:02 Pleiadi kernel: SLUB: Unable to allocate memory on node -1 (gfp=0x20)
> > > Jun 11 12:53:02 Pleiadi kernel: cache: kmalloc-2048, object size: 2048, buffer size: 2048, default order: 2, min order: 0
> > > Jun 11 12:53:02 Pleiadi kernel: node 0: slabs: 61, objs: 476, free: 0
> > > Jun 11 12:53:02 Pleiadi kernel: kworker/0:2: page allocation failure: order:0, mode:0x4020
> >
> > So a single page allocation in atomic (non-sleeping) context failed.
> >
> > [...]
> > > Jun 11 12:53:02 Pleiadi kernel: Free swap = 130908kB
> > > Jun 11 12:53:02 Pleiadi kernel: Total swap = 151164kB
> > > Jun 11 12:53:02 Pleiadi kernel: 40944 pages RAM
> > [...]
> >
> > Not too surprising with this little RAM available (swap didn't help
> > since we couldn't wait for swap-out).
> 
> yes, this is really a very old hardware.
> 
> > There could be a memory leak, but you would need to read /proc/meminfo
> > and /proc/slabinfo at intervals to work out whether that was the case.
> 
> Kindly, may you tell me which should be the intervals? One second? one minute?

Given that you saw this after 12 days, I would think somewhere between
an hour and a day would be appropriate.

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply

* Re: [PATCH v2] bonding: Fix corrupted queue_mapping
From: Neil Horman @ 2012-06-12 17:16 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David Miller, netdev, Tom Herbert, John Fastabend, Roland Dreier
In-Reply-To: <1339517031.22704.202.camel@edumazet-glaptop>

On Tue, Jun 12, 2012 at 06:03:51PM +0200, Eric Dumazet wrote:
> From: Eric Dumazet <edumazet@google.com>
> 
> In the transmit path of the bonding driver, skb->cb is used to
> stash the skb->queue_mapping so that the bonding device can set its
> own queue mapping.  This value becomes corrupted since the skb->cb is
> also used in __dev_xmit_skb.
> 
> When transmitting through bonding driver, bond_select_queue is
> called from dev_queue_xmit.  In bond_select_queue the original
> skb->queue_mapping is copied into skb->cb (via bond_queue_mapping)
> and skb->queue_mapping is overwritten with the bond driver queue.
> 
> Subsequently in dev_queue_xmit, __dev_xmit_skb is called which writes
> the packet length into skb->cb, thereby overwriting the stashed
> queue mappping.  In bond_dev_queue_xmit (called from hard_start_xmit),
> the queue mapping for the skb is set to the stashed value which is now
> the skb length and hence is an invalid queue for the slave device.
> 
> If we want to save skb->queue_mapping into skb->cb[], best place is to
> add a field in struct qdisc_skb_cb, to make sure it wont conflict with
> other layers (eg : Qdiscc, Infiniband...)
> 
> This patchs also makes sure (struct qdisc_skb_cb)->data is aligned on 8
> bytes :
> 
> netem qdisc for example assumes it can store an u64 in it, without
> misalignment penalty.
> 
> Note : we only have 20 bytes left in (struct qdisc_skb_cb)->data[].
> The largest user is CHOKe and it fills it.
> 
> Based on a previous patch from Tom Herbert.
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Reported-by: Tom Herbert <therbert@google.com>
> Cc: John Fastabend <john.r.fastabend@intel.com>
> Cc: Roland Dreier <roland@kernel.org>
Acked-by: Neil Horman <nhorman@tuxdriver.com>

Looking at this, it would be really nice if we could have some sort of layer
independent space in an skb.  It seems we're often looking to shoehorn more
stuff into the control bock.
Neil

^ permalink raw reply

* Bug in net/ipv6/ip6_fib.c:fib6_dump_table()
From: Debabrata Banerjee @ 2012-06-12 17:22 UTC (permalink / raw)
  To: netdev, davem, kaber, yoshfuji, jmorris, pekkas, kuznet
  Cc: linux-kernel, eric.dumazet, Joshua Hunt

Looks like commit 2bec5a369ee79576a3eea2c23863325089785a2c "ipv6: fib:
fix crash when changing large fib while dumping" is the culprit. The
result of this code is that if there is a tree addition while a dump
has suspended because the netlink skb is full, it will simply go back
to the top of the tree and you end up with duplicate/triplicate/etc
routes. It looks like the code attempts to count nodes, but it's a
linear count and the data structure is a tree so that's a big problem.
The net result is potentially DOSable, since if route table updates
happen often enough in proportion to table size, a dump will attempt
to return an infinite amount of routes (observed). So this commit
should be reverted. However I am interested in the problem that commit
tried to solve, if anyone has more information on that. My assumption
is the fib tree gets corrupted and eventually it crashes in
fib6_dump_table(), which I assume can still happen.

I can easily demonstrate the bug by adding cloned/cache routes while I
check the results of fib6_dump_table:

root@a172-25-43-12.deploy.akamaitechnologies.com:~# ip -6 -o route
show table cache |tee tmp | wc -l; sort tmp | uniq -u | wc -l
593
189
root@a172-25-43-12.deploy.akamaitechnologies.com:~# ip -6 -o route
show table cache |tee tmp | wc -l; sort tmp | uniq -u | wc -l
884
16
root@a172-25-43-12.deploy.akamaitechnologies.com:~# ip -6 -o route
show table cache |tee tmp | wc -l; sort tmp | uniq -u | wc -l
888
78
root@a172-25-43-12.deploy.akamaitechnologies.com:~# ip -6 -o route
show table cache |tee tmp | wc -l; sort tmp | uniq -u | wc -l
507
507
root@a172-25-43-12.deploy.akamaitechnologies.com:~# ip -6 -o route
show table cache |tee tmp | wc -l; sort tmp | uniq -u | wc -l
533
533
root@a172-25-43-12.deploy.akamaitechnologies.com:~# ip -6 -o route
show table cache |tee tmp | wc -l; sort tmp | uniq -u | wc -l
571
571

Thanks,
Debabrata

^ permalink raw reply

* [PATCH 1/2] netfilter: xt_HMARK: fix endianness and provide consistent hashing
From: pablo @ 2012-06-12 17:36 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev
In-Reply-To: <1339522563-9227-1-git-send-email-pablo@netfilter.org>

From: Hans Schillstrom <hans@schillstrom.com>

This patch addresses two issues:

a) Fix usage of u32 and __be32 that causes endianess warnings via sparse.
b) Ensure consistent hashing in a cluster that is composed of big and
   little endian systems. Thus, we obtain the same hash mark in an
   heterogeneous cluster.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Hans Schillstrom <hans@schillstrom.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 include/linux/netfilter/xt_HMARK.h |    5 +++
 net/netfilter/xt_HMARK.c           |   72 ++++++++++++++++++++----------------
 2 files changed, 46 insertions(+), 31 deletions(-)

diff --git a/include/linux/netfilter/xt_HMARK.h b/include/linux/netfilter/xt_HMARK.h
index abb1650..826fc58 100644
--- a/include/linux/netfilter/xt_HMARK.h
+++ b/include/linux/netfilter/xt_HMARK.h
@@ -27,7 +27,12 @@ union hmark_ports {
 		__u16	src;
 		__u16	dst;
 	} p16;
+	struct {
+		__be16	src;
+		__be16	dst;
+	} b16;
 	__u32	v32;
+	__be32	b32;
 };
 
 struct xt_hmark_info {
diff --git a/net/netfilter/xt_HMARK.c b/net/netfilter/xt_HMARK.c
index 0a96a43..1686ca1 100644
--- a/net/netfilter/xt_HMARK.c
+++ b/net/netfilter/xt_HMARK.c
@@ -32,13 +32,13 @@ MODULE_ALIAS("ipt_HMARK");
 MODULE_ALIAS("ip6t_HMARK");
 
 struct hmark_tuple {
-	u32			src;
-	u32			dst;
+	__be32			src;
+	__be32			dst;
 	union hmark_ports	uports;
-	uint8_t			proto;
+	u8			proto;
 };
 
-static inline u32 hmark_addr6_mask(const __u32 *addr32, const __u32 *mask)
+static inline __be32 hmark_addr6_mask(const __be32 *addr32, const __be32 *mask)
 {
 	return (addr32[0] & mask[0]) ^
 	       (addr32[1] & mask[1]) ^
@@ -46,8 +46,8 @@ static inline u32 hmark_addr6_mask(const __u32 *addr32, const __u32 *mask)
 	       (addr32[3] & mask[3]);
 }
 
-static inline u32
-hmark_addr_mask(int l3num, const __u32 *addr32, const __u32 *mask)
+static inline __be32
+hmark_addr_mask(int l3num, const __be32 *addr32, const __be32 *mask)
 {
 	switch (l3num) {
 	case AF_INET:
@@ -58,6 +58,22 @@ hmark_addr_mask(int l3num, const __u32 *addr32, const __u32 *mask)
 	return 0;
 }
 
+static inline void hmark_swap_ports(union hmark_ports *uports,
+				    const struct xt_hmark_info *info)
+{
+	union hmark_ports hp;
+	u16 src, dst;
+
+	hp.b32 = (uports->b32 & info->port_mask.b32) | info->port_set.b32;
+	src = ntohs(hp.b16.src);
+	dst = ntohs(hp.b16.dst);
+
+	if (dst > src)
+		uports->v32 = (dst << 16) | src;
+	else
+		uports->v32 = (src << 16) | dst;
+}
+
 static int
 hmark_ct_set_htuple(const struct sk_buff *skb, struct hmark_tuple *t,
 		    const struct xt_hmark_info *info)
@@ -74,22 +90,19 @@ hmark_ct_set_htuple(const struct sk_buff *skb, struct hmark_tuple *t,
 	otuple = &ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple;
 	rtuple = &ct->tuplehash[IP_CT_DIR_REPLY].tuple;
 
-	t->src = hmark_addr_mask(otuple->src.l3num, otuple->src.u3.all,
-				 info->src_mask.all);
-	t->dst = hmark_addr_mask(otuple->src.l3num, rtuple->src.u3.all,
-				 info->dst_mask.all);
+	t->src = hmark_addr_mask(otuple->src.l3num, otuple->src.u3.ip6,
+				 info->src_mask.ip6);
+	t->dst = hmark_addr_mask(otuple->src.l3num, rtuple->src.u3.ip6,
+				 info->dst_mask.ip6);
 
 	if (info->flags & XT_HMARK_FLAG(XT_HMARK_METHOD_L3))
 		return 0;
 
 	t->proto = nf_ct_protonum(ct);
 	if (t->proto != IPPROTO_ICMP) {
-		t->uports.p16.src = otuple->src.u.all;
-		t->uports.p16.dst = rtuple->src.u.all;
-		t->uports.v32 = (t->uports.v32 & info->port_mask.v32) |
-				info->port_set.v32;
-		if (t->uports.p16.dst < t->uports.p16.src)
-			swap(t->uports.p16.dst, t->uports.p16.src);
+		t->uports.b16.src = otuple->src.u.all;
+		t->uports.b16.dst = rtuple->src.u.all;
+		hmark_swap_ports(&t->uports, info);
 	}
 
 	return 0;
@@ -98,15 +111,19 @@ hmark_ct_set_htuple(const struct sk_buff *skb, struct hmark_tuple *t,
 #endif
 }
 
+/* This hash function is endian independent, to ensure consistent hashing if
+ * the cluster is composed of big and little endian systems. */
 static inline u32
 hmark_hash(struct hmark_tuple *t, const struct xt_hmark_info *info)
 {
 	u32 hash;
+	u32 src = ntohl(t->src);
+	u32 dst = ntohl(t->dst);
 
-	if (t->dst < t->src)
-		swap(t->src, t->dst);
+	if (dst < src)
+		swap(src, dst);
 
-	hash = jhash_3words(t->src, t->dst, t->uports.v32, info->hashrnd);
+	hash = jhash_3words(src, dst, t->uports.v32, info->hashrnd);
 	hash = hash ^ (t->proto & info->proto_mask);
 
 	return (((u64)hash * info->hmodulus) >> 32) + info->hoffset;
@@ -126,11 +143,7 @@ hmark_set_tuple_ports(const struct sk_buff *skb, unsigned int nhoff,
 	if (skb_copy_bits(skb, nhoff, &t->uports, sizeof(t->uports)) < 0)
 		return;
 
-	t->uports.v32 = (t->uports.v32 & info->port_mask.v32) |
-			info->port_set.v32;
-
-	if (t->uports.p16.dst < t->uports.p16.src)
-		swap(t->uports.p16.dst, t->uports.p16.src);
+	hmark_swap_ports(&t->uports, info);
 }
 
 #if IS_ENABLED(CONFIG_IP6_NF_IPTABLES)
@@ -178,8 +191,8 @@ hmark_pkt_set_htuple_ipv6(const struct sk_buff *skb, struct hmark_tuple *t,
 			return -1;
 	}
 noicmp:
-	t->src = hmark_addr6_mask(ip6->saddr.s6_addr32, info->src_mask.all);
-	t->dst = hmark_addr6_mask(ip6->daddr.s6_addr32, info->dst_mask.all);
+	t->src = hmark_addr6_mask(ip6->saddr.s6_addr32, info->src_mask.ip6);
+	t->dst = hmark_addr6_mask(ip6->daddr.s6_addr32, info->dst_mask.ip6);
 
 	if (info->flags & XT_HMARK_FLAG(XT_HMARK_METHOD_L3))
 		return 0;
@@ -255,11 +268,8 @@ hmark_pkt_set_htuple_ipv4(const struct sk_buff *skb, struct hmark_tuple *t,
 		}
 	}
 
-	t->src = (__force u32) ip->saddr;
-	t->dst = (__force u32) ip->daddr;
-
-	t->src &= info->src_mask.ip;
-	t->dst &= info->dst_mask.ip;
+	t->src = ip->saddr & info->src_mask.ip;
+	t->dst = ip->daddr & info->dst_mask.ip;
 
 	if (info->flags & XT_HMARK_FLAG(XT_HMARK_METHOD_L3))
 		return 0;
-- 
1.7.10


^ permalink raw reply related

* [PATCH 0/2] netfilter updates for net tree (3.5-rc2)
From: pablo @ 2012-06-12 17:36 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

From: Pablo Neira Ayuso <pablo@netfilter.org>

Hi David,

The following two patches are netfilter fixes for your net tree, they
provide:

* endianess fixes to calm down warnings spotted by the sparse utility for
  xt_HMARK from Hans Schillstrom.

* One fix for h323 conntrack helper from Sanket Shah and Jagdish Motwani.
  The log says it's been me instead because they sent an incomplete patch
  without header, so I had to manually include all the information, but
  forgot to change the GIT_AUTHOR thing.

You can pull these two changes from:

git://1984.lsi.us.es/net master

Thanks!

Hans Schillstrom (1):
  netfilter: xt_HMARK: fix endianness and provide consistent hashing

Pablo Neira Ayuso (1):
  netfilter: nf_ct_h323: fix bug in rtcp natting

 include/linux/netfilter/xt_HMARK.h     |    5 +++
 net/netfilter/nf_conntrack_h323_main.c |    5 +--
 net/netfilter/xt_HMARK.c               |   72 ++++++++++++++++++--------------
 3 files changed, 48 insertions(+), 34 deletions(-)

-- 
1.7.10

^ permalink raw reply

* [PATCH 2/2] netfilter: nf_ct_h323: fix bug in rtcp natting
From: pablo @ 2012-06-12 17:36 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev
In-Reply-To: <1339522563-9227-1-git-send-email-pablo@netfilter.org>

From: Pablo Neira Ayuso <pablo@netfilter.org>

The nat_rtp_rtcp hook takes two separate parameters port and rtp_port.

port is expected to be the real h245 address (found inside the packet).
rtp_port is the even number closest to port (RTP ports are even and
RTCP ports are odd).

However currently, both port and rtp_port are having same value (both are
rounded to nearest even numbers).

This works well in case of openlogicalchannel with media (RTP/even) port.

But in case of openlogicalchannel for media control (RTCP/odd) port,
h245 address in the packet is wrongly modified to have an even port.

I am attaching a pcap demonstrating the problem, for any further analysis.

This behavior was introduced around v2.6.19 while rewriting the helper.

Signed-off-by: Jagdish Motwani <jagdish.motwani@elitecore.com>
Signed-off-by: Sanket Shah <sanket.shah@elitecore.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/netfilter/nf_conntrack_h323_main.c |    5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/net/netfilter/nf_conntrack_h323_main.c b/net/netfilter/nf_conntrack_h323_main.c
index 46d69d7..31f50bc 100644
--- a/net/netfilter/nf_conntrack_h323_main.c
+++ b/net/netfilter/nf_conntrack_h323_main.c
@@ -270,9 +270,8 @@ static int expect_rtp_rtcp(struct sk_buff *skb, struct nf_conn *ct,
 		return 0;

 	/* RTP port is even */
-	port &= htons(~1);
-	rtp_port = port;
-	rtcp_port = htons(ntohs(port) + 1);
+	rtp_port = port & ~htons(1);
+	rtcp_port = port | htons(1);

 	/* Create expect for RTP */
 	if ((rtp_exp = nf_ct_expect_alloc(ct)) == NULL)
-- 
1.7.10

^ permalink raw reply related

* [PATCH 0/7] [RFCv2] new user-space connection tracking helper infrastructure
From: pablo @ 2012-06-12 18:06 UTC (permalink / raw)
  To: netfilter-devel; +Cc: netdev

From: Pablo Neira Ayuso <pablo@netfilter.org>

Hi!

This the second round for the kernel patches that add the newly revamped
user-space cthelper infrastructure as described by:

http://lwn.net/Articles/500196/

This round fixes issues that were spotted by Jan Engelhardt, Joe Perches and
Ferenc Wagner. This also includes some improvements I made myself.

Comments welcome.

Pablo Neira Ayuso (7):
  netfilter: nf_ct_helper: allocate 16 bytes for the helper and policy names
  netfilter: nf_ct_ext: support variable length extensions
  netfilter: nf_ct_helper: implement variable length helper private  data
  netfilter: add glue code to integrate nfnetlink_queue and ctnetlink
  netfilter: nfnetlink_queue: add NAT TCP sequence adjustment if packet mangled
  netfilter: ctnetlink: add CTA_HELP_INFO attribute
  netfilter: add user-space connection tracking helper infrastructure

 include/linux/netfilter.h                      |   12 +
 include/linux/netfilter/Kbuild                 |    1 +
 include/linux/netfilter/nf_conntrack_sip.h     |    2 +
 include/linux/netfilter/nfnetlink.h            |    3 +-
 include/linux/netfilter/nfnetlink_conntrack.h  |    1 +
 include/linux/netfilter/nfnetlink_cthelper.h   |   55 ++
 include/linux/netfilter/nfnetlink_queue.h      |    3 +
 include/linux/netfilter_ipv4.h                 |    1 +
 include/linux/netfilter_ipv6.h                 |    1 +
 include/net/netfilter/nf_conntrack.h           |   35 +-
 include/net/netfilter/nf_conntrack_expect.h    |    4 +-
 include/net/netfilter/nf_conntrack_extend.h    |    7 +-
 include/net/netfilter/nf_conntrack_helper.h    |   29 +-
 include/net/netfilter/nf_nat_helper.h          |    4 +
 net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c |   49 +-
 net/ipv4/netfilter/nf_nat_amanda.c             |    4 +-
 net/ipv4/netfilter/nf_nat_h323.c               |    8 +-
 net/ipv4/netfilter/nf_nat_helper.c             |   13 +
 net/ipv4/netfilter/nf_nat_pptp.c               |    6 +-
 net/ipv4/netfilter/nf_nat_tftp.c               |    4 +-
 net/ipv6/netfilter/nf_conntrack_l3proto_ipv6.c |   49 +-
 net/netfilter/Kconfig                          |    8 +
 net/netfilter/Makefile                         |    1 +
 net/netfilter/core.c                           |    4 +
 net/netfilter/nf_conntrack_core.c              |    3 +-
 net/netfilter/nf_conntrack_extend.c            |   16 +-
 net/netfilter/nf_conntrack_ftp.c               |   11 +-
 net/netfilter/nf_conntrack_h323_main.c         |   16 +-
 net/netfilter/nf_conntrack_helper.c            |   32 +-
 net/netfilter/nf_conntrack_irc.c               |    8 +-
 net/netfilter/nf_conntrack_netlink.c           |  177 ++++++-
 net/netfilter/nf_conntrack_pptp.c              |   17 +-
 net/netfilter/nf_conntrack_proto_gre.c         |   16 +-
 net/netfilter/nf_conntrack_sane.c              |   12 +-
 net/netfilter/nf_conntrack_sip.c               |   32 +-
 net/netfilter/nf_conntrack_tftp.c              |    8 +-
 net/netfilter/nfnetlink_cthelper.c             |  674 ++++++++++++++++++++++++
 net/netfilter/nfnetlink_queue.c                |   58 +-
 net/netfilter/xt_CT.c                          |   44 +-
 39 files changed, 1247 insertions(+), 181 deletions(-)
 create mode 100644 include/linux/netfilter/nfnetlink_cthelper.h
 create mode 100644 net/netfilter/nfnetlink_cthelper.c

-- 
1.7.10


^ permalink raw reply

* [PATCH 2/7] netfilter: nf_ct_ext: support variable length extensions
From: pablo @ 2012-06-12 18:06 UTC (permalink / raw)
  To: netfilter-devel; +Cc: netdev
In-Reply-To: <1339524380-2707-1-git-send-email-pablo@netfilter.org>

From: Pablo Neira Ayuso <pablo@netfilter.org>

We can now define conntrack extensions of variable size. This
patch is useful to get rid of these unions:

union nf_conntrack_help
union nf_conntrack_proto
union nf_conntrack_nat_help

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 include/net/netfilter/nf_conntrack_extend.h |    7 ++++++-
 net/netfilter/nf_conntrack_extend.c         |   16 +++++++++-------
 2 files changed, 15 insertions(+), 8 deletions(-)

diff --git a/include/net/netfilter/nf_conntrack_extend.h b/include/net/netfilter/nf_conntrack_extend.h
index 96755c3..ecedd5f 100644
--- a/include/net/netfilter/nf_conntrack_extend.h
+++ b/include/net/netfilter/nf_conntrack_extend.h
@@ -79,11 +79,16 @@ static inline void nf_ct_ext_free(struct nf_conn *ct)
 		kfree(ct->ext);
 }
 
+void *__nf_ct_ext_add_length(struct nf_conn *ct, enum nf_ct_ext_id id,
+			     size_t var_alloc_len, gfp_t gfp);
+
 /* Add this type, returns pointer to data or NULL. */
 void *
 __nf_ct_ext_add(struct nf_conn *ct, enum nf_ct_ext_id id, gfp_t gfp);
 #define nf_ct_ext_add(ct, id, gfp) \
-	((id##_TYPE *)__nf_ct_ext_add((ct), (id), (gfp)))
+	((id##_TYPE *)__nf_ct_ext_add_length((ct), (id), 0, (gfp)))
+#define nf_ct_ext_add_length(ct, id, len, gfp) \
+	((id##_TYPE *)__nf_ct_ext_add_length((ct), (id), (len), (gfp)))
 
 #define NF_CT_EXT_F_PREALLOC	0x0001
 
diff --git a/net/netfilter/nf_conntrack_extend.c b/net/netfilter/nf_conntrack_extend.c
index 641ff5f..1a95459 100644
--- a/net/netfilter/nf_conntrack_extend.c
+++ b/net/netfilter/nf_conntrack_extend.c
@@ -44,7 +44,8 @@ void __nf_ct_ext_destroy(struct nf_conn *ct)
 EXPORT_SYMBOL(__nf_ct_ext_destroy);
 
 static void *
-nf_ct_ext_create(struct nf_ct_ext **ext, enum nf_ct_ext_id id, gfp_t gfp)
+nf_ct_ext_create(struct nf_ct_ext **ext, enum nf_ct_ext_id id,
+		 size_t var_alloc_len, gfp_t gfp)
 {
 	unsigned int off, len;
 	struct nf_ct_ext_type *t;
@@ -54,8 +55,8 @@ nf_ct_ext_create(struct nf_ct_ext **ext, enum nf_ct_ext_id id, gfp_t gfp)
 	t = rcu_dereference(nf_ct_ext_types[id]);
 	BUG_ON(t == NULL);
 	off = ALIGN(sizeof(struct nf_ct_ext), t->align);
-	len = off + t->len;
-	alloc_size = t->alloc_size;
+	len = off + t->len + var_alloc_len;
+	alloc_size = t->alloc_size + var_alloc_len;
 	rcu_read_unlock();
 
 	*ext = kzalloc(alloc_size, gfp);
@@ -68,7 +69,8 @@ nf_ct_ext_create(struct nf_ct_ext **ext, enum nf_ct_ext_id id, gfp_t gfp)
 	return (void *)(*ext) + off;
 }
 
-void *__nf_ct_ext_add(struct nf_conn *ct, enum nf_ct_ext_id id, gfp_t gfp)
+void *__nf_ct_ext_add_length(struct nf_conn *ct, enum nf_ct_ext_id id,
+			     size_t var_alloc_len, gfp_t gfp)
 {
 	struct nf_ct_ext *old, *new;
 	int i, newlen, newoff;
@@ -79,7 +81,7 @@ void *__nf_ct_ext_add(struct nf_conn *ct, enum nf_ct_ext_id id, gfp_t gfp)
 
 	old = ct->ext;
 	if (!old)
-		return nf_ct_ext_create(&ct->ext, id, gfp);
+		return nf_ct_ext_create(&ct->ext, id, var_alloc_len, gfp);
 
 	if (__nf_ct_ext_exist(old, id))
 		return NULL;
@@ -89,7 +91,7 @@ void *__nf_ct_ext_add(struct nf_conn *ct, enum nf_ct_ext_id id, gfp_t gfp)
 	BUG_ON(t == NULL);
 
 	newoff = ALIGN(old->len, t->align);
-	newlen = newoff + t->len;
+	newlen = newoff + t->len + var_alloc_len;
 	rcu_read_unlock();
 
 	new = __krealloc(old, newlen, gfp);
@@ -117,7 +119,7 @@ void *__nf_ct_ext_add(struct nf_conn *ct, enum nf_ct_ext_id id, gfp_t gfp)
 	memset((void *)new + newoff, 0, newlen - newoff);
 	return (void *)new + newoff;
 }
-EXPORT_SYMBOL(__nf_ct_ext_add);
+EXPORT_SYMBOL(__nf_ct_ext_add_length);
 
 static void update_alloc_size(struct nf_ct_ext_type *type)
 {
-- 
1.7.10


^ permalink raw reply related

* [PATCH 1/7] netfilter: nf_ct_helper: allocate 16 bytes for the helper and policy names
From: pablo @ 2012-06-12 18:06 UTC (permalink / raw)
  To: netfilter-devel; +Cc: netdev
In-Reply-To: <1339524380-2707-1-git-send-email-pablo@netfilter.org>

From: Pablo Neira Ayuso <pablo@netfilter.org>

This patch modifies the struct nf_conntrack_helper to allocate
the room for the helper name. The maximum length is 16 bytes
(this was already introduced in 2.6.24).

For the maximum length for expectation policy names, I have
also selected 16 bytes.

This patch is required by the follow-up patch to support
user-space connection tracking helpers.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 include/net/netfilter/nf_conntrack_expect.h |    4 +++-
 include/net/netfilter/nf_conntrack_helper.h |    2 +-
 net/netfilter/nf_conntrack_ftp.c            |    8 ++------
 net/netfilter/nf_conntrack_irc.c            |    8 ++------
 net/netfilter/nf_conntrack_sane.c           |    8 ++------
 net/netfilter/nf_conntrack_sip.c            |    7 ++-----
 net/netfilter/nf_conntrack_tftp.c           |    8 ++------
 7 files changed, 14 insertions(+), 31 deletions(-)

diff --git a/include/net/netfilter/nf_conntrack_expect.h b/include/net/netfilter/nf_conntrack_expect.h
index 4619caa..983f002 100644
--- a/include/net/netfilter/nf_conntrack_expect.h
+++ b/include/net/netfilter/nf_conntrack_expect.h
@@ -59,10 +59,12 @@ static inline struct net *nf_ct_exp_net(struct nf_conntrack_expect *exp)
 	return nf_ct_net(exp->master);
 }
 
+#define NF_CT_EXP_POLICY_NAME_LEN	16
+
 struct nf_conntrack_expect_policy {
 	unsigned int	max_expected;
 	unsigned int	timeout;
-	const char	*name;
+	char		name[NF_CT_EXP_POLICY_NAME_LEN];
 };
 
 #define NF_CT_EXPECT_CLASS_DEFAULT	0
diff --git a/include/net/netfilter/nf_conntrack_helper.h b/include/net/netfilter/nf_conntrack_helper.h
index 1d18894..5f5a4d9 100644
--- a/include/net/netfilter/nf_conntrack_helper.h
+++ b/include/net/netfilter/nf_conntrack_helper.h
@@ -19,7 +19,7 @@ struct module;
 struct nf_conntrack_helper {
 	struct hlist_node hnode;	/* Internal use. */
 
-	const char *name;		/* name of the module */
+	char name[NF_CT_HELPER_NAME_LEN]; /* name of the module */
 	struct module *me;		/* pointer to self */
 	const struct nf_conntrack_expect_policy *expect_policy;
 
diff --git a/net/netfilter/nf_conntrack_ftp.c b/net/netfilter/nf_conntrack_ftp.c
index 8c5c95c..44e47c9 100644
--- a/net/netfilter/nf_conntrack_ftp.c
+++ b/net/netfilter/nf_conntrack_ftp.c
@@ -512,7 +512,6 @@ out_update_nl:
 }
 
 static struct nf_conntrack_helper ftp[MAX_PORTS][2] __read_mostly;
-static char ftp_names[MAX_PORTS][2][sizeof("ftp-65535")] __read_mostly;
 
 static const struct nf_conntrack_expect_policy ftp_exp_policy = {
 	.max_expected	= 1,
@@ -541,7 +540,6 @@ static void nf_conntrack_ftp_fini(void)
 static int __init nf_conntrack_ftp_init(void)
 {
 	int i, j = -1, ret = 0;
-	char *tmpname;
 
 	ftp_buffer = kmalloc(65536, GFP_KERNEL);
 	if (!ftp_buffer)
@@ -561,12 +559,10 @@ static int __init nf_conntrack_ftp_init(void)
 			ftp[i][j].expect_policy = &ftp_exp_policy;
 			ftp[i][j].me = THIS_MODULE;
 			ftp[i][j].help = help;
-			tmpname = &ftp_names[i][j][0];
 			if (ports[i] == FTP_PORT)
-				sprintf(tmpname, "ftp");
+				sprintf(ftp[i][j].name, "ftp");
 			else
-				sprintf(tmpname, "ftp-%d", ports[i]);
-			ftp[i][j].name = tmpname;
+				sprintf(ftp[i][j].name, "ftp-%d", ports[i]);
 
 			pr_debug("nf_ct_ftp: registering helper for pf: %d "
 				 "port: %d\n",
diff --git a/net/netfilter/nf_conntrack_irc.c b/net/netfilter/nf_conntrack_irc.c
index 81366c1..009c52c 100644
--- a/net/netfilter/nf_conntrack_irc.c
+++ b/net/netfilter/nf_conntrack_irc.c
@@ -221,7 +221,6 @@ static int help(struct sk_buff *skb, unsigned int protoff,
 }
 
 static struct nf_conntrack_helper irc[MAX_PORTS] __read_mostly;
-static char irc_names[MAX_PORTS][sizeof("irc-65535")] __read_mostly;
 static struct nf_conntrack_expect_policy irc_exp_policy;
 
 static void nf_conntrack_irc_fini(void);
@@ -229,7 +228,6 @@ static void nf_conntrack_irc_fini(void);
 static int __init nf_conntrack_irc_init(void)
 {
 	int i, ret;
-	char *tmpname;
 
 	if (max_dcc_channels < 1) {
 		printk(KERN_ERR "nf_ct_irc: max_dcc_channels must not be zero\n");
@@ -255,12 +253,10 @@ static int __init nf_conntrack_irc_init(void)
 		irc[i].me = THIS_MODULE;
 		irc[i].help = help;
 
-		tmpname = &irc_names[i][0];
 		if (ports[i] == IRC_PORT)
-			sprintf(tmpname, "irc");
+			sprintf(irc[i].name, "irc");
 		else
-			sprintf(tmpname, "irc-%u", i);
-		irc[i].name = tmpname;
+			sprintf(irc[i].name, "irc-%u", i);
 
 		ret = nf_conntrack_helper_register(&irc[i]);
 		if (ret) {
diff --git a/net/netfilter/nf_conntrack_sane.c b/net/netfilter/nf_conntrack_sane.c
index 8501823..ec3fc18 100644
--- a/net/netfilter/nf_conntrack_sane.c
+++ b/net/netfilter/nf_conntrack_sane.c
@@ -163,7 +163,6 @@ out:
 }
 
 static struct nf_conntrack_helper sane[MAX_PORTS][2] __read_mostly;
-static char sane_names[MAX_PORTS][2][sizeof("sane-65535")] __read_mostly;
 
 static const struct nf_conntrack_expect_policy sane_exp_policy = {
 	.max_expected	= 1,
@@ -190,7 +189,6 @@ static void nf_conntrack_sane_fini(void)
 static int __init nf_conntrack_sane_init(void)
 {
 	int i, j = -1, ret = 0;
-	char *tmpname;
 
 	sane_buffer = kmalloc(65536, GFP_KERNEL);
 	if (!sane_buffer)
@@ -210,12 +208,10 @@ static int __init nf_conntrack_sane_init(void)
 			sane[i][j].expect_policy = &sane_exp_policy;
 			sane[i][j].me = THIS_MODULE;
 			sane[i][j].help = help;
-			tmpname = &sane_names[i][j][0];
 			if (ports[i] == SANE_PORT)
-				sprintf(tmpname, "sane");
+				sprintf(sane[i][j].name, "sane");
 			else
-				sprintf(tmpname, "sane-%d", ports[i]);
-			sane[i][j].name = tmpname;
+				sprintf(sane[i][j].name, "sane-%d", ports[i]);
 
 			pr_debug("nf_ct_sane: registering helper for pf: %d "
 				 "port: %d\n",
diff --git a/net/netfilter/nf_conntrack_sip.c b/net/netfilter/nf_conntrack_sip.c
index 93faf6a..dfd3ff3 100644
--- a/net/netfilter/nf_conntrack_sip.c
+++ b/net/netfilter/nf_conntrack_sip.c
@@ -1556,7 +1556,6 @@ static void nf_conntrack_sip_fini(void)
 static int __init nf_conntrack_sip_init(void)
 {
 	int i, j, ret;
-	char *tmpname;
 
 	if (ports_c == 0)
 		ports[ports_c++] = SIP_PORT;
@@ -1584,12 +1583,10 @@ static int __init nf_conntrack_sip_init(void)
 			sip[i][j].expect_class_max = SIP_EXPECT_MAX;
 			sip[i][j].me = THIS_MODULE;
 
-			tmpname = &sip_names[i][j][0];
 			if (ports[i] == SIP_PORT)
-				sprintf(tmpname, "sip");
+				sprintf(sip_names[i][j], "sip");
 			else
-				sprintf(tmpname, "sip-%u", i);
-			sip[i][j].name = tmpname;
+				sprintf(sip_names[i][j], "sip-%u", i);
 
 			pr_debug("port #%u: %u\n", i, ports[i]);
 
diff --git a/net/netfilter/nf_conntrack_tftp.c b/net/netfilter/nf_conntrack_tftp.c
index 75466fd..81fc61c 100644
--- a/net/netfilter/nf_conntrack_tftp.c
+++ b/net/netfilter/nf_conntrack_tftp.c
@@ -92,7 +92,6 @@ static int tftp_help(struct sk_buff *skb,
 }
 
 static struct nf_conntrack_helper tftp[MAX_PORTS][2] __read_mostly;
-static char tftp_names[MAX_PORTS][2][sizeof("tftp-65535")] __read_mostly;
 
 static const struct nf_conntrack_expect_policy tftp_exp_policy = {
 	.max_expected	= 1,
@@ -112,7 +111,6 @@ static void nf_conntrack_tftp_fini(void)
 static int __init nf_conntrack_tftp_init(void)
 {
 	int i, j, ret;
-	char *tmpname;
 
 	if (ports_c == 0)
 		ports[ports_c++] = TFTP_PORT;
@@ -129,12 +127,10 @@ static int __init nf_conntrack_tftp_init(void)
 			tftp[i][j].me = THIS_MODULE;
 			tftp[i][j].help = tftp_help;
 
-			tmpname = &tftp_names[i][j][0];
 			if (ports[i] == TFTP_PORT)
-				sprintf(tmpname, "tftp");
+				sprintf(tftp[i][j].name, "tftp");
 			else
-				sprintf(tmpname, "tftp-%u", i);
-			tftp[i][j].name = tmpname;
+				sprintf(tftp[i][j].name, "tftp-%u", i);
 
 			ret = nf_conntrack_helper_register(&tftp[i][j]);
 			if (ret) {
-- 
1.7.10

^ permalink raw reply related

* [PATCH 3/7] netfilter: nf_ct_helper: implement variable length helper private data
From: pablo @ 2012-06-12 18:06 UTC (permalink / raw)
  To: netfilter-devel; +Cc: netdev
In-Reply-To: <1339524380-2707-1-git-send-email-pablo@netfilter.org>

From: Pablo Neira Ayuso <pablo@netfilter.org>

This patch uses the new variable length conntrack extensions.

Instead of using union nf_conntrack_help that contain all the
helper private data information, we allocate variable length
area to store the private helper data.

This patch includes the modification of all existing helpers.
It also includes a couple of include header to avoid compilation
warnings.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 include/linux/netfilter/nf_conntrack_sip.h  |    2 ++
 include/net/netfilter/nf_conntrack.h        |   35 ++-------------------
 include/net/netfilter/nf_conntrack_helper.h |   15 ++++++++-
 net/ipv4/netfilter/nf_nat_amanda.c          |    4 +--
 net/ipv4/netfilter/nf_nat_h323.c            |    8 ++---
 net/ipv4/netfilter/nf_nat_pptp.c            |    6 ++--
 net/ipv4/netfilter/nf_nat_tftp.c            |    4 +--
 net/netfilter/nf_conntrack_core.c           |    3 +-
 net/netfilter/nf_conntrack_ftp.c            |    3 +-
 net/netfilter/nf_conntrack_h323_main.c      |   16 ++++++----
 net/netfilter/nf_conntrack_helper.c         |   11 ++++---
 net/netfilter/nf_conntrack_netlink.c        |    4 +--
 net/netfilter/nf_conntrack_pptp.c           |   17 ++++++-----
 net/netfilter/nf_conntrack_proto_gre.c      |   16 +++++-----
 net/netfilter/nf_conntrack_sane.c           |    4 +--
 net/netfilter/nf_conntrack_sip.c            |   25 +++++++--------
 net/netfilter/xt_CT.c                       |   44 ++++++++++++++++-----------
 17 files changed, 111 insertions(+), 106 deletions(-)

diff --git a/include/linux/netfilter/nf_conntrack_sip.h b/include/linux/netfilter/nf_conntrack_sip.h
index 0ce91d5..0dfc8b7 100644
--- a/include/linux/netfilter/nf_conntrack_sip.h
+++ b/include/linux/netfilter/nf_conntrack_sip.h
@@ -2,6 +2,8 @@
 #define __NF_CONNTRACK_SIP_H__
 #ifdef __KERNEL__
 
+#include <net/netfilter/nf_conntrack_expect.h>
+
 #define SIP_PORT	5060
 #define SIP_TIMEOUT	3600
 
diff --git a/include/net/netfilter/nf_conntrack.h b/include/net/netfilter/nf_conntrack.h
index cce7f6a..f1494fe 100644
--- a/include/net/netfilter/nf_conntrack.h
+++ b/include/net/netfilter/nf_conntrack.h
@@ -39,36 +39,6 @@ union nf_conntrack_expect_proto {
 	/* insert expect proto private data here */
 };
 
-/* Add protocol helper include file here */
-#include <linux/netfilter/nf_conntrack_ftp.h>
-#include <linux/netfilter/nf_conntrack_pptp.h>
-#include <linux/netfilter/nf_conntrack_h323.h>
-#include <linux/netfilter/nf_conntrack_sane.h>
-#include <linux/netfilter/nf_conntrack_sip.h>
-
-/* per conntrack: application helper private data */
-union nf_conntrack_help {
-	/* insert conntrack helper private data (master) here */
-#if defined(CONFIG_NF_CONNTRACK_FTP) || defined(CONFIG_NF_CONNTRACK_FTP_MODULE)
-	struct nf_ct_ftp_master ct_ftp_info;
-#endif
-#if defined(CONFIG_NF_CONNTRACK_PPTP) || \
-    defined(CONFIG_NF_CONNTRACK_PPTP_MODULE)
-	struct nf_ct_pptp_master ct_pptp_info;
-#endif
-#if defined(CONFIG_NF_CONNTRACK_H323) || \
-    defined(CONFIG_NF_CONNTRACK_H323_MODULE)
-	struct nf_ct_h323_master ct_h323_info;
-#endif
-#if defined(CONFIG_NF_CONNTRACK_SANE) || \
-    defined(CONFIG_NF_CONNTRACK_SANE_MODULE)
-	struct nf_ct_sane_master ct_sane_info;
-#endif
-#if defined(CONFIG_NF_CONNTRACK_SIP) || defined(CONFIG_NF_CONNTRACK_SIP_MODULE)
-	struct nf_ct_sip_master ct_sip_info;
-#endif
-};
-
 #include <linux/types.h>
 #include <linux/skbuff.h>
 #include <linux/timer.h>
@@ -89,12 +59,13 @@ struct nf_conn_help {
 	/* Helper. if any */
 	struct nf_conntrack_helper __rcu *helper;
 
-	union nf_conntrack_help help;
-
 	struct hlist_head expectations;
 
 	/* Current number of expected connections */
 	u8 expecting[NF_CT_MAX_EXPECT_CLASSES];
+
+	/* private helper information. */
+	char data[];
 };
 
 #include <net/netfilter/ipv4/nf_conntrack_ipv4.h>
diff --git a/include/net/netfilter/nf_conntrack_helper.h b/include/net/netfilter/nf_conntrack_helper.h
index 5f5a4d9..061352f 100644
--- a/include/net/netfilter/nf_conntrack_helper.h
+++ b/include/net/netfilter/nf_conntrack_helper.h
@@ -11,6 +11,7 @@
 #define _NF_CONNTRACK_HELPER_H
 #include <net/netfilter/nf_conntrack.h>
 #include <net/netfilter/nf_conntrack_extend.h>
+#include <net/netfilter/nf_conntrack_expect.h>
 
 struct module;
 
@@ -23,6 +24,9 @@ struct nf_conntrack_helper {
 	struct module *me;		/* pointer to self */
 	const struct nf_conntrack_expect_policy *expect_policy;
 
+	/* length of internal data, ie. sizeof(struct nf_ct_*_master) */
+	size_t data_len;
+
 	/* Tuple of things we will help (compared against server response) */
 	struct nf_conntrack_tuple tuple;
 
@@ -48,7 +52,7 @@ nf_conntrack_helper_try_module_get(const char *name, u16 l3num, u8 protonum);
 extern int nf_conntrack_helper_register(struct nf_conntrack_helper *);
 extern void nf_conntrack_helper_unregister(struct nf_conntrack_helper *);
 
-extern struct nf_conn_help *nf_ct_helper_ext_add(struct nf_conn *ct, gfp_t gfp);
+extern struct nf_conn_help *nf_ct_helper_ext_add(struct nf_conn *ct, struct nf_conntrack_helper *helper, gfp_t gfp);
 
 extern int __nf_ct_try_assign_helper(struct nf_conn *ct, struct nf_conn *tmpl,
 				     gfp_t flags);
@@ -60,6 +64,15 @@ static inline struct nf_conn_help *nfct_help(const struct nf_conn *ct)
 	return nf_ct_ext_find(ct, NF_CT_EXT_HELPER);
 }
 
+static inline void *nfct_help_data(const struct nf_conn *ct)
+{
+	struct nf_conn_help *help;
+
+	help = nf_ct_ext_find(ct, NF_CT_EXT_HELPER);
+
+	return (void *)help->data;
+}
+
 extern int nf_conntrack_helper_init(struct net *net);
 extern void nf_conntrack_helper_fini(struct net *net);
 
diff --git a/net/ipv4/netfilter/nf_nat_amanda.c b/net/ipv4/netfilter/nf_nat_amanda.c
index 7b22382..3c04d24 100644
--- a/net/ipv4/netfilter/nf_nat_amanda.c
+++ b/net/ipv4/netfilter/nf_nat_amanda.c
@@ -13,10 +13,10 @@
 #include <linux/skbuff.h>
 #include <linux/udp.h>
 
-#include <net/netfilter/nf_nat_helper.h>
-#include <net/netfilter/nf_nat_rule.h>
 #include <net/netfilter/nf_conntrack_helper.h>
 #include <net/netfilter/nf_conntrack_expect.h>
+#include <net/netfilter/nf_nat_helper.h>
+#include <net/netfilter/nf_nat_rule.h>
 #include <linux/netfilter/nf_conntrack_amanda.h>
 
 MODULE_AUTHOR("Brian J. Murrell <netfilter@interlinx.bc.ca>");
diff --git a/net/ipv4/netfilter/nf_nat_h323.c b/net/ipv4/netfilter/nf_nat_h323.c
index cad29c1..c6784a1 100644
--- a/net/ipv4/netfilter/nf_nat_h323.c
+++ b/net/ipv4/netfilter/nf_nat_h323.c
@@ -95,7 +95,7 @@ static int set_sig_addr(struct sk_buff *skb, struct nf_conn *ct,
 			unsigned char **data,
 			TransportAddress *taddr, int count)
 {
-	const struct nf_ct_h323_master *info = &nfct_help(ct)->help.ct_h323_info;
+	const struct nf_ct_h323_master *info = nfct_help_data(ct);
 	int dir = CTINFO2DIR(ctinfo);
 	int i;
 	__be16 port;
@@ -178,7 +178,7 @@ static int nat_rtp_rtcp(struct sk_buff *skb, struct nf_conn *ct,
 			struct nf_conntrack_expect *rtp_exp,
 			struct nf_conntrack_expect *rtcp_exp)
 {
-	struct nf_ct_h323_master *info = &nfct_help(ct)->help.ct_h323_info;
+	struct nf_ct_h323_master *info = nfct_help_data(ct);
 	int dir = CTINFO2DIR(ctinfo);
 	int i;
 	u_int16_t nated_port;
@@ -330,7 +330,7 @@ static int nat_h245(struct sk_buff *skb, struct nf_conn *ct,
 		    TransportAddress *taddr, __be16 port,
 		    struct nf_conntrack_expect *exp)
 {
-	struct nf_ct_h323_master *info = &nfct_help(ct)->help.ct_h323_info;
+	struct nf_ct_h323_master *info = nfct_help_data(ct);
 	int dir = CTINFO2DIR(ctinfo);
 	u_int16_t nated_port = ntohs(port);
 
@@ -419,7 +419,7 @@ static int nat_q931(struct sk_buff *skb, struct nf_conn *ct,
 		    unsigned char **data, TransportAddress *taddr, int idx,
 		    __be16 port, struct nf_conntrack_expect *exp)
 {
-	struct nf_ct_h323_master *info = &nfct_help(ct)->help.ct_h323_info;
+	struct nf_ct_h323_master *info = nfct_help_data(ct);
 	int dir = CTINFO2DIR(ctinfo);
 	u_int16_t nated_port = ntohs(port);
 	union nf_inet_addr addr;
diff --git a/net/ipv4/netfilter/nf_nat_pptp.c b/net/ipv4/netfilter/nf_nat_pptp.c
index c273d58..3881408 100644
--- a/net/ipv4/netfilter/nf_nat_pptp.c
+++ b/net/ipv4/netfilter/nf_nat_pptp.c
@@ -49,7 +49,7 @@ static void pptp_nat_expected(struct nf_conn *ct,
 	const struct nf_nat_pptp *nat_pptp_info;
 	struct nf_nat_ipv4_range range;
 
-	ct_pptp_info = &nfct_help(master)->help.ct_pptp_info;
+	ct_pptp_info = nfct_help_data(master);
 	nat_pptp_info = &nfct_nat(master)->help.nat_pptp_info;
 
 	/* And here goes the grand finale of corrosion... */
@@ -123,7 +123,7 @@ pptp_outbound_pkt(struct sk_buff *skb,
 	__be16 new_callid;
 	unsigned int cid_off;
 
-	ct_pptp_info  = &nfct_help(ct)->help.ct_pptp_info;
+	ct_pptp_info = nfct_help_data(ct);
 	nat_pptp_info = &nfct_nat(ct)->help.nat_pptp_info;
 
 	new_callid = ct_pptp_info->pns_call_id;
@@ -192,7 +192,7 @@ pptp_exp_gre(struct nf_conntrack_expect *expect_orig,
 	struct nf_ct_pptp_master *ct_pptp_info;
 	struct nf_nat_pptp *nat_pptp_info;
 
-	ct_pptp_info  = &nfct_help(ct)->help.ct_pptp_info;
+	ct_pptp_info = nfct_help_data(ct);
 	nat_pptp_info = &nfct_nat(ct)->help.nat_pptp_info;
 
 	/* save original PAC call ID in nat_info */
diff --git a/net/ipv4/netfilter/nf_nat_tftp.c b/net/ipv4/netfilter/nf_nat_tftp.c
index a2901bf..9dbb8d2 100644
--- a/net/ipv4/netfilter/nf_nat_tftp.c
+++ b/net/ipv4/netfilter/nf_nat_tftp.c
@@ -8,10 +8,10 @@
 #include <linux/module.h>
 #include <linux/udp.h>
 
-#include <net/netfilter/nf_nat_helper.h>
-#include <net/netfilter/nf_nat_rule.h>
 #include <net/netfilter/nf_conntrack_helper.h>
 #include <net/netfilter/nf_conntrack_expect.h>
+#include <net/netfilter/nf_nat_helper.h>
+#include <net/netfilter/nf_nat_rule.h>
 #include <linux/netfilter/nf_conntrack_tftp.h>
 
 MODULE_AUTHOR("Magnus Boden <mb@ozaba.mine.nu>");
diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
index 068f2e0..f8d848f 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -819,7 +819,8 @@ init_conntrack(struct net *net, struct nf_conn *tmpl,
 		__set_bit(IPS_EXPECTED_BIT, &ct->status);
 		ct->master = exp->master;
 		if (exp->helper) {
-			help = nf_ct_helper_ext_add(ct, GFP_ATOMIC);
+			help = nf_ct_helper_ext_add(ct, exp->helper,
+						    GFP_ATOMIC);
 			if (help)
 				rcu_assign_pointer(help->helper, exp->helper);
 		}
diff --git a/net/netfilter/nf_conntrack_ftp.c b/net/netfilter/nf_conntrack_ftp.c
index 44e47c9..4bb771d 100644
--- a/net/netfilter/nf_conntrack_ftp.c
+++ b/net/netfilter/nf_conntrack_ftp.c
@@ -358,7 +358,7 @@ static int help(struct sk_buff *skb,
 	u32 seq;
 	int dir = CTINFO2DIR(ctinfo);
 	unsigned int uninitialized_var(matchlen), uninitialized_var(matchoff);
-	struct nf_ct_ftp_master *ct_ftp_info = &nfct_help(ct)->help.ct_ftp_info;
+	struct nf_ct_ftp_master *ct_ftp_info = nfct_help_data(ct);
 	struct nf_conntrack_expect *exp;
 	union nf_inet_addr *daddr;
 	struct nf_conntrack_man cmd = {};
@@ -554,6 +554,7 @@ static int __init nf_conntrack_ftp_init(void)
 		ftp[i][0].tuple.src.l3num = PF_INET;
 		ftp[i][1].tuple.src.l3num = PF_INET6;
 		for (j = 0; j < 2; j++) {
+			ftp[i][j].data_len = sizeof(struct nf_ct_ftp_master);
 			ftp[i][j].tuple.src.u.tcp.port = htons(ports[i]);
 			ftp[i][j].tuple.dst.protonum = IPPROTO_TCP;
 			ftp[i][j].expect_policy = &ftp_exp_policy;
diff --git a/net/netfilter/nf_conntrack_h323_main.c b/net/netfilter/nf_conntrack_h323_main.c
index 46d69d7..ed21992 100644
--- a/net/netfilter/nf_conntrack_h323_main.c
+++ b/net/netfilter/nf_conntrack_h323_main.c
@@ -114,7 +114,7 @@ static int get_tpkt_data(struct sk_buff *skb, unsigned int protoff,
 			 struct nf_conn *ct, enum ip_conntrack_info ctinfo,
 			 unsigned char **data, int *datalen, int *dataoff)
 {
-	struct nf_ct_h323_master *info = &nfct_help(ct)->help.ct_h323_info;
+	struct nf_ct_h323_master *info = nfct_help_data(ct);
 	int dir = CTINFO2DIR(ctinfo);
 	const struct tcphdr *th;
 	struct tcphdr _tcph;
@@ -618,6 +618,7 @@ static const struct nf_conntrack_expect_policy h245_exp_policy = {
 static struct nf_conntrack_helper nf_conntrack_helper_h245 __read_mostly = {
 	.name			= "H.245",
 	.me			= THIS_MODULE,
+	.data_len		= sizeof(struct nf_ct_h323_master),
 	.tuple.src.l3num	= AF_UNSPEC,
 	.tuple.dst.protonum	= IPPROTO_UDP,
 	.help			= h245_help,
@@ -1170,6 +1171,7 @@ static struct nf_conntrack_helper nf_conntrack_helper_q931[] __read_mostly = {
 	{
 		.name			= "Q.931",
 		.me			= THIS_MODULE,
+		.data_len		= sizeof(struct nf_ct_h323_master),
 		.tuple.src.l3num	= AF_INET,
 		.tuple.src.u.tcp.port	= cpu_to_be16(Q931_PORT),
 		.tuple.dst.protonum	= IPPROTO_TCP,
@@ -1245,7 +1247,7 @@ static int expect_q931(struct sk_buff *skb, struct nf_conn *ct,
 		       unsigned char **data,
 		       TransportAddress *taddr, int count)
 {
-	struct nf_ct_h323_master *info = &nfct_help(ct)->help.ct_h323_info;
+	struct nf_ct_h323_master *info = nfct_help_data(ct);
 	int dir = CTINFO2DIR(ctinfo);
 	int ret = 0;
 	int i;
@@ -1360,7 +1362,7 @@ static int process_rrq(struct sk_buff *skb, struct nf_conn *ct,
 		       enum ip_conntrack_info ctinfo,
 		       unsigned char **data, RegistrationRequest *rrq)
 {
-	struct nf_ct_h323_master *info = &nfct_help(ct)->help.ct_h323_info;
+	struct nf_ct_h323_master *info = nfct_help_data(ct);
 	int ret;
 	typeof(set_ras_addr_hook) set_ras_addr;
 
@@ -1395,7 +1397,7 @@ static int process_rcf(struct sk_buff *skb, struct nf_conn *ct,
 		       enum ip_conntrack_info ctinfo,
 		       unsigned char **data, RegistrationConfirm *rcf)
 {
-	struct nf_ct_h323_master *info = &nfct_help(ct)->help.ct_h323_info;
+	struct nf_ct_h323_master *info = nfct_help_data(ct);
 	int dir = CTINFO2DIR(ctinfo);
 	int ret;
 	struct nf_conntrack_expect *exp;
@@ -1444,7 +1446,7 @@ static int process_urq(struct sk_buff *skb, struct nf_conn *ct,
 		       enum ip_conntrack_info ctinfo,
 		       unsigned char **data, UnregistrationRequest *urq)
 {
-	struct nf_ct_h323_master *info = &nfct_help(ct)->help.ct_h323_info;
+	struct nf_ct_h323_master *info = nfct_help_data(ct);
 	int dir = CTINFO2DIR(ctinfo);
 	int ret;
 	typeof(set_sig_addr_hook) set_sig_addr;
@@ -1476,7 +1478,7 @@ static int process_arq(struct sk_buff *skb, struct nf_conn *ct,
 		       enum ip_conntrack_info ctinfo,
 		       unsigned char **data, AdmissionRequest *arq)
 {
-	const struct nf_ct_h323_master *info = &nfct_help(ct)->help.ct_h323_info;
+	const struct nf_ct_h323_master *info = nfct_help_data(ct);
 	int dir = CTINFO2DIR(ctinfo);
 	__be16 port;
 	union nf_inet_addr addr;
@@ -1743,6 +1745,7 @@ static struct nf_conntrack_helper nf_conntrack_helper_ras[] __read_mostly = {
 	{
 		.name			= "RAS",
 		.me			= THIS_MODULE,
+		.data_len		= sizeof(struct nf_ct_h323_master),
 		.tuple.src.l3num	= AF_INET,
 		.tuple.src.u.udp.port	= cpu_to_be16(RAS_PORT),
 		.tuple.dst.protonum	= IPPROTO_UDP,
@@ -1752,6 +1755,7 @@ static struct nf_conntrack_helper nf_conntrack_helper_ras[] __read_mostly = {
 	{
 		.name			= "RAS",
 		.me			= THIS_MODULE,
+		.data_len		= sizeof(struct nf_ct_h323_master),
 		.tuple.src.l3num	= AF_INET6,
 		.tuple.src.u.udp.port	= cpu_to_be16(RAS_PORT),
 		.tuple.dst.protonum	= IPPROTO_UDP,
diff --git a/net/netfilter/nf_conntrack_helper.c b/net/netfilter/nf_conntrack_helper.c
index 4fa2ff9..9c18ecb 100644
--- a/net/netfilter/nf_conntrack_helper.c
+++ b/net/netfilter/nf_conntrack_helper.c
@@ -161,11 +161,14 @@ nf_conntrack_helper_try_module_get(const char *name, u16 l3num, u8 protonum)
 }
 EXPORT_SYMBOL_GPL(nf_conntrack_helper_try_module_get);
 
-struct nf_conn_help *nf_ct_helper_ext_add(struct nf_conn *ct, gfp_t gfp)
+struct nf_conn_help *
+nf_ct_helper_ext_add(struct nf_conn *ct,
+		     struct nf_conntrack_helper *helper, gfp_t gfp)
 {
 	struct nf_conn_help *help;
 
-	help = nf_ct_ext_add(ct, NF_CT_EXT_HELPER, gfp);
+	help = nf_ct_ext_add_length(ct, NF_CT_EXT_HELPER,
+				    helper->data_len, gfp);
 	if (help)
 		INIT_HLIST_HEAD(&help->expectations);
 	else
@@ -218,13 +221,13 @@ int __nf_ct_try_assign_helper(struct nf_conn *ct, struct nf_conn *tmpl,
 	}
 
 	if (help == NULL) {
-		help = nf_ct_helper_ext_add(ct, flags);
+		help = nf_ct_helper_ext_add(ct, helper, flags);
 		if (help == NULL) {
 			ret = -ENOMEM;
 			goto out;
 		}
 	} else {
-		memset(&help->help, 0, sizeof(help->help));
+		memset(help->data, 0, helper->data_len);
 	}
 
 	rcu_assign_pointer(help->helper, helper);
diff --git a/net/netfilter/nf_conntrack_netlink.c b/net/netfilter/nf_conntrack_netlink.c
index 6f4b00a..a088920 100644
--- a/net/netfilter/nf_conntrack_netlink.c
+++ b/net/netfilter/nf_conntrack_netlink.c
@@ -1218,7 +1218,7 @@ ctnetlink_change_helper(struct nf_conn *ct, const struct nlattr * const cda[])
 		if (help->helper)
 			return -EBUSY;
 		/* need to zero data of old helper */
-		memset(&help->help, 0, sizeof(help->help));
+		memset(help->data, 0, help->helper->data_len);
 	} else {
 		/* we cannot set a helper for an existing conntrack */
 		return -EOPNOTSUPP;
@@ -1440,7 +1440,7 @@ ctnetlink_create_conntrack(struct net *net, u16 zone,
 		} else {
 			struct nf_conn_help *help;
 
-			help = nf_ct_helper_ext_add(ct, GFP_ATOMIC);
+			help = nf_ct_helper_ext_add(ct, helper, GFP_ATOMIC);
 			if (help == NULL) {
 				err = -ENOMEM;
 				goto err2;
diff --git a/net/netfilter/nf_conntrack_pptp.c b/net/netfilter/nf_conntrack_pptp.c
index 31d56b2..6fed9ec 100644
--- a/net/netfilter/nf_conntrack_pptp.c
+++ b/net/netfilter/nf_conntrack_pptp.c
@@ -174,7 +174,7 @@ static int destroy_sibling_or_exp(struct net *net, struct nf_conn *ct,
 static void pptp_destroy_siblings(struct nf_conn *ct)
 {
 	struct net *net = nf_ct_net(ct);
-	const struct nf_conn_help *help = nfct_help(ct);
+	const struct nf_ct_pptp_master *ct_pptp_info = nfct_help_data(ct);
 	struct nf_conntrack_tuple t;
 
 	nf_ct_gre_keymap_destroy(ct);
@@ -182,16 +182,16 @@ static void pptp_destroy_siblings(struct nf_conn *ct)
 	/* try original (pns->pac) tuple */
 	memcpy(&t, &ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple, sizeof(t));
 	t.dst.protonum = IPPROTO_GRE;
-	t.src.u.gre.key = help->help.ct_pptp_info.pns_call_id;
-	t.dst.u.gre.key = help->help.ct_pptp_info.pac_call_id;
+	t.src.u.gre.key = ct_pptp_info->pns_call_id;
+	t.dst.u.gre.key = ct_pptp_info->pac_call_id;
 	if (!destroy_sibling_or_exp(net, ct, &t))
 		pr_debug("failed to timeout original pns->pac ct/exp\n");
 
 	/* try reply (pac->pns) tuple */
 	memcpy(&t, &ct->tuplehash[IP_CT_DIR_REPLY].tuple, sizeof(t));
 	t.dst.protonum = IPPROTO_GRE;
-	t.src.u.gre.key = help->help.ct_pptp_info.pac_call_id;
-	t.dst.u.gre.key = help->help.ct_pptp_info.pns_call_id;
+	t.src.u.gre.key = ct_pptp_info->pac_call_id;
+	t.dst.u.gre.key = ct_pptp_info->pns_call_id;
 	if (!destroy_sibling_or_exp(net, ct, &t))
 		pr_debug("failed to timeout reply pac->pns ct/exp\n");
 }
@@ -269,7 +269,7 @@ pptp_inbound_pkt(struct sk_buff *skb,
 		 struct nf_conn *ct,
 		 enum ip_conntrack_info ctinfo)
 {
-	struct nf_ct_pptp_master *info = &nfct_help(ct)->help.ct_pptp_info;
+	struct nf_ct_pptp_master *info = nfct_help_data(ct);
 	u_int16_t msg;
 	__be16 cid = 0, pcid = 0;
 	typeof(nf_nat_pptp_hook_inbound) nf_nat_pptp_inbound;
@@ -396,7 +396,7 @@ pptp_outbound_pkt(struct sk_buff *skb,
 		  struct nf_conn *ct,
 		  enum ip_conntrack_info ctinfo)
 {
-	struct nf_ct_pptp_master *info = &nfct_help(ct)->help.ct_pptp_info;
+	struct nf_ct_pptp_master *info = nfct_help_data(ct);
 	u_int16_t msg;
 	__be16 cid = 0, pcid = 0;
 	typeof(nf_nat_pptp_hook_outbound) nf_nat_pptp_outbound;
@@ -506,7 +506,7 @@ conntrack_pptp_help(struct sk_buff *skb, unsigned int protoff,
 
 {
 	int dir = CTINFO2DIR(ctinfo);
-	const struct nf_ct_pptp_master *info = &nfct_help(ct)->help.ct_pptp_info;
+	const struct nf_ct_pptp_master *info = nfct_help_data(ct);
 	const struct tcphdr *tcph;
 	struct tcphdr _tcph;
 	const struct pptp_pkt_hdr *pptph;
@@ -592,6 +592,7 @@ static const struct nf_conntrack_expect_policy pptp_exp_policy = {
 static struct nf_conntrack_helper pptp __read_mostly = {
 	.name			= "pptp",
 	.me			= THIS_MODULE,
+	.data_len		= sizeof(struct nf_ct_pptp_master),
 	.tuple.src.l3num	= AF_INET,
 	.tuple.src.u.tcp.port	= cpu_to_be16(PPTP_CONTROL_PORT),
 	.tuple.dst.protonum	= IPPROTO_TCP,
diff --git a/net/netfilter/nf_conntrack_proto_gre.c b/net/netfilter/nf_conntrack_proto_gre.c
index 25ba5a2..5cac41c 100644
--- a/net/netfilter/nf_conntrack_proto_gre.c
+++ b/net/netfilter/nf_conntrack_proto_gre.c
@@ -117,10 +117,10 @@ int nf_ct_gre_keymap_add(struct nf_conn *ct, enum ip_conntrack_dir dir,
 {
 	struct net *net = nf_ct_net(ct);
 	struct netns_proto_gre *net_gre = gre_pernet(net);
-	struct nf_conn_help *help = nfct_help(ct);
+	struct nf_ct_pptp_master *ct_pptp_info = nfct_help_data(ct);
 	struct nf_ct_gre_keymap **kmp, *km;
 
-	kmp = &help->help.ct_pptp_info.keymap[dir];
+	kmp = &ct_pptp_info->keymap[dir];
 	if (*kmp) {
 		/* check whether it's a retransmission */
 		read_lock_bh(&net_gre->keymap_lock);
@@ -158,19 +158,19 @@ void nf_ct_gre_keymap_destroy(struct nf_conn *ct)
 {
 	struct net *net = nf_ct_net(ct);
 	struct netns_proto_gre *net_gre = gre_pernet(net);
-	struct nf_conn_help *help = nfct_help(ct);
+	struct nf_ct_pptp_master *ct_pptp_info = nfct_help_data(ct);
 	enum ip_conntrack_dir dir;
 
 	pr_debug("entering for ct %p\n", ct);
 
 	write_lock_bh(&net_gre->keymap_lock);
 	for (dir = IP_CT_DIR_ORIGINAL; dir < IP_CT_DIR_MAX; dir++) {
-		if (help->help.ct_pptp_info.keymap[dir]) {
+		if (ct_pptp_info->keymap[dir]) {
 			pr_debug("removing %p from list\n",
-				 help->help.ct_pptp_info.keymap[dir]);
-			list_del(&help->help.ct_pptp_info.keymap[dir]->list);
-			kfree(help->help.ct_pptp_info.keymap[dir]);
-			help->help.ct_pptp_info.keymap[dir] = NULL;
+				 ct_pptp_info->keymap[dir]);
+			list_del(&ct_pptp_info->keymap[dir]->list);
+			kfree(ct_pptp_info->keymap[dir]);
+			ct_pptp_info->keymap[dir] = NULL;
 		}
 	}
 	write_unlock_bh(&net_gre->keymap_lock);
diff --git a/net/netfilter/nf_conntrack_sane.c b/net/netfilter/nf_conntrack_sane.c
index ec3fc18..295429f 100644
--- a/net/netfilter/nf_conntrack_sane.c
+++ b/net/netfilter/nf_conntrack_sane.c
@@ -69,13 +69,12 @@ static int help(struct sk_buff *skb,
 	void *sb_ptr;
 	int ret = NF_ACCEPT;
 	int dir = CTINFO2DIR(ctinfo);
-	struct nf_ct_sane_master *ct_sane_info;
+	struct nf_ct_sane_master *ct_sane_info = nfct_help_data(ct);
 	struct nf_conntrack_expect *exp;
 	struct nf_conntrack_tuple *tuple;
 	struct sane_request *req;
 	struct sane_reply_net_start *reply;
 
-	ct_sane_info = &nfct_help(ct)->help.ct_sane_info;
 	/* Until there's been traffic both ways, don't look in packets. */
 	if (ctinfo != IP_CT_ESTABLISHED &&
 	    ctinfo != IP_CT_ESTABLISHED_REPLY)
@@ -203,6 +202,7 @@ static int __init nf_conntrack_sane_init(void)
 		sane[i][0].tuple.src.l3num = PF_INET;
 		sane[i][1].tuple.src.l3num = PF_INET6;
 		for (j = 0; j < 2; j++) {
+			sane[i][j].data_len = sizeof(struct nf_ct_sane_master);
 			sane[i][j].tuple.src.u.tcp.port = htons(ports[i]);
 			sane[i][j].tuple.dst.protonum = IPPROTO_TCP;
 			sane[i][j].expect_policy = &sane_exp_policy;
diff --git a/net/netfilter/nf_conntrack_sip.c b/net/netfilter/nf_conntrack_sip.c
index dfd3ff3..758a1ba 100644
--- a/net/netfilter/nf_conntrack_sip.c
+++ b/net/netfilter/nf_conntrack_sip.c
@@ -1075,12 +1075,12 @@ static int process_invite_response(struct sk_buff *skb, unsigned int dataoff,
 {
 	enum ip_conntrack_info ctinfo;
 	struct nf_conn *ct = nf_ct_get(skb, &ctinfo);
-	struct nf_conn_help *help = nfct_help(ct);
+	struct nf_ct_sip_master *ct_sip_info = nfct_help_data(ct);
 
 	if ((code >= 100 && code <= 199) ||
 	    (code >= 200 && code <= 299))
 		return process_sdp(skb, dataoff, dptr, datalen, cseq);
-	else if (help->help.ct_sip_info.invite_cseq == cseq)
+	else if (ct_sip_info->invite_cseq == cseq)
 		flush_expectations(ct, true);
 	return NF_ACCEPT;
 }
@@ -1091,12 +1091,12 @@ static int process_update_response(struct sk_buff *skb, unsigned int dataoff,
 {
 	enum ip_conntrack_info ctinfo;
 	struct nf_conn *ct = nf_ct_get(skb, &ctinfo);
-	struct nf_conn_help *help = nfct_help(ct);
+	struct nf_ct_sip_master *ct_sip_info = nfct_help_data(ct);
 
 	if ((code >= 100 && code <= 199) ||
 	    (code >= 200 && code <= 299))
 		return process_sdp(skb, dataoff, dptr, datalen, cseq);
-	else if (help->help.ct_sip_info.invite_cseq == cseq)
+	else if (ct_sip_info->invite_cseq == cseq)
 		flush_expectations(ct, true);
 	return NF_ACCEPT;
 }
@@ -1107,12 +1107,12 @@ static int process_prack_response(struct sk_buff *skb, unsigned int dataoff,
 {
 	enum ip_conntrack_info ctinfo;
 	struct nf_conn *ct = nf_ct_get(skb, &ctinfo);
-	struct nf_conn_help *help = nfct_help(ct);
+	struct nf_ct_sip_master *ct_sip_info = nfct_help_data(ct);
 
 	if ((code >= 100 && code <= 199) ||
 	    (code >= 200 && code <= 299))
 		return process_sdp(skb, dataoff, dptr, datalen, cseq);
-	else if (help->help.ct_sip_info.invite_cseq == cseq)
+	else if (ct_sip_info->invite_cseq == cseq)
 		flush_expectations(ct, true);
 	return NF_ACCEPT;
 }
@@ -1123,13 +1123,13 @@ static int process_invite_request(struct sk_buff *skb, unsigned int dataoff,
 {
 	enum ip_conntrack_info ctinfo;
 	struct nf_conn *ct = nf_ct_get(skb, &ctinfo);
-	struct nf_conn_help *help = nfct_help(ct);
+	struct nf_ct_sip_master *ct_sip_info = nfct_help_data(ct);
 	unsigned int ret;
 
 	flush_expectations(ct, true);
 	ret = process_sdp(skb, dataoff, dptr, datalen, cseq);
 	if (ret == NF_ACCEPT)
-		help->help.ct_sip_info.invite_cseq = cseq;
+		ct_sip_info->invite_cseq = cseq;
 	return ret;
 }
 
@@ -1154,7 +1154,7 @@ static int process_register_request(struct sk_buff *skb, unsigned int dataoff,
 {
 	enum ip_conntrack_info ctinfo;
 	struct nf_conn *ct = nf_ct_get(skb, &ctinfo);
-	struct nf_conn_help *help = nfct_help(ct);
+	struct nf_ct_sip_master *ct_sip_info = nfct_help_data(ct);
 	enum ip_conntrack_dir dir = CTINFO2DIR(ctinfo);
 	unsigned int matchoff, matchlen;
 	struct nf_conntrack_expect *exp;
@@ -1235,7 +1235,7 @@ static int process_register_request(struct sk_buff *skb, unsigned int dataoff,
 
 store_cseq:
 	if (ret == NF_ACCEPT)
-		help->help.ct_sip_info.register_cseq = cseq;
+		ct_sip_info->register_cseq = cseq;
 	return ret;
 }
 
@@ -1245,7 +1245,7 @@ static int process_register_response(struct sk_buff *skb, unsigned int dataoff,
 {
 	enum ip_conntrack_info ctinfo;
 	struct nf_conn *ct = nf_ct_get(skb, &ctinfo);
-	struct nf_conn_help *help = nfct_help(ct);
+	struct nf_ct_sip_master *ct_sip_info = nfct_help_data(ct);
 	enum ip_conntrack_dir dir = CTINFO2DIR(ctinfo);
 	union nf_inet_addr addr;
 	__be16 port;
@@ -1262,7 +1262,7 @@ static int process_register_response(struct sk_buff *skb, unsigned int dataoff,
 	 * responses, so we store the sequence number of the last valid
 	 * request and compare it here.
 	 */
-	if (help->help.ct_sip_info.register_cseq != cseq)
+	if (ct_sip_info->register_cseq != cseq)
 		return NF_ACCEPT;
 
 	if (code >= 100 && code <= 199)
@@ -1578,6 +1578,7 @@ static int __init nf_conntrack_sip_init(void)
 		sip[i][3].help = sip_help_tcp;
 
 		for (j = 0; j < ARRAY_SIZE(sip[i]); j++) {
+			sip[i][j].data_len = sizeof(struct nf_ct_sip_master);
 			sip[i][j].tuple.src.u.udp.port = htons(ports[i]);
 			sip[i][j].expect_policy = sip_exp_policy;
 			sip[i][j].expect_class_max = SIP_EXPECT_MAX;
diff --git a/net/netfilter/xt_CT.c b/net/netfilter/xt_CT.c
index a51de9b..1160185 100644
--- a/net/netfilter/xt_CT.c
+++ b/net/netfilter/xt_CT.c
@@ -112,6 +112,8 @@ static int xt_ct_tg_check_v0(const struct xt_tgchk_param *par)
 		goto err3;
 
 	if (info->helper[0]) {
+		struct nf_conntrack_helper *helper;
+
 		ret = -ENOENT;
 		proto = xt_ct_find_proto(par);
 		if (!proto) {
@@ -120,19 +122,21 @@ static int xt_ct_tg_check_v0(const struct xt_tgchk_param *par)
 			goto err3;
 		}
 
-		ret = -ENOMEM;
-		help = nf_ct_helper_ext_add(ct, GFP_KERNEL);
-		if (help == NULL)
-			goto err3;
-
 		ret = -ENOENT;
-		help->helper = nf_conntrack_helper_try_module_get(info->helper,
-								  par->family,
-								  proto);
-		if (help->helper == NULL) {
+		helper = nf_conntrack_helper_try_module_get(info->helper,
+							    par->family,
+							    proto);
+		if (helper == NULL) {
 			pr_info("No such helper \"%s\"\n", info->helper);
 			goto err3;
 		}
+
+		ret = -ENOMEM;
+		help = nf_ct_helper_ext_add(ct, helper, GFP_KERNEL);
+		if (help == NULL)
+			goto err3;
+
+		help->helper = helper;
 	}
 
 	__set_bit(IPS_TEMPLATE_BIT, &ct->status);
@@ -202,6 +206,8 @@ static int xt_ct_tg_check_v1(const struct xt_tgchk_param *par)
 		goto err3;
 
 	if (info->helper[0]) {
+		struct nf_conntrack_helper *helper;
+
 		ret = -ENOENT;
 		proto = xt_ct_find_proto(par);
 		if (!proto) {
@@ -210,19 +216,21 @@ static int xt_ct_tg_check_v1(const struct xt_tgchk_param *par)
 			goto err3;
 		}
 
-		ret = -ENOMEM;
-		help = nf_ct_helper_ext_add(ct, GFP_KERNEL);
-		if (help == NULL)
-			goto err3;
-
 		ret = -ENOENT;
-		help->helper = nf_conntrack_helper_try_module_get(info->helper,
-								  par->family,
-								  proto);
-		if (help->helper == NULL) {
+		helper = nf_conntrack_helper_try_module_get(info->helper,
+							    par->family,
+							    proto);
+		if (helper == NULL) {
 			pr_info("No such helper \"%s\"\n", info->helper);
 			goto err3;
 		}
+
+		ret = -ENOMEM;
+		help = nf_ct_helper_ext_add(ct, helper, GFP_KERNEL);
+		if (help == NULL)
+			goto err3;
+
+		help->helper = helper;
 	}
 
 #ifdef CONFIG_NF_CONNTRACK_TIMEOUT
-- 
1.7.10

^ permalink raw reply related

* Re: [PATCH net-next 3/4] 6lowpan: the two bytes of u16 tag needs to be stored/accessed the other way around
From: Alexander Smirnov @ 2012-06-12 18:07 UTC (permalink / raw)
  To: Tony Cheneau
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-zigbee-devel-5NWGOfrQmneRv+LV9MX5uipxlwaOVQ5f
In-Reply-To: <20120611003953.0725d77a@dualbox>

2012/6/11 Tony Cheneau <tony.cheneauh-jNfjcPZKvDhg9hUCZPvPmw@public.gmane.org>:
> Or else, tag values, as displayed with a trafic analyser, such a
> Wireshark, are not properly displayed (e.g. 0x01 00 insted of 0x00 01,
> and so on). ---
>  net/ieee802154/6lowpan.c |    6 +++---
>  1 files changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/net/ieee802154/6lowpan.c b/net/ieee802154/6lowpan.c
> index af4f29b..af2f12e 100644
> --- a/net/ieee802154/6lowpan.c
> +++ b/net/ieee802154/6lowpan.c
> @@ -307,7 +307,7 @@ static u16 lowpan_fetch_skb_u16(struct sk_buff *skb)
>
>        BUG_ON(!pskb_may_pull(skb, 2));
>
> -       ret = skb->data[0] | (skb->data[1] << 8);
> +       ret = (skb->data[0] << 8) | skb->data[1];

This function aimed to obtain u16 from skb, why did you change the
byte-order here instead of 'tag' variable byte-shifting? Will this
code work properly on the both little and big-endian machines? Please
rework this to keep order properly for all the architectures.

>        skb_pull(skb, 2);
>        return ret;
>  }
> @@ -1002,8 +1002,8 @@ lowpan_skb_fragmentation(struct sk_buff *skb)
>        /* first fragment header */
>        head[0] = LOWPAN_DISPATCH_FRAG1 | (payload_length & 0x7);
>        head[1] = (payload_length >> 3) & 0xff;
> -       head[2] = tag & 0xff;
> -       head[3] = tag >> 8;
> +       head[2] = tag >> 8;
> +       head[3] = tag & 0xff;
>
>        err = lowpan_fragment_xmit(skb, head, header_length, 0, 0);
>
> --
> 1.7.3.4
>
>

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/

^ permalink raw reply

* [PATCH 5/7] netfilter: nfnetlink_queue: add NAT TCP sequence adjustment if packet mangled
From: pablo @ 2012-06-12 18:06 UTC (permalink / raw)
  To: netfilter-devel; +Cc: netdev
In-Reply-To: <1339524380-2707-1-git-send-email-pablo@netfilter.org>

From: Pablo Neira Ayuso <pablo@netfilter.org>

User-space programs that receive traffic via NFQUEUE may mangle packets.
If NAT is enabled, this usually puzzles sequence tracking, leading to
traffic disruptions.

With this patch, nfnl_queue will make the corresponding NAT TCP sequence
adjustment if:

1) The packet has been mangled,
2) the NFQA_CFG_F_CONNTRACK flag has been set, and
3) NAT is detected.

There are some records on the Internet complaning about this issue:
http://stackoverflow.com/questions/260757/packet-mangling-utilities-besides-iptables

By now, we only support TCP since we have no helpers for DCCP or SCTP.
Better to add this if we ever have some helper over those layer 4 protocols.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 include/linux/netfilter.h             |    2 ++
 include/net/netfilter/nf_nat_helper.h |    4 ++++
 net/ipv4/netfilter/nf_nat_helper.c    |   13 +++++++++++++
 net/netfilter/nf_conntrack_netlink.c  |    4 ++++
 net/netfilter/nfnetlink_queue.c       |   19 +++++++++++--------
 5 files changed, 34 insertions(+), 8 deletions(-)

diff --git a/include/linux/netfilter.h b/include/linux/netfilter.h
index ba65bfb..dca19e6 100644
--- a/include/linux/netfilter.h
+++ b/include/linux/netfilter.h
@@ -401,6 +401,8 @@ struct nfq_ct_hook {
 	size_t (*build_size)(const struct nf_conn *ct);
 	int (*build)(struct sk_buff *skb, struct nf_conn *ct);
 	int (*parse)(const struct nlattr *attr, struct nf_conn *ct);
+	void (*seq_adjust)(struct sk_buff *skb, struct nf_conn *ct,
+			   u32 ctinfo, int off);
 };
 extern struct nfq_ct_hook *nfq_ct_hook;
 #else
diff --git a/include/net/netfilter/nf_nat_helper.h b/include/net/netfilter/nf_nat_helper.h
index 02bb6c2..7d8fb7b 100644
--- a/include/net/netfilter/nf_nat_helper.h
+++ b/include/net/netfilter/nf_nat_helper.h
@@ -54,4 +54,8 @@ extern void nf_nat_follow_master(struct nf_conn *ct,
 extern s16 nf_nat_get_offset(const struct nf_conn *ct,
 			     enum ip_conntrack_dir dir,
 			     u32 seq);
+
+extern void nf_nat_tcp_seq_adjust(struct sk_buff *skb, struct nf_conn *ct,
+				  u32 dir, int off);
+
 #endif
diff --git a/net/ipv4/netfilter/nf_nat_helper.c b/net/ipv4/netfilter/nf_nat_helper.c
index af65958..2e59ad0 100644
--- a/net/ipv4/netfilter/nf_nat_helper.c
+++ b/net/ipv4/netfilter/nf_nat_helper.c
@@ -153,6 +153,19 @@ void nf_nat_set_seq_adjust(struct nf_conn *ct, enum ip_conntrack_info ctinfo,
 }
 EXPORT_SYMBOL_GPL(nf_nat_set_seq_adjust);
 
+void nf_nat_tcp_seq_adjust(struct sk_buff *skb, struct nf_conn *ct,
+			   u32 ctinfo, int off)
+{
+	const struct tcphdr *th;
+
+	if (nf_ct_protonum(ct) != IPPROTO_TCP)
+		return;
+
+	th = (struct tcphdr *)(skb_network_header(skb)+ ip_hdrlen(skb));
+	nf_nat_set_seq_adjust(ct, ctinfo, th->seq, off);
+}
+EXPORT_SYMBOL_GPL(nf_nat_tcp_seq_adjust);
+
 static void nf_nat_csum(struct sk_buff *skb, const struct iphdr *iph, void *data,
 			int datalen, __sum16 *check, int oldlen)
 {
diff --git a/net/netfilter/nf_conntrack_netlink.c b/net/netfilter/nf_conntrack_netlink.c
index 3b03a86..206d297 100644
--- a/net/netfilter/nf_conntrack_netlink.c
+++ b/net/netfilter/nf_conntrack_netlink.c
@@ -46,6 +46,7 @@
 #ifdef CONFIG_NF_NAT_NEEDED
 #include <net/netfilter/nf_nat_core.h>
 #include <net/netfilter/nf_nat_protocol.h>
+#include <net/netfilter/nf_nat_helper.h>
 #endif
 
 #include <linux/netfilter/nfnetlink.h>
@@ -1751,6 +1752,9 @@ static struct nfq_ct_hook ctnetlink_nfqueue_hook = {
 	.build_size	= ctnetlink_nfqueue_build_size,
 	.build		= ctnetlink_nfqueue_build,
 	.parse		= ctnetlink_nfqueue_parse,
+#ifdef CONFIG_NF_NAT_NEEDED
+	.seq_adjust	= nf_nat_tcp_seq_adjust,
+#endif
 };
 #endif /* CONFIG_NETFILTER_NETLINK_QUEUE */
 
diff --git a/net/netfilter/nfnetlink_queue.c b/net/netfilter/nfnetlink_queue.c
index ab1fd1c..52ebf02 100644
--- a/net/netfilter/nfnetlink_queue.c
+++ b/net/netfilter/nfnetlink_queue.c
@@ -502,12 +502,10 @@ err_out:
 }
 
 static int
-nfqnl_mangle(void *data, int data_len, struct nf_queue_entry *e)
+nfqnl_mangle(void *data, int data_len, struct nf_queue_entry *e, int diff)
 {
 	struct sk_buff *nskb;
-	int diff;
 
-	diff = data_len - e->skb->len;
 	if (diff < 0) {
 		if (pskb_trim(e->skb, data_len))
 			return -ENOMEM;
@@ -766,6 +764,8 @@ nfqnl_recv_verdict(struct sock *ctnl, struct sk_buff *skb,
 	unsigned int verdict;
 	struct nf_queue_entry *entry;
 	struct nfq_ct_hook *nfq_ct;
+	enum ip_conntrack_info uninitialized_var(ctinfo);
+	struct nf_conn *ct = NULL;
 
 	queue = instance_lookup(queue_num);
 	if (!queue)
@@ -788,20 +788,23 @@ nfqnl_recv_verdict(struct sock *ctnl, struct sk_buff *skb,
 	nfq_ct = rcu_dereference(nfq_ct_hook);
 	if (nfq_ct != NULL &&
 	    (queue->flags & NFQA_CFG_F_CONNTRACK) && nfqa[NFQA_CT]) {
-		enum ip_conntrack_info ctinfo;
-		struct nf_conn *ct;
-
 		ct = nf_ct_get(entry->skb, &ctinfo);
 		if (ct && !nf_ct_is_untracked(ct))
 			nfq_ct->parse(nfqa[NFQA_CT], ct);
 	}
-	rcu_read_unlock();
 
 	if (nfqa[NFQA_PAYLOAD]) {
+		u16 payload_len = nla_len(nfqa[NFQA_PAYLOAD]);
+		int diff = payload_len - entry->skb->len;
+
 		if (nfqnl_mangle(nla_data(nfqa[NFQA_PAYLOAD]),
-				 nla_len(nfqa[NFQA_PAYLOAD]), entry) < 0)
+				 payload_len, entry, diff) < 0)
 			verdict = NF_DROP;
+
+		if (ct && (ct->status & IPS_NAT_MASK) && diff)
+			nfq_ct->seq_adjust(skb, ct, ctinfo, diff);
 	}
+	rcu_read_unlock();
 
 	if (nfqa[NFQA_MARK])
 		entry->skb->mark = ntohl(nla_get_be32(nfqa[NFQA_MARK]));
-- 
1.7.10

^ permalink raw reply related

* [PATCH 6/7] netfilter: ctnetlink: add CTA_HELP_INFO attribute
From: pablo @ 2012-06-12 18:06 UTC (permalink / raw)
  To: netfilter-devel; +Cc: netdev
In-Reply-To: <1339524380-2707-1-git-send-email-pablo@netfilter.org>

From: Pablo Neira Ayuso <pablo@netfilter.org>

This attribute can be used to modify and to dump the internal
protocol information.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 include/linux/netfilter/nfnetlink_conntrack.h |    1 +
 include/net/netfilter/nf_conntrack_helper.h   |    1 +
 net/netfilter/nf_conntrack_netlink.c          |   23 ++++++++++++++++++-----
 3 files changed, 20 insertions(+), 5 deletions(-)

diff --git a/include/linux/netfilter/nfnetlink_conntrack.h b/include/linux/netfilter/nfnetlink_conntrack.h
index e58e4b9..7688833 100644
--- a/include/linux/netfilter/nfnetlink_conntrack.h
+++ b/include/linux/netfilter/nfnetlink_conntrack.h
@@ -191,6 +191,7 @@ enum ctattr_expect_nat {
 enum ctattr_help {
 	CTA_HELP_UNSPEC,
 	CTA_HELP_NAME,
+	CTA_HELP_INFO,
 	__CTA_HELP_MAX
 };
 #define CTA_HELP_MAX (__CTA_HELP_MAX - 1)
diff --git a/include/net/netfilter/nf_conntrack_helper.h b/include/net/netfilter/nf_conntrack_helper.h
index 061352f..84b24c3 100644
--- a/include/net/netfilter/nf_conntrack_helper.h
+++ b/include/net/netfilter/nf_conntrack_helper.h
@@ -39,6 +39,7 @@ struct nf_conntrack_helper {
 
 	void (*destroy)(struct nf_conn *ct);
 
+	int (*from_nlattr)(struct nlattr *attr, struct nf_conn *ct);
 	int (*to_nlattr)(struct sk_buff *skb, const struct nf_conn *ct);
 	unsigned int expect_class_max;
 };
diff --git a/net/netfilter/nf_conntrack_netlink.c b/net/netfilter/nf_conntrack_netlink.c
index 206d297..17bd96b 100644
--- a/net/netfilter/nf_conntrack_netlink.c
+++ b/net/netfilter/nf_conntrack_netlink.c
@@ -902,7 +902,8 @@ static const struct nla_policy help_nla_policy[CTA_HELP_MAX+1] = {
 };
 
 static inline int
-ctnetlink_parse_help(const struct nlattr *attr, char **helper_name)
+ctnetlink_parse_help(const struct nlattr *attr, char **helper_name,
+		     struct nlattr **helpinfo)
 {
 	struct nlattr *tb[CTA_HELP_MAX+1];
 
@@ -913,6 +914,9 @@ ctnetlink_parse_help(const struct nlattr *attr, char **helper_name)
 
 	*helper_name = nla_data(tb[CTA_HELP_NAME]);
 
+	if (tb[CTA_HELP_INFO])
+		*helpinfo = tb[CTA_HELP_INFO];
+
 	return 0;
 }
 
@@ -1173,13 +1177,14 @@ ctnetlink_change_helper(struct nf_conn *ct, const struct nlattr * const cda[])
 	struct nf_conntrack_helper *helper;
 	struct nf_conn_help *help = nfct_help(ct);
 	char *helpname = NULL;
+	struct nlattr *helpinfo = NULL;
 	int err;
 
 	/* don't change helper of sibling connections */
 	if (ct->master)
 		return -EBUSY;
 
-	err = ctnetlink_parse_help(cda[CTA_HELP], &helpname);
+	err = ctnetlink_parse_help(cda[CTA_HELP], &helpname, &helpinfo);
 	if (err < 0)
 		return err;
 
@@ -1214,8 +1219,12 @@ ctnetlink_change_helper(struct nf_conn *ct, const struct nlattr * const cda[])
 	}
 
 	if (help) {
-		if (help->helper == helper)
+		if (help->helper == helper) {
+			/* update private helper data if allowed. */
+			if (helper->from_nlattr && helpinfo)
+				helper->from_nlattr(helpinfo, ct);
 			return 0;
+		}
 		if (help->helper)
 			return -EBUSY;
 		/* need to zero data of old helper */
@@ -1411,8 +1420,9 @@ ctnetlink_create_conntrack(struct net *net, u16 zone,
 	rcu_read_lock();
  	if (cda[CTA_HELP]) {
 		char *helpname = NULL;
- 
- 		err = ctnetlink_parse_help(cda[CTA_HELP], &helpname);
+		struct nlattr *helpinfo = NULL;
+
+		err = ctnetlink_parse_help(cda[CTA_HELP], &helpname, &helpinfo);
  		if (err < 0)
 			goto err2;
 
@@ -1446,6 +1456,9 @@ ctnetlink_create_conntrack(struct net *net, u16 zone,
 				err = -ENOMEM;
 				goto err2;
 			}
+			/* set private helper data if allowed. */
+			if (helper->from_nlattr && helpinfo)
+				helper->from_nlattr(helpinfo, ct);
 
 			/* not in hash table yet so not strictly necessary */
 			RCU_INIT_POINTER(help->helper, helper);
-- 
1.7.10

^ permalink raw reply related

* [PATCH 4/7] netfilter: add glue code to integrate nfnetlink_queue and ctnetlink
From: pablo @ 2012-06-12 18:06 UTC (permalink / raw)
  To: netfilter-devel; +Cc: netdev
In-Reply-To: <1339524380-2707-1-git-send-email-pablo@netfilter.org>

From: Pablo Neira Ayuso <pablo@netfilter.org>

This patch allows you to include the conntrack information together
with the packet that is sent to user-space via NFQUEUE.

Previously, there was no integration between ctnetlink and
nfnetlink_queue. If you wanted to access conntrack information
from your libnetfilter_queue program, you required to query
ctnetlink from user-space to obtain it. Thus, delaying the packet
processing even more.

Including the conntrack information is optional, you can set it
via NFQA_CFG_F_CONNTRACK flag with the new NFQA_CFG_FLAGS attribute.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 include/linux/netfilter.h                 |   10 ++
 include/linux/netfilter/nfnetlink_queue.h |    3 +
 net/netfilter/core.c                      |    4 +
 net/netfilter/nf_conntrack_netlink.c      |  146 ++++++++++++++++++++++++++++-
 net/netfilter/nfnetlink_queue.c           |   47 ++++++++++
 5 files changed, 209 insertions(+), 1 deletion(-)

diff --git a/include/linux/netfilter.h b/include/linux/netfilter.h
index 4541f33..ba65bfb 100644
--- a/include/linux/netfilter.h
+++ b/include/linux/netfilter.h
@@ -393,6 +393,16 @@ nf_nat_decode_session(struct sk_buff *skb, struct flowi *fl, u_int8_t family)
 extern void (*ip_ct_attach)(struct sk_buff *, struct sk_buff *) __rcu;
 extern void nf_ct_attach(struct sk_buff *, struct sk_buff *);
 extern void (*nf_ct_destroy)(struct nf_conntrack *) __rcu;
+
+struct nf_conn;
+struct nlattr;
+
+struct nfq_ct_hook {
+	size_t (*build_size)(const struct nf_conn *ct);
+	int (*build)(struct sk_buff *skb, struct nf_conn *ct);
+	int (*parse)(const struct nlattr *attr, struct nf_conn *ct);
+};
+extern struct nfq_ct_hook *nfq_ct_hook;
 #else
 static inline void nf_ct_attach(struct sk_buff *new, struct sk_buff *skb) {}
 #endif
diff --git a/include/linux/netfilter/nfnetlink_queue.h b/include/linux/netfilter/nfnetlink_queue.h
index a6c1dda..e0d8fd8 100644
--- a/include/linux/netfilter/nfnetlink_queue.h
+++ b/include/linux/netfilter/nfnetlink_queue.h
@@ -42,6 +42,8 @@ enum nfqnl_attr_type {
 	NFQA_IFINDEX_PHYSOUTDEV,	/* __u32 ifindex */
 	NFQA_HWADDR,			/* nfqnl_msg_packet_hw */
 	NFQA_PAYLOAD,			/* opaque data payload */
+	NFQA_CT,			/* nf_conntrack_netlink.h */
+	NFQA_CT_INFO,			/* enum ip_conntrack_info */
 
 	__NFQA_MAX
 };
@@ -92,5 +94,6 @@ enum nfqnl_attr_config {
 
 /* Flags for NFQA_CFG_FLAGS */
 #define NFQA_CFG_F_FAIL_OPEN			(1 << 0)
+#define NFQA_CFG_F_CONNTRACK			(1 << 1)
 
 #endif /* _NFNETLINK_QUEUE_H */
diff --git a/net/netfilter/core.c b/net/netfilter/core.c
index e19f365..7eef845 100644
--- a/net/netfilter/core.c
+++ b/net/netfilter/core.c
@@ -264,6 +264,10 @@ void nf_conntrack_destroy(struct nf_conntrack *nfct)
 	rcu_read_unlock();
 }
 EXPORT_SYMBOL(nf_conntrack_destroy);
+
+struct nfq_ct_hook *nfq_ct_hook;
+EXPORT_SYMBOL_GPL(nfq_ct_hook);
+
 #endif /* CONFIG_NF_CONNTRACK */
 
 #ifdef CONFIG_PROC_FS
diff --git a/net/netfilter/nf_conntrack_netlink.c b/net/netfilter/nf_conntrack_netlink.c
index a088920..3b03a86 100644
--- a/net/netfilter/nf_conntrack_netlink.c
+++ b/net/netfilter/nf_conntrack_netlink.c
@@ -1620,6 +1620,140 @@ ctnetlink_new_conntrack(struct sock *ctnl, struct sk_buff *skb,
 	return err;
 }
 
+#if defined(CONFIG_NETFILTER_NETLINK_QUEUE) ||	\
+    defined(CONFIG_NETFILTER_NETLINK_QUEUE_MODULE)
+static size_t
+ctnetlink_nfqueue_build_size(const struct nf_conn *ct)
+{
+	return 3 * nla_total_size(0) /* CTA_TUPLE_ORIG|REPL|MASTER */
+	       + 3 * nla_total_size(0) /* CTA_TUPLE_IP */
+	       + 3 * nla_total_size(0) /* CTA_TUPLE_PROTO */
+	       + 3 * nla_total_size(sizeof(u_int8_t)) /* CTA_PROTO_NUM */
+	       + nla_total_size(sizeof(u_int32_t)) /* CTA_ID */
+	       + nla_total_size(sizeof(u_int32_t)) /* CTA_STATUS */
+	       + nla_total_size(sizeof(u_int32_t)) /* CTA_TIMEOUT */
+	       + nla_total_size(0) /* CTA_PROTOINFO */
+	       + nla_total_size(0) /* CTA_HELP */
+	       + nla_total_size(NF_CT_HELPER_NAME_LEN) /* CTA_HELP_NAME */
+	       + ctnetlink_secctx_size(ct)
+#ifdef CONFIG_NF_NAT_NEEDED
+	       + 2 * nla_total_size(0) /* CTA_NAT_SEQ_ADJ_ORIG|REPL */
+	       + 6 * nla_total_size(sizeof(u_int32_t)) /* CTA_NAT_SEQ_OFFSET */
+#endif
+#ifdef CONFIG_NF_CONNTRACK_MARK
+	       + nla_total_size(sizeof(u_int32_t)) /* CTA_MARK */
+#endif
+	       + ctnetlink_proto_size(ct)
+	       ;
+}
+
+static int
+ctnetlink_nfqueue_build(struct sk_buff *skb, struct nf_conn *ct)
+{
+	struct nlattr *nest_parms;
+
+	rcu_read_lock();
+	nest_parms = nla_nest_start(skb, CTA_TUPLE_ORIG | NLA_F_NESTED);
+	if (!nest_parms)
+		goto nla_put_failure;
+	if (ctnetlink_dump_tuples(skb, nf_ct_tuple(ct, IP_CT_DIR_ORIGINAL)) < 0)
+		goto nla_put_failure;
+	nla_nest_end(skb, nest_parms);
+
+	nest_parms = nla_nest_start(skb, CTA_TUPLE_REPLY | NLA_F_NESTED);
+	if (!nest_parms)
+		goto nla_put_failure;
+	if (ctnetlink_dump_tuples(skb, nf_ct_tuple(ct, IP_CT_DIR_REPLY)) < 0)
+		goto nla_put_failure;
+	nla_nest_end(skb, nest_parms);
+
+	if (nf_ct_zone(ct)) {
+		if (nla_put_be16(skb, CTA_ZONE, htons(nf_ct_zone(ct))))
+			goto nla_put_failure;
+	}
+
+	if (ctnetlink_dump_id(skb, ct) < 0)
+		goto nla_put_failure;
+
+	if (ctnetlink_dump_status(skb, ct) < 0)
+		goto nla_put_failure;
+
+	if (ctnetlink_dump_timeout(skb, ct) < 0)
+		goto nla_put_failure;
+
+	if (ctnetlink_dump_protoinfo(skb, ct) < 0)
+		goto nla_put_failure;
+
+	if (ctnetlink_dump_helpinfo(skb, ct) < 0)
+		goto nla_put_failure;
+
+#ifdef CONFIG_NF_CONNTRACK_SECMARK
+	if (ct->secmark && ctnetlink_dump_secctx(skb, ct) < 0)
+		goto nla_put_failure;
+#endif
+	if (ct->master && ctnetlink_dump_master(skb, ct) < 0)
+		goto nla_put_failure;
+
+	if ((ct->status & IPS_SEQ_ADJUST) &&
+	    ctnetlink_dump_nat_seq_adj(skb, ct) < 0)
+		goto nla_put_failure;
+
+#ifdef CONFIG_NF_CONNTRACK_MARK
+	if (ct->mark && ctnetlink_dump_mark(skb, ct) < 0)
+		goto nla_put_failure;
+#endif
+	rcu_read_unlock();
+	return 0;
+
+nla_put_failure:
+	rcu_read_unlock();
+	return -ENOSPC;
+}
+
+static int
+ctnetlink_nfqueue_parse_ct(const struct nlattr *cda[], struct nf_conn *ct)
+{
+	int err;
+
+	if (cda[CTA_TIMEOUT]) {
+		err = ctnetlink_change_timeout(ct, cda);
+		if (err < 0)
+			return err;
+	}
+	if (cda[CTA_STATUS]) {
+		err = ctnetlink_change_status(ct, cda);
+		if (err < 0)
+			return err;
+	}
+	if (cda[CTA_HELP]) {
+		err = ctnetlink_change_helper(ct, cda);
+		if (err < 0)
+			return err;
+	}
+#if defined(CONFIG_NF_CONNTRACK_MARK)
+	if (cda[CTA_MARK])
+		ct->mark = ntohl(nla_get_be32(cda[CTA_MARK]));
+#endif
+	return 0;
+}
+
+static int
+ctnetlink_nfqueue_parse(const struct nlattr *attr, struct nf_conn *ct)
+{
+	struct nlattr *cda[CTA_MAX+1];
+
+	nla_parse_nested(cda, CTA_MAX, attr, ct_nla_policy);
+
+	return ctnetlink_nfqueue_parse_ct((const struct nlattr **)cda, ct);
+}
+
+static struct nfq_ct_hook ctnetlink_nfqueue_hook = {
+	.build_size	= ctnetlink_nfqueue_build_size,
+	.build		= ctnetlink_nfqueue_build,
+	.parse		= ctnetlink_nfqueue_parse,
+};
+#endif /* CONFIG_NETFILTER_NETLINK_QUEUE */
+
 /***********************************************************************
  * EXPECT
  ***********************************************************************/
@@ -2424,7 +2558,12 @@ static int __init ctnetlink_init(void)
 		pr_err("ctnetlink_init: cannot register pernet operations\n");
 		goto err_unreg_exp_subsys;
 	}
-
+#if defined(CONFIG_NETFILTER_NETLINK_QUEUE) ||	\
+    defined(CONFIG_NETFILTER_NETLINK_QUEUE_MODULE)
+	/* setup interaction between nf_queue and nf_conntrack_netlink. */
+	RCU_INIT_POINTER(nfq_ct_hook, &ctnetlink_nfqueue_hook);
+	printk("registering nf_queue and ctnetlink interaction\n");
+#endif
 	return 0;
 
 err_unreg_exp_subsys:
@@ -2442,6 +2581,11 @@ static void __exit ctnetlink_exit(void)
 	unregister_pernet_subsys(&ctnetlink_net_ops);
 	nfnetlink_subsys_unregister(&ctnl_exp_subsys);
 	nfnetlink_subsys_unregister(&ctnl_subsys);
+#if defined(CONFIG_NETFILTER_NETLINK_QUEUE) ||	\
+    defined(CONFIG_NETFILTER_NETLINK_QUEUE_MODULE)
+	RCU_INIT_POINTER(nfq_ct_hook, NULL);
+	printk("unregistering nf_queue and ctnetlink interaction\n");
+#endif
 }
 
 module_init(ctnetlink_init);
diff --git a/net/netfilter/nfnetlink_queue.c b/net/netfilter/nfnetlink_queue.c
index 630da3d..ab1fd1c 100644
--- a/net/netfilter/nfnetlink_queue.c
+++ b/net/netfilter/nfnetlink_queue.c
@@ -30,6 +30,7 @@
 #include <linux/list.h>
 #include <net/sock.h>
 #include <net/netfilter/nf_queue.h>
+#include <net/netfilter/nf_conntrack.h>
 
 #include <linux/atomic.h>
 
@@ -233,6 +234,9 @@ nfqnl_build_packet_message(struct nfqnl_instance *queue,
 	struct sk_buff *entskb = entry->skb;
 	struct net_device *indev;
 	struct net_device *outdev;
+	struct nfq_ct_hook *nfq_ct;
+	struct nf_conn *ct = NULL;
+	enum ip_conntrack_info uninitialized_var(ctinfo);
 
 	size =    NLMSG_SPACE(sizeof(struct nfgenmsg))
 		+ nla_total_size(sizeof(struct nfqnl_msg_packet_hdr))
@@ -266,6 +270,17 @@ nfqnl_build_packet_message(struct nfqnl_instance *queue,
 		break;
 	}
 
+	/* rcu_read_lock()ed by __nf_queue already. */
+	nfq_ct = rcu_dereference(nfq_ct_hook);
+	if (nfq_ct != NULL && (queue->flags & NFQA_CFG_F_CONNTRACK)) {
+		ct = nf_ct_get(entskb, &ctinfo);
+		if (ct) {
+			if (!nf_ct_is_untracked(ct))
+				size += nfq_ct->build_size(ct);
+			else
+				ct = NULL;
+		}
+	}
 
 	skb = alloc_skb(size, GFP_ATOMIC);
 	if (!skb)
@@ -389,6 +404,24 @@ nfqnl_build_packet_message(struct nfqnl_instance *queue,
 			BUG();
 	}
 
+	if (ct) {
+		struct nlattr *nest_parms;
+		u_int32_t tmp;
+
+		nest_parms = nla_nest_start(skb, NFQA_CT | NLA_F_NESTED);
+		if (!nest_parms)
+			goto nla_put_failure;
+
+		if (nfq_ct->build(skb, ct) < 0)
+			goto nla_put_failure;
+
+		nla_nest_end(skb, nest_parms);
+
+		tmp = ctinfo;
+		if (nla_put_u32(skb, NFQA_CT_INFO, htonl(ctinfo)))
+			goto nla_put_failure;
+	}
+
 	nlh->nlmsg_len = skb->tail - old_tail;
 	return skb;
 
@@ -732,6 +765,7 @@ nfqnl_recv_verdict(struct sock *ctnl, struct sk_buff *skb,
 	struct nfqnl_instance *queue;
 	unsigned int verdict;
 	struct nf_queue_entry *entry;
+	struct nfq_ct_hook *nfq_ct;
 
 	queue = instance_lookup(queue_num);
 	if (!queue)
@@ -750,6 +784,19 @@ nfqnl_recv_verdict(struct sock *ctnl, struct sk_buff *skb,
 	if (entry == NULL)
 		return -ENOENT;
 
+	rcu_read_lock();
+	nfq_ct = rcu_dereference(nfq_ct_hook);
+	if (nfq_ct != NULL &&
+	    (queue->flags & NFQA_CFG_F_CONNTRACK) && nfqa[NFQA_CT]) {
+		enum ip_conntrack_info ctinfo;
+		struct nf_conn *ct;
+
+		ct = nf_ct_get(entry->skb, &ctinfo);
+		if (ct && !nf_ct_is_untracked(ct))
+			nfq_ct->parse(nfqa[NFQA_CT], ct);
+	}
+	rcu_read_unlock();
+
 	if (nfqa[NFQA_PAYLOAD]) {
 		if (nfqnl_mangle(nla_data(nfqa[NFQA_PAYLOAD]),
 				 nla_len(nfqa[NFQA_PAYLOAD]), entry) < 0)
-- 
1.7.10

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox