Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH net-next-2.6 1/5] jme: Fix PHY power-off error
From: David Miller @ 2010-10-21 10:12 UTC (permalink / raw)
  To: cooldavid; +Cc: netdev, stable
In-Reply-To: <1287447044-24471-1-git-send-email-cooldavid@cooldavid.org>


All 5 patches applied, thanks.

^ permalink raw reply

* Re: [PATCH net-next] bnx2: Increase max rx ring size from 1K to 2K
From: David Miller @ 2010-10-21 10:13 UTC (permalink / raw)
  To: mchan; +Cc: andy, jfeeney, netdev
In-Reply-To: <1287448254-14173-1-git-send-email-mchan@broadcom.com>

From: "Michael Chan" <mchan@broadcom.com>
Date: Mon, 18 Oct 2010 17:30:54 -0700

> A number of customers are reporting packet loss under certain workloads
> (e.g. heavy bursts of small packets) with flow control disabled.  A larger
> rx ring helps to prevent these losses.
> 
> No change in default rx ring size and memory consumption.
> 
> Signed-off-by: Andy Gospodarek <andy@greyhouse.net>
> Acked-by: John Feeney <jfeeney@redhat.com>
> Signed-off-by: Michael Chan <mchan@broadcom.com>

Ok, since the new limit is not the default, applied.

Thanks for the explanation Michael.

^ permalink raw reply

* Re: [PATCH net-next] sfc: make functions static
From: David Miller @ 2010-10-21 10:13 UTC (permalink / raw)
  To: bhutchings; +Cc: shemminger, shodgson, linux-net-drivers, netdev
In-Reply-To: <1287421518.2252.219.camel@achroite.uk.solarflarecom.com>

From: Ben Hutchings <bhutchings@solarflare.com>
Date: Mon, 18 Oct 2010 18:05:18 +0100

> On Mon, 2010-10-18 at 08:27 -0700, Stephen Hemminger wrote:
>> Make local functions and variable static. Do some rearrangement
>> of the string table stuff to put it where it gets used.
>> 
>> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
> 
> Acked-by: Ben Hutchings <bhutchings@solarflare.com>
> 
> We might have to change some of these back in future, but I suppose
> there is no harm in making them static now.

Applied.

^ permalink raw reply

* Re: [PATCH net-next-2.6] mlx4: make functions local and remove dead code.
From: David Miller @ 2010-10-21 10:13 UTC (permalink / raw)
  To: yevgenyp; +Cc: shemminger, netdev, eli
In-Reply-To: <E113D394D7C5DB4F8FF691FA7EE9DB443CC3716F65@MTLMAIL.mtl.com>

From: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
Date: Tue, 19 Oct 2010 09:37:44 +0200

>> 
>> There is a whole section of code in this driver related to vlan tables
>> which is not accessed from any kernel code.
>> 
>> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>
>>
> 
> Thanks for sending this, 
> There are patches under review at the moment (RDMA over Ethernet) that use this code:
> http://www.spinics.net/lists/linux-rdma/msg05512.html, which will be broken if the VLAN tables code is removed.

ok.

^ permalink raw reply

* Re: [PATCH net-next] bonding: make bond_resend_igmp_join_requests_delayed static
From: David Miller @ 2010-10-21 10:14 UTC (permalink / raw)
  To: shemminger; +Cc: fubar, netdev, bonding-devel
In-Reply-To: <20101015140256.4192fd34@nehalam>

From: Stephen Hemminger <shemminger@vyatta.com>
Date: Fri, 15 Oct 2010 14:02:56 -0700

> 
> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>

Applied.

^ permalink raw reply

* Re: [PATCH net-next] vmxnet3: make bit twiddle routines inline
From: David Miller @ 2010-10-21 10:14 UTC (permalink / raw)
  To: shemminger; +Cc: sbhatewara, pv-drivers, netdev
In-Reply-To: <20101015140620.63d0a615@nehalam>

From: Stephen Hemminger <shemminger@vyatta.com>
Date: Fri, 15 Oct 2010 14:06:20 -0700

> Gcc doesn't usually handle inline across compilation units, and the
> functions don't have to be global in scope. Move the set/reset flag
> functions int the existing vmxnet3 header.
> 
> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>

Applied.

^ permalink raw reply

* Re: [PATCH net-next] pch_gbe: make local functions static
From: David Miller @ 2010-10-21 10:14 UTC (permalink / raw)
  To: shemminger; +Cc: masa-korg, netdev
In-Reply-To: <20101015140914.01d8ef08@nehalam>

From: Stephen Hemminger <shemminger@vyatta.com>
Date: Fri, 15 Oct 2010 14:09:14 -0700

> Make routines that are only used in one file static.
> 
> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>

Applied.

^ permalink raw reply

* Re: [PATCH net-next] bonding: make release_and_destroy static
From: David Miller @ 2010-10-21 10:15 UTC (permalink / raw)
  To: shemminger; +Cc: fubar, bonding-devel, netdev
In-Reply-To: <20101015080934.6dc28388@nehalam>

From: Stephen Hemminger <shemminger@vyatta.com>
Date: Fri, 15 Oct 2010 08:09:34 -0700

> Only used in main file.
> 
> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>

Applied.

^ permalink raw reply

* Re: [PATCH net-next] rtnetlink: remove rtnl_kill_links
From: David Miller @ 2010-10-21 10:15 UTC (permalink / raw)
  To: shemminger; +Cc: netdev
In-Reply-To: <20101015081201.55eae031@nehalam>

From: Stephen Hemminger <shemminger@vyatta.com>
Date: Fri, 15 Oct 2010 08:12:01 -0700

> The function rtnl_kill_links is defined but never used.
> 
> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>

Applied.

^ permalink raw reply

* Re: [PATCH net-next] xfrm: make xfrm_bundle_ok local
From: David Miller @ 2010-10-21 10:15 UTC (permalink / raw)
  To: shemminger; +Cc: herbert, netdev
In-Reply-To: <20101015081419.79252990@nehalam>

From: Stephen Hemminger <shemminger@vyatta.com>
Date: Fri, 15 Oct 2010 08:14:19 -0700

> Only used in one place.
> 
> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>

Applied.

^ permalink raw reply

* Re: [PATCH net-next] xfrm6: make xfrm6_tunnel_free_spi local
From: David Miller @ 2010-10-21 10:15 UTC (permalink / raw)
  To: shemminger; +Cc: herbert, yoshfuji, netdev
In-Reply-To: <20101015081559.6ede35a0@nehalam>

From: Stephen Hemminger <shemminger@vyatta.com>
Date: Fri, 15 Oct 2010 08:15:59 -0700

> Function only defined and used in one file.
> 
> Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>

Applied.

^ permalink raw reply

* Re: [PATCH] net/nuc900: change dev_warn to dev_dbg when link down occurs
From: David Miller @ 2010-10-21 10:16 UTC (permalink / raw)
  To: mcuos.com; +Cc: netdev, yachen, linux-kernel
In-Reply-To: <4CB86C87.5010207@gmail.com>

From: Wan ZongShun <mcuos.com@gmail.com>
Date: Fri, 15 Oct 2010 23:00:23 +0800

> Hi David,
> 
> When I didnot connect the net cable, the warning infos are always
> showed,
> so I change the dev_warn to dev_dbg.
> 
> Signed-off-by: Wan ZongShun <mcuos.com@gmail.com>

Your patch does not apply to the current tree.

^ permalink raw reply

* Re: [PATCH] net_sched: remove the unused parameter of qdisc_create_dflt()
From: David Miller @ 2010-10-21 10:16 UTC (permalink / raw)
  To: hadi; +Cc: xiaosuo, shemminger, netdev, netem
In-Reply-To: <1287404425.3664.186.camel@bigi>

From: jamal <hadi@cyberus.ca>
Date: Mon, 18 Oct 2010 08:20:25 -0400

> On Sun, 2010-10-17 at 07:04 +0800, Changli Gao wrote:
>> The first parameter dev isn't in use in qdisc_create_dflt().
> 
> And we can derive it from netdev_queue if ever needed in the future.
> 
> Acked-by: Jamal Hadi Salim <hadi@cyberus.ca>

Applied, thanks.

^ permalink raw reply

* Re: [PATCH] smsc95xx: generate random MAC address once, not every ifup
From: David Miller @ 2010-10-21 10:17 UTC (permalink / raw)
  To: bernard; +Cc: steve.glendinning, netdev, linux-omap
In-Reply-To: <20101018231639.GA15955@mersenne.largestprime.net>

From: Bernard Blackham <bernard@largestprime.net>
Date: Tue, 19 Oct 2010 10:16:39 +1100

> The smsc95xx driver currently generates a new random MAC address
> every time the interface is brought up. This makes it impossible to
> override using the standard `ifconfig hw ether` approach.
> 
> Past patches tried to make the MAC address a module parameter or
> base it off the die ID, but it seems to me much simpler (and
> hopefully less controversial) to stick with the current random
> generation scheme, but allow the user to change the address.
> 
> This patch does exactly that - it moves the random address
> generation from smsc95xx_reset() into smsc95xx_bind(), so that it is
> done once on module load, not on every ifup. The user can then
> override this using the standard mechanisms.
> 
> Applies against 2.6.35 and linux-2.6 head.
> 
> Signed-off-by: Bernard Blackham <b-omap@largestprime.net>

Applied.

^ permalink raw reply

* Re: [PATCH 0/2] bonding: checkpatch.pl cleanups in bond_3ad.c
From: David Miller @ 2010-10-21 10:17 UTC (permalink / raw)
  To: bandan.das; +Cc: linux-kernel, netdev, joe, eric.dumazet
In-Reply-To: <1287296399-25142-1-git-send-email-bandan.das@stratus.com>

From: Bandan Das <bandan.das@stratus.com>
Date: Sun, 17 Oct 2010 02:19:57 -0400

> Just some cleanups based on errors/warnings reported by checkpatch.pl.
> There are still a ton of lines greater than 80 chars and I tried to fix 
> a few that I came across.

Both patches applied, thank you.

^ permalink raw reply

* Re: [PATCH] ipv4: synchronize bind() with RTM_NEWADDR notifications
From: Eric Dumazet @ 2010-10-21 10:25 UTC (permalink / raw)
  To: Timo Teräs; +Cc: netdev
In-Reply-To: <1287655930-16879-1-git-send-email-timo.teras@iki.fi>

Le jeudi 21 octobre 2010 à 13:12 +0300, Timo Teräs a écrit :
> Otherwise we have race condition to user land:
>  1. process A changes IP address
>  2. kernel sends RTM_NEWADDR
>  3. process B gets notification
>  4. process B tries to bind() to new IP but that fails with
> EADDRNOTAVAIL because FIB is not yet updated and inet_addr_type() in
> inet_bind() does not recognize the IP as local
>  5. kernel calls inetaddr_chain notifiers which updates FIB
> 
> IPv6 side seems to handle the notifications properly: bind()
> immediately after RTM_NEWADDR succeeds as expected. This is because
> ipv6_chk_addr() uses inet6_addr_lst which is updated before address
> notification.
> 
> Signed-off-by: Timo Teräs <timo.teras@iki.fi>
> ---
>  net/ipv4/af_inet.c  |    9 +++++++++
>  net/ipv6/af_inet6.c |    4 +++-
>  2 files changed, 12 insertions(+), 1 deletions(-)
> 
> diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
> index 6a1100c..21200e4 100644
> --- a/net/ipv4/af_inet.c
> +++ b/net/ipv4/af_inet.c
> @@ -466,6 +466,15 @@ int inet_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len)
>  	if (addr_len < sizeof(struct sockaddr_in))
>  		goto out;
>  
> +	/* Acquire rtnl_lock to synchronize with possible simultaneous
> +	 * IP-address changes. This is needed because when RTM_NEWADDR
> +	 * is sent the new IP is not yet in FIB, but alas inet_addr_type
> +	 * checks the address type using FIB. Acquiring rtnl lock once
> +	 * makse sure that any address for which RTM_NEWADDR was sent
> +	 * earlier exists also in FIB. */
> +	rtnl_lock();
> +	rtnl_unlock();

You must be kidding ?

Really, this is a hot path...




^ permalink raw reply

* Re: [PATCH 9/9] tproxy: use the interface primary IP address as a default value for --on-ip
From: KOVACS Krisztian @ 2010-10-21 10:32 UTC (permalink / raw)
  To: Jan Engelhardt; +Cc: netdev, netfilter-devel, Patrick McHardy, David Miller
In-Reply-To: <alpine.LNX.2.01.1010211051560.22922@obet.zrqbmnf.qr>

Hi,

On Thu, 2010-10-21 at 11:12 +0200, Jan Engelhardt wrote:
> On Wednesday 2010-10-20 13:21, KOVACS Krisztian wrote:
> >+
> >+	if (!ipv6_addr_any(user_laddr))
> >+		return user_laddr;
> >+	laddr = NULL;
> >+
> >+	rcu_read_lock();
> >+	indev = __in6_dev_get(skb->dev);
> >+	if (indev)
> >+		list_for_each_entry(ifa, &indev->addr_list, if_list) {
> >+			/* FIXME: address selection */
> 
> Per our realworld discussion, I believe we should add checks for
> some conditions (RFC 4862 section 2):
> 
> 1. ignore tentative addresses
> 
> 	if (ifa->ifa_flags & IFA_F_TENTATIVE)
> 		continue;

Yep, we should definitely do that.

> 2. tests for when the interface's ifa->preferred_lft == 0/deprecatedness:
> 
> 	if (ctinfo == IP_CT_NEW/RELATED && (ifa->ifa_flags & IFA_F_DEPRECATED))
> 		continue;

Well, we can use the TPROXY target without conntrack so we cannot really
check if ctinfo==IP_CT_NEW/RELATED. However, we do address selection
only in case no established socket was found for the skb, so I think
it's safe to ignore addresses with IFA_F_DEPRECATED set.

Something like

	if (ifa->ifa_flags & (IFA_F_TENTATIVE | IFA_F_DEPRECATED))
		continue;

> 3. check for invalid addresses
> (There might be a flag like tentative..)
> 
> 	if (ifa->valid_lft == 0)
> 		continue;

I didn't find such a flag and it seems that valid_lft contains the
lifetime in seconds, so I think it's not that simple. The addrconf.c
code does something like this:

	unsigned long age = (jiffies - ifp->tstamp) / HZ;
	if (age >= ifp->valid_lft) {
		/* not valid */
	}

The code also sets the deprecated flag when the preferred lifetime
expires and I think valid_lft must be less or equal to prefered_lft. If
this is the case then I think this case should be already covered by
checking for IFA_F_DEPRECATED above.

--KK



^ permalink raw reply

* Re: [PATCH] ipv4: synchronize bind() with RTM_NEWADDR notifications
From: Timo Teräs @ 2010-10-21 10:41 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: netdev
In-Reply-To: <1287656753.6871.46.camel@edumazet-laptop>

On 10/21/2010 01:25 PM, Eric Dumazet wrote:
> Le jeudi 21 octobre 2010 à 13:12 +0300, Timo Teräs a écrit :
>> Otherwise we have race condition to user land:
>>  1. process A changes IP address
>>  2. kernel sends RTM_NEWADDR
>>  3. process B gets notification
>>  4. process B tries to bind() to new IP but that fails with
>> EADDRNOTAVAIL because FIB is not yet updated and inet_addr_type() in
>> inet_bind() does not recognize the IP as local
>>  5. kernel calls inetaddr_chain notifiers which updates FIB
>>
>> IPv6 side seems to handle the notifications properly: bind()
>> immediately after RTM_NEWADDR succeeds as expected. This is because
>> ipv6_chk_addr() uses inet6_addr_lst which is updated before address
>> notification.
>>
>> Signed-off-by: Timo Teräs <timo.teras@iki.fi>
>> ---
>>  net/ipv4/af_inet.c  |    9 +++++++++
>>  net/ipv6/af_inet6.c |    4 +++-
>>  2 files changed, 12 insertions(+), 1 deletions(-)
>>
>> diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
>> index 6a1100c..21200e4 100644
>> --- a/net/ipv4/af_inet.c
>> +++ b/net/ipv4/af_inet.c
>> @@ -466,6 +466,15 @@ int inet_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len)
>>  	if (addr_len < sizeof(struct sockaddr_in))
>>  		goto out;
>>  
>> +	/* Acquire rtnl_lock to synchronize with possible simultaneous
>> +	 * IP-address changes. This is needed because when RTM_NEWADDR
>> +	 * is sent the new IP is not yet in FIB, but alas inet_addr_type
>> +	 * checks the address type using FIB. Acquiring rtnl lock once
>> +	 * makse sure that any address for which RTM_NEWADDR was sent
>> +	 * earlier exists also in FIB. */
>> +	rtnl_lock();
>> +	rtnl_unlock();
> 
> You must be kidding ?
> 
> Really, this is a hot path...

Is inet_bind() called from non-userland context? If yes, then this is a
bad idea. Otherwise I don't think it's that hot path...

The other idea of doing notifier calls before RTM_NEWADDR sending is
worse because it changes ordering of userland visible netlink notifications.

This looked like the easiest way out. If this is unacceptable, I guess
we are left with changing inet_addr_type() to not use FIB.

Or is there better ideas?



^ permalink raw reply

* Re: [PATCH 1/3] tproxy: kick out TIME_WAIT sockets in case a new connection comes in with the same tuple
From: Patrick McHardy @ 2010-10-21 10:45 UTC (permalink / raw)
  To: KOVACS Krisztian; +Cc: netdev, netfilter-devel, David Miller
In-Reply-To: <20101020112142.6538.90622.stgit@este.odu>

Am 20.10.2010 13:21, schrieb KOVACS Krisztian:
> Without tproxy redirections an incoming SYN kicks out conflicting
> TIME_WAIT sockets, in order to handle clients that reuse ports
> within the TIME_WAIT period.
> 
> The same mechanism didn't work in case TProxy is involved in finding
> the proper socket, as the time_wait processing code looked up the
> listening socket assuming that the listener addr/port matches those
> of the established connection.
> 
> This is not the case with TProxy as the listener addr/port is possibly
> changed with the tproxy rule.

Applied, thanks.

^ permalink raw reply

* [PATCH v2 0/9] tproxy: add IPv6 support
From: KOVACS Krisztian @ 2010-10-21 10:47 UTC (permalink / raw)
  To: netdev, netfilter-devel; +Cc: Balazs Scheidler, Patrick McHardy, David Miller

The following series adds IPv6 support for tproxy. The parts touching
non-Netfilter code include exporting the UDP lookup function, adding the
sockopt infrastructure for getting the original destination address and
allowing non-local binds if the IP_TRANSPARENT socket option is set.

Netfilter changes are splitting the defragmentation code off of conntrack,
adding IPv6 socket lookup helpers to the tproxy core module and updating the
socket match and the TPROXY target.

The last patch in the series tries to make it easier to use the TPROXY target
by selecting a meaningful address to redirect to in case the user did not
explicitly specify it with '--on-ip'.

v2 of the patches incorporates fixes suggested by Jan Engelhardt.

---

Balazs Scheidler (9):
      tproxy: split off ipv6 defragmentation to a separate module
      tproxy: added const specifiers to udp lookup functions
      tproxy: added udp6_lib_lookup function
      tproxy: added tproxy sockopt interface in the IPV6 layer
      tproxy: allow non-local binds of IPv6 sockets if IP_TRANSPARENT is enabled
      tproxy: added IPv6 socket lookup function to nf_tproxy_core
      tproxy: added IPv6 support to the TPROXY target
      tproxy: added IPv6 support to the socket match
      tproxy: use the interface primary IP address as a default value for --on-ip


 include/linux/in6.h                            |    4 
 include/linux/ipv6.h                           |    4 
 include/linux/netfilter/xt_TPROXY.h            |   13 +
 include/net/netfilter/ipv6/nf_defrag_ipv6.h    |    6 
 include/net/netfilter/nf_tproxy_core.h         |   72 +++++
 include/net/udp.h                              |    3 
 net/ipv6/af_inet6.c                            |    2 
 net/ipv6/datagram.c                            |   19 +
 net/ipv6/ipv6_sockglue.c                       |   23 ++
 net/ipv6/netfilter/Makefile                    |    5 
 net/ipv6/netfilter/nf_conntrack_l3proto_ipv6.c |   78 ------
 net/ipv6/netfilter/nf_conntrack_reasm.c        |   12 +
 net/ipv6/netfilter/nf_defrag_ipv6_hooks.c      |  131 ++++++++++
 net/ipv6/udp.c                                 |   16 +
 net/netfilter/xt_TPROXY.c                      |  328 +++++++++++++++++++++---
 net/netfilter/xt_socket.c                      |  165 +++++++++++-
 16 files changed, 743 insertions(+), 138 deletions(-)
 create mode 100644 include/net/netfilter/ipv6/nf_defrag_ipv6.h
 create mode 100644 net/ipv6/netfilter/nf_defrag_ipv6_hooks.c

-- 
KOVACS Krisztian


^ permalink raw reply

* [PATCH v2 3/9] tproxy: added udp6_lib_lookup function
From: KOVACS Krisztian @ 2010-10-21 10:47 UTC (permalink / raw)
  To: netdev, netfilter-devel; +Cc: Balazs Scheidler, Patrick McHardy, David Miller
In-Reply-To: <20101021104709.5192.31249.stgit@este.odu>

From: Balazs Scheidler <bazsi@balabit.hu>

Just like with IPv4, we need access to the UDP hash table to look up local
sockets, but instead of exporting the global udp_table, export a lookup
function.

Signed-off-by: Balazs Scheidler <bazsi@balabit.hu>
Signed-off-by: KOVACS Krisztian <hidden@balabit.hu>
---
 include/net/udp.h |    3 +++
 net/ipv6/udp.c    |    8 ++++++++
 2 files changed, 11 insertions(+), 0 deletions(-)

diff --git a/include/net/udp.h b/include/net/udp.h
index a184d34..200b828 100644
--- a/include/net/udp.h
+++ b/include/net/udp.h
@@ -183,6 +183,9 @@ extern int udp_lib_setsockopt(struct sock *sk, int level, int optname,
 extern struct sock *udp4_lib_lookup(struct net *net, __be32 saddr, __be16 sport,
 				    __be32 daddr, __be16 dport,
 				    int dif);
+extern struct sock *udp6_lib_lookup(struct net *net, const struct in6_addr *saddr, __be16 sport,
+				    const struct in6_addr *daddr, __be16 dport,
+				    int dif);
 
 /*
  * 	SNMP statistics for UDP and UDP-Lite
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index 33e3683..c84dad4 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -320,6 +320,14 @@ static struct sock *__udp6_lib_lookup_skb(struct sk_buff *skb,
 				 udptable);
 }
 
+struct sock *udp6_lib_lookup(struct net *net, const struct in6_addr *saddr, __be16 sport,
+			     const struct in6_addr *daddr, __be16 dport, int dif)
+{
+	return __udp6_lib_lookup(net, saddr, sport, daddr, dport, dif, &udp_table);
+}
+EXPORT_SYMBOL_GPL(udp6_lib_lookup);
+
+
 /*
  * 	This should be easy, if there is something there we
  * 	return it, otherwise we block.



^ permalink raw reply related

* [PATCH v2 1/9] tproxy: split off ipv6 defragmentation to a separate module
From: KOVACS Krisztian @ 2010-10-21 10:47 UTC (permalink / raw)
  To: netdev, netfilter-devel; +Cc: Balazs Scheidler, Patrick McHardy, David Miller
In-Reply-To: <20101021104709.5192.31249.stgit@este.odu>

From: Balazs Scheidler <bazsi@balabit.hu>

Like with IPv4, TProxy needs IPv6 defragmentation but does not
require connection tracking. Since defragmentation was coupled
with conntrack, I split off the two, creating an nf_defrag_ipv6 module,
similar to the already existing nf_defrag_ipv4.

Signed-off-by: Balazs Scheidler <bazsi@balabit.hu>
Signed-off-by: KOVACS Krisztian <hidden@balabit.hu>
---
 include/net/netfilter/ipv6/nf_defrag_ipv6.h    |    6 +
 net/ipv6/netfilter/Makefile                    |    5 +
 net/ipv6/netfilter/nf_conntrack_l3proto_ipv6.c |   78 +-------------
 net/ipv6/netfilter/nf_conntrack_reasm.c        |   12 ++
 net/ipv6/netfilter/nf_defrag_ipv6_hooks.c      |  131 ++++++++++++++++++++++++
 5 files changed, 154 insertions(+), 78 deletions(-)
 create mode 100644 include/net/netfilter/ipv6/nf_defrag_ipv6.h
 create mode 100644 net/ipv6/netfilter/nf_defrag_ipv6_hooks.c

diff --git a/include/net/netfilter/ipv6/nf_defrag_ipv6.h b/include/net/netfilter/ipv6/nf_defrag_ipv6.h
new file mode 100644
index 0000000..94dd54d
--- /dev/null
+++ b/include/net/netfilter/ipv6/nf_defrag_ipv6.h
@@ -0,0 +1,6 @@
+#ifndef _NF_DEFRAG_IPV6_H
+#define _NF_DEFRAG_IPV6_H
+
+extern void nf_defrag_ipv6_enable(void);
+
+#endif /* _NF_DEFRAG_IPV6_H */
diff --git a/net/ipv6/netfilter/Makefile b/net/ipv6/netfilter/Makefile
index aafbba3..3f8e4a3 100644
--- a/net/ipv6/netfilter/Makefile
+++ b/net/ipv6/netfilter/Makefile
@@ -11,10 +11,11 @@ obj-$(CONFIG_IP6_NF_RAW) += ip6table_raw.o
 obj-$(CONFIG_IP6_NF_SECURITY) += ip6table_security.o
 
 # objects for l3 independent conntrack
-nf_conntrack_ipv6-objs  :=  nf_conntrack_l3proto_ipv6.o nf_conntrack_proto_icmpv6.o nf_conntrack_reasm.o
+nf_conntrack_ipv6-objs  :=  nf_conntrack_l3proto_ipv6.o nf_conntrack_proto_icmpv6.o
+nf_defrag_ipv6-objs := nf_defrag_ipv6_hooks.o nf_conntrack_reasm.o
 
 # l3 independent conntrack
-obj-$(CONFIG_NF_CONNTRACK_IPV6) += nf_conntrack_ipv6.o
+obj-$(CONFIG_NF_CONNTRACK_IPV6) += nf_conntrack_ipv6.o nf_defrag_ipv6.o
 
 # matches
 obj-$(CONFIG_IP6_NF_MATCH_AH) += ip6t_ah.o
diff --git a/net/ipv6/netfilter/nf_conntrack_l3proto_ipv6.c b/net/ipv6/netfilter/nf_conntrack_l3proto_ipv6.c
index ff43461..c8af58b 100644
--- a/net/ipv6/netfilter/nf_conntrack_l3proto_ipv6.c
+++ b/net/ipv6/netfilter/nf_conntrack_l3proto_ipv6.c
@@ -16,7 +16,6 @@
 #include <linux/module.h>
 #include <linux/skbuff.h>
 #include <linux/icmp.h>
-#include <linux/sysctl.h>
 #include <net/ipv6.h>
 #include <net/inet_frag.h>
 
@@ -29,6 +28,7 @@
 #include <net/netfilter/nf_conntrack_core.h>
 #include <net/netfilter/nf_conntrack_zones.h>
 #include <net/netfilter/ipv6/nf_conntrack_ipv6.h>
+#include <net/netfilter/ipv6/nf_defrag_ipv6.h>
 #include <net/netfilter/nf_log.h>
 
 static bool ipv6_pkt_to_tuple(const struct sk_buff *skb, unsigned int nhoff,
@@ -189,53 +189,6 @@ out:
 	return nf_conntrack_confirm(skb);
 }
 
-static enum ip6_defrag_users nf_ct6_defrag_user(unsigned int hooknum,
-						struct sk_buff *skb)
-{
-	u16 zone = NF_CT_DEFAULT_ZONE;
-
-	if (skb->nfct)
-		zone = nf_ct_zone((struct nf_conn *)skb->nfct);
-
-#ifdef CONFIG_BRIDGE_NETFILTER
-	if (skb->nf_bridge &&
-	    skb->nf_bridge->mask & BRNF_NF_BRIDGE_PREROUTING)
-		return IP6_DEFRAG_CONNTRACK_BRIDGE_IN + zone;
-#endif
-	if (hooknum == NF_INET_PRE_ROUTING)
-		return IP6_DEFRAG_CONNTRACK_IN + zone;
-	else
-		return IP6_DEFRAG_CONNTRACK_OUT + zone;
-
-}
-
-static unsigned int ipv6_defrag(unsigned int hooknum,
-				struct sk_buff *skb,
-				const struct net_device *in,
-				const struct net_device *out,
-				int (*okfn)(struct sk_buff *))
-{
-	struct sk_buff *reasm;
-
-	/* Previously seen (loopback)?  */
-	if (skb->nfct && !nf_ct_is_template((struct nf_conn *)skb->nfct))
-		return NF_ACCEPT;
-
-	reasm = nf_ct_frag6_gather(skb, nf_ct6_defrag_user(hooknum, skb));
-	/* queued */
-	if (reasm == NULL)
-		return NF_STOLEN;
-
-	/* error occured or not fragmented */
-	if (reasm == skb)
-		return NF_ACCEPT;
-
-	nf_ct_frag6_output(hooknum, reasm, (struct net_device *)in,
-			   (struct net_device *)out, okfn);
-
-	return NF_STOLEN;
-}
-
 static unsigned int __ipv6_conntrack_in(struct net *net,
 					unsigned int hooknum,
 					struct sk_buff *skb,
@@ -288,13 +241,6 @@ static unsigned int ipv6_conntrack_local(unsigned int hooknum,
 
 static struct nf_hook_ops ipv6_conntrack_ops[] __read_mostly = {
 	{
-		.hook		= ipv6_defrag,
-		.owner		= THIS_MODULE,
-		.pf		= NFPROTO_IPV6,
-		.hooknum	= NF_INET_PRE_ROUTING,
-		.priority	= NF_IP6_PRI_CONNTRACK_DEFRAG,
-	},
-	{
 		.hook		= ipv6_conntrack_in,
 		.owner		= THIS_MODULE,
 		.pf		= NFPROTO_IPV6,
@@ -309,13 +255,6 @@ static struct nf_hook_ops ipv6_conntrack_ops[] __read_mostly = {
 		.priority	= NF_IP6_PRI_CONNTRACK,
 	},
 	{
-		.hook		= ipv6_defrag,
-		.owner		= THIS_MODULE,
-		.pf		= NFPROTO_IPV6,
-		.hooknum	= NF_INET_LOCAL_OUT,
-		.priority	= NF_IP6_PRI_CONNTRACK_DEFRAG,
-	},
-	{
 		.hook		= ipv6_confirm,
 		.owner		= THIS_MODULE,
 		.pf		= NFPROTO_IPV6,
@@ -387,10 +326,6 @@ struct nf_conntrack_l3proto nf_conntrack_l3proto_ipv6 __read_mostly = {
 	.nlattr_to_tuple	= ipv6_nlattr_to_tuple,
 	.nla_policy		= ipv6_nla_policy,
 #endif
-#ifdef CONFIG_SYSCTL
-	.ctl_table_path		= nf_net_netfilter_sysctl_path,
-	.ctl_table		= nf_ct_ipv6_sysctl_table,
-#endif
 	.me			= THIS_MODULE,
 };
 
@@ -403,16 +338,12 @@ static int __init nf_conntrack_l3proto_ipv6_init(void)
 	int ret = 0;
 
 	need_conntrack();
+	nf_defrag_ipv6_enable();
 
-	ret = nf_ct_frag6_init();
-	if (ret < 0) {
-		pr_err("nf_conntrack_ipv6: can't initialize frag6.\n");
-		return ret;
-	}
 	ret = nf_conntrack_l4proto_register(&nf_conntrack_l4proto_tcp6);
 	if (ret < 0) {
 		pr_err("nf_conntrack_ipv6: can't register tcp.\n");
-		goto cleanup_frag6;
+		return ret;
 	}
 
 	ret = nf_conntrack_l4proto_register(&nf_conntrack_l4proto_udp6);
@@ -450,8 +381,6 @@ static int __init nf_conntrack_l3proto_ipv6_init(void)
 	nf_conntrack_l4proto_unregister(&nf_conntrack_l4proto_udp6);
  cleanup_tcp:
 	nf_conntrack_l4proto_unregister(&nf_conntrack_l4proto_tcp6);
- cleanup_frag6:
-	nf_ct_frag6_cleanup();
 	return ret;
 }
 
@@ -463,7 +392,6 @@ static void __exit nf_conntrack_l3proto_ipv6_fini(void)
 	nf_conntrack_l4proto_unregister(&nf_conntrack_l4proto_icmpv6);
 	nf_conntrack_l4proto_unregister(&nf_conntrack_l4proto_udp6);
 	nf_conntrack_l4proto_unregister(&nf_conntrack_l4proto_tcp6);
-	nf_ct_frag6_cleanup();
 }
 
 module_init(nf_conntrack_l3proto_ipv6_init);
diff --git a/net/ipv6/netfilter/nf_conntrack_reasm.c b/net/ipv6/netfilter/nf_conntrack_reasm.c
index 138a8b3..bb669b4 100644
--- a/net/ipv6/netfilter/nf_conntrack_reasm.c
+++ b/net/ipv6/netfilter/nf_conntrack_reasm.c
@@ -73,7 +73,7 @@ static struct inet_frags nf_frags;
 static struct netns_frags nf_init_frags;
 
 #ifdef CONFIG_SYSCTL
-struct ctl_table nf_ct_ipv6_sysctl_table[] = {
+struct ctl_table nf_ct_frag6_sysctl_table[] = {
 	{
 		.procname	= "nf_conntrack_frag6_timeout",
 		.data		= &nf_init_frags.timeout,
@@ -97,6 +97,8 @@ struct ctl_table nf_ct_ipv6_sysctl_table[] = {
 	},
 	{ }
 };
+
+static struct ctl_table_header *nf_ct_frag6_sysctl_header;
 #endif
 
 static unsigned int nf_hashfn(struct inet_frag_queue *q)
@@ -623,11 +625,19 @@ int nf_ct_frag6_init(void)
 	inet_frags_init_net(&nf_init_frags);
 	inet_frags_init(&nf_frags);
 
+	nf_ct_frag6_sysctl_header = register_sysctl_paths(nf_net_netfilter_sysctl_path,
+							  nf_ct_frag6_sysctl_table);
+	if (!nf_ct_frag6_sysctl_header)
+		return -ENOMEM;
+
 	return 0;
 }
 
 void nf_ct_frag6_cleanup(void)
 {
+	unregister_sysctl_table(nf_ct_frag6_sysctl_header);
+	nf_ct_frag6_sysctl_header = NULL;
+
 	inet_frags_fini(&nf_frags);
 
 	nf_init_frags.low_thresh = 0;
diff --git a/net/ipv6/netfilter/nf_defrag_ipv6_hooks.c b/net/ipv6/netfilter/nf_defrag_ipv6_hooks.c
new file mode 100644
index 0000000..99abfb5
--- /dev/null
+++ b/net/ipv6/netfilter/nf_defrag_ipv6_hooks.c
@@ -0,0 +1,131 @@
+/* (C) 1999-2001 Paul `Rusty' Russell
+ * (C) 2002-2004 Netfilter Core Team <coreteam@netfilter.org>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+
+#include <linux/types.h>
+#include <linux/ipv6.h>
+#include <linux/in6.h>
+#include <linux/netfilter.h>
+#include <linux/module.h>
+#include <linux/skbuff.h>
+#include <linux/icmp.h>
+#include <linux/sysctl.h>
+#include <net/ipv6.h>
+#include <net/inet_frag.h>
+
+#include <linux/netfilter_ipv6.h>
+#include <linux/netfilter_bridge.h>
+#include <net/netfilter/nf_conntrack.h>
+#include <net/netfilter/nf_conntrack_helper.h>
+#include <net/netfilter/nf_conntrack_l4proto.h>
+#include <net/netfilter/nf_conntrack_l3proto.h>
+#include <net/netfilter/nf_conntrack_core.h>
+#include <net/netfilter/nf_conntrack_zones.h>
+#include <net/netfilter/ipv6/nf_conntrack_ipv6.h>
+#include <net/netfilter/ipv6/nf_defrag_ipv6.h>
+
+static enum ip6_defrag_users nf_ct6_defrag_user(unsigned int hooknum,
+						struct sk_buff *skb)
+{
+	u16 zone = NF_CT_DEFAULT_ZONE;
+
+	if (skb->nfct)
+		zone = nf_ct_zone((struct nf_conn *)skb->nfct);
+
+#ifdef CONFIG_BRIDGE_NETFILTER
+	if (skb->nf_bridge &&
+	    skb->nf_bridge->mask & BRNF_NF_BRIDGE_PREROUTING)
+		return IP6_DEFRAG_CONNTRACK_BRIDGE_IN + zone;
+#endif
+	if (hooknum == NF_INET_PRE_ROUTING)
+		return IP6_DEFRAG_CONNTRACK_IN + zone;
+	else
+		return IP6_DEFRAG_CONNTRACK_OUT + zone;
+
+}
+
+static unsigned int ipv6_defrag(unsigned int hooknum,
+				struct sk_buff *skb,
+				const struct net_device *in,
+				const struct net_device *out,
+				int (*okfn)(struct sk_buff *))
+{
+	struct sk_buff *reasm;
+
+	/* Previously seen (loopback)?	*/
+	if (skb->nfct && !nf_ct_is_template((struct nf_conn *)skb->nfct))
+		return NF_ACCEPT;
+
+	reasm = nf_ct_frag6_gather(skb, nf_ct6_defrag_user(hooknum, skb));
+	/* queued */
+	if (reasm == NULL)
+		return NF_STOLEN;
+
+	/* error occured or not fragmented */
+	if (reasm == skb)
+		return NF_ACCEPT;
+
+	nf_ct_frag6_output(hooknum, reasm, (struct net_device *)in,
+			   (struct net_device *)out, okfn);
+
+	return NF_STOLEN;
+}
+
+static struct nf_hook_ops ipv6_defrag_ops[] = {
+	{
+		.hook		= ipv6_defrag,
+		.owner		= THIS_MODULE,
+		.pf		= NFPROTO_IPV6,
+		.hooknum	= NF_INET_PRE_ROUTING,
+		.priority	= NF_IP6_PRI_CONNTRACK_DEFRAG,
+	},
+	{
+		.hook		= ipv6_defrag,
+		.owner		= THIS_MODULE,
+		.pf		= NFPROTO_IPV6,
+		.hooknum	= NF_INET_LOCAL_OUT,
+		.priority	= NF_IP6_PRI_CONNTRACK_DEFRAG,
+	},
+};
+
+static int __init nf_defrag_init(void)
+{
+	int ret = 0;
+
+	ret = nf_ct_frag6_init();
+	if (ret < 0) {
+		pr_err("nf_defrag_ipv6: can't initialize frag6.\n");
+		return ret;
+	}
+	ret = nf_register_hooks(ipv6_defrag_ops, ARRAY_SIZE(ipv6_defrag_ops));
+	if (ret < 0) {
+		pr_err("nf_defrag_ipv6: can't register hooks\n");
+		goto cleanup_frag6;
+	}
+	return ret;
+
+cleanup_frag6:
+	nf_ct_frag6_cleanup();
+	return ret;
+
+}
+
+static void __exit nf_defrag_fini(void)
+{
+	nf_unregister_hooks(ipv6_defrag_ops, ARRAY_SIZE(ipv6_defrag_ops));
+	nf_ct_frag6_cleanup();
+}
+
+void nf_defrag_ipv6_enable(void)
+{
+}
+EXPORT_SYMBOL_GPL(nf_defrag_ipv6_enable);
+
+module_init(nf_defrag_init);
+module_exit(nf_defrag_fini);
+
+MODULE_LICENSE("GPL");



^ permalink raw reply related

* [PATCH v2 4/9] tproxy: added tproxy sockopt interface in the IPV6 layer
From: KOVACS Krisztian @ 2010-10-21 10:47 UTC (permalink / raw)
  To: netdev, netfilter-devel; +Cc: Balazs Scheidler, Patrick McHardy, David Miller
In-Reply-To: <20101021104709.5192.31249.stgit@este.odu>

From: Balazs Scheidler <bazsi@balabit.hu>

Support for IPV6_RECVORIGDSTADDR sockopt for UDP sockets were contributed by
Harry Mason.

Signed-off-by: Balazs Scheidler <bazsi@balabit.hu>
Signed-off-by: KOVACS Krisztian <hidden@balabit.hu>
---
 include/linux/in6.h      |    4 ++++
 include/linux/ipv6.h     |    4 +++-
 net/ipv6/datagram.c      |   19 +++++++++++++++++++
 net/ipv6/ipv6_sockglue.c |   23 +++++++++++++++++++++++
 4 files changed, 49 insertions(+), 1 deletions(-)

diff --git a/include/linux/in6.h b/include/linux/in6.h
index c4bf46f..097a34b 100644
--- a/include/linux/in6.h
+++ b/include/linux/in6.h
@@ -268,6 +268,10 @@ struct in6_flowlabel_req {
 /* RFC5082: Generalized Ttl Security Mechanism */
 #define IPV6_MINHOPCOUNT		73
 
+#define IPV6_ORIGDSTADDR        74
+#define IPV6_RECVORIGDSTADDR    IPV6_ORIGDSTADDR
+#define IPV6_TRANSPARENT        75
+
 /*
  * Multicast Routing:
  * see include/linux/mroute6.h.
diff --git a/include/linux/ipv6.h b/include/linux/ipv6.h
index e62683b..8e429d0 100644
--- a/include/linux/ipv6.h
+++ b/include/linux/ipv6.h
@@ -341,7 +341,9 @@ struct ipv6_pinfo {
 				odstopts:1,
                                 rxflow:1,
 				rxtclass:1,
-				rxpmtu:1;
+				rxpmtu:1,
+				rxorigdstaddr:1;
+				/* 2 bits hole */
 		} bits;
 		__u16		all;
 	} rxopt;
diff --git a/net/ipv6/datagram.c b/net/ipv6/datagram.c
index ef371aa..320bdb8 100644
--- a/net/ipv6/datagram.c
+++ b/net/ipv6/datagram.c
@@ -577,6 +577,25 @@ int datagram_recv_ctl(struct sock *sk, struct msghdr *msg, struct sk_buff *skb)
 		u8 *ptr = nh + opt->dst1;
 		put_cmsg(msg, SOL_IPV6, IPV6_2292DSTOPTS, (ptr[1]+1)<<3, ptr);
 	}
+	if (np->rxopt.bits.rxorigdstaddr) {
+		struct sockaddr_in6 sin6;
+		u16 *ports = (u16 *) skb_transport_header(skb);
+
+		if (skb_transport_offset(skb) + 4 <= skb->len) {
+			/* All current transport protocols have the port numbers in the
+			 * first four bytes of the transport header and this function is
+			 * written with this assumption in mind.
+			 */
+
+			sin6.sin6_family = AF_INET6;
+			ipv6_addr_copy(&sin6.sin6_addr, &ipv6_hdr(skb)->daddr);
+			sin6.sin6_port = ports[1];
+			sin6.sin6_flowinfo = 0;
+			sin6.sin6_scope_id = 0;
+
+			put_cmsg(msg, SOL_IPV6, IPV6_ORIGDSTADDR, sizeof(sin6), &sin6);
+		}
+	}
 	return 0;
 }
 
diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
index a7f66bc..0553867 100644
--- a/net/ipv6/ipv6_sockglue.c
+++ b/net/ipv6/ipv6_sockglue.c
@@ -342,6 +342,21 @@ static int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
 		retv = 0;
 		break;
 
+	case IPV6_TRANSPARENT:
+		if (optlen < sizeof(int))
+			goto e_inval;
+		/* we don't have a separate transparent bit for IPV6 we use the one in the IPv4 socket */
+		inet_sk(sk)->transparent = valbool;
+		retv = 0;
+		break;
+
+	case IPV6_RECVORIGDSTADDR:
+		if (optlen < sizeof(int))
+			goto e_inval;
+		np->rxopt.bits.rxorigdstaddr = valbool;
+		retv = 0;
+		break;
+
 	case IPV6_HOPOPTS:
 	case IPV6_RTHDRDSTOPTS:
 	case IPV6_RTHDR:
@@ -1104,6 +1119,14 @@ static int do_ipv6_getsockopt(struct sock *sk, int level, int optname,
 		break;
 	}
 
+	case IPV6_TRANSPARENT:
+		val = inet_sk(sk)->transparent;
+		break;
+
+	case IPV6_RECVORIGDSTADDR:
+		val = np->rxopt.bits.rxorigdstaddr;
+		break;
+
 	case IPV6_UNICAST_HOPS:
 	case IPV6_MULTICAST_HOPS:
 	{



^ permalink raw reply related

* [PATCH v2 5/9] tproxy: allow non-local binds of IPv6 sockets if IP_TRANSPARENT is enabled
From: KOVACS Krisztian @ 2010-10-21 10:47 UTC (permalink / raw)
  To: netdev, netfilter-devel; +Cc: Balazs Scheidler, Patrick McHardy, David Miller
In-Reply-To: <20101021104709.5192.31249.stgit@este.odu>

From: Balazs Scheidler <bazsi@balabit.hu>

Signed-off-by: Balazs Scheidler <bazsi@balabit.hu>
Signed-off-by: KOVACS Krisztian <hidden@balabit.hu>
---
 net/ipv6/af_inet6.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c
index 6022098..9480572 100644
--- a/net/ipv6/af_inet6.c
+++ b/net/ipv6/af_inet6.c
@@ -343,7 +343,7 @@ int inet6_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len)
 			 */
 			v4addr = LOOPBACK4_IPV6;
 			if (!(addr_type & IPV6_ADDR_MULTICAST))	{
-				if (!ipv6_chk_addr(net, &addr->sin6_addr,
+				if (!inet->transparent && !ipv6_chk_addr(net, &addr->sin6_addr,
 						   dev, 0)) {
 					err = -EADDRNOTAVAIL;
 					goto out_unlock;



^ permalink raw reply related

* [PATCH v2 6/9] tproxy: added IPv6 socket lookup function to nf_tproxy_core
From: KOVACS Krisztian @ 2010-10-21 10:47 UTC (permalink / raw)
  To: netdev, netfilter-devel; +Cc: Balazs Scheidler, Patrick McHardy, David Miller
In-Reply-To: <20101021104709.5192.31249.stgit@este.odu>

From: Balazs Scheidler <bazsi@balabit.hu>

Signed-off-by: Balazs Scheidler <bazsi@balabit.hu>
Signed-off-by: KOVACS Krisztian <hidden@balabit.hu>
---
 include/net/netfilter/nf_tproxy_core.h |   72 ++++++++++++++++++++++++++++++++
 1 files changed, 71 insertions(+), 1 deletions(-)

diff --git a/include/net/netfilter/nf_tproxy_core.h b/include/net/netfilter/nf_tproxy_core.h
index 1027d7f..cd85b3b 100644
--- a/include/net/netfilter/nf_tproxy_core.h
+++ b/include/net/netfilter/nf_tproxy_core.h
@@ -5,7 +5,8 @@
 #include <linux/in.h>
 #include <linux/skbuff.h>
 #include <net/sock.h>
-#include <net/inet_sock.h>
+#include <net/inet_hashtables.h>
+#include <net/inet6_hashtables.h>
 #include <net/tcp.h>
 
 #define NFT_LOOKUP_ANY         0
@@ -130,6 +131,75 @@ nf_tproxy_get_sock_v4(struct net *net, const u8 protocol,
 	return sk;
 }
 
+#if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE)
+static inline struct sock *
+nf_tproxy_get_sock_v6(struct net *net, const u8 protocol,
+		      const struct in6_addr *saddr, const struct in6_addr *daddr,
+		      const __be16 sport, const __be16 dport,
+		      const struct net_device *in, int lookup_type)
+{
+	struct sock *sk;
+
+	/* look up socket */
+	switch (protocol) {
+	case IPPROTO_TCP:
+		switch (lookup_type) {
+		case NFT_LOOKUP_ANY:
+			sk = inet6_lookup(net, &tcp_hashinfo,
+					  saddr, sport, daddr, dport,
+					  in->ifindex);
+			break;
+		case NFT_LOOKUP_LISTENER:
+			sk = inet6_lookup_listener(net, &tcp_hashinfo,
+						   daddr, ntohs(dport),
+						   in->ifindex);
+
+			/* NOTE: we return listeners even if bound to
+			 * 0.0.0.0, those are filtered out in
+			 * xt_socket, since xt_TPROXY needs 0 bound
+			 * listeners too */
+
+			break;
+		case NFT_LOOKUP_ESTABLISHED:
+			sk = __inet6_lookup_established(net, &tcp_hashinfo,
+							saddr, sport, daddr, ntohs(dport),
+							in->ifindex);
+			break;
+		default:
+			WARN_ON(1);
+			sk = NULL;
+			break;
+		}
+		break;
+	case IPPROTO_UDP:
+		sk = udp6_lib_lookup(net, saddr, sport, daddr, dport,
+				     in->ifindex);
+		if (sk && lookup_type != NFT_LOOKUP_ANY) {
+			int connected = (sk->sk_state == TCP_ESTABLISHED);
+			int wildcard = ipv6_addr_any(&inet6_sk(sk)->rcv_saddr);
+
+			/* NOTE: we return listeners even if bound to
+			 * 0.0.0.0, those are filtered out in
+			 * xt_socket, since xt_TPROXY needs 0 bound
+			 * listeners too */
+			if ((lookup_type == NFT_LOOKUP_ESTABLISHED && (!connected || wildcard)) ||
+			    (lookup_type == NFT_LOOKUP_LISTENER && connected)) {
+				sock_put(sk);
+				sk = NULL;
+			}
+		}
+		break;
+	default:
+		WARN_ON(1);
+		sk = NULL;
+	}
+
+	pr_debug("tproxy socket lookup: proto %u %pI6:%u -> %pI6:%u, lookup type: %d, sock %p\n",
+		 protocol, saddr, ntohs(sport), daddr, ntohs(dport), lookup_type, sk);
+
+	return sk;
+}
+#endif
 
 static inline void
 nf_tproxy_put_sock(struct sock *sk)



^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox