Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [Discussion] About over-MTU-sized skb in virtualized env
From: Florian Westphal @ 2014-12-03 10:50 UTC (permalink / raw)
  To: Du Fan
  Cc: Florian Westphal, Thomas Graf, Michael S. Tsirkin, Jesse Gross,
	Flavio Leitner, davem@davemloft.net, pshelar, netdev,
	dev@openvswitch.org, Du, Fan
In-Reply-To: <547EB029.5010102@gmail.com>

Du Fan <fengyuleidian0615@gmail.com> wrote:
> Sorry for resend this mail, because my company email is rejected by netdev.
> 
> 
> Hi Florian
> 
>  214 static int ip_finish_output_gso(struct sk_buff *skb)
>  215 {
>  216     netdev_features_t features;
>  217     struct sk_buff *segs;
>  218     int ret = 0;
>  219
>  220     /* common case: locally created skb or seglen is <= mtu */
>  221     if (((IPCB(skb)->flags & IPSKB_FORWARDED) == 0) ||
>  222           skb_gso_network_seglen(skb) <= ip_skb_dst_mtu(skb))
>  223         return ip_finish_output2(skb);
> 
> Could you please state _concrete_ reason why locally created skb
> length is ＿always＿ fitting into MTU size? or why we needs this
> checking.

We don't "need" this checking.  Its just to avoid skb_gso_network_seglen()
computation for the common (local-out) case.

Locally generated GSO packet is not supposed to exceed dst_mtu, as that
is the PMTU discovery start point in absence of lower/learned value.

^ permalink raw reply

* How do I update Ericsson F5521gw firmware from Linux? / Ericsson F5521gw Random Disconnect Issue
From: Richard Yao @ 2014-12-03 10:58 UTC (permalink / raw)
  To: netdev

I purchased an Ericsson F5521gw (Lenovo part 60Y3279) so that my Lenovo T520
could connect to China Unicom for internet access during a stay in China.
Unfortunately, it tends to fail every 4 to 8 hours with the following printed
to the system log:

Dec  3 04:28:27 t520 kernel: [85827.909187] cdc_ncm 2-1.4:1.6 wwan0: network connection: disconnected
Dec  3 04:28:27 t520 NetworkManager[4134]: <info> (ttyACM1): modem state changed, 'connected' --> 'registered' (reason: user-requested)
Dec  3 04:28:27 t520 NetworkManager[4134]: <info> (ttyACM1): device state change: activated -> failed (reason 'modem-no-carrier') [100 120 25]
Dec  3 04:28:27 t520 NetworkManager[4134]: <info> NetworkManager state is now CONNECTED_LOCAL
Dec  3 04:28:27 t520 NetworkManager[4134]: <info> NetworkManager state is now CONNECTED_GLOBAL
Dec  3 04:28:27 t520 NetworkManager[4134]: <info> Policy set 'tun0' (tun0) as default for IPv4 routing and DNS.
Dec  3 04:28:27 t520 dbus[3912]: [system] Activating service name='org.freedesktop.nm_dispatcher' (using servicehelper)
Dec  3 04:28:27 t520 nm-openvpn[4200]: MANAGEMENT: Client disconnected
Dec  3 04:28:27 t520 nm-openvpn[4200]: SIGTERM received, sending exit notification to peer
Dec  3 04:28:27 t520 NetworkManager[4134]: <info> (tun0): link disconnected (deferring action for 4 seconds)
Dec  3 04:28:27 t520 NetworkManager[4134]: <error> [1417598907.695056] [platform/nm-linux-platform.c:1714] add_object(): Netlink error adding 0.0.0.0/0 via 10.8.0.5 dev tun0 metric 1024 mss 0 src user: Unspecific failure
Dec  3 04:28:27 t520 NetworkManager[4134]: <error> [1417598907.695124] [platform/nm-linux-platform.c:1714] add_object(): Netlink error adding 10.8.0.5/32 via 0.0.0.0 dev tun0 metric 1024 mss 0 src user: Unspecific failure
Dec  3 04:28:27 t520 NetworkManager[4134]: <error> [1417598907.695165] [platform/nm-linux-platform.c:1714] add_object(): Netlink error adding 0.0.0.0/0 via 10.8.0.5 dev tun0 metric 1024 mss 0 src user: Unspecific failure
tail: /var/log/messages: file truncated
Dec  3 04:28:27 t520 NetworkManager[4134]: <error> [1417598907.695165] [platform/nm-linux-platform.c:1714] add_object(): Netlink error adding 0.0.0.0/0 via 10.8.0.5 dev tun0 metric 1024 mss 0 src user: Unspecific failure
Dec  3 04:28:27 t520 NetworkManager[4134]: <error> [1417598907.695187] [nm-policy.c:693] update_ip4_routing(): Failed to set default route.
Dec  3 04:28:27 t520 NetworkManager[4134]: <warn> Activation (ttyACM1) failed for connection 'China Unicom'
Dec  3 04:28:27 t520 NetworkManager[4134]: <info> (ttyACM1): device state change: failed -> disconnected (reason 'none') [120 30 0]
Dec  3 04:28:27 t520 NetworkManager[4134]: <info> (ttyACM1): deactivating device (reason 'none') [0]
Dec  3 04:28:27 t520 dbus[3912]: [system] Successfully activated service 'org.freedesktop.nm_dispatcher'
Dec  3 04:28:27 t520 nm-dispatcher: Dispatching action 'vpn-down' for tun0
Dec  3 04:28:27 t520 nm-dispatcher: Dispatching action 'down' for wwan0

Here is the modem's description of itself:

mmcli -m 0 --command='AT*EEVINFO=99'
response: '*EEVINFO:
Model.................... F5521gw
IMEI Data................ <REDACTED>
SVN...................... 05
Serial Number............ <REDACTED>
Product Number........... KRD 131 18/221
Revision................. R1C
FW Product............... CXP 901 7640/1
FW Version............... R2A07
FW Build Date/Time....... 2010-12-03/12:17
Cust. Product............ CXC 173 0424/22
Cust. Version............ R1B02
Customization Descr...... Lenovo
Format................... 1
Base Product Number...... 1/KRD 131 18/1
Base Product Revision.... R1N
SIMLock Deployment....... 0.0
SIMLock Description...... Unlocked
SIMLock Product.......... CXC 173 0839/01
SIMLock Revision......... R1F
Model Description........ F5521gw Mobile Broadband Module
Vendor Name.............. Lenovo
Config. Set Product...... CXP 901 7629/1
Config. Set Revision..... R3A02
Network Customization.... Default;46001
Customization State...... 0
Configuration Product.... CXP 901 7640/1
Configuration Revision... R2A07
Protocol FW Product...... CXC 173 0063/1
Protocol FW Version...... R2A07
Application FW Product... CXC 173 0064/1
Application FW Version... R2A07
Network List Product..... CXC 173 1116/1
Network List Revision.... R1A
Individualization........ 189.191
Domain................... 3.3
Upgrade State............ 1
Volume info.............. 66 MB total / 43.9% free'

Posts on the Lenovo forums suggest that this can be resolved by updating the
firmware:

http://forums.lenovo.com/t5/X-Series-Tablet-ThinkPad-Laptops/Ericsson-F5521gw-WWAN-disconnects-intermittently-and-cannot/td-p/565597

Unfortunately, the official firmware updater only runs on Windows and I am
unable to find a way to update the firmware from Linux. I also cannot find any
hardware documentation. Does anyone have any suggestions?

^ permalink raw reply

* Re: [PATCH iproute2] ip link: Show devices by link type
From: Vadim Kochan @ 2014-12-03 10:59 UTC (permalink / raw)
  To: Roopa Prabhu; +Cc: netdev@vger.kernel.org
In-Reply-To: <20141203011305.GA5945@angus-think.lan>

Will re-send v2 with changed man page + link_kind filter variable as
suggested by Roopa.

Regards,
Vadim

^ permalink raw reply

* [PATCH net] net: sctp: use MAX_HEADER for headroom reserve in output path
From: Daniel Borkmann @ 2014-12-03 11:13 UTC (permalink / raw)
  To: davem; +Cc: linux-sctp, netdev, robert

To accomodate for enough headroom for tunnels, use MAX_HEADER instead
of LL_MAX_HEADER. Robert reported that he has hit after roughly 40hrs
of trinity an skb_under_panic() via SCTP output path (see reference).
I couldn't reproduce it from here, but not using MAX_HEADER as elsewhere
in other protocols might be one possible cause for this.

In any case, it looks like accounting on chunks themself seems to look
good as the skb already passed the SCTP output path and did not hit
any skb_over_panic(). Given tunneling was enabled in his .config, the
headroom would have been expanded by MAX_HEADER in this case.

Reported-by: Robert Święcki <robert@swiecki.net>
Reference: https://lkml.org/lkml/2014/12/1/507
Fixes: 594ccc14dfe4d ("[SCTP] Replace incorrect use of dev_alloc_skb with alloc_skb in sctp_packet_transmit().")
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
---
 net/sctp/output.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/sctp/output.c b/net/sctp/output.c
index 42dffd4..fc5e45b 100644
--- a/net/sctp/output.c
+++ b/net/sctp/output.c
@@ -401,12 +401,12 @@ int sctp_packet_transmit(struct sctp_packet *packet)
 	sk = chunk->skb->sk;

 	/* Allocate the new skb.  */
-	nskb = alloc_skb(packet->size + LL_MAX_HEADER, GFP_ATOMIC);
+	nskb = alloc_skb(packet->size + MAX_HEADER, GFP_ATOMIC);
 	if (!nskb)
 		goto nomem;

 	/* Make sure the outbound skb has enough header room reserved. */
-	skb_reserve(nskb, packet->overhead + LL_MAX_HEADER);
+	skb_reserve(nskb, packet->overhead + MAX_HEADER);

 	/* Set the owning socket so that we know where to get the
 	 * destination IP address.
-- 
1.7.11.7

^ permalink raw reply related

* Re: [PATCH net-next 1/2] net: bcmgenet: add support for new GENET PHY revision scheme
From: Sergei Shtylyov @ 2014-12-03 11:23 UTC (permalink / raw)
  To: Florian Fainelli, netdev; +Cc: davem
In-Reply-To: <1417562882-2511-2-git-send-email-f.fainelli@gmail.com>

Hello.

On 12/3/2014 2:28 AM, Florian Fainelli wrote:

> Starting with GPHY revision G0, the GENET register layout has changed to
> use the same numbering scheme as the Starfighter 2 switch. This means
> that GPHY major revision is in bits 15:12, minor in bits 11:8 and patch
> level is in bits 7:4.

> Introduce a small heuristic which checks for the old scheme first, tests
> for the new scheme and finally attempts to catch reserved values and
> aborts.

> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
> ---
>   drivers/net/ethernet/broadcom/genet/bcmgenet.c | 24 +++++++++++++++++++++++-
>   1 file changed, 23 insertions(+), 1 deletion(-)

> diff --git a/drivers/net/ethernet/broadcom/genet/bcmgenet.c b/drivers/net/ethernet/broadcom/genet/bcmgenet.c
> index f2fadb053d52..23e283174c4e 100644
> --- a/drivers/net/ethernet/broadcom/genet/bcmgenet.c
> +++ b/drivers/net/ethernet/broadcom/genet/bcmgenet.c
[...]
> @@ -2551,8 +2552,29 @@ static void bcmgenet_set_hw_params(struct bcmgenet_priv *priv)
>   	 * to pass this information to the PHY driver. The PHY driver expects
>   	 * to find the PHY major revision in bits 15:8 while the GENET register
>   	 * stores that information in bits 7:0, account for that.
> +	 *
> +	 * On newer chips, starting with PHY revision G0, a new scheme is
> +	 * deployed similar to the Starfighter 2 switch with GPHY major
> +	 * revision in bits 15:8 and patch level in bits 7:0. Major revision 0
> +	 * is reserved as well as special value 0x01ff, we have a small
> +	 * heuristic to check for the new GPHY revision and re-arrange things
> +	 * so the GPHY driver is happy.
>   	 */
> -	priv->gphy_rev = (reg & 0xffff) << 8;
> +	gphy_rev = (reg & 0xffff);

    Parens not needed anymore.

> +
> +	/* This the good old scheme, just GPHY major, no minor nor patch */

    Missing "is" after "This"?

> +	if ((gphy_rev & 0xf0) != 0)
> +		priv->gphy_rev = gphy_rev << 8;
> +
> +	/* This is the new scheme, GPHY major rolls over with 0x10 = rev G0 */
> +	else if ((gphy_rev & 0xff00) != 0)
> +		priv->gphy_rev = gphy_rev;
> +
> +	/* This is reserved so should require special treatment */
> +	else if (gphy_rev == 0 || gphy_rev == 0x01ff) {
> +		pr_warn("Invalid GPHY revision detected: 0x%04x\n", gphy_rev);
> +		return;
> +	}

    Hm, {} are needed on all *if* branches.

[...]

WBR, Sergei

^ permalink raw reply

* Re: [PATCH net v1 1/2] amd-xgbe: Do not clear interrupt indicator
From: Sergei Shtylyov @ 2014-12-03 11:34 UTC (permalink / raw)
  To: Tom Lendacky, netdev; +Cc: David Miller
In-Reply-To: <20141203001648.17582.48766.stgit@tlendack-t1.amdoffice.net>

Hello.

On 12/3/2014 3:16 AM, Tom Lendacky wrote:

> The interrupt value within the xgbe_ring_data structure is used as an
> indicator of which Rx descriptor should have the INTE bit set to
> generate an interrupt when that Rx descriptor is used.  This bit was
> mistakenly cleared in the xgbe_unmap_rdata function, effectively

    Not xgbe_unmap_skb() (as seems to follow from the patch)?

> nullifying the ethtool rx-frames support.

> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
> ---
>   drivers/net/ethernet/amd/xgbe/xgbe-desc.c |    1 -
>   1 file changed, 1 deletion(-)

> diff --git a/drivers/net/ethernet/amd/xgbe/xgbe-desc.c b/drivers/net/ethernet/amd/xgbe/xgbe-desc.c
> index 6fc5da0..43b7d2e 100644
> --- a/drivers/net/ethernet/amd/xgbe/xgbe-desc.c
> +++ b/drivers/net/ethernet/amd/xgbe/xgbe-desc.c
> @@ -356,7 +356,6 @@ static void xgbe_unmap_skb(struct xgbe_prv_data *pdata,
>
>   	rdata->tso_header = 0;
>   	rdata->len = 0;
> -	rdata->interrupt = 0;
>   	rdata->mapped_as_page = 0;
>
>   	if (rdata->state_saved) {

WBR, Sergei

^ permalink raw reply

* Re: [PATCH] net: less interrupt masking in NAPI
From: Eric Dumazet @ 2014-12-03 11:52 UTC (permalink / raw)
  To: Yang Yingliang; +Cc: David Miller, netdev, willemb
In-Reply-To: <547ED728.2010703@huawei.com>

On Wed, 2014-12-03 at 17:26 +0800, Yang Yingliang wrote:

> Before this patch, when a large network flow arrives, some other processes
> response slowly or even don't response because the cpu is dealing with softirq.
> 
> After this patch, under pressure, much more softirq is doing in ksoftirqd. The other
> processes be scheduled.
> 
> My system has dual core.

Which NIC driver are you using ?

Thanks

^ permalink raw reply

* [PATCH (net.git)] stmmac: fix max coal timer parameter
From: Giuseppe Cavallaro @ 2014-12-03 11:32 UTC (permalink / raw)
  To: netdev; +Cc: Giuseppe Cavallaro

This patch is to fix the max coalesce timer setting that can be provided
by ethtool.
The default value (STMMAC_COAL_TX_TIMER) was used in the set_coalesce helper
instead of the max one (STMMAC_MAX_COAL_TX_TICK, so defined but not used).

Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
---
 .../net/ethernet/stmicro/stmmac/stmmac_ethtool.c   |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c
index 3a08a1f..771cda2 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c
@@ -696,7 +696,7 @@ static int stmmac_set_coalesce(struct net_device *dev,
 	    (ec->tx_max_coalesced_frames == 0))
 		return -EINVAL;
 
-	if ((ec->tx_coalesce_usecs > STMMAC_COAL_TX_TIMER) ||
+	if ((ec->tx_coalesce_usecs > STMMAC_MAX_COAL_TX_TICK) ||
 	    (ec->tx_max_coalesced_frames > STMMAC_TX_MAX_FRAMES))
 		return -EINVAL;
 
-- 
1.7.4.4

^ permalink raw reply related

* Re: [PATCH v2 02/19] kbuild: kselftest_install - add a new make target to install selftests
From: Michal Marek @ 2014-12-03 12:09 UTC (permalink / raw)
  To: Shuah Khan, gregkh, akpm, davem, keescook, tranmanphong,
	dh.herrmann, hughd, bobby.prani, ebiederm, serge.hallyn
  Cc: linux-kbuild, linux-kernel, linux-api, netdev,
	masami.hiramatsu.pt@hitachi.com >> Masami Hiramatsu
In-Reply-To: <547C99B6.7070903@osg.samsung.com>

On 2014-12-01 17:39, Shuah Khan wrote:
> On 12/01/2014 08:47 AM, Michal Marek wrote:
>> On 2014-11-11 21:27, Shuah Khan wrote:
>>> diff --git a/Makefile b/Makefile
>>> index 05d67af..ccbd2e1 100644
>>> --- a/Makefile
>>> +++ b/Makefile
>>> @@ -1071,12 +1071,26 @@ headers_check: headers_install
>>>  	$(Q)$(MAKE) $(hdr-inst)=arch/$(hdr-arch)/include/uapi/asm $(hdr-dst) HDRCHECK=1
>>>  
>>>  # ---------------------------------------------------------------------------
>>> -# Kernel selftest
>>> +# Kernel selftest targets
>>> +
>>> +PHONY += __kselftest_configure
>>> +INSTALL_KSFT_PATH=$(INSTALL_MOD_PATH)/lib/kselftest/$(KERNELRELEASE)
>>> +export INSTALL_KSFT_PATH
>>> +KSELFTEST=$(INSTALL_KSFT_PATH)/kselftest.sh
>>> +export KSELFTEST
>>
>> Can this be moved to tools/testing/selftests/Makefile? It's only used in
>> this part of the tree.
> 
> I looked into doing that. KERNELRELEASE will have to be exported for
> tools/testing/selftests/Makefile to use it? Does that sound okay?

In fact, KERNELRELEASE is already exported. So go ahead.


> Also, it might be easier to get this series in, if you can Ack the main
> Makefile patch (when we are ready i.e), so I can take it through
> kselftest tree.

Sure. The Makefile change will only consist of redirecting the
kselftest_install target to tools/testing/selftests, right?

Michal

^ permalink raw reply

* [PATCH net] bnx2x: Limit 1G link enforcement
From: Yuval Mintz @ 2014-12-03 12:15 UTC (permalink / raw)
  To: davem, netdev; +Cc: Ariel.Elior, Yaniv Rosner, Yuval Mintz

From: Yaniv Rosner <Yaniv.Rosner@qlogic.com>

Change 1G-SFP module detection by verifying not only that it's not
compliant with 10G-Ethernet, but also that it's 1G-ethernet compliant.

Signed-off-by: Yaniv Rosner <Yaniv.Rosner@qlogic.com>
Signed-off-by: Yuval Mintz <Yuval.Mintz@qlogic.com>
---
Hi Dave,

Please consider applying this to `net'.

Thanks,
Yuval Mintz
---
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.c
index 549549e..778e4cd 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_link.c
@@ -8119,10 +8119,11 @@ static int bnx2x_get_edc_mode(struct bnx2x_phy *phy,
 	case SFP_EEPROM_CON_TYPE_VAL_LC:
 	case SFP_EEPROM_CON_TYPE_VAL_RJ45:
 		check_limiting_mode = 1;
-		if ((val[SFP_EEPROM_10G_COMP_CODE_ADDR] &
+		if (((val[SFP_EEPROM_10G_COMP_CODE_ADDR] &
 		     (SFP_EEPROM_10G_COMP_CODE_SR_MASK |
 		      SFP_EEPROM_10G_COMP_CODE_LR_MASK |
-		      SFP_EEPROM_10G_COMP_CODE_LRM_MASK)) == 0) {
+		       SFP_EEPROM_10G_COMP_CODE_LRM_MASK)) == 0) &&
+		    (val[SFP_EEPROM_1G_COMP_CODE_ADDR] != 0)) {
 			DP(NETIF_MSG_LINK, "1G SFP module detected\n");
 			phy->media_type = ETH_PHY_SFP_1G_FIBER;
 			if (phy->req_line_speed != SPEED_1000) {
-- 
1.9.3

^ permalink raw reply related

* Re: [patch net-next 3/6] net_sched: cls_bpf: remove faulty use of list_for_each_entry_rcu
From: Pablo Neira Ayuso @ 2014-12-03 12:19 UTC (permalink / raw)
  To: Jiri Pirko; +Cc: netdev, davem, jhs
In-Reply-To: <1417539636-12710-4-git-send-email-jiri@resnulli.us>

On Tue, Dec 02, 2014 at 06:00:33PM +0100, Jiri Pirko wrote:
> rcu variant is not correct here. The code is called by updater (rtnl
> lock is held), not by reader (no rcu_read_lock is held).
> 
> Signed-off-by: Jiri Pirko <jiri@resnulli.us>
> ---
>  net/sched/cls_bpf.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/net/sched/cls_bpf.c b/net/sched/cls_bpf.c
> index cbfaf6f..d0de979 100644
> --- a/net/sched/cls_bpf.c
> +++ b/net/sched/cls_bpf.c
> @@ -141,7 +141,7 @@ static unsigned long cls_bpf_get(struct tcf_proto *tp, u32 handle)
>  	if (head == NULL)
>  		return 0UL;
>  
> -	list_for_each_entry_rcu(prog, &head->plist, link) {
> +	list_for_each_entry(prog, &head->plist, link) {
>  		if (prog->handle == handle) {
>  			ret = (unsigned long) prog;
>  			break;
> @@ -337,7 +337,7 @@ static void cls_bpf_walk(struct tcf_proto *tp, struct tcf_walker *arg)
>  	struct cls_bpf_head *head = rtnl_dereference(tp->root);
>  	struct cls_bpf_prog *prog;
>  
> -	list_for_each_entry_rcu(prog, &head->plist, link) {
> +	list_for_each_entry(prog, &head->plist, link) {

We still need the _rcu here in the walk path. IIRC, this is called from the
dump path and we hold no rtnl_lock there.

>  		if (arg->count < arg->skip)
>  			goto skip;
>  		if (arg->fn(tp, (unsigned long) prog, arg) < 0) {
> -- 
> 1.9.3
> 
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: BCM4313 & brcmsmac & 3.12: only semi-working?
From: Arend van Spriel @ 2014-12-03 12:43 UTC (permalink / raw)
  To: Michael Tokarev
  Cc: Maximilian Engelhardt, Rafał Miłecki, Seth Forshee,
	brcm80211 development, linux-wireless@vger.kernel.org,
	Network Development
In-Reply-To: <547E31BD.8090302@msgid.tls.msk.ru>

On 12/02/14 22:40, Michael Tokarev wrote:
> 30.11.2014 15:04, Arend van Spriel wrote:
>
>> Thanks. Did not find what I was looking for, but I started working on
>> integrating btcoex related functionality. The attached patch will print
>> some info so I can focus on the required functionality for your device.
>> It is based on 3.18-rc5.
>
> With this patch applied against 3.18-rc5, the machine instantly reboots
> once brcmsmac module is loaded.  I'm still debugging this.

Argh. Probably the register access I added end up in limbo land or some 
other stupid mistake. I will double check my patch.

Regards,
Arend

> Thanks,
>
> /mjt

^ permalink raw reply

* Re: [patch net-next 1/6] net_sched: cls_basic: remove unnecessary iteration and use passed arg
From: Jamal Hadi Salim @ 2014-12-03 12:45 UTC (permalink / raw)
  To: Jiri Pirko, netdev; +Cc: davem
In-Reply-To: <1417539636-12710-2-git-send-email-jiri@resnulli.us>

On 12/02/14 12:00, Jiri Pirko wrote:
> Signed-off-by: Jiri Pirko <jiri@resnulli.us>

Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>

cheers,
jamal

^ permalink raw reply

* Re: [patch net-next 2/6] net_sched: cls_bpf: remove unnecessary iteration and use passed arg
From: Jamal Hadi Salim @ 2014-12-03 12:46 UTC (permalink / raw)
  To: Jiri Pirko, netdev; +Cc: davem
In-Reply-To: <1417539636-12710-3-git-send-email-jiri@resnulli.us>

On 12/02/14 12:00, Jiri Pirko wrote:
> Signed-off-by: Jiri Pirko <jiri@resnulli.us>


It's TheLinuxWay(tm). I am sure this was derived from cls_basic;->


Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>

cheers,
jamal

^ permalink raw reply

* Re: [patch net-next 2/2] rocker: fix eth_type tybe in struct rocker_ctrl
From: Sergei Shtylyov @ 2014-12-03 12:48 UTC (permalink / raw)
  To: Jiri Pirko, netdev; +Cc: davem, sfeldma
In-Reply-To: <1417602753-3084-2-git-send-email-jiri@resnulli.us>

Hello.

On 12/3/2014 1:32 PM, Jiri Pirko wrote:

    s/tybe/type in the subject.

> Signed-off-by: Jiri Pirko <jiri@resnulli.us>

WBR, Sergei

^ permalink raw reply

* Re: [patch net-next 3/6] net_sched: cls_bpf: remove faulty use of list_for_each_entry_rcu
From: Jamal Hadi Salim @ 2014-12-03 12:51 UTC (permalink / raw)
  To: Jiri Pirko, netdev; +Cc: davem
In-Reply-To: <1417539636-12710-4-git-send-email-jiri@resnulli.us>

On 12/02/14 12:00, Jiri Pirko wrote:
> rcu variant is not correct here. The code is called by updater (rtnl
> lock is held), not by reader (no rcu_read_lock is held).
>
> Signed-off-by: Jiri Pirko <jiri@resnulli.us>
> ---
>   net/sched/cls_bpf.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/net/sched/cls_bpf.c b/net/sched/cls_bpf.c
> index cbfaf6f..d0de979 100644
> --- a/net/sched/cls_bpf.c
> +++ b/net/sched/cls_bpf.c
> @@ -141,7 +141,7 @@ static unsigned long cls_bpf_get(struct tcf_proto *tp, u32 handle)
>   	if (head == NULL)
>   		return 0UL;
>
> -	list_for_each_entry_rcu(prog, &head->plist, link) {
> +	list_for_each_entry(prog, &head->plist, link) {
>   		if (prog->handle == handle) {
>   			ret = (unsigned long) prog;

The above is ok i think - only one user space entrant at a time
and datapath is not affected because no modification is happening.

>   			break;
> @@ -337,7 +337,7 @@ static void cls_bpf_walk(struct tcf_proto *tp, struct tcf_walker *arg)
>   	struct cls_bpf_head *head = rtnl_dereference(tp->root);
>   	struct cls_bpf_prog *prog;
>
> -	list_for_each_entry_rcu(prog, &head->plist, link) {
> +	list_for_each_entry(prog, &head->plist, link) {
>   		if (arg->count < arg->skip)
>   			goto skip;
>   		if (arg->fn(tp, (unsigned long) prog, arg) < 0) {
>

I think this may be problematic. Doesnt a flush operation also use the
walker?

cheers,
jamal

^ permalink raw reply

* [PATCH 01/12] netfilter: xt_recent: relax ip_pkt_list_tot restrictions
From: Pablo Neira Ayuso @ 2014-12-03 12:55 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev
In-Reply-To: <1417611342-25257-1-git-send-email-pablo@netfilter.org>

From: Florian Westphal <fw@strlen.de>

The maximum value for the hitcount parameter is given by
"ip_pkt_list_tot" parameter (default: 20).

Exceeding this value on the command line will cause the rule to be
rejected.  The parameter is also readonly, i.e. it cannot be changed
without module unload or reboot.

Store size per table, then base nstamps[] size on the hitcount instead.

The module parameter is retained for backwards compatibility.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/netfilter/xt_recent.c |   64 +++++++++++++++++++++++++++++++++------------
 1 file changed, 47 insertions(+), 17 deletions(-)

diff --git a/net/netfilter/xt_recent.c b/net/netfilter/xt_recent.c
index a9faae8..30dbe34 100644
--- a/net/netfilter/xt_recent.c
+++ b/net/netfilter/xt_recent.c
@@ -43,25 +43,29 @@ MODULE_LICENSE("GPL");
 MODULE_ALIAS("ipt_recent");
 MODULE_ALIAS("ip6t_recent");
 
-static unsigned int ip_list_tot = 100;
-static unsigned int ip_pkt_list_tot = 20;
-static unsigned int ip_list_hash_size = 0;
-static unsigned int ip_list_perms = 0644;
-static unsigned int ip_list_uid = 0;
-static unsigned int ip_list_gid = 0;
+static unsigned int ip_list_tot __read_mostly = 100;
+static unsigned int ip_list_hash_size __read_mostly;
+static unsigned int ip_list_perms __read_mostly = 0644;
+static unsigned int ip_list_uid __read_mostly;
+static unsigned int ip_list_gid __read_mostly;
 module_param(ip_list_tot, uint, 0400);
-module_param(ip_pkt_list_tot, uint, 0400);
 module_param(ip_list_hash_size, uint, 0400);
 module_param(ip_list_perms, uint, 0400);
 module_param(ip_list_uid, uint, S_IRUGO | S_IWUSR);
 module_param(ip_list_gid, uint, S_IRUGO | S_IWUSR);
 MODULE_PARM_DESC(ip_list_tot, "number of IPs to remember per list");
-MODULE_PARM_DESC(ip_pkt_list_tot, "number of packets per IP address to remember (max. 255)");
 MODULE_PARM_DESC(ip_list_hash_size, "size of hash table used to look up IPs");
 MODULE_PARM_DESC(ip_list_perms, "permissions on /proc/net/xt_recent/* files");
 MODULE_PARM_DESC(ip_list_uid, "default owner of /proc/net/xt_recent/* files");
 MODULE_PARM_DESC(ip_list_gid, "default owning group of /proc/net/xt_recent/* files");
 
+/* retained for backwards compatibility */
+static unsigned int ip_pkt_list_tot __read_mostly;
+module_param(ip_pkt_list_tot, uint, 0400);
+MODULE_PARM_DESC(ip_pkt_list_tot, "number of packets per IP address to remember (max. 255)");
+
+#define XT_RECENT_MAX_NSTAMPS	256
+
 struct recent_entry {
 	struct list_head	list;
 	struct list_head	lru_list;
@@ -79,6 +83,7 @@ struct recent_table {
 	union nf_inet_addr	mask;
 	unsigned int		refcnt;
 	unsigned int		entries;
+	u8			nstamps_max_mask;
 	struct list_head	lru_list;
 	struct list_head	iphash[0];
 };
@@ -90,7 +95,8 @@ struct recent_net {
 #endif
 };
 
-static int recent_net_id;
+static int recent_net_id __read_mostly;
+
 static inline struct recent_net *recent_pernet(struct net *net)
 {
 	return net_generic(net, recent_net_id);
@@ -171,12 +177,15 @@ recent_entry_init(struct recent_table *t, const union nf_inet_addr *addr,
 		  u_int16_t family, u_int8_t ttl)
 {
 	struct recent_entry *e;
+	unsigned int nstamps_max = t->nstamps_max_mask;
 
 	if (t->entries >= ip_list_tot) {
 		e = list_entry(t->lru_list.next, struct recent_entry, lru_list);
 		recent_entry_remove(t, e);
 	}
-	e = kmalloc(sizeof(*e) + sizeof(e->stamps[0]) * ip_pkt_list_tot,
+
+	nstamps_max += 1;
+	e = kmalloc(sizeof(*e) + sizeof(e->stamps[0]) * nstamps_max,
 		    GFP_ATOMIC);
 	if (e == NULL)
 		return NULL;
@@ -197,7 +206,7 @@ recent_entry_init(struct recent_table *t, const union nf_inet_addr *addr,
 
 static void recent_entry_update(struct recent_table *t, struct recent_entry *e)
 {
-	e->index %= ip_pkt_list_tot;
+	e->index &= t->nstamps_max_mask;
 	e->stamps[e->index++] = jiffies;
 	if (e->index > e->nstamps)
 		e->nstamps = e->index;
@@ -326,6 +335,7 @@ static int recent_mt_check(const struct xt_mtchk_param *par,
 	kuid_t uid;
 	kgid_t gid;
 #endif
+	unsigned int nstamp_mask;
 	unsigned int i;
 	int ret = -EINVAL;
 	size_t sz;
@@ -349,19 +359,33 @@ static int recent_mt_check(const struct xt_mtchk_param *par,
 		return -EINVAL;
 	if ((info->check_set & XT_RECENT_REAP) && !info->seconds)
 		return -EINVAL;
-	if (info->hit_count > ip_pkt_list_tot) {
-		pr_info("hitcount (%u) is larger than "
-			"packets to be remembered (%u)\n",
-			info->hit_count, ip_pkt_list_tot);
+	if (info->hit_count >= XT_RECENT_MAX_NSTAMPS) {
+		pr_info("hitcount (%u) is larger than allowed maximum (%u)\n",
+			info->hit_count, XT_RECENT_MAX_NSTAMPS - 1);
 		return -EINVAL;
 	}
 	if (info->name[0] == '\0' ||
 	    strnlen(info->name, XT_RECENT_NAME_LEN) == XT_RECENT_NAME_LEN)
 		return -EINVAL;
 
+	if (ip_pkt_list_tot && info->hit_count < ip_pkt_list_tot)
+		nstamp_mask = roundup_pow_of_two(ip_pkt_list_tot) - 1;
+	else if (info->hit_count)
+		nstamp_mask = roundup_pow_of_two(info->hit_count) - 1;
+	else
+		nstamp_mask = 32 - 1;
+
 	mutex_lock(&recent_mutex);
 	t = recent_table_lookup(recent_net, info->name);
 	if (t != NULL) {
+		if (info->hit_count > t->nstamps_max_mask) {
+			pr_info("hitcount (%u) is larger than packets to be remembered (%u) for table %s\n",
+				info->hit_count, t->nstamps_max_mask + 1,
+				info->name);
+			ret = -EINVAL;
+			goto out;
+		}
+
 		t->refcnt++;
 		ret = 0;
 		goto out;
@@ -377,6 +401,7 @@ static int recent_mt_check(const struct xt_mtchk_param *par,
 		goto out;
 	}
 	t->refcnt = 1;
+	t->nstamps_max_mask = nstamp_mask;
 
 	memcpy(&t->mask, &info->mask, sizeof(t->mask));
 	strcpy(t->name, info->name);
@@ -497,9 +522,12 @@ static void recent_seq_stop(struct seq_file *s, void *v)
 static int recent_seq_show(struct seq_file *seq, void *v)
 {
 	const struct recent_entry *e = v;
+	struct recent_iter_state *st = seq->private;
+	const struct recent_table *t = st->table;
 	unsigned int i;
 
-	i = (e->index - 1) % ip_pkt_list_tot;
+	i = (e->index - 1) & t->nstamps_max_mask;
+
 	if (e->family == NFPROTO_IPV4)
 		seq_printf(seq, "src=%pI4 ttl: %u last_seen: %lu oldest_pkt: %u",
 			   &e->addr.ip, e->ttl, e->stamps[i], e->index);
@@ -717,7 +745,9 @@ static int __init recent_mt_init(void)
 {
 	int err;
 
-	if (!ip_list_tot || !ip_pkt_list_tot || ip_pkt_list_tot > 255)
+	BUILD_BUG_ON_NOT_POWER_OF_2(XT_RECENT_MAX_NSTAMPS);
+
+	if (!ip_list_tot || ip_pkt_list_tot >= XT_RECENT_MAX_NSTAMPS)
 		return -EINVAL;
 	ip_list_hash_size = 1 << fls(ip_list_tot);
 
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH 02/12] netfilter: conntrack: avoid zeroing timer
From: Pablo Neira Ayuso @ 2014-12-03 12:55 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev
In-Reply-To: <1417611342-25257-1-git-send-email-pablo@netfilter.org>

From: Florian Westphal <fw@strlen.de>

add a __nfct_init_offset annotation member to struct nf_conn to make
it clear which members are covered by the memset when the conntrack
is allocated.

This avoids zeroing timer_list and ct_net; both are already inited
explicitly.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 include/net/netfilter/nf_conntrack.h |   15 +++++++++------
 net/netfilter/nf_conntrack_core.c    |   11 ++++-------
 2 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/include/net/netfilter/nf_conntrack.h b/include/net/netfilter/nf_conntrack.h
index c8a7db6..f0daed2 100644
--- a/include/net/netfilter/nf_conntrack.h
+++ b/include/net/netfilter/nf_conntrack.h
@@ -92,12 +92,18 @@ struct nf_conn {
 	/* Have we seen traffic both ways yet? (bitset) */
 	unsigned long status;
 
-	/* If we were expected by an expectation, this will be it */
-	struct nf_conn *master;
-
 	/* Timer function; drops refcnt when it goes off. */
 	struct timer_list timeout;
 
+#ifdef CONFIG_NET_NS
+	struct net *ct_net;
+#endif
+	/* all members below initialized via memset */
+	u8 __nfct_init_offset[0];
+
+	/* If we were expected by an expectation, this will be it */
+	struct nf_conn *master;
+
 #if defined(CONFIG_NF_CONNTRACK_MARK)
 	u_int32_t mark;
 #endif
@@ -108,9 +114,6 @@ struct nf_conn {
 
 	/* Extensions */
 	struct nf_ct_ext *ext;
-#ifdef CONFIG_NET_NS
-	struct net *ct_net;
-#endif
 
 	/* Storage reserved for other modules, must be the last member */
 	union nf_conntrack_proto proto;
diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
index 2c69975..9ef88c8 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -826,22 +826,19 @@ __nf_conntrack_alloc(struct net *net, u16 zone,
 		atomic_dec(&net->ct.count);
 		return ERR_PTR(-ENOMEM);
 	}
-	/*
-	 * Let ct->tuplehash[IP_CT_DIR_ORIGINAL].hnnode.next
-	 * and ct->tuplehash[IP_CT_DIR_REPLY].hnnode.next unchanged.
-	 */
-	memset(&ct->tuplehash[IP_CT_DIR_MAX], 0,
-	       offsetof(struct nf_conn, proto) -
-	       offsetof(struct nf_conn, tuplehash[IP_CT_DIR_MAX]));
 	spin_lock_init(&ct->lock);
 	ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple = *orig;
 	ct->tuplehash[IP_CT_DIR_ORIGINAL].hnnode.pprev = NULL;
 	ct->tuplehash[IP_CT_DIR_REPLY].tuple = *repl;
 	/* save hash for reusing when confirming */
 	*(unsigned long *)(&ct->tuplehash[IP_CT_DIR_REPLY].hnnode.pprev) = hash;
+	ct->status = 0;
 	/* Don't set timer yet: wait for confirmation */
 	setup_timer(&ct->timeout, death_by_timeout, (unsigned long)ct);
 	write_pnet(&ct->ct_net, net);
+	memset(&ct->__nfct_init_offset[0], 0,
+	       offsetof(struct nf_conn, proto) -
+	       offsetof(struct nf_conn, __nfct_init_offset[0]));
 #ifdef CONFIG_NF_CONNTRACK_ZONES
 	if (zone) {
 		struct nf_conntrack_zone *nf_ct_zone;
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH 03/12] netfilter: nf_tables_bridge: export nft_reject_ip*hdr_validate functions
From: Pablo Neira Ayuso @ 2014-12-03 12:55 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev
In-Reply-To: <1417611342-25257-1-git-send-email-pablo@netfilter.org>

From: Alvaro Neira <alvaroneay@gmail.com>

This patch exports the functions nft_reject_iphdr_validate and
nft_reject_ip6hdr_validate to use it in follow up patches.
These functions check if the IPv4/IPv6 header is correct.

Signed-off-by: Alvaro Neira Ayuso <alvaroneay@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 include/net/netfilter/nf_tables_bridge.h |    7 ++++
 net/bridge/netfilter/nf_tables_bridge.c  |   48 +++++++++++++++++++++++++++
 net/bridge/netfilter/nft_reject_bridge.c |   52 +++---------------------------
 3 files changed, 60 insertions(+), 47 deletions(-)
 create mode 100644 include/net/netfilter/nf_tables_bridge.h

diff --git a/include/net/netfilter/nf_tables_bridge.h b/include/net/netfilter/nf_tables_bridge.h
new file mode 100644
index 0000000..511fb79
--- /dev/null
+++ b/include/net/netfilter/nf_tables_bridge.h
@@ -0,0 +1,7 @@
+#ifndef _NET_NF_TABLES_BRIDGE_H
+#define _NET_NF_TABLES_BRIDGE_H
+
+int nft_bridge_iphdr_validate(struct sk_buff *skb);
+int nft_bridge_ip6hdr_validate(struct sk_buff *skb);
+
+#endif /* _NET_NF_TABLES_BRIDGE_H */
diff --git a/net/bridge/netfilter/nf_tables_bridge.c b/net/bridge/netfilter/nf_tables_bridge.c
index 074c557..d468c19 100644
--- a/net/bridge/netfilter/nf_tables_bridge.c
+++ b/net/bridge/netfilter/nf_tables_bridge.c
@@ -13,6 +13,54 @@
 #include <linux/module.h>
 #include <linux/netfilter_bridge.h>
 #include <net/netfilter/nf_tables.h>
+#include <net/netfilter/nf_tables_bridge.h>
+#include <linux/ip.h>
+#include <linux/ipv6.h>
+
+int nft_bridge_iphdr_validate(struct sk_buff *skb)
+{
+	struct iphdr *iph;
+	u32 len;
+
+	if (!pskb_may_pull(skb, sizeof(struct iphdr)))
+		return 0;
+
+	iph = ip_hdr(skb);
+	if (iph->ihl < 5 || iph->version != 4)
+		return 0;
+
+	len = ntohs(iph->tot_len);
+	if (skb->len < len)
+		return 0;
+	else if (len < (iph->ihl*4))
+		return 0;
+
+	if (!pskb_may_pull(skb, iph->ihl*4))
+		return 0;
+
+	return 1;
+}
+EXPORT_SYMBOL_GPL(nft_bridge_iphdr_validate);
+
+int nft_bridge_ip6hdr_validate(struct sk_buff *skb)
+{
+	struct ipv6hdr *hdr;
+	u32 pkt_len;
+
+	if (!pskb_may_pull(skb, sizeof(struct ipv6hdr)))
+		return 0;
+
+	hdr = ipv6_hdr(skb);
+	if (hdr->version != 6)
+		return 0;
+
+	pkt_len = ntohs(hdr->payload_len);
+	if (pkt_len + sizeof(struct ipv6hdr) > skb->len)
+		return 0;
+
+	return 1;
+}
+EXPORT_SYMBOL_GPL(nft_bridge_ip6hdr_validate);
 
 static unsigned int
 nft_do_chain_bridge(const struct nf_hook_ops *ops,
diff --git a/net/bridge/netfilter/nft_reject_bridge.c b/net/bridge/netfilter/nft_reject_bridge.c
index 48da2c5..b0330ae 100644
--- a/net/bridge/netfilter/nft_reject_bridge.c
+++ b/net/bridge/netfilter/nft_reject_bridge.c
@@ -14,6 +14,7 @@
 #include <linux/netfilter/nf_tables.h>
 #include <net/netfilter/nf_tables.h>
 #include <net/netfilter/nft_reject.h>
+#include <net/netfilter/nf_tables_bridge.h>
 #include <net/netfilter/ipv4/nf_reject.h>
 #include <net/netfilter/ipv6/nf_reject.h>
 #include <linux/ip.h>
@@ -35,30 +36,6 @@ static void nft_reject_br_push_etherhdr(struct sk_buff *oldskb,
 	skb_pull(nskb, ETH_HLEN);
 }
 
-static int nft_reject_iphdr_validate(struct sk_buff *oldskb)
-{
-	struct iphdr *iph;
-	u32 len;
-
-	if (!pskb_may_pull(oldskb, sizeof(struct iphdr)))
-		return 0;
-
-	iph = ip_hdr(oldskb);
-	if (iph->ihl < 5 || iph->version != 4)
-		return 0;
-
-	len = ntohs(iph->tot_len);
-	if (oldskb->len < len)
-		return 0;
-	else if (len < (iph->ihl*4))
-		return 0;
-
-	if (!pskb_may_pull(oldskb, iph->ihl*4))
-		return 0;
-
-	return 1;
-}
-
 static void nft_reject_br_send_v4_tcp_reset(struct sk_buff *oldskb, int hook)
 {
 	struct sk_buff *nskb;
@@ -66,7 +43,7 @@ static void nft_reject_br_send_v4_tcp_reset(struct sk_buff *oldskb, int hook)
 	const struct tcphdr *oth;
 	struct tcphdr _oth;
 
-	if (!nft_reject_iphdr_validate(oldskb))
+	if (!nft_bridge_iphdr_validate(oldskb))
 		return;
 
 	oth = nf_reject_ip_tcphdr_get(oldskb, &_oth, hook);
@@ -101,7 +78,7 @@ static void nft_reject_br_send_v4_unreach(struct sk_buff *oldskb, int hook,
 	void *payload;
 	__wsum csum;
 
-	if (!nft_reject_iphdr_validate(oldskb))
+	if (!nft_bridge_iphdr_validate(oldskb))
 		return;
 
 	/* IP header checks: fragment. */
@@ -146,25 +123,6 @@ static void nft_reject_br_send_v4_unreach(struct sk_buff *oldskb, int hook,
 	br_deliver(br_port_get_rcu(oldskb->dev), nskb);
 }
 
-static int nft_reject_ip6hdr_validate(struct sk_buff *oldskb)
-{
-	struct ipv6hdr *hdr;
-	u32 pkt_len;
-
-	if (!pskb_may_pull(oldskb, sizeof(struct ipv6hdr)))
-		return 0;
-
-	hdr = ipv6_hdr(oldskb);
-	if (hdr->version != 6)
-		return 0;
-
-	pkt_len = ntohs(hdr->payload_len);
-	if (pkt_len + sizeof(struct ipv6hdr) > oldskb->len)
-		return 0;
-
-	return 1;
-}
-
 static void nft_reject_br_send_v6_tcp_reset(struct net *net,
 					    struct sk_buff *oldskb, int hook)
 {
@@ -174,7 +132,7 @@ static void nft_reject_br_send_v6_tcp_reset(struct net *net,
 	unsigned int otcplen;
 	struct ipv6hdr *nip6h;
 
-	if (!nft_reject_ip6hdr_validate(oldskb))
+	if (!nft_bridge_ip6hdr_validate(oldskb))
 		return;
 
 	oth = nf_reject_ip6_tcphdr_get(oldskb, &_oth, &otcplen, hook);
@@ -207,7 +165,7 @@ static void nft_reject_br_send_v6_unreach(struct net *net,
 	unsigned int len;
 	void *payload;
 
-	if (!nft_reject_ip6hdr_validate(oldskb))
+	if (!nft_bridge_ip6hdr_validate(oldskb))
 		return;
 
 	/* Include "As much of invoking packet as possible without the ICMPv6
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH 04/12] netfilter: nf_tables_bridge: set the pktinfo for IPv4/IPv6 traffic
From: Pablo Neira Ayuso @ 2014-12-03 12:55 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev
In-Reply-To: <1417611342-25257-1-git-send-email-pablo@netfilter.org>

From: Alvaro Neira <alvaroneay@gmail.com>

This patch adds the missing bits to allow to match per meta l4proto from
the bridge. Example:

  nft add rule bridge filter input ether type {ip, ip6} meta l4proto udp counter

Signed-off-by: Alvaro Neira Ayuso <alvaroneay@gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/bridge/netfilter/nf_tables_bridge.c |   40 ++++++++++++++++++++++++++++++-
 1 file changed, 39 insertions(+), 1 deletion(-)

diff --git a/net/bridge/netfilter/nf_tables_bridge.c b/net/bridge/netfilter/nf_tables_bridge.c
index d468c19..19473a9 100644
--- a/net/bridge/netfilter/nf_tables_bridge.c
+++ b/net/bridge/netfilter/nf_tables_bridge.c
@@ -16,6 +16,8 @@
 #include <net/netfilter/nf_tables_bridge.h>
 #include <linux/ip.h>
 #include <linux/ipv6.h>
+#include <net/netfilter/nf_tables_ipv4.h>
+#include <net/netfilter/nf_tables_ipv6.h>
 
 int nft_bridge_iphdr_validate(struct sk_buff *skb)
 {
@@ -62,6 +64,32 @@ int nft_bridge_ip6hdr_validate(struct sk_buff *skb)
 }
 EXPORT_SYMBOL_GPL(nft_bridge_ip6hdr_validate);
 
+static inline void nft_bridge_set_pktinfo_ipv4(struct nft_pktinfo *pkt,
+					       const struct nf_hook_ops *ops,
+					       struct sk_buff *skb,
+					       const struct net_device *in,
+					       const struct net_device *out)
+{
+	if (nft_bridge_iphdr_validate(skb))
+		nft_set_pktinfo_ipv4(pkt, ops, skb, in, out);
+	else
+		nft_set_pktinfo(pkt, ops, skb, in, out);
+}
+
+static inline void nft_bridge_set_pktinfo_ipv6(struct nft_pktinfo *pkt,
+					      const struct nf_hook_ops *ops,
+					      struct sk_buff *skb,
+					      const struct net_device *in,
+					      const struct net_device *out)
+{
+#if IS_ENABLED(CONFIG_IPV6)
+	if (nft_bridge_ip6hdr_validate(skb) &&
+	    nft_set_pktinfo_ipv6(pkt, ops, skb, in, out) == 0)
+		return;
+#endif
+	nft_set_pktinfo(pkt, ops, skb, in, out);
+}
+
 static unsigned int
 nft_do_chain_bridge(const struct nf_hook_ops *ops,
 		    struct sk_buff *skb,
@@ -71,7 +99,17 @@ nft_do_chain_bridge(const struct nf_hook_ops *ops,
 {
 	struct nft_pktinfo pkt;
 
-	nft_set_pktinfo(&pkt, ops, skb, in, out);
+	switch (eth_hdr(skb)->h_proto) {
+	case htons(ETH_P_IP):
+		nft_bridge_set_pktinfo_ipv4(&pkt, ops, skb, in, out);
+		break;
+	case htons(ETH_P_IPV6):
+		nft_bridge_set_pktinfo_ipv6(&pkt, ops, skb, in, out);
+		break;
+	default:
+		nft_set_pktinfo(&pkt, ops, skb, in, out);
+		break;
+	}
 
 	return nft_do_chain(&pkt, ops);
 }
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH 05/12] netfilter: combine IPv4 and IPv6 nf_nat_redirect code in one module
From: Pablo Neira Ayuso @ 2014-12-03 12:55 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev
In-Reply-To: <1417611342-25257-1-git-send-email-pablo@netfilter.org>

This resolves linking problems with CONFIG_IPV6=n:

net/built-in.o: In function `redirect_tg6':
xt_REDIRECT.c:(.text+0x6d021): undefined reference to `nf_nat_redirect_ipv6'

Reported-by: Andreas Ruprecht <rupran@einserver.de>
Reported-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 include/net/netfilter/ipv4/nf_nat_redirect.h |    9 --
 include/net/netfilter/ipv6/nf_nat_redirect.h |    8 --
 include/net/netfilter/nf_nat_redirect.h      |   12 +++
 net/ipv4/netfilter/Kconfig                   |    8 +-
 net/ipv4/netfilter/Makefile                  |    1 -
 net/ipv4/netfilter/nf_nat_redirect_ipv4.c    |   82 -----------------
 net/ipv4/netfilter/nft_redir_ipv4.c          |    2 +-
 net/ipv6/netfilter/Kconfig                   |    8 +-
 net/ipv6/netfilter/Makefile                  |    1 -
 net/ipv6/netfilter/nf_nat_redirect_ipv6.c    |   75 ---------------
 net/ipv6/netfilter/nft_redir_ipv6.c          |    2 +-
 net/netfilter/Kconfig                        |   10 +-
 net/netfilter/Makefile                       |    1 +
 net/netfilter/nf_nat_redirect.c              |  127 ++++++++++++++++++++++++++
 net/netfilter/xt_REDIRECT.c                  |    3 +-
 15 files changed, 153 insertions(+), 196 deletions(-)
 delete mode 100644 include/net/netfilter/ipv4/nf_nat_redirect.h
 delete mode 100644 include/net/netfilter/ipv6/nf_nat_redirect.h
 create mode 100644 include/net/netfilter/nf_nat_redirect.h
 delete mode 100644 net/ipv4/netfilter/nf_nat_redirect_ipv4.c
 delete mode 100644 net/ipv6/netfilter/nf_nat_redirect_ipv6.c
 create mode 100644 net/netfilter/nf_nat_redirect.c

diff --git a/include/net/netfilter/ipv4/nf_nat_redirect.h b/include/net/netfilter/ipv4/nf_nat_redirect.h
deleted file mode 100644
index 19e1df3..0000000
--- a/include/net/netfilter/ipv4/nf_nat_redirect.h
+++ /dev/null
@@ -1,9 +0,0 @@
-#ifndef _NF_NAT_REDIRECT_IPV4_H_
-#define _NF_NAT_REDIRECT_IPV4_H_
-
-unsigned int
-nf_nat_redirect_ipv4(struct sk_buff *skb,
-		     const struct nf_nat_ipv4_multi_range_compat *mr,
-		     unsigned int hooknum);
-
-#endif /* _NF_NAT_REDIRECT_IPV4_H_ */
diff --git a/include/net/netfilter/ipv6/nf_nat_redirect.h b/include/net/netfilter/ipv6/nf_nat_redirect.h
deleted file mode 100644
index 1ebdffc..0000000
--- a/include/net/netfilter/ipv6/nf_nat_redirect.h
+++ /dev/null
@@ -1,8 +0,0 @@
-#ifndef _NF_NAT_REDIRECT_IPV6_H_
-#define _NF_NAT_REDIRECT_IPV6_H_
-
-unsigned int
-nf_nat_redirect_ipv6(struct sk_buff *skb, const struct nf_nat_range *range,
-		     unsigned int hooknum);
-
-#endif /* _NF_NAT_REDIRECT_IPV6_H_ */
diff --git a/include/net/netfilter/nf_nat_redirect.h b/include/net/netfilter/nf_nat_redirect.h
new file mode 100644
index 0000000..73b7295
--- /dev/null
+++ b/include/net/netfilter/nf_nat_redirect.h
@@ -0,0 +1,12 @@
+#ifndef _NF_NAT_REDIRECT_H_
+#define _NF_NAT_REDIRECT_H_
+
+unsigned int
+nf_nat_redirect_ipv4(struct sk_buff *skb,
+		     const struct nf_nat_ipv4_multi_range_compat *mr,
+		     unsigned int hooknum);
+unsigned int
+nf_nat_redirect_ipv6(struct sk_buff *skb, const struct nf_nat_range *range,
+		     unsigned int hooknum);
+
+#endif /* _NF_NAT_REDIRECT_H_ */
diff --git a/net/ipv4/netfilter/Kconfig b/net/ipv4/netfilter/Kconfig
index 8358b2d..59f883d 100644
--- a/net/ipv4/netfilter/Kconfig
+++ b/net/ipv4/netfilter/Kconfig
@@ -104,12 +104,6 @@ config NF_NAT_MASQUERADE_IPV4
 	  This is the kernel functionality to provide NAT in the masquerade
 	  flavour (automatic source address selection).
 
-config NF_NAT_REDIRECT_IPV4
-	tristate "IPv4 redirect support"
-	help
-	  This is the kernel functionality to provide NAT in the redirect
-	  flavour (redirect packets to local machine).
-
 config NFT_MASQ_IPV4
 	tristate "IPv4 masquerading support for nf_tables"
 	depends on NF_TABLES_IPV4
@@ -123,7 +117,7 @@ config NFT_REDIR_IPV4
 	tristate "IPv4 redirect support for nf_tables"
 	depends on NF_TABLES_IPV4
 	depends on NFT_REDIR
-	select NF_NAT_REDIRECT_IPV4
+	select NF_NAT_REDIRECT
 	help
 	  This is the expression that provides IPv4 redirect support for
 	  nf_tables.
diff --git a/net/ipv4/netfilter/Makefile b/net/ipv4/netfilter/Makefile
index 902bcd1..7fe6c70 100644
--- a/net/ipv4/netfilter/Makefile
+++ b/net/ipv4/netfilter/Makefile
@@ -31,7 +31,6 @@ obj-$(CONFIG_NF_NAT_H323) += nf_nat_h323.o
 obj-$(CONFIG_NF_NAT_PPTP) += nf_nat_pptp.o
 obj-$(CONFIG_NF_NAT_SNMP_BASIC) += nf_nat_snmp_basic.o
 obj-$(CONFIG_NF_NAT_MASQUERADE_IPV4) += nf_nat_masquerade_ipv4.o
-obj-$(CONFIG_NF_NAT_REDIRECT_IPV4) += nf_nat_redirect_ipv4.o
 
 # NAT protocols (nf_nat)
 obj-$(CONFIG_NF_NAT_PROTO_GRE) += nf_nat_proto_gre.o
diff --git a/net/ipv4/netfilter/nf_nat_redirect_ipv4.c b/net/ipv4/netfilter/nf_nat_redirect_ipv4.c
deleted file mode 100644
index a220552..0000000
--- a/net/ipv4/netfilter/nf_nat_redirect_ipv4.c
+++ /dev/null
@@ -1,82 +0,0 @@
-/*
- * (C) 1999-2001 Paul `Rusty' Russell
- * (C) 2002-2006 Netfilter Core Team <coreteam@netfilter.org>
- * Copyright (c) 2011 Patrick McHardy <kaber@trash.net>
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- *
- * Based on Rusty Russell's IPv4 REDIRECT target. Development of IPv6
- * NAT funded by Astaro.
- */
-
-#include <linux/if.h>
-#include <linux/inetdevice.h>
-#include <linux/ip.h>
-#include <linux/kernel.h>
-#include <linux/module.h>
-#include <linux/netdevice.h>
-#include <linux/netfilter.h>
-#include <linux/types.h>
-#include <linux/netfilter_ipv4.h>
-#include <linux/netfilter/x_tables.h>
-#include <net/addrconf.h>
-#include <net/checksum.h>
-#include <net/protocol.h>
-#include <net/netfilter/nf_nat.h>
-#include <net/netfilter/ipv4/nf_nat_redirect.h>
-
-unsigned int
-nf_nat_redirect_ipv4(struct sk_buff *skb,
-		     const struct nf_nat_ipv4_multi_range_compat *mr,
-		     unsigned int hooknum)
-{
-	struct nf_conn *ct;
-	enum ip_conntrack_info ctinfo;
-	__be32 newdst;
-	struct nf_nat_range newrange;
-
-	NF_CT_ASSERT(hooknum == NF_INET_PRE_ROUTING ||
-		     hooknum == NF_INET_LOCAL_OUT);
-
-	ct = nf_ct_get(skb, &ctinfo);
-	NF_CT_ASSERT(ct && (ctinfo == IP_CT_NEW || ctinfo == IP_CT_RELATED));
-
-	/* Local packets: make them go to loopback */
-	if (hooknum == NF_INET_LOCAL_OUT) {
-		newdst = htonl(0x7F000001);
-	} else {
-		struct in_device *indev;
-		struct in_ifaddr *ifa;
-
-		newdst = 0;
-
-		rcu_read_lock();
-		indev = __in_dev_get_rcu(skb->dev);
-		if (indev != NULL) {
-			ifa = indev->ifa_list;
-			newdst = ifa->ifa_local;
-		}
-		rcu_read_unlock();
-
-		if (!newdst)
-			return NF_DROP;
-	}
-
-	/* Transfer from original range. */
-	memset(&newrange.min_addr, 0, sizeof(newrange.min_addr));
-	memset(&newrange.max_addr, 0, sizeof(newrange.max_addr));
-	newrange.flags	     = mr->range[0].flags | NF_NAT_RANGE_MAP_IPS;
-	newrange.min_addr.ip = newdst;
-	newrange.max_addr.ip = newdst;
-	newrange.min_proto   = mr->range[0].min;
-	newrange.max_proto   = mr->range[0].max;
-
-	/* Hand modified range to generic setup. */
-	return nf_nat_setup_info(ct, &newrange, NF_NAT_MANIP_DST);
-}
-EXPORT_SYMBOL_GPL(nf_nat_redirect_ipv4);
-
-MODULE_LICENSE("GPL");
-MODULE_AUTHOR("Patrick McHardy <kaber@trash.net>");
diff --git a/net/ipv4/netfilter/nft_redir_ipv4.c b/net/ipv4/netfilter/nft_redir_ipv4.c
index 643c596..ff2d23d 100644
--- a/net/ipv4/netfilter/nft_redir_ipv4.c
+++ b/net/ipv4/netfilter/nft_redir_ipv4.c
@@ -14,7 +14,7 @@
 #include <linux/netfilter/nf_tables.h>
 #include <net/netfilter/nf_tables.h>
 #include <net/netfilter/nf_nat.h>
-#include <net/netfilter/ipv4/nf_nat_redirect.h>
+#include <net/netfilter/nf_nat_redirect.h>
 #include <net/netfilter/nft_redir.h>
 
 static void nft_redir_ipv4_eval(const struct nft_expr *expr,
diff --git a/net/ipv6/netfilter/Kconfig b/net/ipv6/netfilter/Kconfig
index 0dbe5c7..a069822 100644
--- a/net/ipv6/netfilter/Kconfig
+++ b/net/ipv6/netfilter/Kconfig
@@ -82,12 +82,6 @@ config NF_NAT_MASQUERADE_IPV6
 	  This is the kernel functionality to provide NAT in the masquerade
 	  flavour (automatic source address selection) for IPv6.
 
-config NF_NAT_REDIRECT_IPV6
-	tristate "IPv6 redirect support"
-	help
-	  This is the kernel functionality to provide NAT in the redirect
-	  flavour (redirect packet to local machine) for IPv6.
-
 config NFT_MASQ_IPV6
 	tristate "IPv6 masquerade support for nf_tables"
 	depends on NF_TABLES_IPV6
@@ -101,7 +95,7 @@ config NFT_REDIR_IPV6
 	tristate "IPv6 redirect support for nf_tables"
 	depends on NF_TABLES_IPV6
 	depends on NFT_REDIR
-	select NF_NAT_REDIRECT_IPV6
+	select NF_NAT_REDIRECT
 	help
 	  This is the expression that provides IPv4 redirect support for
 	  nf_tables.
diff --git a/net/ipv6/netfilter/Makefile b/net/ipv6/netfilter/Makefile
index d2ac9f5..c36e0a5 100644
--- a/net/ipv6/netfilter/Makefile
+++ b/net/ipv6/netfilter/Makefile
@@ -19,7 +19,6 @@ obj-$(CONFIG_NF_CONNTRACK_IPV6) += nf_conntrack_ipv6.o
 nf_nat_ipv6-y		:= nf_nat_l3proto_ipv6.o nf_nat_proto_icmpv6.o
 obj-$(CONFIG_NF_NAT_IPV6) += nf_nat_ipv6.o
 obj-$(CONFIG_NF_NAT_MASQUERADE_IPV6) += nf_nat_masquerade_ipv6.o
-obj-$(CONFIG_NF_NAT_REDIRECT_IPV6) += nf_nat_redirect_ipv6.o
 
 # defrag
 nf_defrag_ipv6-y := nf_defrag_ipv6_hooks.o nf_conntrack_reasm.o
diff --git a/net/ipv6/netfilter/nf_nat_redirect_ipv6.c b/net/ipv6/netfilter/nf_nat_redirect_ipv6.c
deleted file mode 100644
index ea1308a..0000000
--- a/net/ipv6/netfilter/nf_nat_redirect_ipv6.c
+++ /dev/null
@@ -1,75 +0,0 @@
-/*
- * (C) 1999-2001 Paul `Rusty' Russell
- * (C) 2002-2006 Netfilter Core Team <coreteam@netfilter.org>
- * Copyright (c) 2011 Patrick McHardy <kaber@trash.net>
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- *
- * Based on Rusty Russell's IPv4 REDIRECT target. Development of IPv6
- * NAT funded by Astaro.
- */
-
-#include <linux/if.h>
-#include <linux/inetdevice.h>
-#include <linux/ip.h>
-#include <linux/kernel.h>
-#include <linux/module.h>
-#include <linux/netdevice.h>
-#include <linux/netfilter.h>
-#include <linux/types.h>
-#include <linux/netfilter_ipv6.h>
-#include <linux/netfilter/x_tables.h>
-#include <net/addrconf.h>
-#include <net/checksum.h>
-#include <net/protocol.h>
-#include <net/netfilter/nf_nat.h>
-#include <net/netfilter/ipv6/nf_nat_redirect.h>
-
-static const struct in6_addr loopback_addr = IN6ADDR_LOOPBACK_INIT;
-
-unsigned int
-nf_nat_redirect_ipv6(struct sk_buff *skb, const struct nf_nat_range *range,
-		     unsigned int hooknum)
-{
-	struct nf_nat_range newrange;
-	struct in6_addr newdst;
-	enum ip_conntrack_info ctinfo;
-	struct nf_conn *ct;
-
-	ct = nf_ct_get(skb, &ctinfo);
-	if (hooknum == NF_INET_LOCAL_OUT) {
-		newdst = loopback_addr;
-	} else {
-		struct inet6_dev *idev;
-		struct inet6_ifaddr *ifa;
-		bool addr = false;
-
-		rcu_read_lock();
-		idev = __in6_dev_get(skb->dev);
-		if (idev != NULL) {
-			list_for_each_entry(ifa, &idev->addr_list, if_list) {
-				newdst = ifa->addr;
-				addr = true;
-				break;
-			}
-		}
-		rcu_read_unlock();
-
-		if (!addr)
-			return NF_DROP;
-	}
-
-	newrange.flags		= range->flags | NF_NAT_RANGE_MAP_IPS;
-	newrange.min_addr.in6	= newdst;
-	newrange.max_addr.in6	= newdst;
-	newrange.min_proto	= range->min_proto;
-	newrange.max_proto	= range->max_proto;
-
-	return nf_nat_setup_info(ct, &newrange, NF_NAT_MANIP_DST);
-}
-EXPORT_SYMBOL_GPL(nf_nat_redirect_ipv6);
-
-MODULE_LICENSE("GPL");
-MODULE_AUTHOR("Patrick McHardy <kaber@trash.net>");
diff --git a/net/ipv6/netfilter/nft_redir_ipv6.c b/net/ipv6/netfilter/nft_redir_ipv6.c
index 83420ee..2433a6b 100644
--- a/net/ipv6/netfilter/nft_redir_ipv6.c
+++ b/net/ipv6/netfilter/nft_redir_ipv6.c
@@ -15,7 +15,7 @@
 #include <net/netfilter/nf_tables.h>
 #include <net/netfilter/nf_nat.h>
 #include <net/netfilter/nft_redir.h>
-#include <net/netfilter/ipv6/nf_nat_redirect.h>
+#include <net/netfilter/nf_nat_redirect.h>
 
 static void nft_redir_ipv6_eval(const struct nft_expr *expr,
 				struct nft_data data[NFT_REG_MAX + 1],
diff --git a/net/netfilter/Kconfig b/net/netfilter/Kconfig
index 57f15a9..b02660f 100644
--- a/net/netfilter/Kconfig
+++ b/net/netfilter/Kconfig
@@ -411,6 +411,13 @@ config NF_NAT_TFTP
 	depends on NF_CONNTRACK && NF_NAT
 	default NF_NAT && NF_CONNTRACK_TFTP
 
+config NF_NAT_REDIRECT
+        tristate "IPv4/IPv6 redirect support"
+	depends on NF_NAT
+        help
+          This is the kernel functionality to redirect packets to local
+          machine through NAT.
+
 config NETFILTER_SYNPROXY
 	tristate
 
@@ -844,8 +851,7 @@ config NETFILTER_XT_TARGET_RATEEST
 config NETFILTER_XT_TARGET_REDIRECT
 	tristate "REDIRECT target support"
 	depends on NF_NAT
-	select NF_NAT_REDIRECT_IPV4 if NF_NAT_IPV4
-	select NF_NAT_REDIRECT_IPV6 if NF_NAT_IPV6
+	select NF_NAT_REDIRECT
 	---help---
 	REDIRECT is a special case of NAT: all incoming connections are
 	mapped onto the incoming interface's address, causing the packets to
diff --git a/net/netfilter/Makefile b/net/netfilter/Makefile
index f3eb468..89f73a9 100644
--- a/net/netfilter/Makefile
+++ b/net/netfilter/Makefile
@@ -51,6 +51,7 @@ nf_nat-y	:= nf_nat_core.o nf_nat_proto_unknown.o nf_nat_proto_common.o \
 obj-$(CONFIG_NF_LOG_COMMON) += nf_log_common.o
 
 obj-$(CONFIG_NF_NAT) += nf_nat.o
+obj-$(CONFIG_NF_NAT_REDIRECT) += nf_nat_redirect.o
 
 # NAT protocols (nf_nat)
 obj-$(CONFIG_NF_NAT_PROTO_DCCP) += nf_nat_proto_dccp.o
diff --git a/net/netfilter/nf_nat_redirect.c b/net/netfilter/nf_nat_redirect.c
new file mode 100644
index 0000000..97b75f9
--- /dev/null
+++ b/net/netfilter/nf_nat_redirect.c
@@ -0,0 +1,127 @@
+/*
+ * (C) 1999-2001 Paul `Rusty' Russell
+ * (C) 2002-2006 Netfilter Core Team <coreteam@netfilter.org>
+ * Copyright (c) 2011 Patrick McHardy <kaber@trash.net>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * Based on Rusty Russell's IPv4 REDIRECT target. Development of IPv6
+ * NAT funded by Astaro.
+ */
+
+#include <linux/if.h>
+#include <linux/inetdevice.h>
+#include <linux/ip.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/netdevice.h>
+#include <linux/netfilter.h>
+#include <linux/types.h>
+#include <linux/netfilter_ipv4.h>
+#include <linux/netfilter_ipv6.h>
+#include <linux/netfilter/x_tables.h>
+#include <net/addrconf.h>
+#include <net/checksum.h>
+#include <net/protocol.h>
+#include <net/netfilter/nf_nat.h>
+#include <net/netfilter/nf_nat_redirect.h>
+
+unsigned int
+nf_nat_redirect_ipv4(struct sk_buff *skb,
+		     const struct nf_nat_ipv4_multi_range_compat *mr,
+		     unsigned int hooknum)
+{
+	struct nf_conn *ct;
+	enum ip_conntrack_info ctinfo;
+	__be32 newdst;
+	struct nf_nat_range newrange;
+
+	NF_CT_ASSERT(hooknum == NF_INET_PRE_ROUTING ||
+		     hooknum == NF_INET_LOCAL_OUT);
+
+	ct = nf_ct_get(skb, &ctinfo);
+	NF_CT_ASSERT(ct && (ctinfo == IP_CT_NEW || ctinfo == IP_CT_RELATED));
+
+	/* Local packets: make them go to loopback */
+	if (hooknum == NF_INET_LOCAL_OUT) {
+		newdst = htonl(0x7F000001);
+	} else {
+		struct in_device *indev;
+		struct in_ifaddr *ifa;
+
+		newdst = 0;
+
+		rcu_read_lock();
+		indev = __in_dev_get_rcu(skb->dev);
+		if (indev != NULL) {
+			ifa = indev->ifa_list;
+			newdst = ifa->ifa_local;
+		}
+		rcu_read_unlock();
+
+		if (!newdst)
+			return NF_DROP;
+	}
+
+	/* Transfer from original range. */
+	memset(&newrange.min_addr, 0, sizeof(newrange.min_addr));
+	memset(&newrange.max_addr, 0, sizeof(newrange.max_addr));
+	newrange.flags	     = mr->range[0].flags | NF_NAT_RANGE_MAP_IPS;
+	newrange.min_addr.ip = newdst;
+	newrange.max_addr.ip = newdst;
+	newrange.min_proto   = mr->range[0].min;
+	newrange.max_proto   = mr->range[0].max;
+
+	/* Hand modified range to generic setup. */
+	return nf_nat_setup_info(ct, &newrange, NF_NAT_MANIP_DST);
+}
+EXPORT_SYMBOL_GPL(nf_nat_redirect_ipv4);
+
+static const struct in6_addr loopback_addr = IN6ADDR_LOOPBACK_INIT;
+
+unsigned int
+nf_nat_redirect_ipv6(struct sk_buff *skb, const struct nf_nat_range *range,
+		     unsigned int hooknum)
+{
+	struct nf_nat_range newrange;
+	struct in6_addr newdst;
+	enum ip_conntrack_info ctinfo;
+	struct nf_conn *ct;
+
+	ct = nf_ct_get(skb, &ctinfo);
+	if (hooknum == NF_INET_LOCAL_OUT) {
+		newdst = loopback_addr;
+	} else {
+		struct inet6_dev *idev;
+		struct inet6_ifaddr *ifa;
+		bool addr = false;
+
+		rcu_read_lock();
+		idev = __in6_dev_get(skb->dev);
+		if (idev != NULL) {
+			list_for_each_entry(ifa, &idev->addr_list, if_list) {
+				newdst = ifa->addr;
+				addr = true;
+				break;
+			}
+		}
+		rcu_read_unlock();
+
+		if (!addr)
+			return NF_DROP;
+	}
+
+	newrange.flags		= range->flags | NF_NAT_RANGE_MAP_IPS;
+	newrange.min_addr.in6	= newdst;
+	newrange.max_addr.in6	= newdst;
+	newrange.min_proto	= range->min_proto;
+	newrange.max_proto	= range->max_proto;
+
+	return nf_nat_setup_info(ct, &newrange, NF_NAT_MANIP_DST);
+}
+EXPORT_SYMBOL_GPL(nf_nat_redirect_ipv6);
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Patrick McHardy <kaber@trash.net>");
diff --git a/net/netfilter/xt_REDIRECT.c b/net/netfilter/xt_REDIRECT.c
index b6ec67e..03f0b37 100644
--- a/net/netfilter/xt_REDIRECT.c
+++ b/net/netfilter/xt_REDIRECT.c
@@ -26,8 +26,7 @@
 #include <net/checksum.h>
 #include <net/protocol.h>
 #include <net/netfilter/nf_nat.h>
-#include <net/netfilter/ipv4/nf_nat_redirect.h>
-#include <net/netfilter/ipv6/nf_nat_redirect.h>
+#include <net/netfilter/nf_nat_redirect.h>
 
 static unsigned int
 redirect_tg6(struct sk_buff *skb, const struct xt_action_param *par)
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH 06/12] netfilter: nf_log_ipv6: correct typo in module description
From: Pablo Neira Ayuso @ 2014-12-03 12:55 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev
In-Reply-To: <1417611342-25257-1-git-send-email-pablo@netfilter.org>

From: Steven Noonan <steven@uplinklabs.net>

It incorrectly identifies itself as "IPv4" packet logging.

Signed-off-by: Steven Noonan <steven@uplinklabs.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/ipv6/netfilter/nf_log_ipv6.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ipv6/netfilter/nf_log_ipv6.c b/net/ipv6/netfilter/nf_log_ipv6.c
index 7fc34d1..ddf07e6 100644
--- a/net/ipv6/netfilter/nf_log_ipv6.c
+++ b/net/ipv6/netfilter/nf_log_ipv6.c
@@ -422,6 +422,6 @@ module_init(nf_log_ipv6_init);
 module_exit(nf_log_ipv6_exit);
 
 MODULE_AUTHOR("Netfilter Core Team <coreteam@netfilter.org>");
-MODULE_DESCRIPTION("Netfilter IPv4 packet logging");
+MODULE_DESCRIPTION("Netfilter IPv6 packet logging");
 MODULE_LICENSE("GPL");
 MODULE_ALIAS_NF_LOGGER(AF_INET6, 0);
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH 11/12] netfilter: ipset: Allocate the proper size of memory when /0 networks are supported
From: Pablo Neira Ayuso @ 2014-12-03 12:55 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev
In-Reply-To: <1417611342-25257-1-git-send-email-pablo@netfilter.org>

From: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/netfilter/ipset/ip_set_hash_gen.h |    3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/net/netfilter/ipset/ip_set_hash_gen.h b/net/netfilter/ipset/ip_set_hash_gen.h
index 8ef9135..974ff38 100644
--- a/net/netfilter/ipset/ip_set_hash_gen.h
+++ b/net/netfilter/ipset/ip_set_hash_gen.h
@@ -1101,8 +1101,7 @@ IPSET_TOKEN(HTYPE, _create)(struct net *net, struct ip_set *set,
 
 	hsize = sizeof(*h);
 #ifdef IP_SET_HASH_WITH_NETS
-	hsize += sizeof(struct net_prefixes) *
-		(set->family == NFPROTO_IPV4 ? 32 : 128);
+	hsize += sizeof(struct net_prefixes) * NLEN(set->family);
 #endif
 	h = kzalloc(hsize, GFP_KERNEL);
 	if (!h)
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH 07/12] netfilter: ipset: Support updating extensions when the set is full
From: Pablo Neira Ayuso @ 2014-12-03 12:55 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev
In-Reply-To: <1417611342-25257-1-git-send-email-pablo@netfilter.org>

From: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>

When the set was full (hash type and maxelem reached), it was not
possible to update the extension part of already existing elements.
The patch removes this limitation.

Fixes: https://bugzilla.netfilter.org/show_bug.cgi?id=880
Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/netfilter/ipset/ip_set_hash_gen.h |   40 ++++++++++++++-------------------
 1 file changed, 17 insertions(+), 23 deletions(-)

diff --git a/net/netfilter/ipset/ip_set_hash_gen.h b/net/netfilter/ipset/ip_set_hash_gen.h
index fee7c64e..a12ee04 100644
--- a/net/netfilter/ipset/ip_set_hash_gen.h
+++ b/net/netfilter/ipset/ip_set_hash_gen.h
@@ -633,29 +633,6 @@ mtype_add(struct ip_set *set, void *value, const struct ip_set_ext *ext,
 	bool flag_exist = flags & IPSET_FLAG_EXIST;
 	u32 key, multi = 0;
 
-	if (h->elements >= h->maxelem && SET_WITH_FORCEADD(set)) {
-		rcu_read_lock_bh();
-		t = rcu_dereference_bh(h->table);
-		key = HKEY(value, h->initval, t->htable_bits);
-		n = hbucket(t,key);
-		if (n->pos) {
-			/* Choosing the first entry in the array to replace */
-			j = 0;
-			goto reuse_slot;
-		}
-		rcu_read_unlock_bh();
-	}
-	if (SET_WITH_TIMEOUT(set) && h->elements >= h->maxelem)
-		/* FIXME: when set is full, we slow down here */
-		mtype_expire(set, h, NLEN(set->family), set->dsize);
-
-	if (h->elements >= h->maxelem) {
-		if (net_ratelimit())
-			pr_warn("Set %s is full, maxelem %u reached\n",
-				set->name, h->maxelem);
-		return -IPSET_ERR_HASH_FULL;
-	}
-
 	rcu_read_lock_bh();
 	t = rcu_dereference_bh(h->table);
 	key = HKEY(value, h->initval, t->htable_bits);
@@ -680,6 +657,23 @@ mtype_add(struct ip_set *set, void *value, const struct ip_set_ext *ext,
 		    j != AHASH_MAX(h) + 1)
 			j = i;
 	}
+	if (h->elements >= h->maxelem && SET_WITH_FORCEADD(set) && n->pos) {
+		/* Choosing the first entry in the array to replace */
+		j = 0;
+		goto reuse_slot;
+	}
+	if (SET_WITH_TIMEOUT(set) && h->elements >= h->maxelem)
+		/* FIXME: when set is full, we slow down here */
+		mtype_expire(set, h, NLEN(set->family), set->dsize);
+
+	if (h->elements >= h->maxelem) {
+		if (net_ratelimit())
+			pr_warn("Set %s is full, maxelem %u reached\n",
+				set->name, h->maxelem);
+		ret = -IPSET_ERR_HASH_FULL;
+		goto out;
+	}
+
 reuse_slot:
 	if (j != AHASH_MAX(h) + 1) {
 		/* Fill out reused slot */
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH 09/12] netfilter: ipset: Indicate when /0 networks are supported
From: Pablo Neira Ayuso @ 2014-12-03 12:55 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev
In-Reply-To: <1417611342-25257-1-git-send-email-pablo@netfilter.org>

From: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/netfilter/ipset/ip_set_hash_gen.h      |    2 +-
 net/netfilter/ipset/ip_set_hash_netiface.c |    1 +
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/netfilter/ipset/ip_set_hash_gen.h b/net/netfilter/ipset/ip_set_hash_gen.h
index a12ee04..9428fa5 100644
--- a/net/netfilter/ipset/ip_set_hash_gen.h
+++ b/net/netfilter/ipset/ip_set_hash_gen.h
@@ -156,7 +156,7 @@ hbucket_elem_add(struct hbucket *n, u8 ahash_max, size_t dsize)
 
 #define SET_HOST_MASK(family)	(family == AF_INET ? 32 : 128)
 
-#ifdef IP_SET_HASH_WITH_MULTI
+#ifdef IP_SET_HASH_WITH_NET0
 #define NLEN(family)		(SET_HOST_MASK(family) + 1)
 #else
 #define NLEN(family)		SET_HOST_MASK(family)
diff --git a/net/netfilter/ipset/ip_set_hash_netiface.c b/net/netfilter/ipset/ip_set_hash_netiface.c
index 35dd358..758b002 100644
--- a/net/netfilter/ipset/ip_set_hash_netiface.c
+++ b/net/netfilter/ipset/ip_set_hash_netiface.c
@@ -115,6 +115,7 @@ iface_add(struct rb_root *root, const char **iface)
 #define IP_SET_HASH_WITH_NETS
 #define IP_SET_HASH_WITH_RBTREE
 #define IP_SET_HASH_WITH_MULTI
+#define IP_SET_HASH_WITH_NET0
 
 #define STREQ(a, b)	(strcmp(a, b) == 0)
 
-- 
1.7.10.4

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox