Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH] sky2: override for PCI legacy power management
From: Jonathan Nieder @ 2012-05-06 22:26 UTC (permalink / raw)
  To: Knut Petersen
  Cc: Bjorn Helgaas, Stephen Hemminger, David S. Miller, Linus Torvalds,
	arekm, Jared, dilieto, linux-kernel, netdev
In-Reply-To: <4F6BA960.6020003@t-online.de>

Knut Petersen wrote, a few months ago:

> It´s easy to dmi_match() known broken systems - I have dmidecode
> outputs of four systems that definitely need the patch.
>
> Two ASUSTek P5* mainboards with AMI BIOSes, two AOpen i915G*
> mainboards with Award/Phoenix BIOSes.

Yes, please.  Could you attach those to [1]?

Thanks,
Jonathan

[1] https://bugzilla.kernel.org/show_bug.cgi?id=19492

^ permalink raw reply

* Re: [v12 PATCH 2/3] NETFILTER module xt_hmark, new target for HASH based fwmark
From: Pablo Neira Ayuso @ 2012-05-06 22:57 UTC (permalink / raw)
  To: Hans Schillstrom
  Cc: kaber@trash.net, jengelh@medozas.de,
	netfilter-devel@vger.kernel.org, netdev@vger.kernel.org,
	hans@schillstrom.com
In-Reply-To: <201205021949.48741.hans.schillstrom@ericsson.com>

[-- Attachment #1: Type: text/plain, Size: 9477 bytes --]

Hi Hans,

[...]
> > > > Regarding ICMP traffic, I think we can use the ID field for the
> > > > hashing as well. Thus, we handle ICMP like other protocols.
> > > 
> > > Yes why not, I can give it a try.
> > > 
> 
> I think we wait with this one..

I see. This is easy to add for the conntrack side, but it will require
some extra code for the packet-based solution.

Not directly related to this but, I know that your intention is to
make this as flexible as possible. However, I still don't find how I
would use the port mask feature in any of my setups.  Basically, I
don't come up with any useful example for this situation.

I'm also telling this because I think that ICMP support will be
easier to add if port masking is removed.

[...]
> This is what I have done.
> 
> - I reduced the code size a little bit by combining the hmark_ct_set_htuple_ipvX into one func.
>   by adding a hmark_addr6_mask() and hmark_addr_any_mask()
>   Note that using "otuple->src.l3num" as param 1 in both src and dst is not a typo.
>   (it's not set in the rtuple)

Good one, this made the code even smaller.

> - Made the if (dst < src) swap() in the hmark_hash() since it should be used by every caller.

Not really, you don't need for the conntrack part. The original tuple
is always the same, not matter where the packet is coming from. I have
removed this again so it only affects packet-based hashing.

> - Moved the L3 check a little bit earlier.

good.

> - changed return values for fragments.

With this, you're giving up on trying to classify fragments. Do you
really want this?

>From my point of view, if your firewalls (assuming they are the HMARK
classification) are stateless, it still makes sense to me to classify
fragments using the XT_HMARK_METHOD_L3_4.

> - Added nhoffs to: hmark_set_tuple_ports(skb, (ip->ihl * 4) + nhoff, t, info);
>   to get icmp working

good catch.

Below, some minor changes that I made to your patch (you can find a
new version enclosed to this email).

[...]
> +#ifndef XT_HMARK_H_
> +#define XT_HMARK_H_
> +
> +#include <linux/types.h>
> +
> +enum {
> +	XT_HMARK_NONE,
> +	XT_HMARK_SADR_AND,
> +	XT_HMARK_DADR_AND,
> +	XT_HMARK_SPI_AND,
> +	XT_HMARK_SPI_OR,
> +	XT_HMARK_SPORT_AND,
> +	XT_HMARK_DPORT_AND,
> +	XT_HMARK_SPORT_OR,
> +	XT_HMARK_DPORT_OR,
> +	XT_HMARK_PROTO_AND,
> +	XT_HMARK_RND,
> +	XT_HMARK_MODULUS,
> +	XT_HMARK_OFFSET,
> +	XT_HMARK_CT,
> +	XT_HMARK_METHOD_L3,
> +	XT_HMARK_METHOD_L3_4,
> +	XT_F_HMARK_SADR_AND    = 1 << XT_HMARK_SADR_AND,
> +	XT_F_HMARK_DADR_AND    = 1 << XT_HMARK_DADR_AND,
> +	XT_F_HMARK_SPI_AND     = 1 << XT_HMARK_SPI_AND,
> +	XT_F_HMARK_SPI_OR      = 1 << XT_HMARK_SPI_OR,
> +	XT_F_HMARK_SPORT_AND   = 1 << XT_HMARK_SPORT_AND,
> +	XT_F_HMARK_DPORT_AND   = 1 << XT_HMARK_DPORT_AND,
> +	XT_F_HMARK_SPORT_OR    = 1 << XT_HMARK_SPORT_OR,
> +	XT_F_HMARK_DPORT_OR    = 1 << XT_HMARK_DPORT_OR,
> +	XT_F_HMARK_PROTO_AND   = 1 << XT_HMARK_PROTO_AND,
> +	XT_F_HMARK_RND         = 1 << XT_HMARK_RND,
> +	XT_F_HMARK_MODULUS     = 1 << XT_HMARK_MODULUS,
> +	XT_F_HMARK_OFFSET      = 1 << XT_HMARK_OFFSET,
> +	XT_F_HMARK_CT          = 1 << XT_HMARK_CT,
> +	XT_F_HMARK_METHOD_L3   = 1 << XT_HMARK_METHOD_L3,
> +	XT_F_HMARK_METHOD_L3_4 = 1 << XT_HMARK_METHOD_L3_4,

I've defined:

#define XT_HMARK_FLAG(flag) (1 << flag)

So we save all those extra _F_ defintions, they look redundant.

[...]
> diff --git a/net/netfilter/xt_HMARK.c b/net/netfilter/xt_HMARK.c
> new file mode 100644
> index 0000000..76a3fa7
> --- /dev/null
> +++ b/net/netfilter/xt_HMARK.c
> +/*
> + * xt_HMARK - Netfilter module to set mark as hash value
> + *
> + * (C) 2012 by Hans Schillstrom <hans.schillstrom@ericsson.com>
> + * (C) 2012 by Pablo Neira Ayuso <pablo@netfilter.org>
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms of the GNU General Public License version 2 as published by
> + * the Free Software Foundation.
> + *
> + * Description:
> + *
> + * This module calculates a hash value that can be modified by modulus and an
> + * offset, i.e. it is possible to produce a skb->mark within a range The hash
> + * value is based on a direction independent five tuple: src & dst addr src &
> + * dst ports and protocol.
> + */
> +
> +#include <linux/module.h>
> +#include <linux/skbuff.h>
> +#include <linux/icmp.h>
> +
> +#include <linux/netfilter/x_tables.h>
> +#include <linux/netfilter/xt_HMARK.h>
> +
> +#include <net/ip.h>
> +#if IS_ENABLED(CONFIG_NF_CONNTRACK)
> +#include <net/netfilter/nf_conntrack.h>
> +#endif
> +#if IS_ENABLED(CONFIG_IP6_NF_IPTABLES)
> +#include <net/ipv6.h>
> +#include <linux/netfilter_ipv6/ip6_tables.h>
> +#endif
> +
> +

I removed this extra blank line above.

> +MODULE_LICENSE("GPL");
> +MODULE_AUTHOR("Hans Schillstrom <hans.schillstrom@ericsson.com>");
> +MODULE_DESCRIPTION("Xtables: packet marking using hash calculation");
> +MODULE_ALIAS("ipt_HMARK");
> +MODULE_ALIAS("ip6t_HMARK");
> +
> +struct hmark_tuple {
> +	u32			src;
> +	u32			dst;
> +	union hmark_ports	uports;
> +	uint8_t			proto;
> +};
> +
> +static int
> +hmark_ct_set_htuple(const struct sk_buff *skb, struct hmark_tuple *t,
> +			 const struct xt_hmark_info *info);
> +static inline u32
> +hmark_hash(struct hmark_tuple *t, const struct xt_hmark_info *info)
> +{
> +	u32 hash;
> +
> +	if (t->dst < t->src)
> +		swap(t->src, t->dst);
> +
> +	hash = jhash_3words(t->src, t->dst, t->uports.v32, info->hashrnd);
> +	hash = hash ^ (t->proto & info->proto_mask);
> +
> +	return (hash % info->hmodulus) + info->hoffset;
> +}
> +
> +static void
> +hmark_set_tuple_ports(const struct sk_buff *skb, unsigned int nhoff,
> +		      struct hmark_tuple *t, const struct xt_hmark_info *info)
> +{
> +	int protoff;
> +
> +	protoff = proto_ports_offset(t->proto);
> +	if (protoff < 0)
> +		return;
> +
> +	nhoff += protoff;
> +	if (skb_copy_bits(skb, nhoff, &t->uports, sizeof(t->uports)) < 0)
> +		return;
> +
> +	if (t->proto == IPPROTO_ESP || t->proto == IPPROTO_AH)
> +		t->uports.v32 = (t->uports.v32 & info->spi_mask) |
> +				info->spi_set;
> +	else {
> +		t->uports.v32 = (t->uports.v32 & info->port_mask.v32) |
> +				info->port_set.v32;
> +
> +		if (t->uports.p16.dst < t->uports.p16.src)
> +			swap(t->uports.p16.dst, t->uports.p16.src);
> +	}
> +}
> +
> +#if IS_ENABLED(CONFIG_IP6_NF_IPTABLES)
> +static int get_inner6_hdr(const struct sk_buff *skb, int *offset)
> +{
> +	struct icmp6hdr *icmp6h, _ih6;
> +
> +	icmp6h = skb_header_pointer(skb, *offset, sizeof(_ih6), &_ih6);
> +	if (icmp6h == NULL)
> +		return 0;
> +
> +	if (icmp6h->icmp6_type && icmp6h->icmp6_type < 128) {
> +		*offset += sizeof(struct icmp6hdr);
> +		return 1;
> +	}
> +	return 0;
> +}
> +
> +static inline u32 hmark_addr6_mask(const __u32 *addr32, const __u32 *mask)
> +{
> +	return  (addr32[0] & mask[0]) ^
> +		(addr32[1] & mask[1]) ^
> +		(addr32[2] & mask[2]) ^
> +		(addr32[3] & mask[3]);
> +}
> +
> +static int
> +hmark_pkt_set_htuple_ipv6(const struct sk_buff *skb, struct hmark_tuple *t,
> +			  const struct xt_hmark_info *info)
> +{
> +	struct ipv6hdr *ip6, _ip6;
> +	int flag = IP6T_FH_F_AUTH; /* Ports offset, find_hdr flags */
> +	unsigned int nhoff = 0;
> +	u16 fragoff = 0;
> +	int nexthdr;
> +
> +	ip6 = (struct ipv6hdr *) (skb->data + skb_network_offset(skb));
> +	nexthdr = ipv6_find_hdr(skb, &nhoff, -1, &fragoff, &flag);
> +	if (nexthdr < 0)
> +		return 0;
> +	/* No need to check for icmp errors on fragments */
> +	if ((flag & IP6T_FH_F_FRAG) || (nexthdr != IPPROTO_ICMPV6))
> +		goto noicmp;
> +	/* if an icmp error, use the inner header */
> +	if (get_inner6_hdr(skb, &nhoff)) {
> +		ip6 = skb_header_pointer(skb, nhoff, sizeof(_ip6), &_ip6);
> +		if (ip6 == NULL)
> +			return -1;
> +		/* Treat AH as ESP, use SPI nothing else. */
> +		flag = IP6T_FH_F_AUTH;
> +		nexthdr = ipv6_find_hdr(skb, &nhoff, -1, &fragoff, &flag);
> +		if (nexthdr < 0)
> +			return -1;
> +	}
> +noicmp:
> +	t->src = hmark_addr6_mask(ip6->saddr.s6_addr32, info->src_mask.all);
> +	t->dst = hmark_addr6_mask(ip6->daddr.s6_addr32, info->dst_mask.all);
> +
> +	if (info->flags & XT_F_HMARK_METHOD_L3)
> +		return 0;
> +
> +	t->proto = nexthdr;
> +
> +	if (t->proto == IPPROTO_ICMPV6)
> +		return 0;
> +
> +	if (flag & IP6T_FH_F_FRAG)
> +		return -1;
> +
> +	hmark_set_tuple_ports(skb, nhoff, t, info);
> +
> +	return 0;
> +}
> +
> +static unsigned int
> +hmark_tg_v6(struct sk_buff *skb, const struct xt_action_param *par)
> +{
> +	const struct xt_hmark_info *info = par->targinfo;
> +	struct hmark_tuple t;
> +
> +	memset(&t, 0, sizeof(struct hmark_tuple));
> +
> +	if (info->flags & XT_F_HMARK_CT) {
> +		if (hmark_ct_set_htuple(skb, &t, info) < 0)
> +			return XT_CONTINUE;
> +	} else {
> +		if (hmark_pkt_set_htuple_ipv6(skb, &t, info) < 0)
> +			return XT_CONTINUE;
> +	}
> +
> +	skb->mark = hmark_hash(&t, info);
> +	return XT_CONTINUE;
> +}
> +
> +static inline u32
> +hmark_addr_any_mask(int l3num, const __u32 *addr32, const __u32 *mask)
> +{
> +	if (l3num == AF_INET)
> +		return *addr32 & *mask;
> +
> +	return hmark_addr6_mask(addr32, mask);
> +}
> +#else
> +static inline u32
> +hmark_addr_any_mask(int l3num, const __u32 *addr32, const __u32 *mask)
> +{
> +	return *addr32 & *mask;
> +}
> +
> +#endif

This is ugly. I think you will not find any section of the Netfilter
code with something similar. I have declared this function out of the
#ifdef section, those are static inline, the compiler will put them
out if unused with no further complain.

Please, find a new takeover patch enclosed.

[-- Attachment #2: 0001-netfilter-add-xt_hmark-target-for-hash-based-skb-mar.patch --]
[-- Type: text/x-diff, Size: 13655 bytes --]

>From d5065af3988cc7561a02f30bae8342e1a89126a4 Mon Sep 17 00:00:00 2001
From: Hans Schillstrom <hans.schillstrom@ericsson.com>
Date: Wed, 2 May 2012 07:49:47 +0000
Subject: netfilter: add xt_hmark target for hash-based skb
 marking

The target allows you to create rules in the "raw" and "mangle" tables
which set the skbuff mark by means of hash calculation within a given
range. The nfmark can influence the routing method (see "Use netfilter
MARK value as routing key") and can also be used by other subsystems to
change their behaviour.

Some examples:

* Default rule handles all TCP, UDP, SCTP, ESP & AH

 iptables -t mangle -A PREROUTING -m state --state NEW,ESTABLISHED,RELATED \
	-j HMARK --hmark-offset 10000 --hmark-mod 10

* Handle SCTP and hash dest port only and produce a nfmark between 100-119.

 iptables -t mangle -A PREROUTING -p SCTP -j HMARK --src-mask 0 --dst-mask 0 \
	--sp-mask 0 --offset 100 --mod 20

* Fragment safe Layer 3 only, that keep a class C network flow together

 iptables -t mangle -A PREROUTING -j HMARK --method L3 \
	--src-mask 24 --mod 20 --offset 100

[ A big part of this patch has been refactorized by Pablo Neira Ayuso ]

Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
---
 include/linux/netfilter/xt_HMARK.h |   48 +++++
 net/netfilter/Kconfig              |   15 ++
 net/netfilter/Makefile             |    1 +
 net/netfilter/xt_HMARK.c           |  358 ++++++++++++++++++++++++++++++++++++
 4 files changed, 422 insertions(+)
 create mode 100644 include/linux/netfilter/xt_HMARK.h
 create mode 100644 net/netfilter/xt_HMARK.c

diff --git a/include/linux/netfilter/xt_HMARK.h b/include/linux/netfilter/xt_HMARK.h
new file mode 100644
index 0000000..05e43ba
--- /dev/null
+++ b/include/linux/netfilter/xt_HMARK.h
@@ -0,0 +1,48 @@
+#ifndef XT_HMARK_H_
+#define XT_HMARK_H_
+
+#include <linux/types.h>
+
+enum {
+	XT_HMARK_NONE,
+	XT_HMARK_SADR_AND,
+	XT_HMARK_DADR_AND,
+	XT_HMARK_SPI_AND,
+	XT_HMARK_SPI_OR,
+	XT_HMARK_SPORT_AND,
+	XT_HMARK_DPORT_AND,
+	XT_HMARK_SPORT_OR,
+	XT_HMARK_DPORT_OR,
+	XT_HMARK_PROTO_AND,
+	XT_HMARK_RND,
+	XT_HMARK_MODULUS,
+	XT_HMARK_OFFSET,
+	XT_HMARK_CT,
+	XT_HMARK_METHOD_L3,
+	XT_HMARK_METHOD_L3_4,
+};
+#define XT_HMARK_FLAG(flag)	(1 << flag)
+
+union hmark_ports {
+	struct {
+		__u16	src;
+		__u16	dst;
+	} p16;
+	__u32	v32;
+};
+
+struct xt_hmark_info {
+	union nf_inet_addr	src_mask;	/* Source address mask */
+	union nf_inet_addr	dst_mask;	/* Dest address mask */
+	union hmark_ports	port_mask;
+	union hmark_ports	port_set;
+	__u32			spi_mask;
+	__u32			spi_set;
+	__u32			flags;		/* Print out only */
+	__u16			proto_mask;	/* L4 Proto mask */
+	__u32			hashrnd;
+	__u32			hmodulus;	/* Modulus */
+	__u32			hoffset;	/* Offset */
+};
+
+#endif /* XT_HMARK_H_ */
diff --git a/net/netfilter/Kconfig b/net/netfilter/Kconfig
index 0c6f67e..209c1ed 100644
--- a/net/netfilter/Kconfig
+++ b/net/netfilter/Kconfig
@@ -509,6 +509,21 @@ config NETFILTER_XT_TARGET_HL
 	since you can easily create immortal packets that loop
 	forever on the network.
 
+config NETFILTER_XT_TARGET_HMARK
+	tristate '"HMARK" target support'
+	depends on (IP6_NF_IPTABLES || IP6_NF_IPTABLES=n)
+	depends on NETFILTER_ADVANCED
+	---help---
+	This option adds the "HMARK" target.
+
+	The target allows you to create rules in the "raw" and "mangle" tables
+	which set the skbuff mark by means of hash calculation within a given
+	range. The nfmark can influence the routing method (see "Use netfilter
+	MARK value as routing key") and can also be used by other subsystems to
+	change their behaviour.
+
+	To compile it as a module, choose M here. If unsure, say N.
+
 config NETFILTER_XT_TARGET_IDLETIMER
 	tristate  "IDLETIMER target support"
 	depends on NETFILTER_ADVANCED
diff --git a/net/netfilter/Makefile b/net/netfilter/Makefile
index ca36765..4e7960c 100644
--- a/net/netfilter/Makefile
+++ b/net/netfilter/Makefile
@@ -59,6 +59,7 @@ obj-$(CONFIG_NETFILTER_XT_TARGET_CONNSECMARK) += xt_CONNSECMARK.o
 obj-$(CONFIG_NETFILTER_XT_TARGET_CT) += xt_CT.o
 obj-$(CONFIG_NETFILTER_XT_TARGET_DSCP) += xt_DSCP.o
 obj-$(CONFIG_NETFILTER_XT_TARGET_HL) += xt_HL.o
+obj-$(CONFIG_NETFILTER_XT_TARGET_HMARK) += xt_HMARK.o
 obj-$(CONFIG_NETFILTER_XT_TARGET_LED) += xt_LED.o
 obj-$(CONFIG_NETFILTER_XT_TARGET_LOG) += xt_LOG.o
 obj-$(CONFIG_NETFILTER_XT_TARGET_NFLOG) += xt_NFLOG.o
diff --git a/net/netfilter/xt_HMARK.c b/net/netfilter/xt_HMARK.c
new file mode 100644
index 0000000..b4aa912
--- /dev/null
+++ b/net/netfilter/xt_HMARK.c
@@ -0,0 +1,358 @@
+/*
+ * xt_HMARK - Netfilter module to set mark by means of hashing
+ *
+ * (C) 2012 by Hans Schillstrom <hans.schillstrom@ericsson.com>
+ * (C) 2012 by Pablo Neira Ayuso <pablo@netfilter.org>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License version 2 as published by
+ * the Free Software Foundation.
+ */
+
+#include <linux/module.h>
+#include <linux/skbuff.h>
+#include <linux/icmp.h>
+
+#include <linux/netfilter/x_tables.h>
+#include <linux/netfilter/xt_HMARK.h>
+
+#include <net/ip.h>
+#if IS_ENABLED(CONFIG_NF_CONNTRACK)
+#include <net/netfilter/nf_conntrack.h>
+#endif
+#if IS_ENABLED(CONFIG_IP6_NF_IPTABLES)
+#include <net/ipv6.h>
+#include <linux/netfilter_ipv6/ip6_tables.h>
+#endif
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Hans Schillstrom <hans.schillstrom@ericsson.com>");
+MODULE_DESCRIPTION("Xtables: packet marking using hash calculation");
+MODULE_ALIAS("ipt_HMARK");
+MODULE_ALIAS("ip6t_HMARK");
+
+struct hmark_tuple {
+	u32			src;
+	u32			dst;
+	union hmark_ports	uports;
+	uint8_t			proto;
+};
+
+static inline u32 hmark_addr6_mask(const __u32 *addr32, const __u32 *mask)
+{
+	return (addr32[0] & mask[0]) ^
+	       (addr32[1] & mask[1]) ^
+	       (addr32[2] & mask[2]) ^
+	       (addr32[3] & mask[3]);
+}
+
+static inline u32
+hmark_addr_mask(int l3num, const __u32 *addr32, const __u32 *mask)
+{
+	switch(l3num) {
+	case AF_INET:
+		return *addr32 & *mask;
+	case AF_INET6:
+		return hmark_addr6_mask(addr32, mask);
+	}
+	return 0;
+}
+
+static int
+hmark_ct_set_htuple(const struct sk_buff *skb, struct hmark_tuple *t,
+		    const struct xt_hmark_info *info)
+{
+#if IS_ENABLED(CONFIG_NF_CONNTRACK)
+	enum ip_conntrack_info ctinfo;
+	struct nf_conn *ct = nf_ct_get(skb, &ctinfo);
+	struct nf_conntrack_tuple *otuple;
+	struct nf_conntrack_tuple *rtuple;
+
+	if (ct == NULL || nf_ct_is_untracked(ct))
+		return -1;
+
+	otuple = &ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple;
+	rtuple = &ct->tuplehash[IP_CT_DIR_REPLY].tuple;
+
+	t->src = hmark_addr_mask(otuple->src.l3num, otuple->src.u3.all,
+				 info->src_mask.all);
+	t->dst = hmark_addr_mask(otuple->src.l3num, rtuple->src.u3.all,
+				 info->dst_mask.all);
+
+	if (info->flags & XT_HMARK_FLAG(XT_HMARK_METHOD_L3))
+		return 0;
+
+	t->proto = nf_ct_protonum(ct);
+	if (t->proto != IPPROTO_ICMP) {
+		t->uports.p16.src = otuple->src.u.all;
+		t->uports.p16.dst = rtuple->src.u.all;
+		t->uports.v32 = (t->uports.v32 & info->port_mask.v32) |
+				info->port_set.v32;
+	}
+
+	return 0;
+#else
+	return -1;
+#endif
+}
+
+static inline u32
+hmark_hash(struct hmark_tuple *t, const struct xt_hmark_info *info)
+{
+	u32 hash;
+
+	hash = jhash_3words(t->src, t->dst, t->uports.v32, info->hashrnd);
+	hash = hash ^ (t->proto & info->proto_mask);
+
+	return (hash % info->hmodulus) + info->hoffset;
+}
+
+static void
+hmark_set_tuple_ports(const struct sk_buff *skb, unsigned int nhoff,
+		      struct hmark_tuple *t, const struct xt_hmark_info *info)
+{
+	int protoff;
+
+	protoff = proto_ports_offset(t->proto);
+	if (protoff < 0)
+		return;
+
+	nhoff += protoff;
+	if (skb_copy_bits(skb, nhoff, &t->uports, sizeof(t->uports)) < 0)
+		return;
+
+	if (t->proto == IPPROTO_ESP || t->proto == IPPROTO_AH)
+		t->uports.v32 = (t->uports.v32 & info->spi_mask) |
+				info->spi_set;
+	else {
+		t->uports.v32 = (t->uports.v32 & info->port_mask.v32) |
+				info->port_set.v32;
+
+		if (t->uports.p16.dst < t->uports.p16.src)
+			swap(t->uports.p16.dst, t->uports.p16.src);
+	}
+}
+
+#if IS_ENABLED(CONFIG_IP6_NF_IPTABLES)
+static int get_inner6_hdr(const struct sk_buff *skb, int *offset)
+{
+	struct icmp6hdr *icmp6h, _ih6;
+
+	icmp6h = skb_header_pointer(skb, *offset, sizeof(_ih6), &_ih6);
+	if (icmp6h == NULL)
+		return 0;
+
+	if (icmp6h->icmp6_type && icmp6h->icmp6_type < 128) {
+		*offset += sizeof(struct icmp6hdr);
+		return 1;
+	}
+	return 0;
+}
+
+static int
+hmark_pkt_set_htuple_ipv6(const struct sk_buff *skb, struct hmark_tuple *t,
+			  const struct xt_hmark_info *info)
+{
+	struct ipv6hdr *ip6, _ip6;
+	int flag = IP6T_FH_F_AUTH; /* Ports offset, find_hdr flags */
+	unsigned int nhoff = 0;
+	u16 fragoff = 0;
+	int nexthdr;
+
+	ip6 = (struct ipv6hdr *) (skb->data + skb_network_offset(skb));
+	nexthdr = ipv6_find_hdr(skb, &nhoff, -1, &fragoff, &flag);
+	if (nexthdr < 0)
+		return 0;
+	/* No need to check for icmp errors on fragments */
+	if ((flag & IP6T_FH_F_FRAG) || (nexthdr != IPPROTO_ICMPV6))
+		goto noicmp;
+	/* if an icmp error, use the inner header */
+	if (get_inner6_hdr(skb, &nhoff)) {
+		ip6 = skb_header_pointer(skb, nhoff, sizeof(_ip6), &_ip6);
+		if (ip6 == NULL)
+			return -1;
+		/* Treat AH as ESP, use SPI nothing else. */
+		flag = IP6T_FH_F_AUTH;
+		nexthdr = ipv6_find_hdr(skb, &nhoff, -1, &fragoff, &flag);
+		if (nexthdr < 0)
+			return -1;
+	}
+noicmp:
+	t->src = hmark_addr6_mask(ip6->saddr.s6_addr32, info->src_mask.all);
+	t->dst = hmark_addr6_mask(ip6->daddr.s6_addr32, info->dst_mask.all);
+
+	if (t->dst < t->src)
+		swap(t->src, t->dst);
+
+	if (info->flags & XT_HMARK_FLAG(XT_HMARK_METHOD_L3))
+		return 0;
+
+	t->proto = nexthdr;
+
+	if (t->proto == IPPROTO_ICMPV6)
+		return 0;
+
+	if (flag & IP6T_FH_F_FRAG)
+		return -1;
+
+	hmark_set_tuple_ports(skb, nhoff, t, info);
+
+	return 0;
+}
+
+static unsigned int
+hmark_tg_v6(struct sk_buff *skb, const struct xt_action_param *par)
+{
+	const struct xt_hmark_info *info = par->targinfo;
+	struct hmark_tuple t;
+
+	memset(&t, 0, sizeof(struct hmark_tuple));
+
+	if (info->flags & XT_HMARK_FLAG(XT_HMARK_CT)) {
+		if (hmark_ct_set_htuple(skb, &t, info) < 0)
+			return XT_CONTINUE;
+	} else {
+		if (hmark_pkt_set_htuple_ipv6(skb, &t, info) < 0)
+			return XT_CONTINUE;
+	}
+
+	skb->mark = hmark_hash(&t, info);
+	return XT_CONTINUE;
+}
+#endif
+
+static int get_inner_hdr(const struct sk_buff *skb, int iphsz, int *nhoff)
+{
+	const struct icmphdr *icmph;
+	struct icmphdr _ih;
+
+	/* Not enough header? */
+	icmph = skb_header_pointer(skb, *nhoff + iphsz, sizeof(_ih), &_ih);
+	if (icmph == NULL && icmph->type > NR_ICMP_TYPES)
+		return 0;
+
+	/* Error message? */
+	if (icmph->type != ICMP_DEST_UNREACH &&
+	    icmph->type != ICMP_SOURCE_QUENCH &&
+	    icmph->type != ICMP_TIME_EXCEEDED &&
+	    icmph->type != ICMP_PARAMETERPROB &&
+	    icmph->type != ICMP_REDIRECT)
+		return 0;
+
+	*nhoff += iphsz + sizeof(_ih);
+	return 1;
+}
+
+static int
+hmark_pkt_set_htuple_ipv4(const struct sk_buff *skb, struct hmark_tuple *t,
+			  const struct xt_hmark_info *info)
+{
+	struct iphdr *ip, _ip;
+	int nhoff = skb_network_offset(skb);
+
+	ip = (struct iphdr *) (skb->data + nhoff);
+	if (ip->protocol == IPPROTO_ICMP) {
+		/* use inner header in case of ICMP errors */
+		if (get_inner_hdr(skb, ip->ihl * 4, &nhoff)) {
+			ip = skb_header_pointer(skb, nhoff, sizeof(_ip), &_ip);
+			if (ip == NULL)
+				return -1;
+		}
+	}
+
+	t->src = (__force u32) ip->saddr;
+	t->dst = (__force u32) ip->daddr;
+
+	t->src &= info->src_mask.ip;
+	t->dst &= info->dst_mask.ip;
+
+	if (t->dst < t->src)
+		swap(t->src, t->dst);
+
+	if (info->flags & XT_HMARK_FLAG(XT_HMARK_METHOD_L3))
+		return 0;
+
+	t->proto = ip->protocol;
+
+	/* ICMP has no ports, skip */
+	if (t->proto == IPPROTO_ICMP)
+		return 0;
+
+	/* follow-up fragments don't contain ports, skip */
+	if (ip->frag_off & htons(IP_MF | IP_OFFSET))
+		return -1;
+
+	hmark_set_tuple_ports(skb, (ip->ihl * 4) + nhoff, t, info);
+
+	return 0;
+}
+
+static unsigned int
+hmark_tg_v4(struct sk_buff *skb, const struct xt_action_param *par)
+{
+	const struct xt_hmark_info *info = par->targinfo;
+	struct hmark_tuple t;
+
+	memset(&t, 0, sizeof(struct hmark_tuple));
+
+	if (info->flags & XT_HMARK_FLAG(XT_HMARK_CT)) {
+		if (hmark_ct_set_htuple(skb, &t, info) < 0)
+			return XT_CONTINUE;
+	} else {
+		if (hmark_pkt_set_htuple_ipv4(skb, &t, info) < 0)
+			return XT_CONTINUE;
+	}
+
+	skb->mark = hmark_hash(&t, info);
+	return XT_CONTINUE;
+}
+
+static int hmark_tg_check(const struct xt_tgchk_param *par)
+{
+	const struct xt_hmark_info *info = par->targinfo;
+
+	if (!info->hmodulus) {
+		pr_info("xt_HMARK: hash modulus can't be zero\n");
+		return -EINVAL;
+	}
+	if (info->proto_mask &&
+	    (info->flags & XT_HMARK_FLAG(XT_HMARK_METHOD_L3))) {
+		pr_info("xt_HMARK: proto mask must be zero with L3 mode\n");
+		return -EINVAL;
+	}
+	return 0;
+}
+
+static struct xt_target hmark_tg_reg[] __read_mostly = {
+	{
+		.name		= "HMARK",
+		.family		= NFPROTO_IPV4,
+		.target		= hmark_tg_v4,
+		.targetsize	= sizeof(struct xt_hmark_info),
+		.checkentry	= hmark_tg_check,
+		.me		= THIS_MODULE,
+	},
+#if IS_ENABLED(CONFIG_IP6_NF_IPTABLES)
+	{
+		.name		= "HMARK",
+		.family		= NFPROTO_IPV6,
+		.target		= hmark_tg_v6,
+		.targetsize	= sizeof(struct xt_hmark_info),
+		.checkentry	= hmark_tg_check,
+		.me		= THIS_MODULE,
+	},
+#endif
+};
+
+static int __init hmark_tg_init(void)
+{
+	return xt_register_targets(hmark_tg_reg, ARRAY_SIZE(hmark_tg_reg));
+}
+
+static void __exit hmark_tg_exit(void)
+{
+	xt_unregister_targets(hmark_tg_reg, ARRAY_SIZE(hmark_tg_reg));
+}
+
+module_init(hmark_tg_init);
+module_exit(hmark_tg_exit);
-- 
1.7.9.5


^ permalink raw reply related

* [PATCH RESEND 0/5] Adopt pinctrl support for a few outstanding imx drivers
From: Shawn Guo @ 2012-05-07  0:53 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Arnd Bergmann, Olof Johansson, Sascha Hauer, Dong Aisheng,
	Shawn Guo, spi-devel-general, Grant Likely, linux-i2c,
	Wolfram Sang, linux-can, Marc Kleine-Budde, netdev,
	David S. Miller, linux-serial, Greg Kroah-Hartman

With patch 5b3aa5f (pinctrl: add pinctrl_provide_dummies interface for
platforms to use) applied on pinctrl tree, and patch "ARM: imx: enable
pinctrl dummy states" [1] being there, we are ready to adopt pinctrl
API for imx drivers.  So let's start from a few outstanding ones.

I would expect to ask Arnd and Olof to pull pinctrl tree into arm-soc
as a dependency and then have series [1] and this patch set go through
arm-soc tree to ease the merge process.

Resend to have subsystem lists Cc-ed.

Regards,
Shawn

[1] http://thread.gmane.org/gmane.linux.kernel.mmc/14180

Shawn Guo (5):
  tty: serial: imx: adopt pinctrl support
  net: fec: adopt pinctrl support
  can: flexcan: adopt pinctrl support
  i2c: imx: adopt pinctrl support
  spi/imx: adopt pinctrl support

 drivers/i2c/busses/i2c-imx.c         |    8 ++++++++
 drivers/net/can/flexcan.c            |    6 ++++++
 drivers/net/ethernet/freescale/fec.c |    9 +++++++++
 drivers/spi/spi-imx.c                |    8 ++++++++
 drivers/tty/serial/imx.c             |    8 ++++++++
 5 files changed, 39 insertions(+), 0 deletions(-)

-- 
1.7.5.4


^ permalink raw reply

* [PATCH RESEND 2/5] net: fec: adopt pinctrl support
From: Shawn Guo @ 2012-05-07  0:53 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Arnd Bergmann, Olof Johansson, Sascha Hauer, Dong Aisheng,
	Shawn Guo, netdev, David S. Miller
In-Reply-To: <1336352040-28447-1-git-send-email-shawn.guo@linaro.org>

Cc: netdev@vger.kernel.org
Cc: David S. Miller <davem@davemloft.net>
Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
---
 drivers/net/ethernet/freescale/fec.c |    9 +++++++++
 1 files changed, 9 insertions(+), 0 deletions(-)

diff --git a/drivers/net/ethernet/freescale/fec.c b/drivers/net/ethernet/freescale/fec.c
index a12b3f5..500c106 100644
--- a/drivers/net/ethernet/freescale/fec.c
+++ b/drivers/net/ethernet/freescale/fec.c
@@ -48,6 +48,7 @@
 #include <linux/of_device.h>
 #include <linux/of_gpio.h>
 #include <linux/of_net.h>
+#include <linux/pinctrl/consumer.h>
 
 #include <asm/cacheflush.h>
 
@@ -1542,6 +1543,7 @@ fec_probe(struct platform_device *pdev)
 	struct resource *r;
 	const struct of_device_id *of_id;
 	static int dev_id;
+	struct pinctrl *pinctrl;
 
 	of_id = of_match_device(fec_dt_ids, &pdev->dev);
 	if (of_id)
@@ -1609,6 +1611,12 @@ fec_probe(struct platform_device *pdev)
 		}
 	}
 
+	pinctrl = devm_pinctrl_get_select_default(&pdev->dev);
+	if (IS_ERR(pinctrl)) {
+		ret = PTR_ERR(pinctrl);
+		goto failed_pin;
+	}
+
 	fep->clk = clk_get(&pdev->dev, NULL);
 	if (IS_ERR(fep->clk)) {
 		ret = PTR_ERR(fep->clk);
@@ -1639,6 +1647,7 @@ failed_mii_init:
 failed_init:
 	clk_disable_unprepare(fep->clk);
 	clk_put(fep->clk);
+failed_pin:
 failed_clk:
 	for (i = 0; i < FEC_IRQ_NUM; i++) {
 		irq = platform_get_irq(pdev, i);
-- 
1.7.5.4

^ permalink raw reply related

* Re: [PATCH] 9p: disconnect channel when PCI device is removed
From: Rusty Russell @ 2012-05-07  3:15 UTC (permalink / raw)
  To: Sasha Levin, davem, ericvh, aneesh.kumar, jvrao
  Cc: netdev, linux-kernel, davej, Sasha Levin
In-Reply-To: <1334353716-19483-1-git-send-email-levinsasha928@gmail.com>

On Fri, 13 Apr 2012 17:48:36 -0400, Sasha Levin <levinsasha928@gmail.com> wrote:
> When a virtio_9p pci device is being removed, we should close down any
> active channels and free up resources, we're not supposed to BUG() if there's
> still an open channel since it's a valid case when removing the PCI device.
> 
> Otherwise, removing the PCI device with an open channel would cause the
> following BUG():

(Damn changed notmuch.el bindings!  Previous reply went only to Sasha).

Applied thanks,
Rusty.

^ permalink raw reply

* Re: [net-next 1/4 (V3)] net: ethtool: add the EEE support
From: Giuseppe CAVALLARO @ 2012-05-07  5:25 UTC (permalink / raw)
  To: Ben Hutchings; +Cc: Yuval Mintz, netdev, davem
In-Reply-To: <1335736615.2424.42.camel@bwh-desktop.uk.solarflarecom.com>

Hello Ben

On 4/29/2012 11:56 PM, Ben Hutchings wrote:
> On Sun, 2012-04-29 at 12:20 +0300, Yuval Mintz wrote:
>> On 04/27/2012 05:11 PM, Giuseppe CAVALLARO wrote:
>>
>>> On 4/26/2012 7:17 PM, Ben Hutchings wrote:
>>>> On Thu, 2012-04-26 at 09:48 +0200, Giuseppe CAVALLARO wrote:
>>>>> Hello Ben
>>>>>
>>>>> On 4/19/2012 5:30 PM, Ben Hutchings wrote:
>>>>> [snip]
>>>>>>> I'm changing the code for getting/setting the EEE capability and trying
>>>>>>> to follow your suggestions.
>>>>>>>
>>>>>>> The "get" will show the following things; this is a bit different of the
>>>>>>> points "a" "b" and "c" we had discussed. Maybe, this could also be a
>>>>>>> more complete (*) .
>>>>>>> The ethtool (see output below as example) could report the phy
>>>>>>> (supported/advertised/lp_advertised) and mac eee capabilities separately.
>>>>>> Sounds reasonable.
>>>>>>
>>>>>>> The "set" will be useful for some eth devices (like the stmmac) that can
>>>>>>> stop/enable internally the eee capability (at mac level).
>>>>>> I don't know much about EEE, but shouldn't the driver take care of
>>>>>> configuring the MAC for this whenever the PHY is set to advertise EEE
>>>>>> capability?
>>>>> Yes indeed this can be done at driver level. So could I definitely
>>>>> remove it from ethtool? What do you suggest?
>>>>>
>>>>> In case of the stmmac I could add a specific driver option via sys to
>>>>> enable/disable the eee and set timer.
>>>> Generally, ethtool doesn't distinguish MAC and PHY settings because they
>>>> have to be configured consistently for the device to do anything useful.
>>>> If there is some good use for enabling EEE in the MAC and not the PHY,
>>>> or vice versa, then this should be exposed in the ethtool interface.
>>>> But if not then I don't believe it needs to be in either an ethtool or a
>>>> driver-specific interface.
>>> Thanks Ben for this clarification: in case of the stmmac the option is
>>> useful to stop a timer to enter in lpi state for the tx.
>>> So it's worth having that and from ethtool.
> 
> I think I finally get it.  If we negotiate a 100BASE-TX link (or one of
> the various backplane modes) with EEE enabled, we allow the link partner
> to assert LPI but we might still not want to assert it in the transmit
> direction.  Right?  (Whereas for 1000BASE-T and 10GBASE-T this would be
> useless, since both sides must assert LPI before any transition can
> happen.)
> 
>> How will a user turn off EEE support using this implementation?
> 
> At the ethtool API level this would be done by clearing the EEE
> advertising mask.  At the command-line level there could be a shortcut
> for this, just as you can use 'autoneg on' and 'autoneg off' rather than
> specifying a mask of link modes.
> 
>> Are you suggesting a "set" that works similarly to the control of the pause
>> parameters - that is, a user could either shutdown EEE or only Tx, which
>> will mean to the driver "don't enter Tx LPI mode"?
>>
>> Keep in mind that if later an interface controlling the LPI timers would be
>> added (as a measure of user control to the power saving vs. latency issue),
>> it could make this 'partial' closure interface redundant.
>>
>> Perhaps "set" should only turn the EEE feature on/off entirely (adv. them or
>> not, since clearly the link will have to be re-established afterwards), and
>> we should have a different function that prevents entry into LPI mode in Tx
>> - one whose functionality could later on be extended.
> 
> It sounds like this might as well be included, even if not all
> drivers/hardware would allow the values to be changed.  So the command
> structure would have at least:
> 
> 1. EEE link mode supported flags (get-only)
> 2. EEE link mode advertising flags (get/set)
> 3. Ditto for link partner (get-only)
> 4. TX LPI enable flag (get/set)
> 5. TX LPI timer values (get/set but driver may reject changes)

Ok I'll try to rework all following the points above. Just a note for
the timer and point 5 below.

> But if it's not yet clear exactly what timer parameters will be useful,
> we could leave some reserved space and then later define them along with
> flags to indicate whether the driver understands them.

I can use and test the LPI timer parameters that I intends, in case of
the stmmac d.d., the values added in a mac core register. These two timers:
1) specify the minimum time for which the link-status from the PHY
   should be up. The default value 1 sec as defined in the IEEE
   standard.
2) specify the minimum time for which the MAC waits after it has
   stopped transmitting the LPI pattern to the PHY

Peppe

> 
> Ben.
> 

^ permalink raw reply

* Re: ipctl - new tool for efficient read/write of net related sysctl
From: Thomas Graf @ 2012-05-07  6:14 UTC (permalink / raw)
  To: Oskar Berggren; +Cc: Stephen Hemminger, netdev
In-Reply-To: <CAHOuc7PjmcF=EhEUDEqfF_RShezxDqzc+53frNwWqhww8YQ+mA@mail.gmail.com>

On Sun, May 06, 2012 at 02:46:01PM +0200, Oskar Berggren wrote:
> 2012/5/6 Stephen Hemminger <stephen.hemminger@vyatta.com>:
> >
> >>
> >> In a project of mine I need to read (and possibly set) many of the
> >> properties
> >> found under /proc/sys/net/ipv4/conf/. This is simple enough, except
> >> that
> >> when you have hundreds of interfaces, it is really slow. In my tests
> >> it takes
> >> about 4 seconds to read a single variable for 700 interfaces. For a
> >> while I
> >> worked around this using the binary sysctl() interface, but this is
> >> deprecated.
> >>
> >
> > What about exposing these as NETLINK attributes? That would be faster
> > and you could do bulk updates.
> 
> 
> This is my first attempt at using NETLINK, so could you please elaborate?
> Below is the generic netlink interface I implemented so far. Any pointers
> on how I should do this differently?

What Stephen means is to use the existing message types RTM_SETLINK
and RTM_GETLINK in the NETLINK_ROUTE family.

This is already partially implemented. See the IFLA_AF_SPEC attribute
carrying IPV4_DEVCONF_ and DEVCONF_ (IPv6). Grep for rtnl_af_register()
and you will find the corresponding implementations.

Feel free to complete these existing interfaces, such as adding write
support to IPv6 or adding support to iproute2 which is currently
lacking.

src/nl-link-list.c in the libnl sources allows you to display the
configurations:

$ src/nl-link-list --details --name virbr0-nic
virbr0-nic ether 52:54:00:cb:da:db master virbr0 <broadcast,multicast> 
    mtu 1500 txqlen 500 weight 0 qdisc noop index 7 
    brd ff:ff:ff:ff:ff:ff state down mode default
    ipv4 devconf:
      forwarding            1  mc_forwarding         0  proxy_arp             0
      accept_redirects      1  secure_redirects      1  send_redirects        1
      shared_media          1  rp_filter             1  accept_source_route   0
      bootp_relay           0  log_martians          0  tag                   0
      arpfilter             0  medium_id             0  noxfrm                0
      nopolicy              0  force_igmp_version    0  arp_announce          0
      arp_ignore            0  promote_secondaries   0  arp_accept            0
      arp_notify            0  accept_local          0  src_vmark             0
      proxy_arp_pvlan       0  
    ipv6 max-reasm-len 64KiB <>
      create-stamp 13.35s reachable-time 40s 898msec retrans-time 1s
      devconf:
      forwarding            1  hoplimit             64  mtu6               1500
      accept_ra             1  accept_redirects      1  autoconf              1
      dad_transmits         1  rtr_solicits          3  rtr_solicit_interval 4s
      rtr_solicit_delay    1s  use_tempaddr          0  temp_valid_lft       7d
      temp_prefered_lft    1d  regen_max_retry       3  max_desync_factor   600
      max_addresses        16  force_mld_version     0  accept_ra_defrtr      1
      accept_ra_pinfo       1  accept_ra_rtr_pref    1  rtr_probe_interval   1m
      accept_ra_rt_info     0  proxy_ndp             0  optimistic_dad        0
      accept_source_route   0  mc_forwarding         0  disable_ipv6          0
      accept_dad            1  force_tllao           0  

^ permalink raw reply

* Re: [PATCH v2] RPS: Sparse connection optimizations - v2
From: Deng-Cheng Zhu @ 2012-05-07  6:48 UTC (permalink / raw)
  To: Tom Herbert; +Cc: davem, netdev, eric.dumazet
In-Reply-To: <CA+mtBx-CErRU=3ewkAjVrGN3dGzjsTz8Q-E8J+Xa+529OVEvwA@mail.gmail.com>

On 05/04/2012 11:31 PM, Tom Herbert wrote:
>> I think the mechanisms of rps_dev_flow_table and cpu_flow (in this
>> patch) are different: The former works along with rps_sock_flow_table
>> whose CPU info is based on recvmsg by the application. But for the tests
>> like what I did, there's no application involved.
>>
> While rps_sock_flow_table is currently only managed by recvmsg, it
> still is the general mechanism that maps flows to CPUs for steering.
> There should be nothing preventing you from populating and managing
> entries in other ways.

Well, even using rps_sock_flow_table to map the sparse flows to CPUs,
we still need a data structure to describe a single flow -- that's what
struct cpu_flow is doing. Besides, rps_sock_flow_table, by its meaning,
does not seem to make sense for our purpose. How about keeping the patch
as is but renaming struct cpu_flow to struct rps_sparse_flow? It's like:

---------------------------------------------
In include/linux/netdevice.h:

struct rps_sparse_flow {
        struct net_device *dev;
        u32 rxhash;
        unsigned long ts;
};

In net/core/dev.c:

static DEFINE_PER_CPU(struct rps_sparse_flow [CONFIG_NR_RPS_MAP_LOOPS],
rps_sparse_flow_table);
---------------------------------------------

The above looks similar to rps_dev_flow/rps_dev_flow_table, and we do
not necessarily go with rps_sock_flow_table.


Thanks,

Deng-Cheng

^ permalink raw reply

* Re: [PATCH RESEND 0/5] Adopt pinctrl support for a few outstanding imx drivers
From: Dong Aisheng @ 2012-05-07  6:50 UTC (permalink / raw)
  To: Shawn Guo
  Cc: linux-arm-kernel, Arnd Bergmann, netdev, Sascha Hauer,
	Wolfram Sang, linux-can, Grant Likely, Marc Kleine-Budde,
	linux-i2c, linux-serial, Greg Kroah-Hartman, Olof Johansson,
	spi-devel-general, Dong Aisheng, David S. Miller
In-Reply-To: <1336352040-28447-1-git-send-email-shawn.guo@linaro.org>

On Mon, May 07, 2012 at 08:53:55AM +0800, Shawn Guo wrote:
> With patch 5b3aa5f (pinctrl: add pinctrl_provide_dummies interface for
> platforms to use) applied on pinctrl tree, and patch "ARM: imx: enable
> pinctrl dummy states" [1] being there, we are ready to adopt pinctrl
> API for imx drivers.  So let's start from a few outstanding ones.
> 
> I would expect to ask Arnd and Olof to pull pinctrl tree into arm-soc
> as a dependency and then have series [1] and this patch set go through
> arm-soc tree to ease the merge process.
> 
Shouldn't we add the pinctrl states in dts file at the same time
with this patch series or using another separate patch to add them
before this series to avoid breaking the exist mx6q platforms?

> Resend to have subsystem lists Cc-ed.
> 
> Regards,
> Shawn
> 
> [1] http://thread.gmane.org/gmane.linux.kernel.mmc/14180
> 
> Shawn Guo (5):
>   tty: serial: imx: adopt pinctrl support
...

>   net: fec: adopt pinctrl support
>   can: flexcan: adopt pinctrl support
This two also depends on another patch you sent.
[PATCH RESEND 1/9] ARM: mxs: enable pinctrl dummy states
http://www.spinics.net/lists/arm-kernel/msg173341.html

Maybe you can put this two in the mxs convert series to avoid breaking
mxs platforms.
[PATCH 0/9] Enable pinctrl support for mach-mxs
http://www.spinics.net/lists/arm-kernel/msg173312.html

Regards
Dong Aisheng

>   i2c: imx: adopt pinctrl support
>   spi/imx: adopt pinctrl support
> 
>  drivers/i2c/busses/i2c-imx.c         |    8 ++++++++
>  drivers/net/can/flexcan.c            |    6 ++++++
>  drivers/net/ethernet/freescale/fec.c |    9 +++++++++
>  drivers/spi/spi-imx.c                |    8 ++++++++
>  drivers/tty/serial/imx.c             |    8 ++++++++
>  5 files changed, 39 insertions(+), 0 deletions(-)
> 
> -- 
> 1.7.5.4
> 
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
> 


^ permalink raw reply

* Re: [PATCH v2] RPS: Sparse connection optimizations - v2
From: Deng-Cheng Zhu @ 2012-05-07  6:51 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Tom Herbert, davem, netdev
In-Reply-To: <1336117669.3752.49.camel@edumazet-glaptop>

On 05/04/2012 03:47 PM, Eric Dumazet wrote:
> On Fri, 2012-05-04 at 12:25 +0800, Deng-Cheng Zhu wrote:
>> On 05/04/2012 11:22 AM, Tom Herbert wrote:
>>>> +struct cpu_flow {
>>>> +       struct net_device *dev;
>>>> +       u32 rxhash;
>>>> +       unsigned long ts;
>>>> +};
>>>
>>> This seems like overkill, we already have the rps_flow_table and this
>>> used in accelerated RFS so the device can also take advantage of
>>> steering.
>>
>> I think the mechanisms of rps_dev_flow_table and cpu_flow (in this
>> patch) are different: The former works along with rps_sock_flow_table
>> whose CPU info is based on recvmsg by the application. But for the tests
>> like what I did, there's no application involved.
>>
>>
>> Deng-Cheng
>
> I really suggest you speak with MIPS arch maintainers about these IRQ
> being all serviced by CPU0.

This is not about arch, it's about system design. Even with IRQ affinity
working, for single queue NICs, RPS still has its value in this test.

>
> Adding tweaks in network stack to lower the impact of this huge problem
> is a no go.

This is merely an option, to whom care about sparse flow throughput. And
it's absolutely sort of tradeoff, and not selected by default.


Deng-Cheng

^ permalink raw reply

* Re: [net-next 0/4][pull request] Intel Wired LAN Driver Updates
From: Jeff Kirsher @ 2012-05-07  7:12 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, gospo, sassmann
In-Reply-To: <20120506.132513.36895742709809565.davem@davemloft.net>

[-- Attachment #1: Type: text/plain, Size: 1042 bytes --]

On Sun, 2012-05-06 at 13:25 -0400, David Miller wrote:
> From: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
> Date: Sat,  5 May 2012 05:38:09 -0700
> 
> > This series of patches contains updates for e1000e and ixgbe.
> > 
> > NOTE- The ixgbe patch can and probably should be applied to
> > David Miller's net tree as well.
> > 
> > The following are changes since commit bd14b1b2e29bd6812597f896dde06eaf7c6d2f24:
> >   tcp: be more strict before accepting ECN negociation
> > and are available in the git repository at:
> >   git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next master
> 
> No new changes there?
> 
> [davem@drr net-next]$ git pull git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next master
> From git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/net-next
>  * branch            master     -> FETCH_HEAD
> Already up-to-date.

Sorry Dave, I thought I had pushed the changes but it appears I did not.
I have rectified that and now my net-next tree contains the four
patches.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply

* Re: [PATCH] 9p: disconnect channel when PCI device is removed
From: Aneesh Kumar K.V @ 2012-05-07  7:14 UTC (permalink / raw)
  To: Rusty Russell, Sasha Levin, davem, ericvh, jvrao
  Cc: netdev, linux-kernel, davej, Sasha Levin
In-Reply-To: <87397chebf.fsf@rustcorp.com.au>

Rusty Russell <rusty@rustcorp.com.au> writes:

> On Fri, 13 Apr 2012 17:48:36 -0400, Sasha Levin <levinsasha928@gmail.com> wrote:
>> When a virtio_9p pci device is being removed, we should close down any
>> active channels and free up resources, we're not supposed to BUG() if there's
>> still an open channel since it's a valid case when removing the PCI device.
>> 
>> Otherwise, removing the PCI device with an open channel would cause the
>> following BUG():
>
> (Damn changed notmuch.el bindings!  Previous reply went only to Sasha).
>
> Applied thanks,
> Rusty.

I am not sure whether the patch is sufficient, p9_virtio_remove does a
kfree(chan) and since we are not doing anything at the file system
level, we would still allow new 9p client request. That means
p9_virtio_request would be dereferencing a freed memory.

-aneesh

^ permalink raw reply

* Re: [PATCH v2] RPS: Sparse connection optimizations - v2
From: Eric Dumazet @ 2012-05-07  7:22 UTC (permalink / raw)
  To: Deng-Cheng Zhu; +Cc: Tom Herbert, davem, netdev
In-Reply-To: <4FA770DA.6000104@mips.com>

On Mon, 2012-05-07 at 14:51 +0800, Deng-Cheng Zhu wrote:

> This is not about arch, it's about system design. Even with IRQ affinity
> working, for single queue NICs, RPS still has its value in this test.
> 

Post your numbers when proper affinities are done.

Having several cpus fighting on output path for qdisc/device lock is a
killer. Extra IPI are not worth the pain.

^ permalink raw reply

* Re: [PATCH RESEND 0/5] Adopt pinctrl support for a few outstanding imx drivers
From: Shawn Guo @ 2012-05-07  7:34 UTC (permalink / raw)
  To: Dong Aisheng
  Cc: linux-arm-kernel, Arnd Bergmann, netdev, Sascha Hauer,
	Wolfram Sang, linux-can, Grant Likely, Marc Kleine-Budde,
	linux-i2c, linux-serial, Greg Kroah-Hartman, Olof Johansson,
	spi-devel-general, Dong Aisheng, David S. Miller
In-Reply-To: <20120507065001.GA23607@shlinux2.ap.freescale.net>

On Mon, May 07, 2012 at 02:50:02PM +0800, Dong Aisheng wrote:
> Shouldn't we add the pinctrl states in dts file at the same time
> with this patch series or using another separate patch to add them
> before this series to avoid breaking the exist mx6q platforms?
> 
Ah, I just noticed that your patch "ARM: imx: enable pinctrl dummy
states" did not cover imx6q.  I think we should do the same for imx6q,
so that we can separate dts update from the driver change.  When all
imx6q boards' dts files get updated to have pins defined for the
devices, we can then remove dummy state for imx6q.  Doing so will ease
the pinctrl migration for those imx6q boards.

Will update your patch on my branch to have dummy state enabled for
imx6q.

> >   net: fec: adopt pinctrl support
> >   can: flexcan: adopt pinctrl support
> This two also depends on another patch you sent.
> [PATCH RESEND 1/9] ARM: mxs: enable pinctrl dummy states
> http://www.spinics.net/lists/arm-kernel/msg173341.html
> 
> Maybe you can put this two in the mxs convert series to avoid breaking
> mxs platforms.
> [PATCH 0/9] Enable pinctrl support for mach-mxs
> http://www.spinics.net/lists/arm-kernel/msg173312.html
> 
Right.  I'm going to merge these two series into one since there are
device drives shared between imx and mxs.

-- 
Regards,
Shawn

^ permalink raw reply

* Re: [PATCH v2] RPS: Sparse connection optimizations - v2
From: Eric Dumazet @ 2012-05-07  7:38 UTC (permalink / raw)
  To: Deng-Cheng Zhu; +Cc: Tom Herbert, davem, netdev
In-Reply-To: <4FA77051.20804@mips.com>

On Mon, 2012-05-07 at 14:48 +0800, Deng-Cheng Zhu wrote:
> On 05/04/2012 11:31 PM, Tom Herbert wrote:
> >> I think the mechanisms of rps_dev_flow_table and cpu_flow (in this
> >> patch) are different: The former works along with rps_sock_flow_table
> >> whose CPU info is based on recvmsg by the application. But for the tests
> >> like what I did, there's no application involved.
> >>
> > While rps_sock_flow_table is currently only managed by recvmsg, it
> > still is the general mechanism that maps flows to CPUs for steering.
> > There should be nothing preventing you from populating and managing
> > entries in other ways.
> 
> Well, even using rps_sock_flow_table to map the sparse flows to CPUs,
> we still need a data structure to describe a single flow -- that's what
> struct cpu_flow is doing. Besides, rps_sock_flow_table, by its meaning,
> does not seem to make sense for our purpose. How about keeping the patch
> as is but renaming struct cpu_flow to struct rps_sparse_flow? It's like:
> 

sock_flow_table is about mapping a flow (by its rxhash) to cpu.

If you feel 'sock' is bad name, you can rename it.

You dont need adding new data structure and code in fast path.

Only the first packet of a new flow might be handled by 'the wrong cpu'.

If you add code in forward path to change flow_table for next packets,
added cost in fast path is null.

^ permalink raw reply

* Re: [PATCH RESEND 0/5] Adopt pinctrl support for a few outstanding imx drivers
From: Dong Aisheng @ 2012-05-07  7:53 UTC (permalink / raw)
  To: Shawn Guo
  Cc: Dong Aisheng-B29396, linux-arm-kernel@lists.infradead.org,
	Arnd Bergmann, netdev@vger.kernel.org, Sascha Hauer, Wolfram Sang,
	linux-can@vger.kernel.org, Grant Likely, Marc Kleine-Budde,
	linux-i2c@vger.kernel.org, linux-serial@vger.kernel.org,
	Greg Kroah-Hartman, Olof Johansson,
	spi-devel-general@lists.sourceforge.net, Dong Aisheng,
	David S. Miller
In-Reply-To: <20120507073403.GG19389@S2101-09.ap.freescale.net>

On Mon, May 07, 2012 at 03:34:06PM +0800, Shawn Guo wrote:
> On Mon, May 07, 2012 at 02:50:02PM +0800, Dong Aisheng wrote:
> > Shouldn't we add the pinctrl states in dts file at the same time
> > with this patch series or using another separate patch to add them
> > before this series to avoid breaking the exist mx6q platforms?
> > 
> Ah, I just noticed that your patch "ARM: imx: enable pinctrl dummy
> states" did not cover imx6q.  I think we should do the same for imx6q,
Yes, doing that was to force people to add pinctrl states in dts file
rather than using dummy state since mx6 supports pinctrl driver.

> so that we can separate dts update from the driver change.  When all
> imx6q boards' dts files get updated to have pins defined for the
> devices, we can then remove dummy state for imx6q.  Doing so will ease
> the pinctrl migration for those imx6q boards.
> 
Well, considering we have several mx6 boards, i think i can also be fine
with this way to ease the mx6q pinctrl migration.

> Will update your patch on my branch to have dummy state enabled for
> imx6q.
> 
Then go ahead.

Regards
Dong Aisheng


^ permalink raw reply

* Re: [PATCH v2] RPS: Sparse connection optimizations - v2
From: Deng-Cheng Zhu @ 2012-05-07  8:01 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Tom Herbert, davem, netdev
In-Reply-To: <1336376282.3752.2252.camel@edumazet-glaptop>

On 05/07/2012 03:38 PM, Eric Dumazet wrote:
> On Mon, 2012-05-07 at 14:48 +0800, Deng-Cheng Zhu wrote:
>> On 05/04/2012 11:31 PM, Tom Herbert wrote:
>>>> I think the mechanisms of rps_dev_flow_table and cpu_flow (in this
>>>> patch) are different: The former works along with rps_sock_flow_table
>>>> whose CPU info is based on recvmsg by the application. But for the tests
>>>> like what I did, there's no application involved.
>>>>
>>> While rps_sock_flow_table is currently only managed by recvmsg, it
>>> still is the general mechanism that maps flows to CPUs for steering.
>>> There should be nothing preventing you from populating and managing
>>> entries in other ways.
>>
>> Well, even using rps_sock_flow_table to map the sparse flows to CPUs,
>> we still need a data structure to describe a single flow -- that's what
>> struct cpu_flow is doing. Besides, rps_sock_flow_table, by its meaning,
>> does not seem to make sense for our purpose. How about keeping the patch
>> as is but renaming struct cpu_flow to struct rps_sparse_flow? It's like:
>>
>
> sock_flow_table is about mapping a flow (by its rxhash) to cpu.
>
> If you feel 'sock' is bad name, you can rename it.
>
> You dont need adding new data structure and code in fast path.
>
> Only the first packet of a new flow might be handled by 'the wrong cpu'.
>
> If you add code in forward path to change flow_table for next packets,
> added cost in fast path is null.

Did you really read my patch and understand what I commented? When I was
talking about using rps_sparse_flow (initially cpu_flow), neither
rps_sock_flow_table nor rps_dev_flow_table is activated (number of
entries: 0).

FYI below:

On 05/04/2012 11:39 AM, Deng-Cheng Zhu wrote:
 > On 05/04/2012 11:22 AM, Tom Herbert wrote:
 >>> +struct cpu_flow {
 >>> + struct net_device *dev;
 >>> + u32 rxhash;
 >>> + unsigned long ts;
 >>> +};
 >>
 >> This seems like overkill, we already have the rps_flow_table and this
 >> used in accelerated RFS so the device can also take advantage of
 >> steering. Maybe somehow program that table for your sparse flows?
 >
 > In fact I did ever try something different in rps_flow_cnt (except for
 > rps_cpus, the only tunable thing relating to RPS in sysfs, am I
 > missing something?) and found no effect in my tests (iperf between 2
 > PCs via Malta which works as router and uses iptables/NAT+RPS)...


Deng-Cheng

^ permalink raw reply

* Re: [PATCH RESEND 3/5] can: flexcan: adopt pinctrl support
From: Marc Kleine-Budde @ 2012-05-07  8:06 UTC (permalink / raw)
  To: Shawn Guo
  Cc: linux-arm-kernel, Arnd Bergmann, Olof Johansson, Sascha Hauer,
	Dong Aisheng, linux-can, Linux Netdev List
In-Reply-To: <1336352040-28447-4-git-send-email-shawn.guo@linaro.org>

[-- Attachment #1: Type: text/plain, Size: 1078 bytes --]

Hello,

On 05/07/2012 02:53 AM, Shawn Guo wrote:
> Cc: linux-can@vger.kernel.org
> Cc: Marc Kleine-Budde <mkl@pengutronix.de>
> Signed-off-by: Shawn Guo <shawn.guo@linaro.org>
> ---

It doesn't compile yet against net-next/master, which is based on v3.4-rc4:

/home/frogger/pengutronix/socketcan/linux/drivers/net/can/flexcan.c: In
function 'flexcan_probe':
/home/frogger/pengutronix/socketcan/linux/drivers/net/can/flexcan.c:937:
error: implicit declaration of function 'devm_pinctrl_get_select_default'
/home/frogger/pengutronix/socketcan/linux/drivers/net/can/flexcan.c:937:
warning: assignment makes pointer from integer without a cast

Which tree does this series depend on? If this should go over the
linux-can tree, I have to ask David first to merge this.

regards,
Marc

-- 
Pengutronix e.K.                  | Marc Kleine-Budde           |
Industrial Linux Solutions        | Phone: +49-231-2826-924     |
Vertretung West/Dortmund          | Fax:   +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686  | http://www.pengutronix.de   |


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

^ permalink raw reply

* Re: [PATCH RESEND 3/5] can: flexcan: adopt pinctrl support
From: Shawn Guo @ 2012-05-07  8:17 UTC (permalink / raw)
  To: Marc Kleine-Budde
  Cc: linux-arm-kernel, Arnd Bergmann, Olof Johansson, Sascha Hauer,
	Dong Aisheng, linux-can, Linux Netdev List
In-Reply-To: <4FA78298.4080401@pengutronix.de>

On 7 May 2012 16:06, Marc Kleine-Budde <mkl@pengutronix.de> wrote:
> It doesn't compile yet against net-next/master, which is based on v3.4-rc4:
>
> /home/frogger/pengutronix/socketcan/linux/drivers/net/can/flexcan.c: In
> function 'flexcan_probe':
> /home/frogger/pengutronix/socketcan/linux/drivers/net/can/flexcan.c:937:
> error: implicit declaration of function 'devm_pinctrl_get_select_default'
> /home/frogger/pengutronix/socketcan/linux/drivers/net/can/flexcan.c:937:
> warning: assignment makes pointer from integer without a cast
>
> Which tree does this series depend on? If this should go over the
> linux-can tree, I have to ask David first to merge this.
>
Thanks for the response, Marc.  The patch depends on pinctrl tree and
a couple of other patches that will go through arm-soc tree, so I
would like to ask for your ack to have the patch go over arm-soc tree.

Regards,
Shawn

^ permalink raw reply

* Re: [v12 PATCH 2/3] NETFILTER module xt_hmark, new target for HASH based fwmark
From: Hans Schillstrom @ 2012-05-07  8:20 UTC (permalink / raw)
  To: Pablo Neira Ayuso
  Cc: kaber@trash.net, jengelh@medozas.de,
	netfilter-devel@vger.kernel.org, netdev@vger.kernel.org,
	hans@schillstrom.com
In-Reply-To: <20120506225738.GA23009@1984>

[-- Attachment #1: Type: text/plain, Size: 4731 bytes --]

On Monday 07 May 2012 00:57:38 Pablo Neira Ayuso wrote:
> Hi Hans,
> 
> [...]
> > > > > Regarding ICMP traffic, I think we can use the ID field for the
> > > > > hashing as well. Thus, we handle ICMP like other protocols.
> > > >
> > > > Yes why not, I can give it a try.
> > > >
> >
> > I think we wait with this one..
> 
> I see. This is easy to add for the conntrack side, but it will require
> some extra code for the packet-based solution.

Actually I think there is very little gain to spread with type 
and then we must add a user mode possibility to turn it off 
i.e. a --hmark-icmp-type-mask 

> Not directly related to this but, I know that your intention is to
> make this as flexible as possible. However, I still don't find how I
> would use the port mask feature in any of my setups.  Basically, I
> don't come up with any useful example for this situation.

We have plenty of rules where just source port mask is zero.
and the dest-port-mask is 0xfffc (or 0xffff)


> I'm also telling this because I think that ICMP support will be
> easier to add if port masking is removed.
> 
> [...]
> > This is what I have done.
> >
> > - I reduced the code size a little bit by combining the hmark_ct_set_htuple_ipvX into one func.
> >   by adding a hmark_addr6_mask() and hmark_addr_any_mask()
> >   Note that using "otuple->src.l3num" as param 1 in both src and dst is not a typo.
> >   (it's not set in the rtuple)
> 
> Good one, this made the code even smaller.
> 
> > - Made the if (dst < src) swap() in the hmark_hash() since it should be used by every caller.
> 
> Not really, you don't need for the conntrack part. The original tuple
> is always the same, not matter where the packet is coming from. I have
> removed this again so it only affects packet-based hashing.

Yes original tuple is always the same but not always less than the rtuple.
If you have two nodes that should produce the same hmark,
one with conntrack an one without you must make a compare to make it consistent.

> 
> > - Moved the L3 check a little bit earlier.
> 
> good.
> 
> > - changed return values for fragments.
> 
> With this, you're giving up on trying to classify fragments. Do you
> really want this?
> 
> From my point of view, if your firewalls (assuming they are the HMARK
> classification) are stateless, it still makes sense to me to classify
> fragments using the XT_HMARK_METHOD_L3_4.

I do agree, it is back to "return 0" again.

> 
> > - Added nhoffs to: hmark_set_tuple_ports(skb, (ip->ihl * 4) + nhoff, t, info);
> >   to get icmp working
> 
> good catch.
> 
> Below, some minor changes that I made to your patch (you can find a
> new version enclosed to this email).
> 
> [...]
> > +#ifndef XT_HMARK_H_
> > +#define XT_HMARK_H_
> > +
> > +#include <linux/types.h>
> > +
> > +enum {
> > +     XT_HMARK_NONE,
> > +     XT_HMARK_SADR_AND,
> > +     XT_HMARK_DADR_AND,
> > +     XT_HMARK_SPI_AND,
> > +     XT_HMARK_SPI_OR,
> > +     XT_HMARK_SPORT_AND,
> > +     XT_HMARK_DPORT_AND,
> > +     XT_HMARK_SPORT_OR,
> > +     XT_HMARK_DPORT_OR,
> > +     XT_HMARK_PROTO_AND,
> > +     XT_HMARK_RND,
> > +     XT_HMARK_MODULUS,
> > +     XT_HMARK_OFFSET,
> > +     XT_HMARK_CT,
> > +     XT_HMARK_METHOD_L3,
> > +     XT_HMARK_METHOD_L3_4,
> > +     XT_F_HMARK_SADR_AND    = 1 << XT_HMARK_SADR_AND,
> > +     XT_F_HMARK_DADR_AND    = 1 << XT_HMARK_DADR_AND,
> > +     XT_F_HMARK_SPI_AND     = 1 << XT_HMARK_SPI_AND,
> > +     XT_F_HMARK_SPI_OR      = 1 << XT_HMARK_SPI_OR,
> > +     XT_F_HMARK_SPORT_AND   = 1 << XT_HMARK_SPORT_AND,
> > +     XT_F_HMARK_DPORT_AND   = 1 << XT_HMARK_DPORT_AND,
> > +     XT_F_HMARK_SPORT_OR    = 1 << XT_HMARK_SPORT_OR,
> > +     XT_F_HMARK_DPORT_OR    = 1 << XT_HMARK_DPORT_OR,
> > +     XT_F_HMARK_PROTO_AND   = 1 << XT_HMARK_PROTO_AND,
> > +     XT_F_HMARK_RND         = 1 << XT_HMARK_RND,
> > +     XT_F_HMARK_MODULUS     = 1 << XT_HMARK_MODULUS,
> > +     XT_F_HMARK_OFFSET      = 1 << XT_HMARK_OFFSET,
> > +     XT_F_HMARK_CT          = 1 << XT_HMARK_CT,
> > +     XT_F_HMARK_METHOD_L3   = 1 << XT_HMARK_METHOD_L3,
> > +     XT_F_HMARK_METHOD_L3_4 = 1 << XT_HMARK_METHOD_L3_4,
> 
> I've defined:
> 
> #define XT_HMARK_FLAG(flag) (1 << flag)
> 
> So we save all those extra _F_ defintions, they look redundant.

OK, I had to change the user mode code to keep up with this change...
The user code part is also included now.

[snip]

>+static inline u32
>+hmark_addr_mask(int l3num, const __u32 *addr32, const __u32 *mask)
>+{
>+       switch (l3num) {
              ^
Added a space here

>+       case AF_INET:
>+               return *addr32 & *mask;
>+       case AF_INET6:
>+               return hmark_addr6_mask(addr32, mask);


-- 
Regards
Hans Schillstrom <hans.schillstrom@ericsson.com>

[-- Attachment #2: 0001-netfilter-add-xt_hmark-target-for-hash-based-skb-mar.patch --]
[-- Type: text/x-patch, Size: 14126 bytes --]

From 04cc88b2eec677fd8eab3fbf620ed9209b883b8c Mon Sep 17 00:00:00 2001
From: Hans Schillstrom <hans.schillstrom@ericsson.com>
Date: Mon, 7 May 2012 08:33:08 +0200
Subject: [PATCH 1/1] netfilter: add xt_hmark target for hash-based skb marking

The target allows you to create rules in the "raw" and "mangle" tables
which set the skbuff mark by means of hash calculation within a given
range. The nfmark can influence the routing method (see "Use netfilter
MARK value as routing key") and can also be used by other subsystems to
change their behaviour.

Some examples:

* Default rule handles all TCP, UDP, SCTP, ESP & AH

 iptables -t mangle -A PREROUTING -m state --state NEW,ESTABLISHED,RELATED \
	-j HMARK --hmark-offset 10000 --hmark-mod 10

* Handle SCTP and hash dest port only and produce a nfmark between 100-119.

 iptables -t mangle -A PREROUTING -p SCTP -j HMARK --src-mask 0 --dst-mask 0 \
	--sp-mask 0 --offset 100 --mod 20

* Fragment safe Layer 3 only, that keep a class C network flow together

 iptables -t mangle -A PREROUTING -j HMARK --method L3 \
	--src-mask 24 --mod 20 --offset 100

[ A big part of this patch has been refactorized by Pablo Neira Ayuso ]

Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
---
 include/linux/netfilter/xt_HMARK.h |   48 +++++
 net/netfilter/Kconfig              |   15 ++
 net/netfilter/Makefile             |    1 +
 net/netfilter/xt_HMARK.c           |  355 ++++++++++++++++++++++++++++++++++++
 4 files changed, 419 insertions(+), 0 deletions(-)
 create mode 100644 include/linux/netfilter/xt_HMARK.h
 create mode 100644 net/netfilter/xt_HMARK.c

diff --git a/include/linux/netfilter/xt_HMARK.h b/include/linux/netfilter/xt_HMARK.h
new file mode 100644
index 0000000..05e43ba
--- /dev/null
+++ b/include/linux/netfilter/xt_HMARK.h
@@ -0,0 +1,48 @@
+#ifndef XT_HMARK_H_
+#define XT_HMARK_H_
+
+#include <linux/types.h>
+
+enum {
+	XT_HMARK_NONE,
+	XT_HMARK_SADR_AND,
+	XT_HMARK_DADR_AND,
+	XT_HMARK_SPI_AND,
+	XT_HMARK_SPI_OR,
+	XT_HMARK_SPORT_AND,
+	XT_HMARK_DPORT_AND,
+	XT_HMARK_SPORT_OR,
+	XT_HMARK_DPORT_OR,
+	XT_HMARK_PROTO_AND,
+	XT_HMARK_RND,
+	XT_HMARK_MODULUS,
+	XT_HMARK_OFFSET,
+	XT_HMARK_CT,
+	XT_HMARK_METHOD_L3,
+	XT_HMARK_METHOD_L3_4,
+};
+#define XT_HMARK_FLAG(flag)	(1 << flag)
+
+union hmark_ports {
+	struct {
+		__u16	src;
+		__u16	dst;
+	} p16;
+	__u32	v32;
+};
+
+struct xt_hmark_info {
+	union nf_inet_addr	src_mask;	/* Source address mask */
+	union nf_inet_addr	dst_mask;	/* Dest address mask */
+	union hmark_ports	port_mask;
+	union hmark_ports	port_set;
+	__u32			spi_mask;
+	__u32			spi_set;
+	__u32			flags;		/* Print out only */
+	__u16			proto_mask;	/* L4 Proto mask */
+	__u32			hashrnd;
+	__u32			hmodulus;	/* Modulus */
+	__u32			hoffset;	/* Offset */
+};
+
+#endif /* XT_HMARK_H_ */
diff --git a/net/netfilter/Kconfig b/net/netfilter/Kconfig
index 0c6f67e..209c1ed 100644
--- a/net/netfilter/Kconfig
+++ b/net/netfilter/Kconfig
@@ -509,6 +509,21 @@ config NETFILTER_XT_TARGET_HL
 	since you can easily create immortal packets that loop
 	forever on the network.
 
+config NETFILTER_XT_TARGET_HMARK
+	tristate '"HMARK" target support'
+	depends on (IP6_NF_IPTABLES || IP6_NF_IPTABLES=n)
+	depends on NETFILTER_ADVANCED
+	---help---
+	This option adds the "HMARK" target.
+
+	The target allows you to create rules in the "raw" and "mangle" tables
+	which set the skbuff mark by means of hash calculation within a given
+	range. The nfmark can influence the routing method (see "Use netfilter
+	MARK value as routing key") and can also be used by other subsystems to
+	change their behaviour.
+
+	To compile it as a module, choose M here. If unsure, say N.
+
 config NETFILTER_XT_TARGET_IDLETIMER
 	tristate  "IDLETIMER target support"
 	depends on NETFILTER_ADVANCED
diff --git a/net/netfilter/Makefile b/net/netfilter/Makefile
index ca36765..4e7960c 100644
--- a/net/netfilter/Makefile
+++ b/net/netfilter/Makefile
@@ -59,6 +59,7 @@ obj-$(CONFIG_NETFILTER_XT_TARGET_CONNSECMARK) += xt_CONNSECMARK.o
 obj-$(CONFIG_NETFILTER_XT_TARGET_CT) += xt_CT.o
 obj-$(CONFIG_NETFILTER_XT_TARGET_DSCP) += xt_DSCP.o
 obj-$(CONFIG_NETFILTER_XT_TARGET_HL) += xt_HL.o
+obj-$(CONFIG_NETFILTER_XT_TARGET_HMARK) += xt_HMARK.o
 obj-$(CONFIG_NETFILTER_XT_TARGET_LED) += xt_LED.o
 obj-$(CONFIG_NETFILTER_XT_TARGET_LOG) += xt_LOG.o
 obj-$(CONFIG_NETFILTER_XT_TARGET_NFLOG) += xt_NFLOG.o
diff --git a/net/netfilter/xt_HMARK.c b/net/netfilter/xt_HMARK.c
new file mode 100644
index 0000000..6954d40
--- /dev/null
+++ b/net/netfilter/xt_HMARK.c
@@ -0,0 +1,355 @@
+/*
+ * xt_HMARK - Netfilter module to set mark by means of hashing
+ *
+ * (C) 2012 by Hans Schillstrom <hans.schillstrom@ericsson.com>
+ * (C) 2012 by Pablo Neira Ayuso <pablo@netfilter.org>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License version 2 as published by
+ * the Free Software Foundation.
+ */
+
+#include <linux/module.h>
+#include <linux/skbuff.h>
+#include <linux/icmp.h>
+
+#include <linux/netfilter/x_tables.h>
+#include <linux/netfilter/xt_HMARK.h>
+
+#include <net/ip.h>
+#if IS_ENABLED(CONFIG_NF_CONNTRACK)
+#include <net/netfilter/nf_conntrack.h>
+#endif
+#if IS_ENABLED(CONFIG_IP6_NF_IPTABLES)
+#include <net/ipv6.h>
+#include <linux/netfilter_ipv6/ip6_tables.h>
+#endif
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Hans Schillstrom <hans.schillstrom@ericsson.com>");
+MODULE_DESCRIPTION("Xtables: packet marking using hash calculation");
+MODULE_ALIAS("ipt_HMARK");
+MODULE_ALIAS("ip6t_HMARK");
+
+struct hmark_tuple {
+	u32			src;
+	u32			dst;
+	union hmark_ports	uports;
+	uint8_t			proto;
+};
+
+static inline u32 hmark_addr6_mask(const __u32 *addr32, const __u32 *mask)
+{
+	return (addr32[0] & mask[0]) ^
+	       (addr32[1] & mask[1]) ^
+	       (addr32[2] & mask[2]) ^
+	       (addr32[3] & mask[3]);
+}
+
+static inline u32
+hmark_addr_mask(int l3num, const __u32 *addr32, const __u32 *mask)
+{
+	switch (l3num) {
+	case AF_INET:
+		return *addr32 & *mask;
+	case AF_INET6:
+		return hmark_addr6_mask(addr32, mask);
+	}
+	return 0;
+}
+
+static int
+hmark_ct_set_htuple(const struct sk_buff *skb, struct hmark_tuple *t,
+		    const struct xt_hmark_info *info)
+{
+#if IS_ENABLED(CONFIG_NF_CONNTRACK)
+	enum ip_conntrack_info ctinfo;
+	struct nf_conn *ct = nf_ct_get(skb, &ctinfo);
+	struct nf_conntrack_tuple *otuple;
+	struct nf_conntrack_tuple *rtuple;
+
+	if (ct == NULL || nf_ct_is_untracked(ct))
+		return -1;
+
+	otuple = &ct->tuplehash[IP_CT_DIR_ORIGINAL].tuple;
+	rtuple = &ct->tuplehash[IP_CT_DIR_REPLY].tuple;
+
+	t->src = hmark_addr_mask(otuple->src.l3num, otuple->src.u3.all,
+				 info->src_mask.all);
+	t->dst = hmark_addr_mask(otuple->src.l3num, rtuple->src.u3.all,
+				 info->dst_mask.all);
+
+	if (info->flags & XT_HMARK_FLAG(XT_HMARK_METHOD_L3))
+		return 0;
+
+	t->proto = nf_ct_protonum(ct);
+	if (t->proto != IPPROTO_ICMP) {
+		t->uports.p16.src = otuple->src.u.all;
+		t->uports.p16.dst = rtuple->src.u.all;
+		t->uports.v32 = (t->uports.v32 & info->port_mask.v32) |
+				info->port_set.v32;
+	}
+
+	return 0;
+#else
+	return -1;
+#endif
+}
+
+static inline u32
+hmark_hash(struct hmark_tuple *t, const struct xt_hmark_info *info)
+{
+	u32 hash;
+
+	if (t->dst < t->src)
+		swap(t->src, t->dst);
+
+	hash = jhash_3words(t->src, t->dst, t->uports.v32, info->hashrnd);
+	hash = hash ^ (t->proto & info->proto_mask);
+
+	return (hash % info->hmodulus) + info->hoffset;
+}
+
+static void
+hmark_set_tuple_ports(const struct sk_buff *skb, unsigned int nhoff,
+		      struct hmark_tuple *t, const struct xt_hmark_info *info)
+{
+	int protoff;
+
+	protoff = proto_ports_offset(t->proto);
+	if (protoff < 0)
+		return;
+
+	nhoff += protoff;
+	if (skb_copy_bits(skb, nhoff, &t->uports, sizeof(t->uports)) < 0)
+		return;
+
+	if (t->proto == IPPROTO_ESP || t->proto == IPPROTO_AH)
+		t->uports.v32 = (t->uports.v32 & info->spi_mask) |
+				info->spi_set;
+	else {
+		t->uports.v32 = (t->uports.v32 & info->port_mask.v32) |
+				info->port_set.v32;
+
+		if (t->uports.p16.dst < t->uports.p16.src)
+			swap(t->uports.p16.dst, t->uports.p16.src);
+	}
+}
+
+#if IS_ENABLED(CONFIG_IP6_NF_IPTABLES)
+static int get_inner6_hdr(const struct sk_buff *skb, int *offset)
+{
+	struct icmp6hdr *icmp6h, _ih6;
+
+	icmp6h = skb_header_pointer(skb, *offset, sizeof(_ih6), &_ih6);
+	if (icmp6h == NULL)
+		return 0;
+
+	if (icmp6h->icmp6_type && icmp6h->icmp6_type < 128) {
+		*offset += sizeof(struct icmp6hdr);
+		return 1;
+	}
+	return 0;
+}
+
+static int
+hmark_pkt_set_htuple_ipv6(const struct sk_buff *skb, struct hmark_tuple *t,
+			  const struct xt_hmark_info *info)
+{
+	struct ipv6hdr *ip6, _ip6;
+	int flag = IP6T_FH_F_AUTH; /* Ports offset, find_hdr flags */
+	unsigned int nhoff = 0;
+	u16 fragoff = 0;
+	int nexthdr;
+
+	ip6 = (struct ipv6hdr *) (skb->data + skb_network_offset(skb));
+	nexthdr = ipv6_find_hdr(skb, &nhoff, -1, &fragoff, &flag);
+	if (nexthdr < 0)
+		return 0;
+	/* No need to check for icmp errors on fragments */
+	if ((flag & IP6T_FH_F_FRAG) || (nexthdr != IPPROTO_ICMPV6))
+		goto noicmp;
+	/* if an icmp error, use the inner header */
+	if (get_inner6_hdr(skb, &nhoff)) {
+		ip6 = skb_header_pointer(skb, nhoff, sizeof(_ip6), &_ip6);
+		if (ip6 == NULL)
+			return -1;
+		/* Treat AH as ESP, use SPI nothing else. */
+		flag = IP6T_FH_F_AUTH;
+		nexthdr = ipv6_find_hdr(skb, &nhoff, -1, &fragoff, &flag);
+		if (nexthdr < 0)
+			return -1;
+	}
+noicmp:
+	t->src = hmark_addr6_mask(ip6->saddr.s6_addr32, info->src_mask.all);
+	t->dst = hmark_addr6_mask(ip6->daddr.s6_addr32, info->dst_mask.all);
+
+	if (info->flags & XT_HMARK_FLAG(XT_HMARK_METHOD_L3))
+		return 0;
+
+	t->proto = nexthdr;
+
+	if (t->proto == IPPROTO_ICMPV6)
+		return 0;
+
+	if (flag & IP6T_FH_F_FRAG)
+		return 0;
+
+	hmark_set_tuple_ports(skb, nhoff, t, info);
+
+	return 0;
+}
+
+static unsigned int
+hmark_tg_v6(struct sk_buff *skb, const struct xt_action_param *par)
+{
+	const struct xt_hmark_info *info = par->targinfo;
+	struct hmark_tuple t;
+
+	memset(&t, 0, sizeof(struct hmark_tuple));
+
+	if (info->flags & XT_HMARK_FLAG(XT_HMARK_CT)) {
+		if (hmark_ct_set_htuple(skb, &t, info) < 0)
+			return XT_CONTINUE;
+	} else {
+		if (hmark_pkt_set_htuple_ipv6(skb, &t, info) < 0)
+			return XT_CONTINUE;
+	}
+
+	skb->mark = hmark_hash(&t, info);
+	return XT_CONTINUE;
+}
+#endif
+
+static int get_inner_hdr(const struct sk_buff *skb, int iphsz, int *nhoff)
+{
+	const struct icmphdr *icmph;
+	struct icmphdr _ih;
+
+	/* Not enough header? */
+	icmph = skb_header_pointer(skb, *nhoff + iphsz, sizeof(_ih), &_ih);
+	if (icmph == NULL && icmph->type > NR_ICMP_TYPES)
+		return 0;
+
+	/* Error message? */
+	if (icmph->type != ICMP_DEST_UNREACH &&
+	    icmph->type != ICMP_SOURCE_QUENCH &&
+	    icmph->type != ICMP_TIME_EXCEEDED &&
+	    icmph->type != ICMP_PARAMETERPROB &&
+	    icmph->type != ICMP_REDIRECT)
+		return 0;
+
+	*nhoff += iphsz + sizeof(_ih);
+	return 1;
+}
+
+static int
+hmark_pkt_set_htuple_ipv4(const struct sk_buff *skb, struct hmark_tuple *t,
+			  const struct xt_hmark_info *info)
+{
+	struct iphdr *ip, _ip;
+	int nhoff = skb_network_offset(skb);
+
+	ip = (struct iphdr *) (skb->data + nhoff);
+	if (ip->protocol == IPPROTO_ICMP) {
+		/* use inner header in case of ICMP errors */
+		if (get_inner_hdr(skb, ip->ihl * 4, &nhoff)) {
+			ip = skb_header_pointer(skb, nhoff, sizeof(_ip), &_ip);
+			if (ip == NULL)
+				return -1;
+		}
+	}
+
+	t->src = (__force u32) ip->saddr;
+	t->dst = (__force u32) ip->daddr;
+
+	t->src &= info->src_mask.ip;
+	t->dst &= info->dst_mask.ip;
+
+	if (info->flags & XT_HMARK_FLAG(XT_HMARK_METHOD_L3))
+		return 0;
+
+	t->proto = ip->protocol;
+
+	/* ICMP has no ports, skip */
+	if (t->proto == IPPROTO_ICMP)
+		return 0;
+
+	/* follow-up fragments don't contain ports, skip */
+	if (ip->frag_off & htons(IP_MF | IP_OFFSET))
+		return 0;
+
+	hmark_set_tuple_ports(skb, (ip->ihl * 4) + nhoff, t, info);
+
+	return 0;
+}
+
+static unsigned int
+hmark_tg_v4(struct sk_buff *skb, const struct xt_action_param *par)
+{
+	const struct xt_hmark_info *info = par->targinfo;
+	struct hmark_tuple t;
+
+	memset(&t, 0, sizeof(struct hmark_tuple));
+
+	if (info->flags & XT_HMARK_FLAG(XT_HMARK_CT)) {
+		if (hmark_ct_set_htuple(skb, &t, info) < 0)
+			return XT_CONTINUE;
+	} else {
+		if (hmark_pkt_set_htuple_ipv4(skb, &t, info) < 0)
+			return XT_CONTINUE;
+	}
+
+	skb->mark = hmark_hash(&t, info);
+	return XT_CONTINUE;
+}
+
+static int hmark_tg_check(const struct xt_tgchk_param *par)
+{
+	const struct xt_hmark_info *info = par->targinfo;
+
+	if (!info->hmodulus) {
+		pr_info("xt_HMARK: hash modulus can't be zero\n");
+		return -EINVAL;
+	}
+	if (info->proto_mask &&
+	    (info->flags & XT_HMARK_FLAG(XT_HMARK_METHOD_L3))) {
+		pr_info("xt_HMARK: proto mask must be zero with L3 mode\n");
+		return -EINVAL;
+	}
+	return 0;
+}
+
+static struct xt_target hmark_tg_reg[] __read_mostly = {
+	{
+		.name		= "HMARK",
+		.family		= NFPROTO_IPV4,
+		.target		= hmark_tg_v4,
+		.targetsize	= sizeof(struct xt_hmark_info),
+		.checkentry	= hmark_tg_check,
+		.me		= THIS_MODULE,
+	},
+#if IS_ENABLED(CONFIG_IP6_NF_IPTABLES)
+	{
+		.name		= "HMARK",
+		.family		= NFPROTO_IPV6,
+		.target		= hmark_tg_v6,
+		.targetsize	= sizeof(struct xt_hmark_info),
+		.checkentry	= hmark_tg_check,
+		.me		= THIS_MODULE,
+	},
+#endif
+};
+
+static int __init hmark_tg_init(void)
+{
+	return xt_register_targets(hmark_tg_reg, ARRAY_SIZE(hmark_tg_reg));
+}
+
+static void __exit hmark_tg_exit(void)
+{
+	xt_unregister_targets(hmark_tg_reg, ARRAY_SIZE(hmark_tg_reg));
+}
+
+module_init(hmark_tg_init);
+module_exit(hmark_tg_exit);
-- 
1.7.2.3


[-- Attachment #3: 0001-netfilter-userspace-part-for-target-HMARK.patch --]
[-- Type: text/x-patch, Size: 24699 bytes --]

From edcb596187a50172481d1e9fa11ae062337c69eb Mon Sep 17 00:00:00 2001
From: Hans Schillstrom <hans.schillstrom@ericsson.com>
Date: Mon, 7 May 2012 09:46:38 +0200
Subject: [PATCH 1/1] netfilter: userspace part for target HMARK

    The target allows you to create rules in the "raw" and "mangle" tables
    which alter the netfilter mark (nfmark) field within a given range.
    First a 32 bit hash value is generated then modulus by <limit> and
    finally an offset is added before it's written to nfmark.
    Prior to routing, the nfmark can influence the routing method (see
    "Use netfilter MARK value as routing key") and can also be used by
    other subsystems to change their behaviour.

    The mark match can also be used to match nfmark produced by this module.
Ver 13
    Name change of defines.

Ver 12
    Reset option flag in some cases, where option is disabled by value.

Ver 10
    conntrack reduced to --hmark-ct switch
    renaming of vars in xt_hmark_info
    Adding helptext and updated man due to --hmark-ct switc

Ver 9
    Formating changes.

Ver 8
    Syntax changes more descriptive options
    --hmark-method added.

Ver 6-7 -

Ver 5
      smask and dmask changed to length

Ver 4
      xtoptions used for parsing.

Ver 3
       -

Ver 2
      IPv4 NAT added
      iptables ver 1.4.12.1 adaptions.

Signed-off-by: Hans Schillstrom <hans.schillstrom@ericsson.com>
---
 extensions/libxt_HMARK.c           |  510 ++++++++++++++++++++++++++++++++++++
 extensions/libxt_HMARK.man         |   84 ++++++
 include/linux/netfilter/xt_HMARK.h |   48 ++++
 3 files changed, 642 insertions(+), 0 deletions(-)
 create mode 100644 extensions/libxt_HMARK.c
 create mode 100644 extensions/libxt_HMARK.man
 create mode 100644 include/linux/netfilter/xt_HMARK.h

diff --git a/extensions/libxt_HMARK.c b/extensions/libxt_HMARK.c
new file mode 100644
index 0000000..4b13cd3
--- /dev/null
+++ b/extensions/libxt_HMARK.c
@@ -0,0 +1,510 @@
+/*
+ * Shared library add-on to iptables to add HMARK target support.
+ *
+ * The kernel module calculates a hash value that can be modified by modulus
+ * and an offset. The hash value is based on a direction independent
+ * five tuple: src & dst addr src & dst ports and protocol.
+ * However src & dst port can be masked and are not used for fragmented
+ * packets, ESP and AH don't have ports so SPI will be used instead.
+ * For ICMP error messages the hash mark values will be calculated on
+ * the source packet i.e. the packet caused the error (If sufficient
+ * amount of data exists).
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+#include <stdbool.h>
+#include <stdio.h>
+#include <string.h>
+
+#include "xtables.h"
+#include <linux/netfilter/xt_HMARK.h>
+
+
+#define DEF_HRAND 0xc175a3b8	/* Default "random" value to jhash */
+
+#define XT_F_HMARK_L4_OPTS \
+		(XT_HMARK_FLAG(XT_HMARK_SPI_AND) |\
+		 XT_HMARK_FLAG(XT_HMARK_SPI_OR) |\
+		 XT_HMARK_FLAG(XT_HMARK_SPORT_AND) |\
+		 XT_HMARK_FLAG(XT_HMARK_SPORT_OR) |\
+		 XT_HMARK_FLAG(XT_HMARK_DPORT_AND) |\
+		 XT_HMARK_FLAG(XT_HMARK_DPORT_OR) |\
+		 XT_HMARK_FLAG(XT_HMARK_PROTO_AND))
+
+static void HMARK_help(void)
+{
+	printf(
+"HMARK target options, i.e. modify hash calculation by:\n"
+"  --hmark-method <method>          Overall L3/L4 and fragment behavior\n"
+"                 L3                Fragment safe, do not use ports or proto\n"
+"                                   i.e. Fragments don't need special care.\n"
+"                 L3-4 (Default)    Fragment unsafe, use ports and proto\n"
+"                                   if defrag off in conntrack\n"
+"                                      no hmark on any part of a fragment\n"
+"  Limit/modify the calculated hash mark by:\n"
+"  --hmark-mod value                nfmark modulus value\n"
+"  --hmark-offset value             Last action add value to nfmark\n\n"
+" Fine tuning of what will be included in hash calculation\n"
+"  --hmark-src-mask length          Source address mask length\n"
+"  --hmark-dst-mask length          Dest address mask length\n"
+"  --hmark-sport-mask value         Mask src port with value\n"
+"  --hmark-dport-mask value         Mask dst port with value\n"
+"  --hmark-spi-mask value           For esp and ah AND spi with value\n"
+"  --hmark-sport-set value          OR src port with value\n"
+"  --hmark-dport-set value          OR dst port with value\n"
+"  --hmark-spi-set value            For esp and ah OR spi with value\n"
+"  --hmark-proto-mask value         Mask Protocol with value\n"
+"  --hmark-rnd                      Initial Random value to hash cacl.\n"
+" For NAT in IPv4: src part from original/reply tuple will always be used\n"
+" i.e. orig src part will be used as src address/port.\n"
+"     reply src part will be used as dst address/port\n"
+" Make sure to qualify the rule in a proper way when using NAT flag\n"
+" When --ct is used only tracked connections will match\n"
+"  --hmark-ct                       Force conntrack orig and rely tuples as\n"
+"                                   source and destination.\n\n"
+" In many cases hmark can be omitted i.e. --src-mask can be used\n");
+}
+
+#define hi struct xt_hmark_info
+
+static const struct xt_option_entry HMARK_opts[] = {
+	{ .name  = "hmark-method",
+	  .type  = XTTYPE_STRING,
+	  .id    = XT_HMARK_METHOD_L3
+	},
+	{ .name  = "hmark-src-mask",
+	  .type  = XTTYPE_PLENMASK,
+	  .id    = XT_HMARK_SADR_AND,
+	  .flags = XTOPT_PUT, XTOPT_POINTER(hi, src_mask)
+	},
+	{ .name  = "hmark-dst-mask",
+	  .type  = XTTYPE_PLENMASK,
+	  .id    = XT_HMARK_DADR_AND,
+	  .flags = XTOPT_PUT,
+	  XTOPT_POINTER(hi, dst_mask)
+	},
+	{ .name  = "hmark-sport-mask",
+	  .type  = XTTYPE_UINT16,
+	  .id    = XT_HMARK_SPORT_AND,
+	  .flags = XTOPT_PUT,
+	  XTOPT_POINTER(hi, port_mask.p16.src)
+	},
+	{ .name  = "hmark-dport-mask",
+	  .type  = XTTYPE_UINT16,
+	  .id    = XT_HMARK_DPORT_AND,
+	  .flags = XTOPT_PUT,
+	  XTOPT_POINTER(hi, port_mask.p16.dst)
+	},
+	{ .name  = "hmark-spi-mask",
+	  .type  = XTTYPE_UINT32,
+	  .id    = XT_HMARK_SPI_AND,
+	  .flags = XTOPT_PUT,
+	  XTOPT_POINTER(hi, spi_mask)
+	},
+	{ .name  = "hmark-sport-set",
+	  .type  = XTTYPE_UINT16,
+	  .id    = XT_HMARK_SPORT_OR,
+	  .flags = XTOPT_PUT,
+	  XTOPT_POINTER(hi, port_set.p16.src)
+	},
+	{ .name  = "hmark-dport-set",
+	  .type  = XTTYPE_UINT16,
+	  .id    = XT_HMARK_DPORT_OR,
+	  .flags = XTOPT_PUT,
+	  XTOPT_POINTER(hi, port_set.p16.dst)
+	},
+	{ .name  = "hmark-spi-set",
+	  .type  = XTTYPE_UINT32,
+	  .id    = XT_HMARK_SPI_OR,
+	  .flags = XTOPT_PUT,
+	  XTOPT_POINTER(hi, spi_set)
+	},
+	{ .name  = "hmark-proto-mask",
+	  .type  = XTTYPE_UINT16,
+	  .id    = XT_HMARK_PROTO_AND,
+	  .flags = XTOPT_PUT,
+	  XTOPT_POINTER(hi, proto_mask)
+	},
+	{ .name  = "hmark-rnd",
+	  .type  = XTTYPE_UINT32,
+	  .id    = XT_HMARK_RND,
+	  .flags = XTOPT_PUT,
+	  XTOPT_POINTER(hi, hashrnd)
+	},
+	{ .name = "hmark-mod",
+	  .type = XTTYPE_UINT32,
+	  .id = XT_HMARK_MODULUS,
+	  .min = 1,
+	  .flags = XTOPT_PUT | XTOPT_MAND,
+	  XTOPT_POINTER(hi, hmodulus)
+	},
+	{ .name  = "hmark-offset",
+	  .type  = XTTYPE_UINT32,
+	  .id    = XT_HMARK_OFFSET,
+	  .flags = XTOPT_PUT,
+	  XTOPT_POINTER(hi, hoffset)
+	},
+	{ .name  = "hmark-ct",
+	  .type  = XTTYPE_NONE,
+	  .id    = XT_HMARK_CT
+	},
+
+	{ .name  = "method",
+	  .type  = XTTYPE_STRING,
+	  .id    = XT_HMARK_METHOD_L3
+	},
+	{ .name  = "src-mask",
+	  .type  = XTTYPE_PLENMASK,
+	  .id    = XT_HMARK_SADR_AND,
+	  .flags = XTOPT_PUT,
+	  XTOPT_POINTER(hi, src_mask)
+	},
+	{ .name  = "dst-mask",
+	  .type  = XTTYPE_PLENMASK,
+	  .id    = XT_HMARK_DADR_AND,
+	  .flags = XTOPT_PUT,
+	  XTOPT_POINTER(hi, dst_mask)
+	},
+	{ .name  = "sport-mask",
+	  .type  = XTTYPE_UINT16,
+	  .id    = XT_HMARK_SPORT_AND,
+	  .flags = XTOPT_PUT,
+	  XTOPT_POINTER(hi, port_mask.p16.src)
+	},
+	{ .name  = "dport-mask", .type = XTTYPE_UINT16,
+	  .id = XT_HMARK_DPORT_AND,
+	  .flags = XTOPT_PUT,
+	  XTOPT_POINTER(hi, port_mask.p16.dst)
+	},
+	{ .name  = "spi-mask",
+	  .type  = XTTYPE_UINT32,
+	  .id    = XT_HMARK_SPI_AND,
+	  .flags = XTOPT_PUT,
+	  XTOPT_POINTER(hi, spi_mask)
+	},
+	{ .name  = "sport-set",
+	  .type  = XTTYPE_UINT16,
+	  .id    = XT_HMARK_SPORT_OR,
+	  .flags = XTOPT_PUT,
+	  XTOPT_POINTER(hi, port_set.p16.src)
+	},
+	{ .name  = "dport-set",
+	  .type  = XTTYPE_UINT16,
+	  .id    = XT_HMARK_DPORT_OR,
+	  .flags = XTOPT_PUT,
+	  XTOPT_POINTER(hi, port_set.p16.dst)
+	},
+	{ .name  = "spi-set",
+	  .type  = XTTYPE_UINT32,
+	  .id    = XT_HMARK_SPI_OR,
+	  .flags = XTOPT_PUT,
+	  XTOPT_POINTER(hi, spi_set)
+	},
+	{ .name  = "proto-mask",
+	  .type  = XTTYPE_UINT16,
+	  .id    = XT_HMARK_PROTO_AND,
+	  .flags = XTOPT_PUT,
+	  XTOPT_POINTER(hi, proto_mask)
+	},
+	{ .name  = "rnd",
+	  .type  = XTTYPE_UINT32,
+	  .id    = XT_HMARK_RND,
+	  .flags = XTOPT_PUT,
+	  XTOPT_POINTER(hi, hashrnd)
+	},
+	{ .name  = "mod",
+	  .type  = XTTYPE_UINT32,
+	  .id    = XT_HMARK_MODULUS,
+	  .min   = 1,
+	  .flags = XTOPT_PUT,
+	  XTOPT_MAND, XTOPT_POINTER(hi, hmodulus)
+	},
+	{ .name  = "offset",
+	  .type  = XTTYPE_UINT32,
+	  .id    = XT_HMARK_OFFSET,
+	  .flags = XTOPT_PUT,
+	  XTOPT_POINTER(hi, hoffset)
+	},
+	{ .name  = "ct",
+	  .type  = XTTYPE_NONE,
+	  .id    = XT_HMARK_CT
+	},
+	XTOPT_TABLEEND,
+};
+
+static void HMARK_parse(struct xt_option_call *cb, int plen)
+{
+	struct xt_hmark_info *info = cb->data;
+
+	if (!cb->xflags) {
+		memset(info, 0xff, sizeof(struct xt_hmark_info));
+		info->port_set.v32 = 0;
+		info->flags = 0;
+		info->spi_set = 0;
+		info->hoffset = 0;
+		info->hashrnd = DEF_HRAND;
+	}
+	xtables_option_parse(cb);
+
+	switch (cb->entry->id) {
+	case XT_HMARK_SADR_AND:
+		if (cb->val.hlen == plen)
+			cb->xflags &= ~XT_HMARK_FLAG(XT_HMARK_SADR_AND);
+		break;
+	case XT_HMARK_DADR_AND:
+		if (cb->val.hlen == plen)
+			cb->xflags &= ~XT_HMARK_FLAG(XT_HMARK_DADR_AND);
+		break;
+	case XT_HMARK_SPI_AND:
+		info->spi_mask = htonl(cb->val.u32);
+		if (cb->val.u32 == 0xffffffff)
+			cb->xflags &= ~XT_HMARK_FLAG(XT_HMARK_SPI_AND);
+		break;
+	case XT_HMARK_SPI_OR:
+		info->spi_set = htonl(cb->val.u32);
+		if (cb->val.u32 == 0)
+			cb->xflags &= ~XT_HMARK_FLAG(XT_HMARK_SPI_OR);
+		break;
+	case XT_HMARK_SPORT_AND:
+		info->port_mask.p16.src = htons(cb->val.u16);
+		if (cb->val.u16 == 0xffff)
+			cb->xflags &= ~XT_HMARK_FLAG(XT_HMARK_SPORT_AND);
+		break;
+	case XT_HMARK_DPORT_AND:
+		info->port_mask.p16.dst = htons(cb->val.u16);
+		if (cb->val.u16 == 0xffff)
+			cb->xflags &= ~XT_HMARK_FLAG(XT_HMARK_DPORT_AND);
+		break;
+	case XT_HMARK_SPORT_OR:
+		info->port_set.p16.src = htons(cb->val.u16);
+		if (cb->val.u16 == 0)
+			cb->xflags &= ~XT_HMARK_FLAG(XT_HMARK_SPORT_OR);
+		break;
+	case XT_HMARK_DPORT_OR:
+		info->port_set.p16.dst = htons(cb->val.u16);
+		if (cb->val.u16 == 0)
+			cb->xflags &= ~XT_HMARK_FLAG(XT_HMARK_DPORT_OR);
+		break;
+	case XT_HMARK_PROTO_AND:
+		if (cb->val.u16 == 0xffff)
+			cb->xflags &= ~XT_HMARK_FLAG(XT_HMARK_PROTO_AND);
+		break;
+	case XT_HMARK_MODULUS:
+		if (info->hmodulus == 0) {
+			xtables_error(PARAMETER_PROBLEM,
+				      "xxx modulus 0 ? "
+				      "thats a div by 0");
+			info->hmodulus = 0xffffffff;
+		}
+		break;
+	case XT_HMARK_METHOD_L3:
+		if (strcmp(cb->arg, "L3") == 0) {
+			info->proto_mask = 0;
+			cb->xflags &= ~XT_HMARK_FLAG(XT_HMARK_METHOD_L3_4);
+		} else if (strcmp(cb->arg, "L3-4") == 0) {
+			cb->xflags &= ~XT_HMARK_FLAG(XT_HMARK_METHOD_L3);
+			cb->xflags |= XT_HMARK_FLAG(XT_HMARK_METHOD_L3_4);
+		}
+		break;
+	}
+	info->flags = cb->xflags;
+}
+
+static void HMARK_ip4_parse(struct xt_option_call *cb)
+{
+	HMARK_parse(cb, 32);
+}
+static void HMARK_ip6_parse(struct xt_option_call *cb)
+{
+	HMARK_parse(cb, 128);
+}
+
+static void HMARK_check(struct xt_fcheck_call *cb)
+{
+	if (!(cb->xflags & XT_HMARK_FLAG(XT_HMARK_MODULUS)))
+		xtables_error(PARAMETER_PROBLEM, "HMARK: the --hmark-mod, "
+			      "is not set, or zero wich is a div by zero");
+	/* Check for invalid options */
+	if (cb->xflags & XT_HMARK_FLAG(XT_HMARK_METHOD_L3) &&
+	   (cb->xflags & XT_F_HMARK_L4_OPTS))
+		xtables_error(PARAMETER_PROBLEM, "HMARK: --hmark-method L3, "
+			      "can not be combined by an Layer 4 options: "
+			      "port, spi or proto ");
+}
+/*
+ * Common print for IPv4 & IPv6
+ */
+static void HMARK_print(const struct xt_hmark_info *info)
+{
+	if (info->flags & XT_HMARK_FLAG(XT_HMARK_METHOD_L3)) {
+		printf("method L3 ");
+	} else {
+		if (info->flags & XT_HMARK_FLAG(XT_HMARK_METHOD_L3_4))
+			printf("method L3-4 ");
+		if (info->flags & XT_HMARK_FLAG(XT_HMARK_SPORT_AND))
+			printf("sport-mask 0x%x ",
+			       htons(info->port_mask.p16.src));
+		if (info->flags & XT_HMARK_FLAG(XT_HMARK_DPORT_AND))
+			printf("dport-mask 0x%x ",
+			       htons(info->port_mask.p16.dst));
+		if (info->flags & XT_HMARK_FLAG(XT_HMARK_SPI_AND))
+			printf("spi-mask 0x%x ", htonl(info->spi_mask));
+		if (info->flags & XT_HMARK_FLAG(XT_HMARK_SPORT_OR))
+			printf("sport-set 0x%x ",
+			       htons(info->port_set.p16.src));
+		if (info->flags & XT_HMARK_FLAG(XT_HMARK_DPORT_OR))
+			printf("dport-set 0x%x ",
+			       htons(info->port_set.p16.dst));
+		if (info->flags & XT_HMARK_FLAG(XT_HMARK_SPI_OR))
+			printf("spi-set 0x%x ", htonl(info->spi_set));
+		if (info->flags & XT_HMARK_FLAG(XT_HMARK_PROTO_AND))
+			printf("proto-mask 0x%x ", info->proto_mask);
+	}
+	if (info->flags & XT_HMARK_FLAG(XT_HMARK_RND))
+		printf("rnd 0x%x ", info->hashrnd);
+
+}
+
+static void HMARK_ip6_print(const void *ip,
+			    const struct xt_entry_target *target, int numeric)
+{
+	const struct xt_hmark_info *info =
+			(const struct xt_hmark_info *)target->data;
+
+	printf(" HMARK ");
+	if (info->flags & XT_HMARK_FLAG(XT_HMARK_MODULUS))
+		printf("%% 0x%x ", info->hmodulus);
+	if (info->flags & XT_HMARK_FLAG(XT_HMARK_OFFSET))
+		printf("+ 0x%x ", info->hoffset);
+	if (info->flags & XT_HMARK_FLAG(XT_HMARK_CT))
+		printf("ct, ");
+	if (info->flags & XT_HMARK_FLAG(XT_HMARK_SADR_AND))
+		printf("src-mask %s ",
+		       xtables_ip6mask_to_numeric(&info->src_mask.in6) + 1);
+	if (info->flags & XT_HMARK_FLAG(XT_HMARK_DADR_AND))
+		printf("dst-mask %s ",
+		       xtables_ip6mask_to_numeric(&info->dst_mask.in6) + 1);
+	HMARK_print(info);
+}
+static void HMARK_ip4_print(const void *ip,
+			    const struct xt_entry_target *target, int numeric)
+{
+	const struct xt_hmark_info *info =
+		(const struct xt_hmark_info *)target->data;
+
+	printf(" HMARK ");
+	if (info->flags & XT_HMARK_FLAG(XT_HMARK_MODULUS))
+		printf("%% 0x%x ", info->hmodulus);
+	if (info->flags & XT_HMARK_FLAG(XT_HMARK_OFFSET))
+		printf("+ 0x%x ", info->hoffset);
+	if (info->flags & XT_HMARK_FLAG(XT_HMARK_CT))
+		printf("ct, ");
+	if (info->flags & XT_HMARK_FLAG(XT_HMARK_SADR_AND))
+		printf("src-mask %s ",
+		       xtables_ipmask_to_numeric(&info->src_mask.in) + 1);
+	if (info->flags & XT_HMARK_FLAG(XT_HMARK_DADR_AND))
+		printf("dst-mask %s ",
+		       xtables_ipmask_to_numeric(&info->dst_mask.in) + 1);
+	HMARK_print(info);
+}
+static void HMARK_save(const struct xt_hmark_info *info)
+{
+	if (info->flags & XT_HMARK_FLAG(XT_HMARK_METHOD_L3)) {
+		printf(" --hmark-method L3");
+	} else {
+		if (info->flags & XT_HMARK_FLAG(XT_HMARK_METHOD_L3_4))
+			printf(" --hmark-method L3-4");
+		if (info->flags & XT_HMARK_FLAG(XT_HMARK_SPORT_AND))
+			printf(" --hmark-sport-mask 0x%x",
+			       htons(info->port_mask.p16.src));
+		if (info->flags & XT_HMARK_FLAG(XT_HMARK_DPORT_AND))
+			printf(" --hmark-dport-mask 0x%x",
+			       htons(info->port_mask.p16.dst));
+		if (info->flags & XT_HMARK_FLAG(XT_HMARK_SPI_AND))
+			printf(" --hmark-spi-mask 0x%x",
+			       htonl(info->spi_mask));
+		if (info->flags & XT_HMARK_FLAG(XT_HMARK_SPORT_OR))
+			printf(" --hmark-sport-set 0x%x",
+			       htons(info->port_set.p16.src));
+		if (info->flags & XT_HMARK_FLAG(XT_HMARK_DPORT_OR))
+			printf(" --hmark-dport-set 0x%x",
+			       htons(info->port_set.p16.dst));
+		if (info->flags & XT_HMARK_FLAG(XT_HMARK_SPI_OR))
+			printf(" --hmark-spi-set 0x%x", htonl(info->spi_set));
+		if (info->flags & XT_HMARK_FLAG(XT_HMARK_PROTO_AND))
+			printf(" --hmark-proto-mask 0x%x", info->proto_mask);
+	}
+	if (info->flags & XT_HMARK_FLAG(XT_HMARK_RND))
+		printf(" --hmark-rnd 0x%x", info->hashrnd);
+	if (info->flags & XT_HMARK_FLAG(XT_HMARK_MODULUS))
+		printf(" --hmark-mod 0x%x", info->hmodulus);
+	if (info->flags & XT_HMARK_FLAG(XT_HMARK_OFFSET))
+		printf(" --hmark-offset 0x%x", info->hoffset);
+	if (info->flags & XT_HMARK_FLAG(XT_HMARK_CT))
+		printf(" --hmark-ct");
+}
+
+static void HMARK_ip6_save(const void *ip, const struct xt_entry_target *target)
+{
+	const struct xt_hmark_info *info =
+		(const struct xt_hmark_info *)target->data;
+
+	if (info->flags & XT_HMARK_FLAG(XT_HMARK_SADR_AND))
+		printf(" --hmark-src-mask %s",
+		       xtables_ip6mask_to_numeric(&info->src_mask.in6) + 1);
+	if (info->flags & XT_HMARK_FLAG(XT_HMARK_DADR_AND))
+		printf(" --hmark-dst-mask %s",
+		       xtables_ip6mask_to_numeric(&info->dst_mask.in6) + 1);
+	HMARK_save(info);
+}
+
+static void HMARK_ip4_save(const void *ip, const struct xt_entry_target *target)
+{
+	const struct xt_hmark_info *info =
+		(const struct xt_hmark_info *)target->data;
+
+	if (info->flags & XT_HMARK_FLAG(XT_HMARK_SADR_AND))
+		printf(" --hmark-src-mask %s",
+		       xtables_ipmask_to_numeric(&info->src_mask.in) + 1);
+	if (info->flags & XT_HMARK_FLAG(XT_HMARK_DADR_AND))
+		printf(" --hmark-dst-mask %s",
+		       xtables_ipmask_to_numeric(&info->dst_mask.in) + 1);
+	HMARK_save(info);
+}
+
+static struct xtables_target mark_tg_reg[] = {
+	{
+		.family        = NFPROTO_IPV4,
+		.name          = "HMARK",
+		.version       = XTABLES_VERSION,
+		.revision      = 0,
+		.size          = XT_ALIGN(sizeof(struct xt_hmark_info)),
+		.userspacesize = XT_ALIGN(sizeof(struct xt_hmark_info)),
+		.help          = HMARK_help,
+		.print         = HMARK_ip4_print,
+		.save          = HMARK_ip4_save,
+		.x6_parse      = HMARK_ip4_parse,
+		.x6_fcheck     = HMARK_check,
+		.x6_options    = HMARK_opts,
+	},
+	{
+		.family        = NFPROTO_IPV6,
+		.name          = "HMARK",
+		.version       = XTABLES_VERSION,
+		.revision      = 0,
+		.size          = XT_ALIGN(sizeof(struct xt_hmark_info)),
+		.userspacesize = XT_ALIGN(sizeof(struct xt_hmark_info)),
+		.help          = HMARK_help,
+		.print         = HMARK_ip6_print,
+		.save          = HMARK_ip6_save,
+		.x6_parse      = HMARK_ip6_parse,
+		.x6_fcheck     = HMARK_check,
+		.x6_options    = HMARK_opts,
+	},
+};
+
+void _init(void)
+{
+	xtables_register_targets(mark_tg_reg, ARRAY_SIZE(mark_tg_reg));
+}
diff --git a/extensions/libxt_HMARK.man b/extensions/libxt_HMARK.man
new file mode 100644
index 0000000..c258e59
--- /dev/null
+++ b/extensions/libxt_HMARK.man
@@ -0,0 +1,84 @@
+This module does the same as MARK, i.e. set an fwmark, but the mark is based on a hash value.
+The hash is based on src-addr, dst-addr, sport, dport and proto. The same mark will be produced independent of direction if no masks is set or the same masks is used for src and dest.
+The hash mark could be adjusted by modulus and finally an offset could be added, i.e the final mark will be within a range.
+ICMP error will use the the original message for hash calculation not the icmp it self.
+
+Note: IPv4 packets with nf_defrag_ipv4 loaded will be defragmented before they reach hmark,
+      IPv6 nf_defrag is not implemented this way, hence fragmented ipv6 packets will reach hmark.
+      Default behavior is to completely ignore any fragment if it reach hmark.
+      --hmark-method L3 is fragment safe since neither ports or L4 protocol field is used.
+      None of the parameters effect the packet it self only the calculated hash value.
+
+.PP
+Parameters:
+Short hand methods
+.TP
+\fB\-\-hmark\-method\fP \fIL3\fP
+Do not use L4 protocol field, ports or spi, only Layer 3 addresses, mask length
+of L3 addresses can still be used. Fragment or not does not matter in
+this case since only L3 address can be used in calc. of hash value.
+.TP
+\fB\-\-hmark\-method\fP \fIL3-4\fP (Default)
+Include L4 in calculation. of hash value i.e. all masks below are valid.
+Fragments will be ignored. (i.e no hash value produced)
+.PP
+For all masks default is all "1:s", to disable a field use mask 0
+.TP
+\fB\-\-hmark\-src\-mask\fP \fIlength\fP
+The length of the mask to AND the source address with (saddr & value).
+.TP
+\fB\-\-hmark\-dst\-mask\fP \fIlength\fP
+The length of the mask to AND the dest. address with (daddr & value).
+.TP
+\fB\-\-hmark\-sport\-mask\fP \fIvalue\fP
+A 16 bit value to AND the src port with (sport & value).
+.TP
+\fB\-\-hmark\-dport\-mask\fP \fIvalue\fP
+A 16 bit value to AND the dest port with (dport & value).
+.TP
+\fB\-\-hmark\-sport\-set\fP \fIvalue\fP
+A 16 bit value to OR the src port with (sport | value).
+.TP
+\fB\-\-hmark\-dport\-set\fP \fIvalue\fP
+A 16 bit value to OR the dest port with (dport | value).
+.TP
+\fB\-\-hmark\-spi\-mask\fP \fIvalue\fP
+Value to AND the spi field with (spi & value) valid for proto esp or ah.
+.TP
+\fB\-\-hmark\-spi\-set\fP \fIvalue\fP
+Value to OR the spi field with (spi | value) valid for proto esp or ah.
+.TP
+\fB\-\-hmark\-proto\-mask\fP \fIvalue\fP
+An 8 bit value to AND the L4 proto field with (proto & value).
+.TP
+\fB\-\-hmark\-ct\fP
+When flag is set, conntrack data should be used. Useful when NAT internal addressed should be used in calculation.
+Be careful when using DNAT since mangle table is handled before nat table. I.e it will not work as expected to put HMARK in table mangle and PREROUTING chain. The initial packet will have it's hash based on the original address, while the rest of the flow will use the NAT:ed address.
+.TP
+\fB\-\-hmark\-rnd\fP \fIvalue\fP
+A 32 bit initial value for hash calc, default is 0xc175a3b8.
+.PP
+Final processing of the mark in order of execution.
+.TP
+\fB\-\-hmark\-mod\fP \fIvalue (must be > 0)\fP
+The easiest way to describe this is:  hash = hash mod <value>
+.TP
+\fB\-\-hmark\-offset\fP \fIvalue\fP
+The easiest way to describe this is:  hash = hash + <value>
+.PP
+\fIExamples:\fP
+.PP
+Default rule handles all TCP, UDP, SCTP, ESP & AH
+.IP
+iptables \-t mangle \-A PREROUTING \-m state \-\-state NEW,ESTABLISHED,RELATED
+ \-j HMARK \-\-hmark-offs 10000 \-\-hmark-mod 10
+.PP
+Handle SCTP and hash dest port only and produce a nfmark between 100-119.
+.IP
+iptables \-t mangle \-A PREROUTING -p SCTP \-j HMARK \-\-src\-mask 0 \-\-dst\-mask 0
+ \-\-sp\-mask 0 \-\-offset 100 \-\-mod 20
+.PP
+Fragment safe Layer 3 only that keep a class C network flow together
+.IP
+iptables \-t mangle \-A PREROUTING \-j HMARK \-\-method L3 \-\-src\-mask 24 \-\-mod 20 \-\-offset 100
+
diff --git a/include/linux/netfilter/xt_HMARK.h b/include/linux/netfilter/xt_HMARK.h
new file mode 100644
index 0000000..05e43ba
--- /dev/null
+++ b/include/linux/netfilter/xt_HMARK.h
@@ -0,0 +1,48 @@
+#ifndef XT_HMARK_H_
+#define XT_HMARK_H_
+
+#include <linux/types.h>
+
+enum {
+	XT_HMARK_NONE,
+	XT_HMARK_SADR_AND,
+	XT_HMARK_DADR_AND,
+	XT_HMARK_SPI_AND,
+	XT_HMARK_SPI_OR,
+	XT_HMARK_SPORT_AND,
+	XT_HMARK_DPORT_AND,
+	XT_HMARK_SPORT_OR,
+	XT_HMARK_DPORT_OR,
+	XT_HMARK_PROTO_AND,
+	XT_HMARK_RND,
+	XT_HMARK_MODULUS,
+	XT_HMARK_OFFSET,
+	XT_HMARK_CT,
+	XT_HMARK_METHOD_L3,
+	XT_HMARK_METHOD_L3_4,
+};
+#define XT_HMARK_FLAG(flag)	(1 << flag)
+
+union hmark_ports {
+	struct {
+		__u16	src;
+		__u16	dst;
+	} p16;
+	__u32	v32;
+};
+
+struct xt_hmark_info {
+	union nf_inet_addr	src_mask;	/* Source address mask */
+	union nf_inet_addr	dst_mask;	/* Dest address mask */
+	union hmark_ports	port_mask;
+	union hmark_ports	port_set;
+	__u32			spi_mask;
+	__u32			spi_set;
+	__u32			flags;		/* Print out only */
+	__u16			proto_mask;	/* L4 Proto mask */
+	__u32			hashrnd;
+	__u32			hmodulus;	/* Modulus */
+	__u32			hoffset;	/* Offset */
+};
+
+#endif /* XT_HMARK_H_ */
-- 
1.7.2.3


^ permalink raw reply related

* Re: [PATCH v2] RPS: Sparse connection optimizations - v2
From: Eric Dumazet @ 2012-05-07  8:25 UTC (permalink / raw)
  To: Deng-Cheng Zhu; +Cc: Tom Herbert, davem, netdev
In-Reply-To: <4FA7815F.8030101@mips.com>

On Mon, 2012-05-07 at 16:01 +0800, Deng-Cheng Zhu wrote:

> Did you really read my patch and understand what I commented? When I was
> talking about using rps_sparse_flow (initially cpu_flow), neither
> rps_sock_flow_table nor rps_dev_flow_table is activated (number of
> entries: 0).

I read your patch and am concerned of performance issues when handling
typical workload. Say between 0.1 and 20 Mpps on current hardware.

The argument "oh its only selected when
CONFIG_RPS_SPARSE_FLOW_OPTIMIZATION is set" is wrong.

CONFIG_NR_RPS_MAP_LOOPS is wrong.

Your HZ timeout is yet another dark side of your patch. 

Your (flow->dev == skb->dev) test is wrong.

Your : flow->ts = now; is wrong (dirtying memory for each packet)

Really I dont like your patch.

You are kindly asked to find another way to solve your problem, a
generic mechanism that can help others, not only you.

We do think activating RFS is the way to go. Its the standard layer we
added below RPS, its configurable and scales. It can be expanded at will
with configurable plugins.

For example, using single queue NICS, it makes sense to select cpu on
the output device only, not on the rxhash by itself (a modulo or
something), to reduce false sharing and qdisc/device lock on tx path.

If your machine has 4 cpus, and 4 nics, you can instruct RFS table to
prefer cpu on the NIC that packet will use for output.

^ permalink raw reply

* Re: [PATCH RESEND 3/5] can: flexcan: adopt pinctrl support
From: Marc Kleine-Budde @ 2012-05-07  8:29 UTC (permalink / raw)
  To: Shawn Guo
  Cc: linux-arm-kernel, Arnd Bergmann, Olof Johansson, Sascha Hauer,
	Dong Aisheng, linux-can, Linux Netdev List
In-Reply-To: <CAAQ0ZWSzrfFm+=m+s8S3FDG5k_OrULN58VUVCBE=fwU0r=ZD+g@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 1376 bytes --]

On 05/07/2012 10:17 AM, Shawn Guo wrote:
> On 7 May 2012 16:06, Marc Kleine-Budde <mkl@pengutronix.de> wrote:
>> It doesn't compile yet against net-next/master, which is based on v3.4-rc4:
>>
>> /home/frogger/pengutronix/socketcan/linux/drivers/net/can/flexcan.c: In
>> function 'flexcan_probe':
>> /home/frogger/pengutronix/socketcan/linux/drivers/net/can/flexcan.c:937:
>> error: implicit declaration of function 'devm_pinctrl_get_select_default'
>> /home/frogger/pengutronix/socketcan/linux/drivers/net/can/flexcan.c:937:
>> warning: assignment makes pointer from integer without a cast
>>
>> Which tree does this series depend on? If this should go over the
>> linux-can tree, I have to ask David first to merge this.
>>
> Thanks for the response, Marc.  The patch depends on pinctrl tree and
> a couple of other patches that will go through arm-soc tree, so I
> would like to ask for your ack to have the patch go over arm-soc tree.

Fine with me. David, is this okay? Shawn, better wait for David's okay.

Acked-by: Marc Kleine-Budde <mkl@pengutronix.de>

regards, Marc

-- 
Pengutronix e.K.                  | Marc Kleine-Budde           |
Industrial Linux Solutions        | Phone: +49-231-2826-924     |
Vertretung West/Dortmund          | Fax:   +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686  | http://www.pengutronix.de   |


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

^ permalink raw reply

* Re: [v12 PATCH 2/3] NETFILTER module xt_hmark, new target for HASH based fwmark
From: Pablo Neira Ayuso @ 2012-05-07  9:03 UTC (permalink / raw)
  To: Hans Schillstrom
  Cc: kaber@trash.net, jengelh@medozas.de,
	netfilter-devel@vger.kernel.org, netdev@vger.kernel.org,
	hans@schillstrom.com
In-Reply-To: <201205071020.44449.hans.schillstrom@ericsson.com>

On Mon, May 07, 2012 at 10:20:42AM +0200, Hans Schillstrom wrote:
> On Monday 07 May 2012 00:57:38 Pablo Neira Ayuso wrote:
> > Hi Hans,
> > 
> > [...]
> > > > > > Regarding ICMP traffic, I think we can use the ID field for the
> > > > > > hashing as well. Thus, we handle ICMP like other protocols.
> > > > >
> > > > > Yes why not, I can give it a try.
> > > > >
> > >
> > > I think we wait with this one..
> > 
> > I see. This is easy to add for the conntrack side, but it will require
> > some extra code for the packet-based solution.
> 
> Actually I think there is very little gain to spread with type 
> and then we must add a user mode possibility to turn it off 
> i.e. a --hmark-icmp-type-mask 
> 
> > Not directly related to this but, I know that your intention is to
> > make this as flexible as possible. However, I still don't find how I
> > would use the port mask feature in any of my setups.  Basically, I
> > don't come up with any useful example for this situation.
> 
> We have plenty of rules where just source port mask is zero.
> and the dest-port-mask is 0xfffc (or 0xffff)

0xffff and 0x0000 means on/off respectively.

Still curious, how can 0xfffc be useful?

> > I'm also telling this because I think that ICMP support will be
> > easier to add if port masking is removed.
> > 
> > [...]
> > > This is what I have done.
> > >
> > > - I reduced the code size a little bit by combining the hmark_ct_set_htuple_ipvX into one func.
> > >   by adding a hmark_addr6_mask() and hmark_addr_any_mask()
> > >   Note that using "otuple->src.l3num" as param 1 in both src and dst is not a typo.
> > >   (it's not set in the rtuple)
> > 
> > Good one, this made the code even smaller.
> > 
> > > - Made the if (dst < src) swap() in the hmark_hash() since it should be used by every caller.
> > 
> > Not really, you don't need for the conntrack part. The original tuple
> > is always the same, not matter where the packet is coming from. I have
> > removed this again so it only affects packet-based hashing.
> 
> Yes original tuple is always the same but not always less than the rtuple.
> If you have two nodes that should produce the same hmark,
> one with conntrack an one without you must make a compare to make it consistent.

I see, for consistency still makes sense although this seems to me
like still strange configuration. In what scenario would you use two
different approaches?

> > > - Moved the L3 check a little bit earlier.
> > 
> > good.
> > 
> > > - changed return values for fragments.
> > 
> > With this, you're giving up on trying to classify fragments. Do you
> > really want this?
> > 
> > From my point of view, if your firewalls (assuming they are the HMARK
> > classification) are stateless, it still makes sense to me to classify
> > fragments using the XT_HMARK_METHOD_L3_4.
> 
> I do agree, it is back to "return 0" again.

OK.

^ permalink raw reply

* [net-next 1/2] stmmac: extend mac addr reg and fix perfect filering
From: Giuseppe CAVALLARO @ 2012-05-07  9:12 UTC (permalink / raw)
  To: netdev; +Cc: Giuseppe Cavallaro, Gianni Antoniazzi

This patch is to extend the number of MAC address registers
for 16 to 32. In fact, other new 16 registers are available in new
chips and this can help on perfect filter mode for unicast.

This patch also fixes the perfect filtering mode by setting the
bit 31 in the MAC address registers.

Signed-off-by: Gianni Antoniazzi <gianni.antoniazzi-ext@st.com>
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
---
 drivers/net/ethernet/stmicro/stmmac/common.h       |    2 +-
 drivers/net/ethernet/stmicro/stmmac/dwmac1000.h    |    8 +++++---
 .../net/ethernet/stmicro/stmmac/dwmac1000_core.c   |   11 +++++++++--
 .../net/ethernet/stmicro/stmmac/dwmac100_core.c    |    2 +-
 drivers/net/ethernet/stmicro/stmmac/dwmac_lib.c    |    7 ++++++-
 drivers/net/ethernet/stmicro/stmmac/stmmac.h       |    1 +
 drivers/net/ethernet/stmicro/stmmac/stmmac_main.c  |    4 ++--
 7 files changed, 25 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/common.h b/drivers/net/ethernet/stmicro/stmmac/common.h
index f5dedcb..7164509 100644
--- a/drivers/net/ethernet/stmicro/stmmac/common.h
+++ b/drivers/net/ethernet/stmicro/stmmac/common.h
@@ -280,7 +280,7 @@ struct stmmac_ops {
 	/* Handle extra events on specific interrupts hw dependent */
 	void (*host_irq_status) (void __iomem *ioaddr);
 	/* Multicast filter setting */
-	void (*set_filter) (struct net_device *dev);
+	void (*set_filter) (struct net_device *dev, int id);
 	/* Flow control setting */
 	void (*flow_ctrl) (void __iomem *ioaddr, unsigned int duplex,
 			   unsigned int fc, unsigned int pause_time);
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac1000.h b/drivers/net/ethernet/stmicro/stmmac/dwmac1000.h
index 54339a7..53ed56b 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac1000.h
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac1000.h
@@ -61,9 +61,11 @@ enum power_event {
 };
 
 /* GMAC HW ADDR regs */
-#define GMAC_ADDR_HIGH(reg)		(0x00000040+(reg * 8))
-#define GMAC_ADDR_LOW(reg)		(0x00000044+(reg * 8))
-#define GMAC_MAX_UNICAST_ADDRESSES	16
+#define GMAC_ADDR_HIGH(reg)	(((reg > 15) ? 0x00000800 : 0x00000040) + \
+				(reg * 8))
+#define GMAC_ADDR_LOW(reg)	(((reg > 15) ? 0x00000804 : 0x00000044) + \
+				(reg * 8))
+#define GMAC_MAX_PERFECT_ADDRESSES	32
 
 #define GMAC_AN_CTRL	0x000000c0	/* AN control */
 #define GMAC_AN_STATUS	0x000000c4	/* AN status */
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac1000_core.c b/drivers/net/ethernet/stmicro/stmmac/dwmac1000_core.c
index e7cbcd9..c32af05 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac1000_core.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac1000_core.c
@@ -84,10 +84,11 @@ static void dwmac1000_get_umac_addr(void __iomem *ioaddr, unsigned char *addr,
 				GMAC_ADDR_LOW(reg_n));
 }
 
-static void dwmac1000_set_filter(struct net_device *dev)
+static void dwmac1000_set_filter(struct net_device *dev, int id)
 {
 	void __iomem *ioaddr = (void __iomem *) dev->base_addr;
 	unsigned int value = 0;
+	unsigned int perfect_addr_number;
 
 	CHIP_DBG(KERN_INFO "%s: # mcasts %d, # unicast %d\n",
 		 __func__, netdev_mc_count(dev), netdev_uc_count(dev));
@@ -121,8 +122,14 @@ static void dwmac1000_set_filter(struct net_device *dev)
 		writel(mc_filter[1], ioaddr + GMAC_HASH_HIGH);
 	}
 
+	/* Extra 16 regs are available in cores newer than the 3.40. */
+	if (id > 34)
+		perfect_addr_number = GMAC_MAX_PERFECT_ADDRESSES;
+	else
+		perfect_addr_number = GMAC_MAX_PERFECT_ADDRESSES / 2;
+
 	/* Handle multiple unicast addresses (perfect filtering)*/
-	if (netdev_uc_count(dev) > GMAC_MAX_UNICAST_ADDRESSES)
+	if (netdev_uc_count(dev) > perfect_addr_number)
 		/* Switch to promiscuous mode is more than 16 addrs
 		   are required */
 		value |= GMAC_FRAME_FILTER_PR;
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac100_core.c b/drivers/net/ethernet/stmicro/stmmac/dwmac100_core.c
index efde50f..19e0f4e 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac100_core.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac100_core.c
@@ -89,7 +89,7 @@ static void dwmac100_get_umac_addr(void __iomem *ioaddr, unsigned char *addr,
 	stmmac_get_mac_addr(ioaddr, addr, MAC_ADDR_HIGH, MAC_ADDR_LOW);
 }
 
-static void dwmac100_set_filter(struct net_device *dev)
+static void dwmac100_set_filter(struct net_device *dev, int id)
 {
 	void __iomem *ioaddr = (void __iomem *) dev->base_addr;
 	u32 value = readl(ioaddr + MAC_CONTROL);
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac_lib.c b/drivers/net/ethernet/stmicro/stmmac/dwmac_lib.c
index f20aa12..3edbfa2 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac_lib.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac_lib.c
@@ -31,6 +31,8 @@
 #define DWMAC_LIB_DBG(fmt, args...)  do { } while (0)
 #endif
 
+#define GMAC_HI_REG_AE		0x80000000
+
 /* CSR1 enables the transmit DMA to check for new descriptor */
 void dwmac_enable_dma_transmission(void __iomem *ioaddr)
 {
@@ -233,7 +235,10 @@ void stmmac_set_mac_addr(void __iomem *ioaddr, u8 addr[6],
 	unsigned long data;
 
 	data = (addr[5] << 8) | addr[4];
-	writel(data, ioaddr + high);
+	/* For MAC Addr registers se have to set the Address Enable (AE)
+	 * bit that has no effect on the High Reg 0 where the bit 31 (MO)
+	 * is RO. */
+	writel(data | GMAC_HI_REG_AE, ioaddr + high);
 	data = (addr[3] << 24) | (addr[2] << 16) | (addr[1] << 8) | addr[0];
 	writel(data, ioaddr + low);
 }
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac.h b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
index db2de9a..6b5d060 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac.h
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac.h
@@ -85,6 +85,7 @@ struct stmmac_priv {
 	struct clk *stmmac_clk;
 #endif
 	int clk_csr;
+	int synopsys_id;
 };
 
 extern int phyaddr;
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 1a4cf81..a9699ae 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -1465,7 +1465,7 @@ static void stmmac_set_rx_mode(struct net_device *dev)
 	struct stmmac_priv *priv = netdev_priv(dev);
 
 	spin_lock(&priv->lock);
-	priv->hw->mac->set_filter(dev);
+	priv->hw->mac->set_filter(dev, priv->synopsys_id);
 	spin_unlock(&priv->lock);
 }
 
@@ -1806,7 +1806,7 @@ static int stmmac_hw_init(struct stmmac_priv *priv)
 	priv->hw->ring = &ring_mode_ops;
 
 	/* Get and dump the chip ID */
-	stmmac_get_synopsys_id(priv);
+	priv->synopsys_id = stmmac_get_synopsys_id(priv);
 
 	/* Get the HW capability (new GMAC newer than 3.50a) */
 	priv->hw_cap_support = stmmac_get_hw_features(priv);
-- 
1.7.4.4

^ permalink raw reply related

* [net-next 2/2] stmmac: add mixed burst for DMA
From: Giuseppe CAVALLARO @ 2012-05-07  9:12 UTC (permalink / raw)
  To: netdev; +Cc: Giuseppe Cavallaro
In-Reply-To: <1336381953-18041-1-git-send-email-peppe.cavallaro@st.com>

In mixed burst (MB) mode, the AHB master always initiates
the bursts with fixed-size when the DMA requests transfers
of size less than or equal to 16 beats.
This patch adds the MB support and the flag that can be
passed from the platform to select it.
MB mode can also give some benefits in terms of performances
on some platforms.

Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
---
 drivers/net/ethernet/stmicro/stmmac/common.h       |    4 ++--
 drivers/net/ethernet/stmicro/stmmac/dwmac1000.h    |    1 +
 .../net/ethernet/stmicro/stmmac/dwmac1000_dma.c    |    6 +++++-
 drivers/net/ethernet/stmicro/stmmac/dwmac100_dma.c |    2 +-
 drivers/net/ethernet/stmicro/stmmac/stmmac_main.c  |    6 ++++--
 include/linux/stmmac.h                             |    1 +
 6 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/common.h b/drivers/net/ethernet/stmicro/stmmac/common.h
index 7164509..b343b71 100644
--- a/drivers/net/ethernet/stmicro/stmmac/common.h
+++ b/drivers/net/ethernet/stmicro/stmmac/common.h
@@ -247,8 +247,8 @@ struct stmmac_desc_ops {
 
 struct stmmac_dma_ops {
 	/* DMA core initialization */
-	int (*init) (void __iomem *ioaddr, int pbl, int fb, int burst_len,
-			u32 dma_tx, u32 dma_rx);
+	int (*init) (void __iomem *ioaddr, int pbl, int fb, int mb,
+			int burst_len, u32 dma_tx, u32 dma_rx);
 	/* Dump DMA registers */
 	void (*dump_regs) (void __iomem *ioaddr);
 	/* Set tx/rx threshold in the csr6 register
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac1000.h b/drivers/net/ethernet/stmicro/stmmac/dwmac1000.h
index 53ed56b..f02162f 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac1000.h
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac1000.h
@@ -141,6 +141,7 @@ enum rx_tx_priority_ratio {
 };
 
 #define DMA_BUS_MODE_FB		0x00010000	/* Fixed burst */
+#define DMA_BUS_MODE_MB		0x04000000	/* Mixed burst */
 #define DMA_BUS_MODE_RPBL_MASK	0x003e0000	/* Rx-Programmable Burst Len */
 #define DMA_BUS_MODE_RPBL_SHIFT	17
 #define DMA_BUS_MODE_USP	0x00800000
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac1000_dma.c b/drivers/net/ethernet/stmicro/stmmac/dwmac1000_dma.c
index 3675c57..15e1aa1 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac1000_dma.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac1000_dma.c
@@ -30,7 +30,7 @@
 #include "dwmac1000.h"
 #include "dwmac_dma.h"
 
-static int dwmac1000_dma_init(void __iomem *ioaddr, int pbl, int fb,
+static int dwmac1000_dma_init(void __iomem *ioaddr, int pbl, int fb, int mb,
 			      int burst_len, u32 dma_tx, u32 dma_rx)
 {
 	u32 value = readl(ioaddr + DMA_BUS_MODE);
@@ -66,6 +66,10 @@ static int dwmac1000_dma_init(void __iomem *ioaddr, int pbl, int fb,
 	if (fb)
 		value |= DMA_BUS_MODE_FB;
 
+	/* Mixed Burst has no effect when fb is set */
+	if (mb)
+		value |= DMA_BUS_MODE_MB;
+
 #ifdef CONFIG_STMMAC_DA
 	value |= DMA_BUS_MODE_DA;	/* Rx has priority over tx */
 #endif
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac100_dma.c b/drivers/net/ethernet/stmicro/stmmac/dwmac100_dma.c
index 92ed2e0..e4eca62 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac100_dma.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac100_dma.c
@@ -32,7 +32,7 @@
 #include "dwmac100.h"
 #include "dwmac_dma.h"
 
-static int dwmac100_dma_init(void __iomem *ioaddr, int pbl, int fb,
+static int dwmac100_dma_init(void __iomem *ioaddr, int pbl, int fb, int mb,
 			     int burst_len, u32 dma_tx, u32 dma_rx)
 {
 	u32 value = readl(ioaddr + DMA_BUS_MODE);
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index a9699ae..4fd62ff 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -924,7 +924,8 @@ static void stmmac_check_ether_addr(struct stmmac_priv *priv)
 
 static int stmmac_init_dma_engine(struct stmmac_priv *priv)
 {
-	int pbl = DEFAULT_DMA_PBL, fixed_burst = 0, burst_len = 0;
+	int pbl = DEFAULT_DMA_PBL, fixed_burst = 0, burst_len = 0,
+	    mixed_burst = 0;
 
 	/* Some DMA parameters can be passed from the platform;
 	 * in case of these are not passed we keep a default
@@ -932,10 +933,11 @@ static int stmmac_init_dma_engine(struct stmmac_priv *priv)
 	if (priv->plat->dma_cfg) {
 		pbl = priv->plat->dma_cfg->pbl;
 		fixed_burst = priv->plat->dma_cfg->fixed_burst;
+		mixed_burst = priv->plat->dma_cfg->mixed_burst;
 		burst_len = priv->plat->dma_cfg->burst_len;
 	}
 
-	return priv->hw->dma->init(priv->ioaddr, pbl, fixed_burst,
+	return priv->hw->dma->init(priv->ioaddr, pbl, fixed_burst, mixed_burst,
 				   burst_len, priv->dma_tx_phy,
 				   priv->dma_rx_phy);
 }
diff --git a/include/linux/stmmac.h b/include/linux/stmmac.h
index f85c93d..b69bdb1 100644
--- a/include/linux/stmmac.h
+++ b/include/linux/stmmac.h
@@ -86,6 +86,7 @@ struct stmmac_mdio_bus_data {
 struct stmmac_dma_cfg {
 	int pbl;
 	int fixed_burst;
+	int mixed_burst;
 	int burst_len;
 };
 
-- 
1.7.4.4

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox