Netdev List
 help / color / mirror / Atom feed
* [PATCH 2/3] netfilter: ipt_ULOG: fix info leaks
From: Pablo Neira Ayuso @ 2013-10-23  9:15 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev
In-Reply-To: <1382519724-3953-1-git-send-email-pablo@netfilter.org>

From: Mathias Krause <minipli@googlemail.com>

The ulog messages leak heap bytes by the means of padding bytes and
incompletely filled string arrays. Fix those by memset(0)'ing the
whole struct before filling it.

Signed-off-by: Mathias Krause <minipli@googlemail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/ipv4/netfilter/ipt_ULOG.c |    7 +------
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/net/ipv4/netfilter/ipt_ULOG.c b/net/ipv4/netfilter/ipt_ULOG.c
index cbc2215..9cb993c 100644
--- a/net/ipv4/netfilter/ipt_ULOG.c
+++ b/net/ipv4/netfilter/ipt_ULOG.c
@@ -220,6 +220,7 @@ static void ipt_ulog_packet(struct net *net,
 	ub->qlen++;
 
 	pm = nlmsg_data(nlh);
+	memset(pm, 0, sizeof(*pm));
 
 	/* We might not have a timestamp, get one */
 	if (skb->tstamp.tv64 == 0)
@@ -238,8 +239,6 @@ static void ipt_ulog_packet(struct net *net,
 	}
 	else if (loginfo->prefix[0] != '\0')
 		strncpy(pm->prefix, loginfo->prefix, sizeof(pm->prefix));
-	else
-		*(pm->prefix) = '\0';
 
 	if (in && in->hard_header_len > 0 &&
 	    skb->mac_header != skb->network_header &&
@@ -251,13 +250,9 @@ static void ipt_ulog_packet(struct net *net,
 
 	if (in)
 		strncpy(pm->indev_name, in->name, sizeof(pm->indev_name));
-	else
-		pm->indev_name[0] = '\0';
 
 	if (out)
 		strncpy(pm->outdev_name, out->name, sizeof(pm->outdev_name));
-	else
-		pm->outdev_name[0] = '\0';
 
 	/* copy_len <= skb->len, so can't fail. */
 	if (skb_copy_bits(skb, 0, pm->payload, copy_len) < 0)
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH 1/3] netfilter: ebt_ulog: fix info leaks
From: Pablo Neira Ayuso @ 2013-10-23  9:15 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev
In-Reply-To: <1382519724-3953-1-git-send-email-pablo@netfilter.org>

From: Mathias Krause <minipli@googlemail.com>

The ulog messages leak heap bytes by the means of padding bytes and
incompletely filled string arrays. Fix those by memset(0)'ing the
whole struct before filling it.

Signed-off-by: Mathias Krause <minipli@googlemail.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/bridge/netfilter/ebt_ulog.c |    9 +++------
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/net/bridge/netfilter/ebt_ulog.c b/net/bridge/netfilter/ebt_ulog.c
index 5180938..7c470c3 100644
--- a/net/bridge/netfilter/ebt_ulog.c
+++ b/net/bridge/netfilter/ebt_ulog.c
@@ -181,6 +181,7 @@ static void ebt_ulog_packet(struct net *net, unsigned int hooknr,
 	ub->qlen++;
 
 	pm = nlmsg_data(nlh);
+	memset(pm, 0, sizeof(*pm));
 
 	/* Fill in the ulog data */
 	pm->version = EBT_ULOG_VERSION;
@@ -193,8 +194,6 @@ static void ebt_ulog_packet(struct net *net, unsigned int hooknr,
 	pm->hook = hooknr;
 	if (uloginfo->prefix != NULL)
 		strcpy(pm->prefix, uloginfo->prefix);
-	else
-		*(pm->prefix) = '\0';
 
 	if (in) {
 		strcpy(pm->physindev, in->name);
@@ -204,16 +203,14 @@ static void ebt_ulog_packet(struct net *net, unsigned int hooknr,
 			strcpy(pm->indev, br_port_get_rcu(in)->br->dev->name);
 		else
 			strcpy(pm->indev, in->name);
-	} else
-		pm->indev[0] = pm->physindev[0] = '\0';
+	}
 
 	if (out) {
 		/* If out exists, then out is a bridge port */
 		strcpy(pm->physoutdev, out->name);
 		/* rcu_read_lock()ed by nf_hook_slow */
 		strcpy(pm->outdev, br_port_get_rcu(out)->br->dev->name);
-	} else
-		pm->outdev[0] = pm->physoutdev[0] = '\0';
+	}
 
 	if (skb_copy_bits(skb, -ETH_HLEN, pm->data, copy_len) < 0)
 		BUG();
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH 0/3] netfilter fixes for net
From: Pablo Neira Ayuso @ 2013-10-23  9:15 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

Hi David,

The following patchset contains three netfilter fixes for your net
tree, they are:

* A couple of fixes to resolve info leak to userspace due to uninitialized
  memory area in ulogd, from Mathias Krause.

* Fix instruction ordering issues that may lead to the access of
  uninitialized data in x_tables. The problem involves the table update
 (producer) and the main packet matching (consumer) routines. Detected in
  SMP ARMv7, from Will Deacon.

You can pull these changes from:

  git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git master

Thanks!

----------------------------------------------------------------

The following changes since commit c31eeaced22ce8bd61268a3c595d542bb38c0a4f:

  Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net (2013-10-01 12:58:48 -0700)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git master

for you to fetch changes up to b416c144f46af1a30ddfa4e4319a8f077381ad63:

  netfilter: x_tables: fix ordering of jumpstack allocation and table update (2013-10-22 10:11:29 +0200)

----------------------------------------------------------------
Mathias Krause (2):
      netfilter: ebt_ulog: fix info leaks
      netfilter: ipt_ULOG: fix info leaks

Will Deacon (1):
      netfilter: x_tables: fix ordering of jumpstack allocation and table update

 net/bridge/netfilter/ebt_ulog.c |    9 +++------
 net/ipv4/netfilter/arp_tables.c |    5 +++++
 net/ipv4/netfilter/ip_tables.c  |    5 +++++
 net/ipv4/netfilter/ipt_ULOG.c   |    7 +------
 net/ipv6/netfilter/ip6_tables.c |    5 +++++
 net/netfilter/x_tables.c        |    7 ++++++-
 6 files changed, 25 insertions(+), 13 deletions(-)

^ permalink raw reply

* [PATCH net-next 3/3] inet: remove old fragmentation hash initializing
From: Hannes Frederic Sowa @ 2013-10-23  9:06 UTC (permalink / raw)
  To: netdev; +Cc: netfilter-devel, Hannes Frederic Sowa, David S. Miller,
	Eric Dumazet
In-Reply-To: <1382519217-750-1-git-send-email-hannes@stressinduktion.org>

All fragmentation hash secrets now get initialized by their
corresponding hash function with net_get_random_once. Thus we can
eliminate the initial seeding.

Also provide a comment that hash secret seeding happens at the first
call to the corresponding hashing function.

Cc: David S. Miller <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
---
 include/net/inet_frag.h  | 4 ++++
 net/ipv4/inet_fragment.c | 3 ---
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/include/net/inet_frag.h b/include/net/inet_frag.h
index bfcbc00..6f59de9 100644
--- a/include/net/inet_frag.h
+++ b/include/net/inet_frag.h
@@ -64,6 +64,10 @@ struct inet_frags {
 	rwlock_t		lock ____cacheline_aligned_in_smp;
 	int			secret_interval;
 	struct timer_list	secret_timer;
+
+	/* The first call to hashfn is responsible to initialize
+	 * rnd. This is best done with net_get_random_once.
+	 */
 	u32			rnd;
 	int			qsize;
 
diff --git a/net/ipv4/inet_fragment.c b/net/ipv4/inet_fragment.c
index c5313a9..bb075fc 100644
--- a/net/ipv4/inet_fragment.c
+++ b/net/ipv4/inet_fragment.c
@@ -93,9 +93,6 @@ void inet_frags_init(struct inet_frags *f)
 	}
 	rwlock_init(&f->lock);
 
-	f->rnd = (u32) ((totalram_pages ^ (totalram_pages >> 7)) ^
-				   (jiffies ^ (jiffies >> 6)));
-
 	setup_timer(&f->secret_timer, inet_frag_secret_rebuild,
 			(unsigned long)f);
 	f->secret_timer.expires = jiffies + f->secret_interval;
-- 
1.8.3.1

^ permalink raw reply related

* [PATCH net-next 2/3] ipv6: split inet6_hash_frag for netfilter and initialize secrets with net_get_random_once
From: Hannes Frederic Sowa @ 2013-10-23  9:06 UTC (permalink / raw)
  To: netdev
  Cc: netfilter-devel, Hannes Frederic Sowa, David S. Miller,
	Eric Dumazet, Pablo Neira Ayuso
In-Reply-To: <1382519217-750-1-git-send-email-hannes@stressinduktion.org>

Defer the fragmentation hash secret initialization for IPv6 like the
previous patch did for IPv4.

Because the netfilter logic reuses the hash secret we have to split it
first. Thus introduce a new nf_hash_frag function which takes care to
seed the hash secret.

Cc: David S. Miller <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
---
 include/net/ipv6.h                      |  2 --
 net/ipv6/netfilter/nf_conntrack_reasm.c | 16 ++++++++++++++--
 net/ipv6/reassembly.c                   | 12 ++++++------
 3 files changed, 20 insertions(+), 10 deletions(-)

diff --git a/include/net/ipv6.h b/include/net/ipv6.h
index a35055f..dd96638 100644
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -805,8 +805,6 @@ int ip6_mc_source(int add, int omode, struct sock *sk,
 int ip6_mc_msfilter(struct sock *sk, struct group_filter *gsf);
 int ip6_mc_msfget(struct sock *sk, struct group_filter *gsf,
 		  struct group_filter __user *optval, int __user *optlen);
-unsigned int inet6_hash_frag(__be32 id, const struct in6_addr *saddr,
-			     const struct in6_addr *daddr, u32 rnd);
 
 #ifdef CONFIG_PROC_FS
 int ac6_proc_init(struct net *net);
diff --git a/net/ipv6/netfilter/nf_conntrack_reasm.c b/net/ipv6/netfilter/nf_conntrack_reasm.c
index dffdc1a..4a25826 100644
--- a/net/ipv6/netfilter/nf_conntrack_reasm.c
+++ b/net/ipv6/netfilter/nf_conntrack_reasm.c
@@ -144,12 +144,24 @@ static inline u8 ip6_frag_ecn(const struct ipv6hdr *ipv6h)
 	return 1 << (ipv6_get_dsfield(ipv6h) & INET_ECN_MASK);
 }
 
+static unsigned int nf_hash_frag(__be32 id, const struct in6_addr *saddr,
+				 const struct in6_addr *daddr)
+{
+	u32 c;
+
+	net_get_random_once(&nf_frags.rnd, sizeof(nf_frags.rnd));
+	c = jhash_3words(ipv6_addr_hash(saddr), ipv6_addr_hash(daddr),
+			 (__force u32)id, nf_frags.rnd);
+	return c & (INETFRAGS_HASHSZ - 1);
+}
+
+
 static unsigned int nf_hashfn(struct inet_frag_queue *q)
 {
 	const struct frag_queue *nq;
 
 	nq = container_of(q, struct frag_queue, q);
-	return inet6_hash_frag(nq->id, &nq->saddr, &nq->daddr, nf_frags.rnd);
+	return nf_hash_frag(nq->id, &nq->saddr, &nq->daddr);
 }
 
 static void nf_skb_free(struct sk_buff *skb)
@@ -185,7 +197,7 @@ static inline struct frag_queue *fq_find(struct net *net, __be32 id,
 	arg.ecn = ecn;
 
 	read_lock_bh(&nf_frags.lock);
-	hash = inet6_hash_frag(id, src, dst, nf_frags.rnd);
+	hash = nf_hash_frag(id, src, dst);
 
 	q = inet_frag_find(&net->nf_frag.frags, &nf_frags, &arg, hash);
 	local_bh_enable();
diff --git a/net/ipv6/reassembly.c b/net/ipv6/reassembly.c
index 1aeb473..cc85a9b 100644
--- a/net/ipv6/reassembly.c
+++ b/net/ipv6/reassembly.c
@@ -82,24 +82,24 @@ static int ip6_frag_reasm(struct frag_queue *fq, struct sk_buff *prev,
  * callers should be careful not to use the hash value outside the ipfrag_lock
  * as doing so could race with ipfrag_hash_rnd being recalculated.
  */
-unsigned int inet6_hash_frag(__be32 id, const struct in6_addr *saddr,
-			     const struct in6_addr *daddr, u32 rnd)
+static unsigned int inet6_hash_frag(__be32 id, const struct in6_addr *saddr,
+				    const struct in6_addr *daddr)
 {
 	u32 c;
 
+	net_get_random_once(&ip6_frags.rnd, sizeof(ip6_frags.rnd));
 	c = jhash_3words(ipv6_addr_hash(saddr), ipv6_addr_hash(daddr),
-			 (__force u32)id, rnd);
+			 (__force u32)id, ip6_frags.rnd);
 
 	return c & (INETFRAGS_HASHSZ - 1);
 }
-EXPORT_SYMBOL_GPL(inet6_hash_frag);
 
 static unsigned int ip6_hashfn(struct inet_frag_queue *q)
 {
 	struct frag_queue *fq;
 
 	fq = container_of(q, struct frag_queue, q);
-	return inet6_hash_frag(fq->id, &fq->saddr, &fq->daddr, ip6_frags.rnd);
+	return inet6_hash_frag(fq->id, &fq->saddr, &fq->daddr);
 }
 
 bool ip6_frag_match(struct inet_frag_queue *q, void *a)
@@ -193,7 +193,7 @@ fq_find(struct net *net, __be32 id, const struct in6_addr *src,
 	arg.ecn = ecn;
 
 	read_lock(&ip6_frags.lock);
-	hash = inet6_hash_frag(id, src, dst, ip6_frags.rnd);
+	hash = inet6_hash_frag(id, src, dst);
 
 	q = inet_frag_find(&net->ipv6.frags, &ip6_frags, &arg, hash);
 	if (IS_ERR_OR_NULL(q)) {
-- 
1.8.3.1

^ permalink raw reply related

* [PATCH net-next 1/3] ipv4: initialize ip4_frags hash secret as late as possible
From: Hannes Frederic Sowa @ 2013-10-23  9:06 UTC (permalink / raw)
  To: netdev; +Cc: netfilter-devel, Hannes Frederic Sowa, Eric Dumazet,
	David S. Miller
In-Reply-To: <1382519217-750-1-git-send-email-hannes@stressinduktion.org>

Defer the generation of the first hash secret for the ipv4 fragmentation
cache as late as possible.

ip4_frags.rnd gets initial seeded by inet_frags_init and regulary
reseeded by inet_frag_secret_rebuild. Either we call ipqhashfn directly
from ip_fragment.c in which case we initialize the secret directly.

If we first get called by inet_frag_secret_rebuild we install a new secret
by a manual call to get_random_bytes. This secret will be overwritten
as soon as the first call to ipqhashfn happens. This is safe because we
won't race while publishing the new secrets with anyone else.

Cc: Eric Dumazet <edumazet@google.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
---
 net/ipv4/ip_fragment.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/net/ipv4/ip_fragment.c b/net/ipv4/ip_fragment.c
index b66910a..2481993 100644
--- a/net/ipv4/ip_fragment.c
+++ b/net/ipv4/ip_fragment.c
@@ -106,6 +106,7 @@ struct ip4_create_arg {
 
 static unsigned int ipqhashfn(__be16 id, __be32 saddr, __be32 daddr, u8 prot)
 {
+	net_get_random_once(&ip4_frags.rnd, sizeof(ip4_frags.rnd));
 	return jhash_3words((__force u32)id << 16 | prot,
 			    (__force u32)saddr, (__force u32)daddr,
 			    ip4_frags.rnd) & (INETFRAGS_HASHSZ - 1);
-- 
1.8.3.1

^ permalink raw reply related

* [PATCH net-next 0/3] initialize fragment hash secrets with net_get_random_once
From: Hannes Frederic Sowa @ 2013-10-23  9:06 UTC (permalink / raw)
  To: netdev; +Cc: netfilter-devel

Hi!

This series switches the inet_frag.rnd hash initialization to
net_get_random_once.

Included patches:
 ipv4: initialize ip4_frags hash secret as late
 ipv6: split inet6_hash_frag for netfilter and
 inet: remove old fragmentation hash initializing

Diffstat:
 include/net/inet_frag.h                 |  4 ++++
 include/net/ipv6.h                      |  2 --
 net/ipv4/inet_fragment.c                |  3 ---
 net/ipv4/ip_fragment.c                  |  1 +
 net/ipv6/netfilter/nf_conntrack_reasm.c | 16 ++++++++++++++--
 net/ipv6/reassembly.c                   | 12 ++++++------
 6 files changed, 25 insertions(+), 13 deletions(-)

Greetings,

  Hannes


^ permalink raw reply

* Re: [PATCH net] net: sctp: fix ASCONF to allow non SCTP_ADDR_SRC addresses in ipv6
From: Vladislav Yasevich @ 2013-10-23  8:54 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: David Miller, netdev@vger.kernel.org, linux-sctp@vger.kernel.org,
	Michio Honda
In-Reply-To: <1382459696-1732-1-git-send-email-dborkman@redhat.com>

On Tue, Oct 22, 2013 at 12:34 PM, Daniel Borkmann <dborkman@redhat.com> wrote:
>
> Commit 8a07eb0a50 ("sctp: Add ASCONF operation on the single-homed host")
> implemented possible use of IPv4 addresses with non SCTP_ADDR_SRC state
> as source address when sending ASCONF (ADD) packets, but IPv6 part for
> that was not implemented in 8a07eb0a50. Therefore, as this is not restricted
> to IPv4-only, fix this up to allow the same for IPv6 addresses in SCTP.
>
> Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
> Cc: Michio Honda <micchie@sfc.wide.ad.jp>

[stupid gmail]

Acked-by: Vlad Yasevich <vyasevich@gmail.com>

> ---
>  net/sctp/ipv6.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/net/sctp/ipv6.c b/net/sctp/ipv6.c
> index e7b2d4f..96a5591 100644
> --- a/net/sctp/ipv6.c
> +++ b/net/sctp/ipv6.c
> @@ -279,7 +279,9 @@ static void sctp_v6_get_dst(struct sctp_transport *t, union sctp_addr *saddr,
>                 sctp_v6_to_addr(&dst_saddr, &fl6->saddr, htons(bp->port));
>                 rcu_read_lock();
>                 list_for_each_entry_rcu(laddr, &bp->address_list, list) {
> -                       if (!laddr->valid || (laddr->state != SCTP_ADDR_SRC))
> +                       if (!laddr->valid || laddr->state == SCTP_ADDR_DEL ||
> +                           (laddr->state != SCTP_ADDR_SRC &&
> +                            !asoc->src_out_of_asoc_ok))
>                                 continue;
>
>                         /* Do not compare against v4 addrs */
> --
> 1.8.3.1
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* RE: [PATCH net] netpoll: fix rx_hook() interface by passing the skb
From: David Laight @ 2013-10-23  8:33 UTC (permalink / raw)
  To: Antonio Quartulli; +Cc: David S. Miller, netdev
In-Reply-To: <20131022171314.GL1544@neomailbox.net>

...
> > I can't remember which value you passed as 'offset' (and my mailer makes
> > it hard to find), but to ease the code changes the offset of the udp data
> > would make sense.
> > In that case you still need to pass the source port.
> 
> I decided not to pass the source port because if the user is really interested
> in it, it is still possible to get the udp_hdr from the skb and read its value.

It just seemed that there was no need to require that the hook re-parse
the ip header just to find the source port.
(ok it could assume that the udp header is just before the data)
 
> > If you do rx_hook(np, source_port, skb, offset) then if anyone manages to
> > load an old module (or code that casts the assignement to rx_poll)
> > at least it won't go 'bang'.
> > Renaming the structure member will guarantee to generate compile errors.
> 
> so you suggest to rename rx_hook to something else to warn people about the
> change?

Yes.

> If we go for the "no udp port" approach they will get an error any way because
> of the mismatching arguments.

No - you only get a warning when you assign a function pointer of the wrong type.
And that is true even if you just change the type of the pointer.
However code might already have a cast on the function pointer (eg because the
hook has 'unsigned char *') - so you won't even get a warning.
You then get an OOPS when the hook tries to read the buffer.

It is a really bad interface...
There isn't even a flags/options (etc) word that can be used
to detect enhancements.

	David


^ permalink raw reply

* [PATCH net v3] be2net: Warn users of possible broken functionality on BE2 cards with very old FW versions with latest driver
From: Somnath Kotur @ 2013-10-23  8:19 UTC (permalink / raw)
  To: netdev; +Cc: davem, Somnath Kotur

On very old FW versions < 4.0, the mailbox command to set interrupts
on the card succeeds even though it is not supported and should have
failed, leading to a scenario where interrupts do not work.
Hence warn users to upgrade to a suitable FW version to avoid seeing
broken functionality.

Signed-off-by: Somnath Kotur <somnath.kotur@emulex.com>
---
v3: Incorporated comments from Ben Hutchings and Joe Perches

 drivers/net/ethernet/emulex/benet/be.h      |   14 ++++++++++++++
 drivers/net/ethernet/emulex/benet/be_main.c |    6 ++++++
 2 files changed, 20 insertions(+), 0 deletions(-)

diff --git a/drivers/net/ethernet/emulex/benet/be.h b/drivers/net/ethernet/emulex/benet/be.h
index db02023..6a57051 100644
--- a/drivers/net/ethernet/emulex/benet/be.h
+++ b/drivers/net/ethernet/emulex/benet/be.h
@@ -696,6 +696,20 @@ static inline int qnq_async_evt_rcvd(struct be_adapter *adapter)
 	return adapter->flags & BE_FLAGS_QNQ_ASYNC_EVT_RCVD;
 }
 
+static inline u32 fw_major_num(char *fw_ver_str)
+{
+	u32 fw_major;
+	char *next, *cp;
+	char tmp_fw_ver[FW_VER_LEN];
+
+	strncpy(tmp_fw_ver, fw_ver_str, strlen(fw_ver_str));
+	next = tmp_fw_ver;
+	cp = strsep(&next, ".");
+	sscanf(cp, "%i", &fw_major);
+
+	return fw_major;
+}
+
 extern void be_cq_notify(struct be_adapter *adapter, u16 qid, bool arm,
 		u16 num_popped);
 extern void be_link_status_update(struct be_adapter *adapter, u8 link_status);
diff --git a/drivers/net/ethernet/emulex/benet/be_main.c b/drivers/net/ethernet/emulex/benet/be_main.c
index 2c38cc4..d8da961 100644
--- a/drivers/net/ethernet/emulex/benet/be_main.c
+++ b/drivers/net/ethernet/emulex/benet/be_main.c
@@ -3247,6 +3247,12 @@ static int be_setup(struct be_adapter *adapter)
 
 	be_cmd_get_fw_ver(adapter, adapter->fw_ver, adapter->fw_on_flash);
 
+	if (BE2_chip(adapter) && fw_major_num(adapter->fw_ver) < 4) {
+		dev_err(dev, "Firmware on card is old(%s), IRQs may not work.",
+			adapter->fw_ver);
+		dev_err(dev, "Please upgrade firmware to version >= 4.0\n");
+	}
+
 	if (adapter->vlans_added)
 		be_vid_config(adapter);
 
-- 
1.6.0.2

^ permalink raw reply related

* Re: Big performance loss from 3.4.63 to 3.10.13 when routing ipv4
From: Steffen Klassert @ 2013-10-23  8:12 UTC (permalink / raw)
  To: David Miller; +Cc: linux, hannes, netdev, klassert
In-Reply-To: <20131022.154638.1989281525248160764.davem@davemloft.net>

On Tue, Oct 22, 2013 at 03:46:38PM -0400, David Miller wrote:
> From: Wolfgang Walter <linux@stwm.de>
> Date: Tue, 22 Oct 2013 21:07:41 +0200
> 
> > Am Mittwoch, 2. Oktober 2013, 00:20:02 schrieb Hannes Frederic Sowa:
> >> On Tue, Oct 01, 2013 at 06:39:32PM +0200, Wolfgang Walter wrote:
> >> > All network traffic over the router become slow and sluggish. If one pings
> >> > the router there is a packet loss. After about 2 minutes the traffic
> >> > completely stalls for about 1 minute. Then it works again as in the
> >> > beginning to then stall again. And so on.
> >> 
> >> Maybe dropwatch can give a first hint?
> >> 
> > 
> > I finally found the problem:
> > 
> > In 3.10.x and 3.11.x the value of /proc/sys/net/ipv4/xfrm4_gc_thresh is 1024.
> > 
> > It is much higher in 3.4.x. If I increase this value in 3.10.x to the one I 
> > see on 3.4.x all works fine with 3.10.x
> 
> Steffen, here is yet another report about this issue.
> 
> I think we should resolve this soon, even bumping it to 2048 or 4096
> and leaving it at that would be I think acceptable.
> 

Yes, of course. Let's use 4096 as the default for ipv4 and ipv6.
I'll take care of it next week.

Thanks!

^ permalink raw reply

* Re: [PATCH net-next 0/2] Removal of struct esp_data
From: Steffen Klassert @ 2013-10-23  8:07 UTC (permalink / raw)
  To: David Miller; +Cc: mathias.krause, netdev, herbert
In-Reply-To: <20131022.140510.1879404122981800534.davem@davemloft.net>

On Tue, Oct 22, 2013 at 02:05:10PM -0400, David Miller wrote:
> From: Steffen Klassert <steffen.klassert@secunet.com>
> Date: Tue, 22 Oct 2013 15:08:22 +0200
> 
> > On Fri, Oct 18, 2013 at 01:55:36PM -0400, David Miller wrote:
> >> From: Mathias Krause <mathias.krause@secunet.com>
> >> Date: Fri, 18 Oct 2013 12:09:03 +0200
> >> 
> >> > This series removes one level of indirection when accessing the aead
> >> > crypto algorithm in ESP transforms by simply removing struct esp_data.
> >> > This results in smaller code and less memory usage per xfrm state.
> >> > 
> >> > Please apply!
> >> 
> >> No objections from me, I'll let Steffen pick this up.
> > 
> > I'm a bit hesitating with removing the padlen field. We resisted
> > several attempts to remove it in the past. It is currenly unused,
> > but it provides the infrastructure for ESP padding as defined
> > in RFC 4303. However, RFC 4303 recommends the use of TFC padding
> > instead to conceal the actual length of the packet. So I'm not
> > sure what's the actual usecase for ESP padding. I'll reconsider
> > this next week when I'm back at office.
> 
> Steffen, is it really the case that we cannot add it back later if we
> really need to?
> 

Well, I thought to either take this as a reminder to implement the
missing stuff or to take the removing patches if this is really obsolete.
I'll do one of both once I'm back from conference week in Edinburgh.

^ permalink raw reply

* Re: [Xen-devel] [PATCH net] xen-netback: add the scenario which now beyond the range time_after_eq().
From: jianhai luan @ 2013-10-23  8:02 UTC (permalink / raw)
  To: Wei Liu, David Laight
  Cc: Jan Beulich, annie li, david.vrabel, ian.campbell, xen-devel,
	netdev
In-Reply-To: <20131018112417.GA20185@zion.uk.xensource.com>


On 2013-10-18 19:24, Wei Liu wrote:
> On Fri, Oct 18, 2013 at 09:40:33AM +0100, David Laight wrote:
>>>> My understanding is this patch does not simply double the span, it is
>>>> just stricter than the original one. Please check my previous comments,
>>>> I paste it here.
>>> No, the code (on a 32-bit arch) just _can't_ handle jiffies differences
>>> beyond 2^32, no matter how cleverly you use the respective macros.
>>> All arithmetic there is done modulo 2^32.
>> I haven't followed this discussion very closely but it might be possible
>> to arrange that the 'incorrect lack of credit' only occurs for a few
>> seconds every time 'jiffies' wraps - instead of half of the time.
>> Then you'd have to be extremely unlucky to hit the timing window.
>>
> As I understand it, this is the idea of this patch -- to narrow down the
> timing window.
Jan,  do you agree the idea or have better suggestion to me.

Thanks,
Jason
>
> Wei.
>
>> 	David
>>
>>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe netdev" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: BUG: scheduling while atomic dev_set_promiscuity->__dev_notify_flags
From: Nicolas Dichtel @ 2013-10-23  7:48 UTC (permalink / raw)
  To: Alexei Starovoitov, Cong Wang; +Cc: netdev
In-Reply-To: <CAADnVQLY7-crQLO1Tc4MBLX8R66wAZztw-arnfO3NWO9sLW++Q@mail.gmail.com>

Le 23/10/2013 07:34, Alexei Starovoitov a écrit :
> On Tue, Oct 22, 2013 at 8:53 PM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
>> On Tue, 22 Oct 2013 at 01:04 GMT, Alexei Starovoitov <ast@plumgrid.com> wrote:
>>>
>>> packet_notifier() does rcu_read_lock() before calling into packet_dev_mc() .
>>>
>>> Not sure how to fix it cleanly, other than disabling a notify here.
>>> Any suggestion?
>>>
>>
>> Passing a gfp flag to rtmsg_ifinfo() seems a right fix for me, but I don't
>> know if there is other better way to fix it.
>
> Indeed. rtnl_notify() already accepts gfp_t.
>
> The following diff fixes it for me:
> ---
>   include/linux/rtnetlink.h |    3 ++-
>   net/core/dev.c            |    2 +-
>   net/core/rtnetlink.c      |   12 +++++++++---
>   3 files changed, 12 insertions(+), 5 deletions(-)
>
> diff --git a/include/linux/rtnetlink.h b/include/linux/rtnetlink.h
> index f28544b..0180523 100644
> --- a/include/linux/rtnetlink.h
> +++ b/include/linux/rtnetlink.h
> @@ -15,7 +15,8 @@ extern int rtnetlink_put_metrics(struct sk_buff
> *skb, u32 *metrics);
>   extern int rtnl_put_cacheinfo(struct sk_buff *skb, struct dst_entry *dst,
>                     u32 id, long expires, u32 error);
>
> -extern void rtmsg_ifinfo(int type, struct net_device *dev, unsigned change);
> +void __rtmsg_ifinfo(int type, struct net_device *dev, unsigned
> change, gfp_t flags);
Just nitpiking: putting the type and the name on the same line is better 
'unsigned change'.

> +void rtmsg_ifinfo(int type, struct net_device *dev, unsigned change);
>
>   /* RTNL is used as a global lock for all changes to network configuration  */
>   extern void rtnl_lock(void);
> diff --git a/net/core/dev.c b/net/core/dev.c
> index 0918aad..59b90fe 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -5257,7 +5257,7 @@ void __dev_notify_flags(struct net_device *dev,
> unsigned int old_flags,
>       unsigned int changes = dev->flags ^ old_flags;
>
>       if (gchanges)
> -        rtmsg_ifinfo(RTM_NEWLINK, dev, gchanges);
> +        __rtmsg_ifinfo(RTM_NEWLINK, dev, gchanges, GFP_ATOMIC);
>
>       if (changes & IFF_UP) {
>           if (dev->flags & IFF_UP)
> diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
> index 4aedf03..5931af9 100644
> --- a/net/core/rtnetlink.c
> +++ b/net/core/rtnetlink.c
> @@ -1984,14 +1984,15 @@ static int rtnl_dump_all(struct sk_buff *skb,
> struct netlink_callback *cb)
>       return skb->len;
>   }
>
> -void rtmsg_ifinfo(int type, struct net_device *dev, unsigned int change)
> +void __rtmsg_ifinfo(int type, struct net_device *dev, unsigned int change,
> +            gfp_t flags)
'gfp_t flags' should be aligned with 'int type'

>   {
>       struct net *net = dev_net(dev);
>       struct sk_buff *skb;
>       int err = -ENOBUFS;
>       size_t if_info_size;
>
> -    skb = nlmsg_new((if_info_size = if_nlmsg_size(dev, 0)), GFP_KERNEL);
> +    skb = nlmsg_new((if_info_size = if_nlmsg_size(dev, 0)), flags);
>       if (skb == NULL)
>           goto errout;
>
> @@ -2002,12 +2003,17 @@ void rtmsg_ifinfo(int type, struct net_device
> *dev, unsigned int change)
>           kfree_skb(skb);
>           goto errout;
>       }
> -    rtnl_notify(skb, net, 0, RTNLGRP_LINK, NULL, GFP_KERNEL);
> +    rtnl_notify(skb, net, 0, RTNLGRP_LINK, NULL, flags);
>       return;
>   errout:
>       if (err < 0)
>           rtnl_set_sk_err(net, RTNLGRP_LINK, err);
>   }
> +
> +void rtmsg_ifinfo(int type, struct net_device *dev, unsigned int change)
> +{
> +    __rtmsg_ifinfo(type, dev, change, GFP_KERNEL);
> +}
>   EXPORT_SYMBOL(rtmsg_ifinfo);
>
>   static int nlmsg_populate_fdb_fill(struct sk_buff *skb,
> --
>
> Nicolas, I've sent you my .config.
I got some pb with my testbed, hence I still didn't reproduce the bug.
I will make more test today, but I think this is the right patch.

> Any better ideas?
No and the patch is right for me (__dev_set_promiscuity() call
audit_log(..., GFP_ATOMIC) already).

^ permalink raw reply

* [PATCH 7/7] net: via-rhine: remove unnecessary pci_set_drvdata()
From: Jingoo Han @ 2013-10-23  7:09 UTC (permalink / raw)
  To: 'David S. Miller'
  Cc: netdev, 'Jingoo Han', 'Roger Luethi'
In-Reply-To: <000001cecfbe$44ce9c10$ce6bd430$%han@samsung.com>

The driver core clears the driver data to NULL after device_release
or on probe failure. Thus, it is not needed to manually clear the
device driver data to NULL.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
---
 drivers/net/ethernet/via/via-rhine.c |    1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/net/ethernet/via/via-rhine.c b/drivers/net/ethernet/via/via-rhine.c
index bdf697b..4a7293e 100644
--- a/drivers/net/ethernet/via/via-rhine.c
+++ b/drivers/net/ethernet/via/via-rhine.c
@@ -2292,7 +2292,6 @@ static void rhine_remove_one(struct pci_dev *pdev)
 
 	free_netdev(dev);
 	pci_disable_device(pdev);
-	pci_set_drvdata(pdev, NULL);
 }
 
 static void rhine_shutdown (struct pci_dev *pdev)
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH 6/7] net: tc35815: remove unnecessary pci_set_drvdata()
From: Jingoo Han @ 2013-10-23  7:09 UTC (permalink / raw)
  To: 'David S. Miller'; +Cc: netdev, 'Jingoo Han'
In-Reply-To: <000001cecfbe$44ce9c10$ce6bd430$%han@samsung.com>

The driver core clears the driver data to NULL after device_release
or on probe failure. Thus, it is not needed to manually clear the
device driver data to NULL.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
---
 drivers/net/ethernet/toshiba/tc35815.c |    1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/net/ethernet/toshiba/tc35815.c b/drivers/net/ethernet/toshiba/tc35815.c
index a971b9c..1322546 100644
--- a/drivers/net/ethernet/toshiba/tc35815.c
+++ b/drivers/net/ethernet/toshiba/tc35815.c
@@ -887,7 +887,6 @@ static void tc35815_remove_one(struct pci_dev *pdev)
 	mdiobus_free(lp->mii_bus);
 	unregister_netdev(dev);
 	free_netdev(dev);
-	pci_set_drvdata(pdev, NULL);
 }
 
 static int
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH 5/7] net: spider_net: remove unnecessary pci_set_drvdata()
From: Jingoo Han @ 2013-10-23  7:09 UTC (permalink / raw)
  To: 'David S. Miller'
  Cc: netdev, 'Jingoo Han', 'Ishizaki Kou',
	'Jens Osterkamp'
In-Reply-To: <000001cecfbe$44ce9c10$ce6bd430$%han@samsung.com>

The driver core clears the driver data to NULL after device_release
or on probe failure. Thus, it is not needed to manually clear the
device driver data to NULL.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
---
 drivers/net/ethernet/toshiba/spider_net.c |    1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/net/ethernet/toshiba/spider_net.c b/drivers/net/ethernet/toshiba/spider_net.c
index 5734480..3f4a32e 100644
--- a/drivers/net/ethernet/toshiba/spider_net.c
+++ b/drivers/net/ethernet/toshiba/spider_net.c
@@ -2478,7 +2478,6 @@ out_release_regions:
 	pci_release_regions(pdev);
 out_disable_dev:
 	pci_disable_device(pdev);
-	pci_set_drvdata(pdev, NULL);
 	return NULL;
 }
 
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH 4/7] net: tlan: remove unnecessary pci_set_drvdata()
From: Jingoo Han @ 2013-10-23  7:08 UTC (permalink / raw)
  To: 'David S. Miller'
  Cc: netdev, 'Jingoo Han', 'Samuel Chessman'
In-Reply-To: <000001cecfbe$44ce9c10$ce6bd430$%han@samsung.com>

The driver core clears the driver data to NULL after device_release
or on probe failure. Thus, it is not needed to manually clear the
device driver data to NULL.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
---
 drivers/net/ethernet/ti/tlan.c |    1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/net/ethernet/ti/tlan.c b/drivers/net/ethernet/ti/tlan.c
index 591437e..62b19be 100644
--- a/drivers/net/ethernet/ti/tlan.c
+++ b/drivers/net/ethernet/ti/tlan.c
@@ -319,7 +319,6 @@ static void tlan_remove_one(struct pci_dev *pdev)
 
 	free_netdev(dev);
 
-	pci_set_drvdata(pdev, NULL);
 	cancel_work_sync(&priv->tlan_tqueue);
 }
 
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH 3/7] net: tehuti: remove unnecessary pci_set_drvdata()
From: Jingoo Han @ 2013-10-23  7:07 UTC (permalink / raw)
  To: 'David S. Miller'
  Cc: netdev, 'Jingoo Han', 'Andy Gospodarek'
In-Reply-To: <000001cecfbe$44ce9c10$ce6bd430$%han@samsung.com>

The driver core clears the driver data to NULL after device_release
or on probe failure. Thus, it is not needed to manually clear the
device driver data to NULL.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
---
 drivers/net/ethernet/tehuti/tehuti.c |    1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/net/ethernet/tehuti/tehuti.c b/drivers/net/ethernet/tehuti/tehuti.c
index 571452e..dd0dd627 100644
--- a/drivers/net/ethernet/tehuti/tehuti.c
+++ b/drivers/net/ethernet/tehuti/tehuti.c
@@ -2447,7 +2447,6 @@ static void bdx_remove(struct pci_dev *pdev)
 	iounmap(nic->regs);
 	pci_release_regions(pdev);
 	pci_disable_device(pdev);
-	pci_set_drvdata(pdev, NULL);
 	vfree(nic);
 
 	RET();
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH 2/7] net: niu: remove unnecessary pci_set_drvdata()
From: Jingoo Han @ 2013-10-23  7:07 UTC (permalink / raw)
  To: 'David S. Miller'; +Cc: netdev, 'Jingoo Han'
In-Reply-To: <000001cecfbe$44ce9c10$ce6bd430$%han@samsung.com>

The driver core clears the driver data to NULL after device_release
or on probe failure. Thus, it is not needed to manually clear the
device driver data to NULL.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
---
 drivers/net/ethernet/sun/niu.c |    2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/net/ethernet/sun/niu.c b/drivers/net/ethernet/sun/niu.c
index f28460c..388540f 100644
--- a/drivers/net/ethernet/sun/niu.c
+++ b/drivers/net/ethernet/sun/niu.c
@@ -9875,7 +9875,6 @@ err_out_free_res:
 
 err_out_disable_pdev:
 	pci_disable_device(pdev);
-	pci_set_drvdata(pdev, NULL);
 
 	return err;
 }
@@ -9900,7 +9899,6 @@ static void niu_pci_remove_one(struct pci_dev *pdev)
 		free_netdev(dev);
 		pci_release_regions(pdev);
 		pci_disable_device(pdev);
-		pci_set_drvdata(pdev, NULL);
 	}
 }
 
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH 1/7] net: sungem: remove unnecessary pci_set_drvdata()
From: Jingoo Han @ 2013-10-23  7:06 UTC (permalink / raw)
  To: 'David S. Miller'; +Cc: netdev, 'Jingoo Han'
In-Reply-To: <000001cecfbe$44ce9c10$ce6bd430$%han@samsung.com>

The driver core clears the driver data to NULL after device_release
or on probe failure. Thus, it is not needed to manually clear the
device driver data to NULL.

Signed-off-by: Jingoo Han <jg1.han@samsung.com>
---
 drivers/net/ethernet/sun/sungem.c |    2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/net/ethernet/sun/sungem.c b/drivers/net/ethernet/sun/sungem.c
index a235bd9..b5655b7 100644
--- a/drivers/net/ethernet/sun/sungem.c
+++ b/drivers/net/ethernet/sun/sungem.c
@@ -2806,8 +2806,6 @@ static void gem_remove_one(struct pci_dev *pdev)
 		iounmap(gp->regs);
 		pci_release_regions(pdev);
 		free_netdev(dev);
-
-		pci_set_drvdata(pdev, NULL);
 	}
 }
 
-- 
1.7.10.4

^ permalink raw reply related

* [PATCH 0/7] net: ethernet: remove unnecessary pci_set_drvdata() part 4
From: Jingoo Han @ 2013-10-23  7:05 UTC (permalink / raw)
  To: 'David S. Miller'; +Cc: netdev, 'Jingoo Han'

Since commit 0998d0631001288a5974afc0b2a5f568bcdecb4d
(device-core: Ensure drvdata = NULL when no driver is bound),
the driver core clears the driver data to NULL after device_release
or on probe failure. Thus, it is not needed to manually clear the
device driver data to NULL.

---
 drivers/net/ethernet/sun/niu.c            |    2 --
 drivers/net/ethernet/sun/sungem.c         |    2 --
 drivers/net/ethernet/tehuti/tehuti.c      |    1 -
 drivers/net/ethernet/ti/tlan.c            |    1 -
 drivers/net/ethernet/toshiba/spider_net.c |    1 -
 drivers/net/ethernet/toshiba/tc35815.c    |    1 -
 drivers/net/ethernet/via/via-rhine.c      |    1 -
 7 files changed, 9 deletions(-)

^ permalink raw reply

* Re: [PATCH] packet: Deliver VLAN TPID to userspace
From: Atzm Watanabe @ 2013-10-23  6:51 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: Ben Hutchings, netdev
In-Reply-To: <20131022154110.518282ba@nehalam.linuxnetplumber.net>

At Tue, 22 Oct 2013 15:41:10 -0700,
Stephen Hemminger wrote:
> 
> On Tue, 22 Oct 2013 11:56:31 +0900
> Atzm Watanabe <atzm@stratosphere.co.jp> wrote:
> 
> > Hmm...  I think TPACKET{,2,3}_HDRLEN should not be removed without
> > careful considerations.  Because some userspace programs (e.g libpcap)
> > are using them in order to check mmap ability of the kernel...
> 
> 
> That's bad because it means the library then depends on the headers
> of the machine it was built on, not the machine it is running on. I often
> build software on boxes where /usr/include version of kernel headers is out dated.

Yes, this means that building such library with the new header
(TPACKET{,2,3}_HDRLEN were deleted) will make the binary that does
not support PACKET_MMAP. :-(
So, IMHO, we should be prudent in deleting TPACKET{,2,3}_HDRLEN,
otherwise we will break such userspace programs very easily...

e.g. snippet of the latest libpcap-1.4.0 (pcap-linux.c):
   /* check for memory mapped access avaibility. We assume every needed 
    * struct is defined if the macro TPACKET_HDRLEN is defined, because it
    * uses many ring related structs and macros */
  # ifdef TPACKET_HDRLEN
  #  define HAVE_PACKET_RING
  #  ifdef TPACKET2_HDRLEN
  #   define HAVE_TPACKET2
  #  else
  #   define TPACKET_V1   0
  #  endif /* TPACKET2_HDRLEN */
  # endif /* TPACKET_HDRLEN */

It looks like PACKET_MMAP is used when HAVE_PACKET_RING is defined.

^ permalink raw reply

* [PATCH net-next] net: always inline net_secret_init
From: Hannes Frederic Sowa @ 2013-10-23  6:44 UTC (permalink / raw)
  To: netdev

Currently net_secret_init does not get inlined, so we always have a call
to net_secret_init even in the fast path.

Let's specify net_secret_init as __always_inline so we have the nop in
the fast-path without the call to net_secret_init and the unlikely path
at the epilogue of the function.

jump_labels handle the inlining correctly.

Signed-off-by: Hannes Frederic Sowa <hannes@stressinduktion.org>
---
 net/core/secure_seq.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/core/secure_seq.c b/net/core/secure_seq.c
index b02fd16..90e8a82 100644
--- a/net/core/secure_seq.c
+++ b/net/core/secure_seq.c
@@ -15,7 +15,7 @@
 
 static u32 net_secret[NET_SECRET_SIZE] ____cacheline_aligned;
 
-static void net_secret_init(void)
+static __always_inline void net_secret_init(void)
 {
 	net_get_random_once(net_secret, sizeof(net_secret));
 }
-- 
1.8.3.1

^ permalink raw reply related

* Re: [PATCH v2.44 1/5] odp: Allow VLAN actions after MPLS actions
From: Simon Horman @ 2013-10-23  6:30 UTC (permalink / raw)
  To: Ben Pfaff
  Cc: Joe Stringer, dev, netdev, Jesse Gross, Pravin B Shelar, Ravi K,
	Isaku Yamahata, Joe Stringer
In-Reply-To: <20131022205541.GA29754@nicira.com>

On Tue, Oct 22, 2013 at 01:55:41PM -0700, Ben Pfaff wrote:
> On Tue, Oct 22, 2013 at 11:30:26AM -0700, Joe Stringer wrote:
> > You're quite right. I think for OF1.2, this is similar to existing
> > behaviour, but for OF1.3 it's just incorrect. There is an additional issue
> > with the LOAD action.
> > 
> > OF1.2:
> > "push_vlan(A),push_mpls,push_vlan(B)"
> > In OF1.2, this has the same result as:-
> > "push_mpls,push_vlan(A),push_vlan(B)"
> > 
> > When translated, this boils down to "(pop_vlan,)push_mpls,push_vlan(B)".
> > Correct me if I am wrong, but I think this is similar to the existing
> > behaviour, as we currently only support one layer of VLAN. It doesn't
> > matter if A == B.
> > 
> > OF1.3:
> > "mod_vlan_vid:A,push_mpls:0x8847,mod_vlan_vid:A" should work, but the
> > second mod_vlan is being dropped as it has the same VID as the first. This
> > is incorrect, as you point out.
> > 
> > LOAD:
> > "load:A->OXM_OF_VLAN_VID,push_mpls:0x8847,load:A->OXM_OF_VLAN_VID" doesn't
> > result in the correct vlan_vid from the first load action. I'm not sure
> > that vlan_tci_restore() is clear or correct---I believe its original
> > purpose was to properly handle the case where MPLS changes are made from
> > REG_LOAD and friends, in the way that the PUSH_MPLS case works. Instead, it
> > is handling all cases where MPLS actions have been applied in the past,
> > whether the current action modifies MPLS or not.
> > 
> > It's probably also worthwhile to make use of the ovs-ofctl monitor "-m"
> > option in the tests to actually verify these, rather than the current tests
> > where we just check the size of the resulting packet.
> 
> Thanks for the careful analysis.

Indeed, thanks Joe, I'll fix this up.

The idea that I have at this time is to track the VLAN depth delta
in a similar way to what is done for MPLS. But it is not a particularly
well thought out idea at this stage.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox