* Re: [PATCH] net-netlink: Add a new attribute to expose TCLASS values via netlink
From: Maciej Żenczykowski @ 2011-11-08 0:25 UTC (permalink / raw)
To: Maciej Żenczykowski
Cc: netdev, Murali Raja, Stephen Hemminger, Eric Dumazet,
David S. Miller
In-Reply-To: <1320711791-11005-1-git-send-email-zenczykowski@gmail.com>
FYI, This obviously requires a follow up patch which will add TOS and
TCLASS info to appropriate dual-stack sockets (for example listening
tcp v6 non-ipv6only).
On Mon, Nov 7, 2011 at 4:23 PM, Maciej Żenczykowski
<zenczykowski@gmail.com> wrote:
> From: Maciej Żenczykowski <maze@google.com>
>
> commit 3ceca749668a52bd795585e0f71c6f0b04814f7b added a TOS attribute.
>
> Unfortunately TOS and TCLASS are both present in a dual-stack v6 socket,
> furthermore they can have different values. As such one cannot in a
> sane way expose both through a single attribute.
>
> Signed-off-by: Maciej Żenczyowski <maze@google.com>
> CC: Murali Raja <muralira@google.com>
> CC: Stephen Hemminger <shemminger@vyatta.com>
> CC: Eric Dumazet <eric.dumazet@gmail.com>
> CC: David S. Miller <davem@davemloft.net>
> ---
> include/linux/inet_diag.h | 3 ++-
> net/ipv4/inet_diag.c | 4 ++--
> 2 files changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/include/linux/inet_diag.h b/include/linux/inet_diag.h
> index 80b480c..abf5028 100644
> --- a/include/linux/inet_diag.h
> +++ b/include/linux/inet_diag.h
> @@ -98,9 +98,10 @@ enum {
> INET_DIAG_VEGASINFO,
> INET_DIAG_CONG,
> INET_DIAG_TOS,
> + INET_DIAG_TCLASS,
> };
>
> -#define INET_DIAG_MAX INET_DIAG_TOS
> +#define INET_DIAG_MAX INET_DIAG_TCLASS
>
>
> /* INET_DIAG_MEM */
> diff --git a/net/ipv4/inet_diag.c b/net/ipv4/inet_diag.c
> index f5e2bda..68e8ac5 100644
> --- a/net/ipv4/inet_diag.c
> +++ b/net/ipv4/inet_diag.c
> @@ -133,8 +133,8 @@ static int inet_csk_diag_fill(struct sock *sk,
> &np->rcv_saddr);
> ipv6_addr_copy((struct in6_addr *)r->id.idiag_dst,
> &np->daddr);
> - if (ext & (1 << (INET_DIAG_TOS - 1)))
> - RTA_PUT_U8(skb, INET_DIAG_TOS, np->tclass);
> + if (ext & (1 << (INET_DIAG_TCLASS - 1)))
> + RTA_PUT_U8(skb, INET_DIAG_TCLASS, np->tclass);
> }
> #endif
>
> --
> 1.7.3.1
>
>
--
Maciej A. Żenczykowski
Kernel Networking Developer @ Google
1600 Amphitheatre Parkway, Mountain View, CA 94043
tel: +1 (650) 253-0062
^ permalink raw reply
* Re: [PATCH] net-netlink: Add a new attribute to expose TCLASS values via netlink
From: Stephen Hemminger @ 2011-11-08 0:27 UTC (permalink / raw)
To: Maciej Żenczykowski
Cc: Maciej Żenczykowski, netdev, Murali Raja, Eric Dumazet,
David S. Miller
In-Reply-To: <1320711791-11005-1-git-send-email-zenczykowski@gmail.com>
On Mon, 7 Nov 2011 16:23:11 -0800
Maciej Żenczykowski <zenczykowski@gmail.com> wrote:
> commit 3ceca749668a52bd795585e0f71c6f0b04814f7b added a TOS attribute.
>
> Unfortunately TOS and TCLASS are both present in a dual-stack v6 socket,
> furthermore they can have different values. As such one cannot in a
> sane way expose both through a single attribute.
Do we really want to continue to expose that as a supported
API. I would argue it was a mistake in the original implementation.
^ permalink raw reply
* Re: [PATCH] net-netlink: Add a new attribute to expose TCLASS values via netlink
From: Maciej Żenczykowski @ 2011-11-08 0:29 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: netdev, Murali Raja, Eric Dumazet, David S. Miller
In-Reply-To: <20111107162704.0b569074@nehalam.linuxnetplumber.net>
> Do we really want to continue to expose that as a supported
> API. I would argue it was a mistake in the original implementation.
Yes, that's an interesting question...
However, can this be changed at this point in time?
Theoretically network fabric hardware could interpret v4 and v6 dscp
code points differently
(don't think anyone sane would do really want to do this though...)
^ permalink raw reply
* Re: commit 0bdb0bd0 breaks shutdown/reboot
From: Stephen Hemminger @ 2011-11-08 0:57 UTC (permalink / raw)
To: Dominik Brodowski; +Cc: davem, netdev
In-Reply-To: <20111107170814.GA7657@comet.dominikbrodowski.net>
I can reproduce on my laptop, let me investigate.
^ permalink raw reply
* [PATCH 1/2] net: make ipv6 bind honour freebind
From: Maciej Żenczykowski @ 2011-11-08 0:57 UTC (permalink / raw)
To: Maciej Żenczykowski; +Cc: netdev, Maciej Żenczykowski
In-Reply-To: <CAHo-Oow3LhhvMEO8ph7ZM2TO48KtTak+VZjY56ceWdhxeyUzgA@mail.gmail.com>
From: Maciej Żenczykowski <maze@google.com>
This makes native ipv6 bind follow the precedent set by:
- native ipv4 bind behaviour
- dual stack ipv4-mapped ipv6 bind behaviour.
This does allow an unpriviledged process to spoof its source IPv6
address, just like it currently can spoof its source IPv4 address
(for example when using UDP).
Signed-off-by: Maciej Żenczykowski <maze@google.com>
---
net/ipv6/af_inet6.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c
index d27c797..1040424 100644
--- a/net/ipv6/af_inet6.c
+++ b/net/ipv6/af_inet6.c
@@ -347,7 +347,7 @@ int inet6_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len)
*/
v4addr = LOOPBACK4_IPV6;
if (!(addr_type & IPV6_ADDR_MULTICAST)) {
- if (!inet->transparent &&
+ if (!(inet->freebind || inet->transparent) &&
!ipv6_chk_addr(net, &addr->sin6_addr,
dev, 0)) {
err = -EADDRNOTAVAIL;
--
1.7.3.1
^ permalink raw reply related
* [PATCH 2/2] net: make ipv6 PKTINFO honour freebind
From: Maciej Żenczykowski @ 2011-11-08 0:57 UTC (permalink / raw)
To: Maciej Żenczykowski; +Cc: netdev, Maciej Żenczykowski
In-Reply-To: <CAHo-Oow3LhhvMEO8ph7ZM2TO48KtTak+VZjY56ceWdhxeyUzgA@mail.gmail.com>
From: Maciej Żenczykowski <maze@google.com>
This just makes it possible to spoof source IPv6 address on a socket
without having to create and bind a new socket for every source IP
we wish to spoof.
Signed-off-by: Maciej Żenczykowski <maze@google.com>
---
net/ipv6/datagram.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/net/ipv6/datagram.c b/net/ipv6/datagram.c
index e248069..83037af 100644
--- a/net/ipv6/datagram.c
+++ b/net/ipv6/datagram.c
@@ -654,7 +654,7 @@ int datagram_send_ctl(struct net *net, struct sock *sk,
if (addr_type != IPV6_ADDR_ANY) {
int strict = __ipv6_addr_src_scope(addr_type) <= IPV6_ADDR_SCOPE_LINKLOCAL;
- if (!inet_sk(sk)->transparent &&
+ if (!(inet_sk(sk)->freebind || inet_sk(sk)->transparent) &&
!ipv6_chk_addr(net, &src_info->ipi6_addr,
strict ? dev : NULL, 0))
err = -EINVAL;
--
1.7.3.1
^ permalink raw reply related
* [PATCH] [RFC] net-netlink: fix tos/tclass for dual-stack ipv6 sockets
From: Maciej Żenczykowski @ 2011-11-08 1:46 UTC (permalink / raw)
To: Maciej Żenczykowski; +Cc: netdev, Maciej Żenczykowski
In-Reply-To: <1320711791-11005-1-git-send-email-zenczykowski@gmail.com>
From: Maciej Żenczykowski <maze@google.com>
Something along the following lines would be needed.
Signed-off-by: Maciej Żenczykowski <maze@google.com>
---
include/net/ipv6.h | 52 ++++++++++++++++++++++++++++++++++++++++++++++++++
net/ipv4/inet_diag.c | 11 +++++----
2 files changed, 58 insertions(+), 5 deletions(-)
diff --git a/include/net/ipv6.h b/include/net/ipv6.h
index 3b5ac1f..50c7a3b 100644
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -465,6 +465,58 @@ static inline int ipv6_addr_diff(const struct in6_addr *a1, const struct in6_add
extern void ipv6_select_ident(struct frag_hdr *fhdr, struct rt6_info *rt);
+/* Return true for:
+ * - an IPv4 socket (listening or connected)
+ * - an IPv4 connection on a dual-stack IPv6 socket
+ * - an IPv6 dual-stack listening socket -> can later accept IPv4 connection
+ */
+static inline bool sk_might_be_ipv4(struct sock *sk) {
+#if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE)
+ const struct ipv6_pinfo *np;
+
+ if (!sk) return false;
+ if (sk->sk_family == AF_INET) return true;
+ if (sk->sk_family != AF_INET6) return false;
+ np = inet6_sk(sk);
+ if (np->ipv6only) return false;
+
+ if (ipv6_addr_v4mapped(&np->rcv_saddr)) return true;
+ if (!ipv6_addr_any(&np->rcv_saddr)) return false;
+
+ if (sk->sk_state == TCP_LISTEN) return true;
+
+ if (ipv6_addr_v4mapped(&np->saddr)) return true;
+ return false;
+#else
+ return sk && (sk->sk_family == AF_INET);
+#endif
+}
+
+/* Return true for:
+ * - a native IPv6 connection
+ * - a listening IPv6 socket
+ */
+static inline bool sk_might_be_ipv6(struct sock *sk) {
+#if defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE)
+ const struct ipv6_pinfo *np;
+
+ if (!sk) return false;
+ if (sk->sk_family != AF_INET6) return false;
+ np = inet6_sk(sk);
+ if (np->ipv6only) return true;
+
+ if (ipv6_addr_v4mapped(&np->rcv_saddr)) return false;
+ if (!ipv6_addr_any(&np->rcv_saddr)) return true;
+
+ if (sk->sk_state == TCP_LISTEN) return true;
+
+ if (ipv6_addr_v4mapped(&np->saddr)) return false;
+ return true;
+#else
+ return false;
+#endif
+}
+
/*
* Prototypes exported by ipv6
*/
diff --git a/net/ipv4/inet_diag.c b/net/ipv4/inet_diag.c
index 68e8ac5..39bc97c 100644
--- a/net/ipv4/inet_diag.c
+++ b/net/ipv4/inet_diag.c
@@ -108,9 +108,6 @@ static int inet_csk_diag_fill(struct sock *sk,
icsk->icsk_ca_ops->name);
}
- if ((ext & (1 << (INET_DIAG_TOS - 1))) && (sk->sk_family != AF_INET6))
- RTA_PUT_U8(skb, INET_DIAG_TOS, inet->tos);
-
r->idiag_family = sk->sk_family;
r->idiag_state = sk->sk_state;
r->idiag_timer = 0;
@@ -125,7 +122,13 @@ static int inet_csk_diag_fill(struct sock *sk,
r->id.idiag_src[0] = inet->inet_rcv_saddr;
r->id.idiag_dst[0] = inet->inet_daddr;
+ if ((ext & (1 << (INET_DIAG_TOS - 1))) && sk_might_be_ipv4(sk))
+ RTA_PUT_U8(skb, INET_DIAG_TOS, inet->tos);
+
#if defined(CONFIG_IPV6) || defined (CONFIG_IPV6_MODULE)
+ if ((ext & (1 << (INET_DIAG_TCLASS - 1))) && sk_might_be_ipv6(sk))
+ RTA_PUT_U8(skb, INET_DIAG_TCLASS, inet6_sk(sk)->tclass);
+
if (r->idiag_family == AF_INET6) {
const struct ipv6_pinfo *np = inet6_sk(sk);
@@ -133,8 +136,6 @@ static int inet_csk_diag_fill(struct sock *sk,
&np->rcv_saddr);
ipv6_addr_copy((struct in6_addr *)r->id.idiag_dst,
&np->daddr);
- if (ext & (1 << (INET_DIAG_TCLASS - 1)))
- RTA_PUT_U8(skb, INET_DIAG_TCLASS, np->tclass);
}
#endif
--
1.7.3.1
^ permalink raw reply related
* Re: Query on usage of multicast as source IPv6 address
From: Brian Haley @ 2011-11-08 2:11 UTC (permalink / raw)
To: Kumar Sanghvi; +Cc: netdev
In-Reply-To: <20111107204550.GB2980@kumar.asicdesigners.com>
On 11/07/2011 03:45 PM, Kumar Sanghvi wrote:
> Hi,
>
> I am trying to understand IPv6 behavior in Linux.
> And I have a doubt related to use of multicast address
> as source address.
>
> RFC 4291 in Section 2.7 states that:
> "Multicast addresses must not be used as source addresses
> in IPv6 packets or appear in any Routing header."
>
> However, what should be the behavior if a host receives a
> packet (probably from a malicious host with pktgen abilities)
> having a multicast address in source address field:
> 1) Should the receiving host discard the packet?
I believe other *nixes silently drop it, can you try this patch?
-Brian
diff --git a/net/ipv6/ip6_input.c b/net/ipv6/ip6_input.c
index 027c7ff..a46c64e 100644
--- a/net/ipv6/ip6_input.c
+++ b/net/ipv6/ip6_input.c
@@ -111,6 +111,14 @@ int ipv6_rcv(struct sk_buff *skb, struct net_device *dev,
struct packet_type *pt
ipv6_addr_loopback(&hdr->daddr))
goto err;
+ /*
+ * RFC4291 2.7
+ * Multicast addresses must not be used as source addresses in IPv6
+ * packets or appear in any Routing header.
+ */
+ if (ipv6_addr_is_multicast(&hdr->saddr))
+ goto err;
+
skb->transport_header = skb->network_header + sizeof(*hdr);
IP6CB(skb)->nhoff = offsetof(struct ipv6hdr, nexthdr);
^ permalink raw reply related
* Add IPSec IP Range in Linux kernel
From: Daniil Stolnikov @ 2011-11-08 3:10 UTC (permalink / raw)
To: linux-kernel; +Cc: netdev, linux-crypto, linux-security-module, davem
Hello!
Found that the stack IPSec in Linux does not support any IP range. Many people ask this question. The archives say strongswan said that their daemon supports a range, but the Linux IPSec stack supports only the subnets. I am writing to you to implement support for IP range in Linux. I think that a lot more people will appreciate this innovation.
Regards
Daniil Stolnikov.
^ permalink raw reply
* Re: Query on usage of multicast as source IPv6 address
From: Kumar Sanghvi @ 2011-11-08 4:35 UTC (permalink / raw)
To: Brian Haley; +Cc: netdev
In-Reply-To: <4EB88FCC.9000509@hp.com>
Hi Brian,
On Mon, Nov 07, 2011 at 21:11:24 -0500, Brian Haley wrote:
> On 11/07/2011 03:45 PM, Kumar Sanghvi wrote:
> > Hi,
> >
> > I am trying to understand IPv6 behavior in Linux.
> > And I have a doubt related to use of multicast address
> > as source address.
> >
> > RFC 4291 in Section 2.7 states that:
> > "Multicast addresses must not be used as source addresses
> > in IPv6 packets or appear in any Routing header."
> >
> > However, what should be the behavior if a host receives a
> > packet (probably from a malicious host with pktgen abilities)
> > having a multicast address in source address field:
> > 1) Should the receiving host discard the packet?
>
> I believe other *nixes silently drop it, can you try this patch?
>
> -Brian
>
> diff --git a/net/ipv6/ip6_input.c b/net/ipv6/ip6_input.c
> index 027c7ff..a46c64e 100644
> --- a/net/ipv6/ip6_input.c
> +++ b/net/ipv6/ip6_input.c
> @@ -111,6 +111,14 @@ int ipv6_rcv(struct sk_buff *skb, struct net_device *dev,
> struct packet_type *pt
> ipv6_addr_loopback(&hdr->daddr))
> goto err;
>
> + /*
> + * RFC4291 2.7
> + * Multicast addresses must not be used as source addresses in IPv6
> + * packets or appear in any Routing header.
> + */
> + if (ipv6_addr_is_multicast(&hdr->saddr))
> + goto err;
> +
> skb->transport_header = skb->network_header + sizeof(*hdr);
> IP6CB(skb)->nhoff = offsetof(struct ipv6hdr, nexthdr);
>
Tested this patch on 3.1 kernel.
The patch works fine and now, Linux no longer sends a response
to multicast address.
Thanks Brian for the patch!
Reported-and-Tested-by: Kumar Sanghvi <divinekumar@gmail.com>
Thanks,
Kumar.
^ permalink raw reply
* Re: How to indenfy the real physical network interface?
From: Peter P Waskiewicz Jr @ 2011-11-08 5:06 UTC (permalink / raw)
To: santosh; +Cc: netdev@vger.kernel.org
In-Reply-To: <CAFcFeQJsnqPvgQs+hAvrUmEeMVaD4U1V2xbN68SMqxsjpdXvOg@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 2313 bytes --]
On Mon, 2011-11-07 at 05:59 -0800, santosh wrote:
> Hi,
>
> I am posting this question to "netdev" mailing list because I could
> not find "linux-net" mailing list as suggested at
> http://kernelnewbies.org/ML .
I'm not aware of the networking "user" mailing list, this is the
development mailing list.
>
> I have a wireless device running on Linux 2.6.15. (Can't upgrade to
> latest at this time).
>
> It has 3 interfaces.
> ath0 - Wireless interface.
> eth0 - Ethernet interface.
> br0 - Bridge interface joining ath0 and eth0.
>
> I have a user space socket program that listens to the broadcast
> messages and responds.
> My socket is opened as sock = socket(PF_INET, SOCK_DGRAM, 0).
>
> I need this socket program to listen for the packet coming from
> Ethernet interface only.
> Or, this socket program should be able to figure out the actual
> interface the packet come from.
>
> I tried below methods but both doesn't help me because kernel is
> giving bridge as interface and not giving the real interface to socket
> program.
>
> 1. //setsockopt(sock, SOL_SOCKET, SO_BINDTODEVICE, "br0", 3)
> setsockopt(sock, SOL_SOCKET, SO_BINDTODEVICE, "eth0", 4)
>
> 2. setsockopt(sock, SOL_IP, IP_PKTINFO, (char *) &on, sizeof on)
> // Use recvmsg instead of recvfrom and read the interface index.
> // If interface is not Ethernet do not respond.
>
> Can you please let me know if there is a way to identify the actual
> interface in a UDP socket program when traffic is being controlled by
> a Bridge?
Not in the kernel you're using, that's a very old kernel. I made some
changes in the 2.6.22 kernel (also ancient) that would allow the "real"
device to be returned instead of a bridge or bond device. I think
you're SOL if you want this behavior, but can't upgrade to a much more
sane kernel. The issue is the "real" device is resolved to the bridge
device in the socket code, since that is what the routing table
eventually sends stuff out on (before resolving to the "real" device in
the networking core layer).
Upgrade to a much more recent kernel to at least test. Otherwise
there's nothing anyone here can do.
Cheers,
-PJ
--
Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
LAN Access Division, Intel Corporation
[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 4394 bytes --]
^ permalink raw reply
* Re: Bug? GRE tunnel periodically won't transmit some packets
From: Chris Siebenmann @ 2011-11-08 6:17 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Chris Siebenmann, netdev
In-Reply-To: <1320684905.2361.25.camel@edumazet-HP-Compaq-6005-Pro-SFF-PC>
| Le lundi 07 novembre 2011 à 11:21 -0500, Chris Siebenmann a écrit :
| > I have a weird problem where a GRE tunnel periodically won't transmit
| > some (TCP) packets, while at the same time it will transmit others just
| > fine. This is happening in the current kernel.org git head kernel as
| > well as earlier ones.
[...]
| Do you have any errors on :
|
| ip -s -d link show dev greXXXX
I do indeed. When the problem is happening, I see TX errors counting
up one-for-one with packets that are not transmitted (and no RX
errors). Otherwise I don't see any errors. The other end of the GRE
tunnel shows no errors (TX or RX).
Further information: when the problem is not happening, SSH doesn't
seem to transmit 500-data-octet packets during startup. Instead I see:
IP 128.100.3.52.42538 > 128.100.3.51.ssh: Flags [.], seq 22:824, ack 22, win 91, options [nop,nop,TS val 1393299 ecr 29703771], length 802
I have also once seen an 'ip route show table cache' entry for a route
through the GRE tunnel with 552-byte MTU listed:
24.173.24.46 from 128.100.3.52 dev extun
cache expires 21333540sec ipid 0x9e5c mtu 552
I haven't been able to reproduce this. I have seen listed mtu figures
in 'ip route show table cache' output routinely drop to 774, though.
(I would like to have more data on this but inconveniently the problem
is now not reproducing itself. When it comes back I'll capture more
information about route cache mtu values and error counts and see if
there's anything interesting.)
- cks
^ permalink raw reply
* Re: softirq oops from b44_poll
From: Peter P Waskiewicz Jr @ 2011-11-08 6:21 UTC (permalink / raw)
To: Josh Boyer
Cc: Gary Zambrano, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org, kernel-team@fedoraproject.org
In-Reply-To: <20111107205647.GC14216@zod.bos.redhat.com>
[-- Attachment #1: Type: text/plain, Size: 3591 bytes --]
On Mon, 2011-11-07 at 12:56 -0800, Josh Boyer wrote:
> Hi all,
>
> We've had two reports of a WARN_ON being spit out from kernel/softirq.c
> that seem fairly related in symptoms. Both seem to involved b44_poll
> either during the middle of some disk I/O. An example of the output is
> here:
>
> :WARNING: at kernel/softirq.c:159 _local_bh_enable_ip+0x44/0x8e()
> :Hardware name: Vostro 1500
> :Modules linked in: fuse lockd ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6
> ip6table_filter ip6_tables nf_conntrack_ipv4 nf_defrag_ipv4 xt_state
> nf_conntrack sunrpc uinput snd_hda_codec_idt snd_hda_intel snd_hda_codec
> snd_hwdep snd_seq snd_seq_device snd_pcm dell_wmi sparse_keymap dell_laptop
> joydev dcdbas microcode r852 sm_common nand nand_ids b44 nand_ecc r592 mtd ssb
> mii memstick arc4 i2c_i801 iTCO_wdt iTCO_vendor_support iwl3945 iwl_legacy
> mac80211 cfg80211 rfkill snd_timer snd soundcore snd_page_alloc firewire_ohci
> firewire_core crc_itu_t uas usb_storage sdhci_pci sdhci mmc_core nouveau ttm
> drm_kms_helper drm i2c_algo_bit i2c_core mxm_wmi wmi video [last unloaded:
> scsi_wait_scan]
> :Pid: 1511, comm: nepomukservices Not tainted 3.1.0-1.fc16.x86_64 #1
> :Call Trace:
> : <IRQ> [<ffffffff81057a56>] warn_slowpath_common+0x83/0x9b
> : [<ffffffff81057a88>] warn_slowpath_null+0x1a/0x1c
> : [<ffffffff8105d462>] _local_bh_enable_ip+0x44/0x8e
> : [<ffffffff8105d4ba>] local_bh_enable_ip+0xe/0x10
> : [<ffffffff814b5af4>] _raw_spin_unlock_bh+0x15/0x17
> : [<ffffffffa03cc969>] destroy_conntrack+0x9d/0xdc [nf_conntrack]
> : [<ffffffff813fa083>] nf_conntrack_destroy+0x19/0x1b
> : [<ffffffff813ce4ed>] skb_release_head_state+0xa7/0xef
> : [<ffffffff813ce2f1>] __kfree_skb+0x13/0x83
> : [<ffffffff813ce3b7>] consume_skb+0x56/0x6b
> : [<ffffffffa02e48c4>] b44_poll+0xaf/0x3ec [b44]
> : [<ffffffff813d8137>] net_rx_action+0xa9/0x1b8
> : [<ffffffffa02e202e>] ? br32+0x19/0x1d [b44]
> : [<ffffffff8105d6b3>] __do_softirq+0xc9/0x1b5
> : [<ffffffff81027719>] ? ack_APIC_irq+0x15/0x17
> : [<ffffffff814be32c>] call_softirq+0x1c/0x30
> : [<ffffffff81010b45>] do_softirq+0x46/0x81
> : [<ffffffff8105d97b>] irq_exit+0x57/0xb1
> : [<ffffffff814bec0e>] do_IRQ+0x8e/0xa5
> : [<ffffffff814b5d2e>] common_interrupt+0x6e/0x6e
> : <EOI> [<ffffffff814bc1f4>] ? sysret_audit+0x16/0x20
>
> You can find the original bug reports in the URLs below. This has happened
> on two different machines, one 32-bit and another 64-bit. I'm fairly sure
> both reports are the same issue, but I haven't a clue what that issue might
> be at the moment.
>
> Thoughts?
I don't have the hardware to play with, but from inspection, I suspect a
thread is getting stuck on that CPU from the spin_lock_irqsave() in
b44_poll(). There are some calls that are mapping and unmapping memory,
which could be blocking. NAPI should be offering protection under
softirq context, so I'm not sure why that spinlock is even there. And
comparing with a number of other NAPI poll routines in other drivers,
they are also not locking.
This is entirely a theory that I can't test though.
Cheers,
-PJ
> https://bugzilla.redhat.com/show_bug.cgi?id=749856
> https://bugzilla.redhat.com/show_bug.cgi?id=741117
>
> josh
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
LAN Access Division, Intel Corporation
[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 4394 bytes --]
^ permalink raw reply
* Re: Add IPSec IP Range in Linux kernel
From: Peter P Waskiewicz Jr @ 2011-11-08 6:24 UTC (permalink / raw)
To: Daniil Stolnikov
Cc: linux-kernel@vger.kernel.org, netdev@vger.kernel.org,
linux-crypto@vger.kernel.org,
linux-security-module@vger.kernel.org, davem@davemloft.net
In-Reply-To: <92909814.20111108111036@mail.ru>
[-- Attachment #1: Type: text/plain, Size: 857 bytes --]
On Mon, 2011-11-07 at 19:10 -0800, Daniil Stolnikov wrote:
> Hello!
>
> Found that the stack IPSec in Linux does not support any IP range. Many people ask this question. The archives say strongswan said that their daemon supports a range, but the Linux IPSec stack supports only the subnets. I am writing to you to implement support for IP range in Linux. I think that a lot more people will appreciate this innovation.
It'd be even better if you could write a patch for us to review.
Cheers,
-PJ
>
> Regards
> Daniil Stolnikov.
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
Peter P Waskiewicz Jr <peter.p.waskiewicz.jr@intel.com>
LAN Access Division, Intel Corporation
[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 4394 bytes --]
^ permalink raw reply
* Re: Bug? GRE tunnel periodically won't transmit some packets
From: Eric Dumazet @ 2011-11-08 6:43 UTC (permalink / raw)
To: Chris Siebenmann; +Cc: netdev
In-Reply-To: <20111108061737.B04D236221@apps0.cs.toronto.edu>
Le mardi 08 novembre 2011 à 01:17 -0500, Chris Siebenmann a écrit :
> | Le lundi 07 novembre 2011 à 11:21 -0500, Chris Siebenmann a écrit :
> | > I have a weird problem where a GRE tunnel periodically won't transmit
> | > some (TCP) packets, while at the same time it will transmit others just
> | > fine. This is happening in the current kernel.org git head kernel as
> | > well as earlier ones.
> [...]
> | Do you have any errors on :
> |
> | ip -s -d link show dev greXXXX
>
> I do indeed. When the problem is happening, I see TX errors counting
> up one-for-one with packets that are not transmitted (and no RX
> errors). Otherwise I don't see any errors. The other end of the GRE
> tunnel shows no errors (TX or RX).
>
> Further information: when the problem is not happening, SSH doesn't
> seem to transmit 500-data-octet packets during startup. Instead I see:
>
> IP 128.100.3.52.42538 > 128.100.3.51.ssh: Flags [.], seq 22:824, ack 22, win 91, options [nop,nop,TS val 1393299 ecr 29703771], length 802
>
> I have also once seen an 'ip route show table cache' entry for a route
> through the GRE tunnel with 552-byte MTU listed:
>
> 24.173.24.46 from 128.100.3.52 dev extun
> cache expires 21333540sec ipid 0x9e5c mtu 552
>
> I haven't been able to reproduce this. I have seen listed mtu figures
> in 'ip route show table cache' output routinely drop to 774, though.
>
> (I would like to have more data on this but inconveniently the problem
> is now not reproducing itself. When it comes back I'll capture more
> information about route cache mtu values and error counts and see if
> there's anything interesting.)
>
OK, but could you please report the exact "ip -s -d link gre..."
output ?
^ permalink raw reply
* [PATCH] r8169: increase the delay parameter of pm_schedule_suspend
From: Hayes Wang @ 2011-11-08 6:44 UTC (permalink / raw)
To: romieu; +Cc: netdev, linux-kernel, Hayes Wang
The link down would occur when reseting PHY. And it would take about 2 ~ 5 seconds
from link down to link up. If the delay of pm_schedule_suspend is not long enough,
the device would enter runtime_suspend before link up. After link up, the device
would wake up and reset PHY again. Then, you would find the driver keep in a loop
of runtime_suspend and rumtime_resume.
Signed-off-by: Hayes Wang <hayeswang@realtek.com>
---
drivers/net/ethernet/realtek/r8169.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
index 92b45f0..6f06aa1 100644
--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -1292,7 +1292,7 @@ static void __rtl8169_check_link_status(struct net_device *dev,
netif_carrier_off(dev);
netif_info(tp, ifdown, dev, "link down\n");
if (pm)
- pm_schedule_suspend(&tp->pci_dev->dev, 100);
+ pm_schedule_suspend(&tp->pci_dev->dev, 5000);
}
spin_unlock_irqrestore(&tp->lock, flags);
}
--
1.7.6.2
^ permalink raw reply related
* Re: Bug? GRE tunnel periodically won't transmit some packets
From: Chris Siebenmann @ 2011-11-08 7:08 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Chris Siebenmann, netdev
In-Reply-To: <1320734606.8976.0.camel@edumazet-laptop>
| Le mardi 08 novembre 2011 à 01:17 -0500, Chris Siebenmann a écrit :
| > | Le lundi 07 novembre 2011 à 11:21 -0500, Chris Siebenmann a écrit :
| > | > I have a weird problem where a GRE tunnel periodically won't transmit
| > | > some (TCP) packets, while at the same time it will transmit others just
| > | > fine. This is happening in the current kernel.org git head kernel as
| > | > well as earlier ones.
| > [...]
| > | Do you have any errors on :
| > |
| > | ip -s -d link show dev greXXXX
| >
| > I do indeed. When the problem is happening, I see TX errors counting
| > up one-for-one with packets that are not transmitted (and no RX
| > errors). Otherwise I don't see any errors. The other end of the GRE
| > tunnel shows no errors (TX or RX).
[..]
| OK, but could you please report the exact "ip -s -d link gre..."
| output ?
Sure. Here it is:
7: extun: <POINTOPOINT,NOARP,UP,LOWER_UP> mtu 1200 qdisc noqueue state UNKNOWN
link/gre 66.96.18.208 peer 128.100.3.58
gre remote 128.100.3.58 local 66.96.18.208 dev ppp0 ttl inherit
RX: bytes packets errors dropped overrun mcast
2793721 17015 0 0 0 0
TX: bytes packets errors dropped carrier collsns
2242148 26824 18 0 0 0
(The packet and byte counts were much lower when the problem happened;
this time around it happened relatively soon after I rebooted my machine
and brought up the PPPoE and GRE links, but hasn't happened since and
I've been using the GRE link.)
For vaguely historical reasons I don't use 'greXXX' as the name of
the GRE tunnel. I have a 'gre0' device, but it is not up. In case it
matters, its output is:
6: gre0: <NOARP> mtu 1476 qdisc noop state DOWN
link/gre 0.0.0.0 brd 0.0.0.0
gre remote any local any ttl inherit nopmtudisc
RX: bytes packets errors dropped overrun mcast
0 0 0 0 0 0
TX: bytes packets errors dropped carrier collsns
0 0 0 0 0 0
Let me know if you want a full dump of 'ip link show' (with or
without verbosity).
- cks
^ permalink raw reply
* [PATCH] net: fsl_pq_mdio: fix oops when using uninitialized mutex
From: Baruch Siach @ 2011-11-08 7:23 UTC (permalink / raw)
To: netdev; +Cc: linuxppc-dev, Baruch Siach, Andy Fleming
The get_phy_id() routine (called via fsl_pq_mdio_find_free()) tries to acquire
the mdio_lock mutex which is only initialized when of_mdiobus_register() gets
called later. This causes the following oops:
Unable to handle kernel paging request for data at address 0x00000000
Faulting instruction address: 0xc02eda74
Oops: Kernel access of bad area, sig: 11 [#1]
P1020 RDB
NIP: c02eda74 LR: c01b3aa4 CTR: 00000007
REGS: cf039d70 TRAP: 0300 Not tainted (3.2.0-rc1-00004-gdc9d867-dirty)
MSR: 00029000 <EE,ME,CE> CR: 24024028 XER: 00000000
DEAR: 00000000, ESR: 00800000
TASK = cf034000[1] 'swapper' THREAD: cf038000
GPR00: cf039e28 cf039e20 cf034000 cf368228 00000020 00000002 ffeb02ad 000000d0
GPR08: 00001083 00000000 d1080000 cf039e90 00000000 100ae780 00000000 00000000
GPR16: c0000900 00000012 0fffffff 00ffa000 00000015 00000001 c0470000 00000000
GPR24: 00000000 00000000 c03b4e89 d1072030 cf034000 00000020 cf36822c cf368228
NIP [c02eda74] __mutex_lock_slowpath+0x30/0xb0
LR [c01b3aa4] mdiobus_read+0x38/0x68
Call Trace:
[cf039e20] [ffeb0000] 0xffeb0000 (unreliable)
[cf039e50] [c01b3aa4] mdiobus_read+0x38/0x68
[cf039e70] [c01b2af0] get_phy_id+0x24/0x70
[cf039e90] [c01b4128] fsl_pq_mdio_probe+0x364/0x414
[cf039ec0] [c0195050] platform_drv_probe+0x20/0x30
[cf039ed0] [c0193a70] driver_probe_device+0xc8/0x170
[cf039ef0] [c0193b88] __driver_attach+0x70/0x98
[cf039f10] [c019294c] bus_for_each_dev+0x60/0x90
[cf039f40] [c0193cc8] driver_attach+0x24/0x34
[cf039f50] [c0192f88] bus_add_driver+0xbc/0x230
[cf039f70] [c0194594] driver_register+0xb8/0x13c
[cf039f90] [c0195b40] platform_driver_register+0x6c/0x7c
[cf039fa0] [c03e433c] fsl_pq_mdio_init+0x18/0x28
[cf039fb0] [c03ce824] do_one_initcall+0xdc/0x1b4
[cf039fe0] [c03ce984] kernel_init+0x88/0x118
[cf039ff0] [c000bd5c] kernel_thread+0x4c/0x68
Instruction dump:
9421ffd0 7c0802a6 81230008 bf61001c 3bc30004 7c7f1b78 90010034 38010008
7c5c1378 90030008 93c10008 9121000c
3800ffff 90410010 7d201828
Fix this by moving the of_mdiobus_register() call earlier.
Cc: Andy Fleming <afleming@freescale.com>
Signed-off-by: Baruch Siach <baruch@tkos.co.il>
---
drivers/net/ethernet/freescale/fsl_pq_mdio.c | 14 +++++++-------
1 files changed, 7 insertions(+), 7 deletions(-)
diff --git a/drivers/net/ethernet/freescale/fsl_pq_mdio.c b/drivers/net/ethernet/freescale/fsl_pq_mdio.c
index 52f4e8a..e17fd2f 100644
--- a/drivers/net/ethernet/freescale/fsl_pq_mdio.c
+++ b/drivers/net/ethernet/freescale/fsl_pq_mdio.c
@@ -385,6 +385,13 @@ static int fsl_pq_mdio_probe(struct platform_device *ofdev)
tbiaddr = *prop;
}
+ err = of_mdiobus_register(new_bus, np);
+ if (err) {
+ printk (KERN_ERR "%s: Cannot register as MDIO bus\n",
+ new_bus->name);
+ goto err_free_irqs;
+ }
+
if (tbiaddr == -1) {
out_be32(tbipa, 0);
@@ -403,13 +410,6 @@ static int fsl_pq_mdio_probe(struct platform_device *ofdev)
out_be32(tbipa, tbiaddr);
- err = of_mdiobus_register(new_bus, np);
- if (err) {
- printk (KERN_ERR "%s: Cannot register as MDIO bus\n",
- new_bus->name);
- goto err_free_irqs;
- }
-
return 0;
err_free_irqs:
--
1.7.7.1
^ permalink raw reply related
* Re: Bug? GRE tunnel periodically won't transmit some packets
From: Eric Dumazet @ 2011-11-08 7:34 UTC (permalink / raw)
To: Chris Siebenmann; +Cc: netdev
In-Reply-To: <20111108070819.C643236221@apps0.cs.toronto.edu>
Le mardi 08 novembre 2011 à 02:08 -0500, Chris Siebenmann a écrit :
> Let me know if you want a full dump of 'ip link show' (with or
> without verbosity).
>
> - cks
Oh yes, I meant "ip -s -s link show dev extun"
to get detailed infos :
# ip -s -s link show dev ppp0
8: ppp0: <POINTOPOINT,MULTICAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc
pfifo_fast state UNKNOWN qlen 3
link/ppp
RX: bytes packets errors dropped overrun mcast
21120910 21103 0 0 0 0
RX errors: length crc frame fifo missed
0 0 0 0 0
TX: bytes packets errors dropped carrier collsns
1310652 13736 0 0 0 0
TX errors: aborted fifo window heartbeat
0 0 0 0
^ permalink raw reply
* [PATCH] ipv4: fix a bug in SRR option matching.
From: Li Wei @ 2011-11-08 7:56 UTC (permalink / raw)
To: David S. Miller; +Cc: netdev
Since commit 7be799a7 (ipv4: Remove rt->rt_dst reference from
ip_forward_options()) and commit 0374d9ce (ipv4: Kill spurious
write to iph->daddr in ip_forward_options()) we use iph->daddr
for SRR option matching and assume iph->daddr equals to rt->rt_dst,
Unfortunately skb_rtable(skb) has been updated in ip_options_rcv_srr()
for the nexthop in SRR option but iph->daddr *not* updated,
We should use the updated rt->rt_dst for SRR option matching
and update iph->daddr here.
Signed-off-by: Li Wei <lw@cn.fujitsu.com>
---
net/ipv4/ip_options.c | 3 ++-
1 files changed, 2 insertions(+), 1 deletions(-)
diff --git a/net/ipv4/ip_options.c b/net/ipv4/ip_options.c
index ec93335..8dca67c 100644
--- a/net/ipv4/ip_options.c
+++ b/net/ipv4/ip_options.c
@@ -568,12 +568,13 @@ void ip_forward_options(struct sk_buff *skb)
) {
if (srrptr + 3 > srrspace)
break;
- if (memcmp(&ip_hdr(skb)->daddr, &optptr[srrptr-1], 4) == 0)
+ if (memcmp(&rt->rt_dst, &optptr[srrptr-1], 4) == 0)
break;
}
if (srrptr + 3 <= srrspace) {
opt->is_changed = 1;
ip_rt_get_source(&optptr[srrptr-1], skb, rt);
+ ip_hdr(skb)->daddr = rt->rt_dst;
optptr[2] = srrptr+4;
} else if (net_ratelimit())
printk(KERN_CRIT "ip_forward(): Argh! Destination lost!\n");
--
1.7.3.2
^ permalink raw reply related
* [PATCH] ipv4: fix a bug in strict route gateway comparation.
From: Li Wei @ 2011-11-08 8:44 UTC (permalink / raw)
To: David S. Miller; +Cc: netdev
Since commit def57687 (ipv4: Elide use of rt->rt_dst in ip_forward())
we use iph->daddr for strict route gateway comparation, Unfortunately
skb_rtable(skb) has been updated in ip_options_rcv_srr() for the
nexthop in SRR option but iph->daddr *not* updated, So rt->rt_dst is
not equals to iph->daddr, We should use the updated rt->rt_dst instead.
Signed-off-by: Li Wei <lw@cn.fujitsu.com>
---
net/ipv4/ip_forward.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/net/ipv4/ip_forward.c b/net/ipv4/ip_forward.c
index 3b34d1c..99461f0 100644
--- a/net/ipv4/ip_forward.c
+++ b/net/ipv4/ip_forward.c
@@ -84,7 +84,7 @@ int ip_forward(struct sk_buff *skb)
rt = skb_rtable(skb);
- if (opt->is_strictroute && ip_hdr(skb)->daddr != rt->rt_gateway)
+ if (opt->is_strictroute && rt->rt_dst != rt->rt_gateway)
goto sr_failed;
if (unlikely(skb->len > dst_mtu(&rt->dst) && !skb_is_gso(skb) &&
--
1.7.3.2
^ permalink raw reply related
* Re: [PATCH] r8169: increase the delay parameter of pm_schedule_suspend
From: Francois Romieu @ 2011-11-08 9:07 UTC (permalink / raw)
To: Hayes Wang; +Cc: netdev, linux-kernel, Rafael J. Wysocki
In-Reply-To: <1320734677-1372-1-git-send-email-hayeswang@realtek.com>
Hayes Wang <hayeswang@realtek.com> :
> The link down would occur when reseting PHY. And it would take about 2 ~ 5 seconds
> from link down to link up. If the delay of pm_schedule_suspend is not long enough,
> the device would enter runtime_suspend before link up. After link up, the device
> would wake up and reset PHY again. Then, you would find the driver keep in a loop
> of runtime_suspend and rumtime_resume.
Acked-by: Francois Romieu <romieu@fr.zoreil.com>
So far the worst offender here is the 8111evl (RTL_GIGA_MAC_VER_34) with a
max delay a bit below 4000 ms.
[...]
[ 4195.444121] r8169 0000:03:00.0: 8111e-vl-0: link up
[ 4195.549396] r8169 0000:03:00.0: 8111e-vl-0: link down
[ 4195.888002] r8169 0000:03:00.0: 8111e-vl-0: link up
[ 4199.444120] r8169 0000:03:00.0: 8111e-vl-0: link up
[ 4199.582073] r8169 0000:03:00.0: 8111e-vl-0: link down
[...]
[ 4171.580422] r8169 0000:03:00.0: 8111e-vl-0: link down
[ 4171.904002] r8169 0000:03:00.0: 8111e-vl-0: link up
[ 4175.444131] r8169 0000:03:00.0: 8111e-vl-0: link up
[ 4175.547453] r8169 0000:03:00.0: 8111e-vl-0: link down
The 8168d-vb-gr (RTL_GIGA_MAC_VER_26) and the 8168b (RTL_GIGA_MAC_VER_12)
worked out of the box without the patch - at least with a kernel including
Rafael's recent changes - and the old PCI 8169 always worked.
I have not tested the 8168f nor the 810x yet.
Increasing the delay over and over in the driver alone will not ensure
that the system is always stable but it should send the stuff on a far
enough orbit for some time.
--
Ueimor
^ permalink raw reply
* [PATCH] net/temac: FIX segfault when process old irqs
From: Ricardo Ribalda Delgado @ 2011-11-08 9:29 UTC (permalink / raw)
To: open list:NETWORKING DRIVERS, open list; +Cc: Ricardo Ribalda Delgado
Do not enable the irq until the scatter gather registers are ready to
handle the data. Otherwise an irq from a packet send/received before
last close can lead to an access to an invalid memory region on the irq
handler.
Also, stop the dma engine on close.
Signed-off-by: Ricardo Ribalda Delgado <ricardo.ribalda@gmail.com>
---
drivers/net/ethernet/xilinx/ll_temac_main.c | 6 +++++-
1 files changed, 5 insertions(+), 1 deletions(-)
diff --git a/drivers/net/ethernet/xilinx/ll_temac_main.c b/drivers/net/ethernet/xilinx/ll_temac_main.c
index 4d1658e..c8db76a 100644
--- a/drivers/net/ethernet/xilinx/ll_temac_main.c
+++ b/drivers/net/ethernet/xilinx/ll_temac_main.c
@@ -203,6 +203,9 @@ static void temac_dma_bd_release(struct net_device *ndev)
struct temac_local *lp = netdev_priv(ndev);
int i;
+ /* Reset Local Link (DMA) */
+ lp->dma_out(lp, DMA_CONTROL_REG, DMA_CONTROL_RST);
+
for (i = 0; i < RX_BD_NUM; i++) {
if (!lp->rx_skb[i])
break;
@@ -860,6 +863,8 @@ static int temac_open(struct net_device *ndev)
phy_start(lp->phy_dev);
}
+ temac_device_reset(ndev);
+
rc = request_irq(lp->tx_irq, ll_temac_tx_irq, 0, ndev->name, ndev);
if (rc)
goto err_tx_irq;
@@ -867,7 +872,6 @@ static int temac_open(struct net_device *ndev)
if (rc)
goto err_rx_irq;
- temac_device_reset(ndev);
return 0;
err_rx_irq:
--
1.7.7.1
^ permalink raw reply related
* [PATCH] net/temac: FIX segfault when process old irqs
From: Ricardo Ribalda Delgado @ 2011-11-08 9:31 UTC (permalink / raw)
To: davem, ian.campbell, eric.dumazet, jeffrey.t.kirsher, jpirko,
netdev, linux-kernel
Cc: Ricardo Ribalda Delgado
Do not enable the irq until the scatter gather registers are ready to
handle the data. Otherwise an irq from a packet send/received before
last close can lead to an access to an invalid memory region on the irq
handler.
Also, stop the dma engine on close.
Signed-off-by: Ricardo Ribalda Delgado <ricardo.ribalda@gmail.com>
---
drivers/net/ethernet/xilinx/ll_temac_main.c | 6 +++++-
1 files changed, 5 insertions(+), 1 deletions(-)
diff --git a/drivers/net/ethernet/xilinx/ll_temac_main.c b/drivers/net/ethernet/xilinx/ll_temac_main.c
index 4d1658e..c8db76a 100644
--- a/drivers/net/ethernet/xilinx/ll_temac_main.c
+++ b/drivers/net/ethernet/xilinx/ll_temac_main.c
@@ -203,6 +203,9 @@ static void temac_dma_bd_release(struct net_device *ndev)
struct temac_local *lp = netdev_priv(ndev);
int i;
+ /* Reset Local Link (DMA) */
+ lp->dma_out(lp, DMA_CONTROL_REG, DMA_CONTROL_RST);
+
for (i = 0; i < RX_BD_NUM; i++) {
if (!lp->rx_skb[i])
break;
@@ -860,6 +863,8 @@ static int temac_open(struct net_device *ndev)
phy_start(lp->phy_dev);
}
+ temac_device_reset(ndev);
+
rc = request_irq(lp->tx_irq, ll_temac_tx_irq, 0, ndev->name, ndev);
if (rc)
goto err_tx_irq;
@@ -867,7 +872,6 @@ static int temac_open(struct net_device *ndev)
if (rc)
goto err_rx_irq;
- temac_device_reset(ndev);
return 0;
err_rx_irq:
--
1.7.7.1
^ permalink raw reply related
* Re: dst->obsolete has become pointless
From: Steffen Klassert @ 2011-11-08 9:34 UTC (permalink / raw)
To: David Miller; +Cc: netdev, timo.teras
In-Reply-To: <20111104.230910.520924516201406800.davem@davemloft.net>
On Fri, Nov 04, 2011 at 11:09:10PM -0400, David Miller wrote:
>
> While researching the things unearthed by Steffen Klassert wrt. PMTU
> handling in the current tree I went to do some research on what the
> real story is wrt. dst->obsolete.
>
> And sure enough EVERY SINGLE ipv4 and ipv6 route is created with
> obsolete set to -1, so we unconditionally always invoke ->dst_check().
>
> This makes it completely pointless as an optimization to avoid calling
> the dst_ops->dst_check() method. It never triggers.
>
> This stems from Timo's change to make route expiry properly visible
> to IPSEC stacked routes:
>
> --
> commit d11a4dc18bf41719c9f0d7ed494d295dd2973b92
> Author: Timo Teräs <timo.teras@iki.fi>
> Date: Thu Mar 18 23:20:20 2010 +0000
>
> ipv4: check rt_genid in dst_check
> ...
> --
>
> Only DecNET creates routes with obsolete initially set to zero, and
> therefore only hits ->dst_check() when dst_free is invoked on the route
> during a flush of the decnet routing tables.
>
> And actually this is how ipv4 operated before we started using
> generation counts instead of flushing the entire table. IPV6 seems to
> always have used the FIB6 tree serial numbers for expiration checking
> and therefore always set obsolete to -1 on new routes.
>
> So we can't just get rid of the dst->obsolete check in dst_check() and
> __sk_dst_check() because that will break DecNET because DecNET's
> ->dst_check() handler assumes that if it was called then the route
> is obsolete and it just plainly returns NULL to tell the caller the
> route is in fact invalid.
>
I don't know what to do with DecNET, but we probaply need to decide
about the future of dst->obsolete before we can fix the ipv4 PMTU
problems. Possible fixes might depend on whether ->dst_check() is
always invoked or not.
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox