* Re: 3.2-rc2+: Reported regressions from 3.0 and 3.1
From: Ari Savolainen @ 2011-11-22 5:27 UTC (permalink / raw)
To: Linus Torvalds, Rafael J. Wysocki
Cc: Linux Kernel Mailing List, Maciej Rutecki, Florian Mickler,
Andrew Morton, Kernel Testers List, Network Development,
Linux ACPI, Linux PM List, Linux SCSI List, Linux Wireless List,
DRI
In-Reply-To: <CA+55aFyceMKgS-YRv=r=FrHQ1P9=z7=2PC5gvZcCHoYKnzn_Aw@mail.gmail.com>
2011/11/22 Linus Torvalds <torvalds@linux-foundation.org>:
> On Mon, Nov 21, 2011 at 1:49 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
>>
>> Subject : lockdep warning after aa6afca5bcab: "proc: fix races against execve() of /proc/PID/fd**"
>> Submitter : Ari Savolainen <ari.m.savolainen@gmail.com>
>> Date : 2011-11-08 3:47
>> Message-ID : CAEbykaXYZEFhTgWMm2AfaWQ2SaXYuO_ypTnw+6AVWScOYSCuuw@mail.gmail.com
>> References : http://marc.info/?l=linux-kernel&m=132072413125099&w=2
>
> Commit aa6afca5bcab was reverted by commit 5e442a493fc5, so this one
> is presumably stale.
>
> Linus
Yes, this went away after the reversion.
Ari
^ permalink raw reply
* Re: [PATCH 1/1] AF_UNIX: Fix poll locking problem when reading from a stream socket
From: Eric Dumazet @ 2011-11-22 5:23 UTC (permalink / raw)
To: Alexey Moiseytsev; +Cc: David S. Miller, netdev, linux-kernel
In-Reply-To: <1321918525-5078-1-git-send-email-himeraster@gmail.com>
Le mardi 22 novembre 2011 à 03:35 +0400, Alexey Moiseytsev a écrit :
> poll() call may be locked by concurrent reading from the same stream
> socket.
>
> Signed-off-by: Alexey Moiseytsev <himeraster@gmail.com>
> ---
> net/unix/af_unix.c | 4 ++++
> 1 files changed, 4 insertions(+), 0 deletions(-)
>
> diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
> index 466fbcc..b595a3d 100644
> --- a/net/unix/af_unix.c
> +++ b/net/unix/af_unix.c
> @@ -1957,6 +1957,7 @@ static int unix_stream_recvmsg(struct kiocb *iocb, struct socket *sock,
> if ((UNIXCB(skb).pid != siocb->scm->pid) ||
> (UNIXCB(skb).cred != siocb->scm->cred)) {
> skb_queue_head(&sk->sk_receive_queue, skb);
> + sk->sk_data_ready(sk, skb->len);
> break;
> }
> } else {
> @@ -1974,6 +1975,7 @@ static int unix_stream_recvmsg(struct kiocb *iocb, struct socket *sock,
> chunk = min_t(unsigned int, skb->len, size);
> if (memcpy_toiovec(msg->msg_iov, skb->data, chunk)) {
> skb_queue_head(&sk->sk_receive_queue, skb);
> + sk->sk_data_ready(sk, skb->len);
> if (copied == 0)
> copied = -EFAULT;
> break;
> @@ -1991,6 +1993,7 @@ static int unix_stream_recvmsg(struct kiocb *iocb, struct socket *sock,
> /* put the skb back if we didn't use it up.. */
> if (skb->len) {
> skb_queue_head(&sk->sk_receive_queue, skb);
> + sk->sk_data_ready(sk, skb->len);
> break;
> }
>
> @@ -2006,6 +2009,7 @@ static int unix_stream_recvmsg(struct kiocb *iocb, struct socket *sock,
>
> /* put message back and return */
> skb_queue_head(&sk->sk_receive_queue, skb);
> + sk->sk_data_ready(sk, skb->len);
> break;
> }
> } while (size);
Fine, the fix is technically correct since we own u->readlock mutex,
another thread cannot consume the just requeued skb.
Small note : the words "locking" and "locked" are more used to describe
the action of taking a spinlock/mutex/rwlock or something, while the bug
you fixed is more about poll() system call being blocked/frozen forever.
Thanks !
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
^ permalink raw reply
* Re: WARNING: at mm/slub.c:3357, kernel BUG at mm/slub.c:3413
From: Christoph Lameter @ 2011-11-22 3:18 UTC (permalink / raw)
To: Christian Kujau
Cc: Benjamin Herrenschmidt, Markus Trippelsdorf, Eric Dumazet,
Alex,Shi, linux-kernel@vger.kernel.org, linux-mm@kvack.org,
Pekka Enberg, Matt Mackall, netdev@vger.kernel.org, Tejun Heo
In-Reply-To: <alpine.DEB.2.01.1111211617220.8000@trent.utfs.org>
On Mon, 21 Nov 2011, Christian Kujau wrote:
> On Tue, 22 Nov 2011 at 07:27, Benjamin Herrenschmidt wrote:
> > Note that I hit a similar looking crash (sorry, I couldn't capture a
> > backtrace back then) on a PowerMac G5 (ppc64) while doing a large rsync
> > transfer yesterday with -rc2-something (cfcfc9ec) and
> > Christian Kujau (CC) seems to be able to reproduce something similar on
> > some other ppc platform (Christian, what is your setup ?)
>
> I seem to hit it with heavy disk & cpu IO is in progress on this PowerBook
> G4. Full dmesg & .config: http://nerdbynature.de/bits/3.2.0-rc1/oops/
>
> I've enabled some debug options and now it really points to slub.c:2166
Hmmm... That means that c->page points to page not frozen. Per cpu
partial pages are frozen until they are reused or until the partial list
is flushed.
Does this ever happen on x86 or only on other platforms? In put_cpu_partial() the
this_cpu_cmpxchg really needs really to be irq safe. this_cpu_cmpxchg is
only preempt safe.
Index: linux-2.6/mm/slub.c
===================================================================
--- linux-2.6.orig/mm/slub.c 2011-11-21 21:15:41.575673204 -0600
+++ linux-2.6/mm/slub.c 2011-11-21 21:16:33.442336849 -0600
@@ -1969,7 +1969,7 @@
page->pobjects = pobjects;
page->next = oldpage;
- } while (this_cpu_cmpxchg(s->cpu_slab->partial, oldpage, page) != oldpage);
+ } while (irqsafe_cpu_cmpxchg(s->cpu_slab->partial, oldpage, page) != oldpage);
stat(s, CPU_PARTIAL_FREE);
return pobjects;
}
x
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply
* [PATCH] tcp: do not scale TSO segment size with reordering degree
From: Neal Cardwell @ 2011-11-22 3:15 UTC (permalink / raw)
To: David Miller
Cc: netdev, ilpo.jarvinen, Nandita Dukkipati, Yuchung Cheng,
Jerry Chu, Tom Herbert, Neal Cardwell
Since 2005 (c1b4a7e69576d65efc31a8cea0714173c2841244)
tcp_tso_should_defer has been using tcp_max_burst() as a target limit
for deciding how large to make outgoing TSO packets when not using
sysctl_tcp_tso_win_divisor. But since 2008
(dd9e0dda66ba38a2ddd1405ac279894260dc5c36) tcp_max_burst() returns the
reordering degree. We should not have tcp_tso_should_defer attempt to
build larger segments just because there is more reordering. This
commit splits the notion of deferral size used in TSO from the notion
of burst size used in cwnd moderation, and returns the TSO deferral
limit to its original value.
Signed-off-by: Neal Cardwell <ncardwell@google.com>
---
include/net/tcp.h | 8 ++++++++
net/ipv4/tcp_cong.c | 2 +-
net/ipv4/tcp_output.c | 2 +-
3 files changed, 10 insertions(+), 2 deletions(-)
diff --git a/include/net/tcp.h b/include/net/tcp.h
index 113160b..87e3c80 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -834,6 +834,14 @@ static inline __u32 tcp_current_ssthresh(const struct sock *sk)
extern void tcp_enter_cwr(struct sock *sk, const int set_ssthresh);
extern __u32 tcp_init_cwnd(const struct tcp_sock *tp, const struct dst_entry *dst);
+/* The maximum number of MSS of available cwnd for which TSO defers
+ * sending if not using sysctl_tcp_tso_win_divisor.
+ */
+static inline __u32 tcp_max_tso_deferred_mss(const struct tcp_sock *tp)
+{
+ return 3;
+}
+
/* Slow start with delack produces 3 packets of burst, so that
* it is safe "de facto". This will be the default - same as
* the default reordering threshold - but if reordering increases,
diff --git a/net/ipv4/tcp_cong.c b/net/ipv4/tcp_cong.c
index 850c737..fc6d475 100644
--- a/net/ipv4/tcp_cong.c
+++ b/net/ipv4/tcp_cong.c
@@ -292,7 +292,7 @@ int tcp_is_cwnd_limited(const struct sock *sk, u32 in_flight)
left * sysctl_tcp_tso_win_divisor < tp->snd_cwnd &&
left * tp->mss_cache < sk->sk_gso_max_size)
return 1;
- return left <= tcp_max_burst(tp);
+ return left <= tcp_max_tso_deferred_mss(tp);
}
EXPORT_SYMBOL_GPL(tcp_is_cwnd_limited);
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 63170e2..58f69ac 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -1581,7 +1581,7 @@ static int tcp_tso_should_defer(struct sock *sk, struct sk_buff *skb)
* frame, so if we have space for more than 3 frames
* then send now.
*/
- if (limit > tcp_max_burst(tp) * tp->mss_cache)
+ if (limit > tcp_max_tso_deferred_mss(tp) * tp->mss_cache)
goto send_now;
}
--
1.7.3.1
^ permalink raw reply related
* Re: WARNING: at mm/slub.c:3357, kernel BUG at mm/slub.c:3413
From: Benjamin Herrenschmidt @ 2011-11-22 2:17 UTC (permalink / raw)
To: Christian Kujau
Cc: Markus Trippelsdorf, Eric Dumazet, Alex,Shi,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
Christoph Lameter, Pekka Enberg, Matt Mackall,
netdev@vger.kernel.org, Tejun Heo
In-Reply-To: <alpine.DEB.2.01.1111211617220.8000@trent.utfs.org>
On Mon, 2011-11-21 at 16:21 -0800, Christian Kujau wrote:
> On Tue, 22 Nov 2011 at 07:27, Benjamin Herrenschmidt wrote:
> > Note that I hit a similar looking crash (sorry, I couldn't capture a
> > backtrace back then) on a PowerMac G5 (ppc64) while doing a large rsync
> > transfer yesterday with -rc2-something (cfcfc9ec) and
> > Christian Kujau (CC) seems to be able to reproduce something similar on
> > some other ppc platform (Christian, what is your setup ?)
>
> I seem to hit it with heavy disk & cpu IO is in progress on this PowerBook
> G4. Full dmesg & .config: http://nerdbynature.de/bits/3.2.0-rc1/oops/
>
> I've enabled some debug options and now it really points to slub.c:2166
>
> http://nerdbynature.de/bits/3.2.0-rc1/oops/oops4m.jpg
>
> With debug options enabled I'm currently in the xmon debugger, not sure
> what to make of it yet, I'll try to get something useful out of it :)
Is your powerbook one of those who can actually use xmon ? (ie, keyboard
is working ? If it's usb it won't but if it's adb it will).
You probably landed there too late tho, after the corruption happened.
What would be useful would be to see if you can reproduce with SLAB
and/or after backing out the cpu partial functionality.
Cheers,
Ben.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply
* Re: [Devel] Re: [PATCH v5 00/10] per-cgroup tcp memory pressure
From: KAMEZAWA Hiroyuki @ 2011-11-22 2:07 UTC (permalink / raw)
To: Glauber Costa
Cc: David Miller, jbottomley, eric.dumazet, linux-kernel, netdev,
paul, lizf, linux-mm, devel, kirill, gthelen
In-Reply-To: <4EC6B457.4010502@parallels.com>
On Fri, 18 Nov 2011 17:39:03 -0200
Glauber Costa <glommer@parallels.com> wrote:
> On 11/17/2011 07:35 PM, David Miller wrote:
> > TCP specific stuff in mm/memcontrol.c, at best that's not nice at all.
>
> How crucial is that? Thing is that as far as I am concerned, all the
> memcg people really want the inner layout of struct mem_cgroup to be
> private to memcontrol.c
This is just because memcg is just related to memory management and I don't
want it be wide spreaded, 'struct mem_cgroup' has been changed often.
But I don't like to have TCP code in memcgroup.c.
New idea is welcome.
> This means that at some point, we need to have
> at least a wrapper in memcontrol.c that is able to calculate the offset
> of the tcp structure, and since most functions are actually quite
> simple, that would just make us do more function calls.
>
> Well, an alternative to that would be to use a void pointer in the newly
> added struct cg_proto to an already parsed memcg-related field
> (in this case tcp_memcontrol), that would be passed to the functions
> instead of the whole memcg structure. Do you think this would be
> preferable ?
>
like this ?
struct mem_cgroup_sub_controls {
struct mem_cgroup *mem;
union {
struct tcp_mem_control tcp;
} data;
};
/* for loosely coupled controls for memcg */
struct memcg_sub_controls_function
{
struct memcg_sub_controls (*create)(struct mem_cgroup *);
struct memcg_sub_controls (*destroy)(struct mem_cgroup *);
}
int register_memcg_sub_controls(char *name,
struct memcg_sub_controls_function *abis);
struct mem_cgroup {
.....
.....
/* Root memcg will have no sub_controls! */
struct memcg_sub_controls *sub_controls[NR_MEMCG_SUB_CONTROLS];
}
Maybe some functions should be exported.
Thanks,
-Kame
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply
* Re: [PATCH] net-netlink: fix diag to export IPv4 tos for dual-stack IPv6 sockets
From: Maciej Żenczykowski @ 2011-11-22 1:52 UTC (permalink / raw)
To: David S. Miller
Cc: netdev, Maciej Żenczykowski, Murali Raja, Stephen Hemminger,
Eric Dumazet
In-Reply-To: <1321926620-28753-1-git-send-email-zenczykowski@gmail.com>
As you see, I've decided to go with the simplest 'punt the problem to
userspace' solution.
Hope this is acceptable...
- Maciej
^ permalink raw reply
* [PATCH] net-netlink: fix diag to export IPv4 tos for dual-stack IPv6 sockets
From: Maciej Żenczykowski @ 2011-11-22 1:50 UTC (permalink / raw)
To: Maciej Żenczykowski
Cc: netdev, Maciej Żenczykowski, Murali Raja, Stephen Hemminger,
Eric Dumazet, David S. Miller
In-Reply-To: <CAHo-OozYtgN9_TdOTs+40JKpSZmi2c-6yrbjFS2hViFv_QY-5Q@mail.gmail.com>
From: Maciej Żenczykowski <maze@google.com>
Signed-off-by: Maciej Żenczykowski <maze@google.com>
CC: Murali Raja <muralira@google.com>
CC: Stephen Hemminger <shemminger@vyatta.com>
CC: Eric Dumazet <eric.dumazet@gmail.com>
CC: David S. Miller <davem@davemloft.net>
---
net/ipv4/inet_diag.c | 14 +++++++++-----
1 files changed, 9 insertions(+), 5 deletions(-)
diff --git a/net/ipv4/inet_diag.c b/net/ipv4/inet_diag.c
index 68e8ac5..ccee270 100644
--- a/net/ipv4/inet_diag.c
+++ b/net/ipv4/inet_diag.c
@@ -108,9 +108,6 @@ static int inet_csk_diag_fill(struct sock *sk,
icsk->icsk_ca_ops->name);
}
- if ((ext & (1 << (INET_DIAG_TOS - 1))) && (sk->sk_family != AF_INET6))
- RTA_PUT_U8(skb, INET_DIAG_TOS, inet->tos);
-
r->idiag_family = sk->sk_family;
r->idiag_state = sk->sk_state;
r->idiag_timer = 0;
@@ -125,16 +122,23 @@ static int inet_csk_diag_fill(struct sock *sk,
r->id.idiag_src[0] = inet->inet_rcv_saddr;
r->id.idiag_dst[0] = inet->inet_daddr;
+ /* IPv6 dual-stack sockets use inet->tos for IPv4 connections,
+ * hence this needs to be included regardless of socket family.
+ */
+ if (ext & (1 << (INET_DIAG_TOS - 1)))
+ RTA_PUT_U8(skb, INET_DIAG_TOS, inet->tos);
+
#if defined(CONFIG_IPV6) || defined (CONFIG_IPV6_MODULE)
if (r->idiag_family == AF_INET6) {
const struct ipv6_pinfo *np = inet6_sk(sk);
+ if (ext & (1 << (INET_DIAG_TCLASS - 1)))
+ RTA_PUT_U8(skb, INET_DIAG_TCLASS, np->tclass);
+
ipv6_addr_copy((struct in6_addr *)r->id.idiag_src,
&np->rcv_saddr);
ipv6_addr_copy((struct in6_addr *)r->id.idiag_dst,
&np->daddr);
- if (ext & (1 << (INET_DIAG_TCLASS - 1)))
- RTA_PUT_U8(skb, INET_DIAG_TCLASS, np->tclass);
}
#endif
--
1.7.3.1
^ permalink raw reply related
* Get TOS from IP header
From: Naveen B N (nbn) @ 2011-11-22 1:36 UTC (permalink / raw)
To: netdev
Hi All,
I want to get the TOS field content from IP header at
my application which is using SCTP protocol.
I know that it can be done by Using RAW socket but
There is a over head of implementing the sctp at the
Application layer .
Is there a better way of getting a TOS field filled
Which we received from IP header.
I taught of adding code in kernel to recognize a socket option at the kernel,
If this option is set just and get TOS contents in receive .
Please let me know if this is the way to go about it or is there
A better way to achieve the same .
Thanks and Regards
Naveen
^ permalink raw reply
* [PATCH net-next 0/4] tg3: Cleanups, corrections, and MDI-X
From: Matt Carlson @ 2011-11-22 1:01 UTC (permalink / raw)
To: davem; +Cc: netdev, mcarlson
This patchset cleans-up 1000Base-X flow control resolution, then continues
on to tune some thresholds and then adds MDI-X reporting.
^ permalink raw reply
* [PATCH net-next 3/4] tg3: Restrict large prod ring cap devices
From: Matt Carlson @ 2011-11-22 1:01 UTC (permalink / raw)
To: davem; +Cc: netdev, mcarlson
Future devices may or may not be capable of supporting larger rx
producer rings. This patch changes the code so that this flag is set on
an ASIC rev to ASIC rev basis. Also, this patch changes a place where
the LRG_PROD_RING_CAP flag was not controlling how the rx standard
producer ring size was set.
Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
---
drivers/net/ethernet/broadcom/tg3.c | 9 ++++-----
1 files changed, 4 insertions(+), 5 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/tg3.c b/drivers/net/ethernet/broadcom/tg3.c
index 096a67e..ea3c5f0 100644
--- a/drivers/net/ethernet/broadcom/tg3.c
+++ b/drivers/net/ethernet/broadcom/tg3.c
@@ -8553,10 +8553,7 @@ static int tg3_reset_hw(struct tg3 *tp, int reset_phy)
}
if (tg3_flag(tp, 57765_PLUS)) {
- if (GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_57765)
- val = TG3_RX_STD_MAX_SIZE_5700;
- else
- val = TG3_RX_STD_MAX_SIZE_5717;
+ val = TG3_RX_STD_RING_SIZE(tp);
val <<= BDINFO_FLAGS_MAXLEN_SHIFT;
val |= (TG3_RX_STD_DMA_SZ << 2);
} else
@@ -14009,7 +14006,9 @@ static int __devinit tg3_get_invariants(struct tg3 *tp)
if (GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5719)
tg3_flag_set(tp, 4K_FIFO_LIMIT);
- if (tg3_flag(tp, 5717_PLUS))
+ if (GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5717 ||
+ GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5719 ||
+ GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5720)
tg3_flag_set(tp, LRG_PROD_RING_CAP);
if (tg3_flag(tp, 57765_PLUS) &&
--
1.7.3.4
^ permalink raw reply related
* [PATCH net-next 4/4] tg3: Add MDI-X reporting
From: Matt Carlson @ 2011-11-22 1:01 UTC (permalink / raw)
To: davem; +Cc: netdev, mcarlson
This patch adds MDI-X state reporting.
Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
---
drivers/net/ethernet/broadcom/tg3.c | 24 +++++++++++++++++++++++-
drivers/net/ethernet/broadcom/tg3.h | 5 +++++
2 files changed, 28 insertions(+), 1 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/tg3.c b/drivers/net/ethernet/broadcom/tg3.c
index ea3c5f0..7ff5c72 100644
--- a/drivers/net/ethernet/broadcom/tg3.c
+++ b/drivers/net/ethernet/broadcom/tg3.c
@@ -3932,6 +3932,7 @@ static int tg3_setup_copper_phy(struct tg3 *tp, int force_reset)
current_link_up = 0;
current_speed = SPEED_INVALID;
current_duplex = DUPLEX_INVALID;
+ tp->phy_flags &= ~TG3_PHYFLG_MDIX_STATE;
if (tp->phy_flags & TG3_PHYFLG_CAPACITIVE_COUPLING) {
err = tg3_phy_auxctl_read(tp,
@@ -4004,8 +4005,22 @@ static int tg3_setup_copper_phy(struct tg3 *tp, int force_reset)
}
if (current_link_up == 1 &&
- tp->link_config.active_duplex == DUPLEX_FULL)
+ tp->link_config.active_duplex == DUPLEX_FULL) {
+ u32 reg, bit;
+
+ if (tp->phy_flags & TG3_PHYFLG_IS_FET) {
+ reg = MII_TG3_FET_GEN_STAT;
+ bit = MII_TG3_FET_GEN_STAT_MDIXSTAT;
+ } else {
+ reg = MII_TG3_EXT_STAT;
+ bit = MII_TG3_EXT_STAT_MDIX;
+ }
+
+ if (!tg3_readphy(tp, reg, &val) && (val & bit))
+ tp->phy_flags |= TG3_PHYFLG_MDIX_STATE;
+
tg3_setup_flow_control(tp, lcl_adv, rmt_adv);
+ }
}
relink:
@@ -10290,9 +10305,16 @@ static int tg3_get_settings(struct net_device *dev, struct ethtool_cmd *cmd)
if (netif_running(dev)) {
ethtool_cmd_speed_set(cmd, tp->link_config.active_speed);
cmd->duplex = tp->link_config.active_duplex;
+ if (!(tp->phy_flags & TG3_PHYFLG_ANY_SERDES)) {
+ if (tp->phy_flags & TG3_PHYFLG_MDIX_STATE)
+ cmd->eth_tp_mdix = ETH_TP_MDI_X;
+ else
+ cmd->eth_tp_mdix = ETH_TP_MDI;
+ }
} else {
ethtool_cmd_speed_set(cmd, SPEED_INVALID);
cmd->duplex = DUPLEX_INVALID;
+ cmd->eth_tp_mdix = ETH_TP_MDI_INVALID;
}
cmd->phy_address = tp->phy_addr;
cmd->transceiver = XCVR_INTERNAL;
diff --git a/drivers/net/ethernet/broadcom/tg3.h b/drivers/net/ethernet/broadcom/tg3.h
index 8e2f380..9cc10a8 100644
--- a/drivers/net/ethernet/broadcom/tg3.h
+++ b/drivers/net/ethernet/broadcom/tg3.h
@@ -2174,6 +2174,7 @@
#define MII_TG3_EXT_CTRL_TBI 0x8000
#define MII_TG3_EXT_STAT 0x11 /* Extended status register */
+#define MII_TG3_EXT_STAT_MDIX 0x2000
#define MII_TG3_EXT_STAT_LPASS 0x0100
#define MII_TG3_RXR_COUNTERS 0x14 /* Local/Remote Receiver Counts */
@@ -2277,6 +2278,9 @@
#define MII_TG3_FET_PTEST_FRC_TX_LINK 0x1000
#define MII_TG3_FET_PTEST_FRC_TX_LOCK 0x0800
+#define MII_TG3_FET_GEN_STAT 0x1c
+#define MII_TG3_FET_GEN_STAT_MDIXSTAT 0x2000
+
#define MII_TG3_FET_TEST 0x1f
#define MII_TG3_FET_SHADOW_EN 0x0080
@@ -3135,6 +3139,7 @@ struct tg3 {
#define TG3_PHYFLG_SERDES_PREEMPHASIS 0x00010000
#define TG3_PHYFLG_PARALLEL_DETECT 0x00020000
#define TG3_PHYFLG_EEE_CAP 0x00040000
+#define TG3_PHYFLG_MDIX_STATE 0x00200000
u32 led_ctrl;
u32 phy_otp;
--
1.7.3.4
^ permalink raw reply related
* [PATCH net-next 1/4] tg3: Make 1000Base-X FC resolution look like 1000T
From: Matt Carlson @ 2011-11-22 1:01 UTC (permalink / raw)
To: davem; +Cc: netdev, mcarlson
This patch changes tg3's 1000Base-X flow control resolution to look like
the 1000Base-T flow control resolution code.
Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
---
drivers/net/ethernet/broadcom/tg3.c | 18 ++++++------------
1 files changed, 6 insertions(+), 12 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/tg3.c b/drivers/net/ethernet/broadcom/tg3.c
index d9e9c8c..438d099 100644
--- a/drivers/net/ethernet/broadcom/tg3.c
+++ b/drivers/net/ethernet/broadcom/tg3.c
@@ -1706,18 +1706,12 @@ static u8 tg3_resolve_flowctrl_1000X(u16 lcladv, u16 rmtadv)
{
u8 cap = 0;
- if (lcladv & ADVERTISE_1000XPAUSE) {
- if (lcladv & ADVERTISE_1000XPSE_ASYM) {
- if (rmtadv & LPA_1000XPAUSE)
- cap = FLOW_CTRL_TX | FLOW_CTRL_RX;
- else if (rmtadv & LPA_1000XPAUSE_ASYM)
- cap = FLOW_CTRL_RX;
- } else {
- if (rmtadv & LPA_1000XPAUSE)
- cap = FLOW_CTRL_TX | FLOW_CTRL_RX;
- }
- } else if (lcladv & ADVERTISE_1000XPSE_ASYM) {
- if ((rmtadv & LPA_1000XPAUSE) && (rmtadv & LPA_1000XPAUSE_ASYM))
+ if (lcladv & rmtadv & ADVERTISE_1000XPAUSE) {
+ cap = FLOW_CTRL_TX | FLOW_CTRL_RX;
+ } else if (lcladv & rmtadv & ADVERTISE_1000XPSE_ASYM) {
+ if (lcladv & ADVERTISE_1000XPAUSE)
+ cap = FLOW_CTRL_RX;
+ if (rmtadv & ADVERTISE_1000XPAUSE)
cap = FLOW_CTRL_TX;
}
--
1.7.3.4
^ permalink raw reply related
* [PATCH net-next 2/4] tg3: Adjust BD replenish thresholds
From: Matt Carlson @ 2011-11-22 1:01 UTC (permalink / raw)
To: davem; +Cc: netdev, mcarlson
The BD replenish thresholds for the 57765 and newer ASIC revs are a
little strict. They were tuned for a mode that is currently unused.
This patch relaxes the thresholds so that they are set to values more
inline with the resources available.
Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
---
drivers/net/ethernet/broadcom/tg3.c | 8 +++-----
1 files changed, 3 insertions(+), 5 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/tg3.c b/drivers/net/ethernet/broadcom/tg3.c
index 438d099..096a67e 100644
--- a/drivers/net/ethernet/broadcom/tg3.c
+++ b/drivers/net/ethernet/broadcom/tg3.c
@@ -8171,7 +8171,8 @@ static void tg3_setup_rxbd_thresholds(struct tg3 *tp)
if (!tg3_flag(tp, 5750_PLUS) ||
tg3_flag(tp, 5780_CLASS) ||
GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5750 ||
- GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5752)
+ GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5752 ||
+ tg3_flag(tp, 57765_PLUS))
bdcache_maxcnt = TG3_SRAM_RX_STD_BDCACHE_SIZE_5700;
else if (GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5755 ||
GET_ASIC_REV(tp->pci_chip_rev_id) == ASIC_REV_5787)
@@ -8191,10 +8192,7 @@ static void tg3_setup_rxbd_thresholds(struct tg3 *tp)
if (!tg3_flag(tp, JUMBO_CAPABLE) || tg3_flag(tp, 5780_CLASS))
return;
- if (!tg3_flag(tp, 5705_PLUS))
- bdcache_maxcnt = TG3_SRAM_RX_JMB_BDCACHE_SIZE_5700;
- else
- bdcache_maxcnt = TG3_SRAM_RX_JMB_BDCACHE_SIZE_5717;
+ bdcache_maxcnt = TG3_SRAM_RX_JMB_BDCACHE_SIZE_5700;
host_rep_thresh = max_t(u32, tp->rx_jumbo_pending / 8, 1);
--
1.7.3.4
^ permalink raw reply related
* Anyone tried the 40G Mellanox card?
From: Ben Greear @ 2011-11-22 1:01 UTC (permalink / raw)
To: netdev
I just noticed that Mellanox has a 40G Ethernet NIC. Seems it supports
pci-e 3.0 (8GT/s), if someone could find a motherboard and processor
to go with it...
Anyway, if anyone has done any performance tests with this NIC I'd
love to hear how it went...
Thanks,
Ben
--
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc http://www.candelatech.com
^ permalink raw reply
* Re: [PATCH net-next] virtio_net: return already tracked tx_fifo_errors via virtnet_getstats()
From: Rusty Russell @ 2011-11-22 0:56 UTC (permalink / raw)
To: Rick Jones, netdev, mst
In-Reply-To: <20111121192817.F2CAE290041F@tardy>
On Mon, 21 Nov 2011 11:28:17 -0800 (PST), raj@tardy.cup.hp.com (Rick Jones) wrote:
> From: Rick Jones <rick.jones2@hp.com>
>
> Tx_fifo_errors are tracked in start_xmit_ for virtio_net, but not
> reported in the tallies returned by virtnet_stats(). Return them
> as the rx "sub-stats" rx_length_errors and rx_frame_errors are.
>
> Signed-off-by: Rick Jones <rick.jones2@hp.com>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Cheers,
Rusty.
^ permalink raw reply
* Offres emplois pour vous
From: duloiscarole @ 2011-11-22 0:51 UTC (permalink / raw)
To: netdev
Bonjour,
des nouveaux offres emplois pour vous http://www.universfreeads.com/emplois.php
..
^ permalink raw reply
* Re: WARNING: at mm/slub.c:3357, kernel BUG at mm/slub.c:3413
From: Christian Kujau @ 2011-11-22 0:42 UTC (permalink / raw)
To: Benjamin Herrenschmidt
Cc: Markus Trippelsdorf, Eric Dumazet, Alex,Shi,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
Christoph Lameter, Pekka Enberg, Matt Mackall,
netdev@vger.kernel.org, Tejun Heo
In-Reply-To: <alpine.DEB.2.01.1111211617220.8000@trent.utfs.org>
On Mon, 21 Nov 2011 at 16:21, Christian Kujau wrote:
> With debug options enabled I'm currently in the xmon debugger, not sure
> what to make of it yet, I'll try to get something useful out of it :)
Only screenshots, see the *xmon jpegs in:
http://nerdbynature.de/bits/3.2.0-rc1/oops/
> SLAB does not have the same capabilities to detect corruption.
> You can disable most the cpu partial functionality by setting
> /sys/kernel/slab/*/cpu_partial
> to 0
I'll try to reproduce with those set to 0 (I'm on UP here, in case that
matters).
Thanks,
Christian.
--
BOFH excuse #296:
The hardware bus needs a new token.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply
* Re: [PATCH v2 5/5] net: Add Open vSwitch kernel components.
From: Stephen Hemminger @ 2011-11-22 0:27 UTC (permalink / raw)
To: Jesse Gross
Cc: dev-yBygre7rU0TnMu66kgdUjQ, netdev-u79uwXL29TY76Z2rM5mHXA,
David S. Miller
In-Reply-To: <1321911029-20707-6-git-send-email-jesse-l0M0P4e3n4LQT0dZR+AlfA@public.gmane.org>
One more comment...
Shouldn't this device be using netdev_increment_features() like bridging and bonding
to have the features of the pseudo device reflect those of the underlying hardware.
This would make the device have TSO only if underlying hardware supported it, etc.
^ permalink raw reply
* Re: WARNING: at mm/slub.c:3357, kernel BUG at mm/slub.c:3413
From: Christian Kujau @ 2011-11-22 0:21 UTC (permalink / raw)
To: Benjamin Herrenschmidt
Cc: Markus Trippelsdorf, Eric Dumazet, Alex,Shi,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
Christoph Lameter, Pekka Enberg, Matt Mackall,
netdev@vger.kernel.org, Tejun Heo
In-Reply-To: <1321907275.13860.12.camel@pasglop>
On Tue, 22 Nov 2011 at 07:27, Benjamin Herrenschmidt wrote:
> Note that I hit a similar looking crash (sorry, I couldn't capture a
> backtrace back then) on a PowerMac G5 (ppc64) while doing a large rsync
> transfer yesterday with -rc2-something (cfcfc9ec) and
> Christian Kujau (CC) seems to be able to reproduce something similar on
> some other ppc platform (Christian, what is your setup ?)
I seem to hit it with heavy disk & cpu IO is in progress on this PowerBook
G4. Full dmesg & .config: http://nerdbynature.de/bits/3.2.0-rc1/oops/
I've enabled some debug options and now it really points to slub.c:2166
http://nerdbynature.de/bits/3.2.0-rc1/oops/oops4m.jpg
With debug options enabled I'm currently in the xmon debugger, not sure
what to make of it yet, I'll try to get something useful out of it :)
Christian.
--
BOFH excuse #399:
We are a 100% Microsoft Shop.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply
* Re: sky2 tx watchdog timeout with 1Gb speed
From: Stephen Hemminger @ 2011-11-22 0:05 UTC (permalink / raw)
To: Milan Kocian; +Cc: netdev
In-Reply-To: <20111120232118.GA29748@ntm.wq.cz>
On Mon, 21 Nov 2011 00:21:18 +0100
Milan Kocian <milon@wq.cz> wrote:
> hi all,
>
> I switched my home pc from 100Mb/s to 1000Mb/s and I see
> this warning below.
>
> The original kernel was 2.6.39.4 then I tested 3.1.1 with the same
> result. (self compiled 32bit vanilla). The workaround is to force 10/100 speed
> on my new switch (hp).
>
> lspci:
>
> 03:00.0 Ethernet controller: Marvell Technology Group Ltd. 88E8056 PCI-E Gigabit Ethernet Controller (rev 13)
> Subsystem: Giga-byte Technology Device e000
> Flags: bus master, fast devsel, latency 0, IRQ 45
> Memory at f5000000 (64-bit, non-prefetchable) [size=16K]
> I/O ports at 9000 [size=256]
> [virtual] Expansion ROM at 80300000 [disabled] [size=128K]
> Capabilities: [48] Power Management version 3
> Capabilities: [50] Vital Product Data
> Capabilities: [5c] MSI: Enable+ Count=1/1 Maskable- 64bit+
> Capabilities: [e0] Express Legacy Endpoint, MSI 00
> Capabilities: [100] Advanced Error Reporting
> Kernel driver in use: sky2
>
>
> Nov 20 21:32:54 milu kernel: sky2 0000:03:00.0: eth0: Link is up at 1000 Mbps, full duplex, flow control both
> Nov 20 21:35:29 milu kernel: ------------[ cut here ]------------
> Nov 20 21:35:29 milu kernel: WARNING: at net/sched/sch_generic.c:255 dev_watchdog+0x1fa/0x206()
> Nov 20 21:35:29 milu kernel: Hardware name: 965GM-S2
> Nov 20 21:35:29 milu kernel: NETDEV WATCHDOG: eth0 (sky2): transmit queue 0 timed out
> Nov 20 21:35:29 milu kernel: Modules linked in: parport_pc parport fuse nfsd ipv6 nfs lockd auth_rpcgss nfs_acl sunrpc usbhid snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_intel8x0 sg snd_ac97_codec sr_mod ac97_bus cdrom sky2 snd_pcm_oss snd_mixer_oss snd_pcm snd_seq_dummy snd_seq_oss intel_agp snd_seq_midi snd_rawmidi snd_seq_midi_event snd_seq snd_timer snd_seq_device snd bitrev i2c_i801 crc32 intel_gtt uhci_hcd i2c_core ehci_hcd soundcore usbcore agpgart evdev snd_page_alloc
> Nov 20 21:35:29 milu kernel: Pid: 0, comm: swapper Not tainted 3.1.1 #2
> Nov 20 21:35:29 milu kernel: Call Trace:
> Nov 20 21:35:29 milu kernel: [<c102cd5d>] ? warn_slowpath_common+0x6c/0x94
> Nov 20 21:35:29 milu kernel: [<c1254deb>] ? dev_watchdog+0x1fa/0x206
> Nov 20 21:35:29 milu kernel: [<c1254deb>] ? dev_watchdog+0x1fa/0x206
> Nov 20 21:35:29 milu kernel: [<c102ce0e>] ? warn_slowpath_fmt+0x33/0x37
> Nov 20 21:35:29 milu kernel: [<c1254deb>] ? dev_watchdog+0x1fa/0x206
> Nov 20 21:35:29 milu kernel: [<c1254bf1>] ? qdisc_reset+0x2d/0x2d
> Nov 20 21:35:29 milu kernel: [<c1036434>] ? run_timer_softirq+0xc6/0x1c4
> Nov 20 21:35:29 milu kernel: [<c1027e9b>] ? run_rebalance_domains+0x148/0x169
> Nov 20 21:35:29 milu kernel: [<c103163b>] ? __do_softirq+0x6e/0xea
> Nov 20 21:35:29 milu kernel: [<c10315cd>] ? remote_softirq_receive+0x11/0x11
> Nov 20 21:35:29 milu kernel: <IRQ> [<c1031906>] ? irq_exit+0x5b/0x67
> Nov 20 21:35:29 milu kernel: [<c101631f>] ? smp_apic_timer_interrupt+0x51/0x81
> Nov 20 21:35:29 milu kernel: [<c12ccd96>] ? apic_timer_interrupt+0x2a/0x30
> Nov 20 21:35:29 milu kernel: [<c13f007b>] ? asus_hides_smbus_hostbridge+0xcb/0x249
> Nov 20 21:35:29 milu kernel: [<c1008732>] ? mwait_idle+0x41/0x51
> Nov 20 21:35:29 milu kernel: [<c10015d8>] ? cpu_idle+0x74/0x84
> Nov 20 21:35:29 milu kernel: [<c13d6638>] ? start_kernel+0x28a/0x28f
> Nov 20 21:35:29 milu kernel: [<c13d615e>] ? loglevel+0x2b/0x2b
> Nov 20 21:35:29 milu kernel: ---[ end trace ef84175f674c7842 ]---
> Nov 20 21:35:29 milu kernel: sky2 0000:03:00.0: eth0: tx timeout
> Nov 20 21:35:29 milu kernel: sky2 0000:03:00.0: eth0: transmit ring 52 .. 30 report=52 done=52
> Nov 20 21:35:32 milu kernel: sky2 0000:03:00.0: eth0: Link is up at 1000 Mbps, full duplex, flow control both
> Nov 20 21:37:13 milu kernel: sky2 0000:03:00.0: eth0: tx timeout
> Nov 20 21:37:13 milu kernel: sky2 0000:03:00.0: eth0: transmit ring 37 .. 15 report=37 done=37
> Nov 20 21:37:16 milu kernel: sky2 0000:03:00.0: eth0: Link is up at 1000 Mbps, full duplex, flow control both
>
> Any suggestion ? As I said its home machine so I can test what you want :-).
I haven't seen this, is it under heavy or light traffic.
Are you running something that might cause device to miss interrupts?
^ permalink raw reply
* Re: [PATCH net-next v1 1/4] net-e1000e: fix ethtool set_features taking new features into account too late
From: David Decotigny @ 2011-11-22 0:04 UTC (permalink / raw)
To: Michał Mirosław
Cc: Ian Campbell, Paul Gortmaker, e1000-devel, Bruce Allan,
Jesse Brandeburg, linux-kernel, John Ronciak, netdev,
David S. Miller
In-Reply-To: <CAHXqBFKGym1+AkXmEGeTzXeF_hvzfXoMEKrHBNSK5htHKf9PoQ@mail.gmail.com>
Hello,
2011/11/21 Michał Mirosław <mirqus@gmail.com>:
>> static int e1000_set_features(struct net_device *netdev,
>> - netdev_features_t features)
>> + netdev_features_t features)
>> {
>> struct e1000_adapter *adapter = netdev_priv(netdev);
>> netdev_features_t changed = features ^ netdev->features;
>> + int retval = 1; /* telling netdev that we are updating
>> + * netdev->features by ourselves */
>> +
>> + netdev->features = features;
>>
>> if (changed & (NETIF_F_TSO | NETIF_F_TSO6))
>> adapter->flags |= FLAG_TSO_FORCE;
>>
>> if (!(changed & (NETIF_F_HW_VLAN_RX | NETIF_F_HW_VLAN_TX |
>> NETIF_F_RXCSUM)))
>> - return 0;
>> + return retval;
>>
>
> Would be less code if you set netdev->features here...
>
>> if (netif_running(netdev))
>> e1000e_reinit_locked(adapter);
>> else
>> e1000e_reset(adapter);
>>
>> - return 0;
>> + return retval;
>
> ... and return 1 here, noting in a comment that e1000e_reinit_locked()
> might have changed netdev->features.
This would work, although I preferred the systematic approach for code
management reasons. But I will follow your recommendations. Waiting a
little (review of other patches) before sending the updated version.
Thanks!
------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure
contains a definitive record of customers, application performance,
security threats, fraudulent activity, and more. Splunk takes this
data and makes sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-novd2d
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel® Ethernet, visit http://communities.intel.com/community/wired
^ permalink raw reply
* Re: [PATCH net-next v1 1/4] net-e1000e: fix ethtool set_features taking new features into account too late
From: Michał Mirosław @ 2011-11-22 0:00 UTC (permalink / raw)
To: David Decotigny
Cc: Ian Campbell, Paul Gortmaker, e1000-devel, Bruce Allan,
Jesse Brandeburg, linux-kernel, John Ronciak, netdev,
David S. Miller
In-Reply-To: <22b29c09790d922f6e84ec623b8a691a7c3cd0a3.1321917278.git.david.decotigny@google.com>
2011/11/22 David Decotigny <david.decotigny@google.com>:
[...]
> diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c
> index a5bd7a3..b63f316 100644
> --- a/drivers/net/ethernet/intel/e1000e/netdev.c
> +++ b/drivers/net/ethernet/intel/e1000e/netdev.c
> @@ -5901,24 +5901,28 @@ static void e1000_eeprom_checks(struct e1000_adapter *adapter)
> }
>
> static int e1000_set_features(struct net_device *netdev,
> - netdev_features_t features)
> + netdev_features_t features)
> {
> struct e1000_adapter *adapter = netdev_priv(netdev);
> netdev_features_t changed = features ^ netdev->features;
> + int retval = 1; /* telling netdev that we are updating
> + * netdev->features by ourselves */
> +
> + netdev->features = features;
>
> if (changed & (NETIF_F_TSO | NETIF_F_TSO6))
> adapter->flags |= FLAG_TSO_FORCE;
>
> if (!(changed & (NETIF_F_HW_VLAN_RX | NETIF_F_HW_VLAN_TX |
> NETIF_F_RXCSUM)))
> - return 0;
> + return retval;
>
Would be less code if you set netdev->features here...
> if (netif_running(netdev))
> e1000e_reinit_locked(adapter);
> else
> e1000e_reset(adapter);
>
> - return 0;
> + return retval;
... and return 1 here, noting in a comment that e1000e_reinit_locked()
might have changed netdev->features.
> }
Best Regards,
Michał Mirosław
------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure
contains a definitive record of customers, application performance,
security threats, fraudulent activity, and more. Splunk takes this
data and makes sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-novd2d
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel® Ethernet, visit http://communities.intel.com/community/wired
^ permalink raw reply
* Re: [PATCH v2 5/5] net: Add Open vSwitch kernel components.
From: Michał Mirosław @ 2011-11-21 23:49 UTC (permalink / raw)
To: Stephen Hemminger
Cc: dev-yBygre7rU0TnMu66kgdUjQ, netdev-u79uwXL29TY76Z2rM5mHXA,
David S. Miller
In-Reply-To: <20111121152518.79e82eb8-We1ePj4FEcvRI77zikRAJc56i+j3xesD0e7PPNI6Mm0@public.gmane.org>
2011/11/22 Stephen Hemminger <shemminger@vyatta.com>:
> On Mon, 21 Nov 2011 15:18:43 -0800
> Jesse Gross <jesse@nicira.com> wrote:
>
>> On Mon, Nov 21, 2011 at 1:59 PM, Stephen Hemminger
>> <shemminger@vyatta.com> wrote:
>> > On Mon, 21 Nov 2011 13:30:29 -0800
>> > Jesse Gross <jesse@nicira.com> wrote:
>> >
>> >> +/**
>> >> + * vport_record_error - indicate device error to generic stats layer
>> >> + *
>> >> + * @vport: vport that encountered the error
>> >> + * @err_type: one of enum vport_err_type types to indicate the error type
>> >> + *
>> >> + * If using the vport generic stats layer indicate that an error of the given
>> >> + * type has occured.
>> >> + */
>> >> +void vport_record_error(struct vport *vport, enum vport_err_type err_type)
>> >> +{
>> >> + spin_lock(&vport->stats_lock);
>> >
>> > Sorry for over analyzing this... but I don't think the stats_lock
>> > is necessary either. The only thing it is protecting is against 64 bit
>> > wrap. If you used another u64_stat_sync for that one, it could be eliminated.
>> >
>> > Maybe?
>>
>> The reason for stats_lock is that the error stats are not expected to
>> be contended so in order to save some memory they're not per-cpu and
>> we just use a spin lock to protect them.
>
> Assignment or increment of native type size (64 bit on 64 bit cpu)
> is always atomic.
It might be, but it not always is. For example, on load-store
architectures normal increment (load,inc,store) is not atomic unless
made with special instruction sequence (like LDR/STREX on ARM).
Best Regards,
Michał Mirosław
_______________________________________________
dev mailing list
dev@openvswitch.org
http://openvswitch.org/mailman/listinfo/dev
^ permalink raw reply
* [PATCH 1/1] AF_UNIX: Fix poll locking problem when reading from a stream socket
From: Alexey Moiseytsev @ 2011-11-21 23:35 UTC (permalink / raw)
To: David S. Miller, Eric Dumazet; +Cc: netdev, linux-kernel, Alexey Moiseytsev
poll() call may be locked by concurrent reading from the same stream
socket.
Signed-off-by: Alexey Moiseytsev <himeraster@gmail.com>
---
net/unix/af_unix.c | 4 ++++
1 files changed, 4 insertions(+), 0 deletions(-)
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index 466fbcc..b595a3d 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -1957,6 +1957,7 @@ static int unix_stream_recvmsg(struct kiocb *iocb, struct socket *sock,
if ((UNIXCB(skb).pid != siocb->scm->pid) ||
(UNIXCB(skb).cred != siocb->scm->cred)) {
skb_queue_head(&sk->sk_receive_queue, skb);
+ sk->sk_data_ready(sk, skb->len);
break;
}
} else {
@@ -1974,6 +1975,7 @@ static int unix_stream_recvmsg(struct kiocb *iocb, struct socket *sock,
chunk = min_t(unsigned int, skb->len, size);
if (memcpy_toiovec(msg->msg_iov, skb->data, chunk)) {
skb_queue_head(&sk->sk_receive_queue, skb);
+ sk->sk_data_ready(sk, skb->len);
if (copied == 0)
copied = -EFAULT;
break;
@@ -1991,6 +1993,7 @@ static int unix_stream_recvmsg(struct kiocb *iocb, struct socket *sock,
/* put the skb back if we didn't use it up.. */
if (skb->len) {
skb_queue_head(&sk->sk_receive_queue, skb);
+ sk->sk_data_ready(sk, skb->len);
break;
}
@@ -2006,6 +2009,7 @@ static int unix_stream_recvmsg(struct kiocb *iocb, struct socket *sock,
/* put message back and return */
skb_queue_head(&sk->sk_receive_queue, skb);
+ sk->sk_data_ready(sk, skb->len);
break;
}
} while (size);
--
1.7.2.5
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox