* [PATCH] Infiniband: Randomize local port allocation.
From: penguin-kernel @ 2010-04-14 2:01 UTC (permalink / raw)
To: rolandd, sean.hefty
Cc: amwang, opurdila, eric.dumazet, netdev, nhorman, davem, ebiederm,
linux-kernel
In-Reply-To: <21DAC78125424ED291B5D6477CFF9657@amr.corp.intel.com>
Sean Hefty wrote:
> Sean and Roland, is below patch correct?
> >inet_is_reserved_local_port() is the new function proposed in this patchset.
>
> It looks correct to me. I didn't test the patch series, but if I comment out
> the call to inet_is_reserved_local_port() in the provided below, the changes
> worked fine for me.
>
> Acked-by: Sean Hefty <sean.hefty@intel.com>
>
Thank you for testing.
I think it is better to split this patch into
Part 1: Make cma_alloc_any_port() to use cma_alloc_port().
Part 2: Insert "!inet_is_reserved_local_port(rover) &&" line.
for future "git bisect".
Roland, will you review below patch for part 1?
--------------------
[PATCH] Infiniband: Randomize local port allocation.
Randomize local port allocation in a way sctp_get_port_local() does.
Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
---
drivers/infiniband/core/cma.c | 69 ++++++++++++++----------------------------
1 file changed, 24 insertions(+), 45 deletions(-)
--- linux-2.6.34-rc4.orig/drivers/infiniband/core/cma.c
+++ linux-2.6.34-rc4/drivers/infiniband/core/cma.c
@@ -79,7 +79,6 @@ static DEFINE_IDR(sdp_ps);
static DEFINE_IDR(tcp_ps);
static DEFINE_IDR(udp_ps);
static DEFINE_IDR(ipoib_ps);
-static int next_port;
struct cma_device {
struct list_head list;
@@ -1970,47 +1969,32 @@ err1:
static int cma_alloc_any_port(struct idr *ps, struct rdma_id_private *id_priv)
{
- struct rdma_bind_list *bind_list;
- int port, ret, low, high;
-
- bind_list = kzalloc(sizeof *bind_list, GFP_KERNEL);
- if (!bind_list)
- return -ENOMEM;
-
-retry:
- /* FIXME: add proper port randomization per like inet_csk_get_port */
- do {
- ret = idr_get_new_above(ps, bind_list, next_port, &port);
- } while ((ret == -EAGAIN) && idr_pre_get(ps, GFP_KERNEL));
-
- if (ret)
- goto err1;
+ static unsigned int last_used_port;
+ int low, high, remaining;
+ unsigned int rover;
inet_get_local_port_range(&low, &high);
- if (port > high) {
- if (next_port != low) {
- idr_remove(ps, port);
- next_port = low;
- goto retry;
+ remaining = (high - low) + 1;
+ rover = net_random() % remaining + low;
+ do {
+ rover++;
+ if ((rover < low) || (rover > high))
+ rover = low;
+ if (last_used_port != rover &&
+ !idr_find(ps, (unsigned short) rover)) {
+ int ret = cma_alloc_port(ps, id_priv, rover);
+ /*
+ * Remember previously used port number in order to
+ * avoid re-using same port immediately after it is
+ * closed.
+ */
+ if (!ret)
+ last_used_port = rover;
+ if (ret != -EADDRNOTAVAIL)
+ return ret;
}
- ret = -EADDRNOTAVAIL;
- goto err2;
- }
-
- if (port == high)
- next_port = low;
- else
- next_port = port + 1;
-
- bind_list->ps = ps;
- bind_list->port = (unsigned short) port;
- cma_bind_port(bind_list, id_priv);
- return 0;
-err2:
- idr_remove(ps, port);
-err1:
- kfree(bind_list);
- return ret;
+ } while (--remaining > 0);
+ return -EADDRNOTAVAIL;
}
static int cma_use_port(struct idr *ps, struct rdma_id_private *id_priv)
@@ -2995,12 +2979,7 @@ static void cma_remove_one(struct ib_dev
static int __init cma_init(void)
{
- int ret, low, high, remaining;
-
- get_random_bytes(&next_port, sizeof next_port);
- inet_get_local_port_range(&low, &high);
- remaining = (high - low) + 1;
- next_port = ((unsigned int) next_port % remaining) + low;
+ int ret;
cma_wq = create_singlethread_workqueue("rdma_cm");
if (!cma_wq)
^ permalink raw reply
* Re: linux-next: manual merge of the net tree with Linus' tree
From: David Miller @ 2010-04-14 2:00 UTC (permalink / raw)
To: sfr; +Cc: netdev, linux-next, linux-kernel, ken_kawasaki, jpirko
In-Reply-To: <20100414115244.f97ca080.sfr@canb.auug.org.au>
From: Stephen Rothwell <sfr@canb.auug.org.au>
Date: Wed, 14 Apr 2010 11:52:44 +1000
> Hi Dave,
>
> On Tue, 13 Apr 2010 18:47:24 -0700 (PDT) David Miller <davem@davemloft.net> wrote:
>>
>> Thanks a lot Stephen, I'll merge net-2.6 into net-next-2.6 to
>> fix this up for you.
>
> Thanks.
>
> There was another conflict in drivers/net/virtio_net.c (because there is
> a patch in both your tree and Linus' (via net-current)) that git did not
> quite resolve correctly. The sg_init_table() in add_recvbuf_small() was
> reinserted by the automatic merge ... I removed it in my merge.
Yes I expected that, the cherrypicked fix gets changed by a subsequent
commit in net-next-2.6 that changes where the scatterlist entries are
stored in that driver.
Anyways, thanks for the heads up.
^ permalink raw reply
* Re: linux-next: manual merge of the net tree with Linus' tree
From: Stephen Rothwell @ 2010-04-14 1:52 UTC (permalink / raw)
To: David Miller; +Cc: netdev, linux-next, linux-kernel, ken_kawasaki, jpirko
In-Reply-To: <20100413.184724.112842393.davem@davemloft.net>
[-- Attachment #1: Type: text/plain, Size: 597 bytes --]
Hi Dave,
On Tue, 13 Apr 2010 18:47:24 -0700 (PDT) David Miller <davem@davemloft.net> wrote:
>
> Thanks a lot Stephen, I'll merge net-2.6 into net-next-2.6 to
> fix this up for you.
Thanks.
There was another conflict in drivers/net/virtio_net.c (because there is
a patch in both your tree and Linus' (via net-current)) that git did not
quite resolve correctly. The sg_init_table() in add_recvbuf_small() was
reinserted by the automatic merge ... I removed it in my merge.
--
Cheers,
Stephen Rothwell sfr@canb.auug.org.au
http://www.canb.auug.org.au/~sfr/
[-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --]
^ permalink raw reply
* Re: linux-next: manual merge of the net tree with Linus' tree
From: David Miller @ 2010-04-14 1:47 UTC (permalink / raw)
To: sfr; +Cc: netdev, linux-next, linux-kernel, ken_kawasaki, jpirko
In-Reply-To: <20100414114556.97d7583d.sfr@canb.auug.org.au>
From: Stephen Rothwell <sfr@canb.auug.org.au>
Date: Wed, 14 Apr 2010 11:45:56 +1000
> Hi all,
>
> Today's linux-next merge of the net tree got a conflict in
> drivers/net/pcmcia/smc91c92_cs.c between commit
> a6d37024de02e7cb2b2333e438e71355a9c32a0a ("smc91c92_cs: define
> multicast_table as unsigned char") from Linus' tree and commit
> 22bedad3ce112d5ca1eaf043d4990fa2ed698c87 ("net: convert multicast list to
> list_head") from the net tree.
>
> I fixed it up (see below) and can carry the fix for a while.
Thanks a lot Stephen, I'll merge net-2.6 into net-next-2.6 to
fix this up for you.
^ permalink raw reply
* linux-next: manual merge of the net tree with Linus' tree
From: Stephen Rothwell @ 2010-04-14 1:45 UTC (permalink / raw)
To: David Miller, netdev; +Cc: linux-next, linux-kernel, Ken Kawasaki, Jiri Pirko
Hi all,
Today's linux-next merge of the net tree got a conflict in
drivers/net/pcmcia/smc91c92_cs.c between commit
a6d37024de02e7cb2b2333e438e71355a9c32a0a ("smc91c92_cs: define
multicast_table as unsigned char") from Linus' tree and commit
22bedad3ce112d5ca1eaf043d4990fa2ed698c87 ("net: convert multicast list to
list_head") from the net tree.
I fixed it up (see below) and can carry the fix for a while.
--
Cheers,
Stephen Rothwell sfr@canb.auug.org.au
diff --cc drivers/net/pcmcia/smc91c92_cs.c
index fd9d6e3,ad22676..0000000
--- a/drivers/net/pcmcia/smc91c92_cs.c
+++ b/drivers/net/pcmcia/smc91c92_cs.c
@@@ -1621,10 -1618,14 +1621,10 @@@ static void set_rx_mode(struct net_devi
rx_cfg_setting = RxStripCRC | RxEnable | RxAllMulti;
else {
if (!netdev_mc_empty(dev)) {
- struct dev_mc_list *mc_addr;
+ struct netdev_hw_addr *ha;
- netdev_for_each_mc_addr(mc_addr, dev) {
- u_int position = ether_crc(6, mc_addr->dmi_addr);
+ netdev_for_each_mc_addr(ha, dev) {
+ u_int position = ether_crc(6, ha->addr);
-#ifndef final_version /* Verify multicast address. */
- if ((ha->addr[0] & 1) == 0)
- continue;
-#endif
multicast_table[position >> 29] |= 1 << ((position >> 26) & 7);
}
}
^ permalink raw reply
* Re: forcedeth driver hangs under heavy load
From: David Miller @ 2010-04-14 1:41 UTC (permalink / raw)
To: aabdulla; +Cc: eric.dumazet, smulcahy, bhutchings, netdev, ben, 572201
In-Reply-To: <4BC5539B.6050908@nvidia.com>
From: Ayaz Abdulla <aabdulla@nvidia.com>
Date: Wed, 14 Apr 2010 01:33:15 -0400
> Attached fix has been submitted to netdev.
Thanks!
I apply this soon.
^ permalink raw reply
* Re: [PATCH] tun: orphan an skb on tx
From: Herbert Xu @ 2010-04-14 0:58 UTC (permalink / raw)
To: Eric Dumazet
Cc: Michael S. Tsirkin, Jan Kiszka, David S. Miller, Paul Moore,
David Woodhouse, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org, qemu-devel
In-Reply-To: <1271183463.16881.545.camel@edumazet-laptop>
On Tue, Apr 13, 2010 at 08:31:03PM +0200, Eric Dumazet wrote:
>
> Herbert Acked your patch, so I guess its OK, but I think it can be
> dangerous.
The tun socket accounting was never designed to stop it from
flooding another tun interface. It's there to stop it from
transmitting above a destination interface TX bandwidth and
cause unnecessary packet drops. It also limits the total amount
of kernel memory that can be pinned down by a single tun interface.
In this case, all we're doing is shifting the accounting from the
"hardware" queue to the qdisc queue.
So your ability to flood a tun interface is essentially unchanged.
BTW we do the same thing in a number of hardware drivers, as well
as virtio-net.
Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply
* Re: [PATCH] Fix SCTP failure with ipv6 source address routing
From: Vlad Yasevich @ 2010-04-14 0:47 UTC (permalink / raw)
To: Paul Gortmaker; +Cc: netdev
In-Reply-To: <1271198256-20477-1-git-send-email-paul.gortmaker@windriver.com>
Paul Gortmaker wrote:
> From: Weixing Shi <Weixing.Shi@windriver.com>
>
> Given the below test case, using source address routing, SCTP
> does not work.
>
> Node-A:
> 1)ifconfig eth0 inet6 add 2001:1::1/64
> 2)ip -6 rule add from 2001:1::1 table 100 pref 100
> 3)ip -6 route add 2001:2::1 dev eth0 table 100
> 4)sctp_darn -H 2001:1::1 -P 250 -l &
>
> Node-B:
> 1)ifconfig eth0 inet6 add 2001:2::1/64
> 2)ip -6 rule add from 2001:2::1 table 100 pref 100
> 3)ip -6 route add 2001:1::1 dev eth0 table 100
> 4)sctp_darn -H 2001:2::1 -P 250 -h 2001:1::1 -p 250 -s
>
> Root cause:
> Node-A and Node-B use source address routing, and in the
> begining, the source address will be NULL. So SCTP will search
> the routing table by the destination address (because it is using
> the source address routing table), and hence the resulting dst_entry
> will be NULL.
>
> Solution:
> After SCTP gets the correct source address, then we search for
> dst_entry again, and then we will get the correct value.
The problem here is that ipv6 route lookup code in sctp doesn't bother
searching for the source address, unlike the v4 route lookup code.
Compare sctp_v4_get_dst() and sctp_v6_get_dst. The v4 version bends over
backwards trying to get the correct route, while the v6 version simple does
a single lookup and returns the result.
The v6 route lookup code needs to be fixed to take into account the bound
address list.
-vlad
>
> Signed-off-by: Weixing Shi <Weixing.Shi@windriver.com>
> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
> ---
> net/sctp/transport.c | 11 +++++++++--
> 1 files changed, 9 insertions(+), 2 deletions(-)
>
> diff --git a/net/sctp/transport.c b/net/sctp/transport.c
> index be4d63d..b5ae18c 100644
> --- a/net/sctp/transport.c
> +++ b/net/sctp/transport.c
> @@ -295,9 +295,16 @@ void sctp_transport_route(struct sctp_transport *transport,
>
> if (saddr)
> memcpy(&transport->saddr, saddr, sizeof(union sctp_addr));
> - else
> + else {
> af->get_saddr(opt, asoc, dst, daddr, &transport->saddr);
> -
> + /* When using source address routing, since dst was
> + * looked up prior to filling in the source address, dst
> + * needs to be looked up again to get the correct dst
> + */
> + if (dst)
> + dst_release(dst);
> + dst = af->get_dst(asoc, daddr, &transport->saddr);
> + }
> transport->dst = dst;
> if ((transport->param_flags & SPP_PMTUD_DISABLE) && transport->pathmtu) {
> return;
^ permalink raw reply
* Phylib polling when doing mdio_read will cause system response and transfer speed drop
From: Bryan Wu @ 2010-04-14 0:27 UTC (permalink / raw)
To: afleming, davem; +Cc: netdev, LKML
Hi Andy and David,
After I posted a patch to add phylib supporting in drivers/net/fec.c, we found
performance drop regressions on Freescale i.MX51 babbage board.
Patch is
http://git.kernel.org/?p=linux/kernel/git/davem/net-next-2.6.git;a=commitdiff;h=e6b043d512fa8d9a3801bf5d72bfa3b8fc3b3cc8.
Bug tracker is here:
https://bugs.launchpad.net/ubuntu/+source/linux-fsl-imx51/+bug/546649
I found the root cause is the polling operation in the mdio_read function. When
we transfer large files, we experienced many times of timeout issue. So I got
several question here:
1. Need I return -ETIMEDOUT when polling timeout. If I don't return -ETIMEOUT,
the performance improved a lot. And after check other drivers, some don't return
anything, some return 0, some return negative value. What's the rule for this
mdio_read polling timeout case.
2. How to do polling busy waiting? Normally, we won't buys wait very long in
polling. But hardware is not perfect every time. Running cpu_relax() 10000 times
in polling will cause our system response very bad when hardware don't set the
flag as we expected. Maybe udelay(25) 10 times or msleep(1) 10 times is better
than that.
I got a patch to recover this issue,
http://kernel.ubuntu.com/git?p=roc/ubuntu-lucid.git;a=commitdiff;h=5d77e3409b319ca84183bf1d2fd158a9c864e03f.
Thanks a lot,
--
Bryan Wu <bryan.wu@canonical.com>
Kernel Developer +86.138-1617-6545 Mobile
Ubuntu Kernel Team | Hardware Enablement Team
Canonical Ltd. www.canonical.com
Ubuntu - Linux for human beings | www.ubuntu.com
^ permalink raw reply
* Re: [PATCH] Add somaxconn to Documentation/sysctl/net.txt
From: Rob Landley @ 2010-04-13 23:54 UTC (permalink / raw)
To: Eric Dumazet; +Cc: linux-kernel, linux-doc, netdev
In-Reply-To: <1271184012.16881.549.camel@edumazet-laptop>
On Tuesday 13 April 2010 13:40:12 Eric Dumazet wrote:
> Le mardi 13 avril 2010 à 13:25 -0500, Rob Landley a écrit :
> > From: Rob Landley <rob@landley.net>
> >
> > Add somaxconn to Documentation/sysctl/net.txt
> >
> > Signed-off-by: Rob Landley <rob@landley.net>
> > ---
> >
> > Documentation/sysctl/net.txt | 6 ++++++
> > 1 file changed, 6 insertions(+)
> >
> > diff --git a/Documentation/sysctl/net.txt b/Documentation/sysctl/net.txt
> > index df38ef0..2740085 100644
> > --- a/Documentation/sysctl/net.txt
> > +++ b/Documentation/sysctl/net.txt
> > @@ -90,6 +90,12 @@ optmem_max
> > Maximum ancillary buffer size allowed per socket. Ancillary data is a
> > sequence of struct cmsghdr structures with appended data.
> >
> > +somaxconn
> > +---------
> > +
> > +Maximum backlog of unanswered connections for a listening socket.
> > Provides +an upper bound on the "backlog" parameter of the listen()
> > syscall. +
> > 2. /proc/sys/net/unix - Parameters for Unix domain sockets
> > -------------------------------------------------------
>
> Please cc netdev for such patches
>
> Extract of Documentation/networking/ip-sysctl.txt
>
> somaxconn - INTEGER
> Limit of socket listen() backlog, known in userspace as SOMAXCONN.
> Defaults to 128. See also tcp_max_syn_backlog for additional tuning
> for TCP sockets.
>
> I guess you need to change both files ?
Dunno. I just got a question on the busybox mailing list:
http://lists.busybox.net/pipermail/busybox/2010-April/072090.html
Looked in Documentation to see what /proc/sys/net/core/somaxconn actually
_did_, found it was undocumented, grepped the kernel source for somaxconn,
found just one chunk of code actually using it, replied to the guy's question:
http://lists.busybox.net/pipermail/busybox/2010-April/072096.html
And then tweaked the documentation with what I'd found, and sent in a doc
patch so I wouldn't have to do that twice.
It's quite possible I got it wrong. Maybe it's per interface or something?
Rob
--
Latency is more important than throughput. It's that simple. - Linus Torvalds
^ permalink raw reply
* Re: [PATCH net-next-2.6] net: sk_dst_cache RCUification
From: David Miller @ 2010-04-13 23:11 UTC (permalink / raw)
To: eric.dumazet; +Cc: netdev, paulmck
In-Reply-To: <1271199845.16881.586.camel@edumazet-laptop>
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Wed, 14 Apr 2010 01:04:05 +0200
> Instead of using rcu on whole "struct socket", my plan is to use a small
> structure :
>
> struct wait_queue_head_rcu {
> wait_queue_head_t wait;
> struct rcu_head rcu;
> } ____cacheline_aligned_in_smp;
>
> and make sk->sk_sleep points to this 'wait' field.
So you're relying upon the fact that in the non-FASYNC case
the struct socket's wait queue is never actually used?
^ permalink raw reply
* Re: [PATCH net-next-2.6] net: sk_dst_cache RCUification
From: Eric Dumazet @ 2010-04-13 23:04 UTC (permalink / raw)
To: David Miller; +Cc: netdev, paulmck
In-Reply-To: <20100413.015232.67916764.davem@davemloft.net>
Le mardi 13 avril 2010 à 01:52 -0700, David Miller a écrit :
> Applied, thanks for doing this work Eric.
Thanks David :)
I am now working on sk_callback_lock case, to speedup
sock_def_readable(), sock_def_write_space() in typical cases
(SOCK_FASYNC not set)
Instead of using rcu on whole "struct socket", my plan is to use a small
structure :
struct wait_queue_head_rcu {
wait_queue_head_t wait;
struct rcu_head rcu;
} ____cacheline_aligned_in_smp;
and make sk->sk_sleep points to this 'wait' field.
^ permalink raw reply
* Re: [PATCH v2] net: batch skb dequeueing from softnet input_pkt_queue
From: Changli Gao @ 2010-04-13 22:43 UTC (permalink / raw)
To: paulmck; +Cc: Eric Dumazet, David S. Miller, netdev
In-Reply-To: <20100413155227.GC2538@linux.vnet.ibm.com>
On Tue, Apr 13, 2010 at 11:52 PM, Paul E. McKenney
<paulmck@linux.vnet.ibm.com> wrote:
> On Tue, Apr 13, 2010 at 05:50:29PM +0800, Changli Gao wrote:
>> On Tue, Apr 13, 2010 at 4:08 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>> >
>> > Probably not necessary.
>> >
>> >> + volatile bool flush_processing_queue;
>> >
>> > Use of 'volatile' is strongly discouraged, I would say, forbidden.
>>
>> volatile is used to avoid compiler optimization.
>
> Would it be reasonable to use ACCESS_ONCE() where this variable is used?
Oh, thanks. ACCESS_ONCE() is just what I need.
--
Regards,
Changli Gao(xiaosuo@gmail.com)
^ permalink raw reply
* [PATCH] Fix SCTP failure with ipv6 source address routing
From: Paul Gortmaker @ 2010-04-13 22:37 UTC (permalink / raw)
To: netdev; +Cc: vladislav.yasevich
From: Weixing Shi <Weixing.Shi@windriver.com>
Given the below test case, using source address routing, SCTP
does not work.
Node-A:
1)ifconfig eth0 inet6 add 2001:1::1/64
2)ip -6 rule add from 2001:1::1 table 100 pref 100
3)ip -6 route add 2001:2::1 dev eth0 table 100
4)sctp_darn -H 2001:1::1 -P 250 -l &
Node-B:
1)ifconfig eth0 inet6 add 2001:2::1/64
2)ip -6 rule add from 2001:2::1 table 100 pref 100
3)ip -6 route add 2001:1::1 dev eth0 table 100
4)sctp_darn -H 2001:2::1 -P 250 -h 2001:1::1 -p 250 -s
Root cause:
Node-A and Node-B use source address routing, and in the
begining, the source address will be NULL. So SCTP will search
the routing table by the destination address (because it is using
the source address routing table), and hence the resulting dst_entry
will be NULL.
Solution:
After SCTP gets the correct source address, then we search for
dst_entry again, and then we will get the correct value.
Signed-off-by: Weixing Shi <Weixing.Shi@windriver.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
---
net/sctp/transport.c | 11 +++++++++--
1 files changed, 9 insertions(+), 2 deletions(-)
diff --git a/net/sctp/transport.c b/net/sctp/transport.c
index be4d63d..b5ae18c 100644
--- a/net/sctp/transport.c
+++ b/net/sctp/transport.c
@@ -295,9 +295,16 @@ void sctp_transport_route(struct sctp_transport *transport,
if (saddr)
memcpy(&transport->saddr, saddr, sizeof(union sctp_addr));
- else
+ else {
af->get_saddr(opt, asoc, dst, daddr, &transport->saddr);
-
+ /* When using source address routing, since dst was
+ * looked up prior to filling in the source address, dst
+ * needs to be looked up again to get the correct dst
+ */
+ if (dst)
+ dst_release(dst);
+ dst = af->get_dst(asoc, daddr, &transport->saddr);
+ }
transport->dst = dst;
if ((transport->param_flags & SPP_PMTUD_DISABLE) && transport->pathmtu) {
return;
--
1.6.5.2
^ permalink raw reply related
* Re: [PATCH 0/9] net: support multiple independant multicast routing instances
From: David Miller @ 2010-04-13 21:51 UTC (permalink / raw)
To: kaber; +Cc: netdev
In-Reply-To: <1271171003-11901-1-git-send-email-kaber@trash.net>
From: Patrick McHardy <kaber@trash.net>
Date: Tue, 13 Apr 2010 17:03:14 +0200
> this is an updated patchset of my patches to support multiple independant
> multicast routing instances. Changes since the last posting are:
>
> - rebase to the current net-next-2.6.git tree
> - fix up patch subjects to consistently refer to "ipv4: ipmr:"
> - fix up list_head conversion patch to add new elements at the head of
> the list instead of at the tail
>
> Please apply or pull from:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/kaber/ipmr-2.6.git master
I applied the patches instead of pulling just to check your email
patch submission format, and it was perfect! :-)
I'll do a git pull next time.
All applied to net-next-2.6, thanks!
^ permalink raw reply
* Re: forcedeth driver hangs under heavy load
From: Eric Dumazet @ 2010-04-13 21:46 UTC (permalink / raw)
To: David Miller; +Cc: smulcahy, bhutchings, netdev, ben, aabdulla, 572201
In-Reply-To: <20100413.144340.138717714.davem@davemloft.net>
Le mardi 13 avril 2010 à 14:43 -0700, David Miller a écrit :
> Do you really come to the conclusion that TSO is broken with the above
> test results?
>
> I would conclude that there is a TX checksumming issue, since merely
> turning TSO off does not fix the problem whereas turning TX
> checksumming off does.
Indeed, we clarified the point and it is a TX checksum issue.
^ permalink raw reply
* Re: forcedeth driver hangs under heavy load
From: David Miller @ 2010-04-13 21:43 UTC (permalink / raw)
To: eric.dumazet; +Cc: smulcahy, bhutchings, netdev, ben, aabdulla, 572201
In-Reply-To: <1271169741.16881.437.camel@edumazet-laptop>
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Tue, 13 Apr 2010 16:42:21 +0200
> Le mardi 13 avril 2010 à 15:27 +0100, stephen mulcahy a écrit :
>> Ok, I've tried both of the following with my reproducer
>>
>> 1. ethtool -K eth0 tso off
>>
>> RESULT: reproducer causes multiple hosts to be come unresponsive on
>> first run.
>>
>> 2. ethtool -K eth0 tx off
>>
>> RESULT: reproducer runs three times without any hosts becoming unresponsive.
>>
>> -stephen
>
> Thanks Stephen !
>
> Now some brave fouls to check the 6410 lines of this driver ? ;)
>
> Question of the day : Why TSO is broken in forcedeth ?
> Is it generically broken or is it broken for specific NICS ?
Do you really come to the conclusion that TSO is broken with the above
test results?
I would conclude that there is a TX checksumming issue, since merely
turning TSO off does not fix the problem whereas turning TX
checksumming off does.
^ permalink raw reply
* RE: [PATCH 2/3] cxgb4i: main driver files
From: Karen Xie @ 2010-04-13 21:41 UTC (permalink / raw)
To: Mike Christie, open-iscsi
Cc: Rakesh Ranjan, netdev, linux-scsi, linux-kernel, davem,
James.Bottomley
In-Reply-To: <4BC4D711.5030802@cs.wisc.edu>
Hi, Mike,
Yes, will do that for the next submission.
Thanks,
Karen
-----Original Message-----
From: Mike Christie [mailto:michaelc@cs.wisc.edu]
Sent: Tuesday, April 13, 2010 1:42 PM
To: open-iscsi@googlegroups.com
Cc: Rakesh Ranjan; netdev@vger.kernel.org; linux-scsi@vger.kernel.org;
linux-kernel@vger.kernel.org; Karen Xie; davem@davemloft.net;
James.Bottomley@hansenpartnership.com
Subject: Re: [PATCH 2/3] cxgb4i: main driver files
On 04/08/2010 07:14 AM, Rakesh Ranjan wrote:
> +static inline int cxgb4i_ddp_gl_map(struct pci_dev *pdev,
> + struct cxgb4i_gather_list *gl)
> +{
> + int i;
> +
> + for (i = 0; i< gl->nelem; i++) {
> + gl->phys_addr[i] = pci_map_page(pdev, gl->pages[i], 0,
> + PAGE_SIZE,
Hey Rakesh,
I guess we are trying to move away from the pci mapping functions move
to the dma ones. On your next submission, could you fix those up too?
^ permalink raw reply
* Re: [PATCH Resubmission] drivers/net/usb: Add new driver ipheth
From: David Miller @ 2010-04-13 21:29 UTC (permalink / raw)
To: agimenez
Cc: linux-kernel, dgiagio, dborca, gregkh, jonas.sjoquist,
steve.glendinning, torgny.johansson, dbrownell, omar.oberthur,
linux-usb, netdev
In-Reply-To: <4BC4BFFD.9040802@sysvalve.es>
From: "L. Alberto Giménez" <agimenez@sysvalve.es>
Date: Tue, 13 Apr 2010 21:03:25 +0200
> Thanks for the info. I didn't know that I had to add an entry on the
> upper level Makefile. I guess that something like
> obj-$(CONFIG_USB_IPHETH) += usb/ should be enough? (I got it from the
> other USB net drivers).
Yes.
^ permalink raw reply
* [PATCH net-next-2.6] drivers: net: use skb_headlen()
From: Eric Dumazet @ 2010-04-13 20:48 UTC (permalink / raw)
To: David Miller; +Cc: netdev
replaces (skb->len - skb->data_len) occurrences by skb_headlen(skb)
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
---
drivers/atm/eni.c | 2 +-
drivers/atm/he.c | 4 ++--
drivers/net/3c59x.c | 4 ++--
drivers/net/atl1e/atl1e_main.c | 2 +-
drivers/net/atlx/atl1.c | 4 ++--
drivers/net/benet/be_main.c | 4 ++--
drivers/net/chelsio/sge.c | 8 ++++----
drivers/net/e1000/e1000_main.c | 4 ++--
drivers/net/e1000e/netdev.c | 4 ++--
drivers/net/ehea/ehea_main.c | 10 +++++-----
drivers/net/forcedeth.c | 4 ++--
drivers/net/ixgbevf/ixgbevf_main.c | 2 +-
drivers/net/ksz884x.c | 2 +-
drivers/net/myri10ge/myri10ge.c | 2 +-
drivers/net/s2io.c | 4 ++--
drivers/net/tehuti.c | 2 +-
drivers/net/tsi108_eth.c | 4 ++--
17 files changed, 33 insertions(+), 33 deletions(-)
diff --git a/drivers/atm/eni.c b/drivers/atm/eni.c
index 719ec5a..90a5a7c 100644
--- a/drivers/atm/eni.c
+++ b/drivers/atm/eni.c
@@ -1131,7 +1131,7 @@ DPRINTK("doing direct send\n"); /* @@@ well, this doesn't work anyway */
if (i == -1)
put_dma(tx->index,eni_dev->dma,&j,(unsigned long)
skb->data,
- skb->len - skb->data_len);
+ skb_headlen(skb));
else
put_dma(tx->index,eni_dev->dma,&j,(unsigned long)
skb_shinfo(skb)->frags[i].page + skb_shinfo(skb)->frags[i].page_offset,
diff --git a/drivers/atm/he.c b/drivers/atm/he.c
index c213e0d..56c2e99 100644
--- a/drivers/atm/he.c
+++ b/drivers/atm/he.c
@@ -2664,8 +2664,8 @@ he_send(struct atm_vcc *vcc, struct sk_buff *skb)
#ifdef USE_SCATTERGATHER
tpd->iovec[slot].addr = pci_map_single(he_dev->pci_dev, skb->data,
- skb->len - skb->data_len, PCI_DMA_TODEVICE);
- tpd->iovec[slot].len = skb->len - skb->data_len;
+ skb_headlen(skb), PCI_DMA_TODEVICE);
+ tpd->iovec[slot].len = skb_headlen(skb);
++slot;
for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
diff --git a/drivers/net/3c59x.c b/drivers/net/3c59x.c
index 5f92fdb..9752530 100644
--- a/drivers/net/3c59x.c
+++ b/drivers/net/3c59x.c
@@ -2129,8 +2129,8 @@ boomerang_start_xmit(struct sk_buff *skb, struct net_device *dev)
int i;
vp->tx_ring[entry].frag[0].addr = cpu_to_le32(pci_map_single(VORTEX_PCI(vp), skb->data,
- skb->len-skb->data_len, PCI_DMA_TODEVICE));
- vp->tx_ring[entry].frag[0].length = cpu_to_le32(skb->len-skb->data_len);
+ skb_headlen(skb), PCI_DMA_TODEVICE));
+ vp->tx_ring[entry].frag[0].length = cpu_to_le32(skb_headlen(skb));
for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
diff --git a/drivers/net/atl1e/atl1e_main.c b/drivers/net/atl1e/atl1e_main.c
index b6605d4..d45356f 100644
--- a/drivers/net/atl1e/atl1e_main.c
+++ b/drivers/net/atl1e/atl1e_main.c
@@ -1679,7 +1679,7 @@ static void atl1e_tx_map(struct atl1e_adapter *adapter,
{
struct atl1e_tpd_desc *use_tpd = NULL;
struct atl1e_tx_buffer *tx_buffer = NULL;
- u16 buf_len = skb->len - skb->data_len;
+ u16 buf_len = skb_headsize(skb);
u16 map_len = 0;
u16 mapped_len = 0;
u16 hdr_len = 0;
diff --git a/drivers/net/atlx/atl1.c b/drivers/net/atlx/atl1.c
index 0ebd820..33448a0 100644
--- a/drivers/net/atlx/atl1.c
+++ b/drivers/net/atlx/atl1.c
@@ -2347,7 +2347,7 @@ static netdev_tx_t atl1_xmit_frame(struct sk_buff *skb,
{
struct atl1_adapter *adapter = netdev_priv(netdev);
struct atl1_tpd_ring *tpd_ring = &adapter->tpd_ring;
- int len = skb->len;
+ int len;
int tso;
int count = 1;
int ret_val;
@@ -2359,7 +2359,7 @@ static netdev_tx_t atl1_xmit_frame(struct sk_buff *skb,
unsigned int f;
unsigned int proto_hdr_len;
- len -= skb->data_len;
+ len = skb_headlen(skb);
if (unlikely(skb->len <= 0)) {
dev_kfree_skb_any(skb);
diff --git a/drivers/net/benet/be_main.c b/drivers/net/benet/be_main.c
index 18e0a80..fa10f13 100644
--- a/drivers/net/benet/be_main.c
+++ b/drivers/net/benet/be_main.c
@@ -432,7 +432,7 @@ static int make_tx_wrbs(struct be_adapter *adapter,
map_head = txq->head;
if (skb->len > skb->data_len) {
- int len = skb->len - skb->data_len;
+ int len = skb_headlen(skb);
busaddr = pci_map_single(pdev, skb->data, len,
PCI_DMA_TODEVICE);
if (pci_dma_mapping_error(pdev, busaddr))
@@ -1098,7 +1098,7 @@ static void be_tx_compl_process(struct be_adapter *adapter, u16 last_index)
cur_index = txq->tail;
wrb = queue_tail_node(txq);
unmap_tx_frag(adapter->pdev, wrb, (unmap_skb_hdr &&
- sent_skb->len > sent_skb->data_len));
+ skb_headlen(sent_skb)));
unmap_skb_hdr = false;
num_wrbs++;
diff --git a/drivers/net/chelsio/sge.c b/drivers/net/chelsio/sge.c
index a8ffc1e..f01cfdb 100644
--- a/drivers/net/chelsio/sge.c
+++ b/drivers/net/chelsio/sge.c
@@ -1123,7 +1123,7 @@ static inline unsigned int compute_large_page_tx_descs(struct sk_buff *skb)
if (PAGE_SIZE > SGE_TX_DESC_MAX_PLEN) {
unsigned int nfrags = skb_shinfo(skb)->nr_frags;
- unsigned int i, len = skb->len - skb->data_len;
+ unsigned int i, len = skb_headlen(skb);
while (len > SGE_TX_DESC_MAX_PLEN) {
count++;
len -= SGE_TX_DESC_MAX_PLEN;
@@ -1219,10 +1219,10 @@ static inline void write_tx_descs(struct adapter *adapter, struct sk_buff *skb,
ce = &q->centries[pidx];
mapping = pci_map_single(adapter->pdev, skb->data,
- skb->len - skb->data_len, PCI_DMA_TODEVICE);
+ skb_headlen(skb), PCI_DMA_TODEVICE);
desc_mapping = mapping;
- desc_len = skb->len - skb->data_len;
+ desc_len = skb_headlen(skb);
flags = F_CMD_DATAVALID | F_CMD_SOP |
V_CMD_EOP(nfrags == 0 && desc_len <= SGE_TX_DESC_MAX_PLEN) |
@@ -1258,7 +1258,7 @@ static inline void write_tx_descs(struct adapter *adapter, struct sk_buff *skb,
ce->skb = NULL;
dma_unmap_addr_set(ce, dma_addr, mapping);
- dma_unmap_len_set(ce, dma_len, skb->len - skb->data_len);
+ dma_unmap_len_set(ce, dma_len, skb_headlen(skb));
for (i = 0; nfrags--; i++) {
skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
diff --git a/drivers/net/e1000/e1000_main.c b/drivers/net/e1000/e1000_main.c
index 47da5fc..974a02d 100644
--- a/drivers/net/e1000/e1000_main.c
+++ b/drivers/net/e1000/e1000_main.c
@@ -2929,7 +2929,7 @@ static netdev_tx_t e1000_xmit_frame(struct sk_buff *skb,
unsigned int first, max_per_txd = E1000_MAX_DATA_PER_TXD;
unsigned int max_txd_pwr = E1000_MAX_TXD_PWR;
unsigned int tx_flags = 0;
- unsigned int len = skb->len - skb->data_len;
+ unsigned int len = skb_headlen(skb);
unsigned int nr_frags;
unsigned int mss;
int count = 0;
@@ -2980,7 +2980,7 @@ static netdev_tx_t e1000_xmit_frame(struct sk_buff *skb,
dev_kfree_skb_any(skb);
return NETDEV_TX_OK;
}
- len = skb->len - skb->data_len;
+ len = skb_headlen(skb);
break;
default:
/* do nothing */
diff --git a/drivers/net/e1000e/netdev.c b/drivers/net/e1000e/netdev.c
index 38390b5..214db04 100644
--- a/drivers/net/e1000e/netdev.c
+++ b/drivers/net/e1000e/netdev.c
@@ -4130,7 +4130,7 @@ static netdev_tx_t e1000_xmit_frame(struct sk_buff *skb,
unsigned int max_per_txd = E1000_MAX_PER_TXD;
unsigned int max_txd_pwr = E1000_MAX_TXD_PWR;
unsigned int tx_flags = 0;
- unsigned int len = skb->len - skb->data_len;
+ unsigned int len = skb_headsize(skb);
unsigned int nr_frags;
unsigned int mss;
int count = 0;
@@ -4180,7 +4180,7 @@ static netdev_tx_t e1000_xmit_frame(struct sk_buff *skb,
dev_kfree_skb_any(skb);
return NETDEV_TX_OK;
}
- len = skb->len - skb->data_len;
+ len = skb_headlen(skb);
}
}
diff --git a/drivers/net/ehea/ehea_main.c b/drivers/net/ehea/ehea_main.c
index e2d25fb..3f445ef 100644
--- a/drivers/net/ehea/ehea_main.c
+++ b/drivers/net/ehea/ehea_main.c
@@ -1618,7 +1618,7 @@ static void write_swqe2_TSO(struct sk_buff *skb,
{
struct ehea_vsgentry *sg1entry = &swqe->u.immdata_desc.sg_entry;
u8 *imm_data = &swqe->u.immdata_desc.immediate_data[0];
- int skb_data_size = skb->len - skb->data_len;
+ int skb_data_size = skb_headlen(skb);
int headersize;
/* Packet is TCP with TSO enabled */
@@ -1629,7 +1629,7 @@ static void write_swqe2_TSO(struct sk_buff *skb,
*/
headersize = ETH_HLEN + ip_hdrlen(skb) + tcp_hdrlen(skb);
- skb_data_size = skb->len - skb->data_len;
+ skb_data_size = skb_headlen(skb);
if (skb_data_size >= headersize) {
/* copy immediate data */
@@ -1651,7 +1651,7 @@ static void write_swqe2_TSO(struct sk_buff *skb,
static void write_swqe2_nonTSO(struct sk_buff *skb,
struct ehea_swqe *swqe, u32 lkey)
{
- int skb_data_size = skb->len - skb->data_len;
+ int skb_data_size = skb_headlen(skb);
u8 *imm_data = &swqe->u.immdata_desc.immediate_data[0];
struct ehea_vsgentry *sg1entry = &swqe->u.immdata_desc.sg_entry;
@@ -2108,8 +2108,8 @@ static void ehea_xmit3(struct sk_buff *skb, struct net_device *dev,
} else {
/* first copy data from the skb->data buffer ... */
skb_copy_from_linear_data(skb, imm_data,
- skb->len - skb->data_len);
- imm_data += skb->len - skb->data_len;
+ skb_headlen(skb));
+ imm_data += skb_headlen(skb);
/* ... then copy data from the fragments */
for (i = 0; i < nfrags; i++) {
diff --git a/drivers/net/forcedeth.c b/drivers/net/forcedeth.c
index 3267b23..6c18834 100644
--- a/drivers/net/forcedeth.c
+++ b/drivers/net/forcedeth.c
@@ -2148,7 +2148,7 @@ static netdev_tx_t nv_start_xmit(struct sk_buff *skb, struct net_device *dev)
unsigned int i;
u32 offset = 0;
u32 bcnt;
- u32 size = skb->len-skb->data_len;
+ u32 size = skb_headlen(skb);
u32 entries = (size >> NV_TX2_TSO_MAX_SHIFT) + ((size & (NV_TX2_TSO_MAX_SIZE-1)) ? 1 : 0);
u32 empty_slots;
struct ring_desc* put_tx;
@@ -2269,7 +2269,7 @@ static netdev_tx_t nv_start_xmit_optimized(struct sk_buff *skb,
unsigned int i;
u32 offset = 0;
u32 bcnt;
- u32 size = skb->len-skb->data_len;
+ u32 size = skb_headlen(skb);
u32 entries = (size >> NV_TX2_TSO_MAX_SHIFT) + ((size & (NV_TX2_TSO_MAX_SIZE-1)) ? 1 : 0);
u32 empty_slots;
struct ring_desc_ex* put_tx;
diff --git a/drivers/net/ixgbevf/ixgbevf_main.c b/drivers/net/ixgbevf/ixgbevf_main.c
index 960e985..f484161 100644
--- a/drivers/net/ixgbevf/ixgbevf_main.c
+++ b/drivers/net/ixgbevf/ixgbevf_main.c
@@ -604,7 +604,7 @@ static bool ixgbevf_clean_rx_irq(struct ixgbevf_q_vector *q_vector,
* packets not getting split correctly
*/
if (staterr & IXGBE_RXD_STAT_LB) {
- u32 header_fixup_len = skb->len - skb->data_len;
+ u32 header_fixup_len = skb_headlen(skb);
if (header_fixup_len < 14)
skb_push(skb, header_fixup_len);
}
diff --git a/drivers/net/ksz884x.c b/drivers/net/ksz884x.c
index 4a231bd..cc0bc8a 100644
--- a/drivers/net/ksz884x.c
+++ b/drivers/net/ksz884x.c
@@ -4684,7 +4684,7 @@ static void send_packet(struct sk_buff *skb, struct net_device *dev)
int frag;
skb_frag_t *this_frag;
- dma_buf->len = skb->len - skb->data_len;
+ dma_buf->len = skb_headlen(skb);
dma_buf->dma = pci_map_single(
hw_priv->pdev, skb->data, dma_buf->len,
diff --git a/drivers/net/myri10ge/myri10ge.c b/drivers/net/myri10ge/myri10ge.c
index 958dc28..e0b47cc 100644
--- a/drivers/net/myri10ge/myri10ge.c
+++ b/drivers/net/myri10ge/myri10ge.c
@@ -2757,7 +2757,7 @@ again:
}
/* map the skb for DMA */
- len = skb->len - skb->data_len;
+ len = skb_headlen(skb);
idx = tx->req & tx->mask;
tx->info[idx].skb = skb;
bus = pci_map_single(mgp->pdev, skb->data, len, PCI_DMA_TODEVICE);
diff --git a/drivers/net/s2io.c b/drivers/net/s2io.c
index bab0061..f155928 100644
--- a/drivers/net/s2io.c
+++ b/drivers/net/s2io.c
@@ -2400,7 +2400,7 @@ static struct sk_buff *s2io_txdl_getskb(struct fifo_info *fifo_data,
return NULL;
}
pci_unmap_single(nic->pdev, (dma_addr_t)txds->Buffer_Pointer,
- skb->len - skb->data_len, PCI_DMA_TODEVICE);
+ skb_headlen(skb), PCI_DMA_TODEVICE);
frg_cnt = skb_shinfo(skb)->nr_frags;
if (frg_cnt) {
txds++;
@@ -4202,7 +4202,7 @@ static netdev_tx_t s2io_xmit(struct sk_buff *skb, struct net_device *dev)
txdp->Control_2 |= TXD_VLAN_TAG(vlan_tag);
}
- frg_len = skb->len - skb->data_len;
+ frg_len = skb_headlen(skb);
if (offload_type == SKB_GSO_UDP) {
int ufo_size;
diff --git a/drivers/net/tehuti.c b/drivers/net/tehuti.c
index a38aede..93affdc 100644
--- a/drivers/net/tehuti.c
+++ b/drivers/net/tehuti.c
@@ -1508,7 +1508,7 @@ bdx_tx_map_skb(struct bdx_priv *priv, struct sk_buff *skb,
int nr_frags = skb_shinfo(skb)->nr_frags;
int i;
- db->wptr->len = skb->len - skb->data_len;
+ db->wptr->len = skb_headsize(skb);
db->wptr->addr.dma = pci_map_single(priv->pdev, skb->data,
db->wptr->len, PCI_DMA_TODEVICE);
pbl->len = CPU_CHIP_SWAP32(db->wptr->len);
diff --git a/drivers/net/tsi108_eth.c b/drivers/net/tsi108_eth.c
index 1292d23..a03730b 100644
--- a/drivers/net/tsi108_eth.c
+++ b/drivers/net/tsi108_eth.c
@@ -704,8 +704,8 @@ static int tsi108_send_packet(struct sk_buff * skb, struct net_device *dev)
if (i == 0) {
data->txring[tx].buf0 = dma_map_single(NULL, skb->data,
- skb->len - skb->data_len, DMA_TO_DEVICE);
- data->txring[tx].len = skb->len - skb->data_len;
+ skb_headlen(skb), DMA_TO_DEVICE);
+ data->txring[tx].len = skb_headlen(skb);
misc |= TSI108_TX_SOF;
} else {
skb_frag_t *frag = &skb_shinfo(skb)->frags[i - 1];
^ permalink raw reply related
* Re: [PATCH] tun: orphan an skb on tx
From: Michael S. Tsirkin @ 2010-04-13 20:43 UTC (permalink / raw)
To: Eric Dumazet
Cc: Jan Kiszka, David S. Miller, Herbert Xu, Paul Moore,
David Woodhouse, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org, qemu-devel
In-Reply-To: <1271191086.16881.570.camel@edumazet-laptop>
On Tue, Apr 13, 2010 at 10:38:06PM +0200, Eric Dumazet wrote:
> Le mardi 13 avril 2010 à 23:25 +0300, Michael S. Tsirkin a écrit :
> > On Tue, Apr 13, 2010 at 08:31:03PM +0200, Eric Dumazet wrote:
> > > Le mardi 13 avril 2010 à 20:39 +0300, Michael S. Tsirkin a écrit :
> > >
> > > > > When a socket with inflight tx packets is closed, we dont block the
> > > > > close, we only delay the socket freeing once all packets were delivered
> > > > > and freed.
> > > > >
> > > >
> > > > Which is wrong, since this is under userspace control, so you get
> > > > unkillable processes.
> > > >
> > >
> > > We do not get unkillable processes, at least with sockets I was thinking
> > > about (TCP/UDP ones).
> > >
> > > Maybe tun sockets can behave the same ?
> >
> > Looks like that's what my patch does: ip_rcv seems to call
> > skb_orphan too.
>
> Well, I was speaking of tx side, you speak of receiving side.
Point is, both ip_rcv and my patch call skb_orphan.
> An external flood (coming from another domain) is another problem.
>
> A sender might flood the 'network' inside our domain. How can we
> reasonably limit the sender ?
>
> Maybe the answer is 'We can not', but it should be stated somewhere, so
> that someone can address this point later.
>
And whatever's done should ideally work for tap to IP
and IP to IP sockets as well, not just tap to tap.
--
MST
^ permalink raw reply
* Re: [PATCH 2/3] cxgb4i: main driver files
From: Mike Christie @ 2010-04-13 20:41 UTC (permalink / raw)
To: open-iscsi
Cc: Rakesh Ranjan, netdev, linux-scsi, linux-kernel, kxie, davem,
James.Bottomley
In-Reply-To: <1270728855-20951-3-git-send-email-rakesh@chelsio.com>
On 04/08/2010 07:14 AM, Rakesh Ranjan wrote:
> +static inline int cxgb4i_ddp_gl_map(struct pci_dev *pdev,
> + struct cxgb4i_gather_list *gl)
> +{
> + int i;
> +
> + for (i = 0; i< gl->nelem; i++) {
> + gl->phys_addr[i] = pci_map_page(pdev, gl->pages[i], 0,
> + PAGE_SIZE,
Hey Rakesh,
I guess we are trying to move away from the pci mapping functions move
to the dma ones. On your next submission, could you fix those up too?
^ permalink raw reply
* Re: [PATCH] tun: orphan an skb on tx
From: Eric Dumazet @ 2010-04-13 20:38 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: Jan Kiszka, David S. Miller, Herbert Xu, Paul Moore,
David Woodhouse, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org, qemu-devel
In-Reply-To: <20100413202548.GA3582@redhat.com>
Le mardi 13 avril 2010 à 23:25 +0300, Michael S. Tsirkin a écrit :
> On Tue, Apr 13, 2010 at 08:31:03PM +0200, Eric Dumazet wrote:
> > Le mardi 13 avril 2010 à 20:39 +0300, Michael S. Tsirkin a écrit :
> >
> > > > When a socket with inflight tx packets is closed, we dont block the
> > > > close, we only delay the socket freeing once all packets were delivered
> > > > and freed.
> > > >
> > >
> > > Which is wrong, since this is under userspace control, so you get
> > > unkillable processes.
> > >
> >
> > We do not get unkillable processes, at least with sockets I was thinking
> > about (TCP/UDP ones).
> >
> > Maybe tun sockets can behave the same ?
>
> Looks like that's what my patch does: ip_rcv seems to call
> skb_orphan too.
Well, I was speaking of tx side, you speak of receiving side.
An external flood (coming from another domain) is another problem.
A sender might flood the 'network' inside our domain. How can we
reasonably limit the sender ?
Maybe the answer is 'We can not', but it should be stated somewhere, so
that someone can address this point later.
^ permalink raw reply
* Re: [PATCH] tun: orphan an skb on tx
From: Michael S. Tsirkin @ 2010-04-13 20:25 UTC (permalink / raw)
To: Eric Dumazet
Cc: Jan Kiszka, David S. Miller, Herbert Xu, Paul Moore,
David Woodhouse, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org, qemu-devel
In-Reply-To: <1271183463.16881.545.camel@edumazet-laptop>
On Tue, Apr 13, 2010 at 08:31:03PM +0200, Eric Dumazet wrote:
> Le mardi 13 avril 2010 à 20:39 +0300, Michael S. Tsirkin a écrit :
>
> > > When a socket with inflight tx packets is closed, we dont block the
> > > close, we only delay the socket freeing once all packets were delivered
> > > and freed.
> > >
> >
> > Which is wrong, since this is under userspace control, so you get
> > unkillable processes.
> >
>
> We do not get unkillable processes, at least with sockets I was thinking
> about (TCP/UDP ones).
>
> Maybe tun sockets can behave the same ?
Looks like that's what my patch does: ip_rcv seems to call
skb_orphan too.
> Herbert Acked your patch, so I guess its OK, but I think it can be
> dangerous.
> Anyway my feeling is that we try to add various mechanisms to keep a
> hostile user flooding another one.
>
> For example, UDP got memory accounting quite recently, and we added
> socket backlog limits very recently. It was considered not needed few
> years ago.
>
^ permalink raw reply
* usb-sound circular locking again?
From: Richard Zidlicky @ 2010-04-13 20:30 UTC (permalink / raw)
To: Takashi Iwai; +Cc: Andrew Morton, linux-kernel, netdev
In-Reply-To: <s5hocm9som6.wl%tiwai@suse.de>
Hi,
is this the same old issue? Any way to fix it? Seeing it triggered in a sync
syscall does not make me comfortable.
Apr 13 02:01:36 localhost kernel: [ 8569.449882] PM: Syncing filesystems ...
Apr 13 02:01:36 localhost kernel: [ 8569.449998] =======================================================
Apr 13 02:01:36 localhost kernel: [ 8569.450049] [ INFO: possible circular locking dependency detected ]
Apr 13 02:01:36 localhost kernel: [ 8569.450078] 2.6.33.2v2 #4
Apr 13 02:01:36 localhost kernel: [ 8569.450101] -------------------------------------------------------
Apr 13 02:01:36 localhost kernel: [ 8569.450130] pm-hibernate/17348 is trying to acquire lock:
Apr 13 02:01:36 localhost kernel: [ 8569.450158] (mutex){+.+...}, at: [<c04e6670>] sync_filesystems+0x14/0xd6
Apr 13 02:01:36 localhost kernel: [ 8569.450252]
Apr 13 02:01:36 localhost kernel: [ 8569.450253] but task is already holding lock:
Apr 13 02:01:36 localhost kernel: [ 8569.450266] (pm_mutex){+.+.+.}, at: [<c0466658>] hibernate+0x13/0x18d
Apr 13 02:01:36 localhost kernel: [ 8569.450266]
Apr 13 02:01:36 localhost kernel: [ 8569.450266] which lock already depends on the new lock.
Apr 13 02:01:36 localhost kernel: [ 8569.450266]
Apr 13 02:01:36 localhost kernel: [ 8569.450266]
Apr 13 02:01:36 localhost kernel: [ 8569.450266] the existing dependency chain (in reverse order) is:
Apr 13 02:01:36 localhost kernel: [ 8569.450266]
Apr 13 02:01:36 localhost kernel: [ 8569.450266] -> #6 (pm_mutex){+.+.+.}:
Apr 13 02:01:36 localhost kernel: [ 8569.450266] [<c045b60a>] __lock_acquire+0xa2d/0xbb7
Apr 13 02:01:36 localhost kernel: [ 8569.450266] [<c045b828>] lock_acquire+0x94/0xb1
Apr 13 02:01:36 localhost kernel: [ 8569.450266] [<c0736d84>] __mutex_lock_common+0x35/0x2f3
Apr 13 02:01:36 localhost kernel: [ 8569.450266] [<c07370e0>] mutex_lock_nested+0x30/0x38
Apr 13 02:01:36 localhost kernel: [ 8569.450266] [<c0466658>] hibernate+0x13/0x18d
Apr 13 02:01:36 localhost kernel: [ 8569.450266] [<c046551c>] state_store+0x56/0xa8
Apr 13 02:01:36 localhost kernel: [ 8569.450266] [<c05acb19>] kobj_attr_store+0x1a/0x22
Apr 13 02:01:36 localhost kernel: [ 8569.450266] [<c050f306>] sysfs_write_file+0xb9/0xe4
Apr 13 02:01:36 localhost kernel: [ 8569.450266] [<c04cc821>] vfs_write+0x84/0xdf
Apr 13 02:01:36 localhost kernel: [ 8569.450266] [<c04cc915>] sys_write+0x3b/0x60
Apr 13 02:01:36 localhost kernel: [ 8569.450266] [<c0738295>] syscall_call+0x7/0xb
Apr 13 02:01:36 localhost kernel: [ 8569.450266]
Apr 13 02:01:36 localhost kernel: [ 8569.450266] -> #5 (s_active){++++.+}:
Apr 13 02:01:36 localhost kernel: [ 8569.450266] [<c045b60a>] __lock_acquire+0xa2d/0xbb7
Apr 13 02:01:36 localhost kernel: [ 8569.450266] [<c045b828>] lock_acquire+0x94/0xb1
Apr 13 02:01:36 localhost kernel: [ 8569.450266] [<c05102f8>] sysfs_addrm_finish+0x89/0xde
Apr 13 02:01:36 localhost kernel: [ 8569.450266] [<c050eaf7>] sysfs_hash_and_remove+0x3d/0x4f
Apr 13 02:01:36 localhost kernel: [ 8569.450266] [<c0511100>] sysfs_remove_group+0x74/0xa3
Apr 13 02:01:36 localhost kernel: [ 8569.450266] [<c062e16c>] dpm_sysfs_remove+0x10/0x12
Apr 13 09:39:32 localhost kernel: [ 8569.450266] [<c062933f>] device_del+0x33/0x154
Apr 13 09:39:34 localhost kernel: [ 8569.450266] [<c0629488>] device_unregister+0x28/0x4b
Apr 13 09:39:34 localhost kernel: [ 8569.450266] [<c067b7c5>] usb_remove_ep_devs+0x15/0x1f
Apr 13 09:39:34 localhost kernel: [ 8569.450266] [<c0675c92>] remove_intf_ep_devs+0x21/0x32
Apr 13 09:39:34 localhost kernel: [ 8569.450266] [<c0676d53>] usb_set_interface+0x18c/0x22c
Apr 13 09:39:34 localhost kernel: [ 8569.450266] [<f8302c46>] snd_usb_capture_close+0x26/0x3f [snd_usb_audio]
Apr 13 09:39:34 localhost kernel: [ 8569.450266] [<f80fbb08>] snd_pcm_release_substream+0x3d/0x66 [snd_pcm]
Apr 13 09:39:34 localhost kernel: [ 8569.450266] [<f80fbb8d>] snd_pcm_release+0x5c/0x9e [snd_pcm]
Apr 13 09:39:34 localhost kernel: [ 8569.450266] [<c04cd12a>] __fput+0xf0/0x187
Apr 13 09:39:34 localhost kernel: [ 8569.450266] [<c04cd1da>] fput+0x19/0x1b
Apr 13 09:39:34 localhost kernel: [ 8569.450266] [<c04b2e9f>] remove_vma+0x3e/0x5d
Apr 13 09:39:34 localhost kernel: [ 8569.450266] [<c04b3b2a>] do_munmap+0x23c/0x259
Apr 13 09:39:34 localhost kernel: [ 8569.450266] [<c04b3b77>] sys_munmap+0x30/0x3f
Apr 13 09:39:34 localhost kernel: [ 8569.450266] [<c0738295>] syscall_call+0x7/0xb
Apr 13 09:39:34 localhost kernel: [ 8569.450266]
Apr 13 09:39:34 localhost kernel: [ 8569.450266] -> #4 (&pcm->open_mutex){+.+.+.}:
Apr 13 09:39:34 localhost kernel: [ 8569.454127] [<c045b60a>] __lock_acquire+0xa2d/0xbb7
Apr 13 09:39:34 localhost kernel: [ 8569.454127] [<c045b828>] lock_acquire+0x94/0xb1
Apr 13 09:39:34 localhost kernel: [ 8569.454127] [<c0736d84>] __mutex_lock_common+0x35/0x2f3
Apr 13 09:39:34 localhost kernel: [ 8569.454127] [<c07370e0>] mutex_lock_nested+0x30/0x38
Apr 13 09:39:34 localhost kernel: [ 8569.454127] [<f80fbb86>] snd_pcm_release+0x55/0x9e [snd_pcm]
Apr 13 09:39:34 localhost kernel: [ 8569.454127] [<c04cd12a>] __fput+0xf0/0x187
Apr 13 09:39:34 localhost kernel: [ 8569.454127] [<c04cd1da>] fput+0x19/0x1b
Apr 13 09:39:34 localhost kernel: [ 8569.454127] [<c04b2e9f>] remove_vma+0x3e/0x5d
Apr 13 09:39:34 localhost kernel: [ 8569.454127] [<c04b3b2a>] do_munmap+0x23c/0x259
Apr 13 09:39:34 localhost kernel: [ 8569.454127] [<c04b3b77>] sys_munmap+0x30/0x3f
Apr 13 09:39:34 localhost kernel: [ 8569.455127] [<c0738295>] syscall_call+0x7/0xb
Apr 13 09:39:34 localhost kernel: [ 8569.455127]
Apr 13 09:39:34 localhost kernel: [ 8569.455127] -> #3 (&mm->mmap_sem){++++++}:
Apr 13 09:39:34 localhost kernel: [ 8569.455127] [<c045b60a>] __lock_acquire+0xa2d/0xbb7
Apr 13 09:39:34 localhost kernel: [ 8569.455127] [<c045b828>] lock_acquire+0x94/0xb1
Apr 13 09:39:34 localhost kernel: [ 8569.455127] [<c04add1a>] might_fault+0x64/0x81
Apr 13 09:39:34 localhost kernel: [ 8569.455127] [<c05b3828>] copy_to_user+0x2c/0xfc
Apr 13 09:39:34 localhost kernel: [ 8569.455127] [<c04d784b>] filldir64+0x97/0xcd
Apr 13 09:39:34 localhost kernel: [ 8569.455127] [<c04e299c>] dcache_readdir+0x5a/0x1af
Apr 13 09:39:34 localhost kernel: [ 8569.456129] [<c04d7a5d>] vfs_readdir+0x68/0x94
Apr 13 09:39:34 localhost kernel: [ 8569.456129] [<c04d7aec>] sys_getdents64+0x63/0xa0
Apr 13 09:39:34 localhost kernel: [ 8569.456129] [<c0738295>] syscall_call+0x7/0xb
Apr 13 09:39:34 localhost kernel: [ 8569.456129]
Apr 13 09:39:34 localhost kernel: [ 8569.456129] -> #2 (&sb->s_type->i_mutex_key#3){+.+.+.}:
Apr 13 09:39:34 localhost kernel: [ 8569.456129] [<c045b60a>] __lock_acquire+0xa2d/0xbb7
Apr 13 09:39:34 localhost kernel: [ 8569.456129] [<c045b828>] lock_acquire+0x94/0xb1
Apr 13 09:39:34 localhost kernel: [ 8569.456129] [<c0736d84>] __mutex_lock_common+0x35/0x2f3
Apr 13 09:39:34 localhost kernel: [ 8569.456129] [<c07370e0>] mutex_lock_nested+0x30/0x38
Apr 13 09:39:34 localhost kernel: [ 8569.456129] [<c051164f>] devpts_get_sb+0x1c0/0x29f
Apr 13 09:39:34 localhost kernel: [ 8569.456129] [<c04ce0db>] vfs_kern_mount+0x86/0x11f
Apr 13 09:39:34 localhost kernel: [ 8569.456129] [<c04ce1b8>] do_kern_mount+0x32/0xbe
Apr 13 09:39:34 localhost kernel: [ 8569.456129] [<c04e02c2>] do_mount+0x671/0x6d0
Apr 13 09:39:34 localhost kernel: [ 8569.456129] [<c04e0382>] sys_mount+0x61/0x8f
Apr 13 09:39:34 localhost kernel: [ 8569.456129] [<c0738295>] syscall_call+0x7/0xb
Apr 13 09:39:34 localhost kernel: [ 8569.456129]
Apr 13 09:39:34 localhost kernel: [ 8569.456129] -> #1 (&type->s_umount_key#19){++++..}:
Apr 13 09:39:34 localhost kernel: [ 8569.458127] [<c045b60a>] __lock_acquire+0xa2d/0xbb7
Apr 13 09:39:34 localhost kernel: [ 8569.458127] [<c045b828>] lock_acquire+0x94/0xb1
Apr 13 09:39:34 localhost kernel: [ 8569.458127] [<c0737310>] down_read+0x31/0x45
Apr 13 09:39:34 localhost kernel: [ 8569.458127] [<c04e66cf>] sync_filesystems+0x73/0xd6
Apr 13 09:39:34 localhost kernel: [ 8569.458127] [<c04e676e>] sys_sync+0x11/0x2d
Apr 13 09:39:34 localhost kernel: [ 8569.458127] [<c0738295>] syscall_call+0x7/0xb
Apr 13 09:39:34 localhost kernel: [ 8569.458127]
Apr 13 09:39:34 localhost kernel: [ 8569.458127] -> #0 (mutex){+.+...}:
Apr 13 09:39:34 localhost kernel: [ 8569.458127] [<c045b517>] __lock_acquire+0x93a/0xbb7
Apr 13 09:39:34 localhost kernel: [ 8569.458127] [<c045b828>] lock_acquire+0x94/0xb1
Apr 13 09:39:34 localhost kernel: [ 8569.458127] [<c0736d84>] __mutex_lock_common+0x35/0x2f3
Apr 13 09:39:34 localhost kernel: [ 8569.458127] [<c07370e0>] mutex_lock_nested+0x30/0x38
Apr 13 09:39:34 localhost kernel: [ 8569.458127] [<c04e6670>] sync_filesystems+0x14/0xd6
Apr 13 09:39:34 localhost kernel: [ 8569.458127] [<c04e676e>] sys_sync+0x11/0x2d
Apr 13 09:39:34 localhost kernel: [ 8569.458127] [<c04666c2>] hibernate+0x7d/0x18d
Apr 13 09:39:34 localhost kernel: [ 8569.459761] [<c046551c>] state_store+0x56/0xa8
Apr 13 09:39:34 localhost kernel: [ 8569.459761] [<c05acb19>] kobj_attr_store+0x1a/0x22
Apr 13 09:39:34 localhost kernel: [ 8569.459761] [<c050f306>] sysfs_write_file+0xb9/0xe4
Apr 13 09:39:34 localhost kernel: [ 8569.459761] [<c04cc821>] vfs_write+0x84/0xdf
Apr 13 09:39:34 localhost kernel: [ 8569.460128] [<c04cc915>] sys_write+0x3b/0x60
Apr 13 09:39:34 localhost kernel: [ 8569.460128] [<c0738295>] syscall_call+0x7/0xb
Apr 13 09:39:34 localhost kernel: [ 8569.460128]
Apr 13 09:39:34 localhost kernel: [ 8569.460128] other info that might help us debug this:
Apr 13 09:39:34 localhost kernel: [ 8569.460128]
Apr 13 09:39:34 localhost kernel: [ 8569.460128] 4 locks held by pm-hibernate/17348:
Apr 13 09:39:34 localhost kernel: [ 8569.460128] #0: (&buffer->mutex){+.+.+.}, at: [<c050f272>] sysfs_write_file+0x25/0xe4
Apr 13 09:39:34 localhost kernel: [ 8569.460128] #1: (s_active){++++.+}, at: [<c0510544>] sysfs_get_active_two+0x16/0x36
Apr 13 09:39:34 localhost kernel: [ 8569.461127] #2: (s_active){++++.+}, at: [<c051054f>] sysfs_get_active_two+0x21/0x36
Apr 13 09:39:34 localhost kernel: [ 8569.461127] #3: (pm_mutex){+.+.+.}, at: [<c0466658>] hibernate+0x13/0x18d
Apr 13 09:39:34 localhost kernel: [ 8569.461127]
Apr 13 09:39:34 localhost kernel: [ 8569.461127] stack backtrace:
Apr 13 09:39:34 localhost kernel: [ 8569.461127] Pid: 17348, comm: pm-hibernate Not tainted 2.6.33.2v2 #4
Apr 13 09:39:34 localhost kernel: [ 8569.461127] Call Trace:
Apr 13 09:39:34 localhost kernel: [ 8569.461127] [<c0735b79>] ? printk+0xf/0x16
Apr 13 09:39:34 localhost kernel: [ 8569.461127] [<c045a8a0>] print_circular_bug+0x90/0x9c
Apr 13 09:39:34 localhost kernel: [ 8569.462128] [<c045b517>] __lock_acquire+0x93a/0xbb7
Apr 13 09:39:34 localhost kernel: [ 8569.462128] [<c042730d>] ? update_curr+0x177/0x17f
Apr 13 09:39:34 localhost kernel: [ 8569.462128] [<c0459bf5>] ? mark_lock+0x1e/0x1ea
Apr 13 09:39:34 localhost kernel: [ 8569.462128] [<c045b828>] lock_acquire+0x94/0xb1
Apr 13 09:39:34 localhost kernel: [ 8569.462128] [<c04e6670>] ? sync_filesystems+0x14/0xd6
Apr 13 09:39:34 localhost kernel: [ 8569.462128] [<c0736d84>] __mutex_lock_common+0x35/0x2f3
Apr 13 09:39:34 localhost kernel: [ 8569.462128] [<c04e6670>] ? sync_filesystems+0x14/0xd6
Apr 13 09:39:34 localhost kernel: [ 8569.462128] [<c04e3423>] ? bdi_alloc_queue_work+0x84/0xa0
Apr 13 09:39:34 localhost kernel: [ 8569.462128] [<c07370e0>] mutex_lock_nested+0x30/0x38
Apr 13 09:39:34 localhost kernel: [ 8569.462128] [<c04e6670>] ? sync_filesystems+0x14/0xd6
Apr 13 09:39:34 localhost kernel: [ 8569.462128] [<c04e6670>] sync_filesystems+0x14/0xd6
Apr 13 09:39:34 localhost kernel: [ 8569.462128] [<c04e676e>] sys_sync+0x11/0x2d
Apr 13 09:39:34 localhost kernel: [ 8569.463127] [<c04666c2>] hibernate+0x7d/0x18d
Apr 13 09:39:34 localhost kernel: [ 8569.463127] [<c04654c6>] ? state_store+0x0/0xa8
Apr 13 09:39:34 localhost kernel: [ 8569.463127] [<c046551c>] state_store+0x56/0xa8
Apr 13 09:39:34 localhost kernel: [ 8569.463127] [<c04654c6>] ? state_store+0x0/0xa8
Apr 13 09:39:34 localhost kernel: [ 8569.463127] [<c05acb19>] kobj_attr_store+0x1a/0x22
Apr 13 09:39:34 localhost kernel: [ 8569.463127] [<c050f306>] sysfs_write_file+0xb9/0xe4
Apr 13 09:39:34 localhost kernel: [ 8569.463127] [<c050f24d>] ? sysfs_write_file+0x0/0xe4
Apr 13 09:39:34 localhost kernel: [ 8569.463127] [<c04cc821>] vfs_write+0x84/0xdf
Apr 13 09:39:34 localhost kernel: [ 8569.463127] [<c04cc915>] sys_write+0x3b/0x60
Apr 13 09:39:34 localhost kernel: [ 8569.463127] [<c0738295>] syscall_call+0x7/0xb
Apr 13 09:39:34 localhost kernel: [ 8569.484133] done.
Apr 13 09:39:34 localhost kernel: [ 8569.484223] Freezing user space processes ... (elapsed 0.04 seconds) done.
Apr 13 09:39:34 localhost kernel: [ 8569.528142] Freezing remaining freezable tasks ... (elapsed 0.01 seconds) done.
Apr 13 09:39:34 localhost kernel: [ 8569.539272] PM: Preallocating image memory... done (allocated 349210 pages)
Apr 13 09:39:34 localhost kernel: [ 8583.627118] PM: Allocated 1396840 kbytes in 14.08 seconds (99.20 MB/s)
Regards,
Richard
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox