* [PATCH 1/6 V2] mlx4: Fix vlan table overflow
From: Yevgeny Petrilin @ 2011-10-18 11:50 UTC (permalink / raw)
To: davem; +Cc: netdev, yevgenyp
Prevent overflow when trying to register more Vlans then the Vlan table in
HW is configured to.
Need to take into acount that the first 2 entries are reserved.
Signed-off-by: Yevgeny Petrilin <yevgenyp@mellanox.co.il>
---
drivers/net/ethernet/mellanox/mlx4/port.c | 15 ++++++++-------
1 files changed, 8 insertions(+), 7 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx4/port.c b/drivers/net/ethernet/mellanox/mlx4/port.c
index 609e0ec..163a314 100644
--- a/drivers/net/ethernet/mellanox/mlx4/port.c
+++ b/drivers/net/ethernet/mellanox/mlx4/port.c
@@ -65,7 +65,7 @@ void mlx4_init_vlan_table(struct mlx4_dev *dev, struct mlx4_vlan_table *table)
table->entries[i] = 0;
table->refs[i] = 0;
}
- table->max = 1 << dev->caps.log_num_vlans;
+ table->max = (1 << dev->caps.log_num_vlans) - MLX4_VLAN_REGULAR;
table->total = 0;
}
@@ -354,6 +354,13 @@ int mlx4_register_vlan(struct mlx4_dev *dev, u8 port, u16 vlan, int *index)
int free = -1;
mutex_lock(&table->mutex);
+
+ if (table->total == table->max) {
+ /* No free vlan entries */
+ err = -ENOSPC;
+ goto out;
+ }
+
for (i = MLX4_VLAN_REGULAR; i < MLX4_MAX_VLAN_NUM; i++) {
if (free < 0 && (table->refs[i] == 0)) {
free = i;
@@ -375,12 +382,6 @@ int mlx4_register_vlan(struct mlx4_dev *dev, u8 port, u16 vlan, int *index)
goto out;
}
- if (table->total == table->max) {
- /* No free vlan entries */
- err = -ENOSPC;
- goto out;
- }
-
/* Register new MAC */
table->refs[free] = 1;
table->entries[free] = cpu_to_be32(vlan | MLX4_VLAN_VALID);
--
1.7.7
^ permalink raw reply related
* [PATCH 0/6 V2] mlx4_en: Data path optimizations and bug fix
From: Yevgeny Petrilin @ 2011-10-18 11:50 UTC (permalink / raw)
To: davem; +Cc: netdev, yevgenyp
Hello,
This is V2 of the patches sent yesterday.
The differences comparing to V1 are:
1. Remove patch 3/7 : "Incoming traffic alignment optimizations" following Eric's comments for a rework.
2. Using static seed value for Toeplitz function, rather then randomizing the value each time
drivers/net/ethernet/mellanox/mlx4/en_netdev.c | 2 +-
drivers/net/ethernet/mellanox/mlx4/en_port.c | 6 ++++++
drivers/net/ethernet/mellanox/mlx4/en_rx.c | 23 ++++++++++++++++++++---
drivers/net/ethernet/mellanox/mlx4/en_tx.c | 2 +-
drivers/net/ethernet/mellanox/mlx4/fw.c | 1 +
drivers/net/ethernet/mellanox/mlx4/mlx4_en.h | 7 +++++--
drivers/net/ethernet/mellanox/mlx4/port.c | 15 ++++++++-------
include/linux/mlx4/device.h | 1 +
Thanks,
Yevgeny
^ permalink raw reply
* Re: PROBLEM: System call 'sendmsg' of process ospfd (quagga) causes kernel oops
From: Herbert Xu @ 2011-10-18 13:45 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Elmar Vonlanthen, linux-kernel, netdev, Timo Teräs
In-Reply-To: <1318942560.2657.69.camel@edumazet-HP-Compaq-6005-Pro-SFF-PC>
On Tue, Oct 18, 2011 at 02:56:00PM +0200, Eric Dumazet wrote:
>
> I am ok by this way, but we might hit another similar problem elsewhere.
>
> (igmp.c ip6_output, ...)
>
> We effectively want to remove LL_ALLOCATED_SPACE() usage and obfuscate
> code...
Here's another idea, provide a helper to do the skb allocation
and the skb_reserve in one go. That way this ugliness would only
need to be done once.
Cheers,
--
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply
* Re: Possible via-velocity bug.
From: Eric Dumazet @ 2011-10-18 13:38 UTC (permalink / raw)
To: n00bsys0p; +Cc: linux-kernel, netdev
In-Reply-To: <j7js28$u0m$1@dough.gmane.org>
Le mardi 18 octobre 2011 à 13:39 +0100, n00bsys0p a écrit :
> Hi everyone, I have a nice simple question.
>
> I've found what I think is a bug in the 3.0+ kernel's via-velocity LAN
> driver, which causes the leaking of data into the video RAM during PXE boot.
>
> How do I go about reporting this to the relevant people?
>
> Thanks in advance!
Here and netdev is fine.
^ permalink raw reply
* Re: [PATCH] Fix guest memory leak and panic
From: Krishna Kumar2 @ 2011-10-18 13:22 UTC (permalink / raw)
To: Ian Campbell
Cc: davem@davemloft.net, linux-kernel@vger.kernel.org, mst@redhat.com,
netdev@vger.kernel.org, rusty@rustcorp.com.au,
virtualization@lists.linux-foundation.org
In-Reply-To: <1318935055.3385.0.camel@zakaz.uk.xensource.com>
Ian Campbell <Ian.Campbell@citrix.com> wrote on 10/18/2011 04:20:55 PM:
> > > Sigh, only one out of the ten callers of (__)skb_frag_set_page
expects
> > > skb_frag_set_page to take a new reference. I think that's pretty
> > > comprehensive evidence that the current behaviour is unexpected and
> > > wrong.
> >
> > Looks good!
>
> Thanks.
Tested this patch for virtio_net - hang/panic is fixed.
MemFree also doesn't reduce at end of test compared to
the start.
Tested-by: krkumar2@in.ibm.com
- KK
> > Does it make sense to commit both of these patches? The
> > reason being - my patch becomes a cleanup of set_skb_frag()
> > in virtio_net driver.
>
> I think your patch remains a valid cleanup but I don't maintain that
> code.
>
> Ian.
>
>
^ permalink raw reply
* Re: [PATCH] Fix guest memory leak and panic
From: Eric Dumazet @ 2011-10-18 13:20 UTC (permalink / raw)
To: Ian Campbell
Cc: Krishna Kumar, rusty@rustcorp.com.au, mst@redhat.com,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
virtualization@lists.linux-foundation.org, davem@davemloft.net
In-Reply-To: <1318931232.16132.65.camel@zakaz.uk.xensource.com>
Le mardi 18 octobre 2011 à 10:47 +0100, Ian Campbell a écrit :
> On Tue, 2011-10-18 at 09:36 +0100, Ian Campbell wrote:
> > I think the best thing might be to remove the additional ref taking from
> > the setter function and audit the previous changes to ensure they
> > conform. I'll do that right away and post a fixup patch ASAP.
>
> Sigh, only one out of the ten callers of (__)skb_frag_set_page expects
> skb_frag_set_page to take a new reference. I think that's pretty
> comprehensive evidence that the current behaviour is unexpected and
> wrong.
>
> Sorry about this.
>
> Ian.
>
> 8<---------------------------------------------------------
>
> From 42c26b7ca640bd5cb6f9c3bc76db96c92ed8ff82 Mon Sep 17 00:00:00 2001
> From: Ian Campbell <ian.campbell@citrix.com>
> Date: Tue, 18 Oct 2011 09:59:37 +0100
> Subject: [PATCH] net: do not take an additional reference in skb_frag_set_page
>
> I audited all of the callers in the tree and only one of them (pktgen) expects
> it to do so. Taking this reference is pretty obviously confusing and error
> prone.
>
> In particular I looked at the following commits which switched callers of
> (__)skb_frag_set_page to the skb paged fragment api:
>
> 6a930b9f163d7e6d9ef692e05616c4ede65038ec cxgb3: convert to SKB paged frag API.
> 5dc3e196ea21e833128d51eb5b788a070fea1f28 myri10ge: convert to SKB paged frag API.
> 0e0634d20dd670a89af19af2a686a6cce943ac14 vmxnet3: convert to SKB paged frag API.
> 86ee8130a46769f73f8f423f99dbf782a09f9233 virtionet: convert to SKB paged frag API.
> 4a22c4c919c201c2a7f4ee09e672435a3072d875 sfc: convert to SKB paged frag API.
> 18324d690d6a5028e3c174fc1921447aedead2b8 cassini: convert to SKB paged frag API.
> b061b39e3ae18ad75466258cf2116e18fa5bbd80 benet: convert to SKB paged frag API.
> b7b6a688d217936459ff5cf1087b2361db952509 bnx2: convert to SKB paged frag API.
> 804cf14ea5ceca46554d5801e2817bba8116b7e5 net: xfrm: convert to SKB frag APIs
> ea2ab69379a941c6f8884e290fdd28c93936a778 net: convert core to skb paged frag APIs
>
> Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
> ---
> include/linux/skbuff.h | 1 -
> net/core/pktgen.c | 1 +
> 2 files changed, 1 insertions(+), 1 deletions(-)
>
> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> index 64f8695..78741da 100644
> --- a/include/linux/skbuff.h
> +++ b/include/linux/skbuff.h
> @@ -1765,7 +1765,6 @@ static inline void *skb_frag_address_safe(const skb_frag_t *frag)
> static inline void __skb_frag_set_page(skb_frag_t *frag, struct page *page)
> {
> frag->page = page;
> - __skb_frag_ref(frag);
> }
>
> /**
> diff --git a/net/core/pktgen.c b/net/core/pktgen.c
> index 796044a..c4effd4 100644
> --- a/net/core/pktgen.c
> +++ b/net/core/pktgen.c
> @@ -2603,6 +2603,7 @@ static void pktgen_finalize_skb(struct pktgen_dev *pkt_dev, struct sk_buff *skb,
> break;
> }
> skb_frag_set_page(skb, i, pkt_dev->page);
> + skb_frag_ref(skb, i);
> skb_shinfo(skb)->frags[i].page_offset = 0;
> /*last fragment, fill rest of data*/
> if (i == (frags - 1))
I am ok with this patch, since it makes sense to let driver do the
page->_count change itself (it can use a batched add() in case a page is
splitted in many frags)
But I suggest using get_page(pkt_dev->page), this seems more obvious to
me [ This was how I wrote the thing ;) ]
^ permalink raw reply
* [PATCH] net: Fix driver name for mdio-gpio.c
From: Dirk Eibach @ 2011-10-18 13:04 UTC (permalink / raw)
To: linux-kernel; +Cc: netdev, grant.likely, Dirk Eibach
Since commit
"7488876... dt/net: Eliminate users of of_platform_{,un}register_driver"
there are two platform drivers named "mdio-gpio" registered.
I renamed the of variant to "mdio-ofgpio".
Signed-off-by: Dirk Eibach <eibach@gdsys.de>
---
drivers/net/phy/mdio-gpio.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/drivers/net/phy/mdio-gpio.c b/drivers/net/phy/mdio-gpio.c
index f2d122f..3d66d9f 100644
--- a/drivers/net/phy/mdio-gpio.c
+++ b/drivers/net/phy/mdio-gpio.c
@@ -241,7 +241,7 @@ MODULE_DEVICE_TABLE(of, mdio_ofgpio_match);
static struct platform_driver mdio_ofgpio_driver = {
.driver = {
- .name = "mdio-gpio",
+ .name = "mdio-ofgpio",
.owner = THIS_MODULE,
.of_match_table = mdio_ofgpio_match,
},
--
1.5.6.5
^ permalink raw reply related
* Re: PROBLEM: System call 'sendmsg' of process ospfd (quagga) causes kernel oops
From: Eric Dumazet @ 2011-10-18 12:56 UTC (permalink / raw)
To: Herbert Xu; +Cc: Elmar Vonlanthen, linux-kernel, netdev, Timo Teräs
In-Reply-To: <20111018114945.GA16359@gondor.apana.org.au>
Le mardi 18 octobre 2011 à 13:49 +0200, Herbert Xu a écrit :
> On Tue, Oct 18, 2011 at 01:37:58PM +0200, Eric Dumazet wrote:
> >
> > In the bug we try to fix, we have :
> >
> > skb = sock_alloc_send_skb(sk, ... + LL_ALLOCATED_SPACE(rt->dst.dev)
> >
> > ... < increase of dev->needed_headroom by another cpu/task >
> >
> > skb_reserve(skb, LL_RESERVED_SPACE(rt->dst.dev));
>
> OK, in that case one fix would be to replace LL_ALLOCATED_SPACE
> with its two constiuents so that they may be stored in local
> variables for later use.
>
> hlen = LL_HEADROOM(skb);
> tlen = LL_TAILROOM(skb);
> skb_alloc_send_skb(sk, ... + LL_ALIGN(hlen + tlen));
>
> skb_reserve(skb, LL_ALIGN(hlen));
>
> Cheers,
I am ok by this way, but we might hit another similar problem elsewhere.
(igmp.c ip6_output, ...)
We effectively want to remove LL_ALLOCATED_SPACE() usage and obfuscate
code...
[PATCH] raw: allow dev->needed_headroom dynamic change
It seems ip_gre is able to change dev->needed_headroom on the fly.
It triggers a BUG in raw_sendmsg()
skb = sock_alloc_send_skb(sk, ... + LL_ALLOCATED_SPACE(rt->dst.dev)
< another cpu change dev->needed_headromm (making it bigger)
...
skb_reserve(skb, LL_RESERVED_SPACE(rt->dst.dev));
We end with LL_RESERVED_SPACE() being bigger than LL_ALLOCATED_SPACE()
-> we crash later because skb head is exhausted.
Bug introduced in commit 243aad83 in 2.6.34 (ip_gre: include route
header_len in max_headroom calculation)
Reported-by: Reported-by: Elmar Vonlanthen <evonlanthen@gmail.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: Timo Teräs <timo.teras@iki.fi>
CC: Herbert Xu <herbert@gondor.apana.org.au>
---
include/linux/netdevice.h | 10 +++++++---
net/ipv4/raw.c | 9 +++++++--
net/ipv6/raw.c | 9 +++++++--
3 files changed, 21 insertions(+), 7 deletions(-)
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index ddee79b..dba2399 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -276,12 +276,16 @@ struct hh_cache {
* LL_ALLOCATED_SPACE also takes into account the tailroom the device
* may need.
*/
+#define LL_ALIGN(__len) (((__len)&~(HH_DATA_MOD - 1)) + HH_DATA_MOD)
+
#define LL_RESERVED_SPACE(dev) \
- ((((dev)->hard_header_len+(dev)->needed_headroom)&~(HH_DATA_MOD - 1)) + HH_DATA_MOD)
+ LL_ALIGN((dev)->hard_header_len + (dev)->needed_headroom))
+
#define LL_RESERVED_SPACE_EXTRA(dev,extra) \
- ((((dev)->hard_header_len+(dev)->needed_headroom+(extra))&~(HH_DATA_MOD - 1)) + HH_DATA_MOD)
+ LL_ALIGN((dev)->hard_header_len + (dev)->needed_headroom + (extra))
+
#define LL_ALLOCATED_SPACE(dev) \
- ((((dev)->hard_header_len+(dev)->needed_headroom+(dev)->needed_tailroom)&~(HH_DATA_MOD - 1)) + HH_DATA_MOD)
+ LL_ALIGN((dev)->hard_header_len + (dev)->needed_headroom + (dev)->needed_tailroom)
struct header_ops {
int (*create) (struct sk_buff *skb, struct net_device *dev,
diff --git a/net/ipv4/raw.c b/net/ipv4/raw.c
index 61714bd..4ed4eda 100644
--- a/net/ipv4/raw.c
+++ b/net/ipv4/raw.c
@@ -326,6 +326,9 @@ static int raw_send_hdrinc(struct sock *sk, struct flowi4 *fl4,
unsigned int iphlen;
int err;
struct rtable *rt = *rtp;
+ unsigned int hard_header_len = rt->dst.dev->hard_header_len;
+ unsigned int needed_headroom = rt->dst.dev->needed_headroom;
+ unsigned int needed_tailroom = rt->dst.dev->needed_tailroom;
if (length > rt->dst.dev->mtu) {
ip_local_error(sk, EMSGSIZE, fl4->daddr, inet->inet_dport,
@@ -336,11 +339,13 @@ static int raw_send_hdrinc(struct sock *sk, struct flowi4 *fl4,
goto out;
skb = sock_alloc_send_skb(sk,
- length + LL_ALLOCATED_SPACE(rt->dst.dev) + 15,
+ length + LL_ALIGN(hard_header_len +
+ needed_headroom +
+ needed_tailroom) + 15,
flags & MSG_DONTWAIT, &err);
if (skb == NULL)
goto error;
- skb_reserve(skb, LL_RESERVED_SPACE(rt->dst.dev));
+ skb_reserve(skb, LL_ALIGN(hard_header_len + needed_headroom));
skb->priority = sk->sk_priority;
skb->mark = sk->sk_mark;
diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c
index 343852e..eb0a797 100644
--- a/net/ipv6/raw.c
+++ b/net/ipv6/raw.c
@@ -610,6 +610,9 @@ static int rawv6_send_hdrinc(struct sock *sk, void *from, int length,
struct sk_buff *skb;
int err;
struct rt6_info *rt = (struct rt6_info *)*dstp;
+ unsigned int hard_header_len = rt->dst.dev->hard_header_len;
+ unsigned int needed_headroom = rt->dst.dev->needed_headroom;
+ unsigned int needed_tailroom = rt->dst.dev->needed_tailroom;
if (length > rt->dst.dev->mtu) {
ipv6_local_error(sk, EMSGSIZE, fl6, rt->dst.dev->mtu);
@@ -619,11 +622,13 @@ static int rawv6_send_hdrinc(struct sock *sk, void *from, int length,
goto out;
skb = sock_alloc_send_skb(sk,
- length + LL_ALLOCATED_SPACE(rt->dst.dev) + 15,
+ length + LL_ALIGN(hard_header_len +
+ needed_headroom +
+ needed_tailroom) + 15,
flags & MSG_DONTWAIT, &err);
if (skb == NULL)
goto error;
- skb_reserve(skb, LL_RESERVED_SPACE(rt->dst.dev));
+ skb_reserve(skb, LL_ALIGN(hard_header_len + needed_headroom));
skb->priority = sk->sk_priority;
skb->mark = sk->sk_mark;
^ permalink raw reply related
* Re: [PATCH v13 0/6] flexcan: Add support for powerpc flexcan (freescale p1010)
From: Robin Holt @ 2011-10-18 12:30 UTC (permalink / raw)
To: Kumar Gala
Cc: Grant Likely, netdev-u79uwXL29TY76Z2rM5mHXA, U Bhaskar-B22300,
socketcan-core-0fE9KPoRgkgATYTw5x5z8w, Marc Kleine-Budde,
PPC list, David S. Miller, Wolfgang Grandegger
In-Reply-To: <D79CB818-C14E-4C8D-9A8D-42B39ADE20B2-XVmvHMARGAS8U2dJNN8I7kB+6BGkLq7r@public.gmane.org>
On Tue, Oct 18, 2011 at 06:43:13AM -0500, Kumar Gala wrote:
>
> >> Robin,
> >>
> >> Do you remember why we went with just 'fsl,p1010-flexcan' as the device tree compatible? Do we feel the flex can on P1010 isn't the same as on MPC5xxx? or the ARM SoCs?
> >
> > The decision was due to the fact there is no true "generic" fsl.flexcan
> > chip free of any SOC implementation and therefore not something which
> > could be separately defined. That decision was made by Grant Likely.
> > I will inline that email below.
> >
> > Robin
>
>
> Thanks, I'll look into this internally at FSL. I think its confusing as hell to have "fsl,p1010-flexcan" in an ARM .dts and don't think any reasonable ARM customer of FSL would know to put a PPC SOC name in their .dts. I'll ask the HW guys what's going on so we can come up with a bit more generic name so we don't have to constantly change this. Even if its just:
Grants argument was that there should then be a fsl,zeba-flexcan which
would define each arm based soc. The match string could be there and
the devicetree binding would match on each equivalent.
Robin
>
> fsl,ppc-flexcan & fsl,arm-flexcan.
>
> > On Mon, Aug 15, 2011 at 09:13:50AM -0600, Grant Likely wrote:
> >> On Mon, Aug 15, 2011 at 9:03 AM, Robin Holt <holt-sJ/iWh9BUns@public.gmane.org> wrote:
> >>> Grant,
> >>>
> >>> Earlier, you had asked for a more specific name for the compatible
> >>> property of the Freescale flexcan device. I still have not gotten a
> >>> more specific answer. Hopefully Marc can give you more details about
> >>> the flexcan implementations.
> >>
> >> If there is no ip core version, then just stick with the
> >> fsl,<soc>-flexcan name and drop "fsl,flexcan". Marketing may say
> >> flexcan is flexcan, but hardware engineers like to change things.
> >> Trying to be too generic in compatible values will just lead to
> >> problems in the future.
> >
> > Thanks,
> > Robin
>
> - k
^ permalink raw reply
* Re: [net-next] stmmac: limit max_mtu in case of 4KiB and use __netdev_alloc_skb (V2)
From: Eric Dumazet @ 2011-10-18 12:26 UTC (permalink / raw)
To: Giuseppe CAVALLARO; +Cc: netdev, davem
In-Reply-To: <1318937995-14228-1-git-send-email-peppe.cavallaro@st.com>
Le mardi 18 octobre 2011 à 13:39 +0200, Giuseppe CAVALLARO a écrit :
> Problem using big mtu around 4096 bytes is you end allocating (4096
> +NET_SKB_PAD + NET_IP_ALIGN + sizeof(struct skb_shared_info) bytes ->
> 8192 bytes : order-1 pages
>
> It's better to limit the mtu to SKB_MAX_HEAD(NET_SKB_PAD),
> to have no more than one page per skb.
>
> Also the patch changes the netdev_alloc_skb_ip_align() done in
> init_dma_desc_rings() and uses a variant allowing GFP_KERNEL allocations
> allowing the driver to load even in case of memory pressure.
>
> Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
> Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
> ---
> drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 6 ++++--
> 1 files changed, 4 insertions(+), 2 deletions(-)
>
Seems good, thanks :)
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
^ permalink raw reply
* Re: BUG: All network processes hang (brcmsmac/wpa_supplicant)
From: Eric Dumazet @ 2011-10-18 12:24 UTC (permalink / raw)
To: Nico Schottelius; +Cc: LKML, netdev
In-Reply-To: <20111018111422.GA1979@schottelius.org>
Le mardi 18 octobre 2011 à 13:14 +0200, Nico Schottelius a écrit :
> Dear LKML,
>
> there is a rather nasty bug in all recent kernels (probably older as
> well):
>
> If wpa_supplicant looses the connection to an AP, all network calls
> hang, uninterruptle.
>
> This is a snippet of running fetchmail after the problem started:
>
> socket(PF_NETLINK, SOCK_RAW, 0) = 4
> bind(4, {sa_family=AF_NETLINK, pid=0, groups=00000000}, 12) = 0
> getsockname(4, {sa_family=AF_NETLINK, pid=9734, groups=00000000}, [12]) = 0
> sendto(4, "\24\0\0\0\26\0\1\3\225\\\235N\0\0\0\0\0\0\0\0", 20, 0, {sa_family=AF_NETLINK, pid=0, groups=00000000}, 12
>
> [12:59] brief:linux-2.6% ps aux | grep fetchmail
> nico 9063 0.0 0.0 20652 1492 pts/3 D+ 12:55 0:00 fetchmail
> nico 9564 0.0 0.0 20652 1496 pts/15 D+ 13:01 0:00 fetchmail
> nico 9731 0.0 0.0 4288 724 pts/34 S+ 13:01 0:00 strace -fF fetchmail
> nico 9734 0.0 0.0 20652 1492 pts/34 D+ 13:01 0:00 fetchmail
>
> This is particular nasty, because even calling "sudo -i" does a sendto() call
> and thus hangs, as well as trying to send this mail with sendmail makes mutt hang.
>
> I've seen that on stock Archlinux kernel (linux 3.0.6-2),
> latest from Linus (v3.1-rc9-93-ga84a79e), merged trees from Keith and Jiri
> (3.1.0-rc6-gbee709a).
>
> Listing of hanging processes:
>
> [13:05] brief:linux-2.6% ps aux | grep D
> USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
> root 9056 0.2 0.0 0 0 ? D 12:54 0:02 [kworker/u:0]
> nico 9063 0.0 0.0 20652 1492 pts/3 D+ 12:55 0:00 fetchmail
> nico 9066 0.0 0.0 16548 1180 ? D 12:56 0:00 sendmail -oem -oi -f ME@inf.ethz.ch -- RECIPIENT
> root 9067 0.0 0.0 6060 320 pts/14 DN 12:56 0:00 ip r
> root 9124 0.0 0.0 6060 320 pts/14 DN 12:57 0:00 ip r add 192.168.42.1/24 dev lo
> root 9390 0.0 0.0 11996 784 pts/12 D+ 13:01 0:00 sudo -i
> nico 9564 0.0 0.0 20652 1496 pts/15 D+ 13:01 0:00 fetchmail
> nico 9734 0.0 0.0 20652 1492 pts/34 D+ 13:01 0:00 fetchmail
> root 12304 0.0 0.0 28624 2228 ? Ds 09:16 0:00 wpa_supplicant -B -P /run/wpa_supplicant_wlan0.pid -i wlan0 -D nl80211,wext -c /run/network//wpa.wlan0/wpa.conf
> [13:07] brief:linux-2.6%
>
> The hardware is again the (in)famous MacBook Air 4,2, so I imagine this may be
> related to brcmsmac.
>
> Attached are config, dmesg, lsmod.
Must be a mutex_unlock(some_mltex) missing somewhere.
Then later, a process holding RTNL is blocking on mutex_lock(some_mutex)
Try a "CONFIG_LOCKDEP=y" enabled build
^ permalink raw reply
* Re: [patch] pktgen: bug when calling ndelay in x86 architectures
From: Eric Dumazet @ 2011-10-18 11:56 UTC (permalink / raw)
To: Daniel Turull
Cc: David Miller, netdev, Robert Olsson, Voravit Tanyingyong,
Jens Laas
In-Reply-To: <4E9D5E1B.3080704@gmail.com>
Le mardi 18 octobre 2011 à 13:08 +0200, Daniel Turull a écrit :
> The value selected to delay the transmission in pktgen with the ndelay function should be lower.
> In Linux/arch/x86/include/asm/delay.h and Linux/arch/sh/include/asm/delay.h
> the maximal expected value for a constant is 20000 ns.
>
> Signed-off-by: Daniel Turull <daniel.turull@gmail.com>
> ---
> diff --git a/net/core/pktgen.c b/net/core/pktgen.c
> index 796044a..e17bd41 100644
> --- a/net/core/pktgen.c
> +++ b/net/core/pktgen.c
> @@ -2145,7 +2145,7 @@ static void spin(struct pktgen_dev *pkt_dev, ktime_t spin_until)
> }
>
> start_time = ktime_now();
> - if (remaining < 100000)
> + if (remaining < 20000)
> ndelay(remaining); /* really small just spin */
> else {
> /* see do_nanosleep */
But 'remaining' is not a constant.
If we want exactly 40.000 packets per second rate (25 us between
packets), your patch makes this not quite possible without
CONFIG_HIGH_RES_TIMERS and probable high jitter because of scheduler
effects.
pktgen is kind of special, we _want_ a cpu for our exclusive use.
^ permalink raw reply
* Re: PROBLEM: System call 'sendmsg' of process ospfd (quagga) causes kernel oops
From: Herbert Xu @ 2011-10-18 11:49 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Elmar Vonlanthen, linux-kernel, netdev, Timo Teräs
In-Reply-To: <1318937878.2657.50.camel@edumazet-HP-Compaq-6005-Pro-SFF-PC>
On Tue, Oct 18, 2011 at 01:37:58PM +0200, Eric Dumazet wrote:
>
> In the bug we try to fix, we have :
>
> skb = sock_alloc_send_skb(sk, ... + LL_ALLOCATED_SPACE(rt->dst.dev)
>
> ... < increase of dev->needed_headroom by another cpu/task >
>
> skb_reserve(skb, LL_RESERVED_SPACE(rt->dst.dev));
OK, in that case one fix would be to replace LL_ALLOCATED_SPACE
with its two constiuents so that they may be stored in local
variables for later use.
hlen = LL_HEADROOM(skb);
tlen = LL_TAILROOM(skb);
skb_alloc_send_skb(sk, ... + LL_ALIGN(hlen + tlen));
skb_reserve(skb, LL_ALIGN(hlen));
Cheers,
--
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply
* Re: [PATCH v13 0/6] flexcan: Add support for powerpc flexcan (freescale p1010)
From: Marc Kleine-Budde @ 2011-10-18 11:48 UTC (permalink / raw)
To: Kumar Gala
Cc: Grant Likely, netdev-u79uwXL29TY76Z2rM5mHXA, U Bhaskar-B22300,
linux-can-u79uwXL29TY76Z2rM5mHXA,
socketcan-core-0fE9KPoRgkgATYTw5x5z8w, PPC list, David S. Miller,
Wolfgang Grandegger
In-Reply-To: <D79CB818-C14E-4C8D-9A8D-42B39ADE20B2-XVmvHMARGAS8U2dJNN8I7kB+6BGkLq7r@public.gmane.org>
[-- Attachment #1.1: Type: text/plain, Size: 857 bytes --]
On 10/18/2011 01:43 PM, Kumar Gala wrote:
> Thanks, I'll look into this internally at FSL. I think its confusing
> as hell to have "fsl,p1010-flexcan" in an ARM .dts and don't think
> any reasonable ARM customer of FSL would know to put a PPC SOC name
> in their .dts. I'll ask the HW guys what's going on so we can come
> up with a bit more generic name so we don't have to constantly change
> this. Even if its just:
>
> fsl,ppc-flexcan & fsl,arm-flexcan.
If you talk the HW guys ask them for a name for the coldfire flexcan
core, too please ;)
cheers, Marc
--
Pengutronix e.K. | Marc Kleine-Budde |
Industrial Linux Solutions | Phone: +49-231-2826-924 |
Vertretung West/Dortmund | Fax: +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686 | http://www.pengutronix.de |
[-- Attachment #1.2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 262 bytes --]
[-- Attachment #2: Type: text/plain, Size: 188 bytes --]
_______________________________________________
Socketcan-core mailing list
Socketcan-core-0fE9KPoRgkgATYTw5x5z8w@public.gmane.org
https://lists.berlios.de/mailman/listinfo/socketcan-core
^ permalink raw reply
* Re: [PATCH v13 0/6] flexcan: Add support for powerpc flexcan (freescale p1010)
From: Kumar Gala @ 2011-10-18 11:43 UTC (permalink / raw)
To: Robin Holt
Cc: Grant Likely, netdev-u79uwXL29TY76Z2rM5mHXA, U Bhaskar-B22300,
socketcan-core-0fE9KPoRgkgATYTw5x5z8w, Marc Kleine-Budde,
PPC list, David S. Miller, Wolfgang Grandegger
In-Reply-To: <20111018094328.GC22814-sJ/iWh9BUns@public.gmane.org>
>> Robin,
>>
>> Do you remember why we went with just 'fsl,p1010-flexcan' as the device tree compatible? Do we feel the flex can on P1010 isn't the same as on MPC5xxx? or the ARM SoCs?
>
> The decision was due to the fact there is no true "generic" fsl.flexcan
> chip free of any SOC implementation and therefore not something which
> could be separately defined. That decision was made by Grant Likely.
> I will inline that email below.
>
> Robin
Thanks, I'll look into this internally at FSL. I think its confusing as hell to have "fsl,p1010-flexcan" in an ARM .dts and don't think any reasonable ARM customer of FSL would know to put a PPC SOC name in their .dts. I'll ask the HW guys what's going on so we can come up with a bit more generic name so we don't have to constantly change this. Even if its just:
fsl,ppc-flexcan & fsl,arm-flexcan.
> On Mon, Aug 15, 2011 at 09:13:50AM -0600, Grant Likely wrote:
>> On Mon, Aug 15, 2011 at 9:03 AM, Robin Holt <holt-sJ/iWh9BUns@public.gmane.org> wrote:
>>> Grant,
>>>
>>> Earlier, you had asked for a more specific name for the compatible
>>> property of the Freescale flexcan device. I still have not gotten a
>>> more specific answer. Hopefully Marc can give you more details about
>>> the flexcan implementations.
>>
>> If there is no ip core version, then just stick with the
>> fsl,<soc>-flexcan name and drop "fsl,flexcan". Marketing may say
>> flexcan is flexcan, but hardware engineers like to change things.
>> Trying to be too generic in compatible values will just lead to
>> problems in the future.
>
> Thanks,
> Robin
- k
^ permalink raw reply
* [net-next] stmmac: limit max_mtu in case of 4KiB and use __netdev_alloc_skb (V2)
From: Giuseppe CAVALLARO @ 2011-10-18 11:39 UTC (permalink / raw)
To: netdev; +Cc: davem, eric.dumazet, Giuseppe Cavallaro
In-Reply-To: <1318932085-14927-9-git-send-email-peppe.cavallaro@st.com>
Problem using big mtu around 4096 bytes is you end allocating (4096
+NET_SKB_PAD + NET_IP_ALIGN + sizeof(struct skb_shared_info) bytes ->
8192 bytes : order-1 pages
It's better to limit the mtu to SKB_MAX_HEAD(NET_SKB_PAD),
to have no more than one page per skb.
Also the patch changes the netdev_alloc_skb_ip_align() done in
init_dma_desc_rings() and uses a variant allowing GFP_KERNEL allocations
allowing the driver to load even in case of memory pressure.
Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
---
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 6 ++++--
1 files changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 1848a16..6b8f455 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -474,11 +474,13 @@ static void init_dma_desc_rings(struct net_device *dev)
for (i = 0; i < rxsize; i++) {
struct dma_desc *p = priv->dma_rx + i;
- skb = netdev_alloc_skb_ip_align(dev, bfsize);
+ skb = __netdev_alloc_skb(dev, bfsize + NET_IP_ALIGN,
+ GFP_KERNEL);
if (unlikely(skb == NULL)) {
pr_err("%s: Rx init fails; skb is NULL\n", __func__);
break;
}
+ skb_reserve(skb, NET_IP_ALIGN);
priv->rx_skbuff[i] = skb;
priv->rx_skbuff_dma[i] = dma_map_single(priv->device, skb->data,
bfsize, DMA_FROM_DEVICE);
@@ -1401,7 +1403,7 @@ static int stmmac_change_mtu(struct net_device *dev, int new_mtu)
if (priv->plat->enh_desc)
max_mtu = JUMBO_LEN;
else
- max_mtu = BUF_SIZE_4KiB;
+ max_mtu = SKB_MAX_HEAD(NET_SKB_PAD + NET_IP_ALIGN);
if ((new_mtu < 46) || (new_mtu > max_mtu)) {
pr_err("%s: invalid MTU, max MTU is: %d\n", dev->name, max_mtu);
--
1.7.4.4
^ permalink raw reply related
* Re: [net-next 8/8] stmmac: limit max_mtu in case of 4KiB and use __netdev_alloc_skb.
From: Giuseppe CAVALLARO @ 2011-10-18 11:33 UTC (permalink / raw)
To: Eric Dumazet; +Cc: netdev, davem
In-Reply-To: <1318933110.2657.39.camel@edumazet-HP-Compaq-6005-Pro-SFF-PC>
On 10/18/2011 12:18 PM, Eric Dumazet wrote:
> Le mardi 18 octobre 2011 à 12:01 +0200, Giuseppe CAVALLARO a écrit :
>> Problem using big mtu around 4096 bytes is you end allocating (4096
>> +NET_SKB_PAD + NET_IP_ALIGN + sizeof(struct skb_shared_info) bytes ->
>> 8192 bytes : order-1 pages
>>
>> It's better to limit the mtu to SKB_MAX_HEAD(NET_SKB_PAD),
>> to have no more than one page per skb.
>>
>> Also the patch changes the netdev_alloc_skb_ip_align() done in
>> init_dma_desc_rings() and uses a variant allowing GFP_KERNEL allocations
>> allowing the driver to load even in case of memory pressure.
>>
>> Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
>> Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
>> ---
>> drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 13 +++++++++----
>> 1 files changed, 9 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
>> index 1848a16..f5ca3be 100644
>> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
>> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
>> @@ -474,11 +474,13 @@ static void init_dma_desc_rings(struct net_device *dev)
>> for (i = 0; i < rxsize; i++) {
>> struct dma_desc *p = priv->dma_rx + i;
>>
>> - skb = netdev_alloc_skb_ip_align(dev, bfsize);
>> + skb = __netdev_alloc_skb(dev, bfsize + NET_IP_ALIGN,
>> + GFP_KERNEL);
>> if (unlikely(skb == NULL)) {
>> pr_err("%s: Rx init fails; skb is NULL\n", __func__);
>> break;
>> }
>> + skb_reserve(skb, NET_IP_ALIGN);
>> priv->rx_skbuff[i] = skb;
>> priv->rx_skbuff_dma[i] = dma_map_single(priv->device, skb->data,
>> bfsize, DMA_FROM_DEVICE);
>> @@ -1176,12 +1178,15 @@ static inline void stmmac_rx_refill(struct stmmac_priv *priv)
>>
>> skb = __skb_dequeue(&priv->rx_recycle);
>> if (skb == NULL)
>> - skb = netdev_alloc_skb_ip_align(priv->dev,
>> - bfsize);
>> + skb = __netdev_alloc_skb(priv->dev, bfsize +
>> + NET_IP_ALIGN,
>> + GFP_KERNEL);
>>
>
> No, you cant do that in softirq context. We cant sleep here and must use
> GFP_ATOMIC
> Only the init_dma_desc_rings() part is OK, we run in process context and
> are allowed to sleep in memory allocations (GFP_KERNEL)
>
>
>> if (unlikely(skb == NULL))
>> break;
>>
>> + skb_reserve(skb, NET_IP_ALIGN);
>> +
>> priv->rx_skbuff[entry] = skb;
>> priv->rx_skbuff_dma[entry] =
>> dma_map_single(priv->device, skb->data, bfsize,
>> @@ -1401,7 +1406,7 @@ static int stmmac_change_mtu(struct net_device *dev, int new_mtu)
>> if (priv->plat->enh_desc)
>> max_mtu = JUMBO_LEN;
>> else
>> - max_mtu = BUF_SIZE_4KiB;
>> + max_mtu = SKB_MAX_HEAD(NET_SKB_PAD);
>>
>
> minor nit (since NET_IP_ALIGN is 0 anyway on modern x86)
> max_mtu = SKB_MAX_HEAD(NET_SKB_PAD + NET_IP_ALIGN);
>
>
>> if ((new_mtu < 46) || (new_mtu > max_mtu)) {
>> pr_err("%s: invalid MTU, max MTU is: %d\n", dev->name, max_mtu);
>
>
>
Ok! I'm reworking and sending it again.
Thx
Peppe
^ permalink raw reply
* Re: PROBLEM: System call 'sendmsg' of process ospfd (quagga) causes kernel oops
From: Eric Dumazet @ 2011-10-18 11:37 UTC (permalink / raw)
To: Herbert Xu; +Cc: Elmar Vonlanthen, linux-kernel, netdev, Timo Teräs
In-Reply-To: <20111018104504.GA15913@gondor.apana.org.au>
Le mardi 18 octobre 2011 à 12:45 +0200, Herbert Xu a écrit :
> On Tue, Oct 18, 2011 at 12:23:43PM +0200, Eric Dumazet wrote:
> >
> > You're right, if reallocations are OK in all paths.
>
> If it wasn't OK then making needed_headroom constant won't work
> anyway.
>
> > We'll need to change LL_RESERVED_SPACE() / LL_RESERVED_SPACE_EXTRA() /
> > LL_ALLOCATED_SPACE() macros and provide the [read once] values, instead
> > of a [read once] pointer to values.
>
> I'm not sure what you mean here. I don't see any need to change
> these macros. All we need is to save the value in a local variable:
>
> hh_len = LL_RESERVED_SPACE(dev);
>
> skb = alloc_skb(hh_len + len);
> skb_reserve(skb, hh_len);
>
Not really Herbert. Please read again my patch changelog.
In the bug we try to fix, we have :
skb = sock_alloc_send_skb(sk, ... + LL_ALLOCATED_SPACE(rt->dst.dev)
... < increase of dev->needed_headroom by another cpu/task >
skb_reserve(skb, LL_RESERVED_SPACE(rt->dst.dev));
skb_put() -> crash because we reserved too much space
So we really want LL_ALLOCATED_SPACE() and LL_RESERVED_SPACE() use the
same needed_headroom, or else you can have LL_RESERVED_SPACE() >
LL_ALLOCATED_SPACE().
There are several way to fix this, but this kind of code assumed the
dev->needed... values were consistent for the whole block.
^ permalink raw reply
* [patch] pktgen: bug when calling ndelay in x86 architectures
From: Daniel Turull @ 2011-10-18 11:08 UTC (permalink / raw)
To: David Miller; +Cc: netdev, Robert Olsson, Voravit Tanyingyong, Jens Laas
The value selected to delay the transmission in pktgen with the ndelay function should be lower.
In Linux/arch/x86/include/asm/delay.h and Linux/arch/sh/include/asm/delay.h
the maximal expected value for a constant is 20000 ns.
Signed-off-by: Daniel Turull <daniel.turull@gmail.com>
---
diff --git a/net/core/pktgen.c b/net/core/pktgen.c
index 796044a..e17bd41 100644
--- a/net/core/pktgen.c
+++ b/net/core/pktgen.c
@@ -2145,7 +2145,7 @@ static void spin(struct pktgen_dev *pkt_dev, ktime_t spin_until)
}
start_time = ktime_now();
- if (remaining < 100000)
+ if (remaining < 20000)
ndelay(remaining); /* really small just spin */
else {
/* see do_nanosleep */
^ permalink raw reply related
* Re: [PATCH] Fix guest memory leak and panic
From: Ian Campbell @ 2011-10-18 10:50 UTC (permalink / raw)
To: Krishna Kumar2
Cc: davem@davemloft.net, linux-kernel@vger.kernel.org, mst@redhat.com,
netdev@vger.kernel.org, rusty@rustcorp.com.au,
virtualization@lists.linux-foundation.org
In-Reply-To: <OFB632D60B.6534A38B-ON6525792D.003A1B21-6525792D.003B0DB8@in.ibm.com>
On Tue, 2011-10-18 at 11:46 +0100, Krishna Kumar2 wrote:
> Ian Campbell <Ian.Campbell@citrix.com> wrote on 10/18/2011 03:17:11 PM:
>
> > > I think the best thing might be to remove the additional ref taking
> from
> > > the setter function and audit the previous changes to ensure they
> > > conform. I'll do that right away and post a fixup patch ASAP.
> >
> > Sigh, only one out of the ten callers of (__)skb_frag_set_page expects
> > skb_frag_set_page to take a new reference. I think that's pretty
> > comprehensive evidence that the current behaviour is unexpected and
> > wrong.
>
> Looks good!
Thanks.
> Does it make sense to commit both of these patches? The
> reason being - my patch becomes a cleanup of set_skb_frag()
> in virtio_net driver.
I think your patch remains a valid cleanup but I don't maintain that
code.
Ian.
^ permalink raw reply
* Re: [PATCH] Fix guest memory leak and panic
From: Krishna Kumar2 @ 2011-10-18 10:46 UTC (permalink / raw)
To: Ian Campbell
Cc: davem@davemloft.net, linux-kernel@vger.kernel.org, mst@redhat.com,
netdev@vger.kernel.org, rusty@rustcorp.com.au,
virtualization@lists.linux-foundation.org
In-Reply-To: <1318931232.16132.65.camel@zakaz.uk.xensource.com>
Ian Campbell <Ian.Campbell@citrix.com> wrote on 10/18/2011 03:17:11 PM:
> > I think the best thing might be to remove the additional ref taking
from
> > the setter function and audit the previous changes to ensure they
> > conform. I'll do that right away and post a fixup patch ASAP.
>
> Sigh, only one out of the ten callers of (__)skb_frag_set_page expects
> skb_frag_set_page to take a new reference. I think that's pretty
> comprehensive evidence that the current behaviour is unexpected and
> wrong.
Looks good!
Does it make sense to commit both of these patches? The
reason being - my patch becomes a cleanup of set_skb_frag()
in virtio_net driver.
thanks,
- KK
^ permalink raw reply
* Re: PROBLEM: System call 'sendmsg' of process ospfd (quagga) causes kernel oops
From: Herbert Xu @ 2011-10-18 10:45 UTC (permalink / raw)
To: Eric Dumazet; +Cc: Elmar Vonlanthen, linux-kernel, netdev, Timo Teräs
In-Reply-To: <1318933423.2657.44.camel@edumazet-HP-Compaq-6005-Pro-SFF-PC>
On Tue, Oct 18, 2011 at 12:23:43PM +0200, Eric Dumazet wrote:
>
> You're right, if reallocations are OK in all paths.
If it wasn't OK then making needed_headroom constant won't work
anyway.
> We'll need to change LL_RESERVED_SPACE() / LL_RESERVED_SPACE_EXTRA() /
> LL_ALLOCATED_SPACE() macros and provide the [read once] values, instead
> of a [read once] pointer to values.
I'm not sure what you mean here. I don't see any need to change
these macros. All we need is to save the value in a local variable:
hh_len = LL_RESERVED_SPACE(dev);
skb = alloc_skb(hh_len + len);
skb_reserve(skb, hh_len);
Cheers,
--
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
^ permalink raw reply
* Re: [PATCH] Fix guest memory leak and panic
From: Krishna Kumar2 @ 2011-10-18 10:28 UTC (permalink / raw)
To: Ian Campbell
Cc: davem@davemloft.net, linux-kernel@vger.kernel.org, mst@redhat.com,
netdev@vger.kernel.org, rusty@rustcorp.com.au,
virtualization@lists.linux-foundation.org
In-Reply-To: <1318927016.16132.49.camel@zakaz.uk.xensource.com>
Ian Campbell <Ian.Campbell@citrix.com> wrote on 10/18/2011 02:06:56 PM:
Hi Ian,
> On Tue, 2011-10-18 at 09:05 +0100, Krishna Kumar wrote:
> > The extra reference to the fragment pages causes those pages to
> > not get freed in skb_release_data(). The following patch fixes
> > the bug. I have not checked if any other converted driver has
> > the same issue.
>
> Damn. You are completely correct and I appear to have made this same
> mistake several times. A quick look suggests that at least cxbg,
> myriage, vmxnet, cassini and bnx2 may potentially have a similar
> issue :-( (I stopped looking at that point, I'll obviously do a full
> audit).
>
> I considered quite carefully whether (__)skb_frag_set_page should take a
> reference or not and decided yes but I'm starting to reconsider whether
> I made the right choice. It seems that is just confusing and violates
> the principal of least surprise to have a function called "set" take a
> new reference. In reality all existing drivers expect that adding a page
> to an SKB frag will just take over the existing reference.
>
> I think the best thing might be to remove the additional ref taking from
> the setter function and audit the previous changes to ensure they
> conform. I'll do that right away and post a fixup patch ASAP.
>
> > Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
>
> This change is correct as things stand today, so:
>
> Acked-by: Ian Campbell <ian.campbell@citrix.com>
>
> but perhaps it would be better to hold off and let me fix all of these
> all at once.
Either way is fine with me. However, besides having a fix,
I would like to remove the manual initializations in
set_skb_frag(), and substitute with __skb_fill_page_desc,
like other drivers do.
thanks,
- KK
^ permalink raw reply
* Re: PROBLEM: System call 'sendmsg' of process ospfd (quagga) causes kernel oops
From: Eric Dumazet @ 2011-10-18 10:23 UTC (permalink / raw)
To: Herbert Xu; +Cc: Elmar Vonlanthen, linux-kernel, netdev, Timo Teräs
In-Reply-To: <20111018100527.GA15383@gondor.apana.org.au>
Le mardi 18 octobre 2011 à 12:05 +0200, Herbert Xu a écrit :
> On Tue, Oct 18, 2011 at 12:01:33PM +0200, Eric Dumazet wrote:
> >
> > Adding an RCU protected structure to hold hard_header_len /
> > needed_headroom / needed_tailroom should be possible, but this adds yet
> > another pointer dereference...
>
> I don't think we need RCU here since the problem is simply that
> we're using two different values for skb allocations and skb_reserve.
>
> As long as we use one and the same value it should work. The value
> will rarely be incorrect and when it is, automatic reallocation will
> occur.
>
You're right, if reallocations are OK in all paths.
We'll need to change LL_RESERVED_SPACE() / LL_RESERVED_SPACE_EXTRA() /
LL_ALLOCATED_SPACE() macros and provide the [read once] values, instead
of a [read once] pointer to values.
Thats a bit complex change, but doable.
^ permalink raw reply
* Re: [net-next 8/8] stmmac: limit max_mtu in case of 4KiB and use __netdev_alloc_skb.
From: Eric Dumazet @ 2011-10-18 10:18 UTC (permalink / raw)
To: Giuseppe CAVALLARO; +Cc: netdev, davem
In-Reply-To: <1318932085-14927-9-git-send-email-peppe.cavallaro@st.com>
Le mardi 18 octobre 2011 à 12:01 +0200, Giuseppe CAVALLARO a écrit :
> Problem using big mtu around 4096 bytes is you end allocating (4096
> +NET_SKB_PAD + NET_IP_ALIGN + sizeof(struct skb_shared_info) bytes ->
> 8192 bytes : order-1 pages
>
> It's better to limit the mtu to SKB_MAX_HEAD(NET_SKB_PAD),
> to have no more than one page per skb.
>
> Also the patch changes the netdev_alloc_skb_ip_align() done in
> init_dma_desc_rings() and uses a variant allowing GFP_KERNEL allocations
> allowing the driver to load even in case of memory pressure.
>
> Reported-by: Eric Dumazet <eric.dumazet@gmail.com>
> Signed-off-by: Giuseppe Cavallaro <peppe.cavallaro@st.com>
> ---
> drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 13 +++++++++----
> 1 files changed, 9 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> index 1848a16..f5ca3be 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> @@ -474,11 +474,13 @@ static void init_dma_desc_rings(struct net_device *dev)
> for (i = 0; i < rxsize; i++) {
> struct dma_desc *p = priv->dma_rx + i;
>
> - skb = netdev_alloc_skb_ip_align(dev, bfsize);
> + skb = __netdev_alloc_skb(dev, bfsize + NET_IP_ALIGN,
> + GFP_KERNEL);
> if (unlikely(skb == NULL)) {
> pr_err("%s: Rx init fails; skb is NULL\n", __func__);
> break;
> }
> + skb_reserve(skb, NET_IP_ALIGN);
> priv->rx_skbuff[i] = skb;
> priv->rx_skbuff_dma[i] = dma_map_single(priv->device, skb->data,
> bfsize, DMA_FROM_DEVICE);
> @@ -1176,12 +1178,15 @@ static inline void stmmac_rx_refill(struct stmmac_priv *priv)
>
> skb = __skb_dequeue(&priv->rx_recycle);
> if (skb == NULL)
> - skb = netdev_alloc_skb_ip_align(priv->dev,
> - bfsize);
> + skb = __netdev_alloc_skb(priv->dev, bfsize +
> + NET_IP_ALIGN,
> + GFP_KERNEL);
>
No, you cant do that in softirq context. We cant sleep here and must use
GFP_ATOMIC
Only the init_dma_desc_rings() part is OK, we run in process context and
are allowed to sleep in memory allocations (GFP_KERNEL)
> if (unlikely(skb == NULL))
> break;
>
> + skb_reserve(skb, NET_IP_ALIGN);
> +
> priv->rx_skbuff[entry] = skb;
> priv->rx_skbuff_dma[entry] =
> dma_map_single(priv->device, skb->data, bfsize,
> @@ -1401,7 +1406,7 @@ static int stmmac_change_mtu(struct net_device *dev, int new_mtu)
> if (priv->plat->enh_desc)
> max_mtu = JUMBO_LEN;
> else
> - max_mtu = BUF_SIZE_4KiB;
> + max_mtu = SKB_MAX_HEAD(NET_SKB_PAD);
>
minor nit (since NET_IP_ALIGN is 0 anyway on modern x86)
max_mtu = SKB_MAX_HEAD(NET_SKB_PAD + NET_IP_ALIGN);
> if ((new_mtu < 46) || (new_mtu > max_mtu)) {
> pr_err("%s: invalid MTU, max MTU is: %d\n", dev->name, max_mtu);
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox