public inbox for netdev@vger.kernel.org
 help / color / mirror / Atom feed
* [RFC] skb truesize last offenders
@ 2012-03-23 12:51 Eric Dumazet
  2012-03-23 18:38 ` David Miller
  0 siblings, 1 reply; 8+ messages in thread
From: Eric Dumazet @ 2012-03-23 12:51 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, Wey-Yi Guy

iwlwifi and some other users of skb_add_rx_frag() lie about skb
truesize, because of API being lazy.

iwlwifi can use order-0 or order-1 pages and skb truesize is wrong.
network sbs can use uncharged kernel memory and eventually crash machine
in OOM.

I plan : 

1) adding a @truesize parameter to skb_add_rx_frag() in a neutral patch

2) fix drivers that dont charge the real size of fragment

3) niu could then use skb_add_rx_frag() instead of custom code.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC] skb truesize last offenders
  2012-03-23 12:51 [RFC] skb truesize last offenders Eric Dumazet
@ 2012-03-23 18:38 ` David Miller
  2012-03-23 18:55   ` Eric Dumazet
  0 siblings, 1 reply; 8+ messages in thread
From: David Miller @ 2012-03-23 18:38 UTC (permalink / raw)
  To: eric.dumazet; +Cc: netdev, wey-yi.w.guy

From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Fri, 23 Mar 2012 05:51:52 -0700

> iwlwifi and some other users of skb_add_rx_frag() lie about skb
> truesize, because of API being lazy.
> 
> iwlwifi can use order-0 or order-1 pages and skb truesize is wrong.
> network sbs can use uncharged kernel memory and eventually crash machine
> in OOM.
> 
> I plan : 
> 
> 1) adding a @truesize parameter to skb_add_rx_frag() in a neutral patch
> 
> 2) fix drivers that dont charge the real size of fragment
> 
> 3) niu could then use skb_add_rx_frag() instead of custom code.

No objections from me.  But we really have to fix that
tcp_grow_window() issue, it's quite urgent if you ask me.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC] skb truesize last offenders
  2012-03-23 18:38 ` David Miller
@ 2012-03-23 18:55   ` Eric Dumazet
  2012-03-24  0:05     ` [PATCH] iwlwifi: fix skb truesize underestimation Eric Dumazet
  0 siblings, 1 reply; 8+ messages in thread
From: Eric Dumazet @ 2012-03-23 18:55 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, wey-yi.w.guy, Neal Cardwell

On Fri, 2012-03-23 at 14:38 -0400, David Miller wrote:
> From: Eric Dumazet <eric.dumazet@gmail.com>
> Date: Fri, 23 Mar 2012 05:51:52 -0700
> 
> > iwlwifi and some other users of skb_add_rx_frag() lie about skb
> > truesize, because of API being lazy.
> > 
> > iwlwifi can use order-0 or order-1 pages and skb truesize is wrong.
> > network sbs can use uncharged kernel memory and eventually crash machine
> > in OOM.
> > 
> > I plan : 
> > 
> > 1) adding a @truesize parameter to skb_add_rx_frag() in a neutral patch
> > 
> > 2) fix drivers that dont charge the real size of fragment
> > 
> > 3) niu could then use skb_add_rx_frag() instead of custom code.
> 
> No objections from me.  But we really have to fix that
> tcp_grow_window() issue, it's quite urgent if you ask me.

Sure, I can work on this issue with Neal asap, we do have some ideas on
this subject.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH] iwlwifi: fix skb truesize underestimation
  2012-03-23 18:55   ` Eric Dumazet
@ 2012-03-24  0:05     ` Eric Dumazet
  2012-03-24  0:22       ` Eric Dumazet
  0 siblings, 1 reply; 8+ messages in thread
From: Eric Dumazet @ 2012-03-24  0:05 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, wey-yi.w.guy, Neal Cardwell, John W. Linville

By default, iwlwifi uses order-1 pages (8 KB) to store incoming frames,
but doesnt say so in skb->truesize.

This makes very possible to exhaust kernel memory since these skb evade
normal socket memory accounting.

As struct ieee80211_hdr is going to be pulled before calling IP stack,
there is no need to use dev_alloc_skb() to reserve NET_SKB_PAD bytes.
alloc_skb() is ok in this driver, allowing more tailroom.

Pull beginning of frame in skb header, in the hope we can reuse order-1
pages in the driver immediately for small frames and reduce their
truesize to the minimum (linear skbs)

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Wey-Yi Guy <wey-yi.w.guy@intel.com>
Cc: "John W. Linville" <linville@tuxdriver.com>
Cc: Neal Cardwell <ncardwell@google.com>
---
Depends on the "net: add a truesize parameter to skb_add_rx_frag()"
prior patch.

 drivers/net/wireless/iwlwifi/iwl-agn-rx.c        |   25 ++++++++-----
 drivers/net/wireless/iwlwifi/iwl-trans-pcie-rx.c |    3 +
 drivers/net/wireless/iwlwifi/iwl-trans.h         |    1 
 3 files changed, 19 insertions(+), 10 deletions(-)

diff --git a/drivers/net/wireless/iwlwifi/iwl-agn-rx.c b/drivers/net/wireless/iwlwifi/iwl-agn-rx.c
index f4b84d1..9ecd768f 100644
--- a/drivers/net/wireless/iwlwifi/iwl-agn-rx.c
+++ b/drivers/net/wireless/iwlwifi/iwl-agn-rx.c
@@ -773,8 +773,7 @@ static void iwlagn_pass_packet_to_mac80211(struct iwl_priv *priv,
 	struct sk_buff *skb;
 	__le16 fc = hdr->frame_control;
 	struct iwl_rxon_context *ctx;
-	struct page *p;
-	int offset;
+	unsigned int hdrlen;
 
 	/* We only process data packets if the interface is open */
 	if (unlikely(!priv->is_open)) {
@@ -788,16 +787,24 @@ static void iwlagn_pass_packet_to_mac80211(struct iwl_priv *priv,
 	    iwlagn_set_decrypted_flag(priv, hdr, ampdu_status, stats))
 		return;
 
-	skb = dev_alloc_skb(128);
+	/* Dont use dev_alloc_skb(), we'll have enough headroom once
+	 * ieee80211_hdr pulled.
+	 */
+	skb = alloc_skb(128);
 	if (!skb) {
-		IWL_ERR(priv, "dev_alloc_skb failed\n");
+		IWL_ERR(priv, "alloc_skb failed\n");
 		return;
 	}
-
-	offset = (void *)hdr - rxb_addr(rxb);
-	p = rxb_steal_page(rxb);
-	skb_add_rx_frag(skb, 0, p, offset, len, len);
-
+	hdrlen = min_t(unsigned int, len, skb_tailroom(skb));
+	memcpy(skb_put(skb, hdrlen), hdr, hdrlen);
+	len -= hdrlen;
+
+	if (len) {
+		int offset = (void *)hdr + hdrlen - rxb_addr(rxb);
+ 
+		skb_add_rx_frag(skb, 0, rxb_steal_page(rxb), offset,
+				len, rxb->truesize);
+	}
 	iwl_update_stats(priv, false, fc, len);
 
 	/*
diff --git a/drivers/net/wireless/iwlwifi/iwl-trans-pcie-rx.c b/drivers/net/wireless/iwlwifi/iwl-trans-pcie-rx.c
index 8b1a798..aa7aea1 100644
--- a/drivers/net/wireless/iwlwifi/iwl-trans-pcie-rx.c
+++ b/drivers/net/wireless/iwlwifi/iwl-trans-pcie-rx.c
@@ -374,8 +374,9 @@ static void iwl_rx_handle_rxbuf(struct iwl_trans *trans,
 	if (WARN_ON(!rxb))
 		return;
 
+	rxcb.truesize = PAGE_SIZE << hw_params(trans).rx_page_order;
 	dma_unmap_page(trans->dev, rxb->page_dma,
-		       PAGE_SIZE << hw_params(trans).rx_page_order,
+		       rxcb.truesize,
 		       DMA_FROM_DEVICE);
 
 	rxcb._page = rxb->page;
diff --git a/drivers/net/wireless/iwlwifi/iwl-trans.h b/drivers/net/wireless/iwlwifi/iwl-trans.h
index 0c81cba..fdf9788 100644
--- a/drivers/net/wireless/iwlwifi/iwl-trans.h
+++ b/drivers/net/wireless/iwlwifi/iwl-trans.h
@@ -260,6 +260,7 @@ static inline void iwl_free_resp(struct iwl_host_cmd *cmd)
 
 struct iwl_rx_cmd_buffer {
 	struct page *_page;
+	unsigned int truesize;
 };
 
 static inline void *rxb_addr(struct iwl_rx_cmd_buffer *r)

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] iwlwifi: fix skb truesize underestimation
  2012-03-24  0:05     ` [PATCH] iwlwifi: fix skb truesize underestimation Eric Dumazet
@ 2012-03-24  0:22       ` Eric Dumazet
  2012-03-24  0:29         ` [PATCH v2] " Eric Dumazet
  0 siblings, 1 reply; 8+ messages in thread
From: Eric Dumazet @ 2012-03-24  0:22 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, wey-yi.w.guy, Neal Cardwell, John W. Linville

On Fri, 2012-03-23 at 17:05 -0700, Eric Dumazet wrote:
> +	if (len) {
> +		int offset = (void *)hdr + hdrlen - rxb_addr(rxb);
> + 
> +		skb_add_rx_frag(skb, 0, rxb_steal_page(rxb), offset,
> +				len, rxb->truesize);
> +	}
>  	iwl_update_stats(priv, false, fc, len);
>  

Please delete this patch, I'll send a v2 : iwl_update_stats() should be
called with original "len", not "len - hdrlen"

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2] iwlwifi: fix skb truesize underestimation
  2012-03-24  0:22       ` Eric Dumazet
@ 2012-03-24  0:29         ` Eric Dumazet
  2012-03-24 15:35           ` Guy, Wey-Yi
  0 siblings, 1 reply; 8+ messages in thread
From: Eric Dumazet @ 2012-03-24  0:29 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, wey-yi.w.guy, Neal Cardwell, John W. Linville

By default, iwlwifi uses order-1 pages (8 KB) to store incoming frames,
but doesnt say so in skb->truesize.

This makes very possible to exhaust kernel memory since these skb evade
normal socket memory accounting.

As struct ieee80211_hdr is going to be pulled before calling IP stack,
there is no need to use dev_alloc_skb() to reserve NET_SKB_PAD bytes.
alloc_skb() is ok in this driver, allowing more tailroom.

Pull beginning of frame in skb header, in the hope we can reuse order-1
pages in the driver immediately for small frames and reduce their
truesize to the minimum (linear skbs)

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Wey-Yi Guy <wey-yi.w.guy@intel.com>
Cc: "John W. Linville" <linville@tuxdriver.com>
Cc: Neal Cardwell <ncardwell@google.com>
---
Depends on the "net: add a truesize parameter to skb_add_rx_frag()"
prior patch.

v2: fix the iwl_update_stats() call

 drivers/net/wireless/iwlwifi/iwl-agn-rx.c        |   25 ++++++++-----
 drivers/net/wireless/iwlwifi/iwl-trans-pcie-rx.c |    3 +
 drivers/net/wireless/iwlwifi/iwl-trans.h         |    1 
 3 files changed, 19 insertions(+), 10 deletions(-)

diff --git a/drivers/net/wireless/iwlwifi/iwl-agn-rx.c b/drivers/net/wireless/iwlwifi/iwl-agn-rx.c
index f4b84d1..19a9499 100644
--- a/drivers/net/wireless/iwlwifi/iwl-agn-rx.c
+++ b/drivers/net/wireless/iwlwifi/iwl-agn-rx.c
@@ -773,8 +773,7 @@ static void iwlagn_pass_packet_to_mac80211(struct iwl_priv *priv,
 	struct sk_buff *skb;
 	__le16 fc = hdr->frame_control;
 	struct iwl_rxon_context *ctx;
-	struct page *p;
-	int offset;
+	unsigned int hdrlen, fraglen;
 
 	/* We only process data packets if the interface is open */
 	if (unlikely(!priv->is_open)) {
@@ -788,16 +787,24 @@ static void iwlagn_pass_packet_to_mac80211(struct iwl_priv *priv,
 	    iwlagn_set_decrypted_flag(priv, hdr, ampdu_status, stats))
 		return;
 
-	skb = dev_alloc_skb(128);
+	/* Dont use dev_alloc_skb(), we'll have enough headroom once
+	 * ieee80211_hdr pulled.
+	 */
+	skb = alloc_skb(128, GFP_ATOMIC);
 	if (!skb) {
-		IWL_ERR(priv, "dev_alloc_skb failed\n");
+		IWL_ERR(priv, "alloc_skb failed\n");
 		return;
 	}
-
-	offset = (void *)hdr - rxb_addr(rxb);
-	p = rxb_steal_page(rxb);
-	skb_add_rx_frag(skb, 0, p, offset, len, len);
-
+	hdrlen = min_t(unsigned int, len, skb_tailroom(skb));
+	memcpy(skb_put(skb, hdrlen), hdr, hdrlen);
+	fraglen = len - hdrlen;
+
+	if (fraglen) {
+		int offset = (void *)hdr + hdrlen - rxb_addr(rxb);
+ 
+		skb_add_rx_frag(skb, 0, rxb_steal_page(rxb), offset,
+				fraglen, rxb->truesize);
+	}
 	iwl_update_stats(priv, false, fc, len);
 
 	/*
diff --git a/drivers/net/wireless/iwlwifi/iwl-trans-pcie-rx.c b/drivers/net/wireless/iwlwifi/iwl-trans-pcie-rx.c
index 8b1a798..aa7aea1 100644
--- a/drivers/net/wireless/iwlwifi/iwl-trans-pcie-rx.c
+++ b/drivers/net/wireless/iwlwifi/iwl-trans-pcie-rx.c
@@ -374,8 +374,9 @@ static void iwl_rx_handle_rxbuf(struct iwl_trans *trans,
 	if (WARN_ON(!rxb))
 		return;
 
+	rxcb.truesize = PAGE_SIZE << hw_params(trans).rx_page_order;
 	dma_unmap_page(trans->dev, rxb->page_dma,
-		       PAGE_SIZE << hw_params(trans).rx_page_order,
+		       rxcb.truesize,
 		       DMA_FROM_DEVICE);
 
 	rxcb._page = rxb->page;
diff --git a/drivers/net/wireless/iwlwifi/iwl-trans.h b/drivers/net/wireless/iwlwifi/iwl-trans.h
index 0c81cba..fdf9788 100644
--- a/drivers/net/wireless/iwlwifi/iwl-trans.h
+++ b/drivers/net/wireless/iwlwifi/iwl-trans.h
@@ -260,6 +260,7 @@ static inline void iwl_free_resp(struct iwl_host_cmd *cmd)
 
 struct iwl_rx_cmd_buffer {
 	struct page *_page;
+	unsigned int truesize;
 };
 
 static inline void *rxb_addr(struct iwl_rx_cmd_buffer *r)

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] iwlwifi: fix skb truesize underestimation
  2012-03-24  0:29         ` [PATCH v2] " Eric Dumazet
@ 2012-03-24 15:35           ` Guy, Wey-Yi
  2012-03-24 16:12             ` Eric Dumazet
  0 siblings, 1 reply; 8+ messages in thread
From: Guy, Wey-Yi @ 2012-03-24 15:35 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: David Miller, netdev, Neal Cardwell, John W. Linville

Hi Eric,

On Fri, 2012-03-23 at 17:29 -0700, Eric Dumazet wrote:
> By default, iwlwifi uses order-1 pages (8 KB) to store incoming frames,
> but doesnt say so in skb->truesize.
> 
> This makes very possible to exhaust kernel memory since these skb evade
> normal socket memory accounting.
> 
> As struct ieee80211_hdr is going to be pulled before calling IP stack,
> there is no need to use dev_alloc_skb() to reserve NET_SKB_PAD bytes.
> alloc_skb() is ok in this driver, allowing more tailroom.
> 
> Pull beginning of frame in skb header, in the hope we can reuse order-1
> pages in the driver immediately for small frames and reduce their
> truesize to the minimum (linear skbs)
> 
> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
> Cc: Wey-Yi Guy <wey-yi.w.guy@intel.com>
> Cc: "John W. Linville" <linville@tuxdriver.com>
> Cc: Neal Cardwell <ncardwell@google.com>
> ---

Is it ok I pull your patch into our internal tree first for regression
testing, once the test is done, I will push your patch along with all
our other patches together to wireless-next(John Linville).

By doing so, first, we know the patch is not breaking any functionality
for all the devices(legacy/new), second, it is easier for us to sync-up
with all the other changes we are making now.

Thanks
Wey

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] iwlwifi: fix skb truesize underestimation
  2012-03-24 15:35           ` Guy, Wey-Yi
@ 2012-03-24 16:12             ` Eric Dumazet
  0 siblings, 0 replies; 8+ messages in thread
From: Eric Dumazet @ 2012-03-24 16:12 UTC (permalink / raw)
  To: Guy, Wey-Yi; +Cc: David Miller, netdev, Neal Cardwell, John W. Linville

Le samedi 24 mars 2012 à 08:35 -0700, Guy, Wey-Yi a écrit :
> Hi Eric,
> 
> On Fri, 2012-03-23 at 17:29 -0700, Eric Dumazet wrote:
> > By default, iwlwifi uses order-1 pages (8 KB) to store incoming frames,
> > but doesnt say so in skb->truesize.
> > 
> > This makes very possible to exhaust kernel memory since these skb evade
> > normal socket memory accounting.
> > 
> > As struct ieee80211_hdr is going to be pulled before calling IP stack,
> > there is no need to use dev_alloc_skb() to reserve NET_SKB_PAD bytes.
> > alloc_skb() is ok in this driver, allowing more tailroom.
> > 
> > Pull beginning of frame in skb header, in the hope we can reuse order-1
> > pages in the driver immediately for small frames and reduce their
> > truesize to the minimum (linear skbs)
> > 
> > Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
> > Cc: Wey-Yi Guy <wey-yi.w.guy@intel.com>
> > Cc: "John W. Linville" <linville@tuxdriver.com>
> > Cc: Neal Cardwell <ncardwell@google.com>
> > ---
> 
> Is it ok I pull your patch into our internal tree first for regression
> testing, once the test is done, I will push your patch along with all
> our other patches together to wireless-next(John Linville).
> 
> By doing so, first, we know the patch is not breaking any functionality
> for all the devices(legacy/new), second, it is easier for us to sync-up
> with all the other changes we are making now.
> 

I am absolutely fine with this plan, thanks a lot !

BTW, why order-1 pages are the default on this driver ?

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2012-03-24 16:12 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-03-23 12:51 [RFC] skb truesize last offenders Eric Dumazet
2012-03-23 18:38 ` David Miller
2012-03-23 18:55   ` Eric Dumazet
2012-03-24  0:05     ` [PATCH] iwlwifi: fix skb truesize underestimation Eric Dumazet
2012-03-24  0:22       ` Eric Dumazet
2012-03-24  0:29         ` [PATCH v2] " Eric Dumazet
2012-03-24 15:35           ` Guy, Wey-Yi
2012-03-24 16:12             ` Eric Dumazet

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox