Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH] mwl8k: Add 0x2a02 PCI device-id (Marvell 88W8361)
From: Lennert Buytenhek @ 2012-04-27 10:17 UTC (permalink / raw)
  To: sedat.dilek; +Cc: John W. Linville, linux-wireless, netdev, linux-kernel
In-Reply-To: <CA+icZUWuynSaPE4pdxzoP783pw7r=5x27ewXN_pkR5VS=0kKUg@mail.gmail.com>

On Fri, Apr 27, 2012 at 12:12:13PM +0200, Sedat Dilek wrote:

> >> >> >> Are you planning to or even working on support (for) 8361 devices?
> >> >> >
> >> >> > I don't have any such plans, and I don't know of anyone who does.
> >> >>
> >> >> Does 8361 require firmware support?
> >> >
> >> > What do you mean by that?
> >>
> >> IIRC ath5k for example needs no external firmware file.
> >
> > The 8361 requires firmware to be loaded into it before it'll do
> > anything useful.
> >
> >
> >> >> Does a firmware file exist (name?)?
> >> >
> >> > There's firmware for the 8361 out there, however, that version of
> >> > the firmware implements a firmware API that is different from the
> >> > one that mwl8k currently implements.
> >> >
> >> > You could add 8361 support to mwl8k, but then you'd have to go over
> >> > all the firmware command invocations in mwl8k and make sure that they
> >> > will work on the 8361 firmware that you're trying to support as well.
> >>
> >> Without having a 8361 this will be even harder to walk through.
> >>
> >> Anyway, thanks for your detailed explanations.
> >>
> >> What's the alternative for such affected users?
> >> Use ndis-wrapper?
> >
> > I'm not sure.  I've never tried to get a 8361 work under Linux.
> 
> Just found on [1] this same wrong patch in [2] :-).
> Dunno if [3] worked and from where they have stolen fw-files.
> 
> - Sedat -
> 
> [1] https://dev.openwrt.org/ticket/7209
> [2] https://dev.openwrt.org/attachment/ticket/7209/mwl8k_fix_pci_id.patch
> [3] https://dev.openwrt.org/attachment/ticket/7209/mwl8k_8361p.patch

I doubt that [3] is really all that's needed to make it work.  But if
there's someone for whom it works, I'd like them to run some tests on
mwl8k + [3] on 88w8361p.

^ permalink raw reply

* [PATCH] usbnet: fix skb traversing races during unlink
From: Ming Lei @ 2012-04-27 10:21 UTC (permalink / raw)
  To: David S. Miller, Greg Kroah-Hartman
  Cc: netdev, linux-usb, Ming Lei, Huajun Li, Oliver Neukum, stable

Commit 4231d47e6fe69f061f96c98c30eaf9fb4c14b96d(net/usbnet: avoid
recursive locking in usbnet_stop()) fixes the recursive locking
problem by releasing the skb queue lock before unlink, but may
cause skb traversing races:
	- after URB is unlinked and the queue lock is released,
	the refered skb and skb->next may be moved to done queue,
	even be released
	- in skb_queue_walk_safe, the next skb is still obtained
	by next pointer of the last skb
	- so maybe trigger oops or other problems

This patch extends the usage of entry->state to describe 'start_unlink'
state, so always holding the queue(rx/tx) lock to change the state if
the referd skb is in rx or tx queue because we need to know if the
refered urb has been started unlinking in unlink_urbs.

Also the patch uses usb_block_urb introduced by Oliver to block
URB resubmit in complete handler if the URB will be unlinked.

The other part of this patch is based on Huajun's patch:
always traverse from head of the tx/rx queue to get skb which is
to be unlinked but not been started unlinking.

Signed-off-by: Huajun Li <huajun.li.lee@gmail.com>
Signed-off-by: Ming Lei <tom.leiming@gmail.com>
Cc: Oliver Neukum <oneukum@suse.de>
Cc: stable@kernel.org

---

This patch depends on the usb_block_urb patch from Oliver.

 drivers/net/usb/usbnet.c   |   54 ++++++++++++++++++++++++++++++++------------
 include/linux/usb/usbnet.h |    3 ++-
 2 files changed, 41 insertions(+), 16 deletions(-)

diff --git a/drivers/net/usb/usbnet.c b/drivers/net/usb/usbnet.c
index db99536..dff5e1b 100644
--- a/drivers/net/usb/usbnet.c
+++ b/drivers/net/usb/usbnet.c
@@ -281,17 +281,30 @@ int usbnet_change_mtu (struct net_device *net, int new_mtu)
 }
 EXPORT_SYMBOL_GPL(usbnet_change_mtu);
 
+/*The caller must hold list->lock*/
+static void __usbnet_queue_skb(struct sk_buff_head *list,
+			struct sk_buff *newsk, enum skb_state state)
+{
+	struct skb_data *entry = (struct skb_data *) newsk->cb;
+
+	__skb_queue_tail(list, newsk);
+	entry->state = state;
+}
+
 /*-------------------------------------------------------------------------*/
 
 /* some LK 2.4 HCDs oopsed if we freed or resubmitted urbs from
  * completion callbacks.  2.5 should have fixed those bugs...
  */
 
-static void defer_bh(struct usbnet *dev, struct sk_buff *skb, struct sk_buff_head *list)
+static void defer_bh(struct usbnet *dev, struct sk_buff *skb,
+		struct sk_buff_head *list, enum skb_state state)
 {
 	unsigned long		flags;
+	struct skb_data *entry = (struct skb_data *) skb->cb;
 
 	spin_lock_irqsave(&list->lock, flags);
+	entry->state = state;
 	__skb_unlink(skb, list);
 	spin_unlock(&list->lock);
 	spin_lock(&dev->done.lock);
@@ -339,7 +352,6 @@ static int rx_submit (struct usbnet *dev, struct urb *urb, gfp_t flags)
 	entry = (struct skb_data *) skb->cb;
 	entry->urb = urb;
 	entry->dev = dev;
-	entry->state = rx_start;
 	entry->length = 0;
 
 	usb_fill_bulk_urb (urb, dev->udev, dev->in,
@@ -371,7 +383,7 @@ static int rx_submit (struct usbnet *dev, struct urb *urb, gfp_t flags)
 			tasklet_schedule (&dev->bh);
 			break;
 		case 0:
-			__skb_queue_tail (&dev->rxq, skb);
+			__usbnet_queue_skb(&dev->rxq, skb, rx_start);
 		}
 	} else {
 		netif_dbg(dev, ifdown, dev->net, "rx: stopped\n");
@@ -422,16 +434,17 @@ static void rx_complete (struct urb *urb)
 	struct skb_data		*entry = (struct skb_data *) skb->cb;
 	struct usbnet		*dev = entry->dev;
 	int			urb_status = urb->status;
+	enum skb_state		state;
 
 	skb_put (skb, urb->actual_length);
-	entry->state = rx_done;
+	state = rx_done;
 	entry->urb = NULL;
 
 	switch (urb_status) {
 	/* success */
 	case 0:
 		if (skb->len < dev->net->hard_header_len) {
-			entry->state = rx_cleanup;
+			state = rx_cleanup;
 			dev->net->stats.rx_errors++;
 			dev->net->stats.rx_length_errors++;
 			netif_dbg(dev, rx_err, dev->net,
@@ -470,7 +483,7 @@ static void rx_complete (struct urb *urb)
 				  "rx throttle %d\n", urb_status);
 		}
 block:
-		entry->state = rx_cleanup;
+		state = rx_cleanup;
 		entry->urb = urb;
 		urb = NULL;
 		break;
@@ -481,13 +494,13 @@ block:
 		// FALLTHROUGH
 
 	default:
-		entry->state = rx_cleanup;
+		state = rx_cleanup;
 		dev->net->stats.rx_errors++;
 		netif_dbg(dev, rx_err, dev->net, "rx status %d\n", urb_status);
 		break;
 	}
 
-	defer_bh(dev, skb, &dev->rxq);
+	defer_bh(dev, skb, &dev->rxq, state);
 
 	if (urb) {
 		if (netif_running (dev->net) &&
@@ -578,16 +591,24 @@ EXPORT_SYMBOL_GPL(usbnet_purge_paused_rxq);
 static int unlink_urbs (struct usbnet *dev, struct sk_buff_head *q)
 {
 	unsigned long		flags;
-	struct sk_buff		*skb, *skbnext;
+	struct sk_buff		*skb;
 	int			count = 0;
 
 	spin_lock_irqsave (&q->lock, flags);
-	skb_queue_walk_safe(q, skb, skbnext) {
+	while (!skb_queue_empty(q)) {
 		struct skb_data		*entry;
 		struct urb		*urb;
 		int			retval;
 
-		entry = (struct skb_data *) skb->cb;
+		skb_queue_walk(q, skb) {
+			entry = (struct skb_data *) skb->cb;
+			if (entry->state != unlink_start)
+				break;
+		}
+		if (skb == (struct sk_buff *)q)
+			break;
+
+		entry->state = unlink_start;
 		urb = entry->urb;
 
 		/*
@@ -598,6 +619,10 @@ static int unlink_urbs (struct usbnet *dev, struct sk_buff_head *q)
 		 * handler(include defer_bh).
 		 */
 		usb_get_urb(urb);
+
+		/*speedup unlink by blocking resubmit*/
+		usb_block_urb(urb);
+
 		spin_unlock_irqrestore(&q->lock, flags);
 		// during some PM-driven resume scenarios,
 		// these (async) unlinks complete immediately
@@ -606,6 +631,7 @@ static int unlink_urbs (struct usbnet *dev, struct sk_buff_head *q)
 			netdev_dbg(dev->net, "unlink urb err, %d\n", retval);
 		else
 			count++;
+		usb_unblock_urb(urb);
 		usb_put_urb(urb);
 		spin_lock_irqsave(&q->lock, flags);
 	}
@@ -1039,8 +1065,7 @@ static void tx_complete (struct urb *urb)
 	}
 
 	usb_autopm_put_interface_async(dev->intf);
-	entry->state = tx_done;
-	defer_bh(dev, skb, &dev->txq);
+	defer_bh(dev, skb, &dev->txq, tx_done);
 }
 
 /*-------------------------------------------------------------------------*/
@@ -1096,7 +1121,6 @@ netdev_tx_t usbnet_start_xmit (struct sk_buff *skb,
 	entry = (struct skb_data *) skb->cb;
 	entry->urb = urb;
 	entry->dev = dev;
-	entry->state = tx_start;
 	entry->length = length;
 
 	usb_fill_bulk_urb (urb, dev->udev, dev->out,
@@ -1155,7 +1179,7 @@ netdev_tx_t usbnet_start_xmit (struct sk_buff *skb,
 		break;
 	case 0:
 		net->trans_start = jiffies;
-		__skb_queue_tail (&dev->txq, skb);
+		__usbnet_queue_skb(&dev->txq, skb, tx_start);
 		if (dev->txq.qlen >= TX_QLEN (dev))
 			netif_stop_queue (net);
 	}
diff --git a/include/linux/usb/usbnet.h b/include/linux/usb/usbnet.h
index 605b0aa..76f4396 100644
--- a/include/linux/usb/usbnet.h
+++ b/include/linux/usb/usbnet.h
@@ -191,7 +191,8 @@ extern void usbnet_cdc_status(struct usbnet *, struct urb *);
 enum skb_state {
 	illegal = 0,
 	tx_start, tx_done,
-	rx_start, rx_done, rx_cleanup
+	rx_start, rx_done, rx_cleanup,
+	unlink_start
 };
 
 struct skb_data {	/* skb->cb is one of these */
-- 
1.7.9.5

^ permalink raw reply related

* Re: [PATCH 1/8] isdn/gigaset: ratelimit CAPI message dumps
From: Tilman Schmidt @ 2012-04-27 10:29 UTC (permalink / raw)
  To: Karsten Keil
  Cc: Karsten Keil, David Miller, Hansjoerg Lipp, i4ldeveloper, netdev,
	linux-kernel
In-Reply-To: <4F98EDB1.5090702@b1-systems.de>

[-- Attachment #1: Type: text/plain, Size: 2320 bytes --]

Am 26.04.2012 08:39, schrieb Karsten Keil:
> Am 26.04.2012 01:02, schrieb Tilman Schmidt:
>> Introduce a global ratelimit for CAPI message dumps to protect
>> against possible log flood.
>> Drop the ratelimit for ignored messages which is now covered by the
>> global one.
> 
> Hmm, I think the only CAPI messages which would need a ratelimit are
> related to the DATA_B3 messages. If you need CAPI debug messages in most
> cases you do not need all of the DATA_B3, but you do not want to miss
> any other message related to the call control. With a general rate limit
> you do not have the control, which messages are logged and which are not.

The ratelimit introduced by this patch only applies to messages
other than DATA_B3. Logging DATA_B3 messages is not done via
dump_cmsg().

I'd like to ratelimit specifically non-DATA_B3 messages because I
saw a (possibly buggy) CAPI application flooding the log with
FACILITY messages. Equally important, I'd like to make the
ratelimit in do_nothing() / do_unsupported() bursty because I had
a case where I needed to see several ignored/unhandled CAPI
messages in quick succession. So this patch is killing two birds
with one stone for me.

The burst limit of 20 messages in 20 seconds is chosen to allow a
complete call setup sequence to be logged, while limiting to one
message per second in the long run.

> And here maybe some cases, when even the DATA_B3 are important (e.g.
> searching bugs in flow control), so I would make it still conditional
> to allow to print all messages.

DATA_B3 dumps produce an enormous amount of log data and are
therefore controlled separately by the DEBUG_MCMD flag.
Someone who enables that should know what she or he does.
But if you need them, you need them all. A ratelimit doesn't
make sense there in my experience.

> And I'm not sure, if this is really something for stable.

It's pretty simple and localized, a net simplification, and only
affects generation of debugging messages, so I think it's safe.
But if you see a problem there I can drop the "CC: stable" line.

Thanks,
Tilman

-- 
Tilman Schmidt                    E-Mail: tilman@imap.cc
Bonn, Germany
Diese Nachricht besteht zu 100% aus wiederverwerteten Bits.
Ungeöffnet mindestens haltbar bis: (siehe Rückseite)

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

^ permalink raw reply

* [PATCH 1/4 net-next] net: allow skb->head to be a page fragment
From: Eric Dumazet @ 2012-04-27 10:33 UTC (permalink / raw)
  To: David Miller
  Cc: netdev, Ilpo Järvinen, Neal Cardwell, Tom Herbert,
	Maciej Żenczykowski, Jeff Kirsher, Ben Hutchings,
	Matt Carlson, Michael Chan, Herbert Xu

From: Eric Dumazet <edumazet@google.com>

skb->head is currently allocated from kmalloc(). This is convenient but
has the drawback the data cannot be converted to a page fragment if
needed.

We have three spots were it hurts :

1) GRO aggregation

 When a linear skb must be appended to another skb, GRO uses the
frag_list fallback, very inefficient since we keep all struct sk_buff
around. So drivers enabling GRO but delivering linear skbs to network
stack aren't enabling full GRO power.

2) splice(socket -> pipe).

 We must copy the linear part to a page fragment.
 This kind of defeats splice() purpose (zero copy claim)

3) TCP coalescing.

 Recently introduced, this permits to group several contiguous segments
into a single skb. This shortens queue lengths and save kernel memory,
and greatly reduce probabilities of TCP collapses. This coalescing
doesnt work on linear skbs (or we would need to copy data, this would be
too slow)

Given all these issues, the following patch introduces the possibility
of having skb->head be a fragment in itself. We use a new skb flag,
skb->head_frag to carry this information.

build_skb() is changed to accept a frag_size argument. Drivers willing
to provide a page fragment instead of kmalloc() data will set a non zero
value, set to the fragment size.

Then, on situations we need to convert the skb head to a frag in itself,
we can check if skb->head_frag is set and avoid the copies or various
fallbacks we have.

This means drivers currently using frags could be updated to avoid the
current skb->head allocation and reduce their memory footprint (aka skb
truesize). (thats 512 or 1024 bytes saved per skb). This also makes
bpf/netfilter faster since the 'first frag' will be part of skb linear
part, no need to copy data.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Maciej Żenczykowski <maze@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Tom Herbert <therbert@google.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Ben Hutchings <bhutchings@solarflare.com>
Cc: Matt Carlson <mcarlson@broadcom.com>
Cc: Michael Chan <mchan@broadcom.com>
---
 drivers/net/ethernet/broadcom/bnx2.c            |    2 -
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c |    4 +-
 drivers/net/ethernet/broadcom/tg3.c             |    2 -
 include/linux/skbuff.h                          |    5 +-
 net/core/skbuff.c                               |   24 ++++++++++----
 5 files changed, 25 insertions(+), 12 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnx2.c b/drivers/net/ethernet/broadcom/bnx2.c
index ab55979..ac7b744 100644
--- a/drivers/net/ethernet/broadcom/bnx2.c
+++ b/drivers/net/ethernet/broadcom/bnx2.c
@@ -3006,7 +3006,7 @@ error:
 
 	dma_unmap_single(&bp->pdev->dev, dma_addr, bp->rx_buf_use_size,
 			 PCI_DMA_FROMDEVICE);
-	skb = build_skb(data);
+	skb = build_skb(data, 0);
 	if (!skb) {
 		kfree(data);
 		goto error;
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
index 60d5b54..9c5bc5d 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
@@ -513,7 +513,7 @@ static inline void bnx2x_tpa_stop(struct bnx2x *bp, struct bnx2x_fastpath *fp,
 	dma_unmap_single(&bp->pdev->dev, dma_unmap_addr(rx_buf, mapping),
 			 fp->rx_buf_size, DMA_FROM_DEVICE);
 	if (likely(new_data))
-		skb = build_skb(data);
+		skb = build_skb(data, 0);
 
 	if (likely(skb)) {
 #ifdef BNX2X_STOP_ON_ERROR
@@ -721,7 +721,7 @@ int bnx2x_rx_int(struct bnx2x_fastpath *fp, int budget)
 						 dma_unmap_addr(rx_buf, mapping),
 						 fp->rx_buf_size,
 						 DMA_FROM_DEVICE);
-				skb = build_skb(data);
+				skb = build_skb(data, 0);
 				if (unlikely(!skb)) {
 					kfree(data);
 					fp->eth_q_stats.rx_skb_alloc_failed++;
diff --git a/drivers/net/ethernet/broadcom/tg3.c b/drivers/net/ethernet/broadcom/tg3.c
index 0c3e7c7..d481b0a 100644
--- a/drivers/net/ethernet/broadcom/tg3.c
+++ b/drivers/net/ethernet/broadcom/tg3.c
@@ -5844,7 +5844,7 @@ static int tg3_rx(struct tg3_napi *tnapi, int budget)
 			pci_unmap_single(tp->pdev, dma_addr, skb_size,
 					 PCI_DMA_FROMDEVICE);
 
-			skb = build_skb(data);
+			skb = build_skb(data, 0);
 			if (!skb) {
 				kfree(data);
 				goto drop_it_no_recycle;
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 4a656b5..9d28a22 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -470,7 +470,8 @@ struct sk_buff {
 	__u8			wifi_acked_valid:1;
 	__u8			wifi_acked:1;
 	__u8			no_fcs:1;
-	/* 9/11 bit hole (depending on ndisc_nodetype presence) */
+	__u8			head_frag:1;
+	/* 8/10 bit hole (depending on ndisc_nodetype presence) */
 	kmemcheck_bitfield_end(flags2);
 
 #ifdef CONFIG_NET_DMA
@@ -562,7 +563,7 @@ extern void consume_skb(struct sk_buff *skb);
 extern void	       __kfree_skb(struct sk_buff *skb);
 extern struct sk_buff *__alloc_skb(unsigned int size,
 				   gfp_t priority, int fclone, int node);
-extern struct sk_buff *build_skb(void *data);
+extern struct sk_buff *build_skb(void *data, unsigned int frag_size);
 static inline struct sk_buff *alloc_skb(unsigned int size,
 					gfp_t priority)
 {
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 2342a72..effa75d 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -245,6 +245,7 @@ EXPORT_SYMBOL(__alloc_skb);
 /**
  * build_skb - build a network buffer
  * @data: data buffer provided by caller
+ * @frag_size: size of fragment, or 0 if head was kmalloced
  *
  * Allocate a new &sk_buff. Caller provides space holding head and
  * skb_shared_info. @data must have been allocated by kmalloc()
@@ -258,20 +259,21 @@ EXPORT_SYMBOL(__alloc_skb);
  *  before giving packet to stack.
  *  RX rings only contains data buffers, not full skbs.
  */
-struct sk_buff *build_skb(void *data)
+struct sk_buff *build_skb(void *data, unsigned int frag_size)
 {
 	struct skb_shared_info *shinfo;
 	struct sk_buff *skb;
-	unsigned int size;
+	unsigned int size = frag_size ? : ksize(data);
 
 	skb = kmem_cache_alloc(skbuff_head_cache, GFP_ATOMIC);
 	if (!skb)
 		return NULL;
 
-	size = ksize(data) - SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
+	size -= SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
 
 	memset(skb, 0, offsetof(struct sk_buff, tail));
 	skb->truesize = SKB_TRUESIZE(size);
+	skb->head_frag = frag_size != 0;
 	atomic_set(&skb->users, 1);
 	skb->head = data;
 	skb->data = data;
@@ -376,6 +378,14 @@ static void skb_clone_fraglist(struct sk_buff *skb)
 		skb_get(list);
 }
 
+static void skb_free_head(struct sk_buff *skb)
+{
+	if (skb->head_frag)
+		put_page(virt_to_head_page(skb->head));
+	else
+		kfree(skb->head);
+}
+
 static void skb_release_data(struct sk_buff *skb)
 {
 	if (!skb->cloned ||
@@ -402,7 +412,7 @@ static void skb_release_data(struct sk_buff *skb)
 		if (skb_has_frag_list(skb))
 			skb_drop_fraglist(skb);
 
-		kfree(skb->head);
+		skb_free_head(skb);
 	}
 }
 
@@ -644,6 +654,7 @@ static struct sk_buff *__skb_clone(struct sk_buff *n, struct sk_buff *skb)
 	C(tail);
 	C(end);
 	C(head);
+	C(head_frag);
 	C(data);
 	C(truesize);
 	atomic_set(&n->users, 1);
@@ -940,7 +951,7 @@ int pskb_expand_head(struct sk_buff *skb, int nhead, int ntail,
 		fastpath = atomic_read(&skb_shinfo(skb)->dataref) == delta;
 	}
 
-	if (fastpath &&
+	if (fastpath && !skb->head_frag &&
 	    size + sizeof(struct skb_shared_info) <= ksize(skb->head)) {
 		memmove(skb->head + size, skb_shinfo(skb),
 			offsetof(struct skb_shared_info,
@@ -967,7 +978,7 @@ int pskb_expand_head(struct sk_buff *skb, int nhead, int ntail,
 	       offsetof(struct skb_shared_info, frags[skb_shinfo(skb)->nr_frags]));
 
 	if (fastpath) {
-		kfree(skb->head);
+		skb_free_head(skb);
 	} else {
 		/* copy this zero copy skb frags */
 		if (skb_shinfo(skb)->tx_flags & SKBTX_DEV_ZEROCOPY) {
@@ -985,6 +996,7 @@ int pskb_expand_head(struct sk_buff *skb, int nhead, int ntail,
 	off = (data + nhead) - skb->head;
 
 	skb->head     = data;
+	skb->head_frag = 0;
 adjust_others:
 	skb->data    += off;
 #ifdef NET_SKBUFF_DATA_USES_OFFSET

^ permalink raw reply related

* [PATCH 2/4 net-next] tg3: provide frags as skb head
From: Eric Dumazet @ 2012-04-27 10:34 UTC (permalink / raw)
  To: David Miller
  Cc: netdev, Ilpo Järvinen, Tom Herbert, Neal Cardwell,
	Ben Hutchings, Matt Carlson, Jeff Kirsher, Michael Chan,
	Herbert Xu, Maciej Żenczykowski

From: Eric Dumazet <edumazet@google.com>

This patch converts tg3 driver, one of our reference drivers, to use new
build_skb() api in frag mode.

Instead of using kmalloc() to allocate the memory block that will be
used by build_skb() as skb->head, we use a page fragment.

This is a followup of patch "net: allow skb->head to be a page fragment"

This allows GRO, TCP coalescing, and splice() to be more efficient.

Incidentally, this also removes SLUB slow path contention in kfree()

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Maciej Żenczykowski <maze@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Tom Herbert <therbert@google.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Ben Hutchings <bhutchings@solarflare.com>
Cc: Matt Carlson <mcarlson@broadcom.com>
Cc: Michael Chan <mchan@broadcom.com>
---
 drivers/net/ethernet/broadcom/tg3.c |   74 ++++++++++++++++++++++----
 drivers/net/ethernet/broadcom/tg3.h |    2 
 2 files changed, 66 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/tg3.c b/drivers/net/ethernet/broadcom/tg3.c
index d481b0a..c1a580e 100644
--- a/drivers/net/ethernet/broadcom/tg3.c
+++ b/drivers/net/ethernet/broadcom/tg3.c
@@ -195,6 +195,15 @@ static inline void _tg3_flag_clear(enum TG3_FLAGS flag, unsigned long *bits)
 #define TG3_RX_OFFSET(tp)	(NET_SKB_PAD)
 #endif
 
+/* This driver uses the new build_skb() API providing a frag as skb->head
+ * This strategy permits better GRO aggregation, better TCP coalescing, and
+ * better splice() implementation (avoids a copy from head to a page), at
+ * minimal memory cost.
+ * In this 2048 bytes block, we have enough room to store the MTU=1500 frame
+ * and the struct skb_shared_info.
+ */
+#define TG3_FRAGSIZE 2048
+
 /* minimum number of free TX descriptors required to wake up TX process */
 #define TG3_TX_WAKEUP_THRESH(tnapi)		((tnapi)->tx_pending / 4)
 #define TG3_TX_BD_DMA_MAX_2K		2048
@@ -5617,17 +5626,48 @@ static void tg3_tx(struct tg3_napi *tnapi)
 	}
 }
 
+static void *tg3_frag_alloc(struct tg3_rx_prodring_set *tpr)
+{
+	void *data;
+
+	if (tpr->rx_page_size < TG3_FRAGSIZE) {
+		struct page *page = alloc_page(GFP_ATOMIC);
+
+		if (!page)
+			return NULL;
+		atomic_add((PAGE_SIZE / TG3_FRAGSIZE) - 1, &page->_count); 
+		tpr->rx_page_addr = page_address(page);
+		tpr->rx_page_size = PAGE_SIZE;
+	}
+	data = tpr->rx_page_addr;
+	tpr->rx_page_addr += TG3_FRAGSIZE;
+	tpr->rx_page_size -= TG3_FRAGSIZE;
+	return data;
+}
+
+static void tg3_frag_free(bool is_frag, void *data)
+{
+	if (is_frag)
+		put_page(virt_to_head_page(data));
+	else
+		kfree(data);
+}
+
 static void tg3_rx_data_free(struct tg3 *tp, struct ring_info *ri, u32 map_sz)
 {
+	unsigned int skb_size = SKB_DATA_ALIGN(map_sz + TG3_RX_OFFSET(tp)) +
+		   SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
+
 	if (!ri->data)
 		return;
 
 	pci_unmap_single(tp->pdev, dma_unmap_addr(ri, mapping),
 			 map_sz, PCI_DMA_FROMDEVICE);
-	kfree(ri->data);
+	tg3_frag_free(skb_size <= TG3_FRAGSIZE, ri->data);
 	ri->data = NULL;
 }
 
+
 /* Returns size of skb allocated or < 0 on error.
  *
  * We only need to fill in the address because the other members
@@ -5640,7 +5680,8 @@ static void tg3_rx_data_free(struct tg3 *tp, struct ring_info *ri, u32 map_sz)
  * (to fetch the error flags, vlan tag, checksum, and opaque cookie).
  */
 static int tg3_alloc_rx_data(struct tg3 *tp, struct tg3_rx_prodring_set *tpr,
-			    u32 opaque_key, u32 dest_idx_unmasked)
+			     u32 opaque_key, u32 dest_idx_unmasked,
+			     unsigned int *frag_size)
 {
 	struct tg3_rx_buffer_desc *desc;
 	struct ring_info *map;
@@ -5675,7 +5716,13 @@ static int tg3_alloc_rx_data(struct tg3 *tp, struct tg3_rx_prodring_set *tpr,
 	 */
 	skb_size = SKB_DATA_ALIGN(data_size + TG3_RX_OFFSET(tp)) +
 		   SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
-	data = kmalloc(skb_size, GFP_ATOMIC);
+	if (skb_size <= TG3_FRAGSIZE) {
+		data = tg3_frag_alloc(tpr);
+		*frag_size = TG3_FRAGSIZE;
+	} else {
+		data = kmalloc(skb_size, GFP_ATOMIC);
+		*frag_size = 0;
+	}
 	if (!data)
 		return -ENOMEM;
 
@@ -5683,8 +5730,8 @@ static int tg3_alloc_rx_data(struct tg3 *tp, struct tg3_rx_prodring_set *tpr,
 				 data + TG3_RX_OFFSET(tp),
 				 data_size,
 				 PCI_DMA_FROMDEVICE);
-	if (pci_dma_mapping_error(tp->pdev, mapping)) {
-		kfree(data);
+	if (unlikely(pci_dma_mapping_error(tp->pdev, mapping))) {
+		tg3_frag_free(skb_size <= TG3_FRAGSIZE, data);
 		return -EIO;
 	}
 
@@ -5835,18 +5882,19 @@ static int tg3_rx(struct tg3_napi *tnapi, int budget)
 
 		if (len > TG3_RX_COPY_THRESH(tp)) {
 			int skb_size;
+			unsigned int frag_size;
 
 			skb_size = tg3_alloc_rx_data(tp, tpr, opaque_key,
-						    *post_ptr);
+						    *post_ptr, &frag_size);
 			if (skb_size < 0)
 				goto drop_it;
 
 			pci_unmap_single(tp->pdev, dma_addr, skb_size,
 					 PCI_DMA_FROMDEVICE);
 
-			skb = build_skb(data, 0);
+			skb = build_skb(data, frag_size);
 			if (!skb) {
-				kfree(data);
+				tg3_frag_free(frag_size != 0, data);
 				goto drop_it_no_recycle;
 			}
 			skb_reserve(skb, TG3_RX_OFFSET(tp));
@@ -7279,7 +7327,10 @@ static int tg3_rx_prodring_alloc(struct tg3 *tp,
 
 	/* Now allocate fresh SKBs for each rx ring. */
 	for (i = 0; i < tp->rx_pending; i++) {
-		if (tg3_alloc_rx_data(tp, tpr, RXD_OPAQUE_RING_STD, i) < 0) {
+		unsigned int frag_size;
+
+		if (tg3_alloc_rx_data(tp, tpr, RXD_OPAQUE_RING_STD, i,
+				      &frag_size) < 0) {
 			netdev_warn(tp->dev,
 				    "Using a smaller RX standard ring. Only "
 				    "%d out of %d buffers were allocated "
@@ -7311,7 +7362,10 @@ static int tg3_rx_prodring_alloc(struct tg3 *tp,
 	}
 
 	for (i = 0; i < tp->rx_jumbo_pending; i++) {
-		if (tg3_alloc_rx_data(tp, tpr, RXD_OPAQUE_RING_JUMBO, i) < 0) {
+		unsigned int frag_size;
+
+		if (tg3_alloc_rx_data(tp, tpr, RXD_OPAQUE_RING_JUMBO, i,
+				      &frag_size) < 0) {
 			netdev_warn(tp->dev,
 				    "Using a smaller RX jumbo ring. Only %d "
 				    "out of %d buffers were allocated "
diff --git a/drivers/net/ethernet/broadcom/tg3.h b/drivers/net/ethernet/broadcom/tg3.h
index 93865f8..7c85545 100644
--- a/drivers/net/ethernet/broadcom/tg3.h
+++ b/drivers/net/ethernet/broadcom/tg3.h
@@ -2815,6 +2815,8 @@ struct tg3_rx_prodring_set {
 	struct ring_info		*rx_jmb_buffers;
 	dma_addr_t			rx_std_mapping;
 	dma_addr_t			rx_jmb_mapping;
+	void				*rx_page_addr;
+	unsigned int			rx_page_size;
 };
 
 #define TG3_IRQ_MAX_VECS_RSS		5

^ permalink raw reply related

* [PATCH 3/4 net-next] net: make GRO aware of skb->head_frag
From: Eric Dumazet @ 2012-04-27 10:37 UTC (permalink / raw)
  To: David Miller
  Cc: netdev, Neal Cardwell, Tom Herbert, Jeff Kirsher, Michael Chan,
	Matt Carlson, Herbert Xu, Ben Hutchings, Ilpo Järvinen,
	Maciej Żenczykowski

From: Eric Dumazet <edumazet@google.com>

GRO can check if skb to be merged has its skb->head mapped to a page
fragment, instead of a kmalloc() area.

We 'upgrade' skb->head as a fragment in itself

This avoids the frag_list fallback, and permits to build true GRO skb
(one sk_buff and up to 16 fragments), using less memory.

This reduces number of cache misses when user makes its copy, since a
single sk_buff is fetched.

This is a followup of patch "net: allow skb->head to be a page fragment"

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Maciej Żenczykowski <maze@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Tom Herbert <therbert@google.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Ben Hutchings <bhutchings@solarflare.com>
Cc: Matt Carlson <mcarlson@broadcom.com>
Cc: Michael Chan <mchan@broadcom.com>
---
 include/linux/netdevice.h |    2 ++
 include/linux/skbuff.h    |    1 +
 net/core/dev.c            |    5 ++++-
 net/core/skbuff.c         |   27 ++++++++++++++++++++++++++-

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index e0b70e9..7f377fb 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1509,6 +1509,8 @@ struct napi_gro_cb {
 
 	/* Free the skb? */
 	int free;
+#define NAPI_GRO_FREE		  1
+#define NAPI_GRO_FREE_STOLEN_HEAD 2
 };
 
 #define NAPI_GRO_CB(skb) ((struct napi_gro_cb *)(skb)->cb)
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 9d28a22..2c75e98 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -561,6 +561,7 @@ static inline struct rtable *skb_rtable(const struct sk_buff *skb)
 extern void kfree_skb(struct sk_buff *skb);
 extern void consume_skb(struct sk_buff *skb);
 extern void	       __kfree_skb(struct sk_buff *skb);
+extern struct kmem_cache *skbuff_head_cache;
 extern struct sk_buff *__alloc_skb(unsigned int size,
 				   gfp_t priority, int fclone, int node);
 extern struct sk_buff *build_skb(void *data, unsigned int frag_size);
diff --git a/net/core/dev.c b/net/core/dev.c
index 501f3cc..a2be59f 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -3546,7 +3546,10 @@ gro_result_t napi_skb_finish(gro_result_t ret, struct sk_buff *skb)
 		break;
 
 	case GRO_MERGED_FREE:
-		consume_skb(skb);
+		if (NAPI_GRO_CB(skb)->free == NAPI_GRO_FREE_STOLEN_HEAD)
+			kmem_cache_free(skbuff_head_cache, skb);
+		else
+			__kfree_skb(skb);
 		break;
 
 	case GRO_HELD:
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index effa75d..2ad1ee7 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -69,7 +69,7 @@
 #include <trace/events/skb.h>
 #include <linux/highmem.h>
 
-static struct kmem_cache *skbuff_head_cache __read_mostly;
+struct kmem_cache *skbuff_head_cache __read_mostly;
 static struct kmem_cache *skbuff_fclone_cache __read_mostly;
 
 static void sock_pipe_buf_release(struct pipe_inode_info *pipe,
@@ -2901,6 +2901,31 @@ int skb_gro_receive(struct sk_buff **head, struct sk_buff *skb)
 
 		NAPI_GRO_CB(skb)->free = 1;
 		goto done;
+	} else if (skb->head_frag) {
+		int nr_frags = pinfo->nr_frags;
+		skb_frag_t *frag = pinfo->frags + nr_frags;
+		struct page *page = virt_to_head_page(skb->head);
+		unsigned int first_size = headlen - offset;
+		unsigned int first_offset;
+
+		if (nr_frags + 1 + skbinfo->nr_frags > MAX_SKB_FRAGS)
+			return -E2BIG;
+
+		first_offset = skb->head -
+			       (unsigned char *)page_address(page) +
+			       offset;
+
+		pinfo->nr_frags = nr_frags + 1 + skbinfo->nr_frags;
+
+		frag->page.p	  = page;
+		frag->page_offset = first_offset;
+		skb_frag_size_set(frag, first_size);
+
+		memcpy(frag + 1, skbinfo->frags, sizeof(*frag) * skbinfo->nr_frags);
+		/* We dont need to clear skbinfo->nr_frags here */
+
+		NAPI_GRO_CB(skb)->free = NAPI_GRO_FREE_STOLEN_HEAD;
+		goto done;
 	} else if (skb_gro_len(p) != pinfo->gso_size)
 		return -E2BIG;
 

^ permalink raw reply related

* [PATCH 4/4 net-next] tcp: makes tcp_try_coalesce aware of skb->head_frag
From: Eric Dumazet @ 2012-04-27 10:38 UTC (permalink / raw)
  To: David Miller
  Cc: netdev, Neal Cardwell, Jeff Kirsher, Tom Herbert, Herbert Xu,
	Ben Hutchings, Matt Carlson, Michael Chan,
	Maciej Żenczykowski, Ilpo Järvinen

From: Eric Dumazet <edumazet@google.com>

TCP coalesce can check if skb to be merged has its skb->head mapped to a
page fragment, instead of a kmalloc() area.

We had to disable coalescing in this case, for performance reasons.

We 'upgrade' skb->head as a fragment in itself.

This reduces number of cache misses when user makes its copies, since a
less sk_buff are fetched.

This makes receive and ofo queues shorter and thus reduce cache line
misses in TCP stack.

This is a followup of patch "net: allow skb->head to be a page fragment"

Tested with tg3 nic, with GRO on or off. We can see "TCPRcvCoalesce"
counter being incremented.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Maciej Żenczykowski <maze@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Tom Herbert <therbert@google.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Ben Hutchings <bhutchings@solarflare.com>
Cc: Matt Carlson <mcarlson@broadcom.com>
Cc: Michael Chan <mchan@broadcom.com>
---
 net/ipv4/tcp_input.c |   55 ++++++++++++++++++++++++++++++++---------
 1 file changed, 43 insertions(+), 12 deletions(-)

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index c93b0cb..96a631d 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -4464,10 +4464,12 @@ static inline int tcp_try_rmem_schedule(struct sock *sk, unsigned int size)
  */
 static bool tcp_try_coalesce(struct sock *sk,
 			     struct sk_buff *to,
-			     struct sk_buff *from)
+			     struct sk_buff *from,
+			     bool *fragstolen)
 {
-	int len = from->len;
+	int delta, len = from->len;
 
+	*fragstolen = false;
 	if (tcp_hdr(from)->fin)
 		return false;
 	if (len <= skb_tailroom(to)) {
@@ -4478,15 +4480,19 @@ merge:
 		TCP_SKB_CB(to)->ack_seq = TCP_SKB_CB(from)->ack_seq;
 		return true;
 	}
+
+	if (skb_has_frag_list(to) || skb_has_frag_list(from))
+		return false;
+
 	if (skb_headlen(from) == 0 &&
-	    !skb_has_frag_list(to) &&
-	    !skb_has_frag_list(from) &&
 	    (skb_shinfo(to)->nr_frags +
 	     skb_shinfo(from)->nr_frags <= MAX_SKB_FRAGS)) {
-		int delta = from->truesize - ksize(from->head) -
-			    SKB_DATA_ALIGN(sizeof(struct sk_buff));
+		WARN_ON_ONCE(from->head_frag);
+		delta = from->truesize - ksize(from->head) -
+			SKB_DATA_ALIGN(sizeof(struct sk_buff));
 
 		WARN_ON_ONCE(delta < len);
+copyfrags:
 		memcpy(skb_shinfo(to)->frags + skb_shinfo(to)->nr_frags,
 		       skb_shinfo(from)->frags,
 		       skb_shinfo(from)->nr_frags * sizeof(skb_frag_t));
@@ -4499,6 +4505,20 @@ merge:
 		to->data_len += len;
 		goto merge;
 	}
+	if (from->head_frag) {
+		struct page *page;
+		unsigned int offset;
+
+		if (skb_shinfo(to)->nr_frags + skb_shinfo(from)->nr_frags >= MAX_SKB_FRAGS)
+			return false;
+		page = virt_to_head_page(from->head);
+		offset = from->data - (unsigned char *)page_address(page);
+		skb_fill_page_desc(to, skb_shinfo(to)->nr_frags,
+				   page, offset, skb_headlen(from));
+		*fragstolen = true;
+		delta = len; /* we dont know real truesize... */
+		goto copyfrags;
+	}
 	return false;
 }
 
@@ -4540,10 +4560,15 @@ static void tcp_data_queue_ofo(struct sock *sk, struct sk_buff *skb)
 	end_seq = TCP_SKB_CB(skb)->end_seq;
 
 	if (seq == TCP_SKB_CB(skb1)->end_seq) {
-		if (!tcp_try_coalesce(sk, skb1, skb)) {
+		bool fragstolen;
+
+		if (!tcp_try_coalesce(sk, skb1, skb, &fragstolen)) {
 			__skb_queue_after(&tp->out_of_order_queue, skb1, skb);
 		} else {
-			__kfree_skb(skb);
+			if (fragstolen)
+				kmem_cache_free(skbuff_head_cache, skb);
+			else
+				__kfree_skb(skb);
 			skb = NULL;
 		}
 
@@ -4626,6 +4651,7 @@ static void tcp_data_queue(struct sock *sk, struct sk_buff *skb)
 	const struct tcphdr *th = tcp_hdr(skb);
 	struct tcp_sock *tp = tcp_sk(sk);
 	int eaten = -1;
+	bool fragstolen = false;
 
 	if (TCP_SKB_CB(skb)->seq == TCP_SKB_CB(skb)->end_seq)
 		goto drop;
@@ -4672,7 +4698,9 @@ queue_and_out:
 				goto drop;
 
 			tail = skb_peek_tail(&sk->sk_receive_queue);
-			eaten = (tail && tcp_try_coalesce(sk, tail, skb)) ? 1 : 0;
+			eaten = (tail &&
+				 tcp_try_coalesce(sk, tail, skb,
+						  &fragstolen)) ? 1 : 0;
 			if (eaten <= 0) {
 				skb_set_owner_r(skb, sk);
 				__skb_queue_tail(&sk->sk_receive_queue, skb);
@@ -4699,9 +4727,12 @@ queue_and_out:
 
 		tcp_fast_path_check(sk);
 
-		if (eaten > 0)
-			__kfree_skb(skb);
-		else if (!sock_flag(sk, SOCK_DEAD))
+		if (eaten > 0) {
+			if (fragstolen)
+				kmem_cache_free(skbuff_head_cache, skb);
+			else
+				__kfree_skb(skb);
+		} else if (!sock_flag(sk, SOCK_DEAD))
 			sk->sk_data_ready(sk, 0);
 		return;
 	}

^ permalink raw reply related

* Re: [PATCH] mwl8k: Add 0x2a02 PCI device-id (Marvell 88W8361)
From: Sedat Dilek @ 2012-04-27 10:57 UTC (permalink / raw)
  To: Lennert Buytenhek; +Cc: John W. Linville, linux-wireless, netdev, linux-kernel
In-Reply-To: <20120427101732.GX3157@wantstofly.org>

[-- Attachment #1: Type: text/plain, Size: 2084 bytes --]

On Fri, Apr 27, 2012 at 12:17 PM, Lennert Buytenhek
<buytenh@wantstofly.org> wrote:
> On Fri, Apr 27, 2012 at 12:12:13PM +0200, Sedat Dilek wrote:
>
>> >> >> >> Are you planning to or even working on support (for) 8361 devices?
>> >> >> >
>> >> >> > I don't have any such plans, and I don't know of anyone who does.
>> >> >>
>> >> >> Does 8361 require firmware support?
>> >> >
>> >> > What do you mean by that?
>> >>
>> >> IIRC ath5k for example needs no external firmware file.
>> >
>> > The 8361 requires firmware to be loaded into it before it'll do
>> > anything useful.
>> >
>> >
>> >> >> Does a firmware file exist (name?)?
>> >> >
>> >> > There's firmware for the 8361 out there, however, that version of
>> >> > the firmware implements a firmware API that is different from the
>> >> > one that mwl8k currently implements.
>> >> >
>> >> > You could add 8361 support to mwl8k, but then you'd have to go over
>> >> > all the firmware command invocations in mwl8k and make sure that they
>> >> > will work on the 8361 firmware that you're trying to support as well.
>> >>
>> >> Without having a 8361 this will be even harder to walk through.
>> >>
>> >> Anyway, thanks for your detailed explanations.
>> >>
>> >> What's the alternative for such affected users?
>> >> Use ndis-wrapper?
>> >
>> > I'm not sure.  I've never tried to get a 8361 work under Linux.
>>
>> Just found on [1] this same wrong patch in [2] :-).
>> Dunno if [3] worked and from where they have stolen fw-files.
>>
>> - Sedat -
>>
>> [1] https://dev.openwrt.org/ticket/7209
>> [2] https://dev.openwrt.org/attachment/ticket/7209/mwl8k_fix_pci_id.patch
>> [3] https://dev.openwrt.org/attachment/ticket/7209/mwl8k_8361p.patch
>
> I doubt that [3] is really all that's needed to make it work.  But if
> there's someone for whom it works, I'd like them to run some tests on
> mwl8k + [3] on 88w8361p.

Just asked lautriv to do so (he will do against linux-3.3.3), lets' see.
Also, I refreshed and adapted a bit the instructions (see attachment).

- Sedat -

[-- Attachment #2: mwl8k_extract_firmware.txt --]
[-- Type: text/plain, Size: 1159 bytes --]

### HELP-1: https://forum.openwrt.org/viewtopic.php?pid=103243#p103243
### HELP-2: https://forum.openwrt.org/viewtopic.php?pid=110004#p110004

### From HELP-1:

To extract the firmware for the mwl8k driver, open your Ubuntu terminal:

sudo apt-get install cabextract unshield

mkdir mwl8k
cd mwl8k

wget -O mrv8k_extract_fw.c 'http://marc.info/?l=linux-wireless&m=126540674419330&q=p3'
wget ftp://downloads.netgear.com/files/wn311t_4_2_setup.exe

mkdir mess
cabextract -F Disk1/data*.* -d mess wn311t_4_2_setup.exe
unshield -g Driver -d mess x mess/Disk1/data1.cab
gcc -o mrv8k_extract_fw mrv8k_extract_fw.c
./mrv8k_extract_fw mess/Driver/netmw145.sys
rm -r mess

md5sum -b *.fw
mv fmimage_8XX1.fw fmimage_8363.fw
mv helper_8XX1.fw helper_8363.fw

### From HELP-2: For Marwell 8361P

md5sum -b *.fw | egrep '48c79b085f7f5a590d3dbc15647e519f|0fe11f415adbbd5e8ca03641705c4a6c'
[ OUTPUT ]
48c79b085f7f5a590d3dbc15647e519f *fmimage_8XX0.fw
0fe11f415adbbd5e8ca03641705c4a6c *helper_8XX0.fw

mv fmimage_8XX0.fw fmimage_8361p.fw
mv helper_8XX0.fw helper_8361p.fw

### PATCH: https://dev.openwrt.org/attachment/ticket/7209/mwl8k_8361p.patch


-dileks // 27-Apr-2012


^ permalink raw reply

* Re: [PATCH 02/14 v2] usb/net: rndis: break out <linux/rndis.h> defines
From: Linus Walleij @ 2012-04-27 11:24 UTC (permalink / raw)
  To: David Miller
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA, linux-usb-u79uwXL29TY76Z2rM5mHXA,
	gregkh-hQyY1W1yCW8ekmWlsbkhG0B+6BGkLq7r, balbi-l0cyMroinI0,
	jussi.kivilinna-E01nCVcF24I, haiyangz-0li6OtcxBFHby3iVrkZq2A,
	yongjun_wei-zrsr2BFq86L20UzCJQGyNP8+0UxHXcjY,
	ben-/+tVBieCtBitmTQ+vhA3Yw
In-Reply-To: <20120425.194447.1519804940735129000.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>

On Thu, Apr 26, 2012 at 1:44 AM, David Miller <davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org> wrote:

> After this patch the build emits tons of warnings because you've
> left a situation where there will be duplicate definitions:

Sure, that was on purpose, doing it step-wise so we could find
bisect if something broke in the step of ironing out the duplicate
defines. So actually it was being cautious (maybe in a bad way.)

But I'll just squash the duplication-removal patches now that
we know it works, and we end on warning-free patches.

Yours,
Linus Walleij
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* [PATCH 5/5 net-next] net: makes skb_splice_bits() aware of skb->head_frag
From: Eric Dumazet @ 2012-04-27 12:10 UTC (permalink / raw)
  To: David Miller
  Cc: netdev, Neal Cardwell, Jeff Kirsher, Tom Herbert, Herbert Xu,
	Ben Hutchings, Matt Carlson, Michael Chan,
	Maciej Żenczykowski, Ilpo Järvinen

From: Eric Dumazet <edumazet@google.com>

__skb_splice_bits() can check if skb to be spliced has its skb->head
mapped to a page fragment, instead of a kmalloc() area.

If so we can avoid a copy of the skb head and get a reference on
underlying page.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Maciej Żenczykowski <maze@google.com>
Cc: Neal Cardwell <ncardwell@google.com>
Cc: Tom Herbert <therbert@google.com>
Cc: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Cc: Ben Hutchings <bhutchings@solarflare.com>
Cc: Matt Carlson <mcarlson@broadcom.com>
Cc: Michael Chan <mchan@broadcom.com>
---
 net/core/skbuff.c |   10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 2ad1ee7..e5bd086 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -1699,14 +1699,18 @@ static bool __skb_splice_bits(struct sk_buff *skb, struct pipe_inode_info *pipe,
 			      struct splice_pipe_desc *spd, struct sock *sk)
 {
 	int seg;
+	bool head_is_linear = !skb->head_frag;
 
-	/*
-	 * map the linear part
+	/* map the linear part :
+	 * If skb->head_frag is set, this 'linear' part is backed
+	 * by a fragment, and we can avoid a copy.
 	 */
 	if (__splice_segment(virt_to_page(skb->data),
 			     (unsigned long) skb->data & (PAGE_SIZE - 1),
 			     skb_headlen(skb),
-			     offset, len, skb, spd, true, sk, pipe))
+			     offset, len, skb, spd,
+			     head_is_linear,
+			     sk, pipe))
 		return true;
 
 	/*

^ permalink raw reply related

* Re: [PATCH 00/14 v2] usb/net: rndis: first step toward consolidation
From: Linus Walleij @ 2012-04-27 12:28 UTC (permalink / raw)
  To: Haiyang Zhang
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-usb-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Greg Kroah-Hartman, David S. Miller, Felipe Balbi,
	Jussi Kivilinna, Wei Yongjun, Ben Hutchings, KY Srinivasan
In-Reply-To: <A1F3067C9B68744AA19F6802BAB8FFDC0DD36C9E-Jfd81uAzPQthtgIzimG1dFir+X/St4rqwBk/1ggFUS45P9zcU8sUGwC/G2K4zDHf@public.gmane.org>

On Wed, Apr 25, 2012 at 10:36 PM, Haiyang Zhang <haiyangz-0li6OtcxBFHby3iVrkZq2A@public.gmane.org> wrote:

> I tested Hyper-V netvsc driver, it works fine with the patches.

Thanks! Can I add yout Tested-by: on them?

Yours,
Linus Walleij
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH 03/15] batman-adv: add biggest_unsigned_int(x) macro
From: Antonio Quartulli @ 2012-04-27 12:51 UTC (permalink / raw)
  To: David Miller
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA,
	david.laight-JxhZ9S5GRejQT0dZR+AlfA,
	b.a.t.m.a.n-ZwoEplunGu2X36UT3dwllkB+6BGkLq7r
In-Reply-To: <20120425.140904.935584797119761625.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>

Il 25.04.2012 20:09 David Miller ha scritto:
> From: "David Laight" <David.Laight-ZS65k/vG3HxXrIkS9f7CXA@public.gmane.org>
> Date: Wed, 25 Apr 2012 15:57:39 +0100
>
>>
>>> > +/* Returns the biggest unsigned integer with the sizeof x */
>>> > +#define biggest_unsigned_int(x) (~(x)0)
>>>
>>> Nope - biggest_unsigned_int(unsigned char) is probably 0xffffffff
>>
>> To correct myself, it is actually ((int)(-1)) (ie a signed value).
>
> In any event this is just a terrible macro, and too stupid to live.

Hello David,

ok. So the suggestion is to directly use the ~ operator in the code 
without creating a macro.
If the rest of the patchset is ok, I will eliminate this macro and 
resend the pull request again.

Cheers,


-- 
Antonio Quartulli

^ permalink raw reply

* RE: [PATCH 00/14 v2] usb/net: rndis: first step toward consolidation
From: Haiyang Zhang @ 2012-04-27 12:56 UTC (permalink / raw)
  To: Linus Walleij
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	linux-usb-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Greg Kroah-Hartman, David S. Miller, Felipe Balbi,
	Jussi Kivilinna, Wei Yongjun, Ben Hutchings, KY Srinivasan
In-Reply-To: <CACRpkdYmeRweAe3YJmZztAJ9A5C_qvVwCmG3M-qwmrsv9Sq79A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>



> -----Original Message-----
> From: Linus Walleij [mailto:linus.walleij-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org]
> Sent: Friday, April 27, 2012 8:29 AM
> To: Haiyang Zhang
> Cc: netdev-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; linux-usb-u79uwXL29TY76Z2rM5mHXA@public.gmane.org; Greg Kroah-
> Hartman; David S. Miller; Felipe Balbi; Jussi Kivilinna; Wei Yongjun; Ben
> Hutchings; KY Srinivasan
> Subject: Re: [PATCH 00/14 v2] usb/net: rndis: first step toward consolidation
> 
> On Wed, Apr 25, 2012 at 10:36 PM, Haiyang Zhang
> <haiyangz-0li6OtcxBFHby3iVrkZq2A@public.gmane.org> wrote:
> 
> > I tested Hyper-V netvsc driver, it works fine with the patches.
> 
> Thanks! Can I add yout Tested-by: on them?

Yes, you can, after I test the version 3.

Thanks,
- Haiyang

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH] mwl8k: Add 0x2a02 PCI device-id (Marvell 88W8361)
From: Sedat Dilek @ 2012-04-27 13:29 UTC (permalink / raw)
  To: Lennert Buytenhek
  Cc: John W. Linville, linux-wireless-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, lautriv, Jim Cromie
In-Reply-To: <CA+icZUUnJPeDmL9xXfvj62an40bXwwaOnrJmxmLN_cBt9PyCZA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>

[-- Attachment #1: Type: text/plain, Size: 5077 bytes --]

On Fri, Apr 27, 2012 at 12:57 PM, Sedat Dilek
<sedat.dilek-gM/Ye1E23mwN+BqQ9rBEUg@public.gmane.org> wrote:
> On Fri, Apr 27, 2012 at 12:17 PM, Lennert Buytenhek
> <buytenh-OLH4Qvv75CYX/NnBR394Jw@public.gmane.org> wrote:
>> On Fri, Apr 27, 2012 at 12:12:13PM +0200, Sedat Dilek wrote:
>>
>>> >> >> >> Are you planning to or even working on support (for) 8361 devices?
>>> >> >> >
>>> >> >> > I don't have any such plans, and I don't know of anyone who does.
>>> >> >>
>>> >> >> Does 8361 require firmware support?
>>> >> >
>>> >> > What do you mean by that?
>>> >>
>>> >> IIRC ath5k for example needs no external firmware file.
>>> >
>>> > The 8361 requires firmware to be loaded into it before it'll do
>>> > anything useful.
>>> >
>>> >
>>> >> >> Does a firmware file exist (name?)?
>>> >> >
>>> >> > There's firmware for the 8361 out there, however, that version of
>>> >> > the firmware implements a firmware API that is different from the
>>> >> > one that mwl8k currently implements.
>>> >> >
>>> >> > You could add 8361 support to mwl8k, but then you'd have to go over
>>> >> > all the firmware command invocations in mwl8k and make sure that they
>>> >> > will work on the 8361 firmware that you're trying to support as well.
>>> >>
>>> >> Without having a 8361 this will be even harder to walk through.
>>> >>
>>> >> Anyway, thanks for your detailed explanations.
>>> >>
>>> >> What's the alternative for such affected users?
>>> >> Use ndis-wrapper?
>>> >
>>> > I'm not sure.  I've never tried to get a 8361 work under Linux.
>>>
>>> Just found on [1] this same wrong patch in [2] :-).
>>> Dunno if [3] worked and from where they have stolen fw-files.
>>>
>>> - Sedat -
>>>
>>> [1] https://dev.openwrt.org/ticket/7209
>>> [2] https://dev.openwrt.org/attachment/ticket/7209/mwl8k_fix_pci_id.patch
>>> [3] https://dev.openwrt.org/attachment/ticket/7209/mwl8k_8361p.patch
>>
>> I doubt that [3] is really all that's needed to make it work.  But if
>> there's someone for whom it works, I'd like them to run some tests on
>> mwl8k + [3] on 88w8361p.
>
> Just asked lautriv to do so (he will do against linux-3.3.3), lets' see.
> Also, I refreshed and adapted a bit the instructions (see attachment).
>
> - Sedat -

[ CC Jim Cromie ]

Now, with extracted firmware files renamed & copied to
/lib/firmware/mwl8k/ and applied patch (see attachment) against
Linux-3.4-rc4+ I am seeing with modinfo:

$ sudo modinfo mwl8k
filename:
/lib/modules/3.4.0-rc4-5-generic/kernel/drivers/net/wireless/mwl8k.ko
license:        GPL
author:         Lennert Buytenhek <buytenh-eYqpPyKDWXRBDgjK7y7TUQ@public.gmane.org>
version:        0.13
description:    Marvell TOPDOG(R) 802.11 Wireless Network Driver
firmware:       mwl8k/fmimage_8366_ap-2.fw
firmware:       mwl8k/fmimage_8366.fw
firmware:       mwl8k/helper_8366.fw
firmware:       mwl8k/fmimage_8687.fw
firmware:       mwl8k/helper_8687.fw
firmware:       mwl8k/fmimage_8363.fw
firmware:       mwl8k/helper_8363.fw
firmware:       mwl8k/fmimage_8361p.fw
firmware:       mwl8k/helper_8361p.fw
srcversion:     9E1479A05C8D67E6AE90746
alias:          pci:v000011ABd00002A43sv*sd*bc*sc*i*
alias:          pci:v000011ABd00002A40sv*sd*bc*sc*i*
alias:          pci:v000011ABd00002A30sv*sd*bc*sc*i*
alias:          pci:v000011ABd00002A2Bsv*sd*bc*sc*i*
alias:          pci:v000011ABd00002A24sv*sd*bc*sc*i*
alias:          pci:v000011ABd00002A0Csv*sd*bc*sc*i*
alias:          pci:v000011ABd00002A0Asv*sd*bc*sc*i*
alias:          pci:v000011ABd00002A02sv*sd*bc*sc*i* <--- 8361P:
0x2a02 PCI device-id
depends:        mac80211,cfg80211
vermagic:       3.4.0-rc4-5-generic SMP mod_unload modversions
parm:           ap_mode_default:Set to 1 to make ap mode the default
instead of sta mode (bool)

$ ls -l /lib/firmware/mwl8k/
insgesamt 456
-rw-r--r-- 1 root root  75848 Apr 27 13:49 fmimage_8361p.fw <---
8361P: Firmware image
-rw-r--r-- 1 root root 101780 Mär 19 19:32 fmimage_8366_ap-1.fw
-rw-r--r-- 1 root root 101976 Mär 19 19:32 fmimage_8366_ap-2.fw
-rw-r--r-- 1 root root  96664 Mär 19 19:32 fmimage_8366.fw
-rw-r--r-- 1 root root  73252 Feb 23 20:07 fmimage_8687.fw
-rw-r--r-- 1 root root   2476 Apr 27 13:49 helper_8361p.fw <--- 8361P:
Helper image
-rw-r--r-- 1 root root   2476 Mär 19 19:32 helper_8366.fw
-rw-r--r-- 1 root root   2476 Feb 23 20:07 helper_8687.fw

As said... ***compile-tested*** only here.

I had a short query with lautriv on #linux-wireless this afternoon:
* mwl8k kernel-module was autoloaded
* wlan0 interface got fired up
* ESSID was accepted
* logs reported 802.11bgn support is active
* (he tested on Linux-3.3.3)

With WPA/WPA2 lautriv had some problems as his installation was
missing wpasupplicant.
He setup a classic /etc/network/interfaces.
After resetting his router mwl8k worked nicely.

Unfortunately, he was on the run and promised me to send logs and do
more testing this evening.

So, Lennert if you want more testing - What? How? etc.

Thanks in advance.

- Sedat -

[-- Attachment #2: 0001-mwl8k-Add-support-for-MWL8361P.patch --]
[-- Type: application/octet-stream, Size: 2171 bytes --]

From 1a586a1f21f820de0ea06c5382a69191d93d80fe Mon Sep 17 00:00:00 2001
From: Sedat Dilek <sedat.dilek@gmail.com>
Date: Fri, 27 Apr 2012 13:46:02 +0200
Subject: [PATCH] mwl8k: Add support for MWL8361P

Follow discussion in [1] for more details about adding driver
support and howto extract firmware files.

$ sudo modinfo mwl8k | egrep -i '8361p|2a02'
firmware:       mwl8k/fmimage_8361p.fw
firmware:       mwl8k/helper_8361p.fw
alias:          pci:v000011ABd00002A02sv*sd*bc*sc*i*

Compile-tested against Linux-3.4-rc4+.

[1] http://marc.info/?t=133551792600004&r=1&w=2

Signed-off-by: Sedat Dilek <sedat.dilek@gmail.com>
---
 drivers/net/wireless/mwl8k.c |    9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/drivers/net/wireless/mwl8k.c b/drivers/net/wireless/mwl8k.c
index b48674b..3ea9b8e 100644
--- a/drivers/net/wireless/mwl8k.c
+++ b/drivers/net/wireless/mwl8k.c
@@ -5225,6 +5225,7 @@ static void mwl8k_finalize_join_worker(struct work_struct *work)
 }
 
 enum {
+	MWL8361P = 0,
 	MWL8363 = 0,
 	MWL8687,
 	MWL8366,
@@ -5235,6 +5236,11 @@ enum {
 #define MWL8K_8366_AP_FW(api) _MWL8K_8366_AP_FW(api)
 
 static struct mwl8k_device_info mwl8k_info_tbl[] __devinitdata = {
+	[MWL8361P] = {
+		.part_name      = "88w8361p",
+		.helper_image   = "mwl8k/helper_8361p.fw",
+		.fw_image_sta	= "mwl8k/fmimage_8361p.fw",
+	},
 	[MWL8363] = {
 		.part_name	= "88w8363",
 		.helper_image	= "mwl8k/helper_8363.fw",
@@ -5255,6 +5261,8 @@ static struct mwl8k_device_info mwl8k_info_tbl[] __devinitdata = {
 	},
 };
 
+MODULE_FIRMWARE("mwl8k/helper_8361p.fw");
+MODULE_FIRMWARE("mwl8k/fmimage_8361p.fw");
 MODULE_FIRMWARE("mwl8k/helper_8363.fw");
 MODULE_FIRMWARE("mwl8k/fmimage_8363.fw");
 MODULE_FIRMWARE("mwl8k/helper_8687.fw");
@@ -5264,6 +5272,7 @@ MODULE_FIRMWARE("mwl8k/fmimage_8366.fw");
 MODULE_FIRMWARE(MWL8K_8366_AP_FW(MWL8K_8366_AP_FW_API));
 
 static DEFINE_PCI_DEVICE_TABLE(mwl8k_pci_id_table) = {
+	{ PCI_VDEVICE(MARVELL, 0x2a02), .driver_data = MWL8361P, },
 	{ PCI_VDEVICE(MARVELL, 0x2a0a), .driver_data = MWL8363, },
 	{ PCI_VDEVICE(MARVELL, 0x2a0c), .driver_data = MWL8363, },
 	{ PCI_VDEVICE(MARVELL, 0x2a24), .driver_data = MWL8363, },
-- 
1.7.9.5


^ permalink raw reply related

* Re: [PATCH v4 1/3] make jump_labels wait while updates are in place
From: Jason Baron @ 2012-04-27 13:53 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Glauber Costa, cgroups, netdev, linux-kernel, Li Zefan, Tejun Heo,
	kamezawa.hiroyu, linux-mm, devel, Johannes Weiner, Michal Hocko,
	Ingo Molnar
In-Reply-To: <20120427004305.GC23877@home.goodmis.org>

On Thu, Apr 26, 2012 at 08:43:06PM -0400, Steven Rostedt wrote:
> On Thu, Apr 26, 2012 at 07:51:05PM -0300, Glauber Costa wrote:
> > In mem cgroup, we need to guarantee that two concurrent updates
> > of the jump_label interface wait for each other. IOW, we can't have
> > other updates returning while the first one is still patching the
> > kernel around, otherwise we'll race.
> 
> But it shouldn't. The code as is should prevent that.
> 
> > 
> > I believe this is something that can fit well in the static branch
> > API, without noticeable disadvantages:
> > 
> > * in the common case, it will be a quite simple lock/unlock operation
> > * Every context that calls static_branch_slow* already expects to be
> >   in sleeping context because it will mutex_lock the unlikely case.
> > * static_key_slow_inc is not expected to be called in any fast path,
> >   otherwise it would be expected to have quite a different name. Therefore
> >   the mutex + atomic combination instead of just an atomic should not kill
> >   us.
> > 
> > Signed-off-by: Glauber Costa <glommer@parallels.com>
> > CC: Tejun Heo <tj@kernel.org>
> > CC: Li Zefan <lizefan@huawei.com>
> > CC: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
> > CC: Johannes Weiner <hannes@cmpxchg.org>
> > CC: Michal Hocko <mhocko@suse.cz>
> > CC: Ingo Molnar <mingo@elte.hu>
> > CC: Jason Baron <jbaron@redhat.com>
> > ---
> >  kernel/jump_label.c |   21 +++++++++++----------
> >  1 files changed, 11 insertions(+), 10 deletions(-)
> > 
> > diff --git a/kernel/jump_label.c b/kernel/jump_label.c
> > index 4304919..5d09cb4 100644
> > --- a/kernel/jump_label.c
> > +++ b/kernel/jump_label.c
> > @@ -57,17 +57,16 @@ static void jump_label_update(struct static_key *key, int enable);
> >  
> >  void static_key_slow_inc(struct static_key *key)
> >  {
> > +	jump_label_lock();
> >  	if (atomic_inc_not_zero(&key->enabled))
> > -		return;
> 
> If key->enabled is not zero, there's nothing to be done. As the jump
> label has already been enabled. Note, the key->enabled doesn't get set
> until after the jump label is updated. Thus, if two tasks were to come
> in, they both would be locked on the jump_label_lock().
> 

Right, for x86 which uses stop_machine currently, we guarantee that all
cpus are going to see the updated code, before the inc of key->enabled.
However, other arches (sparc, mips, powerpc, for example), seem to be
using much lighter weight updates, which I hope are ok :)

Thanks,

-Jason

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply

* Re: [PATCH v4 1/3] make jump_labels wait while updates are in place
From: Steven Rostedt @ 2012-04-27 14:07 UTC (permalink / raw)
  To: Jason Baron
  Cc: Glauber Costa, cgroups, netdev, linux-kernel, Li Zefan, Tejun Heo,
	kamezawa.hiroyu, linux-mm, devel, Johannes Weiner, Michal Hocko,
	Ingo Molnar
In-Reply-To: <20120427135320.GA13762@redhat.com>

On Fri, 2012-04-27 at 09:53 -0400, Jason Baron wrote:

> Right, for x86 which uses stop_machine currently, we guarantee that all
> cpus are going to see the updated code, before the inc of key->enabled.
> However, other arches (sparc, mips, powerpc, for example), seem to be
> using much lighter weight updates, which I hope are ok :)

And x86 will soon be removing stop_machine() from its path too. But all
archs should perform some kind of memory sync after patching code. Thus
the update should be treated as if a memory barrier was added after it,
and before the inc.

-- Steve


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply

* Re: [net-next 1/4 (V3)] net: ethtool: add the EEE support
From: Giuseppe CAVALLARO @ 2012-04-27 14:11 UTC (permalink / raw)
  To: Ben Hutchings; +Cc: netdev, rayagond, davem
In-Reply-To: <1335460660.2712.15.camel@bwh-desktop.uk.solarflarecom.com>

On 4/26/2012 7:17 PM, Ben Hutchings wrote:
> On Thu, 2012-04-26 at 09:48 +0200, Giuseppe CAVALLARO wrote:
>> Hello Ben
>>
>> On 4/19/2012 5:30 PM, Ben Hutchings wrote:
>> [snip]
>>>> I'm changing the code for getting/setting the EEE capability and trying
>>>> to follow your suggestions.
>>>>
>>>> The "get" will show the following things; this is a bit different of the
>>>> points "a" "b" and "c" we had discussed. Maybe, this could also be a
>>>> more complete (*) .
>>>> The ethtool (see output below as example) could report the phy
>>>> (supported/advertised/lp_advertised) and mac eee capabilities separately.
>>>
>>> Sounds reasonable.
>>>
>>>> The "set" will be useful for some eth devices (like the stmmac) that can
>>>> stop/enable internally the eee capability (at mac level).
>>>
>>> I don't know much about EEE, but shouldn't the driver take care of
>>> configuring the MAC for this whenever the PHY is set to advertise EEE
>>> capability?
>>
>> Yes indeed this can be done at driver level. So could I definitely
>> remove it from ethtool? What do you suggest?
>>
>> In case of the stmmac I could add a specific driver option via sys to
>> enable/disable the eee and set timer.
> 
> Generally, ethtool doesn't distinguish MAC and PHY settings because they
> have to be configured consistently for the device to do anything useful.
> If there is some good use for enabling EEE in the MAC and not the PHY,
> or vice versa, then this should be exposed in the ethtool interface.
> But if not then I don't believe it needs to be in either an ethtool or a
> driver-specific interface.

Thanks Ben for this clarification: in case of the stmmac the option is
useful to stop a timer to enter in lpi state for the tx.
So it's worth having that and from ethtool.

> 
>>>> [snip]
>>>>
>>>> 	Current message level: 0x0000003f (63)
>>>> 			       drv probe link timer ifdown ifup
>>>> 	Link detected: yes
>>>> 	Energy-Efficient Ethernet:           -------------------------
>>>> 		MAC supports: yes            |-> related to MAC side  |
>>>> 		phy supports modes: ...      |-> from MMD 3.20        |
>>>> 		phy advertising modes: ...   |-> from MMD 7.60        |
>>>> 		LP advertising modes: ...    |-> from MMD 7.61        |
>>>> 					     --------------------------
>>>>                                                     (*)
>>>> PS. The "..." above means that we can actually dump: 100BASE-TX EEE etc
>>>> for each advertising modes and also for phy support (reg 3.20).
>>>
>>> Yes, that's the sort of information I would expect to see (but try to be
>>> consistent with the wording used for AN).:
>>
>> e.g. SUPPORTED_100baseT_EEE ... ADVERTISED_<...>
> 
> I meant the wording used in the ethtool output: 'supported',
> 'advertised', 'link partner advertised' rather than 'phy supports',
> 'phy advertising', 'LP advertising'.

ok :-)

> 
>>> The EEE advertising mask presumably should be changeable similarly to
>>> the AN advertising mask ('ethtool -s <devname> eeeadv <mask>').  But I
>>> don't know how the two advertising masks interact.  Is one supposed to
>>> be a subset of the other?  Currently ethtool automatically changes the
>>> AN advertising mask in response to a speed/duplex change; should it also
>>> try to change the EEE advertising mask?
>>
>> I've just verified the IEEE (Table 45–150a—EEE advertisement register
>> (Register 7.60) bit definitions) and sorry for my delay in reply but I
>> was in trouble because looking at the registers for the phy (I am using)
>> the reg 7.60 was in RO and I couldn't understand how to set the mask.
>> I confirm that the Adv reg from the std is R/W and the mask as you
>> suggest could be set according to the speed.
>> The EEE should work on duplex mode only.
>>
>> I wonder so if if the final patch I should have no new option for the
>> ethtool command and EEE info are directly passed from the kernel like
>> speed and duplex when call get_settings.
> 
> Are you suggesting to define EEE mode flags in the existing supported,
> advertising and lp_advertising masks?

Yes but I was wrong, I can use the existing flags.

Regards
Peppe

> 
> Ben.
> 

^ permalink raw reply

* Re: [PATCH v4 1/3] make jump_labels wait while updates are in place
From: Glauber Costa @ 2012-04-27 14:59 UTC (permalink / raw)
  To: Jason Baron
  Cc: Steven Rostedt, cgroups, netdev, linux-kernel, Li Zefan,
	Tejun Heo, kamezawa.hiroyu, linux-mm, devel, Johannes Weiner,
	Michal Hocko, Ingo Molnar
In-Reply-To: <20120427135320.GA13762@redhat.com>

On 04/27/2012 10:53 AM, Jason Baron wrote:
> On Thu, Apr 26, 2012 at 08:43:06PM -0400, Steven Rostedt wrote:
>> On Thu, Apr 26, 2012 at 07:51:05PM -0300, Glauber Costa wrote:
>>> In mem cgroup, we need to guarantee that two concurrent updates
>>> of the jump_label interface wait for each other. IOW, we can't have
>>> other updates returning while the first one is still patching the
>>> kernel around, otherwise we'll race.
>>
>> But it shouldn't. The code as is should prevent that.
>>
>>>
>>> I believe this is something that can fit well in the static branch
>>> API, without noticeable disadvantages:
>>>
>>> * in the common case, it will be a quite simple lock/unlock operation
>>> * Every context that calls static_branch_slow* already expects to be
>>>    in sleeping context because it will mutex_lock the unlikely case.
>>> * static_key_slow_inc is not expected to be called in any fast path,
>>>    otherwise it would be expected to have quite a different name. Therefore
>>>    the mutex + atomic combination instead of just an atomic should not kill
>>>    us.
>>>
>>> Signed-off-by: Glauber Costa<glommer@parallels.com>
>>> CC: Tejun Heo<tj@kernel.org>
>>> CC: Li Zefan<lizefan@huawei.com>
>>> CC: Kamezawa Hiroyuki<kamezawa.hiroyu@jp.fujitsu.com>
>>> CC: Johannes Weiner<hannes@cmpxchg.org>
>>> CC: Michal Hocko<mhocko@suse.cz>
>>> CC: Ingo Molnar<mingo@elte.hu>
>>> CC: Jason Baron<jbaron@redhat.com>
>>> ---
>>>   kernel/jump_label.c |   21 +++++++++++----------
>>>   1 files changed, 11 insertions(+), 10 deletions(-)
>>>
>>> diff --git a/kernel/jump_label.c b/kernel/jump_label.c
>>> index 4304919..5d09cb4 100644
>>> --- a/kernel/jump_label.c
>>> +++ b/kernel/jump_label.c
>>> @@ -57,17 +57,16 @@ static void jump_label_update(struct static_key *key, int enable);
>>>
>>>   void static_key_slow_inc(struct static_key *key)
>>>   {
>>> +	jump_label_lock();
>>>   	if (atomic_inc_not_zero(&key->enabled))
>>> -		return;
>>
>> If key->enabled is not zero, there's nothing to be done. As the jump
>> label has already been enabled. Note, the key->enabled doesn't get set
>> until after the jump label is updated. Thus, if two tasks were to come
>> in, they both would be locked on the jump_label_lock().
>>
>

Okay, we seem to have been tricked by the usage of atomic while 
analyzing this. The fact that the atomic update happens after the code 
is patched seems enough to guarantee what we need, now that I read it 
again (and it seems so obvious =p )

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply

* [PATCH] tcp: clean up use of jiffies in tcp_rcv_rtt_measure()
From: Neal Cardwell @ 2012-04-27 15:29 UTC (permalink / raw)
  To: David Miller
  Cc: netdev, Eric Dumazet, Nandita Dukkipati, Yuchung Cheng,
	ilpo.jarvinen, Tom Herbert, Neal Cardwell

Clean up a reference to jiffies in tcp_rcv_rtt_measure() that should
instead reference tcp_time_stamp. Since the result of the subtraction
is passed into a function taking u32, this should not change any
behavior (and indeed the generated assembly does not change on
x86_64). However, it seems worth cleaning this up for consistency and
clarity (and perhaps to avoid bugs if this is copied and pasted
somewhere else).

Signed-off-by: Neal Cardwell <ncardwell@google.com>
---
 net/ipv4/tcp_input.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 3ff36406..2a702e3 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -495,7 +495,7 @@ static inline void tcp_rcv_rtt_measure(struct tcp_sock *tp)
 		goto new_measure;
 	if (before(tp->rcv_nxt, tp->rcv_rtt_est.seq))
 		return;
-	tcp_rcv_rtt_update(tp, jiffies - tp->rcv_rtt_est.time, 1);
+	tcp_rcv_rtt_update(tp, tcp_time_stamp - tp->rcv_rtt_est.time, 1);

 new_measure:
 	tp->rcv_rtt_est.seq = tp->rcv_nxt + tp->rcv_wnd;
-- 
1.7.7.3

^ permalink raw reply related

* Re: [PATCH] tcp: clean up use of jiffies in tcp_rcv_rtt_measure()
From: Eric Dumazet @ 2012-04-27 15:56 UTC (permalink / raw)
  To: Neal Cardwell
  Cc: David Miller, netdev, Eric Dumazet, Nandita Dukkipati,
	Yuchung Cheng, ilpo.jarvinen, Tom Herbert
In-Reply-To: <1335540577-32554-1-git-send-email-ncardwell@google.com>

On Fri, 2012-04-27 at 11:29 -0400, Neal Cardwell wrote:
> Clean up a reference to jiffies in tcp_rcv_rtt_measure() that should
> instead reference tcp_time_stamp. Since the result of the subtraction
> is passed into a function taking u32, this should not change any
> behavior (and indeed the generated assembly does not change on
> x86_64). However, it seems worth cleaning this up for consistency and
> clarity (and perhaps to avoid bugs if this is copied and pasted
> somewhere else).
> 
> Signed-off-by: Neal Cardwell <ncardwell@google.com>
> ---

Acked-by: Eric Dumazet <edumazet@google.com>

^ permalink raw reply

* Re: [PATCH 2/4 net-next] tg3: provide frags as skb head
From: Ben Hutchings @ 2012-04-27 16:33 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David Miller, netdev, Ilpo Järvinen, Tom Herbert,
	Neal Cardwell, Matt Carlson, Jeff Kirsher, Michael Chan,
	Herbert Xu, Maciej Żenczykowski
In-Reply-To: <1335522889.2775.231.camel@edumazet-glaptop>

On Fri, 2012-04-27 at 12:34 +0200, Eric Dumazet wrote:
> From: Eric Dumazet <edumazet@google.com>
> 
> This patch converts tg3 driver, one of our reference drivers, to use new
> build_skb() api in frag mode.
> 
> Instead of using kmalloc() to allocate the memory block that will be
> used by build_skb() as skb->head, we use a page fragment.
> 
> This is a followup of patch "net: allow skb->head to be a page fragment"
> 
> This allows GRO, TCP coalescing, and splice() to be more efficient.
> 
> Incidentally, this also removes SLUB slow path contention in kfree()
[...]
> @@ -5617,17 +5626,48 @@ static void tg3_tx(struct tg3_napi *tnapi)
>  	}
>  }
>  
> +static void *tg3_frag_alloc(struct tg3_rx_prodring_set *tpr)
> +{
> +	void *data;
> +
> +	if (tpr->rx_page_size < TG3_FRAGSIZE) {
> +		struct page *page = alloc_page(GFP_ATOMIC);
> +
> +		if (!page)
> +			return NULL;
> +		atomic_add((PAGE_SIZE / TG3_FRAGSIZE) - 1, &page->_count); 
> +		tpr->rx_page_addr = page_address(page);
[...]

page_address() may return NULL if CONFIG_HIGHMEM is enabled.  And that's
going to be a general problem with this scheme on 32-bit machines.

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply

* Re: [PATCH] tcp: clean up use of jiffies in tcp_rcv_rtt_measure()
From: David Miller @ 2012-04-27 16:35 UTC (permalink / raw)
  To: ncardwell; +Cc: netdev, edumazet, nanditad, ycheng, ilpo.jarvinen, therbert
In-Reply-To: <1335540577-32554-1-git-send-email-ncardwell@google.com>

From: Neal Cardwell <ncardwell@google.com>
Date: Fri, 27 Apr 2012 11:29:37 -0400

> Clean up a reference to jiffies in tcp_rcv_rtt_measure() that should
> instead reference tcp_time_stamp. Since the result of the subtraction
> is passed into a function taking u32, this should not change any
> behavior (and indeed the generated assembly does not change on
> x86_64). However, it seems worth cleaning this up for consistency and
> clarity (and perhaps to avoid bugs if this is copied and pasted
> somewhere else).
> 
> Signed-off-by: Neal Cardwell <ncardwell@google.com>

Applied, thanks.

^ permalink raw reply

* Re: [PATCH 2/4 net-next] tg3: provide frags as skb head
From: David Miller @ 2012-04-27 16:38 UTC (permalink / raw)
  To: bhutchings
  Cc: eric.dumazet, netdev, ilpo.jarvinen, therbert, ncardwell,
	mcarlson, jeffrey.t.kirsher, mchan, herbert, maze
In-Reply-To: <1335544386.2618.50.camel@bwh-desktop.uk.solarflarecom.com>

From: Ben Hutchings <bhutchings@solarflare.com>
Date: Fri, 27 Apr 2012 17:33:06 +0100

> page_address() may return NULL if CONFIG_HIGHMEM is enabled.  And that's
> going to be a general problem with this scheme on 32-bit machines.

It can only be NULL for highmem pages, which he is not going to
get from the kind of page allocation calls he is using.

^ permalink raw reply

* Re: [PATCH 2/4 net-next] tg3: provide frags as skb head
From: Eric Dumazet @ 2012-04-27 16:42 UTC (permalink / raw)
  To: Ben Hutchings
  Cc: David Miller, netdev, Ilpo Järvinen, Tom Herbert,
	Neal Cardwell, Matt Carlson, Jeff Kirsher, Michael Chan,
	Herbert Xu, Maciej Żenczykowski
In-Reply-To: <1335544386.2618.50.camel@bwh-desktop.uk.solarflarecom.com>

On Fri, 2012-04-27 at 17:33 +0100, Ben Hutchings wrote:

> page_address() may return NULL if CONFIG_HIGHMEM is enabled.  And that's
> going to be a general problem with this scheme on 32-bit machines.

kmalloc(2048, GFP_ATOMIC) and alloc_page(GFP_ATOMIC) uses the same pool
of LOWMEM pages, so there is no difference here. page_address() is
guaranteed to work.

Thanks

^ permalink raw reply

* Re: [PATCH 2/4 net-next] tg3: provide frags as skb head
From: Ben Hutchings @ 2012-04-27 16:53 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David Miller, netdev, Ilpo Järvinen, Tom Herbert,
	Neal Cardwell, Matt Carlson, Jeff Kirsher, Michael Chan,
	Herbert Xu, Maciej Żenczykowski
In-Reply-To: <1335544964.2775.250.camel@edumazet-glaptop>

On Fri, 2012-04-27 at 18:42 +0200, Eric Dumazet wrote:
> On Fri, 2012-04-27 at 17:33 +0100, Ben Hutchings wrote:
> 
> > page_address() may return NULL if CONFIG_HIGHMEM is enabled.  And that's
> > going to be a general problem with this scheme on 32-bit machines.
> 
> kmalloc(2048, GFP_ATOMIC) and alloc_page(GFP_ATOMIC) uses the same pool
> of LOWMEM pages, so there is no difference here. page_address() is
> guaranteed to work.

Oh, of course, GFP_HIGH is not GFP_HIGHMEM...

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox