Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH 00/14] Swap-over-NBD without deadlocking v5
From: Mel Gorman @ 2011-07-15 14:10 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Andrew Morton, Linux-MM, Linux-Netdev, LKML, David Miller,
	Neil Brown, Peter Zijlstra
In-Reply-To: <20110707125831.GA15412@infradead.org>

On Thu, Jul 07, 2011 at 08:58:31AM -0400, Christoph Hellwig wrote:
> On Thu, Jul 07, 2011 at 10:47:37AM +0100, Mel Gorman wrote:
> > Additional complexity is required for swap-over-NFS but affects the
> > core kernel far less than this series. I do not have a series prepared
> > but from what's in a distro kernel, supporting NFS requires extending
> > address_space_operations for swapfile activation/deactivation with
> > some minor helpers and the bulk of the remaining complexity within
> > NFS itself.
> 
> The biggest addition for swap over NFS is to add proper support for
> a filesystem interface to do I/O on random kernel pages instead of
> the current nasty bmap hack the swapfile code is using. Splitting
> that work from all the required VM infrastructure should make life
> easier for everyone involved and allows merging it independeny as
> both bits have other uses case as well.
> 

The swap-over-nfs patches allows this possibility. There is a swapon
interface added that could be used by the filesystem to ensure the
underlying blocks are allocated and a swap_out/swap_in interface that
takes a struct file, struct page and writeback_control. This would
be an alternative to bmap being used to record the blocks backing
each extent.

Any objection to the swap-over-NBD stuff going ahead to get part of the
complexity out of the way?

-- 
Mel Gorman
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply

* Re: [PATCH 09/10] nfs: use sk fragment destructors to delay I/O completion until page is released by network stack.
From: Trond Myklebust @ 2011-07-15 14:01 UTC (permalink / raw)
  To: Ian Campbell
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA, linux-nfs-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1310728031-19569-9-git-send-email-ian.campbell-Sxgqhf6Nn4DQT0dZR+AlfA@public.gmane.org>

On Fri, 2011-07-15 at 12:07 +0100, Ian Campbell wrote: 
> Thos prevents an issue where an ACK is delayed, a retransmit is queued (either
> at the RPC or TCP level) and the ACK arrives before the retransmission hits the
> wire. If this happens then the write() system call and the userspace process
> can continue potentially modifying the data before the retransmission occurs.
> 
> NB: this only covers the O_DIRECT write() case. I expect other cases to need
> handling as well.

That is why this belongs entirely in the RPC layer, and really should
not touch the NFS layer.
If you move your callback to the RPC layer and have it notify the
rpc_task when the pages have been sent, then it should be possible to
achieve the same thing.

IOW: Add an extra state machine step after call_decode() which checks if
all the page data has been transmitted and if not, puts the rpc_task on
a wait queue, and has it wait for the fragment destructor callback
before calling rpc_exit_task().

Cheers
  Trond
-- 
Trond Myklebust
Linux NFS client maintainer

NetApp
Trond.Myklebust-HgOvQuBEEgTQT0dZR+AlfA@public.gmane.org
www.netapp.com

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Mail Quota alert -Error Code
From: System Administrator @ 2011-07-15 11:48 UTC (permalink / raw)



Your Mail Quota Has Exceeded The Set Quota/Limit. You Are Currently
Running On 23GB Due To Hidden Files And Folder On Your Mailbox,
you may not be able to receive or send new mails until you re-validate

Please Click the Link Below To Validate Your Mailbox And Increase Your Quota.

http://buzurl.com/ao70

Failure To Validate Your Quota May Result In Loss Of Important Information
In Your Mailbox Or Cause Limited Access To It.

Mail Quota alert -Error Code #1997142DDE

System Administrator
192.168.0.1





----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.


^ permalink raw reply

* Re: [PATCH] gianfar: rx parser
From: Jiri Pirko @ 2011-07-15 11:41 UTC (permalink / raw)
  To: Sebastian Pöhn; +Cc: Linux Netdev, sandeep.kumar
In-Reply-To: <1310728141.28792.10.camel@DENEC1DT0191>

Fri, Jul 15, 2011 at 01:09:01PM CEST, sebastian.belden@googlemail.com wrote:
>Only let the rx parser be enabled if it is necessary (if VLAN extracrtion, IP or TCP checksumming or the rx queue filer are enabled). Otherwise disable it.
>
>The new routine gfar_check_rx_parser_mode should be runt after every change on this features and will enable/disable the parser as necessary.
>
>I will send a patch for the rx queue filer the next days.

Looks good to me.

Reviewed-by: Jiri Pirko <jpirko@redhat.com>


>
>
>
>Signed-off-by: Sebastian Poehn <sebastian.poehn@belden.com>
>---
>
> drivers/net/gianfar.c |   25 ++++++++++++++++++++-----
> drivers/net/gianfar.h |    3 ++-
> 2 files changed, 22 insertions(+), 6 deletions(-)
>
>diff --git a/drivers/net/gianfar.c b/drivers/net/gianfar.c
>index 3321d71..6da3712 100644
>--- a/drivers/net/gianfar.c
>+++ b/drivers/net/gianfar.c
>@@ -2287,6 +2287,23 @@ static int gfar_set_mac_address(struct net_device *dev)
> 	return 0;
> }
> 
>+/* Check if rx parser should be activated */
>+void gfar_check_rx_parser_mode(struct gfar_private *priv)
>+{
>+	struct gfar __iomem *regs = NULL;
>+	u32 tempval = 0;
>+	
>+	regs = priv->gfargrp[0].regs;
>+	
>+	tempval = gfar_read(&regs->rctrl);
>+	/* If parse is no longer required, then disable parser */
>+	if (tempval & RCTRL_REQ_PARSER)
>+		tempval |= RCTRL_PRSDEP_INIT;
>+	else
>+		tempval &= ~RCTRL_PRSDEP_INIT;
>+	gfar_write(&regs->rctrl, tempval);
>+}
>+
> 
> /* Enables and disables VLAN insertion/extraction */
> static void gfar_vlan_rx_register(struct net_device *dev,
>@@ -2323,12 +2340,10 @@ static void gfar_vlan_rx_register(struct net_device *dev,
> 		/* Disable VLAN tag extraction */
> 		tempval = gfar_read(&regs->rctrl);
> 		tempval &= ~RCTRL_VLEX;
>-		/* If parse is no longer required, then disable parser */
>-		if (tempval & RCTRL_REQ_PARSER)
>-			tempval |= RCTRL_PRSDEP_INIT;
>-		else
>-			tempval &= ~RCTRL_PRSDEP_INIT;
> 		gfar_write(&regs->rctrl, tempval);
>+
>+		gfar_check_rx_parser_mode(priv);
>+		
> 	}
> 
> 	gfar_change_mtu(dev, dev->mtu);
>diff --git a/drivers/net/gianfar.h b/drivers/net/gianfar.h
>index 27499c6..87c1d86 100644
>--- a/drivers/net/gianfar.h
>+++ b/drivers/net/gianfar.h
>@@ -286,7 +286,7 @@ extern const char gfar_driver_version[];
> #define RCTRL_PROM		0x00000008
> #define RCTRL_EMEN		0x00000002
> #define RCTRL_REQ_PARSER	(RCTRL_VLEX | RCTRL_IPCSEN | \
>-				 RCTRL_TUCSEN)
>+				 RCTRL_TUCSEN | RCTRL_FILREN)
> #define RCTRL_CHECKSUMMING	(RCTRL_IPCSEN | RCTRL_TUCSEN | \
> 				RCTRL_PRSDEP_INIT)
> #define RCTRL_EXTHASH		(RCTRL_GHTX)
>@@ -1182,6 +1182,7 @@ extern void gfar_configure_coalescing(struct gfar_private *priv,
> 		unsigned long tx_mask, unsigned long rx_mask);
> void gfar_init_sysfs(struct net_device *dev);
> int gfar_set_features(struct net_device *dev, u32 features);
>+extern void gfar_check_rx_parser_mode(struct gfar_private *priv);
> 
> extern const struct ethtool_ops gfar_ethtool_ops;
> 
>
> 
>
>

^ permalink raw reply

* Re: [PATCH] 8139cp: convert to new VLAN model.
From: Jiri Pirko @ 2011-07-15 11:36 UTC (permalink / raw)
  To: Francois Romieu; +Cc: Stephen Hemminger, netdev
In-Reply-To: <20110715102144.GA6601@electric-eye.fr.zoreil.com>

Similar patch is in my local git :)

Reviewed-by: Jiri Pirko <jpirko@redhat.com>

Fri, Jul 15, 2011 at 12:21:44PM CEST, romieu@fr.zoreil.com wrote:
>The registers and descriptors bits are identical to the pre-8168
>8169 chipsets : {RxDesc / TxDesc}.opts2 can only contain VLAN information.
>
>Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
>---
> drivers/net/8139cp.c |   83 ++++++++++++++++---------------------------------
> 1 files changed, 27 insertions(+), 56 deletions(-)
>
>diff --git a/drivers/net/8139cp.c b/drivers/net/8139cp.c
>index 73b10b0..cc4c210 100644
>--- a/drivers/net/8139cp.c
>+++ b/drivers/net/8139cp.c
>@@ -78,17 +78,6 @@
> #include <asm/irq.h>
> #include <asm/uaccess.h>
> 
>-/* VLAN tagging feature enable/disable */
>-#if defined(CONFIG_VLAN_8021Q) || defined(CONFIG_VLAN_8021Q_MODULE)
>-#define CP_VLAN_TAG_USED 1
>-#define CP_VLAN_TX_TAG(tx_desc,vlan_tag_value) \
>-	do { (tx_desc)->opts2 = cpu_to_le32(vlan_tag_value); } while (0)
>-#else
>-#define CP_VLAN_TAG_USED 0
>-#define CP_VLAN_TX_TAG(tx_desc,vlan_tag_value) \
>-	do { (tx_desc)->opts2 = 0; } while (0)
>-#endif
>-
> /* These identify the driver base version and may not be removed. */
> static char version[] =
> DRV_NAME ": 10/100 PCI Ethernet driver v" DRV_VERSION " (" DRV_RELDATE ")\n";
>@@ -356,9 +345,6 @@ struct cp_private {
> 	unsigned		rx_buf_sz;
> 	unsigned		wol_enabled : 1; /* Is Wake-on-LAN enabled? */
> 
>-#if CP_VLAN_TAG_USED
>-	struct vlan_group	*vlgrp;
>-#endif
> 	dma_addr_t		ring_dma;
> 
> 	struct mii_if_info	mii_if;
>@@ -423,24 +409,6 @@ static struct {
> };
> 
> 
>-#if CP_VLAN_TAG_USED
>-static void cp_vlan_rx_register(struct net_device *dev, struct vlan_group *grp)
>-{
>-	struct cp_private *cp = netdev_priv(dev);
>-	unsigned long flags;
>-
>-	spin_lock_irqsave(&cp->lock, flags);
>-	cp->vlgrp = grp;
>-	if (grp)
>-		cp->cpcmd |= RxVlanOn;
>-	else
>-		cp->cpcmd &= ~RxVlanOn;
>-
>-	cpw16(CpCmd, cp->cpcmd);
>-	spin_unlock_irqrestore(&cp->lock, flags);
>-}
>-#endif /* CP_VLAN_TAG_USED */
>-
> static inline void cp_set_rxbufsize (struct cp_private *cp)
> {
> 	unsigned int mtu = cp->dev->mtu;
>@@ -455,18 +423,17 @@ static inline void cp_set_rxbufsize (struct cp_private *cp)
> static inline void cp_rx_skb (struct cp_private *cp, struct sk_buff *skb,
> 			      struct cp_desc *desc)
> {
>+	u32 opts2 = le32_to_cpu(desc->opts2);
>+
> 	skb->protocol = eth_type_trans (skb, cp->dev);
> 
> 	cp->dev->stats.rx_packets++;
> 	cp->dev->stats.rx_bytes += skb->len;
> 
>-#if CP_VLAN_TAG_USED
>-	if (cp->vlgrp && (desc->opts2 & cpu_to_le32(RxVlanTagged))) {
>-		vlan_hwaccel_receive_skb(skb, cp->vlgrp,
>-					 swab16(le32_to_cpu(desc->opts2) & 0xffff));
>-	} else
>-#endif
>-		netif_receive_skb(skb);
>+	if (opts2 & RxVlanTagged)
>+		__vlan_hwaccel_put_tag(skb, swab16(opts2 & 0xffff));
>+
>+	napi_gro_receive(&cp->napi, skb);
> }
> 
> static void cp_rx_err_acct (struct cp_private *cp, unsigned rx_tail,
>@@ -730,6 +697,12 @@ static void cp_tx (struct cp_private *cp)
> 		netif_wake_queue(cp->dev);
> }
> 
>+static inline u32 cp_tx_vlan_tag(struct sk_buff *skb)
>+{
>+	return vlan_tx_tag_present(skb) ?
>+		TxVlanTag | swab16(vlan_tx_tag_get(skb)) : 0x00;
>+}
>+
> static netdev_tx_t cp_start_xmit (struct sk_buff *skb,
> 					struct net_device *dev)
> {
>@@ -737,9 +710,7 @@ static netdev_tx_t cp_start_xmit (struct sk_buff *skb,
> 	unsigned entry;
> 	u32 eor, flags;
> 	unsigned long intr_flags;
>-#if CP_VLAN_TAG_USED
>-	u32 vlan_tag = 0;
>-#endif
>+	__le32 opts2;
> 	int mss = 0;
> 
> 	spin_lock_irqsave(&cp->lock, intr_flags);
>@@ -752,15 +723,12 @@ static netdev_tx_t cp_start_xmit (struct sk_buff *skb,
> 		return NETDEV_TX_BUSY;
> 	}
> 
>-#if CP_VLAN_TAG_USED
>-	if (vlan_tx_tag_present(skb))
>-		vlan_tag = TxVlanTag | swab16(vlan_tx_tag_get(skb));
>-#endif
>-
> 	entry = cp->tx_head;
> 	eor = (entry == (CP_TX_RING_SIZE - 1)) ? RingEnd : 0;
> 	mss = skb_shinfo(skb)->gso_size;
> 
>+	opts2 = cpu_to_le32(cp_tx_vlan_tag(skb));
>+
> 	if (skb_shinfo(skb)->nr_frags == 0) {
> 		struct cp_desc *txd = &cp->tx_ring[entry];
> 		u32 len;
>@@ -768,7 +736,7 @@ static netdev_tx_t cp_start_xmit (struct sk_buff *skb,
> 
> 		len = skb->len;
> 		mapping = dma_map_single(&cp->pdev->dev, skb->data, len, PCI_DMA_TODEVICE);
>-		CP_VLAN_TX_TAG(txd, vlan_tag);
>+		txd->opts2 = opts2;
> 		txd->addr = cpu_to_le64(mapping);
> 		wmb();
> 
>@@ -839,7 +807,7 @@ static netdev_tx_t cp_start_xmit (struct sk_buff *skb,
> 				ctrl |= LastFrag;
> 
> 			txd = &cp->tx_ring[entry];
>-			CP_VLAN_TX_TAG(txd, vlan_tag);
>+			txd->opts2 = opts2;
> 			txd->addr = cpu_to_le64(mapping);
> 			wmb();
> 
>@@ -851,7 +819,7 @@ static netdev_tx_t cp_start_xmit (struct sk_buff *skb,
> 		}
> 
> 		txd = &cp->tx_ring[first_entry];
>-		CP_VLAN_TX_TAG(txd, vlan_tag);
>+		txd->opts2 = opts2;
> 		txd->addr = cpu_to_le64(first_mapping);
> 		wmb();
> 
>@@ -1431,6 +1399,11 @@ static int cp_set_features(struct net_device *dev, u32 features)
> 	else
> 		cp->cpcmd &= ~RxChkSum;
> 
>+	if (features & NETIF_F_HW_VLAN_RX)
>+		cp->cpcmd |= RxVlanOn;
>+	else
>+		cp->cpcmd &= ~RxVlanOn;
>+
> 	cpw16_f(CpCmd, cp->cpcmd);
> 	spin_unlock_irqrestore(&cp->lock, flags);
> 
>@@ -1818,9 +1791,6 @@ static const struct net_device_ops cp_netdev_ops = {
> 	.ndo_start_xmit		= cp_start_xmit,
> 	.ndo_tx_timeout		= cp_tx_timeout,
> 	.ndo_set_features	= cp_set_features,
>-#if CP_VLAN_TAG_USED
>-	.ndo_vlan_rx_register	= cp_vlan_rx_register,
>-#endif
> #ifdef BROKEN
> 	.ndo_change_mtu		= cp_change_mtu,
> #endif
>@@ -1949,15 +1919,16 @@ static int cp_init_one (struct pci_dev *pdev, const struct pci_device_id *ent)
> 	dev->ethtool_ops = &cp_ethtool_ops;
> 	dev->watchdog_timeo = TX_TIMEOUT;
> 
>-#if CP_VLAN_TAG_USED
> 	dev->features |= NETIF_F_HW_VLAN_TX | NETIF_F_HW_VLAN_RX;
>-#endif
> 
> 	if (pci_using_dac)
> 		dev->features |= NETIF_F_HIGHDMA;
> 
> 	/* disabled by default until verified */
>-	dev->hw_features |= NETIF_F_SG | NETIF_F_IP_CSUM | NETIF_F_TSO;
>+	dev->hw_features |= NETIF_F_SG | NETIF_F_IP_CSUM | NETIF_F_TSO |
>+		NETIF_F_HW_VLAN_TX | NETIF_F_HW_VLAN_RX;
>+	dev->vlan_features = NETIF_F_SG | NETIF_F_IP_CSUM | NETIF_F_TSO |
>+		NETIF_F_HIGHDMA;
> 
> 	dev->irq = pdev->irq;
> 
>-- 
>1.7.4.4
>

^ permalink raw reply

* [PATCH 1/1] r6040: only disable RX interrupt if napi_schedule_prep is successful
From: Michael Thalmeier @ 2011-07-15 11:28 UTC (permalink / raw)
  To: Florian Fainelli; +Cc: netdev, linux-kernel, michael, Michael Thalmeier

When receiving the first RX interrupt before the internal call
to napi_schedule_prep is successful the RX interrupt gets disabled
and is never enabled again as the poll function never gets executed.

Signed-off-by: Michael Thalmeier <Michael.Thalmeier@sigmatek.at>
---
 drivers/net/r6040.c |    8 +++++---
 1 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/net/r6040.c b/drivers/net/r6040.c
index 200a363..0ffec46 100644
--- a/drivers/net/r6040.c
+++ b/drivers/net/r6040.c
@@ -677,9 +677,11 @@ static irqreturn_t r6040_interrupt(int irq, void *dev_id)
 		if (status & RX_FIFO_FULL)
 			dev->stats.rx_fifo_errors++;
 
-		/* Mask off RX interrupt */
-		misr &= ~RX_INTS;
-		napi_schedule(&lp->napi);
+		if (likely(napi_schedule_prep(&lp->napi))) {
+			/* Mask off RX interrupt */
+			misr &= ~RX_INTS;
+			__napi_schedule(&lp->napi);
+		}
 	}
 
 	/* TX interrupt request */
-- 
1.7.1


^ permalink raw reply related

* [PATCH 07/10] net: add support for per-paged-fragment destructors
From: Ian Campbell @ 2011-07-15 11:07 UTC (permalink / raw)
  To: netdev; +Cc: linux-nfs, Ian Campbell
In-Reply-To: <1310728006.20648.3.camel@zakaz.uk.xensource.com>

Entities which care about the complete lifecycle of pages which they inject
into the network stack via an skb paged fragment can choose to set this
destructor in order to receive a callback when the stack is really finished
with a page (including all clones, retransmits, pull-ups etc etc).

This destructor will always be propagated alongside the struct page when
copying skb_frag_t->page. This is the reason I chose to embed the destructor in
a "struct { } page" within the skb_frag_t, rather than as a separate field,
since it allows existing code which propagates ->frags[N].page to Just
Work(tm).

When the destructor is present the page reference counting is done slightly
differently. No references are held by the network stack on the struct page (it
is up to the caller to manage this as necessary) instead the network stack will
track references via the count embedded in the destructor structure. When this
reference count reaches zero then the destructor will be called and the caller
can take the necesary steps to release the page (i.e. release the struct page
reference itself).

The intention is that callers can use this callback to delay completion to
_their_ callers until the network stack has completely released the page, in
order to prevent use-after-free or modification of data pages which are still
in use by the stack.

It is allowable (indeed expected) for a caller to share a single destructor
instance between multiple pages injected into the stack e.g. a group of pages
included in a single higher level operation might share a destructor which is
used to complete that higher level operation.

NB: a small number of drivers use skb_frag_t independently of struct sk_buff so
this patch is slightly larger than necessary. I did consider leaving skb_frag_t
alone and defining a new (but similar) structure to be used in the struct
sk_buff itself. This would also have the advantage of more clearly separating
the two uses, which is useful since there are now special reference counting
accessors for skb_frag_t within a struct sk_buff but not (necessarily) for
those used outside of an skb.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
---
 drivers/net/cxgb4/sge.c       |   14 +++++++-------
 drivers/net/cxgb4vf/sge.c     |   18 +++++++++---------
 drivers/net/mlx4/en_rx.c      |    2 +-
 drivers/scsi/cxgbi/libcxgbi.c |    2 +-
 include/linux/skbuff.h        |   31 ++++++++++++++++++++++++++-----
 net/core/skbuff.c             |   17 +++++++++++++++++
 6 files changed, 61 insertions(+), 23 deletions(-)

diff --git a/drivers/net/cxgb4/sge.c b/drivers/net/cxgb4/sge.c
index f1813b5..3e7c4b3 100644
--- a/drivers/net/cxgb4/sge.c
+++ b/drivers/net/cxgb4/sge.c
@@ -1416,7 +1416,7 @@ static inline void copy_frags(struct sk_buff *skb,
 	unsigned int n;
 
 	/* usually there's just one frag */
-	skb_frag_set_page(skb, 0, gl->frags[0].page);
+	skb_frag_set_page(skb, 0, gl->frags[0].page.p);	/* XXX */
 	ssi->frags[0].page_offset = gl->frags[0].page_offset + offset;
 	ssi->frags[0].size = gl->frags[0].size - offset;
 	ssi->nr_frags = gl->nfrags;
@@ -1425,7 +1425,7 @@ static inline void copy_frags(struct sk_buff *skb,
 		memcpy(&ssi->frags[1], &gl->frags[1], n * sizeof(skb_frag_t));
 
 	/* get a reference to the last page, we don't own it */
-	get_page(gl->frags[n].page);
+	get_page(gl->frags[n].page.p);	/* XXX */
 }
 
 /**
@@ -1482,7 +1482,7 @@ static void t4_pktgl_free(const struct pkt_gl *gl)
 	const skb_frag_t *p;
 
 	for (p = gl->frags, n = gl->nfrags - 1; n--; p++)
-		put_page(p->page);
+		put_page(p->page.p); /* XXX */
 }
 
 /*
@@ -1635,7 +1635,7 @@ static void restore_rx_bufs(const struct pkt_gl *si, struct sge_fl *q,
 		else
 			q->cidx--;
 		d = &q->sdesc[q->cidx];
-		d->page = si->frags[frags].page;
+		d->page = si->frags[frags].page.p; /* XXX */
 		d->dma_addr |= RX_UNMAPPED_BUF;
 		q->avail++;
 	}
@@ -1717,7 +1717,7 @@ static int process_responses(struct sge_rspq *q, int budget)
 			for (frags = 0, fp = si.frags; ; frags++, fp++) {
 				rsd = &rxq->fl.sdesc[rxq->fl.cidx];
 				bufsz = get_buf_size(rsd);
-				fp->page = rsd->page;
+				fp->page.p = rsd->page; /* XXX */
 				fp->page_offset = q->offset;
 				fp->size = min(bufsz, len);
 				len -= fp->size;
@@ -1734,8 +1734,8 @@ static int process_responses(struct sge_rspq *q, int budget)
 						get_buf_addr(rsd),
 						fp->size, DMA_FROM_DEVICE);
 
-			si.va = page_address(si.frags[0].page) +
-				si.frags[0].page_offset;
+			si.va = page_address(si.frags[0].page.p) +
+				si.frags[0].page_offset; /* XXX */
 
 			prefetch(si.va);
 
diff --git a/drivers/net/cxgb4vf/sge.c b/drivers/net/cxgb4vf/sge.c
index f4c4480..0a0dda1 100644
--- a/drivers/net/cxgb4vf/sge.c
+++ b/drivers/net/cxgb4vf/sge.c
@@ -1397,7 +1397,7 @@ struct sk_buff *t4vf_pktgl_to_skb(const struct pkt_gl *gl,
 		skb_copy_to_linear_data(skb, gl->va, pull_len);
 
 		ssi = skb_shinfo(skb);
-		skb_frag_set_page(skb, 0, gl->frags[0].page);
+		skb_frag_set_page(skb, 0, gl->frags[0].page.p); /* XXX */
 		ssi->frags[0].page_offset = gl->frags[0].page_offset + pull_len;
 		ssi->frags[0].size = gl->frags[0].size - pull_len;
 		if (gl->nfrags > 1)
@@ -1410,7 +1410,7 @@ struct sk_buff *t4vf_pktgl_to_skb(const struct pkt_gl *gl,
 		skb->truesize += skb->data_len;
 
 		/* Get a reference for the last page, we don't own it */
-		get_page(gl->frags[gl->nfrags - 1].page);
+		get_page(gl->frags[gl->nfrags - 1].page.p); /* XXX */
 	}
 
 out:
@@ -1430,7 +1430,7 @@ void t4vf_pktgl_free(const struct pkt_gl *gl)
 
 	frag = gl->nfrags - 1;
 	while (frag--)
-		put_page(gl->frags[frag].page);
+		put_page(gl->frags[frag].page.p); /* XXX */
 }
 
 /**
@@ -1450,7 +1450,7 @@ static inline void copy_frags(struct sk_buff *skb,
 	unsigned int n;
 
 	/* usually there's just one frag */
-	skb_frag_set_page(skb, 0, gl->frags[0].page);
+	skb_frag_set_page(skb, 0, gl->frags[0].page.p);	/* XXX */
 	si->frags[0].page_offset = gl->frags[0].page_offset + offset;
 	si->frags[0].size = gl->frags[0].size - offset;
 	si->nr_frags = gl->nfrags;
@@ -1460,7 +1460,7 @@ static inline void copy_frags(struct sk_buff *skb,
 		memcpy(&si->frags[1], &gl->frags[1], n * sizeof(skb_frag_t));
 
 	/* get a reference to the last page, we don't own it */
-	get_page(gl->frags[n].page);
+	get_page(gl->frags[n].page.p); /* XXX */
 }
 
 /**
@@ -1633,7 +1633,7 @@ static void restore_rx_bufs(const struct pkt_gl *gl, struct sge_fl *fl,
 		else
 			fl->cidx--;
 		sdesc = &fl->sdesc[fl->cidx];
-		sdesc->page = gl->frags[frags].page;
+		sdesc->page = gl->frags[frags].page.p; /* XXX */
 		sdesc->dma_addr |= RX_UNMAPPED_BUF;
 		fl->avail++;
 	}
@@ -1721,7 +1721,7 @@ int process_responses(struct sge_rspq *rspq, int budget)
 				BUG_ON(rxq->fl.avail == 0);
 				sdesc = &rxq->fl.sdesc[rxq->fl.cidx];
 				bufsz = get_buf_size(sdesc);
-				fp->page = sdesc->page;
+				fp->page.p = sdesc->page; /* XXX */
 				fp->page_offset = rspq->offset;
 				fp->size = min(bufsz, len);
 				len -= fp->size;
@@ -1739,8 +1739,8 @@ int process_responses(struct sge_rspq *rspq, int budget)
 			dma_sync_single_for_cpu(rspq->adapter->pdev_dev,
 						get_buf_addr(sdesc),
 						fp->size, DMA_FROM_DEVICE);
-			gl.va = (page_address(gl.frags[0].page) +
-				 gl.frags[0].page_offset);
+			gl.va = (page_address(gl.frags[0].page.p) +
+				 gl.frags[0].page_offset); /* XXX */
 			prefetch(gl.va);
 
 			/*
diff --git a/drivers/net/mlx4/en_rx.c b/drivers/net/mlx4/en_rx.c
index 21a89e0..c5d01ce 100644
--- a/drivers/net/mlx4/en_rx.c
+++ b/drivers/net/mlx4/en_rx.c
@@ -418,7 +418,7 @@ static int mlx4_en_complete_rx_desc(struct mlx4_en_priv *priv,
 			break;
 
 		/* Save page reference in skb */
-		__skb_frag_set_page(&skb_frags_rx[nr], skb_frags[nr].page);
+		__skb_frag_set_page(&skb_frags_rx[nr], skb_frags[nr].page.p); /* XXX */
 		skb_frags_rx[nr].size = skb_frags[nr].size;
 		skb_frags_rx[nr].page_offset = skb_frags[nr].page_offset;
 		dma = be64_to_cpu(rx_desc->data[nr].addr);
diff --git a/drivers/scsi/cxgbi/libcxgbi.c b/drivers/scsi/cxgbi/libcxgbi.c
index 949ee48..8d16a74 100644
--- a/drivers/scsi/cxgbi/libcxgbi.c
+++ b/drivers/scsi/cxgbi/libcxgbi.c
@@ -1823,7 +1823,7 @@ static int sgl_read_to_frags(struct scatterlist *sg, unsigned int sgoffset,
 				return -EINVAL;
 			}
 
-			frags[i].page = page;
+			frags[i].page.p = page;
 			frags[i].page_offset = sg->offset + sgoffset;
 			frags[i].size = copy;
 			i++;
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 982c6a3..99b56e3 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -135,8 +135,17 @@ struct sk_buff;
 
 typedef struct skb_frag_struct skb_frag_t;
 
+struct skb_frag_destructor {
+	atomic_t ref;
+	int (*destroy)(void *data);
+	void *data;
+};
+
 struct skb_frag_struct {
-	struct page *page;
+	struct {
+		struct page *p;
+		struct skb_frag_destructor *destructor;
+	} page;
 #if (BITS_PER_LONG > 32) || (PAGE_SIZE >= 65536)
 	__u32 page_offset;
 	__u32 size;
@@ -1129,7 +1138,8 @@ static inline void __skb_fill_page_desc(struct sk_buff *skb, int i,
 {
 	skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
 
-	frag->page		  = page;
+	frag->page.p		  = page;
+	frag->page.destructor     = NULL;
 	frag->page_offset	  = off;
 	frag->size		  = size;
 }
@@ -1648,7 +1658,7 @@ static inline void netdev_free_page(struct net_device *dev, struct page *page)
  */
 static inline struct page *__skb_frag_page(const skb_frag_t *frag)
 {
-	return frag->page;
+	return frag->page.p;
 }
 
 /**
@@ -1659,9 +1669,12 @@ static inline struct page *__skb_frag_page(const skb_frag_t *frag)
  */
 static inline const struct page *skb_frag_page(const skb_frag_t *frag)
 {
-	return frag->page;
+	return frag->page.p;
 }
 
+extern void skb_frag_destructor_ref(struct skb_frag_destructor *destroy);
+extern void skb_frag_destructor_unref(struct skb_frag_destructor *destroy);
+
 /**
  * __skb_frag_ref - take an addition reference on a paged fragment.
  * @frag: the paged fragment
@@ -1670,6 +1683,10 @@ static inline const struct page *skb_frag_page(const skb_frag_t *frag)
  */
 static inline void __skb_frag_ref(skb_frag_t *frag)
 {
+	if (unlikely(frag->page.destructor)) {
+		skb_frag_destructor_ref(frag->page.destructor);
+		return;
+	}
 	get_page(__skb_frag_page(frag));
 }
 
@@ -1693,6 +1710,10 @@ static inline void skb_frag_ref(struct sk_buff *skb, int f)
  */
 static inline void __skb_frag_unref(skb_frag_t *frag)
 {
+	if (unlikely(frag->page.destructor)) {
+		skb_frag_destructor_unref(frag->page.destructor);
+		return;
+	}
 	put_page(__skb_frag_page(frag));
 }
 
@@ -1745,7 +1766,7 @@ static inline void *skb_frag_address_safe(const skb_frag_t *frag)
  */
 static inline void __skb_frag_set_page(skb_frag_t *frag, struct page *page)
 {
-	frag->page = page;
+	frag->page.p = page;
 	__skb_frag_ref(frag);
 }
 
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 2133600..bdc6f6e 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -292,6 +292,23 @@ struct sk_buff *dev_alloc_skb(unsigned int length)
 }
 EXPORT_SYMBOL(dev_alloc_skb);
 
+void skb_frag_destructor_ref(struct skb_frag_destructor *destroy)
+{
+	BUG_ON(destroy == NULL);
+	atomic_inc(&destroy->ref);
+}
+EXPORT_SYMBOL(skb_frag_destructor_ref);
+
+void skb_frag_destructor_unref(struct skb_frag_destructor *destroy)
+{
+	if (destroy == NULL)
+		return;
+
+	if (atomic_dec_and_test(&destroy->ref))
+		destroy->destroy(destroy->data);
+}
+EXPORT_SYMBOL(skb_frag_destructor_unref);
+
 static void skb_drop_list(struct sk_buff **listp)
 {
 	struct sk_buff *list = *listp;
-- 
1.7.2.5


^ permalink raw reply related

* [PATCH 10/10] nfs: debugging for nfs destructor
From: Ian Campbell @ 2011-07-15 11:07 UTC (permalink / raw)
  To: netdev; +Cc: linux-nfs, Ian Campbell
In-Reply-To: <1310728006.20648.3.camel@zakaz.uk.xensource.com>

---
 fs/nfs/direct.c            |   57 +++++++++++++++++++++++++++++++++++++++++++-
 fs/nfs/nfs3xdr.c           |    2 +
 include/linux/mm.h         |    4 +++
 include/linux/page-flags.h |    2 +
 mm/swap.c                  |    6 ++++
 net/core/skbuff.c          |   14 ++++++++++-
 net/sunrpc/svcsock.c       |    7 +++++
 net/sunrpc/xdr.c           |   12 ++++++++-
 net/sunrpc/xprtsock.c      |    8 ++++++
 9 files changed, 109 insertions(+), 3 deletions(-)

diff --git a/fs/nfs/direct.c b/fs/nfs/direct.c
index 4735fd9..9512f75 100644
--- a/fs/nfs/direct.c
+++ b/fs/nfs/direct.c
@@ -144,8 +144,12 @@ static void nfs_direct_dirty_pages(struct page **pages, unsigned int pgbase, siz
 static void nfs_direct_release_pages(struct page **pages, unsigned int npages)
 {
 	unsigned int i;
-	for (i = 0; i < npages; i++)
+	for (i = 0; i < npages; i++) {
+		if (0) printk(KERN_CRIT "%s releasing %p (clearing debug)\n",
+		       __func__, pages[i]);
+		ClearPageDebug(pages[i]);
 		page_cache_release(pages[i]);
+	}
 }
 
 static inline struct nfs_direct_req *nfs_direct_req_alloc(void)
@@ -216,6 +220,8 @@ out:
  */
 static void nfs_direct_complete(struct nfs_direct_req *dreq)
 {
+	if (0) printk(KERN_CRIT "%s for dreq:%p count:%d\n",
+	       __func__, dreq, atomic_read(&dreq->io_count));
 	if (dreq->iocb) {
 		long res = (long) dreq->error;
 		if (!res)
@@ -521,6 +527,8 @@ static void nfs_direct_write_reschedule(struct nfs_direct_req *dreq)
 				(unsigned long long)data->args.offset);
 	}
 
+	if (0) printk(KERN_CRIT "%s for w:%p dreq:%p count:%d\n", __func__,
+	       data, dreq, atomic_read(&dreq->io_count));
 	if (put_dreq(dreq))
 		nfs_direct_write_complete(dreq, inode);
 }
@@ -549,6 +557,8 @@ static void nfs_direct_commit_release(void *calldata)
 	}
 
 	dprintk("NFS: %5u commit returned %d\n", data->task.tk_pid, status);
+	if (0) printk(KERN_CRIT "%s for w:%p dreq:%p count:%d\n", __func__,
+	       data, dreq, atomic_read(&dreq->io_count));
 	nfs_direct_write_complete(dreq, data->inode);
 	nfs_commit_free(data);
 }
@@ -609,20 +619,25 @@ static void nfs_direct_write_complete(struct nfs_direct_req *dreq, struct inode
 {
 	int flags = dreq->flags;
 
+	if (0) printk(KERN_CRIT "%s for dreq:%p\n", __func__, dreq);
 	dreq->flags = 0;
 	switch (flags) {
 		case NFS_ODIRECT_DO_COMMIT:
+			if (0) printk(KERN_CRIT "%s: DO_COMMIT\n", __func__);
 			nfs_direct_commit_schedule(dreq);
 			break;
 		case NFS_ODIRECT_RESCHED_WRITES:
+			if (0) printk(KERN_CRIT "%s: RESCHED\n", __func__);
 			nfs_direct_write_reschedule(dreq);
 			break;
 		default:
+			if (0) printk(KERN_CRIT "%s: DONE\n", __func__);
 			if (dreq->commit_data != NULL)
 				nfs_commit_free(dreq->commit_data);
 			nfs_direct_free_writedata(dreq);
 			nfs_zap_mapping(inode, inode->i_mapping);
 			nfs_direct_complete(dreq);
+			//set_all_pages_debug(data->pagevec, data->npages);
 	}
 }
 
@@ -661,6 +676,7 @@ static void nfs_direct_write_release(void *calldata)
 {
 	struct nfs_write_data *data = calldata;
 	struct nfs_direct_req *dreq = (struct nfs_direct_req *) data->req;
+
 	int status = data->task.tk_status;
 
 	spin_lock(&dreq->lock);
@@ -691,6 +707,8 @@ static void nfs_direct_write_release(void *calldata)
 out_unlock:
 	spin_unlock(&dreq->lock);
 
+	if (0) printk(KERN_CRIT "%s for w:%p dreq:%p count:%d\n", __func__,
+	       data, dreq, atomic_read(&dreq->io_count));
 	skb_frag_destructor_unref(data->args.pages_destructor);
 }
 
@@ -706,11 +724,29 @@ static int nfs_write_page_destroy(void *calldata)
 {
 	struct nfs_write_data *data = calldata;
 	struct nfs_direct_req *dreq = (struct nfs_direct_req *) data->req;
+	//int i;
+	if (0) printk(KERN_CRIT "%s for w:%p dreq:%p count:%d\n", __func__,
+	       data, dreq,
+	       atomic_read(&dreq->io_count));
+	//for (i = 0; i < data->npages; i++)
+	//	put_page(data->pagevec[i]);
+
 	if (put_dreq(dreq))
 		nfs_direct_write_complete(dreq, data->inode);
 	return 0;
 }
 
+static void set_all_pages_debug(struct page **pages, int npages)
+{
+	int i;
+	return;
+	for (i=0; i<npages; i++) {
+		if (0) printk(KERN_CRIT "Marking page %p as debug current count:%d\n",
+		       pages[i],page_count(pages[i]));
+		SetPageDebug(pages[i]);
+	}
+}
+
 /*
  * For each wsize'd chunk of the user's buffer, dispatch an NFS WRITE
  * operation.  If nfs_writedata_alloc() or get_user_pages() fails,
@@ -754,6 +790,7 @@ static ssize_t nfs_direct_write_schedule_segment(struct nfs_direct_req *dreq,
 		if (unlikely(!data))
 			break;
 
+		if (0) printk(KERN_CRIT "%s: getting user pages\n", __func__);
 		down_read(&current->mm->mmap_sem);
 		result = get_user_pages(current, current->mm, user_addr,
 					data->npages, 0, 0, data->pagevec, NULL);
@@ -773,6 +810,8 @@ static ssize_t nfs_direct_write_schedule_segment(struct nfs_direct_req *dreq,
 			data->npages = result;
 		}
 
+		set_all_pages_debug(data->pagevec, data->npages);
+
 		get_dreq(dreq);
 
 		list_move_tail(&data->pages, &dreq->rewrite_list);
@@ -790,7 +829,11 @@ static ssize_t nfs_direct_write_schedule_segment(struct nfs_direct_req *dreq,
 		data->args.offset = pos;
 		data->args.pgbase = pgbase;
 		data->args.pages = data->pagevec;
+#if 1
 		data->args.pages_destructor = &data->pagevec_destructor;
+#else
+		data->args.pages_destructor = NULL;
+#endif
 		data->args.count = bytes;
 		data->args.stable = sync;
 		data->res.fattr = &data->fattr;
@@ -804,6 +847,16 @@ static ssize_t nfs_direct_write_schedule_segment(struct nfs_direct_req *dreq,
 		msg.rpc_resp = &data->res;
 		NFS_PROTO(inode)->write_setup(data, &msg);
 
+		if (0) printk(KERN_CRIT "%s scheduling w:%p dreq:%p count:%d. args %p pages:%d dest:%p %pf\n",
+		       __func__, data, dreq,
+		       atomic_read(&dreq->io_count),
+		       &data->args, data->npages,
+		       data->args.pages_destructor,
+		       data->args.pages_destructor->destroy);
+		if (0) printk(KERN_CRIT "%s page[0] is %p count:%d\n",
+		       __func__, data->pagevec[0],
+		       page_count(data->pagevec[0]));
+
 		task = rpc_run_task(&task_setup_data);
 		if (IS_ERR(task))
 			break;
@@ -866,6 +919,8 @@ static ssize_t nfs_direct_write_schedule_iovec(struct nfs_direct_req *dreq,
 		return result < 0 ? result : -EIO;
 	}
 
+	if (0) printk(KERN_CRIT "%s for dreq:%p count:%d\n", __func__,
+	       dreq, atomic_read(&dreq->io_count));
 	if (put_dreq(dreq))
 		nfs_direct_write_complete(dreq, dreq->inode);
 	return 0;
diff --git a/fs/nfs/nfs3xdr.c b/fs/nfs/nfs3xdr.c
index f7a83a1..2137550 100644
--- a/fs/nfs/nfs3xdr.c
+++ b/fs/nfs/nfs3xdr.c
@@ -994,6 +994,8 @@ static void encode_write3args(struct xdr_stream *xdr,
 	*p++ = cpu_to_be32(args->count);
 	*p++ = cpu_to_be32(args->stable);
 	*p = cpu_to_be32(args->count);
+	if (0) printk(KERN_CRIT "%s xdr %p args %p destructor %pF\n",
+	       __func__, xdr, args, args->pages_destructor->destroy);
 	xdr_write_pages(xdr, args->pages, args->pages_destructor,
 			args->pgbase, args->count);
 }
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 550ec8f..f992dfa 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -362,6 +362,10 @@ static inline int page_count(struct page *page)
 
 static inline void get_page(struct page *page)
 {
+	if (PageDebug(page)) printk(KERN_CRIT "%s(%p) from %pF count %d\n",
+				    __func__, page,
+				    __builtin_return_address(0),
+				    page_count(page));
 	/*
 	 * Getting a normal page or the head of a compound page
 	 * requires to already have an elevated page->_count. Only if
diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index 7d632cc..8434345 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -107,6 +107,7 @@ enum pageflags {
 #ifdef CONFIG_TRANSPARENT_HUGEPAGE
 	PG_compound_lock,
 #endif
+	PG_debug,
 	__NR_PAGEFLAGS,
 
 	/* Filesystems */
@@ -209,6 +210,7 @@ PAGEFLAG(Pinned, pinned) TESTSCFLAG(Pinned, pinned)	/* Xen */
 PAGEFLAG(SavePinned, savepinned);			/* Xen */
 PAGEFLAG(Reserved, reserved) __CLEARPAGEFLAG(Reserved, reserved)
 PAGEFLAG(SwapBacked, swapbacked) __CLEARPAGEFLAG(SwapBacked, swapbacked)
+PAGEFLAG(Debug, debug)
 
 __PAGEFLAG(SlobFree, slob_free)
 
diff --git a/mm/swap.c b/mm/swap.c
index 3a442f1..4450105 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -61,6 +61,8 @@ static void __page_cache_release(struct page *page)
 
 static void __put_single_page(struct page *page)
 {
+	if (PageDebug(page)) printk(KERN_CRIT "%s(%p) from %pF\n", __func__, page, __builtin_return_address(0));
+	ClearPageDebug(page);
 	__page_cache_release(page);
 	free_hot_cold_page(page, 0);
 }
@@ -153,6 +155,10 @@ static void put_compound_page(struct page *page)
 
 void put_page(struct page *page)
 {
+	if (PageDebug(page)) printk(KERN_CRIT "%s(%p) from %pF count %d\n",
+				    __func__, page,
+				    __builtin_return_address(0),
+				    page_count(page));
 	if (unlikely(PageCompound(page)))
 		put_compound_page(page);
 	else if (put_page_testzero(page))
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index bdc6f6e..c3fdfb7 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -296,6 +296,9 @@ void skb_frag_destructor_ref(struct skb_frag_destructor *destroy)
 {
 	BUG_ON(destroy == NULL);
 	atomic_inc(&destroy->ref);
+	if (0) printk(KERN_CRIT "%s from %pF: %d\n",
+	       __func__, __builtin_return_address(0),
+	       atomic_read(&destroy->ref));
 }
 EXPORT_SYMBOL(skb_frag_destructor_ref);
 
@@ -304,8 +307,17 @@ void skb_frag_destructor_unref(struct skb_frag_destructor *destroy)
 	if (destroy == NULL)
 		return;
 
-	if (atomic_dec_and_test(&destroy->ref))
+	if (atomic_dec_and_test(&destroy->ref)) {
+		if (0) printk(KERN_CRIT "%s from %pF: calling destructor %p %pf(%p)\n",
+		       __func__, __builtin_return_address(0),
+		       destroy, destroy->destroy, destroy->data);
 		destroy->destroy(destroy->data);
+	} else {
+		if (0) printk(KERN_CRIT "%s from %pF: destructor %p %pf(%p) not called %d remaining\n",
+		       __func__, __builtin_return_address(0),
+		       destroy, destroy->destroy, destroy->data,
+		       atomic_read(&destroy->ref));
+	}
 }
 EXPORT_SYMBOL(skb_frag_destructor_unref);
 
diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
index 40c2420..64ce9907 100644
--- a/net/sunrpc/svcsock.c
+++ b/net/sunrpc/svcsock.c
@@ -176,6 +176,10 @@ int svc_send_common(struct socket *sock, struct xdr_buf *xdr,
 	int		slen;
 	int		len = 0;
 
+	if (xdr->destructor)
+		if (0) printk(KERN_CRIT "%s sending xdr %p with destructor %pF\n",
+		       __func__, xdr, xdr->destructor->destroy);
+
 	slen = xdr->len;
 
 	/* send head */
@@ -194,6 +198,9 @@ int svc_send_common(struct socket *sock, struct xdr_buf *xdr,
 	while (pglen > 0) {
 		if (slen == size)
 			flags = 0;
+		if (xdr->destructor)
+			if (0) printk(KERN_CRIT "%s sending xdr %p page %p with destructor %pF\n",
+			       __func__, xdr, *ppage, xdr->destructor);
 		result = kernel_sendpage(sock, *ppage, xdr->destructor, base, size, flags);
 		if (result > 0)
 			len += result;
diff --git a/net/sunrpc/xdr.c b/net/sunrpc/xdr.c
index 9c7dded..fcaea6a 100644
--- a/net/sunrpc/xdr.c
+++ b/net/sunrpc/xdr.c
@@ -135,6 +135,10 @@ xdr_encode_pages(struct xdr_buf *xdr, struct page **pages, unsigned int base,
 	struct kvec *tail = xdr->tail;
 	u32 *p;
 
+	if (xdr->destructor)
+		if (0) printk(KERN_CRIT "%s xdr %p with destructor %p\n",
+		       __func__, xdr, xdr->destructor);
+
 	xdr->pages = pages;
 	xdr->page_base = base;
 	xdr->page_len = len;
@@ -165,6 +169,11 @@ xdr_inline_pages(struct xdr_buf *xdr, unsigned int offset,
 	char *buf = (char *)head->iov_base;
 	unsigned int buflen = head->iov_len;
 
+
+	if (xdr->destructor)
+		if (0) printk(KERN_CRIT "%s xdr %p with destructor %p\n",
+		       __func__, xdr, xdr->destructor);
+
 	head->iov_len  = offset;
 
 	xdr->pages = pages;
@@ -535,7 +544,8 @@ void xdr_write_pages(struct xdr_stream *xdr, struct page **pages,
 	buf->destructor = destroy;
 	buf->page_base = base;
 	buf->page_len = len;
-
+	if (destroy)
+		if (0) printk(KERN_CRIT "%s xdr %p %p destructor %p\n", __func__, xdr, buf, destroy);
 	iov->iov_base = (char *)xdr->p;
 	iov->iov_len  = 0;
 	xdr->iov = iov;
diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
index aa31294..336d787 100644
--- a/net/sunrpc/xprtsock.c
+++ b/net/sunrpc/xprtsock.c
@@ -386,6 +386,10 @@ static int xs_send_pagedata(struct socket *sock, struct xdr_buf *xdr, unsigned i
 	unsigned int remainder;
 	int err, sent = 0;
 
+	if (xdr->destructor)
+		if (0) printk(KERN_CRIT "%s sending xdr %p with destructor %pF\n",
+		       __func__, xdr, xdr->destructor->destroy);
+
 	remainder = xdr->page_len - base;
 	base += xdr->page_base;
 	ppage = xdr->pages + (base >> PAGE_SHIFT);
@@ -425,6 +429,10 @@ static int xs_sendpages(struct socket *sock, struct sockaddr *addr, int addrlen,
 	unsigned int remainder = xdr->len - base;
 	int err, sent = 0;
 
+	if (xdr->destructor)
+		if (0) printk(KERN_CRIT "%s sending xdr %p with destructor %pF\n",
+		       __func__, xdr, xdr->destructor->destroy);
+
 	if (unlikely(!sock))
 		return -ENOTSOCK;
 
-- 
1.7.2.5


^ permalink raw reply related

* [PATCH 01/10] mm: Make some struct page's const.
From: Ian Campbell @ 2011-07-15 11:07 UTC (permalink / raw)
  To: netdev; +Cc: linux-nfs, Ian Campbell
In-Reply-To: <1310728006.20648.3.camel@zakaz.uk.xensource.com>

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
---
 include/linux/mm.h |   10 +++++-----
 mm/sparse.c        |    2 +-
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 9670f71..550ec8f 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -636,7 +636,7 @@ static inline pte_t maybe_mkwrite(pte_t pte, struct vm_area_struct *vma)
 #define SECTIONS_MASK		((1UL << SECTIONS_WIDTH) - 1)
 #define ZONEID_MASK		((1UL << ZONEID_SHIFT) - 1)
 
-static inline enum zone_type page_zonenum(struct page *page)
+static inline enum zone_type page_zonenum(const struct page *page)
 {
 	return (page->flags >> ZONES_PGSHIFT) & ZONES_MASK;
 }
@@ -664,15 +664,15 @@ static inline int zone_to_nid(struct zone *zone)
 }
 
 #ifdef NODE_NOT_IN_PAGE_FLAGS
-extern int page_to_nid(struct page *page);
+extern int page_to_nid(const struct page *page);
 #else
-static inline int page_to_nid(struct page *page)
+static inline int page_to_nid(const struct page *page)
 {
 	return (page->flags >> NODES_PGSHIFT) & NODES_MASK;
 }
 #endif
 
-static inline struct zone *page_zone(struct page *page)
+static inline struct zone *page_zone(const struct page *page)
 {
 	return &NODE_DATA(page_to_nid(page))->node_zones[page_zonenum(page)];
 }
@@ -717,7 +717,7 @@ static inline void set_page_links(struct page *page, enum zone_type zone,
  */
 #include <linux/vmstat.h>
 
-static __always_inline void *lowmem_page_address(struct page *page)
+static __always_inline void *lowmem_page_address(const struct page *page)
 {
 	return __va(PFN_PHYS(page_to_pfn(page)));
 }
diff --git a/mm/sparse.c b/mm/sparse.c
index aa64b12..858e1df 100644
--- a/mm/sparse.c
+++ b/mm/sparse.c
@@ -40,7 +40,7 @@ static u8 section_to_node_table[NR_MEM_SECTIONS] __cacheline_aligned;
 static u16 section_to_node_table[NR_MEM_SECTIONS] __cacheline_aligned;
 #endif
 
-int page_to_nid(struct page *page)
+int page_to_nid(const struct page *page)
 {
 	return section_to_node_table[page_to_section(page)];
 }
-- 
1.7.2.5


^ permalink raw reply related

* [PATCH 03/10] net: add APIs for manipulating skb page fragments.
From: Ian Campbell @ 2011-07-15 11:07 UTC (permalink / raw)
  To: netdev; +Cc: linux-nfs, Ian Campbell
In-Reply-To: <1310728006.20648.3.camel@zakaz.uk.xensource.com>

The primary aim is to add skb_frag_(ref|unref) in order to remove the use of
bare get/put_page on SKB pages fragments and to isolate users from subsequent
changes to the skb_frag_t data structure.

The API also includes an accessor for the struct page itself. The default
variant of this returns a *const* struct page in an attempt to catch bare uses
of get/put_page (which take a non-const struct page).

Also included are helper APIs for passing a paged fragment to kmap and
(pci|dma)_map_page since I was seeing the same pattern a lot.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
---
 include/linux/skbuff.h |  225 +++++++++++++++++++++++++++++++++++++++++++++++-
 1 files changed, 223 insertions(+), 2 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index c0a4f3a..c061257 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -29,6 +29,8 @@
 #include <linux/rcupdate.h>
 #include <linux/dmaengine.h>
 #include <linux/hrtimer.h>
+#include <linux/highmem.h>
+#include <linux/pci.h>
 
 /* Don't change this without changing skb_csum_unnecessary! */
 #define CHECKSUM_NONE 0
@@ -1109,14 +1111,47 @@ static inline int skb_pagelen(const struct sk_buff *skb)
 	return len + skb_headlen(skb);
 }
 
-static inline void skb_fill_page_desc(struct sk_buff *skb, int i,
-				      struct page *page, int off, int size)
+/**
+ * __skb_fill_page_desc - initialise a paged fragment in an skb
+ * @skb: buffer containing fragment to be initialised
+ * @i: paged fragment index to initialise
+ * @page: the page to use for this fragment
+ * @off: the offset to the data with @page
+ * @size: the length of the data
+ *
+ * Initialises the @i'th fragment of @skb to point to &size bytes at
+ * offset @off within @page.
+ *
+ * Does not take any additional reference on the fragment.
+ */
+static inline void __skb_fill_page_desc(struct sk_buff *skb, int i,
+					struct page *page, int off, int size)
 {
 	skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
 
 	frag->page		  = page;
 	frag->page_offset	  = off;
 	frag->size		  = size;
+}
+
+/**
+ * skb_fill_page_desc - initialise a paged fragment in an skb
+ * @skb: buffer containing fragment to be initialised
+ * @i: paged fragment index to initialise
+ * @page: the page to use for this fragment
+ * @off: the offset to the data with @page
+ * @size: the length of the data
+ *
+ * As per __skb_fill_page_desc() -- initialises the @i'th fragment of
+ * @skb to point to &size bytes at offset @off within @page. In
+ * addition updates @skb such that @i is the last fragment.
+ *
+ * Does not take any additional reference on the fragment.
+ */
+static inline void skb_fill_page_desc(struct sk_buff *skb, int i,
+				      struct page *page, int off, int size)
+{
+	__skb_fill_page_desc(skb, i, page, off, size);
 	skb_shinfo(skb)->nr_frags = i + 1;
 }
 
@@ -1605,6 +1640,192 @@ static inline void netdev_free_page(struct net_device *dev, struct page *page)
 }
 
 /**
+ * __skb_frag_page - retrieve the page refered to by a paged fragment
+ * @frag: the paged fragment
+ *
+ * Returns the &struct page associated with @frag. Where possible you
+ * should use skb_frag_page() which returns a const &struct page.
+ */
+static inline struct page *__skb_frag_page(const skb_frag_t *frag)
+{
+	return frag->page;
+}
+
+/**
+ * __skb_frag_page - retrieve the page refered to by a paged fragment
+ * @frag: the paged fragment
+ *
+ * Returns the &struct page associated with @frag as a const.
+ */
+static inline const struct page *skb_frag_page(const skb_frag_t *frag)
+{
+	return frag->page;
+}
+
+/**
+ * __skb_frag_ref - take an addition reference on a paged fragment.
+ * @frag: the paged fragment
+ *
+ * Takes an additional reference on the paged fragment @frag.
+ */
+static inline void __skb_frag_ref(skb_frag_t *frag)
+{
+	get_page(__skb_frag_page(frag));
+}
+
+/**
+ * skb_frag_ref - take an addition reference on a paged fragment of an skb.
+ * @skb: the buffer
+ * @f: the fragment offset.
+ *
+ * Takes an additional reference on the @f'th paged fragment of @skb.
+ */
+static inline void skb_frag_ref(struct sk_buff *skb, int f)
+{
+	__skb_frag_ref(&skb_shinfo(skb)->frags[f]);
+}
+
+/**
+ * __skb_frag_unref - release a reference on a paged fragment.
+ * @frag: the paged fragment
+ *
+ * Releases a reference on the paged fragment @frag.
+ */
+static inline void __skb_frag_unref(skb_frag_t *frag)
+{
+	put_page(__skb_frag_page(frag));
+}
+
+/**
+ * skb_frag_unref - release a reference on a paged fragment of an skb.
+ * @skb: the buffer
+ * @f: the fragment offset
+ *
+ * Releases a reference on the @f'th paged fragment of @skb.
+ */
+static inline void skb_frag_unref(struct sk_buff *skb, int f)
+{
+	__skb_frag_unref(&skb_shinfo(skb)->frags[f]);
+}
+
+/**
+ * skb_frag_address - gets the address of the data contained in a paged fragment
+ * @frag: the paged fragment buffer
+ *
+ * Returns the address of the data within @frag. The page must already
+ * be mapped.
+ */
+static inline void *skb_frag_address(const skb_frag_t *frag)
+{
+	return page_address(skb_frag_page(frag)) + frag->page_offset;
+}
+
+/**
+ * skb_frag_address_safe - gets the address of the data contained in a paged fragment
+ * @frag: the paged fragment buffer
+ *
+ * Returns the address of the data within @frag. Checks that the page
+ * is mapped and returns %NULL otherwise.
+ */
+static inline void *skb_frag_address_safe(const skb_frag_t *frag)
+{
+	void *ptr = page_address(skb_frag_page(frag));
+	if (unlikely(!ptr))
+		return NULL;
+
+	return ptr + frag->page_offset;
+}
+
+/**
+ * __skb_frag_set_page - sets the page contained in a paged fragment
+ * @frag: the paged fragment
+ * @page: the page to set
+ *
+ * Sets the fragment @frag to contain @page.
+ */
+static inline void __skb_frag_set_page(skb_frag_t *frag, struct page *page)
+{
+	frag->page = page;
+	__skb_frag_ref(frag);
+}
+
+/**
+ * skb_frag_set_page - sets the page contained in a paged fragment of an skb
+ * @skb: the buffer
+ * @f: the fragment offset
+ * @page: the page to set
+ *
+ * Sets the @f'th fragment of @skb to contain @page.
+ */
+static inline void skb_frag_set_page(struct sk_buff *skb, int f,
+				     struct page *page)
+{
+	__skb_frag_set_page(&skb_shinfo(skb)->frags[f], page);
+}
+
+/**
+ * skb_frag_kmap - kmaps a paged fragment
+ * @frag: the paged fragment
+ *
+ * kmap()s the paged fragment @frag and returns the virtual address.
+ */
+static inline void *skb_frag_kmap(skb_frag_t *frag)
+{
+	return kmap(__skb_frag_page(frag));
+}
+
+/**
+ * skb_frag_kmap - kunmaps a paged fragment
+ * @frag: the paged fragment
+ *
+ * kunmap()s the paged fragment @frag.
+ */
+static inline void skb_frag_kunmap(skb_frag_t *frag)
+{
+	kunmap(__skb_frag_page(frag));
+}
+
+/**
+ * skb_frag_pci_map - maps a paged fragment to a PCI device
+ * @hwdev: the PCI device to map the fragment to
+ * @frag: the paged fragment to map
+ * @offset: the offset within the fragment (starting at the fragments own offset)
+ * @size: the number of bytes to map
+ * @direction: the direction of the mapping (%PCI_DMA_*)
+ *
+ * Maps the page associated with @frag to the PCI device @hwdev.
+ */
+static inline dma_addr_t skb_frag_pci_map(struct pci_dev *hwdev,
+					  const skb_frag_t *frag,
+					  unsigned long offset,
+					  size_t size,
+					  int direction)
+
+{
+	return pci_map_page(hwdev, __skb_frag_page(frag),
+			    frag->page_offset + offset, size, direction);
+}
+
+/**
+ * skb_frag_dma_map - maps a paged fragment via the DMA API
+ * @device: the device to map the fragment to
+ * @frag: the paged fragment to map
+ * @offset: the offset within the fragment (starting at the fragments own offset)
+ * @size: the number of bytes to map
+ * @direction: the direction of the mapping (%PCI_DMA_*)
+ *
+ * Maps the page associated with @frag to @device.
+ */
+static inline dma_addr_t skb_frag_dma_map(struct device *dev,
+					  const skb_frag_t *frag,
+					  size_t offset, size_t size,
+					  enum dma_data_direction dir)
+{
+	return dma_map_page(dev, __skb_frag_page(frag),
+			    frag->page_offset + offset, size, dir);
+}
+
+/**
  *	skb_clone_writable - is the header of a clone writable
  *	@skb: buffer to check
  *	@len: length up to which to write
-- 
1.7.2.5


^ permalink raw reply related

* [PATCH] gianfar: rx parser
From: Sebastian Pöhn @ 2011-07-15 11:09 UTC (permalink / raw)
  To: Linux Netdev, jpirko; +Cc: sandeep.kumar

Only let the rx parser be enabled if it is necessary (if VLAN extracrtion, IP or TCP checksumming or the rx queue filer are enabled). Otherwise disable it.

The new routine gfar_check_rx_parser_mode should be runt after every change on this features and will enable/disable the parser as necessary.

I will send a patch for the rx queue filer the next days.



Signed-off-by: Sebastian Poehn <sebastian.poehn@belden.com>
---

 drivers/net/gianfar.c |   25 ++++++++++++++++++++-----
 drivers/net/gianfar.h |    3 ++-
 2 files changed, 22 insertions(+), 6 deletions(-)

diff --git a/drivers/net/gianfar.c b/drivers/net/gianfar.c
index 3321d71..6da3712 100644
--- a/drivers/net/gianfar.c
+++ b/drivers/net/gianfar.c
@@ -2287,6 +2287,23 @@ static int gfar_set_mac_address(struct net_device *dev)
 	return 0;
 }
 
+/* Check if rx parser should be activated */
+void gfar_check_rx_parser_mode(struct gfar_private *priv)
+{
+	struct gfar __iomem *regs = NULL;
+	u32 tempval = 0;
+	
+	regs = priv->gfargrp[0].regs;
+	
+	tempval = gfar_read(&regs->rctrl);
+	/* If parse is no longer required, then disable parser */
+	if (tempval & RCTRL_REQ_PARSER)
+		tempval |= RCTRL_PRSDEP_INIT;
+	else
+		tempval &= ~RCTRL_PRSDEP_INIT;
+	gfar_write(&regs->rctrl, tempval);
+}
+
 
 /* Enables and disables VLAN insertion/extraction */
 static void gfar_vlan_rx_register(struct net_device *dev,
@@ -2323,12 +2340,10 @@ static void gfar_vlan_rx_register(struct net_device *dev,
 		/* Disable VLAN tag extraction */
 		tempval = gfar_read(&regs->rctrl);
 		tempval &= ~RCTRL_VLEX;
-		/* If parse is no longer required, then disable parser */
-		if (tempval & RCTRL_REQ_PARSER)
-			tempval |= RCTRL_PRSDEP_INIT;
-		else
-			tempval &= ~RCTRL_PRSDEP_INIT;
 		gfar_write(&regs->rctrl, tempval);
+
+		gfar_check_rx_parser_mode(priv);
+		
 	}
 
 	gfar_change_mtu(dev, dev->mtu);
diff --git a/drivers/net/gianfar.h b/drivers/net/gianfar.h
index 27499c6..87c1d86 100644
--- a/drivers/net/gianfar.h
+++ b/drivers/net/gianfar.h
@@ -286,7 +286,7 @@ extern const char gfar_driver_version[];
 #define RCTRL_PROM		0x00000008
 #define RCTRL_EMEN		0x00000002
 #define RCTRL_REQ_PARSER	(RCTRL_VLEX | RCTRL_IPCSEN | \
-				 RCTRL_TUCSEN)
+				 RCTRL_TUCSEN | RCTRL_FILREN)
 #define RCTRL_CHECKSUMMING	(RCTRL_IPCSEN | RCTRL_TUCSEN | \
 				RCTRL_PRSDEP_INIT)
 #define RCTRL_EXTHASH		(RCTRL_GHTX)
@@ -1182,6 +1182,7 @@ extern void gfar_configure_coalescing(struct gfar_private *priv,
 		unsigned long tx_mask, unsigned long rx_mask);
 void gfar_init_sysfs(struct net_device *dev);
 int gfar_set_features(struct net_device *dev, u32 features);
+extern void gfar_check_rx_parser_mode(struct gfar_private *priv);
 
 extern const struct ethtool_ops gfar_ethtool_ops;
 

 



^ permalink raw reply related

* [PATCH 09/10] nfs: use sk fragment destructors to delay I/O completion until page is released by network stack.
From: Ian Campbell @ 2011-07-15 11:07 UTC (permalink / raw)
  To: netdev-u79uwXL29TY76Z2rM5mHXA
  Cc: linux-nfs-u79uwXL29TY76Z2rM5mHXA, Ian Campbell
In-Reply-To: <1310728006.20648.3.camel-o4Be2W7LfRlXesXXhkcM7miJhflN2719@public.gmane.org>

Thos prevents an issue where an ACK is delayed, a retransmit is queued (either
at the RPC or TCP level) and the ACK arrives before the retransmission hits the
wire. If this happens then the write() system call and the userspace process
can continue potentially modifying the data before the retransmission occurs.

NB: this only covers the O_DIRECT write() case. I expect other cases to need
handling as well.

Signed-off-by: Ian Campbell <ian.campbell-Sxgqhf6Nn4DQT0dZR+AlfA@public.gmane.org>
---
 fs/nfs/direct.c            |   17 +++++++++++++++--
 fs/nfs/nfs2xdr.c           |    4 ++--
 fs/nfs/nfs3xdr.c           |    7 ++++---
 fs/nfs/nfs4xdr.c           |    6 +++---
 include/linux/nfs_xdr.h    |    4 ++++
 include/linux/sunrpc/xdr.h |    2 ++
 net/sunrpc/svcsock.c       |    2 +-
 net/sunrpc/xdr.c           |    4 +++-
 net/sunrpc/xprtsock.c      |    2 +-
 9 files changed, 35 insertions(+), 13 deletions(-)

diff --git a/fs/nfs/direct.c b/fs/nfs/direct.c
index 8eea253..4735fd9 100644
--- a/fs/nfs/direct.c
+++ b/fs/nfs/direct.c
@@ -691,8 +691,7 @@ static void nfs_direct_write_release(void *calldata)
 out_unlock:
 	spin_unlock(&dreq->lock);
 
-	if (put_dreq(dreq))
-		nfs_direct_write_complete(dreq, data->inode);
+	skb_frag_destructor_unref(data->args.pages_destructor);
 }
 
 static const struct rpc_call_ops nfs_write_direct_ops = {
@@ -703,6 +702,15 @@ static const struct rpc_call_ops nfs_write_direct_ops = {
 	.rpc_release = nfs_direct_write_release,
 };
 
+static int nfs_write_page_destroy(void *calldata)
+{
+	struct nfs_write_data *data = calldata;
+	struct nfs_direct_req *dreq = (struct nfs_direct_req *) data->req;
+	if (put_dreq(dreq))
+		nfs_direct_write_complete(dreq, data->inode);
+	return 0;
+}
+
 /*
  * For each wsize'd chunk of the user's buffer, dispatch an NFS WRITE
  * operation.  If nfs_writedata_alloc() or get_user_pages() fails,
@@ -769,6 +777,10 @@ static ssize_t nfs_direct_write_schedule_segment(struct nfs_direct_req *dreq,
 
 		list_move_tail(&data->pages, &dreq->rewrite_list);
 
+		atomic_set(&data->pagevec_destructor.ref, 1);
+		data->pagevec_destructor.destroy = nfs_write_page_destroy;
+		data->pagevec_destructor.data = data;
+
 		data->req = (struct nfs_page *) dreq;
 		data->inode = inode;
 		data->cred = msg.rpc_cred;
@@ -778,6 +790,7 @@ static ssize_t nfs_direct_write_schedule_segment(struct nfs_direct_req *dreq,
 		data->args.offset = pos;
 		data->args.pgbase = pgbase;
 		data->args.pages = data->pagevec;
+		data->args.pages_destructor = &data->pagevec_destructor;
 		data->args.count = bytes;
 		data->args.stable = sync;
 		data->res.fattr = &data->fattr;
diff --git a/fs/nfs/nfs2xdr.c b/fs/nfs/nfs2xdr.c
index 792cb13..6dc77f0 100644
--- a/fs/nfs/nfs2xdr.c
+++ b/fs/nfs/nfs2xdr.c
@@ -431,7 +431,7 @@ static void encode_path(struct xdr_stream *xdr, struct page **pages, u32 length)
 	BUG_ON(length > NFS2_MAXPATHLEN);
 	p = xdr_reserve_space(xdr, 4);
 	*p = cpu_to_be32(length);
-	xdr_write_pages(xdr, pages, 0, length);
+	xdr_write_pages(xdr, pages, NULL, 0, length);
 }
 
 static int decode_path(struct xdr_stream *xdr)
@@ -659,7 +659,7 @@ static void encode_writeargs(struct xdr_stream *xdr,
 
 	/* nfsdata */
 	*p = cpu_to_be32(count);
-	xdr_write_pages(xdr, args->pages, args->pgbase, count);
+	xdr_write_pages(xdr, args->pages, NULL, args->pgbase, count);
 }
 
 static void nfs2_xdr_enc_writeargs(struct rpc_rqst *req,
diff --git a/fs/nfs/nfs3xdr.c b/fs/nfs/nfs3xdr.c
index 183c6b1..f7a83a1 100644
--- a/fs/nfs/nfs3xdr.c
+++ b/fs/nfs/nfs3xdr.c
@@ -238,7 +238,7 @@ static void encode_nfspath3(struct xdr_stream *xdr, struct page **pages,
 {
 	BUG_ON(length > NFS3_MAXPATHLEN);
 	encode_uint32(xdr, length);
-	xdr_write_pages(xdr, pages, 0, length);
+	xdr_write_pages(xdr, pages, NULL, 0, length);
 }
 
 static int decode_nfspath3(struct xdr_stream *xdr)
@@ -994,7 +994,8 @@ static void encode_write3args(struct xdr_stream *xdr,
 	*p++ = cpu_to_be32(args->count);
 	*p++ = cpu_to_be32(args->stable);
 	*p = cpu_to_be32(args->count);
-	xdr_write_pages(xdr, args->pages, args->pgbase, args->count);
+	xdr_write_pages(xdr, args->pages, args->pages_destructor,
+			args->pgbase, args->count);
 }
 
 static void nfs3_xdr_enc_write3args(struct rpc_rqst *req,
@@ -1331,7 +1332,7 @@ static void nfs3_xdr_enc_setacl3args(struct rpc_rqst *req,
 
 	base = req->rq_slen;
 	if (args->npages != 0)
-		xdr_write_pages(xdr, args->pages, 0, args->len);
+		xdr_write_pages(xdr, args->pages, NULL, 0, args->len);
 	else
 		xdr_reserve_space(xdr, NFS_ACL_INLINE_BUFSIZE);
 
diff --git a/fs/nfs/nfs4xdr.c b/fs/nfs/nfs4xdr.c
index 6870bc6..ac9931d 100644
--- a/fs/nfs/nfs4xdr.c
+++ b/fs/nfs/nfs4xdr.c
@@ -1031,7 +1031,7 @@ static void encode_create(struct xdr_stream *xdr, const struct nfs4_create_arg *
 	case NF4LNK:
 		p = reserve_space(xdr, 4);
 		*p = cpu_to_be32(create->u.symlink.len);
-		xdr_write_pages(xdr, create->u.symlink.pages, 0, create->u.symlink.len);
+		xdr_write_pages(xdr, create->u.symlink.pages, NULL, 0, create->u.symlink.len);
 		break;
 
 	case NF4BLK: case NF4CHR:
@@ -1573,7 +1573,7 @@ encode_setacl(struct xdr_stream *xdr, struct nfs_setaclargs *arg, struct compoun
 	BUG_ON(arg->acl_len % 4);
 	p = reserve_space(xdr, 4);
 	*p = cpu_to_be32(arg->acl_len);
-	xdr_write_pages(xdr, arg->acl_pages, arg->acl_pgbase, arg->acl_len);
+	xdr_write_pages(xdr, arg->acl_pages, NULL, arg->acl_pgbase, arg->acl_len);
 	hdr->nops++;
 	hdr->replen += decode_setacl_maxsz;
 }
@@ -1647,7 +1647,7 @@ static void encode_write(struct xdr_stream *xdr, const struct nfs_writeargs *arg
 	*p++ = cpu_to_be32(args->stable);
 	*p = cpu_to_be32(args->count);
 
-	xdr_write_pages(xdr, args->pages, args->pgbase, args->count);
+	xdr_write_pages(xdr, args->pages, NULL, args->pgbase, args->count);
 	hdr->nops++;
 	hdr->replen += decode_write_maxsz;
 }
diff --git a/include/linux/nfs_xdr.h b/include/linux/nfs_xdr.h
index 00848d8..4bbc2bf 100644
--- a/include/linux/nfs_xdr.h
+++ b/include/linux/nfs_xdr.h
@@ -5,6 +5,8 @@
 #include <linux/nfs3.h>
 #include <linux/sunrpc/gss_api.h>
 
+#include <linux/skbuff.h>
+
 /*
  * To change the maximum rsize and wsize supported by the NFS client, adjust
  * NFS_MAX_FILE_IO_SIZE.  64KB is a typical maximum, but some servers can
@@ -470,6 +472,7 @@ struct nfs_writeargs {
 	enum nfs3_stable_how	stable;
 	unsigned int		pgbase;
 	struct page **		pages;
+	struct skb_frag_destructor *pages_destructor;
 	const u32 *		bitmask;
 	struct nfs4_sequence_args	seq_args;
 };
@@ -1121,6 +1124,7 @@ struct nfs_write_data {
 	struct list_head	pages;		/* Coalesced requests we wish to flush */
 	struct nfs_page		*req;		/* multi ops per nfs_page */
 	struct page		**pagevec;
+	struct skb_frag_destructor pagevec_destructor;
 	unsigned int		npages;		/* Max length of pagevec */
 	struct nfs_writeargs	args;		/* argument struct */
 	struct nfs_writeres	res;		/* result struct */
diff --git a/include/linux/sunrpc/xdr.h b/include/linux/sunrpc/xdr.h
index a20970e..cebb531 100644
--- a/include/linux/sunrpc/xdr.h
+++ b/include/linux/sunrpc/xdr.h
@@ -57,6 +57,7 @@ struct xdr_buf {
 			tail[1];	/* Appended after page data */
 
 	struct page **	pages;		/* Array of contiguous pages */
+	struct skb_frag_destructor *destructor;
 	unsigned int	page_base,	/* Start of page data */
 			page_len,	/* Length of page data */
 			flags;		/* Flags for data disposition */
@@ -214,6 +215,7 @@ typedef int	(*kxdrdproc_t)(void *rqstp, struct xdr_stream *xdr, void *obj);
 extern void xdr_init_encode(struct xdr_stream *xdr, struct xdr_buf *buf, __be32 *p);
 extern __be32 *xdr_reserve_space(struct xdr_stream *xdr, size_t nbytes);
 extern void xdr_write_pages(struct xdr_stream *xdr, struct page **pages,
+		struct skb_frag_destructor *destroy,
 		unsigned int base, unsigned int len);
 extern void xdr_init_decode(struct xdr_stream *xdr, struct xdr_buf *buf, __be32 *p);
 extern void xdr_init_decode_pages(struct xdr_stream *xdr, struct xdr_buf *buf,
diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
index a80b1d3..40c2420 100644
--- a/net/sunrpc/svcsock.c
+++ b/net/sunrpc/svcsock.c
@@ -194,7 +194,7 @@ int svc_send_common(struct socket *sock, struct xdr_buf *xdr,
 	while (pglen > 0) {
 		if (slen == size)
 			flags = 0;
-		result = kernel_sendpage(sock, *ppage, NULL, base, size, flags);
+		result = kernel_sendpage(sock, *ppage, xdr->destructor, base, size, flags);
 		if (result > 0)
 			len += result;
 		if (result != size)
diff --git a/net/sunrpc/xdr.c b/net/sunrpc/xdr.c
index f008c14..9c7dded 100644
--- a/net/sunrpc/xdr.c
+++ b/net/sunrpc/xdr.c
@@ -525,12 +525,14 @@ EXPORT_SYMBOL_GPL(xdr_reserve_space);
  * @len: length of data in bytes
  *
  */
-void xdr_write_pages(struct xdr_stream *xdr, struct page **pages, unsigned int base,
+void xdr_write_pages(struct xdr_stream *xdr, struct page **pages,
+		 struct skb_frag_destructor *destroy, unsigned int base,
 		 unsigned int len)
 {
 	struct xdr_buf *buf = xdr->buf;
 	struct kvec *iov = buf->tail;
 	buf->pages = pages;
+	buf->destructor = destroy;
 	buf->page_base = base;
 	buf->page_len = len;
 
diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
index 72abb73..aa31294 100644
--- a/net/sunrpc/xprtsock.c
+++ b/net/sunrpc/xprtsock.c
@@ -397,7 +397,7 @@ static int xs_send_pagedata(struct socket *sock, struct xdr_buf *xdr, unsigned i
 		remainder -= len;
 		if (remainder != 0 || more)
 			flags |= MSG_MORE;
-		err = sock->ops->sendpage(sock, *ppage, base, len, flags);
+		err = sock->ops->sendpage_destructor(sock, *ppage, xdr->destructor, base, len, flags);
 		if (remainder == 0 || err != len)
 			break;
 		sent += err;
-- 
1.7.2.5

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* [PATCH 08/10] net: add paged frag destructor support to kernel_sendpage.
From: Ian Campbell @ 2011-07-15 11:07 UTC (permalink / raw)
  To: netdev-u79uwXL29TY76Z2rM5mHXA
  Cc: linux-nfs-u79uwXL29TY76Z2rM5mHXA, Ian Campbell
In-Reply-To: <1310728006.20648.3.camel-o4Be2W7LfRlXesXXhkcM7miJhflN2719@public.gmane.org>

NB: I added a separate sendpage_destructor to struct proto_ops and struct proto
for this PoC but expect that it would be preferable to just add the new
parameter and update all callers in the tree for the final version.

Signed-off-by: Ian Campbell <ian.campbell-Sxgqhf6Nn4DQT0dZR+AlfA@public.gmane.org>
---
 drivers/staging/pohmelfs/trans.c |    2 +-
 fs/dlm/lowcomms.c                |    2 +-
 include/linux/net.h              |    9 ++++++++-
 include/net/inet_common.h        |    4 ++++
 include/net/sock.h               |    4 ++++
 include/net/tcp.h                |    4 ++++
 net/ceph/messenger.c             |    2 +-
 net/ipv4/af_inet.c               |   20 ++++++++++++++++++--
 net/ipv4/tcp.c                   |   29 ++++++++++++++++++++++++-----
 net/ipv4/tcp_ipv4.c              |    1 +
 net/ipv6/af_inet6.c              |    1 +
 net/ipv6/tcp_ipv6.c              |    1 +
 net/socket.c                     |   28 +++++++++++++++++++++++-----
 net/sunrpc/svcsock.c             |    6 +++---
 14 files changed, 94 insertions(+), 19 deletions(-)

diff --git a/drivers/staging/pohmelfs/trans.c b/drivers/staging/pohmelfs/trans.c
index 36a2535..b5d8411 100644
--- a/drivers/staging/pohmelfs/trans.c
+++ b/drivers/staging/pohmelfs/trans.c
@@ -104,7 +104,7 @@ static int netfs_trans_send_pages(struct netfs_trans *t, struct netfs_state *st)
 		msg.msg_flags = MSG_WAITALL | (attached_pages == 1 ? 0 :
 				MSG_MORE);
 
-		err = kernel_sendpage(st->socket, page, 0, size, msg.msg_flags);
+		err = kernel_sendpage(st->socket, page, NULL, 0, size, msg.msg_flags);
 		if (err <= 0) {
 			printk("%s: %d/%d failed to send transaction page: t: %p, gen: %u, size: %u, err: %d.\n",
 					__func__, i, t->page_num, t, t->gen, size, err);
diff --git a/fs/dlm/lowcomms.c b/fs/dlm/lowcomms.c
index 5e2c71f..64933ff 100644
--- a/fs/dlm/lowcomms.c
+++ b/fs/dlm/lowcomms.c
@@ -1341,7 +1341,7 @@ static void send_to_sock(struct connection *con)
 
 		ret = 0;
 		if (len) {
-			ret = kernel_sendpage(con->sock, e->page, offset, len,
+			ret = kernel_sendpage(con->sock, e->page, NULL, offset, len,
 					      msg_flags);
 			if (ret == -EAGAIN || ret == 0) {
 				if (ret == -EAGAIN &&
diff --git a/include/linux/net.h b/include/linux/net.h
index b299230..dfedc46 100644
--- a/include/linux/net.h
+++ b/include/linux/net.h
@@ -157,6 +157,7 @@ struct kiocb;
 struct sockaddr;
 struct msghdr;
 struct module;
+struct skb_frag_destructor;
 
 struct proto_ops {
 	int		family;
@@ -204,6 +205,10 @@ struct proto_ops {
 				      struct vm_area_struct * vma);
 	ssize_t		(*sendpage)  (struct socket *sock, struct page *page,
 				      int offset, size_t size, int flags);
+	ssize_t		(*sendpage_destructor)
+				     (struct socket *sock, struct page *page,
+				      struct skb_frag_destructor *destroy,
+				      int offset, size_t size, int flags);
 	ssize_t 	(*splice_read)(struct socket *sock,  loff_t *ppos,
 				       struct pipe_inode_info *pipe, size_t len, unsigned int flags);
 };
@@ -273,7 +278,9 @@ extern int kernel_getsockopt(struct socket *sock, int level, int optname,
 			     char *optval, int *optlen);
 extern int kernel_setsockopt(struct socket *sock, int level, int optname,
 			     char *optval, unsigned int optlen);
-extern int kernel_sendpage(struct socket *sock, struct page *page, int offset,
+extern int kernel_sendpage(struct socket *sock, struct page *page,
+			   struct skb_frag_destructor *destroy,
+			   int offset,
 			   size_t size, int flags);
 extern int kernel_sock_ioctl(struct socket *sock, int cmd, unsigned long arg);
 extern int kernel_sock_shutdown(struct socket *sock,
diff --git a/include/net/inet_common.h b/include/net/inet_common.h
index 22fac98..0c39b4b 100644
--- a/include/net/inet_common.h
+++ b/include/net/inet_common.h
@@ -23,6 +23,10 @@ extern int inet_sendmsg(struct kiocb *iocb, struct socket *sock,
 			struct msghdr *msg, size_t size);
 extern ssize_t inet_sendpage(struct socket *sock, struct page *page, int offset,
 			     size_t size, int flags);
+extern ssize_t inet_sendpage_destructor(struct socket *sock, struct page *page,
+			     struct skb_frag_destructor *frag,
+			     int offset,
+			     size_t size, int flags);
 extern int inet_recvmsg(struct kiocb *iocb, struct socket *sock,
 			struct msghdr *msg, size_t size, int flags);
 extern int inet_shutdown(struct socket *sock, int how);
diff --git a/include/net/sock.h b/include/net/sock.h
index c0b938c..7836acf 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -764,6 +764,10 @@ struct proto {
 					int *addr_len);
 	int			(*sendpage)(struct sock *sk, struct page *page,
 					int offset, size_t size, int flags);
+	int			(*sendpage_destructor)
+				       (struct sock *sk, struct page *page,
+					struct skb_frag_destructor *destroy,
+					int offset, size_t size, int flags);
 	int			(*bind)(struct sock *sk, 
 					struct sockaddr *uaddr, int addr_len);
 
diff --git a/include/net/tcp.h b/include/net/tcp.h
index cda30ea..2b42320 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -319,6 +319,10 @@ extern int tcp_sendmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
 		       size_t size);
 extern int tcp_sendpage(struct sock *sk, struct page *page, int offset,
 			size_t size, int flags);
+extern int tcp_sendpage_destructor(struct sock *sk, struct page *page,
+			struct skb_frag_destructor *destroy,
+			int offset,
+			size_t size, int flags);
 extern int tcp_ioctl(struct sock *sk, int cmd, unsigned long arg);
 extern int tcp_rcv_state_process(struct sock *sk, struct sk_buff *skb,
 				 struct tcphdr *th, unsigned len);
diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c
index 78b55f4..ec7955b 100644
--- a/net/ceph/messenger.c
+++ b/net/ceph/messenger.c
@@ -852,7 +852,7 @@ static int write_partial_msg_pages(struct ceph_connection *con)
 				cpu_to_le32(crc32c(tmpcrc, base, len));
 			con->out_msg_pos.did_page_crc = 1;
 		}
-		ret = kernel_sendpage(con->sock, page,
+		ret = kernel_sendpage(con->sock, page, NULL,
 				      con->out_msg_pos.page_pos + page_shift,
 				      len,
 				      MSG_DONTWAIT | MSG_NOSIGNAL |
diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index eae1f67..7954809 100644
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -738,7 +738,9 @@ int inet_sendmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *msg,
 }
 EXPORT_SYMBOL(inet_sendmsg);
 
-ssize_t inet_sendpage(struct socket *sock, struct page *page, int offset,
+ssize_t inet_sendpage_destructor(struct socket *sock, struct page *page,
+		      struct skb_frag_destructor *destroy,
+		      int offset,
 		      size_t size, int flags)
 {
 	struct sock *sk = sock->sk;
@@ -750,10 +752,21 @@ ssize_t inet_sendpage(struct socket *sock, struct page *page, int offset,
 	    inet_autobind(sk))
 		return -EAGAIN;
 
-	if (sk->sk_prot->sendpage)
+	if (destroy) {
+		if (sk->sk_prot->sendpage_destructor)
+			return sk->sk_prot->sendpage_destructor
+				(sk, page, destroy, offset, size, flags);
+	} else if (sk->sk_prot->sendpage)
 		return sk->sk_prot->sendpage(sk, page, offset, size, flags);
 	return sock_no_sendpage(sock, page, offset, size, flags);
 }
+EXPORT_SYMBOL(inet_sendpage_destructor);
+
+ssize_t inet_sendpage(struct socket *sock, struct page *page, int offset,
+		      size_t size, int flags)
+{
+	return inet_sendpage_destructor(sock, page, NULL, offset, size, flags);
+}
 EXPORT_SYMBOL(inet_sendpage);
 
 int inet_recvmsg(struct kiocb *iocb, struct socket *sock, struct msghdr *msg,
@@ -917,6 +930,7 @@ const struct proto_ops inet_stream_ops = {
 	.recvmsg	   = inet_recvmsg,
 	.mmap		   = sock_no_mmap,
 	.sendpage	   = inet_sendpage,
+	.sendpage_destructor = inet_sendpage_destructor,
 	.splice_read	   = tcp_splice_read,
 #ifdef CONFIG_COMPAT
 	.compat_setsockopt = compat_sock_common_setsockopt,
@@ -945,6 +959,7 @@ const struct proto_ops inet_dgram_ops = {
 	.recvmsg	   = inet_recvmsg,
 	.mmap		   = sock_no_mmap,
 	.sendpage	   = inet_sendpage,
+	.sendpage_destructor = inet_sendpage_destructor,
 #ifdef CONFIG_COMPAT
 	.compat_setsockopt = compat_sock_common_setsockopt,
 	.compat_getsockopt = compat_sock_common_getsockopt,
@@ -976,6 +991,7 @@ static const struct proto_ops inet_sockraw_ops = {
 	.recvmsg	   = inet_recvmsg,
 	.mmap		   = sock_no_mmap,
 	.sendpage	   = inet_sendpage,
+	.sendpage_destructor = inet_sendpage_destructor,
 #ifdef CONFIG_COMPAT
 	.compat_setsockopt = compat_sock_common_setsockopt,
 	.compat_getsockopt = compat_sock_common_getsockopt,
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 3a3703c..bfc778e 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -757,7 +757,10 @@ static int tcp_send_mss(struct sock *sk, int *size_goal, int flags)
 	return mss_now;
 }
 
-static ssize_t do_tcp_sendpages(struct sock *sk, struct page **pages, int poffset,
+static ssize_t do_tcp_sendpages(struct sock *sk,
+				struct page **pages,
+				struct skb_frag_destructor **destructors,
+				int poffset,
 			 size_t psize, int flags)
 {
 	struct tcp_sock *tp = tcp_sk(sk);
@@ -783,6 +786,7 @@ static ssize_t do_tcp_sendpages(struct sock *sk, struct page **pages, int poffse
 	while (psize > 0) {
 		struct sk_buff *skb = tcp_write_queue_tail(sk);
 		struct page *page = pages[poffset / PAGE_SIZE];
+		struct skb_frag_destructor *destructor = destructors ? destructors[poffset / PAGE_SIZE] : NULL;
 		int copy, i, can_coalesce;
 		int offset = poffset % PAGE_SIZE;
 		int size = min_t(size_t, psize, PAGE_SIZE - offset);
@@ -815,8 +819,9 @@ new_segment:
 		if (can_coalesce) {
 			skb_shinfo(skb)->frags[i - 1].size += copy;
 		} else {
-			get_page(page);
 			skb_fill_page_desc(skb, i, page, offset, copy);
+			skb_shinfo(skb)->frags[i].page.destructor = destructor;
+			skb_frag_ref(skb, i);
 		}
 
 		skb->len += copy;
@@ -871,8 +876,11 @@ out_err:
 	return sk_stream_error(sk, flags, err);
 }
 
-int tcp_sendpage(struct sock *sk, struct page *page, int offset,
-		 size_t size, int flags)
+int tcp_sendpage_destructor(struct sock *sk,
+			    struct page *page,
+			    struct skb_frag_destructor *destructor,
+			    int offset,
+			    size_t size, int flags)
 {
 	ssize_t res;
 
@@ -882,10 +890,21 @@ int tcp_sendpage(struct sock *sk, struct page *page, int offset,
 					flags);
 
 	lock_sock(sk);
-	res = do_tcp_sendpages(sk, &page, offset, size, flags);
+	res = do_tcp_sendpages(sk, &page,
+			       destructor ? &destructor : NULL,
+			       offset, size, flags);
 	release_sock(sk);
 	return res;
 }
+EXPORT_SYMBOL(tcp_sendpage_destructor);
+
+int tcp_sendpage(struct sock *sk,
+		 struct page *page,
+		 int offset,
+		 size_t size, int flags)
+{
+	return tcp_sendpage_destructor(sk, page, NULL, offset, size, flags);
+}
 EXPORT_SYMBOL(tcp_sendpage);
 
 #define TCP_PAGE(sk)	(sk->sk_sndmsg_page)
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 708dc20..9baa996 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -2587,6 +2587,7 @@ struct proto tcp_prot = {
 	.recvmsg		= tcp_recvmsg,
 	.sendmsg		= tcp_sendmsg,
 	.sendpage		= tcp_sendpage,
+	.sendpage_destructor	= tcp_sendpage_destructor,
 	.backlog_rcv		= tcp_v4_do_rcv,
 	.hash			= inet_hash,
 	.unhash			= inet_unhash,
diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c
index d450a2f..58d2520 100644
--- a/net/ipv6/af_inet6.c
+++ b/net/ipv6/af_inet6.c
@@ -531,6 +531,7 @@ const struct proto_ops inet6_stream_ops = {
 	.recvmsg	   = inet_recvmsg,		/* ok		*/
 	.mmap		   = sock_no_mmap,
 	.sendpage	   = inet_sendpage,
+	.sendpage_destructor = inet_sendpage_destructor,
 	.splice_read	   = tcp_splice_read,
 #ifdef CONFIG_COMPAT
 	.compat_setsockopt = compat_sock_common_setsockopt,
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 87551ca..98a2576 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -2210,6 +2210,7 @@ struct proto tcpv6_prot = {
 	.recvmsg		= tcp_recvmsg,
 	.sendmsg		= tcp_sendmsg,
 	.sendpage		= tcp_sendpage,
+	.sendpage_destructor	= tcp_sendpage_destructor,
 	.backlog_rcv		= tcp_v6_do_rcv,
 	.hash			= tcp_v6_hash,
 	.unhash			= inet_unhash,
diff --git a/net/socket.c b/net/socket.c
index 02dc82d..f1c39a4 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -795,7 +795,7 @@ static ssize_t sock_sendpage(struct file *file, struct page *page,
 	if (more)
 		flags |= MSG_MORE;
 
-	return kernel_sendpage(sock, page, offset, size, flags);
+	return kernel_sendpage(sock, page, NULL, offset, size, flags);
 }
 
 static ssize_t sock_splice_read(struct file *file, loff_t *ppos,
@@ -3343,15 +3343,33 @@ int kernel_setsockopt(struct socket *sock, int level, int optname,
 }
 EXPORT_SYMBOL(kernel_setsockopt);
 
-int kernel_sendpage(struct socket *sock, struct page *page, int offset,
+int kernel_sendpage(struct socket *sock, struct page *page,
+		    struct skb_frag_destructor *destroy,
+		    int offset,
 		    size_t size, int flags)
 {
+	int ret;
 	sock_update_classid(sock->sk);
 
-	if (sock->ops->sendpage)
-		return sock->ops->sendpage(sock, page, offset, size, flags);
+	/*
+         * If we have a destructor but the socket does not support
+         * sendpage_destructor then fallback to sock_no_sendpage which
+         * is copying...
+         */
+	if (destroy) {
+		if (sock->ops->sendpage_destructor)
+			return sock->ops->sendpage_destructor(sock, page, destroy,
+							      offset, size, flags);
+	} else {
+		if (sock->ops->sendpage)
+			return sock->ops->sendpage(sock, page,
+						   offset, size, flags);
+	}
 
-	return sock_no_sendpage(sock, page, offset, size, flags);
+	ret = sock_no_sendpage(sock, page, offset, size, flags);
+	/* sock_no_sendpage copies so we can destroy immediately */
+	skb_frag_destructor_unref(destroy);
+	return ret;
 }
 EXPORT_SYMBOL(kernel_sendpage);
 
diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
index af04f77..a80b1d3 100644
--- a/net/sunrpc/svcsock.c
+++ b/net/sunrpc/svcsock.c
@@ -181,7 +181,7 @@ int svc_send_common(struct socket *sock, struct xdr_buf *xdr,
 	/* send head */
 	if (slen == xdr->head[0].iov_len)
 		flags = 0;
-	len = kernel_sendpage(sock, headpage, headoffset,
+	len = kernel_sendpage(sock, headpage, NULL, headoffset,
 				  xdr->head[0].iov_len, flags);
 	if (len != xdr->head[0].iov_len)
 		goto out;
@@ -194,7 +194,7 @@ int svc_send_common(struct socket *sock, struct xdr_buf *xdr,
 	while (pglen > 0) {
 		if (slen == size)
 			flags = 0;
-		result = kernel_sendpage(sock, *ppage, base, size, flags);
+		result = kernel_sendpage(sock, *ppage, NULL, base, size, flags);
 		if (result > 0)
 			len += result;
 		if (result != size)
@@ -208,7 +208,7 @@ int svc_send_common(struct socket *sock, struct xdr_buf *xdr,
 
 	/* send tail */
 	if (xdr->tail[0].iov_len) {
-		result = kernel_sendpage(sock, tailpage, tailoffset,
+		result = kernel_sendpage(sock, tailpage, NULL, tailoffset,
 				   xdr->tail[0].iov_len, 0);
 		if (result > 0)
 			len += result;
-- 
1.7.2.5

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* [PATCH 06/10] net: convert drivers to paged frag API.
From: Ian Campbell @ 2011-07-15 11:07 UTC (permalink / raw)
  To: netdev-u79uwXL29TY76Z2rM5mHXA
  Cc: linux-nfs-u79uwXL29TY76Z2rM5mHXA, Ian Campbell
In-Reply-To: <1310728006.20648.3.camel-o4Be2W7LfRlXesXXhkcM7miJhflN2719@public.gmane.org>

Coccinelle was quite useful in the initial stages of this conversion but a) my
spatch was ugly as sin and b) I've done several rounds of updates since then so
they no longer actually represent the resultant changes anyway.

NB: should be split into individual patches to be acked by relevant driver
maintainers.

Signed-off-by: Ian Campbell <ian.campbell-Sxgqhf6Nn4DQT0dZR+AlfA@public.gmane.org>
---
 drivers/atm/eni.c                       |    4 +-
 drivers/infiniband/hw/amso1100/c2.c     |    7 +----
 drivers/infiniband/hw/nes/nes_nic.c     |   10 +++-----
 drivers/infiniband/ulp/ipoib/ipoib_cm.c |    6 +++-
 drivers/infiniband/ulp/ipoib/ipoib_ib.c |    6 +++-
 drivers/net/3c59x.c                     |    4 +-
 drivers/net/8139cp.c                    |    3 +-
 drivers/net/acenic.c                    |    6 ++--
 drivers/net/atl1c/atl1c_main.c          |    3 +-
 drivers/net/atl1e/atl1e_main.c          |    9 +++----
 drivers/net/atlx/atl1.c                 |    7 ++---
 drivers/net/benet/be_main.c             |   10 ++++----
 drivers/net/bna/bnad.c                  |    4 +-
 drivers/net/bnx2.c                      |    8 +++---
 drivers/net/bnx2x/bnx2x_cmn.c           |    3 +-
 drivers/net/cassini.c                   |   15 ++++++-------
 drivers/net/chelsio/sge.c               |    5 +--
 drivers/net/cxgb3/sge.c                 |    6 ++--
 drivers/net/cxgb4/sge.c                 |   14 +++++++-----
 drivers/net/cxgb4vf/sge.c               |   13 ++++++-----
 drivers/net/e1000/e1000_main.c          |    9 +++----
 drivers/net/e1000e/netdev.c             |    7 ++---
 drivers/net/enic/enic_main.c            |   12 ++++------
 drivers/net/forcedeth.c                 |   10 +++++---
 drivers/net/gianfar.c                   |   10 ++++----
 drivers/net/greth.c                     |    9 ++-----
 drivers/net/ibmveth.c                   |    5 +--
 drivers/net/igb/igb_main.c              |    5 +---
 drivers/net/igbvf/netdev.c              |    5 +---
 drivers/net/ixgb/ixgb_main.c            |    6 ++--
 drivers/net/ixgbe/ixgbe_main.c          |    9 +++----
 drivers/net/ixgbevf/ixgbevf_main.c      |   10 +++-----
 drivers/net/jme.c                       |    5 ++-
 drivers/net/ksz884x.c                   |    3 +-
 drivers/net/mlx4/en_rx.c                |   22 +++++++++-----------
 drivers/net/mlx4/en_tx.c                |   17 +-------------
 drivers/net/mv643xx_eth.c               |    8 +++---
 drivers/net/myri10ge/myri10ge.c         |   12 +++++-----
 drivers/net/netxen/netxen_nic_main.c    |    2 +-
 drivers/net/niu.c                       |    4 +-
 drivers/net/ns83820.c                   |    5 +--
 drivers/net/pasemi_mac.c                |    5 +--
 drivers/net/qla3xxx.c                   |    5 +--
 drivers/net/qlcnic/qlcnic_main.c        |    2 +-
 drivers/net/qlge/qlge_main.c            |    8 ++----
 drivers/net/r8169.c                     |    2 +-
 drivers/net/s2io.c                      |    7 ++---
 drivers/net/sfc/rx.c                    |    2 +-
 drivers/net/sfc/tx.c                    |    9 +------
 drivers/net/skge.c                      |    4 +-
 drivers/net/sky2.c                      |   13 +++++------
 drivers/net/starfire.c                  |    2 +-
 drivers/net/stmmac/stmmac_main.c        |    5 +--
 drivers/net/sungem.c                    |    7 ++---
 drivers/net/sunhme.c                    |    5 +--
 drivers/net/tehuti.c                    |    4 +-
 drivers/net/tg3.c                       |    6 +---
 drivers/net/tsi108_eth.c                |    7 +++--
 drivers/net/typhoon.c                   |    3 +-
 drivers/net/via-velocity.c              |    6 ++--
 drivers/net/virtio_net.c                |    2 +-
 drivers/net/vmxnet3/vmxnet3_drv.c       |    7 ++---
 drivers/net/vxge/vxge-main.c            |    6 ++--
 drivers/net/xen-netback/netback.c       |   34 +++++++++++++++++++++----------
 drivers/net/xen-netfront.c              |   28 +++++++++++++++----------
 drivers/scsi/bnx2fc/bnx2fc_fcoe.c       |    2 +-
 drivers/scsi/cxgbi/libcxgbi.c           |    6 ++--
 drivers/scsi/fcoe/fcoe.c                |    2 +-
 drivers/scsi/fcoe/fcoe_transport.c      |    5 ++-
 drivers/staging/et131x/et1310_tx.c      |   11 ++++-----
 drivers/staging/hv/netvsc_drv.c         |    2 +-
 71 files changed, 245 insertions(+), 280 deletions(-)

diff --git a/drivers/atm/eni.c b/drivers/atm/eni.c
index 3230ea0..60bc9c5 100644
--- a/drivers/atm/eni.c
+++ b/drivers/atm/eni.c
@@ -1133,8 +1133,8 @@ DPRINTK("doing direct send\n"); /* @@@ well, this doesn't work anyway */
 				    skb->data,
 				    skb_headlen(skb));
 			else
-				put_dma(tx->index,eni_dev->dma,&j,(unsigned long)
-				    skb_shinfo(skb)->frags[i].page + skb_shinfo(skb)->frags[i].page_offset,
+				put_dma(tx->index,eni_dev->dma,&j,
+				    (unsigned long)skb_frag_address(&skb_shinfo(skb)->frags[i]),
 				    skb_shinfo(skb)->frags[i].size);
 	}
 	if (skb->len & 3)
diff --git a/drivers/infiniband/hw/amso1100/c2.c b/drivers/infiniband/hw/amso1100/c2.c
index 0cfc455..0c411b4 100644
--- a/drivers/infiniband/hw/amso1100/c2.c
+++ b/drivers/infiniband/hw/amso1100/c2.c
@@ -801,11 +801,8 @@ static int c2_xmit_frame(struct sk_buff *skb, struct net_device *netdev)
 		for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
 			skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
 			maplen = frag->size;
-			mapaddr =
-			    pci_map_page(c2dev->pcidev, frag->page,
-					 frag->page_offset, maplen,
-					 PCI_DMA_TODEVICE);
-
+			mapaddr = skb_frag_pci_map(c2dev->pcidev, frag, 0,
+						   maplen, PCI_DMA_TODEVICE);
 			elem = elem->next;
 			elem->skb = NULL;
 			elem->mapaddr = mapaddr;
diff --git a/drivers/infiniband/hw/nes/nes_nic.c b/drivers/infiniband/hw/nes/nes_nic.c
index d3a1c41..171a1ff 100644
--- a/drivers/infiniband/hw/nes/nes_nic.c
+++ b/drivers/infiniband/hw/nes/nes_nic.c
@@ -441,9 +441,8 @@ static int nes_nic_send(struct sk_buff *skb, struct net_device *netdev)
 		nesnic->tx_skb[nesnic->sq_head] = skb;
 		for (skb_fragment_index = 0; skb_fragment_index < skb_shinfo(skb)->nr_frags;
 				skb_fragment_index++) {
-			bus_address = pci_map_page( nesdev->pcidev,
-					skb_shinfo(skb)->frags[skb_fragment_index].page,
-					skb_shinfo(skb)->frags[skb_fragment_index].page_offset,
+			bus_address = skb_frag_pci_map( nesdev->pcidev,
+					&skb_shinfo(skb)->frags[skb_fragment_index], 0,
 					skb_shinfo(skb)->frags[skb_fragment_index].size,
 					PCI_DMA_TODEVICE);
 			wqe_fragment_length[wqe_fragment_index] =
@@ -561,9 +560,8 @@ tso_sq_no_longer_full:
 			/* Map all the buffers */
 			for (tso_frag_count=0; tso_frag_count < skb_shinfo(skb)->nr_frags;
 					tso_frag_count++) {
-				tso_bus_address[tso_frag_count] = pci_map_page( nesdev->pcidev,
-						skb_shinfo(skb)->frags[tso_frag_count].page,
-						skb_shinfo(skb)->frags[tso_frag_count].page_offset,
+				tso_bus_address[tso_frag_count] = skb_frag_pci_map( nesdev->pcidev,
+						&skb_shinfo(skb)->frags[tso_frag_count], 0,
 						skb_shinfo(skb)->frags[tso_frag_count].size,
 						PCI_DMA_TODEVICE);
 			}
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_cm.c b/drivers/infiniband/ulp/ipoib/ipoib_cm.c
index 39913a0..1f20e40 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_cm.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_cm.c
@@ -169,7 +169,8 @@ static struct sk_buff *ipoib_cm_alloc_rx_skb(struct net_device *dev,
 			goto partial_error;
 		skb_fill_page_desc(skb, i, page, 0, PAGE_SIZE);
 
-		mapping[i + 1] = ib_dma_map_page(priv->ca, skb_shinfo(skb)->frags[i].page,
+		mapping[i + 1] = ib_dma_map_page(priv->ca,
+						 __skb_frag_page(&skb_shinfo(skb)->frags[i]),
 						 0, PAGE_SIZE, DMA_FROM_DEVICE);
 		if (unlikely(ib_dma_mapping_error(priv->ca, mapping[i + 1])))
 			goto partial_error;
@@ -537,7 +538,8 @@ static void skb_put_frags(struct sk_buff *skb, unsigned int hdr_space,
 
 		if (length == 0) {
 			/* don't need this page */
-			skb_fill_page_desc(toskb, i, frag->page, 0, PAGE_SIZE);
+			skb_fill_page_desc(toskb, i, __skb_frag_page(frag),
+					   0, PAGE_SIZE);/* XXX */
 			--skb_shinfo(skb)->nr_frags;
 		} else {
 			size = min(length, (unsigned) PAGE_SIZE);
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_ib.c b/drivers/infiniband/ulp/ipoib/ipoib_ib.c
index 81ae61d..f6ef6c2 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_ib.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_ib.c
@@ -182,7 +182,8 @@ static struct sk_buff *ipoib_alloc_rx_skb(struct net_device *dev, int id)
 			goto partial_error;
 		skb_fill_page_desc(skb, 0, page, 0, PAGE_SIZE);
 		mapping[1] =
-			ib_dma_map_page(priv->ca, skb_shinfo(skb)->frags[0].page,
+			ib_dma_map_page(priv->ca,
+					__skb_frag_page(&skb_shinfo(skb)->frags[0]),
 					0, PAGE_SIZE, DMA_FROM_DEVICE);
 		if (unlikely(ib_dma_mapping_error(priv->ca, mapping[1])))
 			goto partial_error;
@@ -323,7 +324,8 @@ static int ipoib_dma_map_tx(struct ib_device *ca,
 
 	for (i = 0; i < skb_shinfo(skb)->nr_frags; ++i) {
 		skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
-		mapping[i + off] = ib_dma_map_page(ca, frag->page,
+		mapping[i + off] = ib_dma_map_page(ca,
+						 __skb_frag_page(frag),
 						 frag->page_offset, frag->size,
 						 DMA_TO_DEVICE);
 		if (unlikely(ib_dma_mapping_error(ca, mapping[i + off])))
diff --git a/drivers/net/3c59x.c b/drivers/net/3c59x.c
index 8cc2256..fcd3820 100644
--- a/drivers/net/3c59x.c
+++ b/drivers/net/3c59x.c
@@ -2180,8 +2180,8 @@ boomerang_start_xmit(struct sk_buff *skb, struct net_device *dev)
 
 			vp->tx_ring[entry].frag[i+1].addr =
 					cpu_to_le32(pci_map_single(VORTEX_PCI(vp),
-											   (void*)page_address(frag->page) + frag->page_offset,
-											   frag->size, PCI_DMA_TODEVICE));
+								   (void*)skb_frag_address(frag),
+								   frag->size, PCI_DMA_TODEVICE));
 
 			if (i == skb_shinfo(skb)->nr_frags-1)
 					vp->tx_ring[entry].frag[i+1].length = cpu_to_le32(frag->size|LAST_FRAG);
diff --git a/drivers/net/8139cp.c b/drivers/net/8139cp.c
index 10c4505..c418aa3 100644
--- a/drivers/net/8139cp.c
+++ b/drivers/net/8139cp.c
@@ -815,8 +815,7 @@ static netdev_tx_t cp_start_xmit (struct sk_buff *skb,
 
 			len = this_frag->size;
 			mapping = dma_map_single(&cp->pdev->dev,
-						 ((void *) page_address(this_frag->page) +
-						  this_frag->page_offset),
+						 skb_frag_address(this_frag),
 						 len, PCI_DMA_TODEVICE);
 			eor = (entry == (CP_TX_RING_SIZE - 1)) ? RingEnd : 0;
 
diff --git a/drivers/net/acenic.c b/drivers/net/acenic.c
index d7c1bfe4..f848691 100644
--- a/drivers/net/acenic.c
+++ b/drivers/net/acenic.c
@@ -2528,9 +2528,9 @@ restart:
 			info = ap->skb->tx_skbuff + idx;
 			desc = ap->tx_ring + idx;
 
-			mapping = pci_map_page(ap->pdev, frag->page,
-					       frag->page_offset, frag->size,
-					       PCI_DMA_TODEVICE);
+			mapping = skb_frag_pci_map(ap->pdev, frag, 0,
+						   frag->size,
+						   PCI_DMA_TODEVICE);
 
 			flagsize = (frag->size << 16);
 			if (skb->ip_summed == CHECKSUM_PARTIAL)
diff --git a/drivers/net/atl1c/atl1c_main.c b/drivers/net/atl1c/atl1c_main.c
index 1269ba5..162a38a 100644
--- a/drivers/net/atl1c/atl1c_main.c
+++ b/drivers/net/atl1c/atl1c_main.c
@@ -2162,8 +2162,7 @@ static void atl1c_tx_map(struct atl1c_adapter *adapter,
 		buffer_info = atl1c_get_tx_buffer(adapter, use_tpd);
 		buffer_info->length = frag->size;
 		buffer_info->dma =
-			pci_map_page(adapter->pdev, frag->page,
-					frag->page_offset,
+			skb_frag_pci_map(adapter->pdev, frag, 0,
 					buffer_info->length,
 					PCI_DMA_TODEVICE);
 		ATL1C_SET_BUFFER_STATE(buffer_info, ATL1C_BUFFER_BUSY);
diff --git a/drivers/net/atl1e/atl1e_main.c b/drivers/net/atl1e/atl1e_main.c
index 86a9122..f48fa37 100644
--- a/drivers/net/atl1e/atl1e_main.c
+++ b/drivers/net/atl1e/atl1e_main.c
@@ -1746,11 +1746,10 @@ static void atl1e_tx_map(struct atl1e_adapter *adapter,
 			buf_len -= tx_buffer->length;
 
 			tx_buffer->dma =
-				pci_map_page(adapter->pdev, frag->page,
-						frag->page_offset +
-						(i * MAX_TX_BUF_LEN),
-						tx_buffer->length,
-						PCI_DMA_TODEVICE);
+				skb_frag_pci_map(adapter->pdev, frag,
+						 (i * MAX_TX_BUF_LEN),
+						 tx_buffer->length,
+						 PCI_DMA_TODEVICE);
 			ATL1E_SET_PCIMAP_TYPE(tx_buffer, ATL1E_TX_PCIMAP_PAGE);
 			use_tpd->buffer_addr = cpu_to_le64(tx_buffer->dma);
 			use_tpd->word2 = (use_tpd->word2 & (~TPD_BUFLEN_MASK)) |
diff --git a/drivers/net/atlx/atl1.c b/drivers/net/atlx/atl1.c
index cd5789f..96caf1a 100644
--- a/drivers/net/atlx/atl1.c
+++ b/drivers/net/atlx/atl1.c
@@ -2283,11 +2283,10 @@ static void atl1_tx_map(struct atl1_adapter *adapter, struct sk_buff *skb,
 			buffer_info->length = (buf_len > ATL1_MAX_TX_BUF_LEN) ?
 				ATL1_MAX_TX_BUF_LEN : buf_len;
 			buf_len -= buffer_info->length;
-			buffer_info->dma = pci_map_page(adapter->pdev,
-				frag->page,
-				frag->page_offset + (i * ATL1_MAX_TX_BUF_LEN),
+			buffer_info->dma = skb_frag_pci_map(
+				adapter->pdev, frag,
+				(i * ATL1_MAX_TX_BUF_LEN),
 				buffer_info->length, PCI_DMA_TODEVICE);
-
 			if (++next_to_use == tpd_ring->count)
 				next_to_use = 0;
 		}
diff --git a/drivers/net/benet/be_main.c b/drivers/net/benet/be_main.c
index a485f7f..3e4f643 100644
--- a/drivers/net/benet/be_main.c
+++ b/drivers/net/benet/be_main.c
@@ -715,8 +715,8 @@ static int make_tx_wrbs(struct be_adapter *adapter,
 	for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
 		struct skb_frag_struct *frag =
 			&skb_shinfo(skb)->frags[i];
-		busaddr = dma_map_page(dev, frag->page, frag->page_offset,
-				       frag->size, DMA_TO_DEVICE);
+		busaddr = skb_frag_dma_map(dev, frag, 0,
+					   frag->size, DMA_TO_DEVICE);
 		if (dma_mapping_error(dev, busaddr))
 			goto dma_err;
 		wrb = queue_head_node(txq);
@@ -1122,7 +1122,7 @@ static void skb_fill_rx_data(struct be_adapter *adapter, struct be_rx_obj *rxo,
 		skb->tail += curr_frag_len;
 	} else {
 		skb_shinfo(skb)->nr_frags = 1;
-		skb_shinfo(skb)->frags[0].page = page_info->page;
+		skb_frag_set_page(skb, 0, page_info->page);
 		skb_shinfo(skb)->frags[0].page_offset =
 					page_info->page_offset + hdr_len;
 		skb_shinfo(skb)->frags[0].size = curr_frag_len - hdr_len;
@@ -1147,7 +1147,7 @@ static void skb_fill_rx_data(struct be_adapter *adapter, struct be_rx_obj *rxo,
 		if (page_info->page_offset == 0) {
 			/* Fresh page */
 			j++;
-			skb_shinfo(skb)->frags[j].page = page_info->page;
+			skb_frag_set_page(skb, j, page_info->page);
 			skb_shinfo(skb)->frags[j].page_offset =
 							page_info->page_offset;
 			skb_shinfo(skb)->frags[j].size = 0;
@@ -1236,7 +1236,7 @@ static void be_rx_compl_process_gro(struct be_adapter *adapter,
 		if (i == 0 || page_info->page_offset == 0) {
 			/* First frag or Fresh page */
 			j++;
-			skb_shinfo(skb)->frags[j].page = page_info->page;
+			skb_frag_set_page(skb, j, page_info->page);
 			skb_shinfo(skb)->frags[j].page_offset =
 							page_info->page_offset;
 			skb_shinfo(skb)->frags[j].size = 0;
diff --git a/drivers/net/bna/bnad.c b/drivers/net/bna/bnad.c
index 7d25a97..a79f69b 100644
--- a/drivers/net/bna/bnad.c
+++ b/drivers/net/bna/bnad.c
@@ -2636,8 +2636,8 @@ bnad_start_xmit(struct sk_buff *skb, struct net_device *netdev)
 
 		BUG_ON(!(size <= BFI_TX_MAX_DATA_PER_VECTOR));
 		txqent->vector[vect_id].length = htons(size);
-		dma_addr = dma_map_page(&bnad->pcidev->dev, frag->page,
-					frag->page_offset, size, DMA_TO_DEVICE);
+		dma_addr = skb_frag_dma_map(&bnad->pcidev->dev, frag,
+					    0, size, DMA_TO_DEVICE);
 		dma_unmap_addr_set(&unmap_q->unmap_array[unmap_prod], dma_addr,
 				   dma_addr);
 		BNA_SET_DMA_ADDR(dma_addr, &txqent->vector[vect_id].host_addr);
diff --git a/drivers/net/bnx2.c b/drivers/net/bnx2.c
index 57d3293..ff90b13 100644
--- a/drivers/net/bnx2.c
+++ b/drivers/net/bnx2.c
@@ -2880,8 +2880,8 @@ bnx2_reuse_rx_skb_pages(struct bnx2 *bp, struct bnx2_rx_ring_info *rxr,
 
 		shinfo = skb_shinfo(skb);
 		shinfo->nr_frags--;
-		page = shinfo->frags[shinfo->nr_frags].page;
-		shinfo->frags[shinfo->nr_frags].page = NULL;
+		page = __skb_frag_page(&shinfo->frags[shinfo->nr_frags]);
+		__skb_frag_set_page(&shinfo->frags[shinfo->nr_frags], NULL);
 
 		cons_rx_pg->page = page;
 		dev_kfree_skb(skb);
@@ -6461,8 +6461,8 @@ bnx2_start_xmit(struct sk_buff *skb, struct net_device *dev)
 		txbd = &txr->tx_desc_ring[ring_prod];
 
 		len = frag->size;
-		mapping = dma_map_page(&bp->pdev->dev, frag->page, frag->page_offset,
-				       len, PCI_DMA_TODEVICE);
+		mapping = skb_frag_dma_map(&bp->pdev->dev, frag, 0, len,
+					   PCI_DMA_TODEVICE);
 		if (dma_mapping_error(&bp->pdev->dev, mapping))
 			goto dma_error;
 		dma_unmap_addr_set(&txr->tx_buf_ring[ring_prod], mapping,
diff --git a/drivers/net/bnx2x/bnx2x_cmn.c b/drivers/net/bnx2x/bnx2x_cmn.c
index 28904433..dee09d7 100644
--- a/drivers/net/bnx2x/bnx2x_cmn.c
+++ b/drivers/net/bnx2x/bnx2x_cmn.c
@@ -2406,8 +2406,7 @@ netdev_tx_t bnx2x_start_xmit(struct sk_buff *skb, struct net_device *dev)
 		if (total_pkt_bd == NULL)
 			total_pkt_bd = &fp->tx_desc_ring[bd_prod].reg_bd;
 
-		mapping = dma_map_page(&bp->pdev->dev, frag->page,
-				       frag->page_offset,
+		mapping = skb_frag_dma_map(&bp->pdev->dev, frag, 0,
 				       frag->size, DMA_TO_DEVICE);
 
 		tx_data_bd->addr_hi = cpu_to_le32(U64_HI(mapping));
diff --git a/drivers/net/cassini.c b/drivers/net/cassini.c
index 22ce03e..5261673 100644
--- a/drivers/net/cassini.c
+++ b/drivers/net/cassini.c
@@ -2047,8 +2047,8 @@ static int cas_rx_process_pkt(struct cas *cp, struct cas_rx_comp *rxc,
 		skb->truesize += hlen - swivel;
 		skb->len      += hlen - swivel;
 
-		get_page(page->buffer);
-		frag->page = page->buffer;
+		__skb_frag_set_page(frag, page->buffer);
+		__skb_frag_ref(frag);
 		frag->page_offset = off;
 		frag->size = hlen - swivel;
 
@@ -2071,8 +2071,8 @@ static int cas_rx_process_pkt(struct cas *cp, struct cas_rx_comp *rxc,
 			skb->len      += hlen;
 			frag++;
 
-			get_page(page->buffer);
-			frag->page = page->buffer;
+			__skb_frag_set_page(frag, page->buffer);
+			__skb_frag_ref(frag);
 			frag->page_offset = 0;
 			frag->size = hlen;
 			RX_USED_ADD(page, hlen + cp->crc_size);
@@ -2829,9 +2829,8 @@ static inline int cas_xmit_tx_ringN(struct cas *cp, int ring,
 		skb_frag_t *fragp = &skb_shinfo(skb)->frags[frag];
 
 		len = fragp->size;
-		mapping = pci_map_page(cp->pdev, fragp->page,
-				       fragp->page_offset, len,
-				       PCI_DMA_TODEVICE);
+		mapping = skb_frag_pci_map(cp->pdev, fragp, 0, len,
+					   PCI_DMA_TODEVICE);
 
 		tabort = cas_calc_tabort(cp, fragp->page_offset, len);
 		if (unlikely(tabort)) {
@@ -2842,7 +2841,7 @@ static inline int cas_xmit_tx_ringN(struct cas *cp, int ring,
 				      ctrl, 0);
 			entry = TX_DESC_NEXT(ring, entry);
 
-			addr = cas_page_map(fragp->page);
+			addr = cas_page_map(__skb_frag_page(fragp));
 			memcpy(tx_tiny_buf(cp, ring, entry),
 			       addr + fragp->page_offset + len - tabort,
 			       tabort);
diff --git a/drivers/net/chelsio/sge.c b/drivers/net/chelsio/sge.c
index 58380d2..7215fe5 100644
--- a/drivers/net/chelsio/sge.c
+++ b/drivers/net/chelsio/sge.c
@@ -1276,9 +1276,8 @@ static inline void write_tx_descs(struct adapter *adapter, struct sk_buff *skb,
 			ce = q->centries;
 		}
 
-		mapping = pci_map_page(adapter->pdev, frag->page,
-				       frag->page_offset, frag->size,
-				       PCI_DMA_TODEVICE);
+		mapping = skb_frag_pci_map(adapter->pdev, frag, 0,
+					   frag->size, PCI_DMA_TODEVICE);
 		desc_mapping = mapping;
 		desc_len = frag->size;
 
diff --git a/drivers/net/cxgb3/sge.c b/drivers/net/cxgb3/sge.c
index 3f562ba..d8a0d81 100644
--- a/drivers/net/cxgb3/sge.c
+++ b/drivers/net/cxgb3/sge.c
@@ -979,8 +979,8 @@ static inline unsigned int make_sgl(const struct sk_buff *skb,
 	for (i = 0; i < nfrags; i++) {
 		skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
 
-		mapping = pci_map_page(pdev, frag->page, frag->page_offset,
-				       frag->size, PCI_DMA_TODEVICE);
+		mapping = skb_frag_pci_map(pdev, frag, 0,
+					   frag->size, PCI_DMA_TODEVICE);
 		sgp->len[j] = cpu_to_be32(frag->size);
 		sgp->addr[j] = cpu_to_be64(mapping);
 		j ^= 1;
@@ -2133,7 +2133,7 @@ static void lro_add_page(struct adapter *adap, struct sge_qset *qs,
 	len -= offset;
 
 	rx_frag += nr_frags;
-	rx_frag->page = sd->pg_chunk.page;
+	__skb_frag_set_page(rx_frag, sd->pg_chunk.page);
 	rx_frag->page_offset = sd->pg_chunk.offset + offset;
 	rx_frag->size = len;
 
diff --git a/drivers/net/cxgb4/sge.c b/drivers/net/cxgb4/sge.c
index 56adf44..f1813b5 100644
--- a/drivers/net/cxgb4/sge.c
+++ b/drivers/net/cxgb4/sge.c
@@ -215,8 +215,8 @@ static int map_skb(struct device *dev, const struct sk_buff *skb,
 	end = &si->frags[si->nr_frags];
 
 	for (fp = si->frags; fp < end; fp++) {
-		*++addr = dma_map_page(dev, fp->page, fp->page_offset, fp->size,
-				       DMA_TO_DEVICE);
+		*++addr = skb_frag_dma_map(dev, fp, 0, fp->size,
+					   DMA_TO_DEVICE);
 		if (dma_mapping_error(dev, *addr))
 			goto unwind;
 	}
@@ -1409,13 +1409,14 @@ int cxgb4_ofld_send(struct net_device *dev, struct sk_buff *skb)
 }
 EXPORT_SYMBOL(cxgb4_ofld_send);
 
-static inline void copy_frags(struct skb_shared_info *ssi,
+static inline void copy_frags(struct sk_buff *skb,
 			      const struct pkt_gl *gl, unsigned int offset)
 {
+	struct skb_shared_info *ssi = skb_shinfo(skb);
 	unsigned int n;
 
 	/* usually there's just one frag */
-	ssi->frags[0].page = gl->frags[0].page;
+	skb_frag_set_page(skb, 0, gl->frags[0].page);
 	ssi->frags[0].page_offset = gl->frags[0].page_offset + offset;
 	ssi->frags[0].size = gl->frags[0].size - offset;
 	ssi->nr_frags = gl->nfrags;
@@ -1459,7 +1460,7 @@ struct sk_buff *cxgb4_pktgl_to_skb(const struct pkt_gl *gl,
 		__skb_put(skb, pull_len);
 		skb_copy_to_linear_data(skb, gl->va, pull_len);
 
-		copy_frags(skb_shinfo(skb), gl, pull_len);
+		copy_frags(skb, gl, pull_len);
 		skb->len = gl->tot_len;
 		skb->data_len = skb->len - pull_len;
 		skb->truesize += skb->data_len;
@@ -1522,7 +1523,7 @@ static void do_gro(struct sge_eth_rxq *rxq, const struct pkt_gl *gl,
 		return;
 	}
 
-	copy_frags(skb_shinfo(skb), gl, RX_PKT_PAD);
+	copy_frags(skb, gl, RX_PKT_PAD);
 	skb->len = gl->tot_len - RX_PKT_PAD;
 	skb->data_len = skb->len;
 	skb->truesize += skb->data_len;
@@ -1735,6 +1736,7 @@ static int process_responses(struct sge_rspq *q, int budget)
 
 			si.va = page_address(si.frags[0].page) +
 				si.frags[0].page_offset;
+
 			prefetch(si.va);
 
 			si.nfrags = frags + 1;
diff --git a/drivers/net/cxgb4vf/sge.c b/drivers/net/cxgb4vf/sge.c
index 5fd75fd..f4c4480 100644
--- a/drivers/net/cxgb4vf/sge.c
+++ b/drivers/net/cxgb4vf/sge.c
@@ -296,8 +296,8 @@ static int map_skb(struct device *dev, const struct sk_buff *skb,
 	si = skb_shinfo(skb);
 	end = &si->frags[si->nr_frags];
 	for (fp = si->frags; fp < end; fp++) {
-		*++addr = dma_map_page(dev, fp->page, fp->page_offset, fp->size,
-				       DMA_TO_DEVICE);
+		*++addr = skb_frag_dma_map(dev, fp, 0, fp->size,
+					   DMA_TO_DEVICE);
 		if (dma_mapping_error(dev, *addr))
 			goto unwind;
 	}
@@ -1397,7 +1397,7 @@ struct sk_buff *t4vf_pktgl_to_skb(const struct pkt_gl *gl,
 		skb_copy_to_linear_data(skb, gl->va, pull_len);
 
 		ssi = skb_shinfo(skb);
-		ssi->frags[0].page = gl->frags[0].page;
+		skb_frag_set_page(skb, 0, gl->frags[0].page);
 		ssi->frags[0].page_offset = gl->frags[0].page_offset + pull_len;
 		ssi->frags[0].size = gl->frags[0].size - pull_len;
 		if (gl->nfrags > 1)
@@ -1442,14 +1442,15 @@ void t4vf_pktgl_free(const struct pkt_gl *gl)
  *	Copy an internal packet gather list into a Linux skb_shared_info
  *	structure.
  */
-static inline void copy_frags(struct skb_shared_info *si,
+static inline void copy_frags(struct sk_buff *skb,
 			      const struct pkt_gl *gl,
 			      unsigned int offset)
 {
+	struct skb_shared_info *si = skb_shinfo(skb);
 	unsigned int n;
 
 	/* usually there's just one frag */
-	si->frags[0].page = gl->frags[0].page;
+	skb_frag_set_page(skb, 0, gl->frags[0].page);
 	si->frags[0].page_offset = gl->frags[0].page_offset + offset;
 	si->frags[0].size = gl->frags[0].size - offset;
 	si->nr_frags = gl->nfrags;
@@ -1484,7 +1485,7 @@ static void do_gro(struct sge_eth_rxq *rxq, const struct pkt_gl *gl,
 		return;
 	}
 
-	copy_frags(skb_shinfo(skb), gl, PKTSHIFT);
+	copy_frags(skb, gl, PKTSHIFT);
 	skb->len = gl->tot_len - PKTSHIFT;
 	skb->data_len = skb->len;
 	skb->truesize += skb->data_len;
diff --git a/drivers/net/e1000/e1000_main.c b/drivers/net/e1000/e1000_main.c
index 76e8af0..e902cd0 100644
--- a/drivers/net/e1000/e1000_main.c
+++ b/drivers/net/e1000/e1000_main.c
@@ -2861,7 +2861,7 @@ static int e1000_tx_map(struct e1000_adapter *adapter,
 
 		frag = &skb_shinfo(skb)->frags[f];
 		len = frag->size;
-		offset = frag->page_offset;
+		offset = 0;
 
 		while (len) {
 			i++;
@@ -2878,7 +2878,7 @@ static int e1000_tx_map(struct e1000_adapter *adapter,
 			 * Avoid terminating buffers within evenly-aligned
 			 * dwords. */
 			if (unlikely(adapter->pcix_82544 &&
-			    !((unsigned long)(page_to_phys(frag->page) + offset
+			    !((unsigned long)(page_to_phys(__skb_frag_page(frag)) + offset
 			                      + size - 1) & 4) &&
 			    size > 4))
 				size -= 4;
@@ -2886,9 +2886,8 @@ static int e1000_tx_map(struct e1000_adapter *adapter,
 			buffer_info->length = size;
 			buffer_info->time_stamp = jiffies;
 			buffer_info->mapped_as_page = true;
-			buffer_info->dma = dma_map_page(&pdev->dev, frag->page,
-							offset,	size,
-							DMA_TO_DEVICE);
+			buffer_info->dma = skb_frag_dma_map(&pdev->dev, frag,
+						offset, size, DMA_TO_DEVICE);
 			if (dma_mapping_error(&pdev->dev, buffer_info->dma))
 				goto dma_error;
 			buffer_info->next_to_watch = i;
diff --git a/drivers/net/e1000e/netdev.c b/drivers/net/e1000e/netdev.c
index 3310c3d..30f8a5c 100644
--- a/drivers/net/e1000e/netdev.c
+++ b/drivers/net/e1000e/netdev.c
@@ -4599,7 +4599,7 @@ static int e1000_tx_map(struct e1000_adapter *adapter,
 
 		frag = &skb_shinfo(skb)->frags[f];
 		len = frag->size;
-		offset = frag->page_offset;
+		offset = 0;
 
 		while (len) {
 			i++;
@@ -4612,9 +4612,8 @@ static int e1000_tx_map(struct e1000_adapter *adapter,
 			buffer_info->length = size;
 			buffer_info->time_stamp = jiffies;
 			buffer_info->next_to_watch = i;
-			buffer_info->dma = dma_map_page(&pdev->dev, frag->page,
-							offset, size,
-							DMA_TO_DEVICE);
+			buffer_info->dma = skb_frag_dma_map(&pdev->dev, frag,
+						offset, size, DMA_TO_DEVICE);
 			buffer_info->mapped_as_page = true;
 			if (dma_mapping_error(&pdev->dev, buffer_info->dma))
 				goto dma_error;
diff --git a/drivers/net/enic/enic_main.c b/drivers/net/enic/enic_main.c
index 2f433fb..bbab7cd 100644
--- a/drivers/net/enic/enic_main.c
+++ b/drivers/net/enic/enic_main.c
@@ -584,9 +584,8 @@ static inline void enic_queue_wq_skb_cont(struct enic *enic,
 	for (frag = skb_shinfo(skb)->frags; len_left; frag++) {
 		len_left -= frag->size;
 		enic_queue_wq_desc_cont(wq, skb,
-			pci_map_page(enic->pdev, frag->page,
-				frag->page_offset, frag->size,
-				PCI_DMA_TODEVICE),
+			skb_frag_pci_map(enic->pdev, frag, 0, frag->size,
+					 PCI_DMA_TODEVICE),
 			frag->size,
 			(len_left == 0),	/* EOP? */
 			loopback);
@@ -698,14 +697,13 @@ static inline void enic_queue_wq_skb_tso(struct enic *enic,
 	for (frag = skb_shinfo(skb)->frags; len_left; frag++) {
 		len_left -= frag->size;
 		frag_len_left = frag->size;
-		offset = frag->page_offset;
+		offset = 0;
 
 		while (frag_len_left) {
 			len = min(frag_len_left,
 				(unsigned int)WQ_ENET_MAX_DESC_LEN);
-			dma_addr = pci_map_page(enic->pdev, frag->page,
-				offset, len,
-				PCI_DMA_TODEVICE);
+			dma_addr = skb_frag_pci_map(enic->pdev, frag,
+				offset, len, PCI_DMA_TODEVICE);
 			enic_queue_wq_desc_cont(wq, skb,
 				dma_addr,
 				len,
diff --git a/drivers/net/forcedeth.c b/drivers/net/forcedeth.c
index 537b695..062eccd 100644
--- a/drivers/net/forcedeth.c
+++ b/drivers/net/forcedeth.c
@@ -2149,8 +2149,9 @@ static netdev_tx_t nv_start_xmit(struct sk_buff *skb, struct net_device *dev)
 			prev_tx = put_tx;
 			prev_tx_ctx = np->put_tx_ctx;
 			bcnt = (size > NV_TX2_TSO_MAX_SIZE) ? NV_TX2_TSO_MAX_SIZE : size;
-			np->put_tx_ctx->dma = pci_map_page(np->pci_dev, frag->page, frag->page_offset+offset, bcnt,
-							   PCI_DMA_TODEVICE);
+			np->put_tx_ctx->dma =
+				skb_frag_pci_map(np->pci_dev, frag, offset, bcnt,
+						 PCI_DMA_TODEVICE);
 			np->put_tx_ctx->dma_len = bcnt;
 			np->put_tx_ctx->dma_single = 0;
 			put_tx->buf = cpu_to_le32(np->put_tx_ctx->dma);
@@ -2260,8 +2261,9 @@ static netdev_tx_t nv_start_xmit_optimized(struct sk_buff *skb,
 			prev_tx = put_tx;
 			prev_tx_ctx = np->put_tx_ctx;
 			bcnt = (size > NV_TX2_TSO_MAX_SIZE) ? NV_TX2_TSO_MAX_SIZE : size;
-			np->put_tx_ctx->dma = pci_map_page(np->pci_dev, frag->page, frag->page_offset+offset, bcnt,
-							   PCI_DMA_TODEVICE);
+			np->put_tx_ctx->dma =
+				skb_frag_pci_map(np->pci_dev, frag, offset, bcnt,
+						 PCI_DMA_TODEVICE);
 			np->put_tx_ctx->dma_len = bcnt;
 			np->put_tx_ctx->dma_single = 0;
 			put_tx->bufhigh = cpu_to_le32(dma_high(np->put_tx_ctx->dma));
diff --git a/drivers/net/gianfar.c b/drivers/net/gianfar.c
index 2dfcc80..766e037 100644
--- a/drivers/net/gianfar.c
+++ b/drivers/net/gianfar.c
@@ -2141,11 +2141,11 @@ static int gfar_start_xmit(struct sk_buff *skb, struct net_device *dev)
 			if (i == nr_frags - 1)
 				lstatus |= BD_LFLAG(TXBD_LAST | TXBD_INTERRUPT);
 
-			bufaddr = dma_map_page(&priv->ofdev->dev,
-					skb_shinfo(skb)->frags[i].page,
-					skb_shinfo(skb)->frags[i].page_offset,
-					length,
-					DMA_TO_DEVICE);
+			bufaddr = skb_frag_dma_map(&priv->ofdev->dev,
+						   &skb_shinfo(skb)->frags[i],
+						   0,
+						   length,
+						   DMA_TO_DEVICE);
 
 			/* set the TxBD length and buffer pointer */
 			txbdp->bufPtr = bufaddr;
diff --git a/drivers/net/greth.c b/drivers/net/greth.c
index f181304..37c1eff 100644
--- a/drivers/net/greth.c
+++ b/drivers/net/greth.c
@@ -111,7 +111,7 @@ static void greth_print_tx_packet(struct sk_buff *skb)
 	for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
 
 		print_hex_dump(KERN_DEBUG, "TX: ", DUMP_PREFIX_OFFSET, 16, 1,
-			       phys_to_virt(page_to_phys(skb_shinfo(skb)->frags[i].page)) +
+			       phys_to_virt(page_to_phys(skb_frag_page(&skb_shinfo(skb)->frags[i]))) +
 			       skb_shinfo(skb)->frags[i].page_offset,
 			       length, true);
 	}
@@ -526,11 +526,8 @@ greth_start_xmit_gbit(struct sk_buff *skb, struct net_device *dev)
 
 		greth_write_bd(&bdp->stat, status);
 
-		dma_addr = dma_map_page(greth->dev,
-					frag->page,
-					frag->page_offset,
-					frag->size,
-					DMA_TO_DEVICE);
+		dma_addr = skb_frag_dma_map(greth->dev, frag, 0, frag->size,
+					    DMA_TO_DEVICE);
 
 		if (unlikely(dma_mapping_error(greth->dev, dma_addr)))
 			goto frag_map_error;
diff --git a/drivers/net/ibmveth.c b/drivers/net/ibmveth.c
index b388d78..65e5874 100644
--- a/drivers/net/ibmveth.c
+++ b/drivers/net/ibmveth.c
@@ -1001,9 +1001,8 @@ retry_bounce:
 		unsigned long dma_addr;
 		skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
 
-		dma_addr = dma_map_page(&adapter->vdev->dev, frag->page,
-					frag->page_offset, frag->size,
-					DMA_TO_DEVICE);
+		dma_addr = skb_frag_dma_map(&adapter->vdev->dev, frag, 0,
+					    frag->size, DMA_TO_DEVICE);
 
 		if (dma_mapping_error(&adapter->vdev->dev, dma_addr))
 			goto map_failed_frags;
diff --git a/drivers/net/igb/igb_main.c b/drivers/net/igb/igb_main.c
index 2c28621..17f94f4 100644
--- a/drivers/net/igb/igb_main.c
+++ b/drivers/net/igb/igb_main.c
@@ -4132,10 +4132,7 @@ static inline int igb_tx_map_adv(struct igb_ring *tx_ring, struct sk_buff *skb,
 		buffer_info->time_stamp = jiffies;
 		buffer_info->next_to_watch = i;
 		buffer_info->mapped_as_page = true;
-		buffer_info->dma = dma_map_page(dev,
-						frag->page,
-						frag->page_offset,
-						len,
+		buffer_info->dma = skb_frag_dma_map(dev, frag, 0, len,
 						DMA_TO_DEVICE);
 		if (dma_mapping_error(dev, buffer_info->dma))
 			goto dma_error;
diff --git a/drivers/net/igbvf/netdev.c b/drivers/net/igbvf/netdev.c
index 1c77fb3..3f6655f 100644
--- a/drivers/net/igbvf/netdev.c
+++ b/drivers/net/igbvf/netdev.c
@@ -2074,10 +2074,7 @@ static inline int igbvf_tx_map_adv(struct igbvf_adapter *adapter,
 		buffer_info->time_stamp = jiffies;
 		buffer_info->next_to_watch = i;
 		buffer_info->mapped_as_page = true;
-		buffer_info->dma = dma_map_page(&pdev->dev,
-						frag->page,
-						frag->page_offset,
-						len,
+		buffer_info->dma = skb_frag_dma_map(&pdev->dev, frag, 0, len,
 						DMA_TO_DEVICE);
 		if (dma_mapping_error(&pdev->dev, buffer_info->dma))
 			goto dma_error;
diff --git a/drivers/net/ixgb/ixgb_main.c b/drivers/net/ixgb/ixgb_main.c
index 6a130eb..45c4e90 100644
--- a/drivers/net/ixgb/ixgb_main.c
+++ b/drivers/net/ixgb/ixgb_main.c
@@ -1341,7 +1341,7 @@ ixgb_tx_map(struct ixgb_adapter *adapter, struct sk_buff *skb,
 
 		frag = &skb_shinfo(skb)->frags[f];
 		len = frag->size;
-		offset = frag->page_offset;
+		offset = 0;
 
 		while (len) {
 			i++;
@@ -1361,8 +1361,8 @@ ixgb_tx_map(struct ixgb_adapter *adapter, struct sk_buff *skb,
 			buffer_info->time_stamp = jiffies;
 			buffer_info->mapped_as_page = true;
 			buffer_info->dma =
-				dma_map_page(&pdev->dev, frag->page,
-					     offset, size, DMA_TO_DEVICE);
+				skb_frag_dma_map(&pdev->dev, frag, offset, size,
+						 DMA_TO_DEVICE);
 			if (dma_mapping_error(&pdev->dev, buffer_info->dma))
 				goto dma_error;
 			buffer_info->next_to_watch = 0;
diff --git a/drivers/net/ixgbe/ixgbe_main.c b/drivers/net/ixgbe/ixgbe_main.c
index 08e8e25..307cf06 100644
--- a/drivers/net/ixgbe/ixgbe_main.c
+++ b/drivers/net/ixgbe/ixgbe_main.c
@@ -6632,7 +6632,7 @@ static int ixgbe_tx_map(struct ixgbe_adapter *adapter,
 
 		frag = &skb_shinfo(skb)->frags[f];
 		len = min((unsigned int)frag->size, total);
-		offset = frag->page_offset;
+		offset = 0;
 
 		while (len) {
 			i++;
@@ -6643,10 +6643,9 @@ static int ixgbe_tx_map(struct ixgbe_adapter *adapter,
 			size = min(len, (uint)IXGBE_MAX_DATA_PER_TXD);
 
 			tx_buffer_info->length = size;
-			tx_buffer_info->dma = dma_map_page(dev,
-							   frag->page,
-							   offset, size,
-							   DMA_TO_DEVICE);
+			tx_buffer_info->dma =
+				skb_frag_dma_map(dev, frag, offset, size,
+					     DMA_TO_DEVICE);
 			tx_buffer_info->mapped_as_page = true;
 			if (dma_mapping_error(dev, tx_buffer_info->dma))
 				goto dma_error;
diff --git a/drivers/net/ixgbevf/ixgbevf_main.c b/drivers/net/ixgbevf/ixgbevf_main.c
index 28d3cb2..ad05ad9 100644
--- a/drivers/net/ixgbevf/ixgbevf_main.c
+++ b/drivers/net/ixgbevf/ixgbevf_main.c
@@ -2951,18 +2951,16 @@ static int ixgbevf_tx_map(struct ixgbevf_adapter *adapter,
 
 		frag = &skb_shinfo(skb)->frags[f];
 		len = min((unsigned int)frag->size, total);
-		offset = frag->page_offset;
+		offset = 0;
 
 		while (len) {
 			tx_buffer_info = &tx_ring->tx_buffer_info[i];
 			size = min(len, (unsigned int)IXGBE_MAX_DATA_PER_TXD);
 
 			tx_buffer_info->length = size;
-			tx_buffer_info->dma = dma_map_page(&adapter->pdev->dev,
-							   frag->page,
-							   offset,
-							   size,
-							   DMA_TO_DEVICE);
+			tx_buffer_info->dma =
+				skb_frag_dma_map(&adapter->pdev->dev, frag,
+						 offset, size, DMA_TO_DEVICE);
 			tx_buffer_info->mapped_as_page = true;
 			if (dma_mapping_error(&pdev->dev, tx_buffer_info->dma))
 				goto dma_error;
diff --git a/drivers/net/jme.c b/drivers/net/jme.c
index b5b174a..a73e895 100644
--- a/drivers/net/jme.c
+++ b/drivers/net/jme.c
@@ -1928,8 +1928,9 @@ jme_map_tx_skb(struct jme_adapter *jme, struct sk_buff *skb, int idx)
 		ctxdesc = txdesc + ((idx + i + 2) & (mask));
 		ctxbi = txbi + ((idx + i + 2) & (mask));
 
-		jme_fill_tx_map(jme->pdev, ctxdesc, ctxbi, frag->page,
-				 frag->page_offset, frag->size, hidma);
+		jme_fill_tx_map(jme->pdev, ctxdesc, ctxbi,
+				__skb_frag_page(frag),
+				frag->page_offset, frag->size, hidma);
 	}
 
 	len = skb_is_nonlinear(skb) ? skb_headlen(skb) : skb->len;
diff --git a/drivers/net/ksz884x.c b/drivers/net/ksz884x.c
index 41ea592..e610d88 100644
--- a/drivers/net/ksz884x.c
+++ b/drivers/net/ksz884x.c
@@ -4703,8 +4703,7 @@ static void send_packet(struct sk_buff *skb, struct net_device *dev)
 
 			dma_buf->dma = pci_map_single(
 				hw_priv->pdev,
-				page_address(this_frag->page) +
-				this_frag->page_offset,
+				skb_frag_address(this_frag),
 				dma_buf->len,
 				PCI_DMA_TODEVICE);
 			set_tx_buf(desc, dma_buf->dma);
diff --git a/drivers/net/mlx4/en_rx.c b/drivers/net/mlx4/en_rx.c
index 277215f..21a89e0 100644
--- a/drivers/net/mlx4/en_rx.c
+++ b/drivers/net/mlx4/en_rx.c
@@ -60,20 +60,18 @@ static int mlx4_en_alloc_frag(struct mlx4_en_priv *priv,
 		if (!page)
 			return -ENOMEM;
 
-		skb_frags[i].page = page_alloc->page;
+		__skb_frag_set_page(&skb_frags[i], page_alloc->page);
 		skb_frags[i].page_offset = page_alloc->offset;
 		page_alloc->page = page;
 		page_alloc->offset = frag_info->frag_align;
 	} else {
-		page = page_alloc->page;
-		get_page(page);
-
-		skb_frags[i].page = page;
+		__skb_frag_set_page(&skb_frags[i], page_alloc->page);
+		__skb_frag_ref(&skb_frags[i]);
 		skb_frags[i].page_offset = page_alloc->offset;
 		page_alloc->offset += frag_info->frag_stride;
 	}
-	dma = pci_map_single(mdev->pdev, page_address(skb_frags[i].page) +
-			     skb_frags[i].page_offset, frag_info->frag_size,
+	dma = pci_map_single(mdev->pdev, skb_frag_address(&skb_frags[i]),
+			     frag_info->frag_size,
 			     PCI_DMA_FROMDEVICE);
 	rx_desc->data[i].addr = cpu_to_be64(dma);
 	return 0;
@@ -169,7 +167,7 @@ static int mlx4_en_prepare_rx_desc(struct mlx4_en_priv *priv,
 
 err:
 	while (i--)
-		put_page(skb_frags[i].page);
+		__skb_frag_unref(&skb_frags[i]);
 	return -ENOMEM;
 }
 
@@ -196,7 +194,7 @@ static void mlx4_en_free_rx_desc(struct mlx4_en_priv *priv,
 		en_dbg(DRV, priv, "Unmapping buffer at dma:0x%llx\n", (u64) dma);
 		pci_unmap_single(mdev->pdev, dma, skb_frags[nr].size,
 				 PCI_DMA_FROMDEVICE);
-		put_page(skb_frags[nr].page);
+		__skb_frag_unref(&skb_frags[nr]);
 	}
 }
 
@@ -420,7 +418,7 @@ static int mlx4_en_complete_rx_desc(struct mlx4_en_priv *priv,
 			break;
 
 		/* Save page reference in skb */
-		skb_frags_rx[nr].page = skb_frags[nr].page;
+		__skb_frag_set_page(&skb_frags_rx[nr], skb_frags[nr].page);
 		skb_frags_rx[nr].size = skb_frags[nr].size;
 		skb_frags_rx[nr].page_offset = skb_frags[nr].page_offset;
 		dma = be64_to_cpu(rx_desc->data[nr].addr);
@@ -444,7 +442,7 @@ fail:
 	 * the descriptor) of this packet; remaining fragments are reused... */
 	while (nr > 0) {
 		nr--;
-		put_page(skb_frags_rx[nr].page);
+		__skb_frag_unref(&skb_frags_rx[nr]);
 	}
 	return 0;
 }
@@ -474,7 +472,7 @@ static struct sk_buff *mlx4_en_rx_skb(struct mlx4_en_priv *priv,
 
 	/* Get pointer to first fragment so we could copy the headers into the
 	 * (linear part of the) skb */
-	va = page_address(skb_frags[0].page) + skb_frags[0].page_offset;
+	va = skb_frag_address(&skb_frags[0]);
 
 	if (length <= SMALL_PACKET_SIZE) {
 		/* We are copying all relevant data to the skb - temporarily
diff --git a/drivers/net/mlx4/en_tx.c b/drivers/net/mlx4/en_tx.c
index b229acf..10049b1 100644
--- a/drivers/net/mlx4/en_tx.c
+++ b/drivers/net/mlx4/en_tx.c
@@ -461,26 +461,13 @@ static inline void mlx4_en_xmit_poll(struct mlx4_en_priv *priv, int tx_ind)
 		}
 }
 
-static void *get_frag_ptr(struct sk_buff *skb)
-{
-	struct skb_frag_struct *frag =  &skb_shinfo(skb)->frags[0];
-	struct page *page = frag->page;
-	void *ptr;
-
-	ptr = page_address(page);
-	if (unlikely(!ptr))
-		return NULL;
-
-	return ptr + frag->page_offset;
-}
-
 static int is_inline(struct sk_buff *skb, void **pfrag)
 {
 	void *ptr;
 
 	if (inline_thold && !skb_is_gso(skb) && skb->len <= inline_thold) {
 		if (skb_shinfo(skb)->nr_frags == 1) {
-			ptr = get_frag_ptr(skb);
+			ptr = skb_frag_address_safe(&skb_shinfo(skb)->frags[0]);
 			if (unlikely(!ptr))
 				return 0;
 
@@ -757,7 +744,7 @@ netdev_tx_t mlx4_en_xmit(struct sk_buff *skb, struct net_device *dev)
 		/* Map fragments */
 		for (i = skb_shinfo(skb)->nr_frags - 1; i >= 0; i--) {
 			frag = &skb_shinfo(skb)->frags[i];
-			dma = pci_map_page(mdev->dev->pdev, frag->page, frag->page_offset,
+			dma = skb_frag_pci_map(mdev->dev->pdev, frag, 0,
 					   frag->size, PCI_DMA_TODEVICE);
 			data->addr = cpu_to_be64(dma);
 			data->lkey = cpu_to_be32(mdev->mr.key);
diff --git a/drivers/net/mv643xx_eth.c b/drivers/net/mv643xx_eth.c
index a5d9b1c..d02a034 100644
--- a/drivers/net/mv643xx_eth.c
+++ b/drivers/net/mv643xx_eth.c
@@ -752,10 +752,10 @@ static void txq_submit_frag_skb(struct tx_queue *txq, struct sk_buff *skb)
 
 		desc->l4i_chk = 0;
 		desc->byte_cnt = this_frag->size;
-		desc->buf_ptr = dma_map_page(mp->dev->dev.parent,
-					     this_frag->page,
-					     this_frag->page_offset,
-					     this_frag->size, DMA_TO_DEVICE);
+		desc->buf_ptr = skb_frag_dma_map(mp->dev->dev.parent,
+						 this_frag, 0,
+						 this_frag->size,
+						 DMA_TO_DEVICE);
 	}
 }
 
diff --git a/drivers/net/myri10ge/myri10ge.c b/drivers/net/myri10ge/myri10ge.c
index bf84849..a3459eb 100644
--- a/drivers/net/myri10ge/myri10ge.c
+++ b/drivers/net/myri10ge/myri10ge.c
@@ -1339,7 +1339,7 @@ myri10ge_rx_done(struct myri10ge_slice_state *ss, int len, __wsum csum,
 	/* Fill skb_frag_struct(s) with data from our receive */
 	for (i = 0, remainder = len; remainder > 0; i++) {
 		myri10ge_unmap_rx_page(pdev, &rx->info[idx], bytes);
-		rx_frags[i].page = rx->info[idx].page;
+		__skb_frag_set_page(&rx_frags[i], rx->info[idx].page); /* XXX */
 		rx_frags[i].page_offset = rx->info[idx].page_offset;
 		if (remainder < MYRI10GE_ALLOC_SIZE)
 			rx_frags[i].size = remainder;
@@ -1372,7 +1372,7 @@ myri10ge_rx_done(struct myri10ge_slice_state *ss, int len, __wsum csum,
 		ss->stats.rx_dropped++;
 		do {
 			i--;
-			put_page(rx_frags[i].page);
+			__skb_frag_unref(&rx_frags[i]); /* XXX */
 		} while (i != 0);
 		return 0;
 	}
@@ -1380,7 +1380,7 @@ myri10ge_rx_done(struct myri10ge_slice_state *ss, int len, __wsum csum,
 	/* Attach the pages to the skb, and trim off any padding */
 	myri10ge_rx_skb_build(skb, va, rx_frags, len, hlen);
 	if (skb_shinfo(skb)->frags[0].size <= 0) {
-		put_page(skb_shinfo(skb)->frags[0].page);
+		skb_frag_unref(skb, 0);
 		skb_shinfo(skb)->nr_frags = 0;
 	}
 	skb->protocol = eth_type_trans(skb, dev);
@@ -2220,7 +2220,7 @@ myri10ge_get_frag_header(struct skb_frag_struct *frag, void **mac_hdr,
 	struct ethhdr *eh;
 	struct vlan_ethhdr *veh;
 	struct iphdr *iph;
-	u8 *va = page_address(frag->page) + frag->page_offset;
+	u8 *va = skb_frag_address(frag);
 	unsigned long ll_hlen;
 	/* passed opaque through lro_receive_frags() */
 	__wsum csum = (__force __wsum) (unsigned long)priv;
@@ -2863,8 +2863,8 @@ again:
 		frag = &skb_shinfo(skb)->frags[frag_idx];
 		frag_idx++;
 		len = frag->size;
-		bus = pci_map_page(mgp->pdev, frag->page, frag->page_offset,
-				   len, PCI_DMA_TODEVICE);
+		bus = skb_frag_pci_map(mgp->pdev, frag, 0, len,
+				       PCI_DMA_TODEVICE);
 		dma_unmap_addr_set(&tx->info[idx], bus, bus);
 		dma_unmap_len_set(&tx->info[idx], len, len);
 	}
diff --git a/drivers/net/netxen/netxen_nic_main.c b/drivers/net/netxen/netxen_nic_main.c
index c0788a3..f962aa4 100644
--- a/drivers/net/netxen/netxen_nic_main.c
+++ b/drivers/net/netxen/netxen_nic_main.c
@@ -1836,7 +1836,7 @@ netxen_map_tx_skb(struct pci_dev *pdev,
 		frag = &skb_shinfo(skb)->frags[i];
 		nf = &pbuf->frag_array[i+1];
 
-		map = pci_map_page(pdev, frag->page, frag->page_offset,
+		map = skb_frag_pci_map(pdev, frag, 0,
 				frag->size, PCI_DMA_TODEVICE);
 		if (pci_dma_mapping_error(pdev, map))
 			goto unwind;
diff --git a/drivers/net/niu.c b/drivers/net/niu.c
index cc25bff..a901193 100644
--- a/drivers/net/niu.c
+++ b/drivers/net/niu.c
@@ -3290,7 +3290,7 @@ static void niu_rx_skb_append(struct sk_buff *skb, struct page *page,
 	int i = skb_shinfo(skb)->nr_frags;
 	skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
 
-	frag->page = page;
+	__skb_frag_set_page(frag, page);
 	frag->page_offset = offset;
 	frag->size = size;
 
@@ -6731,7 +6731,7 @@ static netdev_tx_t niu_start_xmit(struct sk_buff *skb,
 		skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
 
 		len = frag->size;
-		mapping = np->ops->map_page(np->device, frag->page,
+		mapping = np->ops->map_page(np->device, __skb_frag_page(frag),
 					    frag->page_offset, len,
 					    DMA_TO_DEVICE);
 
diff --git a/drivers/net/ns83820.c b/drivers/net/ns83820.c
index 3e4040f..39c5e21 100644
--- a/drivers/net/ns83820.c
+++ b/drivers/net/ns83820.c
@@ -1181,9 +1181,8 @@ again:
 		if (!nr_frags)
 			break;
 
-		buf = pci_map_page(dev->pci_dev, frag->page,
-				   frag->page_offset,
-				   frag->size, PCI_DMA_TODEVICE);
+		buf = skb_frag_pci_map(dev->pci_dev, frag, 0, frag->size,
+				       PCI_DMA_TODEVICE);
 		dprintk("frag: buf=%08Lx  page=%08lx offset=%08lx\n",
 			(long long)buf, (long) page_to_pfn(frag->page),
 			frag->page_offset);
diff --git a/drivers/net/pasemi_mac.c b/drivers/net/pasemi_mac.c
index 9ec112c..2d71ed2 100644
--- a/drivers/net/pasemi_mac.c
+++ b/drivers/net/pasemi_mac.c
@@ -1505,9 +1505,8 @@ static int pasemi_mac_start_tx(struct sk_buff *skb, struct net_device *dev)
 	for (i = 0; i < nfrags; i++) {
 		skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
 
-		map[i+1] = pci_map_page(mac->dma_pdev, frag->page,
-					frag->page_offset, frag->size,
-					PCI_DMA_TODEVICE);
+		map[i + 1] = skb_frag_pci_map(mac->dma_pdev, frag, 0,
+					      frag->size, PCI_DMA_TODEVICE);
 		map_size[i+1] = frag->size;
 		if (pci_dma_mapping_error(mac->dma_pdev, map[i+1])) {
 			nfrags = i;
diff --git a/drivers/net/qla3xxx.c b/drivers/net/qla3xxx.c
index 771bb61..79ff9c1 100644
--- a/drivers/net/qla3xxx.c
+++ b/drivers/net/qla3xxx.c
@@ -2388,9 +2388,8 @@ static int ql_send_map(struct ql3_adapter *qdev,
 			seg++;
 		}
 
-		map = pci_map_page(qdev->pdev, frag->page,
-				   frag->page_offset, frag->size,
-				   PCI_DMA_TODEVICE);
+		map = skb_frag_pci_map(qdev->pdev, frag, 0, frag->size,
+				       PCI_DMA_TODEVICE);
 
 		err = pci_dma_mapping_error(qdev->pdev, map);
 		if (err) {
diff --git a/drivers/net/qlcnic/qlcnic_main.c b/drivers/net/qlcnic/qlcnic_main.c
index 0f6af5c..e8170e1 100644
--- a/drivers/net/qlcnic/qlcnic_main.c
+++ b/drivers/net/qlcnic/qlcnic_main.c
@@ -2120,7 +2120,7 @@ qlcnic_map_tx_skb(struct pci_dev *pdev,
 		frag = &skb_shinfo(skb)->frags[i];
 		nf = &pbuf->frag_array[i+1];
 
-		map = pci_map_page(pdev, frag->page, frag->page_offset,
+		map = skb_frag_pci_map(pdev, frag, 0,
 				frag->size, PCI_DMA_TODEVICE);
 		if (pci_dma_mapping_error(pdev, map))
 			goto unwind;
diff --git a/drivers/net/qlge/qlge_main.c b/drivers/net/qlge/qlge_main.c
index 930ae45..4503e16 100644
--- a/drivers/net/qlge/qlge_main.c
+++ b/drivers/net/qlge/qlge_main.c
@@ -1430,10 +1430,8 @@ static int ql_map_send(struct ql_adapter *qdev,
 			map_idx++;
 		}
 
-		map =
-		    pci_map_page(qdev->pdev, frag->page,
-				 frag->page_offset, frag->size,
-				 PCI_DMA_TODEVICE);
+		map = skb_frag_pci_map(qdev->pdev, frag, 0, frag->size,
+				       PCI_DMA_TODEVICE);
 
 		err = pci_dma_mapping_error(qdev->pdev, map);
 		if (err) {
@@ -1494,7 +1492,7 @@ static void ql_process_mac_rx_gro_page(struct ql_adapter *qdev,
 	rx_frag = skb_shinfo(skb)->frags;
 	nr_frags = skb_shinfo(skb)->nr_frags;
 	rx_frag += nr_frags;
-	rx_frag->page = lbq_desc->p.pg_chunk.page;
+	__skb_frag_set_page(rx_frag, lbq_desc->p.pg_chunk.page);
 	rx_frag->page_offset = lbq_desc->p.pg_chunk.offset;
 	rx_frag->size = length;
 
diff --git a/drivers/net/r8169.c b/drivers/net/r8169.c
index 05d8178..916603d 100644
--- a/drivers/net/r8169.c
+++ b/drivers/net/r8169.c
@@ -4630,7 +4630,7 @@ static int rtl8169_xmit_frags(struct rtl8169_private *tp, struct sk_buff *skb,
 
 		txd = tp->TxDescArray + entry;
 		len = frag->size;
-		addr = ((void *) page_address(frag->page)) + frag->page_offset;
+		addr = skb_frag_address(frag);
 		mapping = dma_map_single(d, addr, len, DMA_TO_DEVICE);
 		if (unlikely(dma_mapping_error(d, mapping))) {
 			if (net_ratelimit())
diff --git a/drivers/net/s2io.c b/drivers/net/s2io.c
index df0d2c8..1901b26 100644
--- a/drivers/net/s2io.c
+++ b/drivers/net/s2io.c
@@ -4242,10 +4242,9 @@ static netdev_tx_t s2io_xmit(struct sk_buff *skb, struct net_device *dev)
 		if (!frag->size)
 			continue;
 		txdp++;
-		txdp->Buffer_Pointer = (u64)pci_map_page(sp->pdev, frag->page,
-							 frag->page_offset,
-							 frag->size,
-							 PCI_DMA_TODEVICE);
+		txdp->Buffer_Pointer = (u64)skb_frag_pci_map(sp->pdev, frag,
+							     0, frag->size,
+							     PCI_DMA_TODEVICE);
 		txdp->Control_1 = TXD_BUFFER0_SIZE(frag->size);
 		if (offload_type == SKB_GSO_UDP)
 			txdp->Control_1 |= TXD_UFO_EN;
diff --git a/drivers/net/sfc/rx.c b/drivers/net/sfc/rx.c
index 62e4364..91a6b71 100644
--- a/drivers/net/sfc/rx.c
+++ b/drivers/net/sfc/rx.c
@@ -478,7 +478,7 @@ static void efx_rx_packet_gro(struct efx_channel *channel,
 		if (efx->net_dev->features & NETIF_F_RXHASH)
 			skb->rxhash = efx_rx_buf_hash(eh);
 
-		skb_shinfo(skb)->frags[0].page = page;
+		skb_frag_set_page(skb, 0, page);
 		skb_shinfo(skb)->frags[0].page_offset =
 			efx_rx_buf_offset(efx, rx_buf);
 		skb_shinfo(skb)->frags[0].size = rx_buf->len;
diff --git a/drivers/net/sfc/tx.c b/drivers/net/sfc/tx.c
index 84eb99e..237adc4 100644
--- a/drivers/net/sfc/tx.c
+++ b/drivers/net/sfc/tx.c
@@ -137,8 +137,6 @@ netdev_tx_t efx_enqueue_skb(struct efx_tx_queue *tx_queue, struct sk_buff *skb)
 	struct pci_dev *pci_dev = efx->pci_dev;
 	struct efx_tx_buffer *buffer;
 	skb_frag_t *fragment;
-	struct page *page;
-	int page_offset;
 	unsigned int len, unmap_len = 0, fill_level, insert_ptr;
 	dma_addr_t dma_addr, unmap_addr = 0;
 	unsigned int dma_len;
@@ -241,12 +239,10 @@ netdev_tx_t efx_enqueue_skb(struct efx_tx_queue *tx_queue, struct sk_buff *skb)
 			break;
 		fragment = &skb_shinfo(skb)->frags[i];
 		len = fragment->size;
-		page = fragment->page;
-		page_offset = fragment->page_offset;
 		i++;
 		/* Map for DMA */
 		unmap_single = false;
-		dma_addr = pci_map_page(pci_dev, page, page_offset, len,
+		dma_addr = skb_frag_pci_map(pci_dev, fragment, 0, len,
 					PCI_DMA_TODEVICE);
 	}
 
@@ -929,8 +925,7 @@ static void tso_start(struct tso_state *st, const struct sk_buff *skb)
 static int tso_get_fragment(struct tso_state *st, struct efx_nic *efx,
 			    skb_frag_t *frag)
 {
-	st->unmap_addr = pci_map_page(efx->pci_dev, frag->page,
-				      frag->page_offset, frag->size,
+	st->unmap_addr = skb_frag_pci_map(efx->pci_dev, frag, 0, frag->size,
 				      PCI_DMA_TODEVICE);
 	if (likely(!pci_dma_mapping_error(efx->pci_dev, st->unmap_addr))) {
 		st->unmap_single = false;
diff --git a/drivers/net/skge.c b/drivers/net/skge.c
index f4be5c7..8b80db5 100644
--- a/drivers/net/skge.c
+++ b/drivers/net/skge.c
@@ -2747,8 +2747,8 @@ static netdev_tx_t skge_xmit_frame(struct sk_buff *skb,
 		for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
 			skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
 
-			map = pci_map_page(hw->pdev, frag->page, frag->page_offset,
-					   frag->size, PCI_DMA_TODEVICE);
+			map = skb_frag_pci_map(hw->pdev, frag, 0, frag->size,
+					       PCI_DMA_TODEVICE);
 
 			e = e->next;
 			e->skb = skb;
diff --git a/drivers/net/sky2.c b/drivers/net/sky2.c
index 3ee41da..1f63f9f 100644
--- a/drivers/net/sky2.c
+++ b/drivers/net/sky2.c
@@ -1143,10 +1143,9 @@ static int sky2_rx_map_skb(struct pci_dev *pdev, struct rx_ring_info *re,
 	for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
 		skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
 
-		re->frag_addr[i] = pci_map_page(pdev, frag->page,
-						frag->page_offset,
-						frag->size,
-						PCI_DMA_FROMDEVICE);
+		re->frag_addr[i] = skb_frag_pci_map(pdev, frag, 0,
+						    frag->size,
+						    PCI_DMA_FROMDEVICE);
 
 		if (pci_dma_mapping_error(pdev, re->frag_addr[i]))
 			goto map_page_error;
@@ -1826,8 +1825,8 @@ static netdev_tx_t sky2_xmit_frame(struct sk_buff *skb,
 	for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
 		const skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
 
-		mapping = pci_map_page(hw->pdev, frag->page, frag->page_offset,
-				       frag->size, PCI_DMA_TODEVICE);
+		mapping = skb_frag_pci_map(hw->pdev, frag, 0, frag->size,
+					   PCI_DMA_TODEVICE);
 
 		if (pci_dma_mapping_error(hw->pdev, mapping))
 			goto mapping_unwind;
@@ -2360,7 +2359,7 @@ static void skb_put_frags(struct sk_buff *skb, unsigned int hdr_space,
 
 		if (length == 0) {
 			/* don't need this page */
-			__free_page(frag->page);
+			__skb_frag_unref(frag);
 			--skb_shinfo(skb)->nr_frags;
 		} else {
 			size = min(length, (unsigned) PAGE_SIZE);
diff --git a/drivers/net/starfire.c b/drivers/net/starfire.c
index 36045f3..a0c8f34 100644
--- a/drivers/net/starfire.c
+++ b/drivers/net/starfire.c
@@ -1270,7 +1270,7 @@ static netdev_tx_t start_tx(struct sk_buff *skb, struct net_device *dev)
 			skb_frag_t *this_frag = &skb_shinfo(skb)->frags[i - 1];
 			status |= this_frag->size;
 			np->tx_info[entry].mapping =
-				pci_map_single(np->pci_dev, page_address(this_frag->page) + this_frag->page_offset, this_frag->size, PCI_DMA_TODEVICE);
+				pci_map_single(np->pci_dev, skb_frag_address(this_frag), this_frag->size, PCI_DMA_TODEVICE);
 		}
 
 		np->tx_ring[entry].addr = cpu_to_dma(np->tx_info[entry].mapping);
diff --git a/drivers/net/stmmac/stmmac_main.c b/drivers/net/stmmac/stmmac_main.c
index e25e44a..5157624 100644
--- a/drivers/net/stmmac/stmmac_main.c
+++ b/drivers/net/stmmac/stmmac_main.c
@@ -1040,9 +1040,8 @@ static netdev_tx_t stmmac_xmit(struct sk_buff *skb, struct net_device *dev)
 		desc = priv->dma_tx + entry;
 
 		TX_DBG("\t[entry %d] segment len: %d\n", entry, len);
-		desc->des2 = dma_map_page(priv->device, frag->page,
-					  frag->page_offset,
-					  len, DMA_TO_DEVICE);
+		desc->des2 = skb_frag_dma_map(priv->device, frag, 0, len,
+					      DMA_TO_DEVICE);
 		priv->tx_skbuff[entry] = NULL;
 		priv->hw->desc->prepare_tx_desc(desc, 0, len, csum_insertion);
 		priv->hw->desc->set_tx_owner(desc);
diff --git a/drivers/net/sungem.c b/drivers/net/sungem.c
index ab59300..e3c1306 100644
--- a/drivers/net/sungem.c
+++ b/drivers/net/sungem.c
@@ -1078,10 +1078,9 @@ static netdev_tx_t gem_start_xmit(struct sk_buff *skb,
 			u64 this_ctrl;
 
 			len = this_frag->size;
-			mapping = pci_map_page(gp->pdev,
-					       this_frag->page,
-					       this_frag->page_offset,
-					       len, PCI_DMA_TODEVICE);
+			mapping = skb_frag_pci_map(gp->pdev,
+						   this_frag, 0,
+						   len, PCI_DMA_TODEVICE);
 			this_ctrl = ctrl;
 			if (frag == skb_shinfo(skb)->nr_frags - 1)
 				this_ctrl |= TXDCTRL_EOF;
diff --git a/drivers/net/sunhme.c b/drivers/net/sunhme.c
index 30aad54..3baef5e 100644
--- a/drivers/net/sunhme.c
+++ b/drivers/net/sunhme.c
@@ -2315,9 +2315,8 @@ static netdev_tx_t happy_meal_start_xmit(struct sk_buff *skb,
 			u32 len, mapping, this_txflags;
 
 			len = this_frag->size;
-			mapping = dma_map_page(hp->dma_dev, this_frag->page,
-					       this_frag->page_offset, len,
-					       DMA_TO_DEVICE);
+			mapping = skb_frag_dma_map(hp->dma_dev, this_frag,
+						   0, len, DMA_TO_DEVICE);
 			this_txflags = tx_flags;
 			if (frag == skb_shinfo(skb)->nr_frags - 1)
 				this_txflags |= TXFLAG_EOP;
diff --git a/drivers/net/tehuti.c b/drivers/net/tehuti.c
index 80fbee0..b817616 100644
--- a/drivers/net/tehuti.c
+++ b/drivers/net/tehuti.c
@@ -1520,8 +1520,8 @@ bdx_tx_map_skb(struct bdx_priv *priv, struct sk_buff *skb,
 		frag = &skb_shinfo(skb)->frags[i];
 		db->wptr->len = frag->size;
 		db->wptr->addr.dma =
-		    pci_map_page(priv->pdev, frag->page, frag->page_offset,
-				 frag->size, PCI_DMA_TODEVICE);
+		    skb_frag_pci_map(priv->pdev, frag, 0, frag->size,
+				     PCI_DMA_TODEVICE);
 
 		pbl++;
 		pbl->len = CPU_CHIP_SWAP32(db->wptr->len);
diff --git a/drivers/net/tg3.c b/drivers/net/tg3.c
index a1f9f9e..935f4eb 100644
--- a/drivers/net/tg3.c
+++ b/drivers/net/tg3.c
@@ -6040,10 +6040,8 @@ static netdev_tx_t tg3_start_xmit(struct sk_buff *skb, struct net_device *dev)
 			skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
 
 			len = frag->size;
-			mapping = pci_map_page(tp->pdev,
-					       frag->page,
-					       frag->page_offset,
-					       len, PCI_DMA_TODEVICE);
+			mapping = skb_frag_pci_map(tp->pdev, frag, 0, len,
+						   PCI_DMA_TODEVICE);
 
 			tnapi->tx_buffers[entry].skb = NULL;
 			dma_unmap_addr_set(&tnapi->tx_buffers[entry], mapping,
diff --git a/drivers/net/tsi108_eth.c b/drivers/net/tsi108_eth.c
index 5c633a3..52f89a5 100644
--- a/drivers/net/tsi108_eth.c
+++ b/drivers/net/tsi108_eth.c
@@ -710,9 +710,10 @@ static int tsi108_send_packet(struct sk_buff * skb, struct net_device *dev)
 		} else {
 			skb_frag_t *frag = &skb_shinfo(skb)->frags[i - 1];
 
-			data->txring[tx].buf0 =
-			    dma_map_page(NULL, frag->page, frag->page_offset,
-					    frag->size, DMA_TO_DEVICE);
+			data->txring[tx].buf0 = skb_frag_dma_map(NULL, frag,
+								 0,
+								 frag->size,
+								 DMA_TO_DEVICE);
 			data->txring[tx].len = frag->size;
 		}
 
diff --git a/drivers/net/typhoon.c b/drivers/net/typhoon.c
index 3de4283..eb32147 100644
--- a/drivers/net/typhoon.c
+++ b/drivers/net/typhoon.c
@@ -819,8 +819,7 @@ typhoon_start_tx(struct sk_buff *skb, struct net_device *dev)
 			typhoon_inc_tx_index(&txRing->lastWrite, 1);
 
 			len = frag->size;
-			frag_addr = (void *) page_address(frag->page) +
-						frag->page_offset;
+			frag_addr = skb_frag_address(frag);
 			skb_dma = pci_map_single(tp->tx_pdev, frag_addr, len,
 					 PCI_DMA_TODEVICE);
 			txd->flags = TYPHOON_FRAG_DESC | TYPHOON_DESC_VALID;
diff --git a/drivers/net/via-velocity.c b/drivers/net/via-velocity.c
index 06daa9d..9c57836 100644
--- a/drivers/net/via-velocity.c
+++ b/drivers/net/via-velocity.c
@@ -2580,9 +2580,9 @@ static netdev_tx_t velocity_xmit(struct sk_buff *skb,
 	for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
 		skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
 
-		tdinfo->skb_dma[i + 1] = pci_map_page(vptr->pdev, frag->page,
-				frag->page_offset, frag->size,
-				PCI_DMA_TODEVICE);
+		tdinfo->skb_dma[i + 1] = skb_frag_pci_map(vptr->pdev, frag,
+							  0, frag->size,
+							  PCI_DMA_TODEVICE);
 
 		td_ptr->td_buf[i + 1].pa_low = cpu_to_le32(tdinfo->skb_dma[i + 1]);
 		td_ptr->td_buf[i + 1].pa_high = 0;
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index f685324..c35ae8f 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -137,7 +137,7 @@ static void set_skb_frag(struct sk_buff *skb, struct page *page,
 	f = &skb_shinfo(skb)->frags[i];
 	f->size = min((unsigned)PAGE_SIZE - offset, *len);
 	f->page_offset = offset;
-	f->page = page;
+	__skb_frag_set_page(f, page);
 
 	skb->data_len += f->size;
 	skb->len += f->size;
diff --git a/drivers/net/vmxnet3/vmxnet3_drv.c b/drivers/net/vmxnet3/vmxnet3_drv.c
index fa6e2ac..3564169 100644
--- a/drivers/net/vmxnet3/vmxnet3_drv.c
+++ b/drivers/net/vmxnet3/vmxnet3_drv.c
@@ -650,7 +650,7 @@ vmxnet3_append_frag(struct sk_buff *skb, struct Vmxnet3_RxCompDesc *rcd,
 
 	BUG_ON(skb_shinfo(skb)->nr_frags >= MAX_SKB_FRAGS);
 
-	frag->page = rbi->page;
+	__skb_frag_set_page(frag, rbi->page);
 	frag->page_offset = 0;
 	frag->size = rcd->len;
 	skb->data_len += frag->size;
@@ -744,9 +744,8 @@ vmxnet3_map_pkt(struct sk_buff *skb, struct vmxnet3_tx_ctx *ctx,
 
 		tbi = tq->buf_info + tq->tx_ring.next2fill;
 		tbi->map_type = VMXNET3_MAP_PAGE;
-		tbi->dma_addr = pci_map_page(adapter->pdev, frag->page,
-					     frag->page_offset, frag->size,
-					     PCI_DMA_TODEVICE);
+		tbi->dma_addr = skb_frag_pci_map(adapter->pdev, frag, 0,
+						 frag->size, PCI_DMA_TODEVICE);
 
 		tbi->len = frag->size;
 
diff --git a/drivers/net/vxge/vxge-main.c b/drivers/net/vxge/vxge-main.c
index 8ab870a..a802036 100644
--- a/drivers/net/vxge/vxge-main.c
+++ b/drivers/net/vxge/vxge-main.c
@@ -921,9 +921,9 @@ vxge_xmit(struct sk_buff *skb, struct net_device *dev)
 		if (!frag->size)
 			continue;
 
-		dma_pointer = (u64) pci_map_page(fifo->pdev, frag->page,
-				frag->page_offset, frag->size,
-				PCI_DMA_TODEVICE);
+		dma_pointer = (u64) skb_frag_pci_map(fifo->pdev, frag, 0,
+						     frag->size,
+						     PCI_DMA_TODEVICE);
 
 		if (unlikely(pci_dma_mapping_error(fifo->pdev, dma_pointer)))
 			goto _exit2;
diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index 0e4851b..5c79483 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -215,6 +215,16 @@ static int get_page_ext(struct page *pg,
 			 sizeof(struct iphdr) + MAX_IPOPTLEN + \
 			 sizeof(struct tcphdr) + MAX_TCP_OPTION_SPACE)
 
+static unsigned long frag_get_pending_idx(skb_frag_t *frag)
+{
+	return (unsigned long)skb_frag_page(frag);
+}
+
+static void frag_set_pending_idx(skb_frag_t *frag, unsigned long pending_idx)
+{
+	__skb_frag_set_page(frag, (void *)pending_idx);
+}
+
 static inline pending_ring_idx_t pending_index(unsigned i)
 {
 	return i & (MAX_PENDING_REQS-1);
@@ -512,7 +522,7 @@ static int netbk_gop_skb(struct sk_buff *skb,
 
 	for (i = 0; i < nr_frags; i++) {
 		netbk_gop_frag_copy(vif, skb, npo,
-				    skb_shinfo(skb)->frags[i].page,
+				    __skb_frag_page(&skb_shinfo(skb)->frags[i]),
 				    skb_shinfo(skb)->frags[i].size,
 				    skb_shinfo(skb)->frags[i].page_offset,
 				    &head);
@@ -913,7 +923,7 @@ static struct gnttab_copy *xen_netbk_get_requests(struct xen_netbk *netbk,
 	int i, start;
 
 	/* Skip first skb fragment if it is on same page as header fragment. */
-	start = ((unsigned long)shinfo->frags[0].page == pending_idx);
+	start = (frag_get_pending_idx(&shinfo->frags[0]) == pending_idx);
 
 	for (i = start; i < shinfo->nr_frags; i++, txp++) {
 		struct page *page;
@@ -945,7 +955,7 @@ static struct gnttab_copy *xen_netbk_get_requests(struct xen_netbk *netbk,
 		memcpy(&pending_tx_info[pending_idx].req, txp, sizeof(*txp));
 		xenvif_get(vif);
 		pending_tx_info[pending_idx].vif = vif;
-		frags[i].page = (void *)pending_idx;
+		frag_set_pending_idx(&frags[i], pending_idx);
 	}
 
 	return gop;
@@ -976,13 +986,13 @@ static int xen_netbk_tx_check_gop(struct xen_netbk *netbk,
 	}
 
 	/* Skip first skb fragment if it is on same page as header fragment. */
-	start = ((unsigned long)shinfo->frags[0].page == pending_idx);
+	start = (frag_get_pending_idx(&shinfo->frags[0]) == pending_idx);
 
 	for (i = start; i < nr_frags; i++) {
 		int j, newerr;
 		pending_ring_idx_t index;
 
-		pending_idx = (unsigned long)shinfo->frags[i].page;
+		pending_idx = frag_get_pending_idx(&shinfo->frags[i]);
 
 		/* Check error status: if okay then remember grant handle. */
 		newerr = (++gop)->status;
@@ -1008,7 +1018,7 @@ static int xen_netbk_tx_check_gop(struct xen_netbk *netbk,
 		pending_idx = *((u16 *)skb->data);
 		xen_netbk_idx_release(netbk, pending_idx);
 		for (j = start; j < i; j++) {
-			pending_idx = (unsigned long)shinfo->frags[i].page;
+			pending_idx = frag_get_pending_idx(&shinfo->frags[i]);
 			xen_netbk_idx_release(netbk, pending_idx);
 		}
 
@@ -1029,12 +1039,14 @@ static void xen_netbk_fill_frags(struct xen_netbk *netbk, struct sk_buff *skb)
 	for (i = 0; i < nr_frags; i++) {
 		skb_frag_t *frag = shinfo->frags + i;
 		struct xen_netif_tx_request *txp;
+		struct page *page;
 		unsigned long pending_idx;
 
-		pending_idx = (unsigned long)frag->page;
+		pending_idx = frag_get_pending_idx(frag);
 
 		txp = &netbk->pending_tx_info[pending_idx].req;
-		frag->page = virt_to_page(idx_to_kaddr(netbk, pending_idx));
+		page = virt_to_page(idx_to_kaddr(netbk, pending_idx));
+		__skb_frag_set_page(frag, page);
 		frag->size = txp->size;
 		frag->page_offset = txp->offset;
 
@@ -1349,11 +1361,11 @@ static unsigned xen_netbk_tx_build_gops(struct xen_netbk *netbk)
 		skb_shinfo(skb)->nr_frags = ret;
 		if (data_len < txreq.size) {
 			skb_shinfo(skb)->nr_frags++;
-			skb_shinfo(skb)->frags[0].page =
-				(void *)(unsigned long)pending_idx;
+			frag_set_pending_idx(&skb_shinfo(skb)->frags[0],
+					     pending_idx);
 		} else {
 			/* Discriminate from any valid pending_idx value. */
-			skb_shinfo(skb)->frags[0].page = (void *)~0UL;
+			frag_set_pending_idx(&skb_shinfo(skb)->frags[0], ~0UL);
 		}
 
 		__skb_queue_tail(&netbk->tx_queue, skb);
diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
index d29365a..ecc4b4b 100644
--- a/drivers/net/xen-netfront.c
+++ b/drivers/net/xen-netfront.c
@@ -265,7 +265,7 @@ no_skb:
 			break;
 		}
 
-		skb_shinfo(skb)->frags[0].page = page;
+		skb_frag_set_page(skb, 0, page);
 		skb_shinfo(skb)->nr_frags = 1;
 		__skb_queue_tail(&np->rx_batch, skb);
 	}
@@ -299,8 +299,8 @@ no_skb:
 		BUG_ON((signed short)ref < 0);
 		np->grant_rx_ref[id] = ref;
 
-		pfn = page_to_pfn(skb_shinfo(skb)->frags[0].page);
-		vaddr = page_address(skb_shinfo(skb)->frags[0].page);
+		pfn = page_to_pfn(skb_frag_page(&skb_shinfo(skb)->frags[0]));
+		vaddr = page_address(skb_frag_page(&skb_shinfo(skb)->frags[0]));
 
 		req = RING_GET_REQUEST(&np->rx, req_prod + i);
 		gnttab_grant_foreign_access_ref(ref,
@@ -451,7 +451,7 @@ static void xennet_make_frags(struct sk_buff *skb, struct net_device *dev,
 		ref = gnttab_claim_grant_reference(&np->gref_tx_head);
 		BUG_ON((signed short)ref < 0);
 
-		mfn = pfn_to_mfn(page_to_pfn(frag->page));
+		mfn = pfn_to_mfn(page_to_pfn(skb_frag_page(frag)));
 		gnttab_grant_foreign_access_ref(ref, np->xbdev->otherend_id,
 						mfn, GNTMAP_readonly);
 
@@ -755,8 +755,9 @@ static RING_IDX xennet_fill_frags(struct netfront_info *np,
 	while ((nskb = __skb_dequeue(list))) {
 		struct xen_netif_rx_response *rx =
 			RING_GET_RESPONSE(&np->rx, ++cons);
+		skb_frag_t *nfrag = &skb_shinfo(nskb)->frags[0];
 
-		frag->page = skb_shinfo(nskb)->frags[0].page;
+		__skb_frag_set_page(frag, __skb_frag_page(nfrag));
 		frag->page_offset = rx->offset;
 		frag->size = rx->status;
 
@@ -858,7 +859,7 @@ static int handle_incoming_queue(struct net_device *dev,
 		memcpy(skb->data, vaddr + offset,
 		       skb_headlen(skb));
 
-		if (page != skb_shinfo(skb)->frags[0].page)
+		if (page != skb_frag_page(&skb_shinfo(skb)->frags[0]))
 			__free_page(page);
 
 		/* Ethernet work: Delayed to here as it peeks the header. */
@@ -937,7 +938,8 @@ err:
 			}
 		}
 
-		NETFRONT_SKB_CB(skb)->page = skb_shinfo(skb)->frags[0].page;
+		NETFRONT_SKB_CB(skb)->page =
+			__skb_frag_page(&skb_shinfo(skb)->frags[0]);
 		NETFRONT_SKB_CB(skb)->offset = rx->offset;
 
 		len = rx->status;
@@ -951,7 +953,7 @@ err:
 			skb_shinfo(skb)->frags[0].size = rx->status - len;
 			skb->data_len = rx->status - len;
 		} else {
-			skb_shinfo(skb)->frags[0].page = NULL;
+			skb_frag_set_page(skb, 0, NULL);
 			skb_shinfo(skb)->nr_frags = 0;
 		}
 
@@ -1094,7 +1096,8 @@ static void xennet_release_rx_bufs(struct netfront_info *np)
 
 		if (!xen_feature(XENFEAT_auto_translated_physmap)) {
 			/* Remap the page. */
-			struct page *page = skb_shinfo(skb)->frags[0].page;
+			const struct page *page =
+				skb_frag_page(&skb_shinfo(skb)->frags[0]);
 			unsigned long pfn = page_to_pfn(page);
 			void *vaddr = page_address(page);
 
@@ -1593,6 +1596,8 @@ static int xennet_connect(struct net_device *dev)
 
 	/* Step 2: Rebuild the RX buffer freelist and the RX ring itself. */
 	for (requeue_idx = 0, i = 0; i < NET_RX_RING_SIZE; i++) {
+		skb_frag_t *frag;
+		const struct page *page;
 		if (!np->rx_skbs[i])
 			continue;
 
@@ -1600,10 +1605,11 @@ static int xennet_connect(struct net_device *dev)
 		ref = np->grant_rx_ref[requeue_idx] = xennet_get_rx_ref(np, i);
 		req = RING_GET_REQUEST(&np->rx, requeue_idx);
 
+		frag = &skb_shinfo(skb)->frags[0];
+		page = skb_frag_page(frag);
 		gnttab_grant_foreign_access_ref(
 			ref, np->xbdev->otherend_id,
-			pfn_to_mfn(page_to_pfn(skb_shinfo(skb)->
-					       frags->page)),
+			pfn_to_mfn(page_to_pfn(page)),
 			0);
 		req->gref = ref;
 		req->id   = requeue_idx;
diff --git a/drivers/scsi/bnx2fc/bnx2fc_fcoe.c b/drivers/scsi/bnx2fc/bnx2fc_fcoe.c
index ab255fb..f7a3517 100644
--- a/drivers/scsi/bnx2fc/bnx2fc_fcoe.c
+++ b/drivers/scsi/bnx2fc/bnx2fc_fcoe.c
@@ -296,7 +296,7 @@ static int bnx2fc_xmit(struct fc_lport *lport, struct fc_frame *fp)
 			return -ENOMEM;
 		}
 		frag = &skb_shinfo(skb)->frags[skb_shinfo(skb)->nr_frags - 1];
-		cp = kmap_atomic(frag->page, KM_SKB_DATA_SOFTIRQ)
+		cp = kmap_atomic(__skb_frag_page(frag), KM_SKB_DATA_SOFTIRQ)
 				+ frag->page_offset;
 	} else {
 		cp = (struct fcoe_crc_eof *)skb_put(skb, tlen);
diff --git a/drivers/scsi/cxgbi/libcxgbi.c b/drivers/scsi/cxgbi/libcxgbi.c
index a2a9c7c..949ee48 100644
--- a/drivers/scsi/cxgbi/libcxgbi.c
+++ b/drivers/scsi/cxgbi/libcxgbi.c
@@ -1812,7 +1812,7 @@ static int sgl_read_to_frags(struct scatterlist *sg, unsigned int sgoffset,
 
 		}
 		copy = min(datalen, sglen);
-		if (i && page == frags[i - 1].page &&
+		if (i && page == skb_frag_page(&frags[i - 1]) &&
 		    sgoffset + sg->offset ==
 			frags[i - 1].page_offset + frags[i - 1].size) {
 			frags[i - 1].size += copy;
@@ -1948,7 +1948,7 @@ int cxgbi_conn_init_pdu(struct iscsi_task *task, unsigned int offset,
 
 			/* data fits in the skb's headroom */
 			for (i = 0; i < tdata->nr_frags; i++, frag++) {
-				char *src = kmap_atomic(frag->page,
+				char *src = kmap_atomic(__skb_frag_page(frag),
 							KM_SOFTIRQ0);
 
 				memcpy(dst, src+frag->page_offset, frag->size);
@@ -1963,7 +1963,7 @@ int cxgbi_conn_init_pdu(struct iscsi_task *task, unsigned int offset,
 		} else {
 			/* data fit into frag_list */
 			for (i = 0; i < tdata->nr_frags; i++)
-				get_page(tdata->frags[i].page);
+				__skb_frag_ref(&tdata->frags[i]);
 
 			memcpy(skb_shinfo(skb)->frags, tdata->frags,
 				sizeof(skb_frag_t) * tdata->nr_frags);
diff --git a/drivers/scsi/fcoe/fcoe.c b/drivers/scsi/fcoe/fcoe.c
index 155d7b9..deee71a 100644
--- a/drivers/scsi/fcoe/fcoe.c
+++ b/drivers/scsi/fcoe/fcoe.c
@@ -1425,7 +1425,7 @@ int fcoe_xmit(struct fc_lport *lport, struct fc_frame *fp)
 			return -ENOMEM;
 		}
 		frag = &skb_shinfo(skb)->frags[skb_shinfo(skb)->nr_frags - 1];
-		cp = kmap_atomic(frag->page, KM_SKB_DATA_SOFTIRQ)
+		cp = kmap_atomic(__skb_frag_page(frag), KM_SKB_DATA_SOFTIRQ)
 			+ frag->page_offset;
 	} else {
 		cp = (struct fcoe_crc_eof *)skb_put(skb, tlen);
diff --git a/drivers/scsi/fcoe/fcoe_transport.c b/drivers/scsi/fcoe/fcoe_transport.c
index 41068e8..40243ce 100644
--- a/drivers/scsi/fcoe/fcoe_transport.c
+++ b/drivers/scsi/fcoe/fcoe_transport.c
@@ -108,8 +108,9 @@ u32 fcoe_fc_crc(struct fc_frame *fp)
 		len = frag->size;
 		while (len > 0) {
 			clen = min(len, PAGE_SIZE - (off & ~PAGE_MASK));
-			data = kmap_atomic(frag->page + (off >> PAGE_SHIFT),
-					   KM_SKB_DATA_SOFTIRQ);
+			data = kmap_atomic(
+				__skb_frag_page(frag) + (off >> PAGE_SHIFT),
+				KM_SKB_DATA_SOFTIRQ);
 			crc = crc32(crc, data + (off & ~PAGE_MASK), clen);
 			kunmap_atomic(data, KM_SKB_DATA_SOFTIRQ);
 			off += clen;
diff --git a/drivers/staging/et131x/et1310_tx.c b/drivers/staging/et131x/et1310_tx.c
index 4241d2a..b2ab240 100644
--- a/drivers/staging/et131x/et1310_tx.c
+++ b/drivers/staging/et131x/et1310_tx.c
@@ -519,12 +519,11 @@ static int nic_send_packet(struct et131x_adapter *etdev, struct tcb *tcb)
 			 * returned by pci_map_page() is always 32-bit
 			 * addressable (as defined by the pci/dma subsystem)
 			 */
-			desc[frag++].addr_lo =
-			    pci_map_page(etdev->pdev,
-					 frags[i - 1].page,
-					 frags[i - 1].page_offset,
-					 frags[i - 1].size,
-					 PCI_DMA_TODEVICE);
+			desc[frag++].addr_lo = skb_frag_pci_map(etdev->pdev,
+								&frags[i - 1],
+								0,
+								frags[i - 1].size,
+								PCI_DMA_TODEVICE);
 		}
 	}
 
diff --git a/drivers/staging/hv/netvsc_drv.c b/drivers/staging/hv/netvsc_drv.c
index 7b9c229..80d1f1f 100644
--- a/drivers/staging/hv/netvsc_drv.c
+++ b/drivers/staging/hv/netvsc_drv.c
@@ -172,7 +172,7 @@ static int netvsc_start_xmit(struct sk_buff *skb, struct net_device *net)
 	for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
 		skb_frag_t *f = &skb_shinfo(skb)->frags[i];
 
-		packet->page_buf[i+2].pfn = page_to_pfn(f->page);
+		packet->page_buf[i+2].pfn = page_to_pfn(skb_frag_page(f));
 		packet->page_buf[i+2].offset = f->page_offset;
 		packet->page_buf[i+2].len = f->size;
 	}
-- 
1.7.2.5

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* [PATCH 05/10] net: convert protocols to SKB frag APIs
From: Ian Campbell @ 2011-07-15 11:07 UTC (permalink / raw)
  To: netdev-u79uwXL29TY76Z2rM5mHXA
  Cc: linux-nfs-u79uwXL29TY76Z2rM5mHXA, Ian Campbell
In-Reply-To: <1310728006.20648.3.camel-o4Be2W7LfRlXesXXhkcM7miJhflN2719@public.gmane.org>

NB: should likely be split into per-protocol patches.

Signed-off-by: Ian Campbell <ian.campbell-Sxgqhf6Nn4DQT0dZR+AlfA@public.gmane.org>
---
 net/ipv4/inet_lro.c    |    2 +-
 net/ipv4/ip_output.c   |    8 +++++---
 net/ipv4/tcp.c         |    3 ++-
 net/ipv4/tcp_output.c  |    2 +-
 net/ipv6/ip6_output.c  |    8 +++++---
 net/xfrm/xfrm_ipcomp.c |   11 +++++++----
 6 files changed, 21 insertions(+), 13 deletions(-)

diff --git a/net/ipv4/inet_lro.c b/net/ipv4/inet_lro.c
index 85a0f75..63f3def 100644
--- a/net/ipv4/inet_lro.c
+++ b/net/ipv4/inet_lro.c
@@ -449,7 +449,7 @@ static struct sk_buff *__lro_proc_segment(struct net_lro_mgr *lro_mgr,
 	if (!lro_mgr->get_frag_header ||
 	    lro_mgr->get_frag_header(frags, (void *)&mac_hdr, (void *)&iph,
 				     (void *)&tcph, &flags, priv)) {
-		mac_hdr = page_address(frags->page) + frags->page_offset;
+		mac_hdr = skb_frag_address(frags);
 		goto out1;
 	}
 
diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index a8024ea..29941ec 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -985,14 +985,14 @@ alloc_new_skb:
 			if (page && (left = PAGE_SIZE - off) > 0) {
 				if (copy >= left)
 					copy = left;
-				if (page != frag->page) {
+				if (page != skb_frag_page(frag)) {
 					if (i == MAX_SKB_FRAGS) {
 						err = -EMSGSIZE;
 						goto error;
 					}
-					get_page(page);
 					skb_fill_page_desc(skb, i, page, off, 0);
 					frag = &skb_shinfo(skb)->frags[i];
+					__skb_frag_ref(frag);
 				}
 			} else if (i < MAX_SKB_FRAGS) {
 				if (copy > PAGE_SIZE)
@@ -1005,13 +1005,15 @@ alloc_new_skb:
 				cork->page = page;
 				cork->off = 0;
 
+				/* XXX no ref ? */
 				skb_fill_page_desc(skb, i, page, 0, 0);
 				frag = &skb_shinfo(skb)->frags[i];
 			} else {
 				err = -EMSGSIZE;
 				goto error;
 			}
-			if (getfrag(from, page_address(frag->page)+frag->page_offset+frag->size, offset, copy, skb->len, skb) < 0) {
+			if (getfrag(from, skb_frag_address(frag)+frag->size,
+				    offset, copy, skb->len, skb) < 0) {
 				err = -EFAULT;
 				goto error;
 			}
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 054a59d..3a3703c 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -3035,7 +3035,8 @@ int tcp_md5_hash_skb_data(struct tcp_md5sig_pool *hp,
 
 	for (i = 0; i < shi->nr_frags; ++i) {
 		const struct skb_frag_struct *f = &shi->frags[i];
-		sg_set_page(&sg, f->page, f->size, f->page_offset);
+		struct page *page = __skb_frag_page(f); /* XXX */
+		sg_set_page(&sg, page, f->size, f->page_offset);
 		if (crypto_hash_update(desc, &sg, f->size))
 			return 1;
 	}
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 882e0b0..0377c06 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -1095,7 +1095,7 @@ static void __pskb_trim_head(struct sk_buff *skb, int len)
 	k = 0;
 	for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
 		if (skb_shinfo(skb)->frags[i].size <= eat) {
-			put_page(skb_shinfo(skb)->frags[i].page);
+			skb_frag_unref(skb, i);
 			eat -= skb_shinfo(skb)->frags[i].size;
 		} else {
 			skb_shinfo(skb)->frags[k] = skb_shinfo(skb)->frags[i];
diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index 9d4b165..fdd4f61 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -1441,14 +1441,14 @@ alloc_new_skb:
 			if (page && (left = PAGE_SIZE - off) > 0) {
 				if (copy >= left)
 					copy = left;
-				if (page != frag->page) {
+				if (page != skb_frag_page(frag)) {
 					if (i == MAX_SKB_FRAGS) {
 						err = -EMSGSIZE;
 						goto error;
 					}
-					get_page(page);
 					skb_fill_page_desc(skb, i, page, sk->sk_sndmsg_off, 0);
 					frag = &skb_shinfo(skb)->frags[i];
+					__skb_frag_ref(frag);
 				}
 			} else if(i < MAX_SKB_FRAGS) {
 				if (copy > PAGE_SIZE)
@@ -1461,13 +1461,15 @@ alloc_new_skb:
 				sk->sk_sndmsg_page = page;
 				sk->sk_sndmsg_off = 0;
 
+				/* XXX no ref ? */
 				skb_fill_page_desc(skb, i, page, 0, 0);
 				frag = &skb_shinfo(skb)->frags[i];
 			} else {
 				err = -EMSGSIZE;
 				goto error;
 			}
-			if (getfrag(from, page_address(frag->page)+frag->page_offset+frag->size, offset, copy, skb->len, skb) < 0) {
+			if (getfrag(from, skb_frag_address(frag)+frag->size,
+				    offset, copy, skb->len, skb) < 0) {
 				err = -EFAULT;
 				goto error;
 			}
diff --git a/net/xfrm/xfrm_ipcomp.c b/net/xfrm/xfrm_ipcomp.c
index fc91ad7..f781b9a 100644
--- a/net/xfrm/xfrm_ipcomp.c
+++ b/net/xfrm/xfrm_ipcomp.c
@@ -70,26 +70,29 @@ static int ipcomp_decompress(struct xfrm_state *x, struct sk_buff *skb)
 
 	while ((scratch += len, dlen -= len) > 0) {
 		skb_frag_t *frag;
+		struct page *page;
 
 		err = -EMSGSIZE;
 		if (WARN_ON(skb_shinfo(skb)->nr_frags >= MAX_SKB_FRAGS))
 			goto out;
 
 		frag = skb_shinfo(skb)->frags + skb_shinfo(skb)->nr_frags;
-		frag->page = alloc_page(GFP_ATOMIC);
+		page = alloc_page(GFP_ATOMIC);
 
 		err = -ENOMEM;
-		if (!frag->page)
+		if (!page)
 			goto out;
 
+		__skb_frag_set_page(frag, page);
+
 		len = PAGE_SIZE;
 		if (dlen < len)
 			len = dlen;
 
-		memcpy(page_address(frag->page), scratch, len);
-
 		frag->page_offset = 0;
 		frag->size = len;
+		memcpy(skb_frag_address(frag), scratch, len);
+
 		skb->truesize += len;
 		skb->data_len += len;
 		skb->len += len;
-- 
1.7.2.5

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* [PATCH 04/10] net: convert core to skb paged frag APIs
From: Ian Campbell @ 2011-07-15 11:07 UTC (permalink / raw)
  To: netdev-u79uwXL29TY76Z2rM5mHXA
  Cc: linux-nfs-u79uwXL29TY76Z2rM5mHXA, Ian Campbell
In-Reply-To: <1310728006.20648.3.camel-o4Be2W7LfRlXesXXhkcM7miJhflN2719@public.gmane.org>

Signed-off-by: Ian Campbell <ian.campbell-Sxgqhf6Nn4DQT0dZR+AlfA@public.gmane.org>
---
 include/linux/skbuff.h |    4 ++--
 net/core/datagram.c    |   20 ++++++++------------
 net/core/dev.c         |    7 +++----
 net/core/kmap_skb.h    |    2 +-
 net/core/pktgen.c      |    3 +--
 net/core/skbuff.c      |   31 +++++++++++++++++--------------
 net/core/sock.c        |   12 +++++-------
 net/core/user_dma.c    |    2 +-
 8 files changed, 38 insertions(+), 43 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index c061257..982c6a3 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -1927,12 +1927,12 @@ static inline int skb_add_data(struct sk_buff *skb,
 }
 
 static inline int skb_can_coalesce(struct sk_buff *skb, int i,
-				   struct page *page, int off)
+				   const struct page *page, int off)
 {
 	if (i) {
 		struct skb_frag_struct *frag = &skb_shinfo(skb)->frags[i - 1];
 
-		return page == frag->page &&
+		return page == skb_frag_page(frag) &&
 		       off == frag->page_offset + frag->size;
 	}
 	return 0;
diff --git a/net/core/datagram.c b/net/core/datagram.c
index 18ac112..f0dcaa2 100644
--- a/net/core/datagram.c
+++ b/net/core/datagram.c
@@ -332,14 +332,13 @@ int skb_copy_datagram_iovec(const struct sk_buff *skb, int offset,
 			int err;
 			u8  *vaddr;
 			skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
-			struct page *page = frag->page;
 
 			if (copy > len)
 				copy = len;
-			vaddr = kmap(page);
+			vaddr = skb_frag_kmap(frag);
 			err = memcpy_toiovec(to, vaddr + frag->page_offset +
 					     offset - start, copy);
-			kunmap(page);
+			skb_frag_kunmap(frag);
 			if (err)
 				goto fault;
 			if (!(len -= copy))
@@ -418,14 +417,13 @@ int skb_copy_datagram_const_iovec(const struct sk_buff *skb, int offset,
 			int err;
 			u8  *vaddr;
 			skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
-			struct page *page = frag->page;
 
 			if (copy > len)
 				copy = len;
-			vaddr = kmap(page);
+			vaddr = skb_frag_kmap(frag);
 			err = memcpy_toiovecend(to, vaddr + frag->page_offset +
 						offset - start, to_offset, copy);
-			kunmap(page);
+			skb_frag_kunmap(frag);
 			if (err)
 				goto fault;
 			if (!(len -= copy))
@@ -508,15 +506,14 @@ int skb_copy_datagram_from_iovec(struct sk_buff *skb, int offset,
 			int err;
 			u8  *vaddr;
 			skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
-			struct page *page = frag->page;
 
 			if (copy > len)
 				copy = len;
-			vaddr = kmap(page);
+			vaddr = skb_frag_kmap(frag);
 			err = memcpy_fromiovecend(vaddr + frag->page_offset +
 						  offset - start,
 						  from, from_offset, copy);
-			kunmap(page);
+			skb_frag_kunmap(frag);
 			if (err)
 				goto fault;
 
@@ -594,16 +591,15 @@ static int skb_copy_and_csum_datagram(const struct sk_buff *skb, int offset,
 			int err = 0;
 			u8  *vaddr;
 			skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
-			struct page *page = frag->page;
 
 			if (copy > len)
 				copy = len;
-			vaddr = kmap(page);
+			vaddr = skb_frag_kmap(frag);
 			csum2 = csum_and_copy_to_user(vaddr +
 							frag->page_offset +
 							offset - start,
 						      to, copy, 0, &err);
-			kunmap(page);
+			skb_frag_kunmap(frag);
 			if (err)
 				goto fault;
 			*csump = csum_block_add(*csump, csum2, pos);
diff --git a/net/core/dev.c b/net/core/dev.c
index 9c58c1e..9ab39c0 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -3414,7 +3414,7 @@ pull:
 		skb_shinfo(skb)->frags[0].size -= grow;
 
 		if (unlikely(!skb_shinfo(skb)->frags[0].size)) {
-			put_page(skb_shinfo(skb)->frags[0].page);
+			skb_frag_unref(skb, 0);
 			memmove(skb_shinfo(skb)->frags,
 				skb_shinfo(skb)->frags + 1,
 				--skb_shinfo(skb)->nr_frags * sizeof(skb_frag_t));
@@ -3478,10 +3478,9 @@ void skb_gro_reset_offset(struct sk_buff *skb)
 	NAPI_GRO_CB(skb)->frag0_len = 0;
 
 	if (skb->mac_header == skb->tail &&
-	    !PageHighMem(skb_shinfo(skb)->frags[0].page)) {
+	    !PageHighMem(skb_frag_page(&skb_shinfo(skb)->frags[0]))) {
 		NAPI_GRO_CB(skb)->frag0 =
-			page_address(skb_shinfo(skb)->frags[0].page) +
-			skb_shinfo(skb)->frags[0].page_offset;
+			skb_frag_address(&skb_shinfo(skb)->frags[0]);
 		NAPI_GRO_CB(skb)->frag0_len = skb_shinfo(skb)->frags[0].size;
 	}
 }
diff --git a/net/core/kmap_skb.h b/net/core/kmap_skb.h
index 283c2b9..b1e9711 100644
--- a/net/core/kmap_skb.h
+++ b/net/core/kmap_skb.h
@@ -7,7 +7,7 @@ static inline void *kmap_skb_frag(const skb_frag_t *frag)
 
 	local_bh_disable();
 #endif
-	return kmap_atomic(frag->page, KM_SKB_DATA_SOFTIRQ);
+	return kmap_atomic(__skb_frag_page(frag), KM_SKB_DATA_SOFTIRQ);
 }
 
 static inline void kunmap_skb_frag(void *vaddr)
diff --git a/net/core/pktgen.c b/net/core/pktgen.c
index f76079c..989b2b6 100644
--- a/net/core/pktgen.c
+++ b/net/core/pktgen.c
@@ -2600,8 +2600,7 @@ static void pktgen_finalize_skb(struct pktgen_dev *pkt_dev, struct sk_buff *skb,
 				if (!pkt_dev->page)
 					break;
 			}
-			skb_shinfo(skb)->frags[i].page = pkt_dev->page;
-			get_page(pkt_dev->page);
+			skb_frag_set_page(skb, i, pkt_dev->page);
 			skb_shinfo(skb)->frags[i].page_offset = 0;
 			/*last fragment, fill rest of data*/
 			if (i == (frags - 1))
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 46cbd28..2133600 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -326,7 +326,7 @@ static void skb_release_data(struct sk_buff *skb)
 		if (skb_shinfo(skb)->nr_frags) {
 			int i;
 			for (i = 0; i < skb_shinfo(skb)->nr_frags; i++)
-				put_page(skb_shinfo(skb)->frags[i].page);
+				skb_frag_unref(skb, i);
 		}
 
 		if (skb_has_frag_list(skb))
@@ -733,7 +733,7 @@ struct sk_buff *pskb_copy(struct sk_buff *skb, gfp_t gfp_mask)
 
 		for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
 			skb_shinfo(n)->frags[i] = skb_shinfo(skb)->frags[i];
-			get_page(skb_shinfo(n)->frags[i].page);
+			skb_frag_ref(skb, i);
 		}
 		skb_shinfo(n)->nr_frags = i;
 	}
@@ -820,7 +820,7 @@ int pskb_expand_head(struct sk_buff *skb, int nhead, int ntail,
 		kfree(skb->head);
 	} else {
 		for (i = 0; i < skb_shinfo(skb)->nr_frags; i++)
-			get_page(skb_shinfo(skb)->frags[i].page);
+			skb_frag_ref(skb, i);
 
 		if (skb_has_frag_list(skb))
 			skb_clone_fraglist(skb);
@@ -1098,7 +1098,7 @@ drop_pages:
 		skb_shinfo(skb)->nr_frags = i;
 
 		for (; i < nfrags; i++)
-			put_page(skb_shinfo(skb)->frags[i].page);
+			skb_frag_unref(skb, i);
 
 		if (skb_has_frag_list(skb))
 			skb_drop_fraglist(skb);
@@ -1267,7 +1267,7 @@ pull_pages:
 	k = 0;
 	for (i = 0; i < skb_shinfo(skb)->nr_frags; i++) {
 		if (skb_shinfo(skb)->frags[i].size <= eat) {
-			put_page(skb_shinfo(skb)->frags[i].page);
+			skb_frag_unref(skb, i);
 			eat -= skb_shinfo(skb)->frags[i].size;
 		} else {
 			skb_shinfo(skb)->frags[k] = skb_shinfo(skb)->frags[i];
@@ -1512,7 +1512,9 @@ static int __skb_splice_bits(struct sk_buff *skb, struct pipe_inode_info *pipe,
 	for (seg = 0; seg < skb_shinfo(skb)->nr_frags; seg++) {
 		const skb_frag_t *f = &skb_shinfo(skb)->frags[seg];
 
-		if (__splice_segment(f->page, f->page_offset, f->size,
+		/* XXX */
+		if (__splice_segment(__skb_frag_page(f),
+				     f->page_offset, f->size,
 				     offset, len, skb, spd, 0, sk, pipe))
 			return 1;
 	}
@@ -2057,7 +2059,7 @@ static inline void skb_split_no_header(struct sk_buff *skb,
 				 *    where splitting is expensive.
 				 * 2. Split is accurately. We make this.
 				 */
-				get_page(skb_shinfo(skb)->frags[i].page);
+				skb_frag_ref(skb, i);
 				skb_shinfo(skb1)->frags[0].page_offset += len - pos;
 				skb_shinfo(skb1)->frags[0].size -= len - pos;
 				skb_shinfo(skb)->frags[i].size	= len - pos;
@@ -2132,7 +2134,8 @@ int skb_shift(struct sk_buff *tgt, struct sk_buff *skb, int shiftlen)
 	 * commit all, so that we don't have to undo partial changes
 	 */
 	if (!to ||
-	    !skb_can_coalesce(tgt, to, fragfrom->page, fragfrom->page_offset)) {
+	    !skb_can_coalesce(tgt, to, skb_frag_page(fragfrom),
+			      fragfrom->page_offset)) {
 		merge = -1;
 	} else {
 		merge = to - 1;
@@ -2179,7 +2182,7 @@ int skb_shift(struct sk_buff *tgt, struct sk_buff *skb, int shiftlen)
 			to++;
 
 		} else {
-			get_page(fragfrom->page);
+			__skb_frag_ref(fragfrom);
 			fragto->page = fragfrom->page;
 			fragto->page_offset = fragfrom->page_offset;
 			fragto->size = todo;
@@ -2201,7 +2204,7 @@ int skb_shift(struct sk_buff *tgt, struct sk_buff *skb, int shiftlen)
 		fragto = &skb_shinfo(tgt)->frags[merge];
 
 		fragto->size += fragfrom->size;
-		put_page(fragfrom->page);
+		__skb_frag_unref(fragfrom);
 	}
 
 	/* Reposition in the original skb */
@@ -2446,8 +2449,7 @@ int skb_append_datato_frags(struct sock *sk, struct sk_buff *skb,
 		left = PAGE_SIZE - frag->page_offset;
 		copy = (length > left)? left : length;
 
-		ret = getfrag(from, (page_address(frag->page) +
-			    frag->page_offset + frag->size),
+		ret = getfrag(from, skb_frag_address(frag) + frag->size,
 			    offset, copy, 0, skb);
 		if (ret < 0)
 			return -EFAULT;
@@ -2599,7 +2601,7 @@ struct sk_buff *skb_segment(struct sk_buff *skb, u32 features)
 
 		while (pos < offset + len && i < nfrags) {
 			*frag = skb_shinfo(skb)->frags[i];
-			get_page(frag->page);
+			__skb_frag_ref(frag);
 			size = frag->size;
 
 			if (pos < offset) {
@@ -2822,7 +2824,8 @@ __skb_to_sgvec(struct sk_buff *skb, struct scatterlist *sg, int offset, int len)
 
 			if (copy > len)
 				copy = len;
-			sg_set_page(&sg[elt], frag->page, copy,
+			/* XXX */
+			sg_set_page(&sg[elt], __skb_frag_page(frag), copy,
 					frag->page_offset+offset-start);
 			elt++;
 			if (!(len -= copy))
diff --git a/net/core/sock.c b/net/core/sock.c
index 6e81978..0fb2160 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -1530,7 +1530,6 @@ struct sk_buff *sock_alloc_send_pskb(struct sock *sk, unsigned long header_len,
 				skb_shinfo(skb)->nr_frags = npages;
 				for (i = 0; i < npages; i++) {
 					struct page *page;
-					skb_frag_t *frag;
 
 					page = alloc_pages(sk->sk_allocation, 0);
 					if (!page) {
@@ -1540,12 +1539,11 @@ struct sk_buff *sock_alloc_send_pskb(struct sock *sk, unsigned long header_len,
 						goto failure;
 					}
 
-					frag = &skb_shinfo(skb)->frags[i];
-					frag->page = page;
-					frag->page_offset = 0;
-					frag->size = (data_len >= PAGE_SIZE ?
-						      PAGE_SIZE :
-						      data_len);
+					__skb_fill_page_desc(skb, i,
+							page, 0,
+							(data_len >= PAGE_SIZE ?
+							 PAGE_SIZE :
+							 data_len));
 					data_len -= PAGE_SIZE;
 				}
 
diff --git a/net/core/user_dma.c b/net/core/user_dma.c
index 25d717e..d22ec3e 100644
--- a/net/core/user_dma.c
+++ b/net/core/user_dma.c
@@ -78,7 +78,7 @@ int dma_skb_copy_datagram_iovec(struct dma_chan *chan,
 		copy = end - offset;
 		if (copy > 0) {
 			skb_frag_t *frag = &skb_shinfo(skb)->frags[i];
-			struct page *page = frag->page;
+			struct page *page = __skb_frag_page(frag); /* XXX */
 
 			if (copy > len)
 				copy = len;
-- 
1.7.2.5

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* [PATCH 02/10] mm: use const struct page for r/o page-flag accessor methods
From: Ian Campbell @ 2011-07-15 11:07 UTC (permalink / raw)
  To: netdev-u79uwXL29TY76Z2rM5mHXA
  Cc: linux-nfs-u79uwXL29TY76Z2rM5mHXA, Ian Campbell
In-Reply-To: <1310728006.20648.3.camel-o4Be2W7LfRlXesXXhkcM7miJhflN2719@public.gmane.org>

Signed-off-by: Ian Campbell <ian.campbell-Sxgqhf6Nn4DQT0dZR+AlfA@public.gmane.org>
---
 include/linux/page-flags.h |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index 6081493..7d632cc 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -135,7 +135,7 @@ enum pageflags {
  * Macros to create function definitions for page flags
  */
 #define TESTPAGEFLAG(uname, lname)					\
-static inline int Page##uname(struct page *page) 			\
+static inline int Page##uname(const struct page *page) 			\
 			{ return test_bit(PG_##lname, &page->flags); }
 
 #define SETPAGEFLAG(uname, lname)					\
@@ -173,7 +173,7 @@ static inline int __TestClearPage##uname(struct page *page)		\
 	__SETPAGEFLAG(uname, lname)  __CLEARPAGEFLAG(uname, lname)
 
 #define PAGEFLAG_FALSE(uname) 						\
-static inline int Page##uname(struct page *page) 			\
+static inline int Page##uname(const struct page *page) 			\
 			{ return 0; }
 
 #define TESTSCFLAG(uname, lname)					\
-- 
1.7.2.5

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* [PATCH/RFC 0/10] enable SKB paged fragment lifetime visibility
From: Ian Campbell @ 2011-07-15 11:06 UTC (permalink / raw)
  To: netdev-u79uwXL29TY76Z2rM5mHXA; +Cc: linux-nfs-u79uwXL29TY76Z2rM5mHXA

Hi,

The following is my attempt to allow entities which inject pages into
the networking stack to receive a notification when the stack has really
finished with those pages (i.e. including retransmissions, clones,
pull-ups etc) and not just when the original skb is finished with. It
implements something broadly along the lines of what was described in
[0].

The series is a proof-of-concept but I have used it to implement a fix
for the NFS issue which I described in [1] (for O_DIRECT writes only, I
presume non O_DIRECT writes would benefit from the same treatment), by
delaying completion of the write() until the pages are no longer
referenced by the network stack (which can happen due to retransmissions
or cloning). I expect that other block and filesystem users of the
network subsystem (e.g. iSCSI) would also benefit from this
functionality since they will suffer from the same class of issue.

Although I've not rebased onto it yet (this series is on 3.0-rc5) I also
expect it would be possible to remove the need to copy on clone which
was recently added to support the SKBTX_DEV_ZEROCOPY stuff by Shirley
Ma. I also expect that this functionality will be useful in my attempts
to add foreign page mapping to Xen's netback (per [2]).

Lastly I think the AF_PACKET mmap'd TX ring completion could also
benefit, although I wasn't able to cause an actual failure in that case,
it seems like cloning of skb's would cause pages which are still
referenced by the stack to be released back to userspace.

In order to do this I have introduced an API to manipulate an SKBs paged
fragments (which unfortunately necessitated changing each driver),
including an explicit fragment ref and unref API to replace the direct
use of get/put_page. Using those I was then able to add an optional
extra layer of reference counting to the paged fragments which can be
used by the creator of the fragment to receive a callback at the time
the page would normally be freed.

What is the general feeling regarding this approach?

The series has been built allmodconfig on x86_64 so I have likely missed
some arch-specific drivers etc. I'll take care of that in future
postings, as well as addressing the issues mentioned in some of the
commit messages.

Ian.

[0] http://marc.info/?l=linux-netdev&m=130925719513084&w=2
[1] http://marc.info/?l=linux-nfs&m=122424132729720&w=2
[2] http://marc.info/?l=linux-netdev&m=130893020922848&w=2



--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* [PATCH] 8139cp: convert to new VLAN model.
From: Francois Romieu @ 2011-07-15 10:21 UTC (permalink / raw)
  To: Jiri Pirko; +Cc: Stephen Hemminger, netdev

The registers and descriptors bits are identical to the pre-8168
8169 chipsets : {RxDesc / TxDesc}.opts2 can only contain VLAN information.

Signed-off-by: Francois Romieu <romieu@fr.zoreil.com>
---
 drivers/net/8139cp.c |   83 ++++++++++++++++---------------------------------
 1 files changed, 27 insertions(+), 56 deletions(-)

diff --git a/drivers/net/8139cp.c b/drivers/net/8139cp.c
index 73b10b0..cc4c210 100644
--- a/drivers/net/8139cp.c
+++ b/drivers/net/8139cp.c
@@ -78,17 +78,6 @@
 #include <asm/irq.h>
 #include <asm/uaccess.h>
 
-/* VLAN tagging feature enable/disable */
-#if defined(CONFIG_VLAN_8021Q) || defined(CONFIG_VLAN_8021Q_MODULE)
-#define CP_VLAN_TAG_USED 1
-#define CP_VLAN_TX_TAG(tx_desc,vlan_tag_value) \
-	do { (tx_desc)->opts2 = cpu_to_le32(vlan_tag_value); } while (0)
-#else
-#define CP_VLAN_TAG_USED 0
-#define CP_VLAN_TX_TAG(tx_desc,vlan_tag_value) \
-	do { (tx_desc)->opts2 = 0; } while (0)
-#endif
-
 /* These identify the driver base version and may not be removed. */
 static char version[] =
 DRV_NAME ": 10/100 PCI Ethernet driver v" DRV_VERSION " (" DRV_RELDATE ")\n";
@@ -356,9 +345,6 @@ struct cp_private {
 	unsigned		rx_buf_sz;
 	unsigned		wol_enabled : 1; /* Is Wake-on-LAN enabled? */
 
-#if CP_VLAN_TAG_USED
-	struct vlan_group	*vlgrp;
-#endif
 	dma_addr_t		ring_dma;
 
 	struct mii_if_info	mii_if;
@@ -423,24 +409,6 @@ static struct {
 };
 
 
-#if CP_VLAN_TAG_USED
-static void cp_vlan_rx_register(struct net_device *dev, struct vlan_group *grp)
-{
-	struct cp_private *cp = netdev_priv(dev);
-	unsigned long flags;
-
-	spin_lock_irqsave(&cp->lock, flags);
-	cp->vlgrp = grp;
-	if (grp)
-		cp->cpcmd |= RxVlanOn;
-	else
-		cp->cpcmd &= ~RxVlanOn;
-
-	cpw16(CpCmd, cp->cpcmd);
-	spin_unlock_irqrestore(&cp->lock, flags);
-}
-#endif /* CP_VLAN_TAG_USED */
-
 static inline void cp_set_rxbufsize (struct cp_private *cp)
 {
 	unsigned int mtu = cp->dev->mtu;
@@ -455,18 +423,17 @@ static inline void cp_set_rxbufsize (struct cp_private *cp)
 static inline void cp_rx_skb (struct cp_private *cp, struct sk_buff *skb,
 			      struct cp_desc *desc)
 {
+	u32 opts2 = le32_to_cpu(desc->opts2);
+
 	skb->protocol = eth_type_trans (skb, cp->dev);
 
 	cp->dev->stats.rx_packets++;
 	cp->dev->stats.rx_bytes += skb->len;
 
-#if CP_VLAN_TAG_USED
-	if (cp->vlgrp && (desc->opts2 & cpu_to_le32(RxVlanTagged))) {
-		vlan_hwaccel_receive_skb(skb, cp->vlgrp,
-					 swab16(le32_to_cpu(desc->opts2) & 0xffff));
-	} else
-#endif
-		netif_receive_skb(skb);
+	if (opts2 & RxVlanTagged)
+		__vlan_hwaccel_put_tag(skb, swab16(opts2 & 0xffff));
+
+	napi_gro_receive(&cp->napi, skb);
 }
 
 static void cp_rx_err_acct (struct cp_private *cp, unsigned rx_tail,
@@ -730,6 +697,12 @@ static void cp_tx (struct cp_private *cp)
 		netif_wake_queue(cp->dev);
 }
 
+static inline u32 cp_tx_vlan_tag(struct sk_buff *skb)
+{
+	return vlan_tx_tag_present(skb) ?
+		TxVlanTag | swab16(vlan_tx_tag_get(skb)) : 0x00;
+}
+
 static netdev_tx_t cp_start_xmit (struct sk_buff *skb,
 					struct net_device *dev)
 {
@@ -737,9 +710,7 @@ static netdev_tx_t cp_start_xmit (struct sk_buff *skb,
 	unsigned entry;
 	u32 eor, flags;
 	unsigned long intr_flags;
-#if CP_VLAN_TAG_USED
-	u32 vlan_tag = 0;
-#endif
+	__le32 opts2;
 	int mss = 0;
 
 	spin_lock_irqsave(&cp->lock, intr_flags);
@@ -752,15 +723,12 @@ static netdev_tx_t cp_start_xmit (struct sk_buff *skb,
 		return NETDEV_TX_BUSY;
 	}
 
-#if CP_VLAN_TAG_USED
-	if (vlan_tx_tag_present(skb))
-		vlan_tag = TxVlanTag | swab16(vlan_tx_tag_get(skb));
-#endif
-
 	entry = cp->tx_head;
 	eor = (entry == (CP_TX_RING_SIZE - 1)) ? RingEnd : 0;
 	mss = skb_shinfo(skb)->gso_size;
 
+	opts2 = cpu_to_le32(cp_tx_vlan_tag(skb));
+
 	if (skb_shinfo(skb)->nr_frags == 0) {
 		struct cp_desc *txd = &cp->tx_ring[entry];
 		u32 len;
@@ -768,7 +736,7 @@ static netdev_tx_t cp_start_xmit (struct sk_buff *skb,
 
 		len = skb->len;
 		mapping = dma_map_single(&cp->pdev->dev, skb->data, len, PCI_DMA_TODEVICE);
-		CP_VLAN_TX_TAG(txd, vlan_tag);
+		txd->opts2 = opts2;
 		txd->addr = cpu_to_le64(mapping);
 		wmb();
 
@@ -839,7 +807,7 @@ static netdev_tx_t cp_start_xmit (struct sk_buff *skb,
 				ctrl |= LastFrag;
 
 			txd = &cp->tx_ring[entry];
-			CP_VLAN_TX_TAG(txd, vlan_tag);
+			txd->opts2 = opts2;
 			txd->addr = cpu_to_le64(mapping);
 			wmb();
 
@@ -851,7 +819,7 @@ static netdev_tx_t cp_start_xmit (struct sk_buff *skb,
 		}
 
 		txd = &cp->tx_ring[first_entry];
-		CP_VLAN_TX_TAG(txd, vlan_tag);
+		txd->opts2 = opts2;
 		txd->addr = cpu_to_le64(first_mapping);
 		wmb();
 
@@ -1431,6 +1399,11 @@ static int cp_set_features(struct net_device *dev, u32 features)
 	else
 		cp->cpcmd &= ~RxChkSum;
 
+	if (features & NETIF_F_HW_VLAN_RX)
+		cp->cpcmd |= RxVlanOn;
+	else
+		cp->cpcmd &= ~RxVlanOn;
+
 	cpw16_f(CpCmd, cp->cpcmd);
 	spin_unlock_irqrestore(&cp->lock, flags);
 
@@ -1818,9 +1791,6 @@ static const struct net_device_ops cp_netdev_ops = {
 	.ndo_start_xmit		= cp_start_xmit,
 	.ndo_tx_timeout		= cp_tx_timeout,
 	.ndo_set_features	= cp_set_features,
-#if CP_VLAN_TAG_USED
-	.ndo_vlan_rx_register	= cp_vlan_rx_register,
-#endif
 #ifdef BROKEN
 	.ndo_change_mtu		= cp_change_mtu,
 #endif
@@ -1949,15 +1919,16 @@ static int cp_init_one (struct pci_dev *pdev, const struct pci_device_id *ent)
 	dev->ethtool_ops = &cp_ethtool_ops;
 	dev->watchdog_timeo = TX_TIMEOUT;
 
-#if CP_VLAN_TAG_USED
 	dev->features |= NETIF_F_HW_VLAN_TX | NETIF_F_HW_VLAN_RX;
-#endif
 
 	if (pci_using_dac)
 		dev->features |= NETIF_F_HIGHDMA;
 
 	/* disabled by default until verified */
-	dev->hw_features |= NETIF_F_SG | NETIF_F_IP_CSUM | NETIF_F_TSO;
+	dev->hw_features |= NETIF_F_SG | NETIF_F_IP_CSUM | NETIF_F_TSO |
+		NETIF_F_HW_VLAN_TX | NETIF_F_HW_VLAN_RX;
+	dev->vlan_features = NETIF_F_SG | NETIF_F_IP_CSUM | NETIF_F_TSO |
+		NETIF_F_HIGHDMA;
 
 	dev->irq = pdev->irq;
 
-- 
1.7.4.4


^ permalink raw reply related

* Re: [RFC 46/72] ixp2000: Move the Radisys driver
From: Lennert Buytenhek @ 2011-07-15 10:32 UTC (permalink / raw)
  To: Jeff Kirsher; +Cc: davem@davemloft.net, netdev@vger.kernel.org
In-Reply-To: <1310692758.2628.30.camel@jtkirshe-mobl>

On Thu, Jul 14, 2011 at 06:19:17PM -0700, Jeff Kirsher wrote:

> > > > > Move the Radisys driver into drivers/net/ethernet/radisys/ and
> > > > > make the necessary Kconfig and Makefile changes
> > > > > 
> > > > > CC: Lennert Buytenhek <kernel@wantstofly.org>
> > > > > Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
> > > > > ---
> > > > >  MAINTAINERS                                        |    2 +-
> > > > >  drivers/net/Kconfig                                |    2 --
> > > > >  drivers/net/Makefile                               |    1 -
> > > > >  drivers/net/ethernet/Kconfig                       |    1 +
> > > > >  drivers/net/ethernet/Makefile                      |    1 +
> > > > >  drivers/net/ethernet/radisys/Kconfig               |   16 ++++++++++++++++
> > > > >  drivers/net/ethernet/radisys/Makefile              |    5 +++++
> > > > >  drivers/net/{ => ethernet/radisys}/ixp2000/Kconfig |    2 +-
> > > > >  .../net/{ => ethernet/radisys}/ixp2000/Makefile    |    0
> > > > 
> > > > The ixp2000 is a series of Intel ARM SoCs, and the ENP2611 is a Radisys
> > > > PCI board based on the ixp2000 series (ixp2400), so it doesn't make
> > > > sense to put everything in the radisys/ directory.
> > > 
> > > It was suggested earlier that it probably should be in either ixp/ or
> > > radisys/, and someone suggested that ixp/ might be confusing.  If you
> > > would like to keep it under drivers/net/ethernet/ixp2000, I am fine with
> > > that as well.
> > > 
> > > > If you insist on moving all drivers into vendor directories (I don't
> > > > like that idea at all -- are we going to rename directories and shuffle
> > > > stuff around every time vendor A buys vendor B or takes over one of
> > > > vendor B's products?), at least the core ixp2000 code should be under
> > > > intel/.
> > > 
> > > It was not my preference to place drivers into vendor specific
> > > directories because of that very reason, as I did the work, it just
> > > tended to work out that way.
> > > 
> > > The reason (as stated before) that it was not moved into intel/
> > > directory is because the driver was not written (and supported) by
> > > Intel.
> > 
> > I didn't follow the discussion, but that seems like a silly argument.
> > Does that mean that any non-vendor supported driver can't go into
> > drivers/net/$vendor/?  What if the vendor starts contributing to the
> > driver later on, will the driver be moved then?
> > 
> > It seems that what you're saying is that you don't want the ixp2000
> > driver in "your" subdirectory.
> 
> It is more that all the drivers that in intel/ are all very similar and
> have "common" code which potentially code be shared.  We have been
> looking at what possibilities there are for the common code between the
> Intel supported drivers.  So it really did not make sense to throw a
> driver into the intel/ directory just because it used an Intel chip, but
> has nothing in common with the other drivers in that directory.  Sorry
> if my previous explanation sounded like it I just did not want the
> drivers in the intel/ because we did not write or support it.
> 
> After looking at ixp2000/ixp4xx drivers and the 82586/82596 drivers a
> bit more, I came up with the solution of putting the IXP drivers in
> xscale/ and the 82586/82596 drivers in i825xx/ directory.

That sounds fine with me.

^ permalink raw reply

* [PATCH] Fix panic in virtnet_remove
From: Krishna Kumar @ 2011-07-15  9:16 UTC (permalink / raw)
  To: davem; +Cc: netdev, shemminger, Krishna Kumar

modprobe -r virtio_net panics in free_netdev() as the
dev is already freed in the newly introduced virtnet_free
(commit 3fa2a1df9094). Since virtnet_remove doesn't require
dev after unregister, I am removing the free_netdev call
in virtnet_remove instead of in virtnet_free (which seems
to be the right place to free the dev). Confirmed that
the panic is fixed with this patch.

Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
---
 drivers/net/virtio_net.c |    2 --
 1 file changed, 2 deletions(-)

diff -ruNp org/drivers/net/virtio_net.c new/drivers/net/virtio_net.c
--- org/drivers/net/virtio_net.c	2011-07-04 10:38:33.000000000 +0530
+++ new/drivers/net/virtio_net.c	2011-07-15 14:27:48.000000000 +0530
@@ -1121,8 +1121,6 @@ static void __devexit virtnet_remove(str
 
 	while (vi->pages)
 		__free_pages(get_a_page(vi, GFP_KERNEL), 0);
-
-	free_netdev(vi->dev);
 }
 
 static struct virtio_device_id id_table[] = {

^ permalink raw reply

* [PATCH] [RFC] Batch statistics update in free_old_xmit_skbs
From: Krishna Kumar @ 2011-07-15  9:16 UTC (permalink / raw)
  To: davem; +Cc: netdev, shemminger, Krishna Kumar

Improve performance for update of stats counters for 32-bit
guests. The following table shows the average number of skbs
that were processed in free_old_xmit_skbs under various
cases:

-----------------------------------------------------------
#Procs      #packets (512 I/O)      #packets (16K I/O)
-----------------------------------------------------------
1           2.81                    1.23
4           4.52                    18.63
16          4.23                    17.58
32          9.81                    18.07
64          15.86                   17.69
96          21.30                   16.72
-----------------------------------------------------------

Batching u64_stats_update_begin/ends seems to be useful
for 32-bit guests, as free_old_xmit_skbs is called at
every xmit.

Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
---
 drivers/net/virtio_net.c |   18 +++++++++++++-----
 1 file changed, 13 insertions(+), 5 deletions(-)

diff -ruNp org/drivers/net/virtio_net.c new/drivers/net/virtio_net.c
--- org/drivers/net/virtio_net.c	2011-07-04 10:38:33.000000000 +0530
+++ new/drivers/net/virtio_net.c	2011-07-15 12:27:41.000000000 +0530
@@ -531,19 +531,27 @@ static unsigned int free_old_xmit_skbs(s
 {
 	struct sk_buff *skb;
 	unsigned int len, tot_sgs = 0;
-	struct virtnet_stats __percpu *stats = this_cpu_ptr(vi->stats);
+	u64 tx_bytes = 0, tx_packets = 0;
 
 	while ((skb = virtqueue_get_buf(vi->svq, &len)) != NULL) {
 		pr_debug("Sent skb %p\n", skb);
 
-		u64_stats_update_begin(&stats->syncp);
-		stats->tx_bytes += skb->len;
-		stats->tx_packets++;
-		u64_stats_update_end(&stats->syncp);
+		tx_bytes += skb->len;
+		tx_packets++;
 
 		tot_sgs += skb_vnet_hdr(skb)->num_sg;
 		dev_kfree_skb_any(skb);
 	}
+
+	if (tx_packets) {
+		struct virtnet_stats __percpu *stats = this_cpu_ptr(vi->stats);
+
+		u64_stats_update_begin(&stats->syncp);
+		stats->tx_bytes += tx_bytes;
+		stats->tx_packets += tx_packets;
+		u64_stats_update_end(&stats->syncp);
+	}
+
 	return tot_sgs;
 }
 

^ permalink raw reply

* [PATCH] Remove redundant variable/code in __qdisc_run
From: Krishna Kumar @ 2011-07-15  9:16 UTC (permalink / raw)
  To: davem; +Cc: netdev, Krishna Kumar

Remove redundant variable "work".

Signed-off-by: Krishna Kumar <krkumar2@in.ibm.com>
---
 net/sched/sch_generic.c |    4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff -ruNp org/net/sched/sch_generic.c new/net/sched/sch_generic.c
--- org/net/sched/sch_generic.c	2011-06-27 16:16:07.000000000 +0530
+++ new/net/sched/sch_generic.c	2011-07-13 10:58:39.000000000 +0530
@@ -190,16 +190,14 @@ static inline int qdisc_restart(struct Q
 void __qdisc_run(struct Qdisc *q)
 {
 	int quota = weight_p;
-	int work = 0;
 
 	while (qdisc_restart(q)) {
-		work++;
 		/*
 		 * Ordered by possible occurrence: Postpone processing if
 		 * 1. we've exceeded packet quota
 		 * 2. another process needs the CPU;
 		 */
-		if (work >= quota || need_resched()) {
+		if (--quota <= 0 || need_resched()) {
 			__netif_schedule(q);
 			break;
 		}

^ permalink raw reply

* Re: gianfar vlan
From: Jiri Pirko @ 2011-07-15  9:16 UTC (permalink / raw)
  To: Sebastian Pöhn; +Cc: sandeep.kumar, netdev, Sebastian Pöhn
In-Reply-To: <1310716743.1789.9.camel@DENEC1DT0191>

Fri, Jul 15, 2011 at 09:59:03AM CEST, sebastian.belden@googlemail.com wrote:
>Hi Jirka!
>
>The rx queue filer table also needs the RCTRL_PRSDEP_INIT to be set. So I would suggest to add FILREN to RCTRL_REQ_PARSER.
>
>Further I guess the parser should only be enabled if it is necessary (because of VLAN, IP, TCP or FILREN acceleration modes).
>
>What about putting the
>
>if (tempval & RCTRL_REQ_PARSER)
>	tempval |= RCTRL_PRSDEP_INIT;
>else
>	tempval &= ~RCTRL_PRSDEP_INIT;
>
>in a dedicated routine and call it every time one of the four features is altered?

Ok why not. Would you please cook up the patch and post it? Thanks.

Jirka

>
>Greetings
>Sebastian
>
>> Hi Sandeep.
>> 
>> I'm trying to remove gfar_vlan_rx_register() and move the vlan accel
>> enabling/disabling to gfar_set_features() -> startup_gfar() ->
>> gfar_init_mac(). I'm not sure how to handle following lines in
>> gfar_vlan_rx_register():
>> 
>> /* If parse is no longer required, then disable parser */
>> if (tempval & RCTRL_REQ_PARSER)
>> tempval |= RCTRL_PRSDEP_INIT;
>> else
>> tempval &= ~RCTRL_PRSDEP_INIT;
>> 
>> Should this be done only on disabling vlan accel (both rx an tx)?
>> Or can I just do that always in startup_gfar() ?
>> 
>> Thanks.
>> 
>> Jirka
>

^ permalink raw reply

* Re: iwlagn: Random "Time out reading EEPROM".
From: Nicolas de Pesloüan @ 2011-07-15  8:22 UTC (permalink / raw)
  To: wwguy, dhalperi; +Cc: netdev@vger.kernel.org, wireless
In-Reply-To: <1307284276.6918.17.camel@wwguy-ubuntu>

Le 05/06/2011 16:31, wwguy a écrit :
> On Sat, 2011-06-04 at 06:14 -0700, Nicolas de Pesloüan wrote:
>> Hi,
>>
>>   From time to time, my Intel Wifi adapter fail on cold boot, with the following in dmesg:
>>
>> [    7.442634] iwlagn: Intel(R) Wireless WiFi Link AGN driver for Linux, 1.3.27ks
>> [    7.442636] iwlagn: Copyright(c) 2003-2009 Intel Corporation
>> [    7.442701] iwlagn 0000:05:00.0: PCI INT A ->  GSI 18 (level, low) ->  IRQ 18
>> [    7.442708] iwlagn 0000:05:00.0: setting latency timer to 64
>> [    7.442788] iwlagn 0000:05:00.0: Detected Intel Wireless WiFi Link 5100AGN REV=0x54
>> [    7.455547] iwlagn 0000:05:00.0: Time out reading EEPROM[0]
>> [    7.455611] iwlagn 0000:05:00.0: Unable to init EEPROM
>> [    7.455678] iwlagn 0000:05:00.0: PCI INT A disabled
>> [    7.455685] iwlagn: probe of 0000:05:00.0 failed with error -110
>>
>> After "modprobe -r iwlagn ; modprobe iwlagn" or after a reboot, everything work fine:
>>
>> [  346.332166] iwlagn: Intel(R) Wireless WiFi Link AGN driver for Linux, 1.3.27ks
>> [  346.332169] iwlagn: Copyright(c) 2003-2009 Intel Corporation
>> [  346.332225] iwlagn 0000:05:00.0: PCI INT A ->  GSI 18 (level, low) ->  IRQ 18
>> [  346.332234] iwlagn 0000:05:00.0: setting latency timer to 64
>> [  346.332277] iwlagn 0000:05:00.0: Detected Intel Wireless WiFi Link 5100AGN REV=0x54
>> [  346.365739] iwlagn 0000:05:00.0: Tunable channels: 13 802.11bg, 24 802.11a channels
>> [  346.365787]   alloc irq_desc for 33 on node -1
>> [  346.365788]   alloc kstat_irqs on node -1
>> [  346.365804] iwlagn 0000:05:00.0: irq 33 for MSI/MSI-X
>> [  346.449989] phy0: Selected rate control algorithm 'iwl-agn-rs'
>>
>> This happens using 2.6.32-5 from Debian. I didn't had the opportunity to reproduce it using recent
>> kernel, but it occurs too rarely to conclude anything.
>>
>> I had a look at the git log, without finding something that seems related to this problem. Does this
>> ring a bell for someone?
>>
>> 	Nicolas.
>>
>> 05:00.0 Network controller: Intel Corporation WiFi Link 5100
>>           Subsystem: Intel Corporation WiFi Link 5100 AGN
>>           Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR-
>> FastB2B- DisINTx-
>>           Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast>TAbort-<TAbort-<MAbort->SERR-
>> <PERR- INTx-
>>           Latency: 0, Cache Line Size: 64 bytes
>>           Interrupt: pin A routed to IRQ 33
>>           Region 0: Memory at d6d00000 (64-bit, non-prefetchable) [size=8K]
>>           Capabilities: [c8] Power Management version 3
>>                   Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
>>                   Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
>>           Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+
>>                   Address: 00000000fee0300c  Data: 41c9
>>           Capabilities: [e0] Express (v1) Endpoint, MSI 00
>>                   DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s<512ns, L1 unlimited
>>                           ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset+
>>                   DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
>>                           RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+ FLReset-
>>                           MaxPayload 128 bytes, MaxReadReq 128 bytes
>>                   DevSta: CorrErr+ UncorrErr- FatalErr- UnsuppReq+ AuxPwr+ TransPend-
>>                   LnkCap: Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Latency L0<128ns, L1<32us
>>                           ClockPM+ Surprise- LLActRep- BwNot-
>>                   LnkCtl: ASPM L1 Enabled; RCB 64 bytes Disabled- Retrain- CommClk+
>>                           ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
>>                   LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
>>           Capabilities: [100 v1] Advanced Error Reporting
>>                   UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC-
>> UnsupReq- ACSViol-
>>                   UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC-
>> UnsupReq- ACSViol-
>>                   UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC-
>> UnsupReq- ACSViol-
>>                   CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
>>                   CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
>>                   AERCap: First Error Pointer: 00, GenCap- CGenEn- ChkCap- ChkEn-
>>           Capabilities: [140 v1] Device Serial Number 00-22-fb-ff-ff-28-46-c8
>>           Kernel driver in use: iwlagn
>>
>
> no sure, you are the ONLY one report this issue, since the device works
> after you reload the module, the HW/SW is ok. maybe the timing/sequence
> have problem on the kernel you are using on your system. The bus is not
> ready when you boot the system and access the EEPROM.
>
> What model of the laptop you have? could you try to upgrade to newer
> kernel?

This problem also happens using 2.6.39-2-amd64 from Debian.

[    8.507426] iwlagn: Intel(R) Wireless WiFi Link AGN driver for Linux, in-tree:
[    8.507429] iwlagn: Copyright(c) 2003-2010 Intel Corporation
[    8.507521] iwlagn 0000:05:00.0: PCI INT A -> GSI 18 (level, low) -> IRQ 18
[    8.507529] iwlagn 0000:05:00.0: setting latency timer to 64
[    8.507556] iwlagn 0000:05:00.0: Detected Intel(R) WiFi Link 5100 AGN, REV=0x54
[    8.516687] iwlagn 0000:05:00.0: Time out reading EEPROM[2]
[    8.516724] iwlagn 0000:05:00.0: Unable to init EEPROM
[    8.516739] iwlagn 0000:05:00.0: PCI INT A disabled
[    8.516745] iwlagn: probe of 0000:05:00.0 failed with error -110

modprobe -r iwlagn ; modprobe iwlagn

[  208.948664] iwlagn: Intel(R) Wireless WiFi Link AGN driver for Linux, in-tree:
[  208.948666] iwlagn: Copyright(c) 2003-2010 Intel Corporation
[  208.948743] iwlagn 0000:05:00.0: PCI INT A -> GSI 18 (level, low) -> IRQ 18
[  208.948753] iwlagn 0000:05:00.0: setting latency timer to 64
[  208.948817] iwlagn 0000:05:00.0: Detected Intel(R) WiFi Link 5100 AGN, REV=0x54
[  208.969659] iwlagn 0000:05:00.0: device EEPROM VER=0x11f, CALIB=0x4
[  208.969661] iwlagn 0000:05:00.0: Device SKU: 0Xb
[  208.969701] iwlagn 0000:05:00.0: Tunable channels: 13 802.11bg, 24 802.11a channels
[  208.969766] iwlagn 0000:05:00.0: irq 50 for MSI/MSI-X
[  209.048366] iwlagn 0000:05:00.0: loaded firmware version 8.83.5.1 build 33692

Because it happens not very often, I still didn't had the opportunity to reproduce it with up to 
date kernel.

The laptop is a two years old Sony Vaio VGN-AW21M.

	Nicolas.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox