Netdev List
 help / color / mirror / Atom feed
* Re: issues with vxlan RX checksum offload under OVS datapath
From: Or Gerlitz @ 2014-01-21 20:55 UTC (permalink / raw)
  To: Pravin Shelar; +Cc: Or Gerlitz, Joseph Gasparakis, netdev
In-Reply-To: <CALnjE+rUTejbU_d39G7WSn=k0aY2xELCE3UxjDr_Zdq3iwHbNA@mail.gmail.com>

On Tue, Jan 21, 2014 at 7:30 PM, Pravin Shelar <pshelar@nicira.com> wrote:
> On Sun, Jan 19, 2014 at 2:05 PM, Or Gerlitz <ogerlitz@mellanox.com> wrote:

>> While testing the gro udp patches over a setup with openvswitch I noted that
>> the RX checksum offload support introduced by Joseph's commit 0afb166
>> "vxlan: Add capability of Rx checksum offload for inner packet" works fine
>> when you use a setup made of
>> NIC --> IP stack --> vxlan device --> bridge --> tap
>> but not when its
>> NIC --> IP stack --> ovs vxlan port --> OVS DP --> tap
>> I narrowed it down to the fact the when going the OVS pathskb->encapsulation
>> remains true also after the decap is done. Basically, this is the original hunk
[...]
>>> +       skb->encapsulation = 0;
[...]

>> Moving this to shared code (while removing the check for
>> vxlan->dev->features) made things to work on my setup, but this misses one
>> of the original conditions, ideas?

> I kept csum check in vxlan-device recv path for same reason. As of now
> there is no efficient way to get ovs-dev features.
> May be we can cache device features in struct datapath from datapath-netdev.

To be a bit more precise/concrete here, do we agree that the both paths must do

   skb->encapsulation = 0;

which is done now only by the non-ovs path

Or.

^ permalink raw reply

* Re: [PATCH] ixgbevf: delete unneeded call to pci_set_power_state
From: Brown, Aaron F @ 2014-01-21 20:47 UTC (permalink / raw)
  To: Julia.Lawall@lip6.fr
  Cc: Kirsher, Jeffrey T, Brandeburg, Jesse, Allan, Bruce W,
	Wyborny, Carolyn, Skidmore, Donald C, Rose, Gregory V,
	Duyck, Alexander H, Ronciak, John,
	kernel-janitors@vger.kernel.org,
	e1000-devel@lists.sourceforge.net, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org
In-Reply-To: <1389114026-27854-1-git-send-email-Julia.Lawall@lip6.fr>

On Tue, 2014-01-07 at 18:00 +0100, Julia Lawall wrote:
> From: Julia Lawall <Julia.Lawall@lip6.fr>
> 
> This driver does not need to adjust the power state on suspend, so the
> call to pci_set_power_state in the resume function is a no-op.  Drop it, to
> make the code more understandable.
> 
> Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Tested-by: Phil Schmitt <phillip.j.schmitt@intel.com>
Signed-off-by: Aaron Brown <aaron.f.brown@intel.com>

> ---
>  drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c |    1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
> index a5d3167..5709fb0 100644
> --- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
> +++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c
> @@ -3226,7 +3226,6 @@ static int ixgbevf_resume(struct pci_dev *pdev)
>  	struct net_device *netdev = adapter->netdev;
>  	u32 err;
>  
> -	pci_set_power_state(pdev, PCI_D0);
>  	pci_restore_state(pdev);
>  	/*
>  	 * pci_restore_state clears dev->state_saved so call
> 
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



^ permalink raw reply

* Kernel fails to drop packet with martian source address (subnet broadcast)
From: gregory hoggarth @ 2014-01-21 20:45 UTC (permalink / raw)
  To: netdev; +Cc: davem

Hi,

This is my second mail to the list about this issue, since I didn't get much of
a response for my first one.

Earlier in the year we upgraded the linux kernel on our L3 switch firmware to 
v3.6.2 which includes the "ipv4: Elidge fib_validate_source() completely when 
possible" patch. Our testers have picked up a change in behaviour which I have 
traced back to this patch and I'd like to get comments on whether this is a bug
or not. Here's the patch: 
https://github.com/torvalds/linux/commit/7a9bc9b81a5bc6e44ebc80ef781332e4385083f2

We found that in a configuration with a device under test (DUT) that has an IP 
address 192.168.0.1 / 24, when it is sent an ICMP echo request with a source IP 
address of 192.168.0.255 will now be accepted by the DUT, which will send an 
ICMP echo reply back to the 192.168.0.255 address with a broadcast MAC address 
of FFFF.FFFF.FFFF, meaning any other devices in the L2 domain will pick up and 
process the ICMP reply.

Prior to the upgrade, our device was running on linux kernel 2.6.38.2, and in 
that version the DUT would reject the packet with a martian source address 
error:
11:20:54 awplus kernel: RxPkt(00001) Q:1 dev:8 lport:00000a00 cpuC:154 encap:0 
len:98 bufs:1
11:20:54 awplus kernel: RxPkt(00001) vid:1 port2.0.23
11:20:54 awplus kernel: <7>   0000cd29 980d0000 cd27c1eb 08004500 
11:20:54 awplus kernel: <7>   00540000 40004001 b858c0a8 00ffc0a8 
11:20:54 awplus kernel: <7>   00010800 a4cf588c 000552dd c9990000 
11:20:54 awplus kernel: <7>   f3240809 0a0b0c0d 0e0f1011 12131415 
11:20:54 awplus kernel: <7>   (...)
11:20:54 awplus kernel: martian source 192.168.0.1 from 192.168.0.255, on dev 
vlan1
11:20:54 awplus kernel: ll header: 
00:00:cd:29:98:0d:00:00:cd:27:c1:eb:08:00:45:00


Initially we were thinking that a device configured with 192.168.0.255 / 23 
sending a ping to 192.168.0.1 / 24 should expect to get a reply, even though 
this is somewhat of a misconfiguration in that the subnets aren't matching and 
that some things may not work properly. For example ping from a device with IP 
address 192.168.0.254 would behave correctly, so it seemed correct that a ping 
from 192.168.0.255 should also work properly. However it appears that some 
other part of the kernel is detecting that the echo reply packet sent by the 
DUT has a destination IP address of the subnet broadcast, so instead of sending 
it with a unicast MAC address it sends it with the broadcast MAC address, which 
means all other devices on the subnet would process the ICMP echo reply. This 
therefore makes it seem like the ICMP echo request should have been dropped in 
the first place.

The patch added the "if (!r && !fib_num_tclassid_users)" check to 
fib_validate_source, which for our devices evaluates to true, hence returning 
early and the source address check inside __fib_validate_source is not carried 
out.


Does anyone have comments as to what the correct behaviour should be in this 
case, and whether this is an unexpected side-effect from the patch or not?

Thanks,
Greg

^ permalink raw reply

* [PATCH v4] xen/grant-table: Avoid m2p_override during mapping
From: Zoltan Kiss @ 2014-01-21 20:22 UTC (permalink / raw)
  To: ian.campbell, wei.liu2, xen-devel, netdev, linux-kernel,
	jonathan.davies
  Cc: Zoltan Kiss

The grant mapping API does m2p_override unnecessarily: only gntdev needs it,
for blkback and future netback patches it just cause a lock contention, as
those pages never go to userspace. Therefore this series does the following:
- the original functions were renamed to __gnttab_[un]map_refs, with a new
  parameter m2p_override
- based on m2p_override either they follow the original behaviour, or just set
  the private flag and call set_phys_to_machine
- gnttab_[un]map_refs are now a wrapper to call __gnttab_[un]map_refs with
  m2p_override false
- a new function gnttab_[un]map_refs_userspace provides the old behaviour

v2:
- move the storing of the old mfn in page->index to gnttab_map_refs
- move the function header update to a separate patch

v3:
- a new approach to retain old behaviour where it needed
- squash the patches into one

v4:
- move out the common bits from m2p* functions, and pass pfn/mfn as parameter
- clear page->private before doing anything with the page, so m2p_find_override
  won't race with this

Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com>
Suggested-by: David Vrabel <david.vrabel@citrix.com>
---
 arch/x86/include/asm/xen/page.h     |   12 +++--
 arch/x86/xen/p2m.c                  |   25 ++--------
 drivers/block/xen-blkback/blkback.c |   15 +++---
 drivers/xen/gntdev.c                |   13 +++--
 drivers/xen/grant-table.c           |   90 +++++++++++++++++++++++++++++------
 include/xen/grant_table.h           |    8 +++-
 6 files changed, 107 insertions(+), 56 deletions(-)

diff --git a/arch/x86/include/asm/xen/page.h b/arch/x86/include/asm/xen/page.h
index b913915..68a1438 100644
--- a/arch/x86/include/asm/xen/page.h
+++ b/arch/x86/include/asm/xen/page.h
@@ -49,10 +49,14 @@ extern bool __set_phys_to_machine(unsigned long pfn, unsigned long mfn);
 extern unsigned long set_phys_range_identity(unsigned long pfn_s,
 					     unsigned long pfn_e);
 
-extern int m2p_add_override(unsigned long mfn, struct page *page,
-			    struct gnttab_map_grant_ref *kmap_op);
+extern int m2p_add_override(unsigned long mfn,
+			    struct page *page,
+			    struct gnttab_map_grant_ref *kmap_op,
+			    unsigned long pfn);
 extern int m2p_remove_override(struct page *page,
-				struct gnttab_map_grant_ref *kmap_op);
+			       struct gnttab_map_grant_ref *kmap_op,
+			       unsigned long pfn,
+			       unsigned long mfn);
 extern struct page *m2p_find_override(unsigned long mfn);
 extern unsigned long m2p_find_override_pfn(unsigned long mfn, unsigned long pfn);
 
@@ -121,7 +125,7 @@ static inline unsigned long mfn_to_pfn(unsigned long mfn)
 		pfn = m2p_find_override_pfn(mfn, ~0);
 	}
 
-	/* 
+	/*
 	 * pfn is ~0 if there are no entries in the m2p for mfn or if the
 	 * entry doesn't map back to the mfn and m2p_override doesn't have a
 	 * valid entry for it.
diff --git a/arch/x86/xen/p2m.c b/arch/x86/xen/p2m.c
index 2ae8699..0060178 100644
--- a/arch/x86/xen/p2m.c
+++ b/arch/x86/xen/p2m.c
@@ -872,15 +872,13 @@ static unsigned long mfn_hash(unsigned long mfn)
 
 /* Add an MFN override for a particular page */
 int m2p_add_override(unsigned long mfn, struct page *page,
-		struct gnttab_map_grant_ref *kmap_op)
+		struct gnttab_map_grant_ref *kmap_op, unsigned long pfn)
 {
 	unsigned long flags;
-	unsigned long pfn;
 	unsigned long uninitialized_var(address);
 	unsigned level;
 	pte_t *ptep = NULL;
 
-	pfn = page_to_pfn(page);
 	if (!PageHighMem(page)) {
 		address = (unsigned long)__va(pfn << PAGE_SHIFT);
 		ptep = lookup_address(address, &level);
@@ -888,13 +886,6 @@ int m2p_add_override(unsigned long mfn, struct page *page,
 					"m2p_add_override: pfn %lx not mapped", pfn))
 			return -EINVAL;
 	}
-	WARN_ON(PagePrivate(page));
-	SetPagePrivate(page);
-	set_page_private(page, mfn);
-	page->index = pfn_to_mfn(pfn);
-
-	if (unlikely(!set_phys_to_machine(pfn, FOREIGN_FRAME(mfn))))
-		return -ENOMEM;
 
 	if (kmap_op != NULL) {
 		if (!PageHighMem(page)) {
@@ -933,20 +924,15 @@ int m2p_add_override(unsigned long mfn, struct page *page,
 }
 EXPORT_SYMBOL_GPL(m2p_add_override);
 int m2p_remove_override(struct page *page,
-		struct gnttab_map_grant_ref *kmap_op)
+			struct gnttab_map_grant_ref *kmap_op,
+			unsigned long pfn,
+			unsigned long mfn)
 {
 	unsigned long flags;
-	unsigned long mfn;
-	unsigned long pfn;
 	unsigned long uninitialized_var(address);
 	unsigned level;
 	pte_t *ptep = NULL;
 
-	pfn = page_to_pfn(page);
-	mfn = get_phys_to_machine(pfn);
-	if (mfn == INVALID_P2M_ENTRY || !(mfn & FOREIGN_FRAME_BIT))
-		return -EINVAL;
-
 	if (!PageHighMem(page)) {
 		address = (unsigned long)__va(pfn << PAGE_SHIFT);
 		ptep = lookup_address(address, &level);
@@ -959,10 +945,7 @@ int m2p_remove_override(struct page *page,
 	spin_lock_irqsave(&m2p_override_lock, flags);
 	list_del(&page->lru);
 	spin_unlock_irqrestore(&m2p_override_lock, flags);
-	WARN_ON(!PagePrivate(page));
-	ClearPagePrivate(page);
 
-	set_phys_to_machine(pfn, page->index);
 	if (kmap_op != NULL) {
 		if (!PageHighMem(page)) {
 			struct multicall_space mcs;
diff --git a/drivers/block/xen-blkback/blkback.c b/drivers/block/xen-blkback/blkback.c
index 6620b73..875025f 100644
--- a/drivers/block/xen-blkback/blkback.c
+++ b/drivers/block/xen-blkback/blkback.c
@@ -285,8 +285,7 @@ static void free_persistent_gnts(struct xen_blkif *blkif, struct rb_root *root,
 
 		if (++segs_to_unmap == BLKIF_MAX_SEGMENTS_PER_REQUEST ||
 			!rb_next(&persistent_gnt->node)) {
-			ret = gnttab_unmap_refs(unmap, NULL, pages,
-				segs_to_unmap);
+			ret = gnttab_unmap_refs(unmap, pages, segs_to_unmap);
 			BUG_ON(ret);
 			put_free_pages(blkif, pages, segs_to_unmap);
 			segs_to_unmap = 0;
@@ -321,8 +320,7 @@ static void unmap_purged_grants(struct work_struct *work)
 		pages[segs_to_unmap] = persistent_gnt->page;
 
 		if (++segs_to_unmap == BLKIF_MAX_SEGMENTS_PER_REQUEST) {
-			ret = gnttab_unmap_refs(unmap, NULL, pages,
-				segs_to_unmap);
+			ret = gnttab_unmap_refs(unmap, pages, segs_to_unmap);
 			BUG_ON(ret);
 			put_free_pages(blkif, pages, segs_to_unmap);
 			segs_to_unmap = 0;
@@ -330,7 +328,7 @@ static void unmap_purged_grants(struct work_struct *work)
 		kfree(persistent_gnt);
 	}
 	if (segs_to_unmap > 0) {
-		ret = gnttab_unmap_refs(unmap, NULL, pages, segs_to_unmap);
+		ret = gnttab_unmap_refs(unmap, pages, segs_to_unmap);
 		BUG_ON(ret);
 		put_free_pages(blkif, pages, segs_to_unmap);
 	}
@@ -670,15 +668,14 @@ static void xen_blkbk_unmap(struct xen_blkif *blkif,
 				    GNTMAP_host_map, pages[i]->handle);
 		pages[i]->handle = BLKBACK_INVALID_HANDLE;
 		if (++invcount == BLKIF_MAX_SEGMENTS_PER_REQUEST) {
-			ret = gnttab_unmap_refs(unmap, NULL, unmap_pages,
-			                        invcount);
+			ret = gnttab_unmap_refs(unmap, unmap_pages, invcount);
 			BUG_ON(ret);
 			put_free_pages(blkif, unmap_pages, invcount);
 			invcount = 0;
 		}
 	}
 	if (invcount) {
-		ret = gnttab_unmap_refs(unmap, NULL, unmap_pages, invcount);
+		ret = gnttab_unmap_refs(unmap, unmap_pages, invcount);
 		BUG_ON(ret);
 		put_free_pages(blkif, unmap_pages, invcount);
 	}
@@ -740,7 +737,7 @@ again:
 	}
 
 	if (segs_to_map) {
-		ret = gnttab_map_refs(map, NULL, pages_to_gnt, segs_to_map);
+		ret = gnttab_map_refs(map, pages_to_gnt, segs_to_map);
 		BUG_ON(ret);
 	}
 
diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c
index e41c79c..e652c0e 100644
--- a/drivers/xen/gntdev.c
+++ b/drivers/xen/gntdev.c
@@ -284,8 +284,10 @@ static int map_grant_pages(struct grant_map *map)
 	}
 
 	pr_debug("map %d+%d\n", map->index, map->count);
-	err = gnttab_map_refs(map->map_ops, use_ptemod ? map->kmap_ops : NULL,
-			map->pages, map->count);
+	err = gnttab_map_refs_userspace(map->map_ops,
+					use_ptemod ? map->kmap_ops : NULL,
+					map->pages,
+					map->count);
 	if (err)
 		return err;
 
@@ -315,9 +317,10 @@ static int __unmap_grant_pages(struct grant_map *map, int offset, int pages)
 		}
 	}
 
-	err = gnttab_unmap_refs(map->unmap_ops + offset,
-			use_ptemod ? map->kmap_ops + offset : NULL, map->pages + offset,
-			pages);
+	err = gnttab_unmap_refs_userspace(map->unmap_ops + offset,
+					  use_ptemod ? map->kmap_ops + offset : NULL,
+					  map->pages + offset,
+					  pages);
 	if (err)
 		return err;
 
diff --git a/drivers/xen/grant-table.c b/drivers/xen/grant-table.c
index aa846a4..1f97fa0 100644
--- a/drivers/xen/grant-table.c
+++ b/drivers/xen/grant-table.c
@@ -880,15 +880,17 @@ void gnttab_batch_copy(struct gnttab_copy *batch, unsigned count)
 }
 EXPORT_SYMBOL_GPL(gnttab_batch_copy);
 
-int gnttab_map_refs(struct gnttab_map_grant_ref *map_ops,
+int __gnttab_map_refs(struct gnttab_map_grant_ref *map_ops,
 		    struct gnttab_map_grant_ref *kmap_ops,
-		    struct page **pages, unsigned int count)
+		    struct page **pages, unsigned int count,
+		    bool m2p_override)
 {
 	int i, ret;
 	bool lazy = false;
 	pte_t *pte;
-	unsigned long mfn;
+	unsigned long mfn, pfn;
 
+	BUG_ON(kmap_ops && !m2p_override);
 	ret = HYPERVISOR_grant_table_op(GNTTABOP_map_grant_ref, map_ops, count);
 	if (ret)
 		return ret;
@@ -907,10 +909,12 @@ int gnttab_map_refs(struct gnttab_map_grant_ref *map_ops,
 			set_phys_to_machine(map_ops[i].host_addr >> PAGE_SHIFT,
 					map_ops[i].dev_bus_addr >> PAGE_SHIFT);
 		}
-		return ret;
+		return 0;
 	}
 
-	if (!in_interrupt() && paravirt_get_lazy_mode() == PARAVIRT_LAZY_NONE) {
+	if (m2p_override &&
+	    !in_interrupt() &&
+	    paravirt_get_lazy_mode() == PARAVIRT_LAZY_NONE) {
 		arch_enter_lazy_mmu_mode();
 		lazy = true;
 	}
@@ -927,8 +931,18 @@ int gnttab_map_refs(struct gnttab_map_grant_ref *map_ops,
 		} else {
 			mfn = PFN_DOWN(map_ops[i].dev_bus_addr);
 		}
-		ret = m2p_add_override(mfn, pages[i], kmap_ops ?
-				       &kmap_ops[i] : NULL);
+		pfn = page_to_pfn(pages[i]);
+
+		WARN_ON(PagePrivate(pages[i]));
+		SetPagePrivate(pages[i]);
+		set_page_private(pages[i], mfn);
+
+		pages[i]->index = pfn_to_mfn(pfn);
+		if (unlikely(!set_phys_to_machine(pfn, FOREIGN_FRAME(mfn))))
+			return -ENOMEM;
+		if (m2p_override)
+			ret = m2p_add_override(mfn, pages[i], kmap_ops ?
+					       &kmap_ops[i] : NULL, pfn);
 		if (ret)
 			goto out;
 	}
@@ -937,17 +951,34 @@ int gnttab_map_refs(struct gnttab_map_grant_ref *map_ops,
 	if (lazy)
 		arch_leave_lazy_mmu_mode();
 
-	return ret;
+	return 0;
+}
+
+int gnttab_map_refs(struct gnttab_map_grant_ref *map_ops,
+		    struct page **pages, unsigned int count)
+{
+	return __gnttab_map_refs(map_ops, NULL, pages, count, false);
 }
 EXPORT_SYMBOL_GPL(gnttab_map_refs);
 
-int gnttab_unmap_refs(struct gnttab_unmap_grant_ref *unmap_ops,
+int gnttab_map_refs_userspace(struct gnttab_map_grant_ref *map_ops,
+			      struct gnttab_map_grant_ref *kmap_ops,
+			      struct page **pages, unsigned int count)
+{
+	return __gnttab_map_refs(map_ops, kmap_ops, pages, count, true);
+}
+EXPORT_SYMBOL_GPL(gnttab_map_refs_userspace);
+
+int __gnttab_unmap_refs(struct gnttab_unmap_grant_ref *unmap_ops,
 		      struct gnttab_map_grant_ref *kmap_ops,
-		      struct page **pages, unsigned int count)
+		      struct page **pages, unsigned int count,
+		      bool m2p_override)
 {
 	int i, ret;
 	bool lazy = false;
+	unsigned long pfn, mfn;
 
+	BUG_ON(kmap_ops && !m2p_override);
 	ret = HYPERVISOR_grant_table_op(GNTTABOP_unmap_grant_ref, unmap_ops, count);
 	if (ret)
 		return ret;
@@ -958,17 +989,32 @@ int gnttab_unmap_refs(struct gnttab_unmap_grant_ref *unmap_ops,
 			set_phys_to_machine(unmap_ops[i].host_addr >> PAGE_SHIFT,
 					INVALID_P2M_ENTRY);
 		}
-		return ret;
+		return 0;
 	}
 
-	if (!in_interrupt() && paravirt_get_lazy_mode() == PARAVIRT_LAZY_NONE) {
+	if (m2p_override &&
+	    !in_interrupt() &&
+	    paravirt_get_lazy_mode() == PARAVIRT_LAZY_NONE) {
 		arch_enter_lazy_mmu_mode();
 		lazy = true;
 	}
 
 	for (i = 0; i < count; i++) {
-		ret = m2p_remove_override(pages[i], kmap_ops ?
-				       &kmap_ops[i] : NULL);
+		pfn = page_to_pfn(pages[i]);
+		mfn = get_phys_to_machine(pfn);
+		if (mfn == INVALID_P2M_ENTRY || !(mfn & FOREIGN_FRAME_BIT))
+			return -EINVAL;
+
+		set_page_private(pages[i], INVALID_P2M_ENTRY);
+		WARN_ON(!PagePrivate(pages[i]));
+		ClearPagePrivate(pages[i]);
+		set_phys_to_machine(pfn, pages[i]->index);
+		if (m2p_override)
+			ret = m2p_remove_override(pages[i],
+						  kmap_ops ?
+						   &kmap_ops[i] : NULL,
+						  pfn,
+						  mfn);
 		if (ret)
 			goto out;
 	}
@@ -977,10 +1023,24 @@ int gnttab_unmap_refs(struct gnttab_unmap_grant_ref *unmap_ops,
 	if (lazy)
 		arch_leave_lazy_mmu_mode();
 
-	return ret;
+	return 0;
+}
+
+int gnttab_unmap_refs(struct gnttab_unmap_grant_ref *map_ops,
+		    struct page **pages, unsigned int count)
+{
+	return __gnttab_unmap_refs(map_ops, NULL, pages, count, false);
 }
 EXPORT_SYMBOL_GPL(gnttab_unmap_refs);
 
+int gnttab_unmap_refs_userspace(struct gnttab_unmap_grant_ref *map_ops,
+				struct gnttab_map_grant_ref *kmap_ops,
+				struct page **pages, unsigned int count)
+{
+	return __gnttab_unmap_refs(map_ops, kmap_ops, pages, count, true);
+}
+EXPORT_SYMBOL_GPL(gnttab_unmap_refs_userspace);
+
 static unsigned nr_status_frames(unsigned nr_grant_frames)
 {
 	BUG_ON(grefs_per_grant_frame == 0);
diff --git a/include/xen/grant_table.h b/include/xen/grant_table.h
index 694dcaf..9a919b1 100644
--- a/include/xen/grant_table.h
+++ b/include/xen/grant_table.h
@@ -184,11 +184,15 @@ unsigned int gnttab_max_grant_frames(void);
 #define gnttab_map_vaddr(map) ((void *)(map.host_virt_addr))
 
 int gnttab_map_refs(struct gnttab_map_grant_ref *map_ops,
-		    struct gnttab_map_grant_ref *kmap_ops,
 		    struct page **pages, unsigned int count);
+int gnttab_map_refs_userspace(struct gnttab_map_grant_ref *map_ops,
+			      struct gnttab_map_grant_ref *kmap_ops,
+			      struct page **pages, unsigned int count);
 int gnttab_unmap_refs(struct gnttab_unmap_grant_ref *unmap_ops,
-		      struct gnttab_map_grant_ref *kunmap_ops,
 		      struct page **pages, unsigned int count);
+int gnttab_unmap_refs_userspace(struct gnttab_unmap_grant_ref *unmap_ops,
+				struct gnttab_map_grant_ref *kunmap_ops,
+				struct page **pages, unsigned int count);
 
 /* Perform a batch of grant map/copy operations. Retry every batch slot
  * for which the hypervisor returns GNTST_eagain. This is typically due

^ permalink raw reply related

* Re: [PATCH] DT: net: document Ethernet bindings in one place
From: Sergei Shtylyov @ 2014-01-21 21:19 UTC (permalink / raw)
  To: Florian Fainelli, Rob Herring
  Cc: netdev, Rob Herring, Pawel Moll, Mark Rutland, Ian Campbell,
	Kumar Gala, devicetree@vger.kernel.org, Rob Landley,
	linux-doc@vger.kernel.org, Grant Likely
In-Reply-To: <CAGVrzcb0dfw1orZzUt5-YkShOg-HNbUYvvo2vfmsZUZXy1Aqfg@mail.gmail.com>

Hello.

On 01/21/2014 11:05 PM, Florian Fainelli wrote:

>>>>     I'm afraid that's too late, it has spread very far, so that
>>>> of_get_phy_mode() handles that property, not "phy-connection-type".

>>> Uggg, I guess this is a case of a defacto standard then if the kernel
>>> doesn't even support it.

>> Maybe I forgot to CC you on patch sent to Grant only, I sent a patch a
>> while ago for of_get_phy_mode() to look for both "phy-mode" and
>> "phy-connection-type" since the former has been a Linux invention, but
>> the latter is ePAPR specified.

> Here is a link to the actual patch in question, not sure which tree
> Grant applied it to though:

> http://lkml.indiana.edu/hypermail/linux/kernel/1311.2/00048.html

    It's not the patch mail, it's Grant's "applied" reply, patch is mangled in 
this reply, and I couldn't follow the thread. Here's the actual patch mail:

http://marc.info/?l=devicetree&m=138449662807254

WBR, Sergei

^ permalink raw reply

* Re: [PATCH] DT: net: document Ethernet bindings in one place
From: Florian Fainelli @ 2014-01-21 20:05 UTC (permalink / raw)
  To: Rob Herring
  Cc: Sergei Shtylyov, netdev, Rob Herring, Pawel Moll, Mark Rutland,
	Ian Campbell, Kumar Gala, devicetree@vger.kernel.org, Rob Landley,
	linux-doc@vger.kernel.org, Grant Likely
In-Reply-To: <CAGVrzcb3X3soJNJCE5+YSpQrr+EdycCRFkptPvBCgFg4CbGJ4Q@mail.gmail.com>

2014/1/21 Florian Fainelli <f.fainelli@gmail.com>:
>>>    I'm afraid that's too late, it has spread very far, so that
>>> of_get_phy_mode() handles that property, not "phy-connection-type".
>>
>> Uggg, I guess this is a case of a defacto standard then if the kernel
>> doesn't even support it.
>
> Maybe I forgot to CC you on patch sent to Grant only, I sent a patch a
> while ago for of_get_phy_mode() to look for both "phy-mode" and
> "phy-connection-type" since the former has been a Linux invention, but
> the latter is ePAPR specified.

Here is a link to the actual patch in question, not sure which tree
Grant applied it to though:

http://lkml.indiana.edu/hypermail/linux/kernel/1311.2/00048.html

>
>>
>>>>> +- phy-connection-type: the same as "phy-mode" property (but described in
>>>>> ePAPR);
>>>>> +- phy-handle: phandle, specifies a reference to a node representing a
>>>>> PHY
>>>>> +  device (this property is described in ePAPR);
>>>>> +- phy: the same as "phy-handle" property (but actually ad-hoc one).
>>>
>>>
>>>> Mark this as deprecated in favor of phy-handle.
>>>
>>>
>>>    Here situation is more optimistic. Quite many drivers still use
>>> "phy-handle", though some use even more exotic props I didn't document here.
>>
>> Perhaps flagging as "Not recommended for new bindings" would be nicer wording...
>
> Ok, so what's the deal here, can't we use phy-handle? There is
> currently no infrastructure in drivers/net/ or drivers/of/ to look for
> the "phy-handle" nor "phy" properties as phandles to fetch an Ethernet
> PHY device tree node. If we are to provide one common place where we
> want to do this, we will have to look for both properties, why can't
> we mandate the use of "phy-handle" since this is the ePAPR compliant
> one?
>
>>
>> Rob
>> --
>> To unsubscribe from this list: send the line "unsubscribe devicetree" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>
>
> --
> Florian



-- 
Florian

^ permalink raw reply

* Re: [PATCH] DT: net: document Ethernet bindings in one place
From: Florian Fainelli @ 2014-01-21 19:56 UTC (permalink / raw)
  To: Rob Herring
  Cc: Sergei Shtylyov, netdev, Rob Herring, Pawel Moll, Mark Rutland,
	Ian Campbell, Kumar Gala, devicetree@vger.kernel.org, Rob Landley,
	linux-doc@vger.kernel.org
In-Reply-To: <CAL_JsqJXbF1-PcPHR2VP+Vi9A1aWizdsG_r3kDvRt3itXDhCGQ@mail.gmail.com>

2014/1/20 Rob Herring <robherring2@gmail.com>:
> On Mon, Jan 20, 2014 at 3:45 PM, Sergei Shtylyov
> <sergei.shtylyov@cogentembedded.com> wrote:
>> On 01/20/2014 05:06 PM, Rob Herring wrote:
>>
>>>> This patch is an attempt to gather the Ethernet related bindings in one
>>>> file,
>>>> like it's done in the MMC and some other subsystems. It should save the
>>>> trouble
>>>> of documenting several properties over and over in each binding document.
>>
>>
>>>> I have used the Embedded Power Architecture(TM) Platform Requirements
>>>> (ePAPR)
>>>> standard as a base for the properties description, also documenting some
>>>> ad-hoc
>>>> properties that have been introduced over time despite having direct
>>>> analogs in
>>>> ePAPR.
>>
>>
>>>> Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
>>
>>
>>>> ---
>>>> The patch is against DaveM's 'net-next.git' repo and the DaVinci EMAC
>>>> bindings
>>>> fix I've posted yesterday:
>>
>>
>>>> http://patchwork.ozlabs.org/patch/311854/
>>
>>
>> [...]
>
>>>> Index: net-next/Documentation/devicetree/bindings/net/ethernet.txt
>>>> ===================================================================
>>>> --- /dev/null
>>>> +++ net-next/Documentation/devicetree/bindings/net/ethernet.txt
>>>> @@ -0,0 +1,20 @@
>>>> +The following properties are common to the Ethernet controllers:
>>>> +
>>>> +- local-mac-address: array of 6 bytes, specifies the MAC address that
>>>> was
>>>> +  assigned to the network device;
>>>> +- mac-address: array of 6 bytes, specifies the MAC address that was last
>>>> used by
>>>> +  the boot program; should be used in cases where the MAC address
>>>> assigned to
>>>> +  the device by the boot program is different from the
>>>> "local-mac-address"
>>>> +  property;
>>>> +- max-speed: number, specifies maximum speed in Mbit/s supported by the
>>>> device;
>>>> +- phy-mode: string, operation mode of the PHY interface; supported
>>>> values are
>>>> +  "mii", "gmii", "sgmii", "tbi", "rev-mii", "rmii", "rgmii", "rgmii-id",
>>>> +  "rgmii-rxid", "rgmii-txid", "rtbi", "smii", "xgmii";
>>
>>
>>> Mark this as deprecated
>>
>>
>>    That's kind of wishful thinking at this point, as that's what the
>> majority of drivers use. I'm unsure of the reasons why that was done,
>> probably people just didn't read the proper specs...
>
> Or the spec was defined after those bindings. Deprecating does not
> matter for existing bindings. It's only defining new ones that are
> affected. I was more concerned with giving people guidance on which
> one to use for new bindings.
>
>>> in favor of phy-connection-type
>>
>>
>>    That one is only used by the oldish PowerPC 'gianfar' driver.
>>
>>
>>> so it's use does not spread.
>>
>>
>>    I'm afraid that's too late, it has spread very far, so that
>> of_get_phy_mode() handles that property, not "phy-connection-type".
>
> Uggg, I guess this is a case of a defacto standard then if the kernel
> doesn't even support it.

Maybe I forgot to CC you on patch sent to Grant only, I sent a patch a
while ago for of_get_phy_mode() to look for both "phy-mode" and
"phy-connection-type" since the former has been a Linux invention, but
the latter is ePAPR specified.

>
>>>> +- phy-connection-type: the same as "phy-mode" property (but described in
>>>> ePAPR);
>>>> +- phy-handle: phandle, specifies a reference to a node representing a
>>>> PHY
>>>> +  device (this property is described in ePAPR);
>>>> +- phy: the same as "phy-handle" property (but actually ad-hoc one).
>>
>>
>>> Mark this as deprecated in favor of phy-handle.
>>
>>
>>    Here situation is more optimistic. Quite many drivers still use
>> "phy-handle", though some use even more exotic props I didn't document here.
>
> Perhaps flagging as "Not recommended for new bindings" would be nicer wording...

Ok, so what's the deal here, can't we use phy-handle? There is
currently no infrastructure in drivers/net/ or drivers/of/ to look for
the "phy-handle" nor "phy" properties as phandles to fetch an Ethernet
PHY device tree node. If we are to provide one common place where we
want to do this, we will have to look for both properties, why can't
we mandate the use of "phy-handle" since this is the ePAPR compliant
one?

>
> Rob
> --
> To unsubscribe from this list: send the line "unsubscribe devicetree" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Florian

^ permalink raw reply

* I Need Your Assistance.
From: Malik Wasim @ 2014-01-21 19:43 UTC (permalink / raw)


Dear Friend,

I am Dr.Malik Wasim, Manager Auditing and Accountancy Department, Bank of Africa in (B.O.A) Burkina Faso. I got your email account while searching for a business oriented personality and you happen to be the one.

I bring forth a business proposal in the tune of $12.5 million U.S. dollars to be transferred to an offshore account with your assistance acting as beneficiary and next-of-kin to the funds.
If you know you are capable of involving and partaking in this transaction, please send down the following personal details to me via dr.malik_wasim@yahoo.com

(1)Full names:........................ .....
(2)Occupation:................ ..............
(3)Age and Sex:..........................
(4)Marital………….
Status:................
(5)Private phone number:..................
(6)Current residential address:.................

Noted; that every discussion is in the detail.
Dr.Malik Wasim.

^ permalink raw reply

* Re: [Xen-devel] [PATCH v3] xen/grant-table: Avoid m2p_override during mapping
From: Zoltan Kiss @ 2014-01-21 19:43 UTC (permalink / raw)
  To: Stefano Stabellini, David Vrabel
  Cc: jonathan.davies, wei.liu2, ian.campbell, netdev, linux-kernel,
	xen-devel
In-Reply-To: <alpine.DEB.2.02.1401211416420.21510@kaball.uk.xensource.com>

On 21/01/14 14:22, Stefano Stabellini wrote:
> On Tue, 21 Jan 2014, David Vrabel wrote:
>> On 21/01/14 12:26, Stefano Stabellini wrote:
>>> On Mon, 20 Jan 2014, Zoltan Kiss wrote:
>>>
>>>> -		ret = m2p_add_override(mfn, pages[i], kmap_ops ?
>>>> -				       &kmap_ops[i] : NULL);
>>>> +		if (m2p_override)
>>>> +			ret = m2p_add_override(mfn, pages[i], kmap_ops ?
>>>> +					       &kmap_ops[i] : NULL);
>>>> +		else {
>>>> +			unsigned long pfn = page_to_pfn(pages[i]);
>>>> +			WARN_ON(PagePrivate(pages[i]));
>>>> +			SetPagePrivate(pages[i]);
>>>> +			set_page_private(pages[i], mfn);
>>>> +			pages[i]->index = pfn_to_mfn(pfn);
>>>> +			if (unlikely(!set_phys_to_machine(pfn, FOREIGN_FRAME(mfn))))
>>>> +				return -ENOMEM;
>>>
>>> What happens if the page is PageHighMem?
>>>
>>> This looks like a subset of m2p_add_override, but it is missing some
>>> relevant bits, like the PageHighMem check, or the p2m(m2p(mfn)) == mfn
>>> check.  Maybe we can find a way to avoid duplicating the code.
>>> We could split m2p_add_override in two functions or add yet another
>>> parameter to it.
>>
>> The PageHighMem() check isn't relevant as we're not mapping anything
>> here.  Also, a page for a kernel grant mapping only cannot be highmem.
>>
>> The check for a local mfn and the additional set_phys_to_machine() is
>> only necessary if something tries an mfn_to_pfn() on the local mfn.  We
>> can only omit adding an m2p override if we know thing will try
>> mfn_to_pfn(), therefore the check and set_phys_to_machine() is unnecessary.
>
> OK, you convinced me that the two checks are superfluous for this case.
>
> Can we still avoid the code duplication by removing the corresponding
> code from m2p_add_override and m2p_remove_override and doing the
> set_page_private thing uniquely in grant-table.c?
Yes, I moved these parts out from the m2p* funcions to the gntmap 
functions. One change is that now we pass pfn/mfn to m2p* functions as 
they are changing right before the call. Also, to avoid racing I clear 
the page->private value before calling m2p_remove_override. I'll send in 
the patch soon.

Zoli

^ permalink raw reply

* Re: [PATCH RFC 4/6] net: rfkill: gpio: add device tree support
From: Arnd Bergmann @ 2014-01-21 18:50 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org,
	Linus Walleij, devicetree-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
	Heikki Krogerus, netdev, linux-wireless, linux-sunxi,
	linux-kernel, Maxime Ripard, Chen-Yu Tsai, Johannes Berg,
	Mika Westerberg, David S. Miller,
	linux-gpio-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
In-Reply-To: <CAAVeFuKFfAvSPYLmvWV5jjT-peZFJ8sJ2bbh4F=JAYoWLhjZpA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>

On Tuesday 21 January 2014, Alexandre Courbot wrote:
> >> As discussed earlier in this thread I'm not sure the con_id is
> >> suitable for labelling GPIOs. It'd be better to have a proper name
> >> specified in DT/ACPI instead.
> >
> > +1
> 
> I wonder why you guys prefer to have the name defined in the GPIO
> mapping. Having the driver decide the label makes it easier to look up
> which GPIO does what in debugfs, whereas nothing prevents people to
> name GPIOs whatever inadequate name they want in the device DT node.
> What am I overlooking here?

I should have another look at the debugfs representation, but isn't
there a global namespace that gets used for all gpios?  Neither the
con_id nor the name that the driver picks would be globally unique
and stable across kernel versions, so they don't make a good user
interface.

I think what we want here is a system-wide unique identifier for
each gpio line that may get represented in debugfs, and use a new
DT mechanism to communicate that.

	Arnd

^ permalink raw reply

* Re: [PATCH] ip_tunnel: clear IPCB in ip_tunnel_xmit() in case dst_link_failure() is called
From: Pravin Shelar @ 2014-01-21 18:33 UTC (permalink / raw)
  To: Duan Jiong; +Cc: David Miller, Daniel Petre, Eric Dumazet, netdev
In-Reply-To: <52DE1F60.2060109@cn.fujitsu.com>

On Mon, Jan 20, 2014 at 11:18 PM, Duan Jiong <duanj.fnst@cn.fujitsu.com> wrote:
>
> commit a622260254ee48("ip_tunnel: fix kernel panic with icmp_dest_unreach")
> clear IPCB in ip_tunnel_xmit()  , or else skb->cb[] may contain garbage from
> GSO segmentation layer.
>
> But commit 0e6fbc5b6c621("ip_tunnels: extend iptunnel_xmit()") refactor codes,
> and it clear IPCB behind the dst_link_failure().
>
ok.

> So clear IPCB in ip_tunnel_xmit() just like commti a622260254ee48("ip_tunnel:
> fix kernel panic with icmp_dest_unreach").
>
> Signed-off-by: Duan Jiong <duanj.fnst@cn.fujitsu.com>
> ---
>  net/ipv4/ip_tunnel.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/net/ipv4/ip_tunnel.c b/net/ipv4/ip_tunnel.c
> index 90ff957..dda1e9a 100644
> --- a/net/ipv4/ip_tunnel.c
> +++ b/net/ipv4/ip_tunnel.c
> @@ -535,6 +535,7 @@ void ip_tunnel_xmit(struct sk_buff *skb, struct net_device *dev,
>
>         inner_iph = (const struct iphdr *)skb_inner_network_header(skb);
>
> +       memset(IPCB(skb), 0, sizeof(*IPCB(skb)));
>         dst = tnl_params->daddr;
>         if (dst == 0) {
>                 /* NBMA tunnel */

Can you clear IPCB in error path for dst_link_failure(). So that we do
not clear it twice in fast path.

^ permalink raw reply

* Re: [PATCH] net: gre: don't pull skb if dealing with icmp message
From: Pravin Shelar @ 2014-01-21 18:27 UTC (permalink / raw)
  To: Duan Jiong; +Cc: David Miller, Daniel Borkmann, netdev
In-Reply-To: <52DC7DE4.1050108@cn.fujitsu.com>

On Sun, Jan 19, 2014 at 5:37 PM, Duan Jiong <duanj.fnst@cn.fujitsu.com> wrote:
> 于 2014年01月20日 05:08, Pravin Shelar 写道:
>> On Sun, Jan 19, 2014 at 12:35 AM, Duan Jiong <duanj.fnst@cn.fujitsu.com> wrote:
>>>
>>> When dealing with icmp messages, the skb->data points the
>>> ip header that triggered the sending of the icmp message.
>>>
>>> In gre_cisco_err(), the parse_gre_header() is called, and the
>>> iptunnel_pull_header() is called to pull the skb at the end of
>>> the parse_gre_header(). Unfortunately, the ipgre_err still needs
>>> the skb->data points the ip header following the icmp layer,
>>> and those ip addresses in ip header will be used to look up
>>> tunnel by ip_tunnel_lookup().
>>>
>> This looks like bug.
>> Can you use ip_hdr() rather than skb->data in ipgre_err().
>> Same is done in ipgre_rcv().
>
> That's maybe not a good idea. The ip_hdr() will the return the outmost
> ip header, but we need inner ip header following the icmp layer.
>
Why do you want to use inner ip header?
If you want to really look at original packet header, you can
calculate it from outer iph header rather than changing fast path
parse_gre_header() function.

^ permalink raw reply

* Re: [PATCH net-next 0/2] Bug Fixes for SFC driver
From: Ben Hutchings @ 2014-01-21 18:02 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-net-drivers, Shradha Shah
In-Reply-To: <52D98860.4020402@solarflare.com>

On Fri, 2014-01-17 at 19:45 +0000, Shradha Shah wrote:
> I am taking over the upstream patch submission work for the 
> sfc driver from Ben Hutchings.
> 
> These patches are bug fixes to the sfc driver involving 
> replacement of the PORT RESET MC command and fixing transposed 
> ptp_{under,over}size_sync_window_statistics
> 
> The PORT_RESET bug fix is needed for all versions supporting EF10
> i.e all versions including and after 3.12.
> 
> Ben Hutchings (1):
>   sfc: Fix transposed ptp_{under,over}size_sync_windows statistics
> 
> Jon Cooper (1):
>   sfc: Change efx_mcdi_reset_port to use ENTITY_RESET MC command.
> 
>  drivers/net/ethernet/sfc/ef10.c |   31 ++++++++++++++++++++++++++-----
>  drivers/net/ethernet/sfc/mcdi.c |   14 +++++++++++---
>  drivers/net/ethernet/sfc/ptp.c  |    4 ++--
>  3 files changed, 39 insertions(+), 10 deletions(-)

David, I don't know if you're waiting for my ack on these, but in case
you are:

Acked-by: Ben Hutchings <bhutchings@solarflare.com>

Shradha will send a MAINTAINERS update in the next patch series.

Ben.

-- 
Ben Hutchings, Staff Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.

^ permalink raw reply

* Re: FEC performance degradation with certain packet sizes
From: Marek Vasut @ 2014-01-21 17:49 UTC (permalink / raw)
  To: fugang.duan@freescale.com
  Cc: Hector Palacios, netdev@vger.kernel.org,
	Fabio.Estevam@freescale.com, shawn.guo@linaro.org,
	l.stach@pengutronix.de, Frank.Li@freescale.com,
	bhutchings@solarflare.com, davem@davemloft.net
In-Reply-To: <45804b55ce704164bc054d4275801153@BLUPR03MB373.namprd03.prod.outlook.com>

On Monday, December 23, 2013 at 03:52:20 AM, fugang.duan@freescale.com wrote:
> From: Hector Palacios <hector.palacios@digi.com>
> Sent: Friday, December 20, 2013 11:02 PM
> 
> >To: Duan Fugang-B38611; Marek Vasut; netdev@vger.kernel.org
> >Cc: Estevam Fabio-R49496; shawn.guo@linaro.org; l.stach@pengutronix.de; Li
> >Frank-B20596; bhutchings@solarflare.com; davem@davemloft.net
> >Subject: Re: FEC performance degradation with certain packet sizes
> >
> >Dear Andy,
> >
> >On 12/20/2013 04:35 AM, fugang.duan@freescale.com wrote:
> >> [...]
> >> 
> >> I can reproduce the issue on imx6q/dl platform with freescale internal
> >> kernel
> >
> >tree.
> >
> >> This issue must be related to cpufreq, when set scaling_governor to
> >
> >performance:
> >> echo performance >
> >> /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
> >> 
> >> And then do NPtcp test, the result as below:
> >>   24:      99 bytes      5 times -->      9.89 Mbps in      76.40 usec
> >>   25:     125 bytes      5 times -->     12.10 Mbps in      78.80 usec
> >>   26:     128 bytes      5 times -->     12.27 Mbps in      79.60 usec
> >>   27:     131 bytes      5 times -->     12.80 Mbps in      78.10 usec
> >>   28:     189 bytes      5 times -->     18.00 Mbps in      80.10 usec
> >>   29:     192 bytes      5 times -->     18.31 Mbps in      80.00 usec
> >>   30:     195 bytes      5 times -->     18.41 Mbps in      80.80 usec
> >>   31:     253 bytes      5 times -->     23.34 Mbps in      82.70 usec
> >>   32:     256 bytes      5 times -->     23.91 Mbps in      81.70 usec
> >>   33:     259 bytes      5 times -->     24.19 Mbps in      81.70 usec
> >>   34:     381 bytes      5 times -->     33.18 Mbps in      87.60 usec
> >>   35:     384 bytes      5 times -->     33.87 Mbps in      86.50 usec
> >>   36:     387 bytes      5 times -->     34.41 Mbps in      85.80 usec
> >>   37:     509 bytes      5 times -->     42.72 Mbps in      90.90 usec
> >>   38:     512 bytes      5 times -->     42.60 Mbps in      91.70 usec
> >>   39:     515 bytes      5 times -->     42.80 Mbps in      91.80 usec
> >>   40:     765 bytes      5 times -->     56.45 Mbps in     103.40 usec
> >>   41:     768 bytes      5 times -->     57.11 Mbps in     102.60 usec
> >>   42:     771 bytes      5 times -->     57.22 Mbps in     102.80 usec
> >>   43:    1021 bytes      5 times -->     70.69 Mbps in     110.20 usec
> >>   44:    1024 bytes      5 times -->     70.70 Mbps in     110.50 usec
> >>   45:    1027 bytes      5 times -->     69.59 Mbps in     112.60 usec
> >>   46:    1533 bytes      5 times -->     73.56 Mbps in     159.00 usec
> >>   47:    1536 bytes      5 times -->     72.92 Mbps in     160.70 usec
> >>   48:    1539 bytes      5 times -->     73.80 Mbps in     159.10 usec
> >>   49:    2045 bytes      5 times -->     93.59 Mbps in     166.70 usec
> >>   50:    2048 bytes      5 times -->     94.07 Mbps in     166.10 usec
> >>   51:    2051 bytes      5 times -->     92.92 Mbps in     168.40 usec
> >>   52:    3069 bytes      5 times -->    123.43 Mbps in     189.70 usec
> >>   53:    3072 bytes      5 times -->    123.68 Mbps in     189.50 usec
> >
> >You are right. Unfortunately, this does not work on i.MX28 (at least for
> >me). Couldn't it be that the cpufreq is masking the problem on the i.MX6?
> >
> >Best regards,
> >--
> >Hector Palacios
> 
> I reproduce the issue on imx28 evk platform, imx28 has no specific cpufreq
> driver. In kernel 3.13, the ethernet driver is almost the same for imx28
> and imx6 since they use the Same enet IP, but imx6 enet IP have some
> evolution.
> 
> Now I don't know the cause. When I am free, I will dig out it.

Hi! Is there any progress on this issue ? Did you manage to find anything out 
please?

Best regards,
Marek Vasut

^ permalink raw reply

* Re: issues with vxlan RX checksum offload under OVS datapath
From: Jesse Gross @ 2014-01-21 17:37 UTC (permalink / raw)
  To: Pravin Shelar; +Cc: Or Gerlitz, Joseph Gasparakis, netdev
In-Reply-To: <CALnjE+rUTejbU_d39G7WSn=k0aY2xELCE3UxjDr_Zdq3iwHbNA@mail.gmail.com>

On Tue, Jan 21, 2014 at 9:30 AM, Pravin Shelar <pshelar@nicira.com> wrote:
> On Sun, Jan 19, 2014 at 2:05 PM, Or Gerlitz <ogerlitz@mellanox.com> wrote:
>> Hi Pravin,
>>
>> While testing the gro udp patches over a setup with openvswitch I noted that
>> the RX checksum offload support introduced by Joseph's commit 0afb166
>> "vxlan: Add capability of Rx checksum offload for inner packet" works fine
>> when you use a setup made of
>>
>> NIC --> IP stack --> vxlan device --> bridge --> tap
>>
>> but not when its
>>
>> NIC --> IP stack --> ovs vxlan port --> OVS DP --> tap
>>
>> I narrowed it down to the fact the when going the OVS pathskb->encapsulation
>> remains true also after the decap is done. Basically, this is the original
>> hunk
>>
>>
>>> @@ -607,7 +607,17 @@ static int vxlan_udp_encap_recv(struct sock *sk,
>>> struct sk_buff *skb)
>>>
>>>         __skb_tunnel_rx(skb, vxlan->dev);
>>>         skb_reset_network_header(skb);
>>> -       skb->ip_summed = CHECKSUM_NONE;
>>> +
>>> +       /* If the NIC driver gave us an encapsulated packet with
>>> +        * CHECKSUM_UNNECESSARY and Rx checksum feature is enabled,
>>> +        * leave the CHECKSUM_UNNECESSARY, the device checksummed it
>>> +        * for us. Otherwise force the upper layers to verify it.
>>> +        */
>>> +       if (skb->ip_summed != CHECKSUM_UNNECESSARY || !skb->encapsulation
>>> ||
>>> +           !(vxlan->dev->features & NETIF_F_RXCSUM))
>>> +               skb->ip_summed = CHECKSUM_NONE;
>>> +
>>> +       skb->encapsulation = 0;
>>
>>
>> which was moved by your commits
>>
>> 5cfccc5 vxlan: Add vxlan recv demux.
>> 7ce0475 vxlan: Restructure vxlan receive.
>>
>> to a code which isn't shared by the vxlan driver and ovs vxlan datapath
>> code, and I was thinking lets just move it back to vxlan_udp_encap_recv.
>> However, we can't really make the "vxlan->dev->features & NETIF_F_RXCSUM"
>> check - since there's no vxlan device for ovs datapath, any idea how to
>> overcome this.
>>
>> Moving this to shared code (while removing the check for
>> vxlan->dev->features) made things to work on my setup, but this misses one
>> of the original conditions, ideas?
>>
> I kept csum check in vxlan-device recv path for same reason. As of now
> there is no efficient way to get ovs-dev features.
> May be we can cache device features in struct datapath from datapath-netdev.\

I don't think that getting the features from the internal device makes
sense - a packet could be destined to a VM instead.

I think that controlling RX checksum on the device that physically
received it should be enough to be able to check for broken devices.
It's not clear what the benefit to turning it off at the VXLAN layer
is - if you really don't want it for some reason, you can always throw
away the checksum later.

^ permalink raw reply

* Re: issues with vxlan RX checksum offload under OVS datapath
From: Pravin Shelar @ 2014-01-21 17:30 UTC (permalink / raw)
  To: Or Gerlitz; +Cc: Joseph Gasparakis, netdev
In-Reply-To: <52DC4C1B.5000309@mellanox.com>

On Sun, Jan 19, 2014 at 2:05 PM, Or Gerlitz <ogerlitz@mellanox.com> wrote:
> Hi Pravin,
>
> While testing the gro udp patches over a setup with openvswitch I noted that
> the RX checksum offload support introduced by Joseph's commit 0afb166
> "vxlan: Add capability of Rx checksum offload for inner packet" works fine
> when you use a setup made of
>
> NIC --> IP stack --> vxlan device --> bridge --> tap
>
> but not when its
>
> NIC --> IP stack --> ovs vxlan port --> OVS DP --> tap
>
> I narrowed it down to the fact the when going the OVS pathskb->encapsulation
> remains true also after the decap is done. Basically, this is the original
> hunk
>
>
>> @@ -607,7 +607,17 @@ static int vxlan_udp_encap_recv(struct sock *sk,
>> struct sk_buff *skb)
>>
>>         __skb_tunnel_rx(skb, vxlan->dev);
>>         skb_reset_network_header(skb);
>> -       skb->ip_summed = CHECKSUM_NONE;
>> +
>> +       /* If the NIC driver gave us an encapsulated packet with
>> +        * CHECKSUM_UNNECESSARY and Rx checksum feature is enabled,
>> +        * leave the CHECKSUM_UNNECESSARY, the device checksummed it
>> +        * for us. Otherwise force the upper layers to verify it.
>> +        */
>> +       if (skb->ip_summed != CHECKSUM_UNNECESSARY || !skb->encapsulation
>> ||
>> +           !(vxlan->dev->features & NETIF_F_RXCSUM))
>> +               skb->ip_summed = CHECKSUM_NONE;
>> +
>> +       skb->encapsulation = 0;
>
>
> which was moved by your commits
>
> 5cfccc5 vxlan: Add vxlan recv demux.
> 7ce0475 vxlan: Restructure vxlan receive.
>
> to a code which isn't shared by the vxlan driver and ovs vxlan datapath
> code, and I was thinking lets just move it back to vxlan_udp_encap_recv.
> However, we can't really make the "vxlan->dev->features & NETIF_F_RXCSUM"
> check - since there's no vxlan device for ovs datapath, any idea how to
> overcome this.
>
> Moving this to shared code (while removing the check for
> vxlan->dev->features) made things to work on my setup, but this misses one
> of the original conditions, ideas?
>
I kept csum check in vxlan-device recv path for same reason. As of now
there is no efficient way to get ovs-dev features.
May be we can cache device features in struct datapath from datapath-netdev.

> I believe its too late for 3.13 unless there is going to be -rc9, but lets
> first see what's the right fix.
>
> Or.

^ permalink raw reply

* [PATCH net-next v2] ipv6: enable anycast addresses as source addresses for datagrams
From: Francois-Xavier Le Bail @ 2014-01-21 16:01 UTC (permalink / raw)
  To: netdev; +Cc: Hannes Frederic Sowa, David Stevens, David Miller

This change allows to consider an anycast address valid as source address
when given via an IPV6_PKTINFO or IPV6_2292PKTINFO ancillary data item.
So, when sending a datagram with ancillary data, the unicast and anycast
addresses are handled in the same way.

- Adds ipv6_chk_acast_addr_src() to check if an anycast address is link-local
  on given interface or is global.
- Uses it in ip6_datagram_send_ctl().

Signed-off-by: Francois-Xavier Le Bail <fx.lebail@yahoo.com>
---
v2: Consideration of Hannes's review:
    - Same behaviour for anycast and unicast.
    - No need for a new socket option.

Typical usage :
A server uses IPV6_RECVPKTINFO socket option to get ancillary data with
recvmsg() and can use sendmsg() to reply with anycast adress as source address
in the same way it does for unicast.

 include/net/addrconf.h |    5 +++--
 net/ipv6/anycast.c     |   11 +++++++++++
 net/ipv6/datagram.c    |    4 +++-
 3 files changed, 17 insertions(+), 3 deletions(-)

diff --git a/include/net/addrconf.h b/include/net/addrconf.h
index 66c4a44..50e39a8 100644
--- a/include/net/addrconf.h
+++ b/include/net/addrconf.h
@@ -205,8 +205,9 @@ void ipv6_sock_ac_close(struct sock *sk);
 int ipv6_dev_ac_inc(struct net_device *dev, const struct in6_addr *addr);
 int __ipv6_dev_ac_dec(struct inet6_dev *idev, const struct in6_addr *addr);
 bool ipv6_chk_acast_addr(struct net *net, struct net_device *dev,
-				const struct in6_addr *addr);
-
+			 const struct in6_addr *addr);
+bool ipv6_chk_acast_addr_src(struct net *net, struct net_device *dev,
+			     const struct in6_addr *addr);
 
 /* Device notifier */
 int register_inet6addr_notifier(struct notifier_block *nb);
diff --git a/net/ipv6/anycast.c b/net/ipv6/anycast.c
index 5a80f15..d3a6e2d 100644
--- a/net/ipv6/anycast.c
+++ b/net/ipv6/anycast.c
@@ -383,6 +383,17 @@ bool ipv6_chk_acast_addr(struct net *net, struct net_device *dev,
 	return found;
 }
 
+/*	check if this anycast address is link-local on given interface or
+ *	is global
+ */
+bool ipv6_chk_acast_addr_src(struct net *net, struct net_device *dev,
+			     const struct in6_addr *addr)
+{
+	if (ipv6_addr_type(addr) & IPV6_ADDR_LINKLOCAL)
+		return ipv6_chk_acast_dev(dev, addr);
+	else
+		return ipv6_chk_acast_addr(net, NULL, addr);
+}
 
 #ifdef CONFIG_PROC_FS
 struct ac6_iter_state {
diff --git a/net/ipv6/datagram.c b/net/ipv6/datagram.c
index cd8699b..2615197 100644
--- a/net/ipv6/datagram.c
+++ b/net/ipv6/datagram.c
@@ -689,7 +689,9 @@ int ip6_datagram_send_ctl(struct net *net, struct sock *sk,
 				int strict = __ipv6_addr_src_scope(addr_type) <= IPV6_ADDR_SCOPE_LINKLOCAL;
 				if (!(inet_sk(sk)->freebind || inet_sk(sk)->transparent) &&
 				    !ipv6_chk_addr(net, &src_info->ipi6_addr,
-						   strict ? dev : NULL, 0))
+						   strict ? dev : NULL, 0) &&
+				    !ipv6_chk_acast_addr_src(net, dev,
+							     &src_info->ipi6_addr))
 					err = -EINVAL;
 				else
 					fl6->saddr = src_info->ipi6_addr;

^ permalink raw reply related

* Re: [PATCH net-next 00/25] bonding: introduce new option API
From: Nikolay Aleksandrov @ 2014-01-21 16:06 UTC (permalink / raw)
  To: Dan Williams
  Cc: netdev, Andy Gospodarek, Jay Vosburgh, Veaceslav Falico,
	Scott Feldman, David S. Miller
In-Reply-To: <1390319983.26902.4.camel@dcbw.local>

On 01/21/2014 04:59 PM, Dan Williams wrote:
> On Tue, 2014-01-21 at 15:54 +0100, Nikolay Aleksandrov wrote:
>> Hi,
>> This patchset's goal is to introduce a new option API which should be used
>> to properly describe the bonding options with their mode dependcies and
>> requirements. With this patchset applied we get centralized option
>> manipulation, automatic RTNL acquire per option setting, automatic option
>> range checking, mode dependcy checking and other various flags which are
>> described in detail in patch 01's commit message and comments.
>> Also the parameter passing is changed to use a specialized structure which
>> is initialized to a value depending on the needs.
>> The main exported functions are:
>>  __bond_opt_set() - set an option (RTNL should be acquired prior)
>>  bond_opt_init(val|str) - init a bond_opt_value struct for value or string
>>                           parameter passing
>>  bond_opt_tryset_rtnl() - function which tries to acquire rtnl, mainly used
>>                           for sysfs
>>  bond_opt_parse - used to parse or check for valid values
>>  bond_opt_get - retrieve a pointer to bond_option struct for some option
>>  bond_opt_get_val - retrieve a pointer to a bond_opt_value struct for
>>                     some value
>>
>> The same functions are used to set an option via sysfs and netlink, just
>> the parameter that's passed is usually initialized in a different way.
>> The converted options have multiple style fixes, there're some longer
>> lines but they looked either ugly or were strings/pr_warnings, if you
>> think some line would be better broken just let me know :-) there're
>> also a few sscanf false-positive warnings.
>> I decided to keep the "unsuppmodes" way of mode dep checking since it's
>> straight forward, if we make a more general way for checking dependencies
>> it'll be easy to change it.
>>
>> Future plans for this work include:
>>  - Automatic sysfs generation from the bond_opts[].
>>  - Use of the API in bond_check_params() and thus cleaning it up (this has
>>    actually started, I'll take care of the rest in a separate patch)
>>  - Clean up all option-unrelated files of option definitions and functions
>>
>> I've tried to leave as much documentation as possible, if there's anything
>> unclear please let me know. One more thing, I haven't moved all
>> option-related functions from bonding.h to the new bond_options.h, this
>> will be done in a separate patch, it's in my todo list.
>>
>> This patchset has been tested by setting each converted option via sysfs
>> and netlink to a couple of wrong values, a couple of correct values and
>> some random values, also for the opts that have flags they have been
>> tested as well.
> 
> Currently userspace has to encode a lot of the same logic the kernel has
> for option validation, for example when creating a user interface for
> this stuff, you have to know that miimon and arp are incompatible, and
> that certain options are only relevant with certain bond modes, and it's
> a mess.  And this also sometimes changes when new kernel options or
> capabilities are added.
> 
> So, is there any good way to describe the valid value ranges or
> capabilities for userspace?  One idea is to send a package of bond
> options to the kernel and see if they validate before actually applying
> them, though this only works when actually configuring the interface.
> An additional idea would be to somehow describe the available options
> (eg, value ranges for numeric values, list-of-strings for bond modes,
> etc) that the kernel supports before any bonds are even created (thus
> probably through netlink, not sysfs).
> 
> Thoughts?
> 
> Dan
> 
Hi Dan,
I noticed the same thing while looking at the new ip link code. With this
change though you can pass anything to the option and it'll get validated,
so even if you pass a wrong value - not a problem you'll get the
appropriate error, so I don't see a problem for ip link to accept any
input, maybe the problem comes from the dual-personality options (accepting
both string and integers), we can always switch to the way sysfs does it
with either passing the argument as a string so __bond_opt_set would parse
it or the other option would be to export the option value tables via some
interface so ip can do its own validation.

This is a nice task to add to the todo list, just have to agree on a
solution :-)
Thanks for bringing it up.

Cheers,
 Nik
>> Best regards,
>>  Nikolay Aleksandrov
>>
>> CC: Andy Gospodarek <andy@greyhouse.net>
>> CC: Jay Vosburgh <fubar@us.ibm.com>
>> CC: Veaceslav Falico <vfalico@redhat.com>
>> CC: Scott Feldman <sfeldma@cumulusnetworks.com>
>> CC: David S. Miller <davem@davemloft.net>
>>
>> Nikolay Aleksandrov (25):
>>   bonding: add infrastructure for an option API
>>   bonding: convert mode setting to use the new option API
>>   bonding: convert packets_per_slave to use the new option API
>>   bonding: convert xmit_hash_policy to use the new option API
>>   bonding: convert arp_validate to use the new option API
>>   bonding: convert arp_all_targets to use the new option API
>>   bonding: convert fail_over_mac to use the new option API
>>   bonding: convert arp_interval to use the new option API
>>   bonding: convert arp_ip_target to use the new option API
>>   bonding: convert downdelay to use the new option API
>>   bonding: convert updelay to use the new option API
>>   bonding: convert lacp_rate to use the new option API
>>   bonding: convert min_links to use the new option API
>>   bonding: convert ad_select to use the new option API
>>   bonding: convert num_peer_notif to use the new option API
>>   bonding: convert miimon to use the new option API
>>   bonding: convert primary to use the new option API
>>   bonding: convert primary_reselect to use the new option API
>>   bonding: convert use_carrier to use the new option API
>>   bonding: convert active_slave to use the new option API
>>   bonding: convert queue_id to use the new option API
>>   bonding: convert all_slaves_active to use the new option API
>>   bonding: convert resend_igmp to use the new option API
>>   bonding: convert lp_interval to use the new option API
>>   bonding: convert slaves to use the new option API
>>
>>  drivers/net/bonding/bond_main.c    |  179 +++---
>>  drivers/net/bonding/bond_netlink.c |   87 ++-
>>  drivers/net/bonding/bond_options.c | 1086 +++++++++++++++++++++++++++---------
>>  drivers/net/bonding/bond_options.h |  170 ++++++
>>  drivers/net/bonding/bond_procfs.c  |   25 +-
>>  drivers/net/bonding/bond_sysfs.c   |  519 +++--------------
>>  drivers/net/bonding/bonding.h      |   29 +-
>>  7 files changed, 1214 insertions(+), 881 deletions(-)
>>  create mode 100644 drivers/net/bonding/bond_options.h
>>
> 
> 

^ permalink raw reply

* Re: [PATCH net-next 00/25] bonding: introduce new option API
From: Dan Williams @ 2014-01-21 15:59 UTC (permalink / raw)
  To: Nikolay Aleksandrov
  Cc: netdev, Andy Gospodarek, Jay Vosburgh, Veaceslav Falico,
	Scott Feldman, David S. Miller
In-Reply-To: <1390316114-17815-1-git-send-email-nikolay@redhat.com>

On Tue, 2014-01-21 at 15:54 +0100, Nikolay Aleksandrov wrote:
> Hi,
> This patchset's goal is to introduce a new option API which should be used
> to properly describe the bonding options with their mode dependcies and
> requirements. With this patchset applied we get centralized option
> manipulation, automatic RTNL acquire per option setting, automatic option
> range checking, mode dependcy checking and other various flags which are
> described in detail in patch 01's commit message and comments.
> Also the parameter passing is changed to use a specialized structure which
> is initialized to a value depending on the needs.
> The main exported functions are:
>  __bond_opt_set() - set an option (RTNL should be acquired prior)
>  bond_opt_init(val|str) - init a bond_opt_value struct for value or string
>                           parameter passing
>  bond_opt_tryset_rtnl() - function which tries to acquire rtnl, mainly used
>                           for sysfs
>  bond_opt_parse - used to parse or check for valid values
>  bond_opt_get - retrieve a pointer to bond_option struct for some option
>  bond_opt_get_val - retrieve a pointer to a bond_opt_value struct for
>                     some value
> 
> The same functions are used to set an option via sysfs and netlink, just
> the parameter that's passed is usually initialized in a different way.
> The converted options have multiple style fixes, there're some longer
> lines but they looked either ugly or were strings/pr_warnings, if you
> think some line would be better broken just let me know :-) there're
> also a few sscanf false-positive warnings.
> I decided to keep the "unsuppmodes" way of mode dep checking since it's
> straight forward, if we make a more general way for checking dependencies
> it'll be easy to change it.
> 
> Future plans for this work include:
>  - Automatic sysfs generation from the bond_opts[].
>  - Use of the API in bond_check_params() and thus cleaning it up (this has
>    actually started, I'll take care of the rest in a separate patch)
>  - Clean up all option-unrelated files of option definitions and functions
> 
> I've tried to leave as much documentation as possible, if there's anything
> unclear please let me know. One more thing, I haven't moved all
> option-related functions from bonding.h to the new bond_options.h, this
> will be done in a separate patch, it's in my todo list.
> 
> This patchset has been tested by setting each converted option via sysfs
> and netlink to a couple of wrong values, a couple of correct values and
> some random values, also for the opts that have flags they have been
> tested as well.

Currently userspace has to encode a lot of the same logic the kernel has
for option validation, for example when creating a user interface for
this stuff, you have to know that miimon and arp are incompatible, and
that certain options are only relevant with certain bond modes, and it's
a mess.  And this also sometimes changes when new kernel options or
capabilities are added.

So, is there any good way to describe the valid value ranges or
capabilities for userspace?  One idea is to send a package of bond
options to the kernel and see if they validate before actually applying
them, though this only works when actually configuring the interface.
An additional idea would be to somehow describe the available options
(eg, value ranges for numeric values, list-of-strings for bond modes,
etc) that the kernel supports before any bonds are even created (thus
probably through netlink, not sysfs).

Thoughts?

Dan

> Best regards,
>  Nikolay Aleksandrov
> 
> CC: Andy Gospodarek <andy@greyhouse.net>
> CC: Jay Vosburgh <fubar@us.ibm.com>
> CC: Veaceslav Falico <vfalico@redhat.com>
> CC: Scott Feldman <sfeldma@cumulusnetworks.com>
> CC: David S. Miller <davem@davemloft.net>
> 
> Nikolay Aleksandrov (25):
>   bonding: add infrastructure for an option API
>   bonding: convert mode setting to use the new option API
>   bonding: convert packets_per_slave to use the new option API
>   bonding: convert xmit_hash_policy to use the new option API
>   bonding: convert arp_validate to use the new option API
>   bonding: convert arp_all_targets to use the new option API
>   bonding: convert fail_over_mac to use the new option API
>   bonding: convert arp_interval to use the new option API
>   bonding: convert arp_ip_target to use the new option API
>   bonding: convert downdelay to use the new option API
>   bonding: convert updelay to use the new option API
>   bonding: convert lacp_rate to use the new option API
>   bonding: convert min_links to use the new option API
>   bonding: convert ad_select to use the new option API
>   bonding: convert num_peer_notif to use the new option API
>   bonding: convert miimon to use the new option API
>   bonding: convert primary to use the new option API
>   bonding: convert primary_reselect to use the new option API
>   bonding: convert use_carrier to use the new option API
>   bonding: convert active_slave to use the new option API
>   bonding: convert queue_id to use the new option API
>   bonding: convert all_slaves_active to use the new option API
>   bonding: convert resend_igmp to use the new option API
>   bonding: convert lp_interval to use the new option API
>   bonding: convert slaves to use the new option API
> 
>  drivers/net/bonding/bond_main.c    |  179 +++---
>  drivers/net/bonding/bond_netlink.c |   87 ++-
>  drivers/net/bonding/bond_options.c | 1086 +++++++++++++++++++++++++++---------
>  drivers/net/bonding/bond_options.h |  170 ++++++
>  drivers/net/bonding/bond_procfs.c  |   25 +-
>  drivers/net/bonding/bond_sysfs.c   |  519 +++--------------
>  drivers/net/bonding/bonding.h      |   29 +-
>  7 files changed, 1214 insertions(+), 881 deletions(-)
>  create mode 100644 drivers/net/bonding/bond_options.h
> 

^ permalink raw reply

* Re: [PATCH] SUNRPC: Allow one callback request to be received from two sk_buff
From: Trond Myklebust @ 2014-01-21 15:35 UTC (permalink / raw)
  To: shaobingqing
  Cc: Dr Fields James Bruce, Miller David S., linuxnfs,
	netdev-u79uwXL29TY76Z2rM5mHXA, Linux Kernel Mailing List
In-Reply-To: <CALrKORrZ3Kcuqc1RajQKkZcot0yiswh4VR_WuXHqfRTjn9oGQQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>


On Jan 21, 2014, at 3:08, shaobingqing <shaobingqing-Gb3srWounXyPt1CcHtbs0g@public.gmane.org> wrote:

> 2014/1/21 Trond Myklebust <trond.myklebust-7I+n7zu2hftEKMMhf/gKZA@public.gmane.org>:
>> On Mon, 2014-01-20 at 14:59 +0800, shaobingqing wrote:
>>> In current code, there only one struct rpc_rqst is prealloced. If one
>>> callback request is received from two sk_buff, the xprt_alloc_bc_request
>>> would be execute two times with the same transport->xid. The first time
>>> xprt_alloc_bc_request will alloc one struct rpc_rqst and the TCP_RCV_COPY_DATA
>>> bit of transport->tcp_flags will not be cleared. The second time
>>> xprt_alloc_bc_request could not alloc struct rpc_rqst any more and NULL
>>> pointer will be returned, then xprt_force_disconnect occur. I think one
>>> callback request can be allowed to be received from two sk_buff.
>>> 
>>> Signed-off-by: shaobingqing <shaobingqing-Gb3srWounXyPt1CcHtbs0g@public.gmane.org>
>>> ---
>>> net/sunrpc/xprtsock.c |   11 +++++++++--
>>> 1 files changed, 9 insertions(+), 2 deletions(-)
>>> 
>>> diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
>>> index ee03d35..606950d 100644
>>> --- a/net/sunrpc/xprtsock.c
>>> +++ b/net/sunrpc/xprtsock.c
>>> @@ -1271,8 +1271,13 @@ static inline int xs_tcp_read_callback(struct rpc_xprt *xprt,
>>>      struct sock_xprt *transport =
>>>                              container_of(xprt, struct sock_xprt, xprt);
>>>      struct rpc_rqst *req;
>>> +     static struct rpc_rqst *req_partial;
>>> +
>>> +     if (req_partial == NULL)
>>> +             req = xprt_alloc_bc_request(xprt);
>>> +     else if (req_partial->rq_xid == transport->tcp_xid)
>>> +             req = req_partial;
>> 
>> What happens here if req_partial->rq_xid != transport->tcp_xid? AFAICS,
>> req will be undefined. Either way, you cannot use a static variable for
>> storage here: that isn't re-entrant.
> 
> Because metadata sever only have one slot for backchannel request,
> req_partial->rq_xid == transport->tcp_xid always happens, if the callback
> request just being splited in two sk_buffs. But req_partial->rq_xid !=
> transport->tcp_xid may also happens in some special cases, such as
> retransmission occurs?

If the server retransmits, then it is broken. The NFSv4.1 protocol does not allow it to retransmit unless the connection breaks.

> If one callback request is splited in two sk_buffs, xs_tcp_read_callback
> will be execute two times. The req_partial should be a static variable,
> because  the second execution of xs_tcp_read_callback should use
> the rpc_rqst allocated for the first execution, which saves information
> copies from the first sk_buff.

No! This is a multi-threaded/process environment which can support multiple connection. It is a bug to use a static variable.

--
Trond Myklebust
Linux NFS client maintainer

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH RFC 4/6] net: rfkill: gpio: add device tree support
From: Mika Westerberg @ 2014-01-21 15:25 UTC (permalink / raw)
  To: Alexandre Courbot
  Cc: Arnd Bergmann, linux-arm-kernel@lists.infradead.org,
	Linus Walleij, devicetree@vger.kernel.org, Heikki Krogerus,
	netdev, linux-wireless, linux-sunxi, linux-kernel, Maxime Ripard,
	Chen-Yu Tsai, Johannes Berg, David S. Miller,
	linux-gpio@vger.kernel.org
In-Reply-To: <CAAVeFuKFfAvSPYLmvWV5jjT-peZFJ8sJ2bbh4F=JAYoWLhjZpA@mail.gmail.com>

On Tue, Jan 21, 2014 at 11:53:13PM +0900, Alexandre Courbot wrote:
> > I think a unified kernel API makes more sense for some subsystems than
> > others, and it depends a bit on the rate of adoption of APCI for drivers
> > that already have a DT binding (or vice versa, if that happens).
> >
> > GPIO might actually be in the first category since it's commonly used
> > for off-chip components that will get shared across ARM and x86 (as
> > well as everything else), while a common kernel API would be less
> > important for things that are internal to an SoC where Intel is the
> > only company needing ACPI support.
> 
> I am afraid I don't have a good enough view of the ACPI landscape to
> understand how often drivers might be reused on both ACPI and DT. But
> I suppose nothing speaks against that, technically speaking. Maybe
> Mika would have comments to make here?

Well, we try to reuse existing drivers whenever possible. As an example
Intel LPSS devices (that exists on Haswell and Baytrail) are mostly
existing drivers from ARM world.

I would say that GPIO is one of such things where we would like to have an
unified interface definitely.

^ permalink raw reply

* Re: [PATCH net-next 0/7] sctp: remove some macro locking wrappers
From: Vlad Yasevich @ 2014-01-21 15:16 UTC (permalink / raw)
  To: Wang Weidong, nhorman, davem; +Cc: dborkman, netdev, linux-sctp
In-Reply-To: <1390290252-16080-1-git-send-email-wangweidong1@huawei.com>

On 01/21/2014 02:44 AM, Wang Weidong wrote:
> In sctp.h we can find some macro locking wrappers. As Neil point out that:
> 
> "Its because in the origional implementation of the sctp protocol, there was a
> user space test harness which built the kernel module for userspace execution to
> cary our some unit testing on the code.  It did so by redefining some of those
> locking macros to user space friendly code.  IIRC we haven't use those unit
> tests in years, and so should be removing them, not adding them to other
> locations."
> 
> So I remove them.
> 
> Wang Weidong (7):
>   sctp: remove macro sctp_spin_[un]lock_irqrestore
>   sctp: remove macro sctp_local_bh_{disable|enable}
>   sctp: remove macro sctp_spin_[un]lock
>   sctp: remove macros sctp_write_[un]_lock
>   sctp: remove macros sctp_read_[un]lock
>   sctp: remove macros sctp_{lock|release}_sock
>   sctp: remove macros sctp_bh_[un]lock_sock
> 
>  include/net/sctp/sctp.h  | 27 ++-----------
>  net/sctp/endpointola.c   |  4 +-
>  net/sctp/input.c         | 54 +++++++++++++-------------
>  net/sctp/proc.c          | 12 +++---
>  net/sctp/protocol.c      |  4 +-
>  net/sctp/sm_sideeffect.c | 16 ++++----
>  net/sctp/socket.c        | 98 ++++++++++++++++++++++++------------------------
>  7 files changed, 98 insertions(+), 117 deletions(-)
> 

I like this series just the way it is.  Very simple to review.

Series
Acked-by: Vlad Yasevich <vyasevich@gmail.com>

Thanks
-vlad

^ permalink raw reply

* [PATCH net-next 25/25] bonding: convert slaves to use the new option API
From: Nikolay Aleksandrov @ 2014-01-21 14:55 UTC (permalink / raw)
  To: netdev; +Cc: Nikolay Aleksandrov
In-Reply-To: <1390316114-17815-1-git-send-email-nikolay@redhat.com>

This patch adds the necessary changes so slaves would use
the new bonding option API. Also move the option to its own set function
in bond_options.c and fix some style errors.

Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
---
 drivers/net/bonding/bond_options.c | 53 ++++++++++++++++++++++++++++++++++++++
 drivers/net/bonding/bond_options.h |  2 ++
 drivers/net/bonding/bond_sysfs.c   | 51 +++---------------------------------
 3 files changed, 59 insertions(+), 47 deletions(-)

diff --git a/drivers/net/bonding/bond_options.c b/drivers/net/bonding/bond_options.c
index b9e9bbe..c241601 100644
--- a/drivers/net/bonding/bond_options.c
+++ b/drivers/net/bonding/bond_options.c
@@ -301,6 +301,13 @@ static struct bond_option bond_opts[] = {
 		.values = bond_lp_interval_tbl,
 		.set = bond_option_lp_interval_set
 	},
+	[BOND_OPT_SLAVES] = {
+		.id = BOND_OPT_SLAVES,
+		.name = "slaves",
+		.desc = "Slave membership management",
+		.flags = BOND_OPTFLAG_RAWVAL,
+		.set = bond_option_slaves_set
+	},
 	{ }
 };
 
@@ -1215,3 +1222,49 @@ err_no_cmd:
 	goto out;
 
 }
+
+int bond_option_slaves_set(struct bonding *bond, struct bond_opt_value *newval)
+{
+	char command[IFNAMSIZ + 1] = { 0, };
+	struct net_device *dev;
+	char *ifname;
+	int ret;
+
+	sscanf(newval->string, "%16s", command); /* IFNAMSIZ*/
+	ifname = command + 1;
+	if ((strlen(command) <= 1) ||
+	    !dev_valid_name(ifname))
+		goto err_no_cmd;
+
+	dev = __dev_get_by_name(dev_net(bond->dev), ifname);
+	if (!dev) {
+		pr_info("%s: Interface %s does not exist!\n",
+			bond->dev->name, ifname);
+		ret = -ENODEV;
+		goto out;
+	}
+
+	switch (command[0]) {
+	case '+':
+		pr_info("%s: Adding slave %s.\n", bond->dev->name, dev->name);
+		ret = bond_enslave(bond->dev, dev);
+		break;
+
+	case '-':
+		pr_info("%s: Removing slave %s.\n", bond->dev->name, dev->name);
+		ret = bond_release(bond->dev, dev);
+		break;
+
+	default:
+		goto err_no_cmd;
+	}
+
+out:
+	return ret;
+
+err_no_cmd:
+	pr_err("no command found in slaves file for bond %s. Use +ifname or -ifname.\n",
+	       bond->dev->name);
+	ret = -EPERM;
+	goto out;
+}
diff --git a/drivers/net/bonding/bond_options.h b/drivers/net/bonding/bond_options.h
index eb3a773..433d37f 100644
--- a/drivers/net/bonding/bond_options.h
+++ b/drivers/net/bonding/bond_options.h
@@ -61,6 +61,7 @@ enum {
 	BOND_OPT_ALL_SLAVES_ACTIVE,
 	BOND_OPT_RESEND_IGMP,
 	BOND_OPT_LP_INTERVAL,
+	BOND_OPT_SLAVES,
 	BOND_OPT_LAST
 };
 
@@ -165,4 +166,5 @@ int bond_option_resend_igmp_set(struct bonding *bond,
 				struct bond_opt_value *newval);
 int bond_option_lp_interval_set(struct bonding *bond,
 				struct bond_opt_value *newval);
+int bond_option_slaves_set(struct bonding *bond, struct bond_opt_value *newval);
 #endif /* _BOND_OPTIONS_H */
diff --git a/drivers/net/bonding/bond_sysfs.c b/drivers/net/bonding/bond_sysfs.c
index 8667e55..ceacb12 100644
--- a/drivers/net/bonding/bond_sysfs.c
+++ b/drivers/net/bonding/bond_sysfs.c
@@ -201,58 +201,15 @@ static ssize_t bonding_store_slaves(struct device *d,
 				    struct device_attribute *attr,
 				    const char *buffer, size_t count)
 {
-	char command[IFNAMSIZ + 1] = { 0, };
-	char *ifname;
-	int res, ret = count;
-	struct net_device *dev;
 	struct bonding *bond = to_bond(d);
+	int ret;
 
-	if (!rtnl_trylock())
-		return restart_syscall();
-
-	sscanf(buffer, "%16s", command); /* IFNAMSIZ*/
-	ifname = command + 1;
-	if ((strlen(command) <= 1) ||
-	    !dev_valid_name(ifname))
-		goto err_no_cmd;
-
-	dev = __dev_get_by_name(dev_net(bond->dev), ifname);
-	if (!dev) {
-		pr_info("%s: Interface %s does not exist!\n",
-			bond->dev->name, ifname);
-		ret = -ENODEV;
-		goto out;
-	}
-
-	switch (command[0]) {
-	case '+':
-		pr_info("%s: Adding slave %s.\n", bond->dev->name, dev->name);
-		res = bond_enslave(bond->dev, dev);
-		break;
-
-	case '-':
-		pr_info("%s: Removing slave %s.\n", bond->dev->name, dev->name);
-		res = bond_release(bond->dev, dev);
-		break;
-
-	default:
-		goto err_no_cmd;
-	}
-
-	if (res)
-		ret = res;
-	goto out;
-
-err_no_cmd:
-	pr_err("no command found in slaves file for bond %s. Use +ifname or -ifname.\n",
-	       bond->dev->name);
-	ret = -EPERM;
+	ret = bond_opt_tryset_rtnl(bond, BOND_OPT_SLAVES, (char *)buffer);
+	if (!ret)
+		ret = count;
 
-out:
-	rtnl_unlock();
 	return ret;
 }
-
 static DEVICE_ATTR(slaves, S_IRUGO | S_IWUSR, bonding_show_slaves,
 		   bonding_store_slaves);
 
-- 
1.8.4.2

^ permalink raw reply related

* [PATCH net-next 22/25] bonding: convert all_slaves_active to use the new option API
From: Nikolay Aleksandrov @ 2014-01-21 14:55 UTC (permalink / raw)
  To: netdev; +Cc: Nikolay Aleksandrov
In-Reply-To: <1390316114-17815-1-git-send-email-nikolay@redhat.com>

This patch adds the necessary changes so all_slaves_active would use
the new bonding option API.

Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
---
 drivers/net/bonding/bond_netlink.c |  4 ++--
 drivers/net/bonding/bond_options.c | 29 +++++++++++++++++------------
 drivers/net/bonding/bond_options.h |  3 +++
 drivers/net/bonding/bond_sysfs.c   | 15 +++------------
 drivers/net/bonding/bonding.h      |  2 --
 5 files changed, 25 insertions(+), 28 deletions(-)

diff --git a/drivers/net/bonding/bond_netlink.c b/drivers/net/bonding/bond_netlink.c
index bd26d1c..d0f22ff 100644
--- a/drivers/net/bonding/bond_netlink.c
+++ b/drivers/net/bonding/bond_netlink.c
@@ -283,8 +283,8 @@ static int bond_changelink(struct net_device *bond_dev,
 		int all_slaves_active =
 			nla_get_u8(data[IFLA_BOND_ALL_SLAVES_ACTIVE]);
 
-		err = bond_option_all_slaves_active_set(bond,
-							all_slaves_active);
+		bond_opt_initval(&newval, all_slaves_active);
+		err = __bond_opt_set(bond, BOND_OPT_ALL_SLAVES_ACTIVE, &newval);
 		if (err)
 			return err;
 	}
diff --git a/drivers/net/bonding/bond_options.c b/drivers/net/bonding/bond_options.c
index 3fc2730..8309d51 100644
--- a/drivers/net/bonding/bond_options.c
+++ b/drivers/net/bonding/bond_options.c
@@ -106,6 +106,12 @@ static struct bond_opt_value bond_use_carrier_tbl[] = {
 	{ NULL,  -1, 0}
 };
 
+static struct bond_opt_value bond_all_slaves_active_tbl[] = {
+	{ "off", 0,  BOND_VALFLAG_DEFAULT},
+	{ "on",  1,  0},
+	{ NULL,  -1, 0}
+};
+
 static struct bond_option bond_opts[] = {
 	[BOND_OPT_MODE] = {
 		.id = BOND_OPT_MODE,
@@ -262,6 +268,13 @@ static struct bond_option bond_opts[] = {
 		.flags = BOND_OPTFLAG_RAWVAL,
 		.set = bond_option_queue_id_set
 	},
+	[BOND_OPT_ALL_SLAVES_ACTIVE] = {
+		.id = BOND_OPT_ALL_SLAVES_ACTIVE,
+		.name = "all_slaves_active",
+		.desc = "Keep all frames received on an interface by setting active flag for all slaves",
+		.values = bond_all_slaves_active_tbl,
+		.set = bond_option_all_slaves_active_set
+	},
 	{ }
 };
 
@@ -1050,25 +1063,17 @@ int bond_option_num_peer_notif_set(struct bonding *bond,
 }
 
 int bond_option_all_slaves_active_set(struct bonding *bond,
-				      int all_slaves_active)
+				      struct bond_opt_value *newval)
 {
 	struct list_head *iter;
 	struct slave *slave;
 
-	if (all_slaves_active == bond->params.all_slaves_active)
+	if (newval->value == bond->params.all_slaves_active)
 		return 0;
-
-	if ((all_slaves_active == 0) || (all_slaves_active == 1)) {
-		bond->params.all_slaves_active = all_slaves_active;
-	} else {
-		pr_info("%s: Ignoring invalid all_slaves_active value %d.\n",
-			bond->dev->name, all_slaves_active);
-		return -EINVAL;
-	}
-
+	bond->params.all_slaves_active = newval->value;
 	bond_for_each_slave(bond, slave, iter) {
 		if (!bond_is_active_slave(slave)) {
-			if (all_slaves_active)
+			if (newval->value)
 				slave->inactive = 0;
 			else
 				slave->inactive = 1;
diff --git a/drivers/net/bonding/bond_options.h b/drivers/net/bonding/bond_options.h
index 396d504..09ee8c8 100644
--- a/drivers/net/bonding/bond_options.h
+++ b/drivers/net/bonding/bond_options.h
@@ -58,6 +58,7 @@ enum {
 	BOND_OPT_USE_CARRIER,
 	BOND_OPT_ACTIVE_SLAVE,
 	BOND_OPT_QUEUE_ID,
+	BOND_OPT_ALL_SLAVES_ACTIVE,
 	BOND_OPT_LAST
 };
 
@@ -156,4 +157,6 @@ int bond_option_active_slave_set(struct bonding *bond,
 				 struct bond_opt_value *newval);
 int bond_option_queue_id_set(struct bonding *bond,
 			     struct bond_opt_value *newval);
+int bond_option_all_slaves_active_set(struct bonding *bond,
+				      struct bond_opt_value *newval);
 #endif /* _BOND_OPTIONS_H */
diff --git a/drivers/net/bonding/bond_sysfs.c b/drivers/net/bonding/bond_sysfs.c
index 2da37bd..a403345 100644
--- a/drivers/net/bonding/bond_sysfs.c
+++ b/drivers/net/bonding/bond_sysfs.c
@@ -1016,22 +1016,13 @@ static ssize_t bonding_store_slaves_active(struct device *d,
 					   const char *buf, size_t count)
 {
 	struct bonding *bond = to_bond(d);
-	int new_value, ret;
-
-	if (sscanf(buf, "%d", &new_value) != 1) {
-		pr_err("%s: no all_slaves_active value specified.\n",
-		       bond->dev->name);
-		return -EINVAL;
-	}
-
-	if (!rtnl_trylock())
-		return restart_syscall();
+	int ret;
 
-	ret = bond_option_all_slaves_active_set(bond, new_value);
+	ret = bond_opt_tryset_rtnl(bond, BOND_OPT_ALL_SLAVES_ACTIVE,
+				   (char *)buf);
 	if (!ret)
 		ret = count;
 
-	rtnl_unlock();
 	return ret;
 }
 static DEVICE_ATTR(all_slaves_active, S_IRUGO | S_IWUSR,
diff --git a/drivers/net/bonding/bonding.h b/drivers/net/bonding/bonding.h
index 014605d..c021d54 100644
--- a/drivers/net/bonding/bonding.h
+++ b/drivers/net/bonding/bonding.h
@@ -452,8 +452,6 @@ void bond_netlink_fini(void);
 int bond_option_arp_ip_target_add(struct bonding *bond, __be32 target);
 int bond_option_arp_ip_target_rem(struct bonding *bond, __be32 target);
 int bond_option_resend_igmp_set(struct bonding *bond, int resend_igmp);
-int bond_option_all_slaves_active_set(struct bonding *bond,
-				      int all_slaves_active);
 int bond_option_lp_interval_set(struct bonding *bond, int min_links);
 struct net_device *bond_option_active_slave_get_rcu(struct bonding *bond);
 struct net_device *bond_option_active_slave_get(struct bonding *bond);
-- 
1.8.4.2

^ permalink raw reply related

* [PATCH net-next 24/25] bonding: convert lp_interval to use the new option API
From: Nikolay Aleksandrov @ 2014-01-21 14:55 UTC (permalink / raw)
  To: netdev; +Cc: Nikolay Aleksandrov
In-Reply-To: <1390316114-17815-1-git-send-email-nikolay@redhat.com>

This patch adds the necessary changes so lp_interval would use
the new bonding option API.

Signed-off-by: Nikolay Aleksandrov <nikolay@redhat.com>
---
 drivers/net/bonding/bond_netlink.c |  3 ++-
 drivers/net/bonding/bond_options.c | 23 +++++++++++++++--------
 drivers/net/bonding/bond_options.h |  3 +++
 drivers/net/bonding/bond_sysfs.c   | 14 ++------------
 drivers/net/bonding/bonding.h      |  1 -
 5 files changed, 22 insertions(+), 22 deletions(-)

diff --git a/drivers/net/bonding/bond_netlink.c b/drivers/net/bonding/bond_netlink.c
index e183c66..50fbe3f 100644
--- a/drivers/net/bonding/bond_netlink.c
+++ b/drivers/net/bonding/bond_netlink.c
@@ -302,7 +302,8 @@ static int bond_changelink(struct net_device *bond_dev,
 		int lp_interval =
 			nla_get_u32(data[IFLA_BOND_LP_INTERVAL]);
 
-		err = bond_option_lp_interval_set(bond, lp_interval);
+		bond_opt_initval(&newval, lp_interval);
+		err = __bond_opt_set(bond, BOND_OPT_LP_INTERVAL, &newval);
 		if (err)
 			return err;
 	}
diff --git a/drivers/net/bonding/bond_options.c b/drivers/net/bonding/bond_options.c
index 17fbaaf..b9e9bbe 100644
--- a/drivers/net/bonding/bond_options.c
+++ b/drivers/net/bonding/bond_options.c
@@ -119,6 +119,11 @@ static struct bond_opt_value bond_resend_igmp_tbl[] = {
 	{ NULL,      -1,  0}
 };
 
+static struct bond_opt_value bond_lp_interval_tbl[] = {
+	{ "minval",  1,       BOND_VALFLAG_MIN | BOND_VALFLAG_DEFAULT},
+	{ "maxval",  INT_MAX, BOND_VALFLAG_MAX},
+};
+
 static struct bond_option bond_opts[] = {
 	[BOND_OPT_MODE] = {
 		.id = BOND_OPT_MODE,
@@ -289,6 +294,13 @@ static struct bond_option bond_opts[] = {
 		.values = bond_resend_igmp_tbl,
 		.set = bond_option_resend_igmp_set
 	},
+	[BOND_OPT_LP_INTERVAL] = {
+		.id = BOND_OPT_LP_INTERVAL,
+		.name = "lp_interval",
+		.desc = "The number of seconds between instances where the bonding driver sends learning packets to each slave's peer switch",
+		.values = bond_lp_interval_tbl,
+		.set = bond_option_lp_interval_set
+	},
 	{ }
 };
 
@@ -1102,15 +1114,10 @@ int bond_option_min_links_set(struct bonding *bond,
 	return 0;
 }
 
-int bond_option_lp_interval_set(struct bonding *bond, int lp_interval)
+int bond_option_lp_interval_set(struct bonding *bond,
+				struct bond_opt_value *newval)
 {
-	if (lp_interval <= 0) {
-		pr_err("%s: lp_interval must be between 1 and %d\n",
-		       bond->dev->name, INT_MAX);
-		return -EINVAL;
-	}
-
-	bond->params.lp_interval = lp_interval;
+	bond->params.lp_interval = newval->value;
 
 	return 0;
 }
diff --git a/drivers/net/bonding/bond_options.h b/drivers/net/bonding/bond_options.h
index f0c2cbb..eb3a773 100644
--- a/drivers/net/bonding/bond_options.h
+++ b/drivers/net/bonding/bond_options.h
@@ -60,6 +60,7 @@ enum {
 	BOND_OPT_QUEUE_ID,
 	BOND_OPT_ALL_SLAVES_ACTIVE,
 	BOND_OPT_RESEND_IGMP,
+	BOND_OPT_LP_INTERVAL,
 	BOND_OPT_LAST
 };
 
@@ -162,4 +163,6 @@ int bond_option_all_slaves_active_set(struct bonding *bond,
 				      struct bond_opt_value *newval);
 int bond_option_resend_igmp_set(struct bonding *bond,
 				struct bond_opt_value *newval);
+int bond_option_lp_interval_set(struct bonding *bond,
+				struct bond_opt_value *newval);
 #endif /* _BOND_OPTIONS_H */
diff --git a/drivers/net/bonding/bond_sysfs.c b/drivers/net/bonding/bond_sysfs.c
index 4fad7cb..8667e55 100644
--- a/drivers/net/bonding/bond_sysfs.c
+++ b/drivers/net/bonding/bond_sysfs.c
@@ -1071,22 +1071,12 @@ static ssize_t bonding_store_lp_interval(struct device *d,
 					 const char *buf, size_t count)
 {
 	struct bonding *bond = to_bond(d);
-	int new_value, ret;
-
-	if (sscanf(buf, "%d", &new_value) != 1) {
-		pr_err("%s: no lp interval value specified.\n",
-			bond->dev->name);
-		return -EINVAL;
-	}
-
-	if (!rtnl_trylock())
-		return restart_syscall();
+	int ret;
 
-	ret = bond_option_lp_interval_set(bond, new_value);
+	ret = bond_opt_tryset_rtnl(bond, BOND_OPT_LP_INTERVAL, (char *)buf);
 	if (!ret)
 		ret = count;
 
-	rtnl_unlock();
 	return ret;
 }
 
diff --git a/drivers/net/bonding/bonding.h b/drivers/net/bonding/bonding.h
index 00b5e91..69fc177 100644
--- a/drivers/net/bonding/bonding.h
+++ b/drivers/net/bonding/bonding.h
@@ -451,7 +451,6 @@ int bond_netlink_init(void);
 void bond_netlink_fini(void);
 int bond_option_arp_ip_target_add(struct bonding *bond, __be32 target);
 int bond_option_arp_ip_target_rem(struct bonding *bond, __be32 target);
-int bond_option_lp_interval_set(struct bonding *bond, int min_links);
 struct net_device *bond_option_active_slave_get_rcu(struct bonding *bond);
 struct net_device *bond_option_active_slave_get(struct bonding *bond);
 const char *bond_slave_link_status(s8 link);
-- 
1.8.4.2

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox