Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH v2] xsk: share the mmap_sem for page pinning
From: Daniel Borkmann @ 2019-02-11 19:53 UTC (permalink / raw)
  To: akpm, linux-mm, linux-kernel, David S . Miller, Bjorn Topel,
	Magnus Karlsson, netdev, Davidlohr Bueso
In-Reply-To: <20190211161529.uskq5ca7y3j5522i@linux-r8p5>

On 02/11/2019 05:15 PM, Davidlohr Bueso wrote:
> Holding mmap_sem exclusively for a gup() is an overkill.
> Lets share the lock and replace the gup call for gup_longterm(),
> as it is better suited for the lifetime of the pinning.
> 
> Cc: David S. Miller <davem@davemloft.net>
> Cc: Bjorn Topel <bjorn.topel@intel.com>
> Cc: Magnus Karlsson <magnus.karlsson@intel.com>
> CC: netdev@vger.kernel.org
> Signed-off-by: Davidlohr Bueso <dbueso@suse.de>

Applied, thanks!

^ permalink raw reply

* Re: [PATCH mlx5-next 2/2] net/mlx5: Factor out HCA capabilities functions
From: Leon Romanovsky @ 2019-02-11 20:02 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Doug Ledford, RDMA mailing list, Moni Shoua, Saeed Mahameed,
	linux-netdev
In-Reply-To: <20190211195050.GL24706@mellanox.com>

[-- Attachment #1: Type: text/plain, Size: 2235 bytes --]

On Mon, Feb 11, 2019 at 07:50:55PM +0000, Jason Gunthorpe wrote:
> On Mon, Feb 11, 2019 at 01:56:08PM +0200, Leon Romanovsky wrote:
> > From: Leon Romanovsky <leonro@mellanox.com>
> >
> > Combine all HCA capabilities setters under one function
> > and compile out the ODP related function in case kernel
> > was compiled without ODP support.
> >
> > Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
> >  .../net/ethernet/mellanox/mlx5/core/main.c    | 47 +++++++++++++------
> >  1 file changed, 33 insertions(+), 14 deletions(-)
> >
> > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c
> > index 6d45518edbdc..d7145ab6105d 100644
> > +++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
> > @@ -459,6 +459,7 @@ static int handle_hca_cap_atomic(struct mlx5_core_dev *dev)
> >  	return err;
> >  }
> >
> > +#ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING
> >  static int handle_hca_cap_odp(struct mlx5_core_dev *dev)
> >  {
> >  	void *set_hca_cap;
> > @@ -502,6 +503,7 @@ static int handle_hca_cap_odp(struct mlx5_core_dev *dev)
> >  	kfree(set_ctx);
> >  	return err;
> >  }
> > +#endif
> >
> >  static int handle_hca_cap(struct mlx5_core_dev *dev)
> >  {
> > @@ -576,6 +578,35 @@ static int handle_hca_cap(struct mlx5_core_dev *dev)
> >  	return err;
> >  }
> >
> > +static int set_hca_cap(struct mlx5_core_dev *dev)
> > +{
> > +	struct pci_dev *pdev = dev->pdev;
> > +	int err;
> > +
> > +	err = handle_hca_cap(dev);
> > +	if (err) {
> > +		dev_err(&pdev->dev, "handle_hca_cap failed\n");
> > +		goto out;
> > +	}
> > +
> > +	err = handle_hca_cap_atomic(dev);
> > +	if (err) {
> > +		dev_err(&pdev->dev, "handle_hca_cap_atomic failed\n");
> > +		goto out;
> > +	}
> > +
> > +#ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING
> > +	err = handle_hca_cap_odp(dev);
> > +	if (err) {
> > +		dev_err(&pdev->dev, "handle_hca_cap_odp failed\n");
> > +		goto out;
> > +	}
> > +#endif
>
> Adding
>   if (IS_ENABLED..)
>     return 0;
>
> To the top of handle_hca_cap_odp is alot better.

Saeed gave comment that he prefers code to be compiled-out in case
config is not set. In you suggestion, the code will exist and only
with some optimizations enabled it will be thrown.

Thanks

>
> Jason

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply

* Re: [PATCH 0/9] perf annotation of BPF programs
From: Song Liu @ 2019-02-11 20:10 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo
  Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	ast@kernel.org, daniel@iogearbox.net, Kernel Team,
	peterz@infradead.org, Jiri Olsa, Namhyung Kim, Andi Kleen,
	Stephane Eranian, Arnaldo Carvalho de Melo
In-Reply-To: <20190211185400.GA2084@redhat.com>



> On Feb 11, 2019, at 10:54 AM, Arnaldo Carvalho de Melo <acme@redhat.com> wrote:
> 
> Em Fri, Feb 08, 2019 at 05:16:56PM -0800, Song Liu escreveu:
>> This series enables annotation of BPF programs in perf.
>> 
>> perf tool gathers information via sys_bpf and (optionally) stores them in
>> perf.data as headers.
> 
> Jiri, Stephane, this is the patchkit I mentioned in the context of doing
> away with perf.data headers and instead package everything as userspace
> PERF_RECORD_ metadata events.
> 
> Song, please add Jiri and Namhyung in future perf patchkits, they are
> listed as perf tools reviewers in MAINTAINANERS and Jiri also is working
> on something directly related.
> 
> Thanks,
> 
> - Arnaldo

Thanks Arnaldo! I will keep Jiri and Namhyung in the loop. 

Song




>> Patch 1/9 fixes a minor issue in kernel;
>> Patch 2/9 to 4/9 introduce new helper functions and use them in perf and
>>     bpftool;
>> Patch 5/9 and 6/9 saves information of bpf program in perf_env;
>> Patch 7/9 adds --bpf-event options to perf-top;
>> Patch 8/9 enables annotation of bpf programs based on information gathered
>>     in 5/9 and 6/9;
>> Patch 9/9 handles information of short living BPF program that are loaded
>>     during perf-record or perf-top.
>> 
>> Commands tested during developments are perf-top, perf-record, perf-report,
>> and perf-annotate.
>> 
>> ===================== Note on patch dependency  ========================
>> This set has dependency in both bpf-next tree and tip/perf/core. Current
>> version is developed on bpf-next tree with the following commits
>> cherry-picked from tip/perf/core:
>> 
>> (from 1/10 to 10/10)
>> commit 76193a94522f ("perf, bpf: Introduce PERF_RECORD_KSYMBOL")
>> commit d764ac646491 ("tools headers uapi: Sync tools/include/uapi/linux/perf_event.h")
>> commit 6ee52e2a3fe4 ("perf, bpf: Introduce PERF_RECORD_BPF_EVENT")
>> commit df063c83aa2c ("tools headers uapi: Sync tools/include/uapi/linux/perf_event.h")
>> commit 9aa0bfa370b2 ("perf tools: Handle PERF_RECORD_KSYMBOL")
>> commit 45178a928a4b ("perf tools: Handle PERF_RECORD_BPF_EVENT")
>> commit 7b612e291a5a ("perf tools: Synthesize PERF_RECORD_* for loaded BPF programs")
>> commit a40b95bcd30c ("perf top: Synthesize BPF events for pre-existing loaded BPF programs")
>> commit 6934058d9fb6 ("bpf: Add module name [bpf] to ksymbols for bpf programs")
>> commit 811184fb6977 ("perf bpf: Fix synthesized PERF_RECORD_KSYMBOL/BPF_EVENT")
>> ========================================================================
>> 
>> Song Liu (9):
>>  perf, bpf: consider events with attr.bpf_event as side-band events
>>  bpf: libbpf: introduce bpf_program__get_prog_info_linear()
>>  bpf: bpftool: use bpf_program__get_prog_info_linear() in
>>    prog.c:do_dump()
>>  perf, bpf: synthesize bpf events with
>>    bpf_program__get_prog_info_linear()
>>  perf, bpf: save bpf_prog_info in a rbtree in perf_env
>>  perf, bpf: save btf in a rbtree in perf_env
>>  perf-top: add option --bpf-event
>>  perf, bpf: enable annotation of bpf program
>>  perf, bpf: save information about short living bpf programs
>> 
>> kernel/events/core.c        |   3 +-
>> tools/bpf/bpftool/prog.c    | 266 ++++++---------------------
>> tools/lib/bpf/libbpf.c      | 251 ++++++++++++++++++++++++++
>> tools/lib/bpf/libbpf.h      |  63 +++++++
>> tools/lib/bpf/libbpf.map    |   3 +
>> tools/perf/Makefile.config  |   2 +-
>> tools/perf/builtin-record.c |  15 +-
>> tools/perf/builtin-top.c    |  15 +-
>> tools/perf/util/annotate.c  | 149 ++++++++++++++-
>> tools/perf/util/bpf-event.c | 351 +++++++++++++++++++++++++++---------
>> tools/perf/util/bpf-event.h |  48 ++++-
>> tools/perf/util/dso.c       |   1 +
>> tools/perf/util/dso.h       |  33 ++--
>> tools/perf/util/env.c       | 148 +++++++++++++++
>> tools/perf/util/env.h       |  12 ++
>> tools/perf/util/evlist.c    |  20 ++
>> tools/perf/util/evlist.h    |   2 +
>> tools/perf/util/header.c    | 231 +++++++++++++++++++++++-
>> tools/perf/util/header.h    |   2 +
>> tools/perf/util/symbol.c    |   1 +
>> 20 files changed, 1304 insertions(+), 312 deletions(-)
>> 
>> --
>> 2.17.1


^ permalink raw reply

* Re: [PATCH bpf] bpf: only adjust gso_size on bytestream protocols
From: Willem de Bruijn @ 2019-02-11 20:11 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: Alexei Starovoitov, Network Development, Alexei Starovoitov,
	Peter Oskolkov, Daniel Axtens, Willem de Bruijn
In-Reply-To: <25e33f72-b04a-3229-ce8e-2320d29b8ee6@iogearbox.net>

On Mon, Feb 11, 2019 at 9:58 AM Daniel Borkmann <daniel@iogearbox.net> wrote:
>
> Hi Willem,
>
> On 02/11/2019 05:00 AM, Alexei Starovoitov wrote:
> > On Thu, Feb 07, 2019 at 02:54:16PM -0500, Willem de Bruijn wrote:
> >> From: Willem de Bruijn <willemb@google.com>
> >>
> >> bpf_skb_change_proto and bpf_skb_adjust_room change skb header length.
> >> For GSO packets they adjust gso_size to maintain the same MTU.
> >>
> >> The gso size can only be safely adjusted on bytestream protocols.
> >> Commit d02f51cbcf12 ("bpf: fix bpf_skb_adjust_net/bpf_skb_proto_xlat
> >> to deal with gso sctp skbs") excluded SKB_GSO_SCTP.
> >>
> >> Since then type SKB_GSO_UDP_L4 has been added, whose contents are one
> >> gso_size unit per datagram. Also exclude these.
> >>
> >> Move from a blacklist to a whitelist check to future proof against
> >> additional such new GSO types, e.g., for fraglist based GRO.
> >>
> >> Fixes: bec1f6f69736 ("udp: generate gso with UDP_SEGMENT")
> >> Signed-off-by: Willem de Bruijn <willemb@google.com>
> >
> > Applied to bpf tree.
> > I agree that whitelist approach is the most appropriate.
>
> What would be needed to get UDP GSO working with nat64 work above? I don't
> really mind about SCTP, but sucks that this doesn't guarantee full support
> for TCP *and* UDP at least. :/

The easy part is shrinking headers in bpf_skb_net_shrink and
bpf_skb_proto_6_to_4. Those are safe if they adjust gso_size only if
skb_is_gso_tcp(skb).

Growing headers, whether with nat64 or in-review BPF_LWT_ENCAP_IP, is
fine if the original gso_size was chosen sufficiently below MSS to
account for the possible transformation.

Though this is not so cheap to verify here. But the same MTU concern
exists for non-GSO packets. Those are also adjusted unconditionally,
as far as I can tell. We do not need to add an MTU check solely for GSO.

For both GSO and non-GSO, for egress transformation, an admin
inserting such BPF programs can at least add an explicit route mtu to
force processes to limit the size they generate. Analogous to how tunnel
devices derive their mtu from their destination device minus encap headers.

^ permalink raw reply

* Re: [net-next PATCH 1/2] mm: add dma_addr_t to struct page
From: Andrew Morton @ 2019-02-11 20:16 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: netdev, linux-mm, Toke Høiland-Jørgensen,
	Ilias Apalodimas, willy, Saeed Mahameed, mgorman, David S. Miller,
	Tariq Toukan
In-Reply-To: <154990120685.24530.15350136329514629029.stgit@firesoul>

On Mon, 11 Feb 2019 17:06:46 +0100 Jesper Dangaard Brouer <brouer@redhat.com> wrote:

> The page_pool API is using page->private to store DMA addresses.
> As pointed out by David Miller we can't use that on 32-bit architectures
> with 64-bit DMA
> 
> This patch adds a new dma_addr_t struct to allow storing DMA addresses
> 
> ..
>
> --- a/include/linux/mm_types.h
> +++ b/include/linux/mm_types.h
> @@ -95,6 +95,14 @@ struct page {
>  			 */
>  			unsigned long private;
>  		};
> +		struct {	/* page_pool used by netstack */
> +			/**
> +			 * @dma_addr: Page_pool need to store DMA-addr, and
> +			 * cannot use @private, as DMA-mappings can be 64-bit
> +			 * even on 32-bit Architectures.
> +			 */

This comment is a bit awkward.  The discussion about why it doesn't use
->private is uninteresting going forward and is more material for a
changelog.

How about

			/**
			 * @dma_addr: page_pool requires a 64-bit value even on
			 * 32-bit architectures.
			 */

Otherwise,

Acked-by: Andrew Morton <akpm@linux-foundation.org>

^ permalink raw reply

* [PATCH 0/3] Add gup fast + longterm and use it in HFI1
From: ira.weiny @ 2019-02-11 20:16 UTC (permalink / raw)
  To: linux-rdma, linux-kernel, linux-mm, Daniel Borkmann,
	Davidlohr Bueso, netdev
  Cc: Mike Marciniszyn, Dennis Dalessandro, Doug Ledford,
	Jason Gunthorpe, Andrew Morton, Kirill A. Shutemov, Dan Williams,
	Ira Weiny

From: Ira Weiny <ira.weiny@intel.com>

NOTE: This series depends on my clean up patch to remove the write parameter
from gup_fast_permitted()[1]

HFI1 uses get_user_pages_fast() due to it performance advantages.  Like RDMA,
HFI1 pages can be held for a significant time.  But get_user_pages_fast() does
not protect against mapping of FS DAX pages.

Introduce a get_user_pages_fast_longterm() which retains the performance while
also adding the FS DAX checks.  XDP has also shown interest in using this
functionality.[2]

[1] https://lkml.org/lkml/2019/2/11/237
[2] https://lkml.org/lkml/2019/2/11/1789

Ira Weiny (3):
  mm/gup: Change "write" parameter to flags
  mm/gup: Introduce get_user_pages_fast_longterm()
  IB/HFI1: Use new get_user_pages_fast_longterm()

 drivers/infiniband/hw/hfi1/user_pages.c |   2 +-
 include/linux/mm.h                      |   8 ++
 mm/gup.c                                | 152 ++++++++++++++++--------
 3 files changed, 114 insertions(+), 48 deletions(-)

-- 
2.20.1


^ permalink raw reply

* [PATCH 3/3] IB/HFI1: Use new get_user_pages_fast_longterm()
From: ira.weiny @ 2019-02-11 20:16 UTC (permalink / raw)
  To: linux-rdma, linux-kernel, linux-mm, Daniel Borkmann,
	Davidlohr Bueso, netdev
  Cc: Mike Marciniszyn, Dennis Dalessandro, Doug Ledford,
	Jason Gunthorpe, Andrew Morton, Kirill A. Shutemov, Dan Williams,
	Ira Weiny
In-Reply-To: <20190211201643.7599-1-ira.weiny@intel.com>

From: Ira Weiny <ira.weiny@intel.com>

Use the new get_user_pages_fast_longterm() call to protect against
FS DAX pages being mapped.

Signed-off-by: Ira Weiny <ira.weiny@intel.com>
---
 drivers/infiniband/hw/hfi1/user_pages.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/infiniband/hw/hfi1/user_pages.c b/drivers/infiniband/hw/hfi1/user_pages.c
index 24b592c6522e..b94ab5385a09 100644
--- a/drivers/infiniband/hw/hfi1/user_pages.c
+++ b/drivers/infiniband/hw/hfi1/user_pages.c
@@ -105,7 +105,7 @@ int hfi1_acquire_user_pages(struct mm_struct *mm, unsigned long vaddr, size_t np
 {
 	int ret;
 
-	ret = get_user_pages_fast(vaddr, npages, writable, pages);
+	ret = get_user_pages_fast_longterm(vaddr, npages, writable, pages);
 	if (ret < 0)
 		return ret;
 
-- 
2.20.1


^ permalink raw reply related

* [PATCH 2/3] mm/gup: Introduce get_user_pages_fast_longterm()
From: ira.weiny @ 2019-02-11 20:16 UTC (permalink / raw)
  To: linux-rdma, linux-kernel, linux-mm, Daniel Borkmann,
	Davidlohr Bueso, netdev
  Cc: Mike Marciniszyn, Dennis Dalessandro, Doug Ledford,
	Jason Gunthorpe, Andrew Morton, Kirill A. Shutemov, Dan Williams,
	Ira Weiny
In-Reply-To: <20190211201643.7599-1-ira.weiny@intel.com>

From: Ira Weiny <ira.weiny@intel.com>

Users of get_user_pages_fast are not protected against mapping
pages within FS DAX.  Introduce a call which protects them.

We do this by checking for DEVMAP pages during the fast walk and
falling back to the longterm gup call to check for FS DAX if needed.

Signed-off-by: Ira Weiny <ira.weiny@intel.com>
---
 include/linux/mm.h |   8 ++++
 mm/gup.c           | 102 +++++++++++++++++++++++++++++++++++----------
 2 files changed, 88 insertions(+), 22 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 80bb6408fe73..8f831c823630 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1540,6 +1540,8 @@ long get_user_pages_unlocked(unsigned long start, unsigned long nr_pages,
 long get_user_pages_longterm(unsigned long start, unsigned long nr_pages,
 			    unsigned int gup_flags, struct page **pages,
 			    struct vm_area_struct **vmas);
+int get_user_pages_fast_longterm(unsigned long start, int nr_pages, bool write,
+				 struct page **pages);
 #else
 static inline long get_user_pages_longterm(unsigned long start,
 		unsigned long nr_pages, unsigned int gup_flags,
@@ -1547,6 +1549,11 @@ static inline long get_user_pages_longterm(unsigned long start,
 {
 	return get_user_pages(start, nr_pages, gup_flags, pages, vmas);
 }
+static inline int get_user_pages_fast_longterm(unsigned long start, int nr_pages,
+					       bool write, struct page **pages)
+{
+	return get_user_pages_fast(start, nr_pages, write, pages);
+}
 #endif /* CONFIG_FS_DAX */
 
 int get_user_pages_fast(unsigned long start, int nr_pages, int write,
@@ -2615,6 +2622,7 @@ struct page *follow_page(struct vm_area_struct *vma, unsigned long address,
 #define FOLL_REMOTE	0x2000	/* we are working on non-current tsk/mm */
 #define FOLL_COW	0x4000	/* internal GUP flag */
 #define FOLL_ANON	0x8000	/* don't do file mappings */
+#define FOLL_LONGTERM	0x10000	/* mapping is intended for a long term pin */
 
 static inline int vm_fault_to_errno(vm_fault_t vm_fault, int foll_flags)
 {
diff --git a/mm/gup.c b/mm/gup.c
index 894ab014bd1e..f7d86a304405 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -1190,6 +1190,21 @@ long get_user_pages_longterm(unsigned long start, unsigned long nr_pages,
 EXPORT_SYMBOL(get_user_pages_longterm);
 #endif /* CONFIG_FS_DAX */
 
+static long get_user_pages_longterm_unlocked(unsigned long start,
+					     unsigned long nr_pages,
+					     struct page **pages,
+					     unsigned int gup_flags)
+{
+	struct mm_struct *mm = current->mm;
+	long ret;
+
+	down_read(&mm->mmap_sem);
+	ret = get_user_pages_longterm(start, nr_pages, gup_flags, pages, NULL);
+	up_read(&mm->mmap_sem);
+
+	return ret;
+}
+
 /**
  * populate_vma_page_range() -  populate a range of pages in the vma.
  * @vma:   target vma
@@ -1417,6 +1432,9 @@ static int gup_pte_range(pmd_t pmd, unsigned long addr, unsigned long end,
 			goto pte_unmap;
 
 		if (pte_devmap(pte)) {
+			if (flags & FOLL_LONGTERM)
+				goto pte_unmap;
+
 			pgmap = get_dev_pagemap(pte_pfn(pte), pgmap);
 			if (unlikely(!pgmap)) {
 				undo_dev_pagemap(nr, nr_start, pages);
@@ -1556,8 +1574,12 @@ static int gup_huge_pmd(pmd_t orig, pmd_t *pmdp, unsigned long addr,
 	if (!pmd_access_permitted(orig, flags & FOLL_WRITE))
 		return 0;
 
-	if (pmd_devmap(orig))
+	if (pmd_devmap(orig)) {
+		if (flags & FOLL_LONGTERM)
+			return 0;
+
 		return __gup_device_huge_pmd(orig, pmdp, addr, end, pages, nr);
+	}
 
 	refs = 0;
 	page = pmd_page(orig) + ((addr & ~PMD_MASK) >> PAGE_SHIFT);
@@ -1837,24 +1859,9 @@ int __get_user_pages_fast(unsigned long start, int nr_pages, int write,
 	return nr;
 }
 
-/**
- * get_user_pages_fast() - pin user pages in memory
- * @start:	starting user address
- * @nr_pages:	number of pages from start to pin
- * @write:	whether pages will be written to
- * @pages:	array that receives pointers to the pages pinned.
- *		Should be at least nr_pages long.
- *
- * Attempt to pin user pages in memory without taking mm->mmap_sem.
- * If not successful, it will fall back to taking the lock and
- * calling get_user_pages().
- *
- * Returns number of pages pinned. This may be fewer than the number
- * requested. If nr_pages is 0 or negative, returns 0. If no pages
- * were pinned, returns -errno.
- */
-int get_user_pages_fast(unsigned long start, int nr_pages, int write,
-			struct page **pages)
+static int __get_user_pages_fast_flags(unsigned long start, int nr_pages,
+				       unsigned int gup_flags,
+				       struct page **pages)
 {
 	unsigned long addr, len, end;
 	int nr = 0, ret = 0;
@@ -1872,7 +1879,7 @@ int get_user_pages_fast(unsigned long start, int nr_pages, int write,
 
 	if (gup_fast_permitted(start, nr_pages)) {
 		local_irq_disable();
-		gup_pgd_range(addr, end, write ? FOLL_WRITE : 0, pages, &nr);
+		gup_pgd_range(addr, end, gup_flags, pages, &nr);
 		local_irq_enable();
 		ret = nr;
 	}
@@ -1882,8 +1889,14 @@ int get_user_pages_fast(unsigned long start, int nr_pages, int write,
 		start += nr << PAGE_SHIFT;
 		pages += nr;
 
-		ret = get_user_pages_unlocked(start, nr_pages - nr, pages,
-				write ? FOLL_WRITE : 0);
+		if (gup_flags & FOLL_LONGTERM)
+			ret = get_user_pages_longterm_unlocked(start,
+							       nr_pages - nr,
+							       pages,
+							       gup_flags);
+		else
+			ret = get_user_pages_unlocked(start, nr_pages - nr,
+						      pages, gup_flags);
 
 		/* Have to be a bit careful with return values */
 		if (nr > 0) {
@@ -1897,4 +1910,49 @@ int get_user_pages_fast(unsigned long start, int nr_pages, int write,
 	return ret;
 }
 
+/**
+ * get_user_pages_fast() - pin user pages in memory
+ * @start:	starting user address
+ * @nr_pages:	number of pages from start to pin
+ * @write:	whether pages will be written to
+ * @pages:	array that receives pointers to the pages pinned.
+ *		Should be at least nr_pages long.
+ *
+ * Attempt to pin user pages in memory without taking mm->mmap_sem.
+ * If not successful, it will fall back to taking the lock and
+ * calling get_user_pages().
+ *
+ * Returns number of pages pinned. This may be fewer than the number
+ * requested. If nr_pages is 0 or negative, returns 0. If no pages
+ * were pinned, returns -errno.
+ */
+int get_user_pages_fast(unsigned long start, int nr_pages, int write,
+			struct page **pages)
+{
+	return __get_user_pages_fast_flags(start, nr_pages,
+					   write ? FOLL_WRITE : 0,
+					   pages);
+}
+
+#ifdef CONFIG_FS_DAX
+/**
+ * get_user_pages_fast_longterm() - pin user pages in memory
+ *
+ * Exactly the same semantics as get_user_pages_fast() except fails mappings
+ * device mapped pages (such as DAX pages) which then fall back to checking for
+ * FS DAX pages with get_user_pages_longterm().
+ */
+int get_user_pages_fast_longterm(unsigned long start, int nr_pages, bool write,
+				 struct page **pages)
+{
+	unsigned int gup_flags = FOLL_LONGTERM;
+
+	if (write)
+		gup_flags |= FOLL_WRITE;
+
+	return __get_user_pages_fast_flags(start, nr_pages, gup_flags, pages);
+}
+EXPORT_SYMBOL(get_user_pages_fast_longterm);
+#endif /* CONFIG_FS_DAX */
+
 #endif /* CONFIG_HAVE_GENERIC_GUP */
-- 
2.20.1


^ permalink raw reply related

* [PATCH 1/3] mm/gup: Change "write" parameter to flags
From: ira.weiny @ 2019-02-11 20:16 UTC (permalink / raw)
  To: linux-rdma, linux-kernel, linux-mm, Daniel Borkmann,
	Davidlohr Bueso, netdev
  Cc: Mike Marciniszyn, Dennis Dalessandro, Doug Ledford,
	Jason Gunthorpe, Andrew Morton, Kirill A. Shutemov, Dan Williams,
	Ira Weiny
In-Reply-To: <20190211201643.7599-1-ira.weiny@intel.com>

From: Ira Weiny <ira.weiny@intel.com>

In order to support more options in the GUP fast walk, change the
write parameter to flags throughout the call stack.

This patch does not change functionality and passes FOLL_WRITE
where write was previously used.

Signed-off-by: Ira Weiny <ira.weiny@intel.com>
---
 mm/gup.c | 52 ++++++++++++++++++++++++++--------------------------
 1 file changed, 26 insertions(+), 26 deletions(-)

diff --git a/mm/gup.c b/mm/gup.c
index b63e88eca31b..894ab014bd1e 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -1395,7 +1395,7 @@ static void undo_dev_pagemap(int *nr, int nr_start, struct page **pages)
 
 #ifdef CONFIG_ARCH_HAS_PTE_SPECIAL
 static int gup_pte_range(pmd_t pmd, unsigned long addr, unsigned long end,
-			 int write, struct page **pages, int *nr)
+			 unsigned int flags, struct page **pages, int *nr)
 {
 	struct dev_pagemap *pgmap = NULL;
 	int nr_start = *nr, ret = 0;
@@ -1413,7 +1413,7 @@ static int gup_pte_range(pmd_t pmd, unsigned long addr, unsigned long end,
 		if (pte_protnone(pte))
 			goto pte_unmap;
 
-		if (!pte_access_permitted(pte, write))
+		if (!pte_access_permitted(pte, flags & FOLL_WRITE))
 			goto pte_unmap;
 
 		if (pte_devmap(pte)) {
@@ -1465,7 +1465,7 @@ static int gup_pte_range(pmd_t pmd, unsigned long addr, unsigned long end,
  * useful to have gup_huge_pmd even if we can't operate on ptes.
  */
 static int gup_pte_range(pmd_t pmd, unsigned long addr, unsigned long end,
-			 int write, struct page **pages, int *nr)
+			 unsigned int flags, struct page **pages, int *nr)
 {
 	return 0;
 }
@@ -1548,12 +1548,12 @@ static int __gup_device_huge_pud(pud_t pud, pud_t *pudp, unsigned long addr,
 #endif
 
 static int gup_huge_pmd(pmd_t orig, pmd_t *pmdp, unsigned long addr,
-		unsigned long end, int write, struct page **pages, int *nr)
+		unsigned long end, unsigned int flags, struct page **pages, int *nr)
 {
 	struct page *head, *page;
 	int refs;
 
-	if (!pmd_access_permitted(orig, write))
+	if (!pmd_access_permitted(orig, flags & FOLL_WRITE))
 		return 0;
 
 	if (pmd_devmap(orig))
@@ -1586,12 +1586,12 @@ static int gup_huge_pmd(pmd_t orig, pmd_t *pmdp, unsigned long addr,
 }
 
 static int gup_huge_pud(pud_t orig, pud_t *pudp, unsigned long addr,
-		unsigned long end, int write, struct page **pages, int *nr)
+		unsigned long end, unsigned int flags, struct page **pages, int *nr)
 {
 	struct page *head, *page;
 	int refs;
 
-	if (!pud_access_permitted(orig, write))
+	if (!pud_access_permitted(orig, flags & FOLL_WRITE))
 		return 0;
 
 	if (pud_devmap(orig))
@@ -1624,13 +1624,13 @@ static int gup_huge_pud(pud_t orig, pud_t *pudp, unsigned long addr,
 }
 
 static int gup_huge_pgd(pgd_t orig, pgd_t *pgdp, unsigned long addr,
-			unsigned long end, int write,
+			unsigned long end, unsigned int flags,
 			struct page **pages, int *nr)
 {
 	int refs;
 	struct page *head, *page;
 
-	if (!pgd_access_permitted(orig, write))
+	if (!pgd_access_permitted(orig, flags & FOLL_WRITE))
 		return 0;
 
 	BUILD_BUG_ON(pgd_devmap(orig));
@@ -1661,7 +1661,7 @@ static int gup_huge_pgd(pgd_t orig, pgd_t *pgdp, unsigned long addr,
 }
 
 static int gup_pmd_range(pud_t pud, unsigned long addr, unsigned long end,
-		int write, struct page **pages, int *nr)
+		unsigned int flags, struct page **pages, int *nr)
 {
 	unsigned long next;
 	pmd_t *pmdp;
@@ -1683,7 +1683,7 @@ static int gup_pmd_range(pud_t pud, unsigned long addr, unsigned long end,
 			if (pmd_protnone(pmd))
 				return 0;
 
-			if (!gup_huge_pmd(pmd, pmdp, addr, next, write,
+			if (!gup_huge_pmd(pmd, pmdp, addr, next, flags,
 				pages, nr))
 				return 0;
 
@@ -1693,9 +1693,9 @@ static int gup_pmd_range(pud_t pud, unsigned long addr, unsigned long end,
 			 * pmd format and THP pmd format
 			 */
 			if (!gup_huge_pd(__hugepd(pmd_val(pmd)), addr,
-					 PMD_SHIFT, next, write, pages, nr))
+					 PMD_SHIFT, next, flags, pages, nr))
 				return 0;
-		} else if (!gup_pte_range(pmd, addr, next, write, pages, nr))
+		} else if (!gup_pte_range(pmd, addr, next, flags, pages, nr))
 			return 0;
 	} while (pmdp++, addr = next, addr != end);
 
@@ -1703,7 +1703,7 @@ static int gup_pmd_range(pud_t pud, unsigned long addr, unsigned long end,
 }
 
 static int gup_pud_range(p4d_t p4d, unsigned long addr, unsigned long end,
-			 int write, struct page **pages, int *nr)
+			 unsigned int flags, struct page **pages, int *nr)
 {
 	unsigned long next;
 	pud_t *pudp;
@@ -1716,14 +1716,14 @@ static int gup_pud_range(p4d_t p4d, unsigned long addr, unsigned long end,
 		if (pud_none(pud))
 			return 0;
 		if (unlikely(pud_huge(pud))) {
-			if (!gup_huge_pud(pud, pudp, addr, next, write,
+			if (!gup_huge_pud(pud, pudp, addr, next, flags,
 					  pages, nr))
 				return 0;
 		} else if (unlikely(is_hugepd(__hugepd(pud_val(pud))))) {
 			if (!gup_huge_pd(__hugepd(pud_val(pud)), addr,
-					 PUD_SHIFT, next, write, pages, nr))
+					 PUD_SHIFT, next, flags, pages, nr))
 				return 0;
-		} else if (!gup_pmd_range(pud, addr, next, write, pages, nr))
+		} else if (!gup_pmd_range(pud, addr, next, flags, pages, nr))
 			return 0;
 	} while (pudp++, addr = next, addr != end);
 
@@ -1731,7 +1731,7 @@ static int gup_pud_range(p4d_t p4d, unsigned long addr, unsigned long end,
 }
 
 static int gup_p4d_range(pgd_t pgd, unsigned long addr, unsigned long end,
-			 int write, struct page **pages, int *nr)
+			 unsigned int flags, struct page **pages, int *nr)
 {
 	unsigned long next;
 	p4d_t *p4dp;
@@ -1746,9 +1746,9 @@ static int gup_p4d_range(pgd_t pgd, unsigned long addr, unsigned long end,
 		BUILD_BUG_ON(p4d_huge(p4d));
 		if (unlikely(is_hugepd(__hugepd(p4d_val(p4d))))) {
 			if (!gup_huge_pd(__hugepd(p4d_val(p4d)), addr,
-					 P4D_SHIFT, next, write, pages, nr))
+					 P4D_SHIFT, next, flags, pages, nr))
 				return 0;
-		} else if (!gup_pud_range(p4d, addr, next, write, pages, nr))
+		} else if (!gup_pud_range(p4d, addr, next, flags, pages, nr))
 			return 0;
 	} while (p4dp++, addr = next, addr != end);
 
@@ -1756,7 +1756,7 @@ static int gup_p4d_range(pgd_t pgd, unsigned long addr, unsigned long end,
 }
 
 static void gup_pgd_range(unsigned long addr, unsigned long end,
-		int write, struct page **pages, int *nr)
+		unsigned int flags, struct page **pages, int *nr)
 {
 	unsigned long next;
 	pgd_t *pgdp;
@@ -1769,14 +1769,14 @@ static void gup_pgd_range(unsigned long addr, unsigned long end,
 		if (pgd_none(pgd))
 			return;
 		if (unlikely(pgd_huge(pgd))) {
-			if (!gup_huge_pgd(pgd, pgdp, addr, next, write,
+			if (!gup_huge_pgd(pgd, pgdp, addr, next, flags,
 					  pages, nr))
 				return;
 		} else if (unlikely(is_hugepd(__hugepd(pgd_val(pgd))))) {
 			if (!gup_huge_pd(__hugepd(pgd_val(pgd)), addr,
-					 PGDIR_SHIFT, next, write, pages, nr))
+					 PGDIR_SHIFT, next, flags, pages, nr))
 				return;
-		} else if (!gup_p4d_range(pgd, addr, next, write, pages, nr))
+		} else if (!gup_p4d_range(pgd, addr, next, flags, pages, nr))
 			return;
 	} while (pgdp++, addr = next, addr != end);
 }
@@ -1830,7 +1830,7 @@ int __get_user_pages_fast(unsigned long start, int nr_pages, int write,
 
 	if (gup_fast_permitted(start, nr_pages)) {
 		local_irq_save(flags);
-		gup_pgd_range(start, end, write, pages, &nr);
+		gup_pgd_range(start, end, write ? FOLL_WRITE : 0, pages, &nr);
 		local_irq_restore(flags);
 	}
 
@@ -1872,7 +1872,7 @@ int get_user_pages_fast(unsigned long start, int nr_pages, int write,
 
 	if (gup_fast_permitted(start, nr_pages)) {
 		local_irq_disable();
-		gup_pgd_range(addr, end, write, pages, &nr);
+		gup_pgd_range(addr, end, write ? FOLL_WRITE : 0, pages, &nr);
 		local_irq_enable();
 		ret = nr;
 	}
-- 
2.20.1


^ permalink raw reply related

* Re: [PATCH net-next v4 0/9] net: Remove switchdev_ops
From: David Miller @ 2019-02-11 20:16 UTC (permalink / raw)
  To: f.fainelli
  Cc: netdev, idosch, linux-kernel, devel, bridge, jiri, andrew,
	vivien.didelot
In-Reply-To: <20190211191001.8623-1-f.fainelli@gmail.com>

From: Florian Fainelli <f.fainelli@gmail.com>
Date: Mon, 11 Feb 2019 11:09:52 -0800

> David, I would like to get Ido's feedback on this to make sure I did not
> miss something, thank you!

Ok, Ido please look at this when you can.

^ permalink raw reply

* [PATCH] Revert: "p54: Use skb_peek_tail() instead of direct head pointer accesses"
From: Matthew Whitehead @ 2019-02-11 20:20 UTC (permalink / raw)
  To: whiteheadm, netdev, davem; +Cc: Matthew Whitehead

Commit e3554197fc8fbb9656f62c18f9c9edd396394e16 causes a null pointer error.

kernel: p54pci 0000:07:00.0: enabling device (0000 -> 0002)
kernel: ieee80211 phy1: p54 detected a LM86 firmware
kernel: p54: rx_mtu reduced from 3240 to 2376
kernel: ieee80211 phy1: FW rev 2.13.1.0 - Softmac protocol 5.5
kernel: ieee80211 phy1: cryptographic accelerator WEP:YES, TKIP:YES, CCMP:YES
kernel: BUG: unable to handle kernel NULL pointer dereference at 00000000
kernel: *pde = 00000000
kernel: Oops: 0000 [#1] PREEMPT SMP
kernel: CPU: 0 PID: 5 Comm: kworker/0:0 Not tainted 4.19.0.bisect-14.#871
kernel: Hardware name: IBM 2378RVU/2378RVU, BIOS 1RETDKWW (3.16 ) 04/19/2005
kernel: Workqueue: events request_firmware_work_func
kernel: EIP: p54_tx_pending+0xff/0x128 [p54common]
kernel: Code: 8b 4d dc 89 7e 30 89 56 34 0f b6 53 56 01 d7 89 79 04 8b 96 a0 00 00 00 f6 42 01 80 75 0c 80 7a 28 00 75 06 89 bb d4 01 00 00 <8b> 10 89 46 04 89 16 89 30 8b 45 ec 89 72 04 8b 55 e8 ff 43 2c e8
kernel: EAX: 00000000 EBX: ec6a2d60 ECX: ed4de568 EDX: ed4de568
kernel: ESI: ec4e0980 EDI: 00020264 EBP: c0071eb8 ESP: c0071e94
kernel: DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 EFLAGS: 00010082
kernel: CR0: 80050033 CR2: 00000000 CR3: 2f715000 CR4: 00000690
kernel: Call Trace:
kernel:  p54_tx+0x1a/0x1d [p54common]
kernel:  p54_download_eeprom+0xa6/0xfb [p54common]
kernel:  p54_read_eeprom+0x5c/0x99 [p54common]
kernel:  p54p_firmware_step2+0x50/0xcd [p54pci]
kernel:  request_firmware_work_func+0x2a/0x51
kernel:  process_one_work+0x16b/0x28e
kernel:  worker_thread+0x180/0x222
kernel:  kthread+0xce/0xd0
kernel:  ? cancel_delayed_work+0x5e/0x5e
kernel:  ? kthread_create_worker_on_cpu+0x1c/0x1c
kernel:  ret_from_fork+0x19/0x24
kernel: Modules linked in: p54pci p54common crc_ccitt mac80211 ipw2200 libipw lib80211 cfg80211 uhci_hcd pcmcia ehci_pci yenta_socket ehci_hcd rfkill i2c_i801 pcmcia_rsrc e1000 usbcore i2c_core pcmcia_core lpc_ich usb_common mfd_core floppy autofs4
kernel: CR2: 0000000000000000
kernel: ---[ end trace ddc1a265fd4f4bc6 ]---
kernel: EIP: p54_tx_pending+0xff/0x128 [p54common]
kernel: Code: 8b 4d dc 89 7e 30 89 56 34 0f b6 53 56 01 d7 89 79 04 8b 96 a0 00 00 00 f6 42 01 80 75 0c 80 7a 28 00 75 06 89 bb d4 01 00 00 <8b> 10 89 46 04 89 16 89 30 8b 45 ec 89 72 04 8b 55 e8 ff 43 2c e8
kernel: EAX: 00000000 EBX: ec6a2d60 ECX: ed4de568 EDX: ed4de568
kernel: ESI: ec4e0980 EDI: 00020264 EBP: c0071eb8 ESP: c16252e8
kernel: DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 EFLAGS: 00010082
kernel: CR0: 80050033 CR2: 00000000 CR3: 2f715000 CR4: 00000690
kernel: note: kworker/0:0[5] exited with preempt_count 1

Reverting the patch fixes the problem.

Signed-off-by: Matthew Whitehead <tedheadster@gmail.com>
---
 drivers/net/wireless/intersil/p54/txrx.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/wireless/intersil/p54/txrx.c b/drivers/net/wireless/intersil/p54/txrx.c
index 79078456..3a4214d 100644
--- a/drivers/net/wireless/intersil/p54/txrx.c
+++ b/drivers/net/wireless/intersil/p54/txrx.c
@@ -121,8 +121,8 @@ static int p54_assign_address(struct p54_common *priv, struct sk_buff *skb)
 	}
 	if (unlikely(!target_skb)) {
 		if (priv->rx_end - last_addr >= len) {
-			target_skb = skb_peek_tail(&priv->tx_queue);
-			if (target_skb) {
+			target_skb = priv->tx_queue.prev;
+			if (!skb_queue_empty(&priv->tx_queue)) {
 				info = IEEE80211_SKB_CB(target_skb);
 				range = (void *)info->rate_driver_data;
 				target_addr = range->end_addr;
-- 
1.8.3.1


^ permalink raw reply related

* Re: [PATCH mlx5-next 2/2] net/mlx5: Factor out HCA capabilities functions
From: Jason Gunthorpe @ 2019-02-11 20:32 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Doug Ledford, RDMA mailing list, Moni Shoua, Saeed Mahameed,
	linux-netdev
In-Reply-To: <20190211200207.GD21447@mtr-leonro.mtl.com>

On Mon, Feb 11, 2019 at 10:02:07PM +0200, Leon Romanovsky wrote:
> On Mon, Feb 11, 2019 at 07:50:55PM +0000, Jason Gunthorpe wrote:
> > On Mon, Feb 11, 2019 at 01:56:08PM +0200, Leon Romanovsky wrote:
> > > From: Leon Romanovsky <leonro@mellanox.com>
> > >
> > > Combine all HCA capabilities setters under one function
> > > and compile out the ODP related function in case kernel
> > > was compiled without ODP support.
> > >
> > > Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
> > >  .../net/ethernet/mellanox/mlx5/core/main.c    | 47 +++++++++++++------
> > >  1 file changed, 33 insertions(+), 14 deletions(-)
> > >
> > > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c
> > > index 6d45518edbdc..d7145ab6105d 100644
> > > +++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
> > > @@ -459,6 +459,7 @@ static int handle_hca_cap_atomic(struct mlx5_core_dev *dev)
> > >  	return err;
> > >  }
> > >
> > > +#ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING
> > >  static int handle_hca_cap_odp(struct mlx5_core_dev *dev)
> > >  {
> > >  	void *set_hca_cap;
> > > @@ -502,6 +503,7 @@ static int handle_hca_cap_odp(struct mlx5_core_dev *dev)
> > >  	kfree(set_ctx);
> > >  	return err;
> > >  }
> > > +#endif
> > >
> > >  static int handle_hca_cap(struct mlx5_core_dev *dev)
> > >  {
> > > @@ -576,6 +578,35 @@ static int handle_hca_cap(struct mlx5_core_dev *dev)
> > >  	return err;
> > >  }
> > >
> > > +static int set_hca_cap(struct mlx5_core_dev *dev)
> > > +{
> > > +	struct pci_dev *pdev = dev->pdev;
> > > +	int err;
> > > +
> > > +	err = handle_hca_cap(dev);
> > > +	if (err) {
> > > +		dev_err(&pdev->dev, "handle_hca_cap failed\n");
> > > +		goto out;
> > > +	}
> > > +
> > > +	err = handle_hca_cap_atomic(dev);
> > > +	if (err) {
> > > +		dev_err(&pdev->dev, "handle_hca_cap_atomic failed\n");
> > > +		goto out;
> > > +	}
> > > +
> > > +#ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING
> > > +	err = handle_hca_cap_odp(dev);
> > > +	if (err) {
> > > +		dev_err(&pdev->dev, "handle_hca_cap_odp failed\n");
> > > +		goto out;
> > > +	}
> > > +#endif
> >
> > Adding
> >   if (IS_ENABLED..)
> >     return 0;
> >
> > To the top of handle_hca_cap_odp is alot better.
> 
> Saeed gave comment that he prefers code to be compiled-out in case
> config is not set. In you suggestion, the code will exist and only
> with some optimizations enabled it will be thrown.

the kernel always compiles with optimizations that will throw away
this code, it is a widely used pattern.

#ifdef creates a compilation test matrix that is very undesired.

Jason

^ permalink raw reply

* Re: [PATCH 0/3] Add gup fast + longterm and use it in HFI1
From: Davidlohr Bueso @ 2019-02-11 20:34 UTC (permalink / raw)
  To: ira.weiny
  Cc: linux-rdma, linux-kernel, linux-mm, Daniel Borkmann, netdev,
	Mike Marciniszyn, Dennis Dalessandro, Doug Ledford,
	Jason Gunthorpe, Andrew Morton, Kirill A. Shutemov, Dan Williams
In-Reply-To: <20190211201643.7599-1-ira.weiny@intel.com>

On Mon, 11 Feb 2019, ira.weiny@intel.com wrote:
>Ira Weiny (3):
>  mm/gup: Change "write" parameter to flags
>  mm/gup: Introduce get_user_pages_fast_longterm()
>  IB/HFI1: Use new get_user_pages_fast_longterm()

Out of curiosity, are you planning on having all rdma drivers
use get_user_pages_fast_longterm()? Ie:

hw/mthca/mthca_memfree.c:       ret = get_user_pages_fast(uaddr & PAGE_MASK, 1, FOLL_WRITE, pages);
hw/qib/qib_user_sdma.c:         ret = get_user_pages_fast(addr, j, 0, pages);

Thanks,
Davidlohr

^ permalink raw reply

* Re: Kernel memory corruption in CIPSO labeled TCP packets processing.
From: Paul Moore @ 2019-02-11 20:37 UTC (permalink / raw)
  To: Nazarov Sergey
  Cc: linux-security-module@vger.kernel.org, selinux@vger.kernel.org,
	netdev@vger.kernel.org, Casey Schaufler
In-Reply-To: <11242361548940840@iva8-8d7a47df0521.qloud-c.yandex.net>

On Thu, Jan 31, 2019 at 8:20 AM Nazarov Sergey <s-nazarov@yandex.ru> wrote:
> 31.01.2019, 05:10, "Paul Moore" <paul@paul-moore.com>:
> > This isn't how the rest of the stack works, look at
> > ip_local_deliver_finish() for one example. Perhaps the behavior you
> > are proposing is correct, but please show me where in the various RFC
> > specs it is defined so that I can better understand why it should work
> > this way.
> > --
> > paul moore
> > www.paul-moore.com
>
> Sorry, I was inattentive. ip_options_compile modifies srr option data, only if
> skb is NULL. My last message could be ignored.

Hi Nazarov,

Do you plan on submitting these patches as a proper patchset for
review and merging?

-- 
paul moore
www.paul-moore.com

^ permalink raw reply

* Re: [PATCH] ipv6: fix icmp6_send() route lookup
From: David Miller @ 2019-02-11 20:38 UTC (permalink / raw)
  To: alin.nastac; +Cc: netdev
In-Reply-To: <1549551931-11909-1-git-send-email-alin.nastac@gmail.com>

From: Alin Nastac <alin.nastac@gmail.com>
Date: Thu,  7 Feb 2019 16:05:31 +0100

> Original packet destination address must be used as saddr for the
> route lookup performed by icmp6_send() even when this address is
> not local. This fixes the IPv6 router ability to send back
> destination unreachable ICMPv6 errors for forwarded packets when
> the route toward the saddr of the original packet is source
> filtered (e.g. a default route with a "from PD" attribute, where
> PD is the delegated prefix).
> 
> Signed-off-by: Alin Nastac <alin.nastac@gmail.com>

Yes, but however this will change behavior for a lot of situations
not just the one you are interested in.

The base ipv6_chk_addr() test has been there for more than a decade
and I'm not comfortable with changing this logic until I see you
write up a full audit of all of the use cases of icmp6_send() and
how they are impacted by your changes.

Thanks.

^ permalink raw reply

* Re: [PATCH] Documentation: bring operstate documentation up-to-date
From: David Miller @ 2019-02-11 20:39 UTC (permalink / raw)
  To: j.witteveen; +Cc: netdev
In-Reply-To: <20190207161432.GA6060@Mindship-05.localdomain>

From: Jouke Witteveen <j.witteveen@gmail.com>
Date: Thu, 7 Feb 2019 17:14:32 +0100

> Netlink has moved from bitmasks to group numbers long ago.
> 
> Signed-off-by: Jouke Witteveen <j.witteveen@gmail.com>

Applied, thank you.

^ permalink raw reply

* Re: [PATCH 2/3] mm/gup: Introduce get_user_pages_fast_longterm()
From: Jason Gunthorpe @ 2019-02-11 20:39 UTC (permalink / raw)
  To: ira.weiny
  Cc: linux-rdma, linux-kernel, linux-mm, Daniel Borkmann,
	Davidlohr Bueso, netdev, Mike Marciniszyn, Dennis Dalessandro,
	Doug Ledford, Andrew Morton, Kirill A. Shutemov, Dan Williams
In-Reply-To: <20190211201643.7599-3-ira.weiny@intel.com>

On Mon, Feb 11, 2019 at 12:16:42PM -0800, ira.weiny@intel.com wrote:
> From: Ira Weiny <ira.weiny@intel.com>
> 
> Users of get_user_pages_fast are not protected against mapping
> pages within FS DAX.  Introduce a call which protects them.
> 
> We do this by checking for DEVMAP pages during the fast walk and
> falling back to the longterm gup call to check for FS DAX if needed.
> 
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>
>  include/linux/mm.h |   8 ++++
>  mm/gup.c           | 102 +++++++++++++++++++++++++++++++++++----------
>  2 files changed, 88 insertions(+), 22 deletions(-)
> 
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 80bb6408fe73..8f831c823630 100644
> +++ b/include/linux/mm.h
> @@ -1540,6 +1540,8 @@ long get_user_pages_unlocked(unsigned long start, unsigned long nr_pages,
>  long get_user_pages_longterm(unsigned long start, unsigned long nr_pages,
>  			    unsigned int gup_flags, struct page **pages,
>  			    struct vm_area_struct **vmas);
> +int get_user_pages_fast_longterm(unsigned long start, int nr_pages, bool write,
> +				 struct page **pages);
>  #else
>  static inline long get_user_pages_longterm(unsigned long start,
>  		unsigned long nr_pages, unsigned int gup_flags,
> @@ -1547,6 +1549,11 @@ static inline long get_user_pages_longterm(unsigned long start,
>  {
>  	return get_user_pages(start, nr_pages, gup_flags, pages, vmas);
>  }
> +static inline int get_user_pages_fast_longterm(unsigned long start, int nr_pages,
> +					       bool write, struct page **pages)
> +{
> +	return get_user_pages_fast(start, nr_pages, write, pages);
> +}
>  #endif /* CONFIG_FS_DAX */
>  
>  int get_user_pages_fast(unsigned long start, int nr_pages, int write,
> @@ -2615,6 +2622,7 @@ struct page *follow_page(struct vm_area_struct *vma, unsigned long address,
>  #define FOLL_REMOTE	0x2000	/* we are working on non-current tsk/mm */
>  #define FOLL_COW	0x4000	/* internal GUP flag */
>  #define FOLL_ANON	0x8000	/* don't do file mappings */
> +#define FOLL_LONGTERM	0x10000	/* mapping is intended for a long term pin */

If we are adding a new flag, maybe we should get rid of the 'longterm'
entry points and just rely on the callers to pass the flag?

Jason

^ permalink raw reply

* Re: [PATCH 0/3] Add gup fast + longterm and use it in HFI1
From: Jason Gunthorpe @ 2019-02-11 20:40 UTC (permalink / raw)
  To: ira.weiny
  Cc: linux-rdma, linux-kernel, linux-mm, Daniel Borkmann,
	Davidlohr Bueso, netdev, Mike Marciniszyn, Dennis Dalessandro,
	Doug Ledford, Andrew Morton, Kirill A. Shutemov, Dan Williams
In-Reply-To: <20190211201643.7599-1-ira.weiny@intel.com>

On Mon, Feb 11, 2019 at 12:16:40PM -0800, ira.weiny@intel.com wrote:
> From: Ira Weiny <ira.weiny@intel.com>
> 
> NOTE: This series depends on my clean up patch to remove the write parameter
> from gup_fast_permitted()[1]
> 
> HFI1 uses get_user_pages_fast() due to it performance advantages.  Like RDMA,
> HFI1 pages can be held for a significant time.  But get_user_pages_fast() does
> not protect against mapping of FS DAX pages.

If HFI1 can use the _fast varient, can't all the general RDMA stuff
use it too? 

What is the guidance on when fast vs not fast should be use?

Jason

^ permalink raw reply

* RE: [PATCH v3] arm64: dts: lx2160aqds: Add mdio mux nodes
From: Leo Li @ 2019-02-11 20:43 UTC (permalink / raw)
  To: Shawn Guo, Pankaj Bansal
  Cc: Andrew Lunn, Florian Fainelli, netdev@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org
In-Reply-To: <20190211030010.GI22487@dragon>



> -----Original Message-----
> From: Shawn Guo <shawnguo@kernel.org>
> Sent: Sunday, February 10, 2019 9:00 PM
> To: Pankaj Bansal <pankaj.bansal@nxp.com>
> Cc: Leo Li <leoyang.li@nxp.com>; Andrew Lunn <andrew@lunn.ch>; Florian
> Fainelli <f.fainelli@gmail.com>; netdev@vger.kernel.org; linux-arm-
> kernel@lists.infradead.org
> Subject: Re: [PATCH v3] arm64: dts: lx2160aqds: Add mdio mux nodes
> 
> On Wed, Feb 06, 2019 at 09:40:33AM +0000, Pankaj Bansal wrote:
> > The two external MDIO buses used to communicate with phy devices that
> > are external to SOC are muxed in LX2160AQDS board.
> >
> > These buses can be routed to any one of the eight IO slots on
> > LX2160AQDS board depending on value in fpga register 0x54.
> >
> > Additionally the external MDIO1 is used to communicate to the onboard
> > RGMII phy devices.
> >
> > The mdio1 is controlled by bits 4-7 of fpga register and mdio2 is
> > controlled by bits 0-3 of fpga register.
> >
> > Signed-off-by: Pankaj Bansal <pankaj.bansal@nxp.com>
> > ---
> >
> > Notes:
> >     V3:
> >     - Add status = disabled in soc file and status = okay in board file
> >       for external MDIO nodes
> >     - Add interrupts property in external mdio nodes in soc file
> >     V2:
> >     - removed unnecassary TODO statements
> >     - removed device_type from mdio nodes
> >     - change the case of hex number to lowercase
> >     - removed board specific comments from soc file
> >
> >  .../boot/dts/freescale/fsl-lx2160a-qds.dts   | 123 +++++++++++++++++
> >  .../boot/dts/freescale/fsl-lx2160a.dtsi      |  22 +++
> >  2 files changed, 145 insertions(+)
> >
> > diff --git a/arch/arm64/boot/dts/freescale/fsl-lx2160a-qds.dts
> > b/arch/arm64/boot/dts/freescale/fsl-lx2160a-qds.dts
> > index 99a22abbe725..079264b391a2 100644
> > --- a/arch/arm64/boot/dts/freescale/fsl-lx2160a-qds.dts
> > +++ b/arch/arm64/boot/dts/freescale/fsl-lx2160a-qds.dts
> > @@ -35,6 +35,14 @@
> >  	status = "okay";
> >  };
> >
> > +&emdio1 {
> > +	status = "okay";
> > +};
> > +
> > +&emdio2 {
> > +	status = "okay";
> > +};
> > +
> >  &esdhc0 {
> >  	status = "okay";
> >  };
> > @@ -46,6 +54,121 @@
> >  &i2c0 {
> >  	status = "okay";
> >
> > +	fpga@66 {
> > +		compatible = "fsl,lx2160aqds-fpga", "fsl,fpga-qixis-i2c";
> > +		reg = <0x66>;
> > +		#address-cells = <1>;
> > +		#size-cells = <0>;
> > +
> > +		mdio-mux-1@54 {
> > +			mdio-parent-bus = <&emdio1>;
> > +			reg = <0x54>;		 /* BRDCFG4 */
> > +			mux-mask = <0xf8>;      /* EMI1_MDIO */
> > +			#address-cells=<1>;
> > +			#size-cells = <0>;
> > +
> > +			mdio@0 {
> > +				reg = <0x00>;
> > +				#address-cells = <1>;
> > +				#size-cells = <0>;
> > +			};
> 
> Please have a newline between nodes.  It doesn't deserve a respin though.  I
> can fix them up when applying if Leo is fine with this version.

I think there should be a compatible string defined for the binding of parent node mdio-mux, probably "mdio-mux-regmap", and be used here in the device tree.

> 
> Shawn
> 
> > +			mdio@40 {
> > +				reg = <0x40>;
> > +				#address-cells = <1>;
> > +				#size-cells = <0>;
> > +			};
> > +			mdio@c0 {
> > +				reg = <0xc0>;
> > +				#address-cells = <1>;
> > +				#size-cells = <0>;
> > +			};
> > +			mdio@c8 {
> > +				reg = <0xc8>;
> > +				#address-cells = <1>;
> > +				#size-cells = <0>;
> > +			};
> > +			mdio@d0 {
> > +				reg = <0xd0>;
> > +				#address-cells = <1>;
> > +				#size-cells = <0>;
> > +			};
> > +			mdio@d8 {
> > +				reg = <0xd8>;
> > +				#address-cells = <1>;
> > +				#size-cells = <0>;
> > +			};
> > +			mdio@e0 {
> > +				reg = <0xe0>;
> > +				#address-cells = <1>;
> > +				#size-cells = <0>;
> > +			};
> > +			mdio@e8 {
> > +				reg = <0xe8>;
> > +				#address-cells = <1>;
> > +				#size-cells = <0>;
> > +			};
> > +			mdio@f0 {
> > +				reg = <0xf0>;
> > +				#address-cells = <1>;
> > +				#size-cells = <0>;
> > +			};
> > +			mdio@f8 {
> > +				reg = <0xf8>;
> > +				#address-cells = <1>;
> > +				#size-cells = <0>;
> > +			};
> > +		};
> > +
> > +		mdio-mux-2@54 {
> > +			mdio-parent-bus = <&emdio2>;
> > +			reg = <0x54>;		 /* BRDCFG4 */
> > +			mux-mask = <0x07>;      /* EMI2_MDIO */
> > +			#address-cells=<1>;
> > +			#size-cells = <0>;
> > +
> > +			mdio@0 {
> > +				reg = <0x00>;
> > +				#address-cells = <1>;
> > +				#size-cells = <0>;
> > +			};
> > +			mdio@1 {
> > +				reg = <0x01>;
> > +				#address-cells = <1>;
> > +				#size-cells = <0>;
> > +			};
> > +			mdio@2 {
> > +				reg = <0x02>;
> > +				#address-cells = <1>;
> > +				#size-cells = <0>;
> > +			};
> > +			mdio@3 {
> > +				reg = <0x03>;
> > +				#address-cells = <1>;
> > +				#size-cells = <0>;
> > +			};
> > +			mdio@4 {
> > +				reg = <0x04>;
> > +				#address-cells = <1>;
> > +				#size-cells = <0>;
> > +			};
> > +			mdio@5 {
> > +				reg = <0x05>;
> > +				#address-cells = <1>;
> > +				#size-cells = <0>;
> > +			};
> > +			mdio@6 {
> > +				reg = <0x06>;
> > +				#address-cells = <1>;
> > +				#size-cells = <0>;
> > +			};
> > +			mdio@7 {
> > +				reg = <0x07>;
> > +				#address-cells = <1>;
> > +				#size-cells = <0>;
> > +			};
> > +		};
> > +	};
> > +
> >  	i2c-mux@77 {
> >  		compatible = "nxp,pca9547";
> >  		reg = <0x77>;
> > diff --git a/arch/arm64/boot/dts/freescale/fsl-lx2160a.dtsi
> > b/arch/arm64/boot/dts/freescale/fsl-lx2160a.dtsi
> > index a79f5c1ea56d..7def5252ac1a 100644
> > --- a/arch/arm64/boot/dts/freescale/fsl-lx2160a.dtsi
> > +++ b/arch/arm64/boot/dts/freescale/fsl-lx2160a.dtsi
> > @@ -762,5 +762,27 @@
> >  				     <GIC_SPI 209 IRQ_TYPE_LEVEL_HIGH>;
> >  			dma-coherent;
> >  		};
> > +
> > +		/* WRIOP0: 0x8b8_0000, E-MDIO1: 0x1_6000 */
> > +		emdio1: mdio@8b96000 {
> > +			compatible = "fsl,fman-memac-mdio";
> > +			reg = <0x0 0x8b96000 0x0 0x1000>;
> > +			interrupts = <GIC_SPI 90 IRQ_TYPE_LEVEL_HIGH>;
> > +			#address-cells = <1>;
> > +			#size-cells = <0>;
> > +			little-endian;	/* force the driver in LE mode */
> > +			status = "disabled";
> > +		};
> > +
> > +		/* WRIOP0: 0x8b8_0000, E-MDIO2: 0x1_7000 */
> > +		emdio2: mdio@8b97000 {
> > +			compatible = "fsl,fman-memac-mdio";
> > +			reg = <0x0 0x8b97000 0x0 0x1000>;
> > +			interrupts = <GIC_SPI 91 IRQ_TYPE_LEVEL_HIGH>;
> > +			#address-cells = <1>;
> > +			#size-cells = <0>;
> > +			little-endian;	/* force the driver in LE mode */
> > +			status = "disabled";
> > +		};
> >  	};
> >  };
> > --
> > 2.17.1
> >

^ permalink raw reply

* Re: [PATCH net] vxlan: test dev->flags & IFF_UP before calling netif_rx()
From: David Miller @ 2019-02-11 20:44 UTC (permalink / raw)
  To: edumazet; +Cc: netdev, eric.dumazet, petrm, idosch, roopa, sbrivio
In-Reply-To: <20190207202738.155940-1-edumazet@google.com>

From: Eric Dumazet <edumazet@google.com>
Date: Thu,  7 Feb 2019 12:27:38 -0800

> netif_rx() must be called under a strict contract.
> 
> At device dismantle phase, core networking clears IFF_UP
> and flush_all_backlogs() is called after rcu grace period
> to make sure no incoming packet might be in a cpu backlog
> and still referencing the device.
> 
> Most drivers call netif_rx() from their interrupt handler,
> and since the interrupts are disabled at device dismantle,
> netif_rx() does not have to check dev->flags & IFF_UP
> 
> Virtual drivers do not have this guarantee, and must
> therefore make the check themselves.
> 
> Otherwise we risk use-after-free and/or crashes.
> 
> Note this patch also fixes a small issue that came
> with commit ce6502a8f957 ("vxlan: fix a use after free
> in vxlan_encap_bypass"), since the dev->stats.rx_dropped
> change was done on the wrong device.
> 
> Fixes: d342894c5d2f ("vxlan: virtual extensible lan")
> Fixes: ce6502a8f957 ("vxlan: fix a use after free in vxlan_encap_bypass")
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Applied and queued up for -stable.

^ permalink raw reply

* Re: [PATCH net] af_key: unconditionally clone on broadcast
From: David Miller @ 2019-02-11 20:45 UTC (permalink / raw)
  To: stranche; +Cc: eric.dumazet, netdev, steffen.klassert
In-Reply-To: <1549571601-395-1-git-send-email-stranche@codeaurora.org>

From: Sean Tranchetti <stranche@codeaurora.org>
Date: Thu,  7 Feb 2019 13:33:21 -0700

> Attempting to avoid cloning the skb when broadcasting by inflating
> the refcount with sock_hold/sock_put while under RCU lock is dangerous
> and violates RCU principles. It leads to subtle race conditions when
> attempting to free the SKB, as we may reference sockets that have
> already been freed by the stack.
 ...
> Suggested-by: Eric Dumazet <eric.dumazet@gmail.com>
> Signed-off-by: Sean Tranchetti <stranche@codeaurora.org>
> ---
> Realized I never actually sent this patch out after testing the changes
> Eric recommended. Whoops. Better late then never, I suppose...

Steffen, I assume you will review and pick this up.

Thanks.

^ permalink raw reply

* Re: [PATCH 0/3] Add gup fast + longterm and use it in HFI1
From: Jason Gunthorpe @ 2019-02-11 20:47 UTC (permalink / raw)
  To: ira.weiny, linux-rdma, linux-kernel, linux-mm, Daniel Borkmann,
	netdev, Mike Marciniszyn, Dennis Dalessandro, Doug Ledford,
	Andrew Morton, Kirill A. Shutemov, Dan Williams
In-Reply-To: <20190211203417.a2c2kbmjai43flyz@linux-r8p5>

On Mon, Feb 11, 2019 at 12:34:17PM -0800, Davidlohr Bueso wrote:
> On Mon, 11 Feb 2019, ira.weiny@intel.com wrote:
> > Ira Weiny (3):
> >  mm/gup: Change "write" parameter to flags
> >  mm/gup: Introduce get_user_pages_fast_longterm()
> >  IB/HFI1: Use new get_user_pages_fast_longterm()
> 
> Out of curiosity, are you planning on having all rdma drivers
> use get_user_pages_fast_longterm()? Ie:
> 
> hw/mthca/mthca_memfree.c:       ret = get_user_pages_fast(uaddr & PAGE_MASK, 1, FOLL_WRITE, pages);

This one is certainly a mistake - this should be done with a umem.

Jason

^ permalink raw reply

* Re: [PATCH net] net: dsa: microchip: add switch offload forwarding support
From: David Miller @ 2019-02-11 20:49 UTC (permalink / raw)
  To: Tristram.Ha
  Cc: sergio.paracuellos, andrew, f.fainelli, pavel, UNGLinuxDriver,
	netdev
In-Reply-To: <1549598758-25870-1-git-send-email-Tristram.Ha@microchip.com>

From: <Tristram.Ha@microchip.com>
Date: Thu, 7 Feb 2019 20:05:58 -0800

> From: Tristram Ha <Tristram.Ha@microchip.com>
> 
> The flag offload_fwd_mark is set as the switch can forward frames by
> itself.
> 
> This can be considered a fix to a problem introduced in commit
> c2e866911e254067 where the port membership are not set in sync.  The flag
> offload_fwd_mark just needs to be set in tag_ksz.c to prevent the software
> bridge from forwarding duplicate multicast frames.
> 
> Fixes: c2e866911e254067 ("microchip: break KSZ9477 DSA driver into two files")
> Signed-off-by: Tristram Ha <Tristram.Ha@microchip.com>

Applied, thank you.

^ permalink raw reply

* [PATCH net-next] net/skbuff: fix up kernel-doc placement
From: Brian Norris @ 2019-02-11 21:02 UTC (permalink / raw)
  To: netdev; +Cc: linux-kernel, David S. Miller, Brian Norris

There are several skb_* functions where the locked and unlocked
functions are confusingly documented. For several of them, the
kernel-doc for the unlocked version is placed above the locked version,
which to the casual reader makes it seems like the locked version "takes
no locks and you must therefore hold required locks before calling it."

One can see, for example, that this link claims to document
skb_queue_head(), while instead describing __skb_queue_head().

https://www.kernel.org/doc/html/latest/networking/kapi.html#c.skb_queue_head

The correct documentation for skb_queue_head() is also included further
down the page.

This diff tested via:

  $ scripts/kernel-doc -rst include/linux/skbuff.h net/core/skbuff.c

No new warnings were seen, and the output makes a little more sense.

Signed-off-by: Brian Norris <briannorris@chromium.org>
---
 include/linux/skbuff.h | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 831846617d07..a41e84f7730c 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -1889,12 +1889,12 @@ static inline void __skb_queue_before(struct sk_buff_head *list,
  *
  *	A buffer cannot be placed on two lists at the same time.
  */
-void skb_queue_head(struct sk_buff_head *list, struct sk_buff *newsk);
 static inline void __skb_queue_head(struct sk_buff_head *list,
 				    struct sk_buff *newsk)
 {
 	__skb_queue_after(list, (struct sk_buff *)list, newsk);
 }
+void skb_queue_head(struct sk_buff_head *list, struct sk_buff *newsk);
 
 /**
  *	__skb_queue_tail - queue a buffer at the list tail
@@ -1906,12 +1906,12 @@ static inline void __skb_queue_head(struct sk_buff_head *list,
  *
  *	A buffer cannot be placed on two lists at the same time.
  */
-void skb_queue_tail(struct sk_buff_head *list, struct sk_buff *newsk);
 static inline void __skb_queue_tail(struct sk_buff_head *list,
 				   struct sk_buff *newsk)
 {
 	__skb_queue_before(list, (struct sk_buff *)list, newsk);
 }
+void skb_queue_tail(struct sk_buff_head *list, struct sk_buff *newsk);
 
 /*
  * remove sk_buff from list. _Must_ be called atomically, and with
@@ -1938,7 +1938,6 @@ static inline void __skb_unlink(struct sk_buff *skb, struct sk_buff_head *list)
  *	so must be used with appropriate locks held only. The head item is
  *	returned or %NULL if the list is empty.
  */
-struct sk_buff *skb_dequeue(struct sk_buff_head *list);
 static inline struct sk_buff *__skb_dequeue(struct sk_buff_head *list)
 {
 	struct sk_buff *skb = skb_peek(list);
@@ -1946,6 +1945,7 @@ static inline struct sk_buff *__skb_dequeue(struct sk_buff_head *list)
 		__skb_unlink(skb, list);
 	return skb;
 }
+struct sk_buff *skb_dequeue(struct sk_buff_head *list);
 
 /**
  *	__skb_dequeue_tail - remove from the tail of the queue
@@ -1955,7 +1955,6 @@ static inline struct sk_buff *__skb_dequeue(struct sk_buff_head *list)
  *	so must be used with appropriate locks held only. The tail item is
  *	returned or %NULL if the list is empty.
  */
-struct sk_buff *skb_dequeue_tail(struct sk_buff_head *list);
 static inline struct sk_buff *__skb_dequeue_tail(struct sk_buff_head *list)
 {
 	struct sk_buff *skb = skb_peek_tail(list);
@@ -1963,6 +1962,7 @@ static inline struct sk_buff *__skb_dequeue_tail(struct sk_buff_head *list)
 		__skb_unlink(skb, list);
 	return skb;
 }
+struct sk_buff *skb_dequeue_tail(struct sk_buff_head *list);
 
 
 static inline bool skb_is_nonlinear(const struct sk_buff *skb)
@@ -2653,13 +2653,13 @@ static inline int skb_orphan_frags_rx(struct sk_buff *skb, gfp_t gfp_mask)
  *	the list and one reference dropped. This function does not take the
  *	list lock and the caller must hold the relevant locks to use it.
  */
-void skb_queue_purge(struct sk_buff_head *list);
 static inline void __skb_queue_purge(struct sk_buff_head *list)
 {
 	struct sk_buff *skb;
 	while ((skb = __skb_dequeue(list)) != NULL)
 		kfree_skb(skb);
 }
+void skb_queue_purge(struct sk_buff_head *list);
 
 unsigned int skb_rbtree_purge(struct rb_root *root);
 
@@ -3028,7 +3028,7 @@ static inline int skb_padto(struct sk_buff *skb, unsigned int len)
 }
 
 /**
- *	skb_put_padto - increase size and pad an skbuff up to a minimal size
+ *	__skb_put_padto - increase size and pad an skbuff up to a minimal size
  *	@skb: buffer to pad
  *	@len: minimal length
  *	@free_on_error: free buffer on error
-- 
2.20.1.791


^ permalink raw reply related

* [Patch net v2 0/3] net_sched: some fixes for cls_tcindex
From: Cong Wang @ 2019-02-11 21:06 UTC (permalink / raw)
  To: netdev; +Cc: Cong Wang

This patchset contains 3 bug fixes for tcindex filter. Please check
each patch for details.

v2: fix a compile error in patch 2
    drop netns refcnt in patch 1

Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>

Cong Wang (3):
  net_sched: fix a race condition in tcindex_destroy()
  net_sched: fix a memory leak in cls_tcindex
  net_sched: fix two more memory leaks in cls_tcindex

 net/sched/cls_tcindex.c | 80 ++++++++++++++++++++++++-----------------
 1 file changed, 48 insertions(+), 32 deletions(-)

-- 
2.20.1


^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox