Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH bpf] xdp: Fix handling of devmap in generic XDP
From: kbuild test robot @ 2018-06-13  9:27 UTC (permalink / raw)
  To: Toshiaki Makita
  Cc: kbuild-all, Alexei Starovoitov, Daniel Borkmann, Toshiaki Makita,
	netdev, Jesper Dangaard Brouer
In-Reply-To: <1528877178-2521-1-git-send-email-makita.toshiaki@lab.ntt.co.jp>

[-- Attachment #1: Type: text/plain, Size: 1264 bytes --]

Hi Toshiaki,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on bpf/master]

url:    https://github.com/0day-ci/linux/commits/Toshiaki-Makita/xdp-Fix-handling-of-devmap-in-generic-XDP/20180613-161204
base:   https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git master
config: i386-randconfig-a1-201823 (attached as .config)
compiler: gcc-4.9 (Debian 4.9.4-2) 4.9.4
reproduce:
        # save the attached .config to linux build tree
        make ARCH=i386 

All warnings (new ones prefixed by >>):

   In file included from net//bpf/test_run.c:7:0:
>> include/linux/bpf.h:594:16: warning: 'struct sk_buff' declared inside parameter list
            struct bpf_prog *xdp_prog)
                   ^
>> include/linux/bpf.h:594:16: warning: its scope is only this definition or declaration, which is probably not what you want

vim +594 include/linux/bpf.h

   591	
   592	static inline int dev_map_generic_redirect(struct bpf_dtab_netdev *dst,
   593						   struct sk_buff *skb,
 > 594						   struct bpf_prog *xdp_prog)
   595	{
   596		return 0;
   597	}
   598	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 28525 bytes --]

^ permalink raw reply

* Re: [PATCH 2/2] WAN: LMC: ensure lmc_trace is reporting return from lmc_proto_type
From: Dan Carpenter @ 2018-06-13  9:22 UTC (permalink / raw)
  To: Colin King; +Cc: David S . Miller, netdev, kernel-janitors, linux-kernel
In-Reply-To: <20180613060410.735-2-colin.king@canonical.com>

On Wed, Jun 13, 2018 at 07:04:10AM +0100, Colin King wrote:
> From: Colin Ian King <colin.king@canonical.com>
> 
> Currently the lmc tracing is not reporting the return from function
> lmc_proto_type and this tracing statement is never executed. Fix
> this by returning through the end of the function.  Also fix a typo
> in the function name lmc_proto_type in the trace message.
> 
> Detected by CoverityScan, CID#710539 ("Structurally dead code")
> 
> Signed-off-by: Colin Ian King <colin.king@canonical.com>
> ---
>  drivers/net/wan/lmc/lmc_proto.c | 14 +++++++++-----
>  1 file changed, 9 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/net/wan/lmc/lmc_proto.c b/drivers/net/wan/lmc/lmc_proto.c
> index 5a6c87bce1bf..b98c1ee860de 100644
> --- a/drivers/net/wan/lmc/lmc_proto.c
> +++ b/drivers/net/wan/lmc/lmc_proto.c
> @@ -99,23 +99,27 @@ void lmc_proto_close(lmc_softc_t *sc)
>  
>  __be16 lmc_proto_type(lmc_softc_t *sc, struct sk_buff *skb) /*FOLD00*/
>  {
> +	__be16 ret;
> +
>  	lmc_trace(sc->lmc_device, "lmc_proto_type in");

Did you take a look at lmc_trace()?  It's total garbage.  It's better
to just delete it.

regards,
dan carpenter

^ permalink raw reply

* Re: [PATCH] net: thunderx: prevent concurrent data re-writing by nicvf_set_rx_mode
From: Vadim Lomovtsev @ 2018-06-13  9:15 UTC (permalink / raw)
  To: David Miller
  Cc: dnelson, rric, sgoutham, linux-arm-kernel, netdev, linux-kernel,
	Vadim.Lomovtsev
In-Reply-To: <20180612.152540.1304714747425091865.davem@davemloft.net>

Sorry for delay.

On Tue, Jun 12, 2018 at 03:25:40PM -0700, David Miller wrote:
> From: Dean Nelson <dnelson@redhat.com>
> Date: Mon, 11 Jun 2018 06:22:14 -0500
> 
> > On 06/10/2018 02:35 PM, David Miller wrote:
> >> From: Vadim Lomovtsev <Vadim.Lomovtsev@caviumnetworks.com>
> >> Date: Fri,  8 Jun 2018 02:27:59 -0700
> >> 
> >>> +	/* Save message data locally to prevent them from
> >>> +	 * being overwritten by next ndo_set_rx_mode call().
> >>> +	 */
> >>> +	spin_lock(&nic->rx_mode_wq_lock);
> >>> +	mode = vf_work->mode;
> >>> +	mc = vf_work->mc;
> >>> +	vf_work->mc = NULL;
> > 
> > If I'm reading this code correctly, I believe nic->rx_mode_work.mc
> > will
> > have been set to NULL before the lock is dropped by
> > nicvf_set_rx_mode_task() and acquired by nicvf_set_rx_mode().
> > 
> > 
> >>> +	spin_unlock(&nic->rx_mode_wq_lock);
> >> At the moment you drop this lock, the memory behind 'mc' can be
> >> freed up by:
> >> 
> >>> +	spin_lock(&nic->rx_mode_wq_lock);
> >>> +	kfree(nic->rx_mode_work.mc);
> > 
> > So the kfree() will be called with a NULL pointer and quickly return.
> > 
> > 
> >> And you'll crash when you dereference it above via
> >> __nicvf_set_rx_mode_task().
> >> 
> > 
> > I believe the call to kfree() in nicvf_set_rx_mode() is there to free
> > up a mc_list that has been allocated by nicvf_set_rx_mode() during a
> > previous callback to the function, one that has not yet been processed
> > by nicvf_set_rx_mode_task().
> > 
> > In this way only the last 'unprocessed' callback to
> > nicvf_set_rx_mode()
> > gets processed should there be multiple callbacks occurring between
> > the
> > times the nicvf_set_rx_mode_task() runs.
> > 
> > In my testing with this patch, this is what I see happening.
> 
> You're right, my bad.
> 
> Patch applied.

Thank you for your time.

WBR,
Vadim

^ permalink raw reply

* Re: [PATCH net-queue] i40e: Fix incorrect skb reserved size on rx
From: Daniel Borkmann @ 2018-06-13  9:06 UTC (permalink / raw)
  To: Toshiaki Makita, Jeff Kirsher; +Cc: John Fastabend, intel-wired-lan, netdev
In-Reply-To: <1528877310-2574-1-git-send-email-makita.toshiaki@lab.ntt.co.jp>

On 06/13/2018 10:08 AM, Toshiaki Makita wrote:
> i40e_build_skb() reserves I40E_SKB_PAD + (xdp->data -
> xdp->data_hard_start) but obviously I40E_SKB_PAD is unnecessary here
> and mac_header/data feilds in skb becomes incorrect, and breaks normal
> skb receive path as well as XDP receive path.
> 
> Fixes: cc5b114dcf98 ("bpf, i40e: add meta data support")
> Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>

Thanks Toshiaki, I sent a complete fix yesterday here:

https://lkml.org/lkml/2018/6/12/843

Cheers,
Daniel

^ permalink raw reply

* [PATCH net/jkirsher] bpf, xdp, i40e: fix i40e_build_skb skb reserve and truesize
From: Daniel Borkmann @ 2018-06-13  9:04 UTC (permalink / raw)
  To: jeffrey.t.kirsher
  Cc: intel-wired-lan, keith.busch, makita.toshiaki, bjorn.topel,
	john.fastabend, netdev, Daniel Borkmann

Using skb_reserve(skb, I40E_SKB_PAD + (xdp->data - xdp->data_hard_start))
is clearly wrong since I40E_SKB_PAD already points to the offset where
the original xdp->data was sitting since xdp->data_hard_start is defined
as xdp->data - i40e_rx_offset(rx_ring) where latter offsets to I40E_SKB_PAD
when build skb is used.

However, also before cc5b114dcf98 ("bpf, i40e: add meta data support")
this seems broken since bpf_xdp_adjust_head() helper could have been used
to alter headroom and enlarge / shrink the frame and with that the assumption
that the xdp->data remains unchanged does not hold and would push a bogus
packet to upper stack.

ixgbe got this right in 924708081629 ("ixgbe: add XDP support for pass and
drop actions"). In any case, fix it by removing the I40E_SKB_PAD from both
skb_reserve() and truesize calculation.

Fixes: cc5b114dcf98 ("bpf, i40e: add meta data support")
Fixes: 0c8493d90b6b ("i40e: add XDP support for pass and drop actions")
Reported-by: Keith Busch <keith.busch@linux.intel.com>
Reported-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Cc: Björn Töpel <bjorn.topel@intel.com>
Cc: John Fastabend <john.fastabend@gmail.com>
---
 drivers/net/ethernet/intel/i40e/i40e_txrx.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
index 8ffb745..ed6dbcf 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
@@ -2103,9 +2103,8 @@ static struct sk_buff *i40e_build_skb(struct i40e_ring *rx_ring,
 	unsigned int truesize = i40e_rx_pg_size(rx_ring) / 2;
 #else
 	unsigned int truesize = SKB_DATA_ALIGN(sizeof(struct skb_shared_info)) +
-				SKB_DATA_ALIGN(I40E_SKB_PAD +
-					       (xdp->data_end -
-						xdp->data_hard_start));
+				SKB_DATA_ALIGN(xdp->data_end -
+					       xdp->data_hard_start);
 #endif
 	struct sk_buff *skb;

@@ -2124,7 +2123,7 @@ static struct sk_buff *i40e_build_skb(struct i40e_ring *rx_ring,
 		return NULL;

 	/* update pointers within the skb to store the data */
-	skb_reserve(skb, I40E_SKB_PAD + (xdp->data - xdp->data_hard_start));
+	skb_reserve(skb, xdp->data - xdp->data_hard_start);
 	__skb_put(skb, xdp->data_end - xdp->data);
 	if (metasize)
 		skb_metadata_set(skb, metasize);
-- 
2.9.5

^ permalink raw reply related

* Re: [PATCH bpf] xdp: Fix handling of devmap in generic XDP
From: kbuild test robot @ 2018-06-13  8:51 UTC (permalink / raw)
  To: Toshiaki Makita
  Cc: kbuild-all, Alexei Starovoitov, Daniel Borkmann, Toshiaki Makita,
	netdev, Jesper Dangaard Brouer
In-Reply-To: <1528877178-2521-1-git-send-email-makita.toshiaki@lab.ntt.co.jp>

[-- Attachment #1: Type: text/plain, Size: 1374 bytes --]

Hi Toshiaki,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on bpf/master]

url:    https://github.com/0day-ci/linux/commits/Toshiaki-Makita/xdp-Fix-handling-of-devmap-in-generic-XDP/20180613-161204
base:   https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git master
config: parisc-c3000_defconfig (attached as .config)
compiler: hppa-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
reproduce:
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        GCC_VERSION=7.2.0 make.cross ARCH=parisc 

All warnings (new ones prefixed by >>):

   In file included from net/bpf/test_run.c:7:0:
>> include/linux/bpf.h:593:16: warning: 'struct sk_buff' declared inside parameter list will not be visible outside of this definition or declaration
            struct sk_buff *skb,
                   ^~~~~~~

vim +593 include/linux/bpf.h

   591	
   592	static inline int dev_map_generic_redirect(struct bpf_dtab_netdev *dst,
 > 593						   struct sk_buff *skb,
   594						   struct bpf_prog *xdp_prog)
   595	{
   596		return 0;
   597	}
   598	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 14461 bytes --]

^ permalink raw reply

* Re: FW: [PATCH 2/2] ath10k: allow ATH10K_SNOC with COMPILE_TEST
From: Kalle Valo @ 2018-06-13  8:47 UTC (permalink / raw)
  To: Niklas Cassel
  Cc: Govind Singh, bjorn.andersson, davem, netdev, linux-wireless,
	linux-kernel, ath10k
In-Reply-To: <20180612124403.GA26986@centauri.lan>

Niklas Cassel <niklas.cassel@linaro.org> writes:

> On Tue, Jun 12, 2018 at 06:02:48PM +0530, Govind Singh wrote:
>> On 2018-06-12 17:45, Govind Singh wrote:
>> > -----Original Message-----
>> > From: ath10k <ath10k-bounces@lists.infradead.org> On Behalf Of Niklas
>> > Cassel
>> > Sent: Tuesday, June 12, 2018 5:09 PM
>> > To: Kalle Valo <kvalo@codeaurora.org>; David S. Miller
>> > <davem@davemloft.net>
>> > Cc: Niklas Cassel <niklas.cassel@linaro.org>; netdev@vger.kernel.org;
>> > linux-wireless@vger.kernel.org; linux-kernel@vger.kernel.org;
>> > ath10k@lists.infradead.org
>> > Subject: [PATCH 2/2] ath10k: allow ATH10K_SNOC with COMPILE_TEST
>> > 
>> > ATH10K_SNOC builds just fine with COMPILE_TEST, so make that possible.
>> > 
>> > Signed-off-by: Niklas Cassel <niklas.cassel@linaro.org>
>> > ---
>> >  drivers/net/wireless/ath/ath10k/Kconfig | 3 ++-
>> >  1 file changed, 2 insertions(+), 1 deletion(-)
>> > 
>> > diff --git a/drivers/net/wireless/ath/ath10k/Kconfig
>> > b/drivers/net/wireless/ath/ath10k/Kconfig
>> > index 54ff5930126c..6572a43590a8 100644
>> > --- a/drivers/net/wireless/ath/ath10k/Kconfig
>> > +++ b/drivers/net/wireless/ath/ath10k/Kconfig
>> > @@ -42,7 +42,8 @@ config ATH10K_USB
>> > 
>> >  config ATH10K_SNOC
>> >  	tristate "Qualcomm ath10k SNOC support (EXPERIMENTAL)"
>> > -	depends on ATH10K && ARCH_QCOM
>> > +	depends on ATH10K
>> > +	depends on ARCH_QCOM || COMPILE_TEST
>> >  	---help---
>> >  	  This module adds support for integrated WCN3990 chip connected
>> >  	  to system NOC(SNOC). Currently work in progress and will not
>> 
>> Thanks Niklas for enabling COMPILE_TEST. With QMI set of
>> changes(https://patchwork.kernel.org/patch/10448183/), we need to enable
>> COMPILE_TEST for
>> QCOM_SCM/QMI_HELPERS which seems broken today. Are you planning to fix the
>> same.
>
>
> Argh..
>
> qcom_scm seems fine, it is just missing a single definition in the
> #else clause of include/linux/qcom_scm.h.
>
> +++ b/include/linux/qcom_scm.h
> @@ -89,6 +89,10 @@ static inline int qcom_scm_pas_mem_setup(u32 peripheral, phys_addr_t addr,                                     
>  static inline int
>  qcom_scm_pas_auth_and_reset(u32 peripheral) { return -ENODEV; }
>  static inline int qcom_scm_pas_shutdown(u32 peripheral) { return -ENODEV; }                                                      
> +static inline int qcom_scm_assign_mem(phys_addr_t mem_addr, size_t mem_sz,                                                       
> +                                     unsigned int *src,
> +                                     struct qcom_scm_vmperm *newvm,                                                              
> +                                     int dest_cnt) { return -ENODEV; }                                                           
>  static inline void qcom_scm_cpu_power_down(u32 flags) {}
>  static inline u32 qcom_scm_get_version(void) { return 0; }
>
>
>
> include/linux/soc/qcom/qmi.h on the other hand doesn't have any
> dummy defintions at all.
> I think that it makes sense to be able to compile test
> the QMI helpers also on other archs..
>
> Bjorn, any opinion?

Please don't drop ath10k list, adding it back.

-- 
Kalle Valo

^ permalink raw reply

* Re: [RFC PATCH 06/12] xen-blkfront: add callbacks for PM suspend and hibernation
From: Roger Pau Monné @ 2018-06-13  8:24 UTC (permalink / raw)
  To: Anchal Agarwal
  Cc: tglx, mingo, hpa, x86, boris.ostrovsky, konrad.wilk, netdev,
	jgross, xen-devel, linux-kernel, kamatam, fllinden, vallish,
	guruanb, eduval, rjw, pavel, len.brown, linux-pm, cyberax
In-Reply-To: <20180612205619.28156-7-anchalag@amazon.com>

On Tue, Jun 12, 2018 at 08:56:13PM +0000, Anchal Agarwal wrote:
> From: Munehisa Kamata <kamatam@amazon.com>
> 
> Add freeze and restore callbacks for PM suspend and hibernation support.
> The freeze handler stops a block-layer queue and disconnect the frontend
> from the backend while freeing ring_info and associated resources. The
> restore handler re-allocates ring_info and re-connect to the backedend,
> so the rest of the kernel can continue to use the block device
> transparently.Also, the handlers are used for both PM
> suspend and hibernation so that we can keep the existing suspend/resume
> callbacks for Xen suspend without modification.
> If a backend doesn't have commit 12ea729645ac ("xen/blkback: unmap all
> persistent grants when frontend gets disconnected"), the frontend may see
> massive amount of grant table warning when freeing resources.
> 
>  [   36.852659] deferring g.e. 0xf9 (pfn 0xffffffffffffffff)
>  [   36.855089] xen:grant_table: WARNING: g.e. 0x112 still in use!
> 
> In this case, persistent grants would need to be disabled.
> 
> Ensure no reqs/rsps in rings before disconnecting. When disconnecting
> the frontend from the backend in blkfront_freeze(), there still may be
> unconsumed requests or responses in the rings, especially when the
> backend is backed by network-based device. If the frontend gets
> disconnected with such reqs/rsps remaining there, it can cause
> grant warnings and/or losing reqs/rsps by freeing pages afterward.

I'm not sure why having pending requests can cause grant warnings or
lose of requests. If handled properly this shouldn't be an issue.
Linux blkfront already does live migration (which also involves a
reconnection of the frontend) with pending requests and that doesn't
seem to be an issue.

> This can lead resumed kernel into unrecoverable state like unexpected
> freeing of grant page and/or hung task due to the lost reqs or rsps.
> Therefore we have to ensure that there is no unconsumed requests or
> responses before disconnecting.

Given that we have multiqueue, plus multipage rings, I'm not sure
waiting for the requests on the rings to complete is a good idea.

Why can't you just disconnect the frontend and requeue all the
requests in flight? When the frontend connects on resume those
requests will be queued again.

Thanks, Roger.

^ permalink raw reply

* [PATCH net-queue] i40e: Fix incorrect skb reserved size on rx
From: Toshiaki Makita @ 2018-06-13  8:08 UTC (permalink / raw)
  To: Jeff Kirsher
  Cc: Toshiaki Makita, Daniel Borkmann, John Fastabend, intel-wired-lan,
	netdev

i40e_build_skb() reserves I40E_SKB_PAD + (xdp->data -
xdp->data_hard_start) but obviously I40E_SKB_PAD is unnecessary here
and mac_header/data feilds in skb becomes incorrect, and breaks normal
skb receive path as well as XDP receive path.

Fixes: cc5b114dcf98 ("bpf, i40e: add meta data support")
Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
---
 drivers/net/ethernet/intel/i40e/i40e_txrx.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
index 8ffb7454e67c..6d59f51f1730 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
@@ -2124,7 +2124,7 @@ static struct sk_buff *i40e_build_skb(struct i40e_ring *rx_ring,
 		return NULL;
 
 	/* update pointers within the skb to store the data */
-	skb_reserve(skb, I40E_SKB_PAD + (xdp->data - xdp->data_hard_start));
+	skb_reserve(skb, xdp->data - xdp->data_hard_start);
 	__skb_put(skb, xdp->data_end - xdp->data);
 	if (metasize)
 		skb_metadata_set(skb, metasize);
-- 
2.14.2

^ permalink raw reply related

* [PATCH bpf] xdp: Fix handling of devmap in generic XDP
From: Toshiaki Makita @ 2018-06-13  8:06 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann
  Cc: Toshiaki Makita, netdev, Jesper Dangaard Brouer

Commit 67f29e07e131 ("bpf: devmap introduce dev_map_enqueue") changed
the return value type of __devmap_lookup_elem() from struct net_device *
to struct bpf_dtab_netdev * but forgot to modify generic XDP code
accordingly.
Thus generic XDP incorrectly used struct bpf_dtab_netdev where struct
net_device is expected, then skb->dev was set to invalid value.

Fixes: 67f29e07e131 ("bpf: devmap introduce dev_map_enqueue")
Signed-off-by: Toshiaki Makita <makita.toshiaki@lab.ntt.co.jp>
---
 include/linux/bpf.h    | 10 ++++++++++
 include/linux/filter.h | 16 ++++++++++++++++
 kernel/bpf/devmap.c    | 14 ++++++++++++++
 net/core/filter.c      | 21 ++++-----------------
 4 files changed, 44 insertions(+), 17 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 995c3b1..2fe3aa1 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -487,6 +487,7 @@ static inline void bpf_long_memcpy(void *dst, const void *src, u32 size)
 void bpf_patch_call_args(struct bpf_insn *insn, u32 stack_depth);
 
 /* Map specifics */
+struct sk_buff;
 struct xdp_buff;
 
 struct bpf_dtab_netdev *__dev_map_lookup_elem(struct bpf_map *map, u32 key);
@@ -494,6 +495,8 @@ static inline void bpf_long_memcpy(void *dst, const void *src, u32 size)
 void __dev_map_flush(struct bpf_map *map);
 int dev_map_enqueue(struct bpf_dtab_netdev *dst, struct xdp_buff *xdp,
 		    struct net_device *dev_rx);
+int dev_map_generic_redirect(struct bpf_dtab_netdev *dst, struct sk_buff *skb,
+			     struct bpf_prog *xdp_prog);
 
 struct bpf_cpu_map_entry *__cpu_map_lookup_elem(struct bpf_map *map, u32 key);
 void __cpu_map_insert_ctx(struct bpf_map *map, u32 index);
@@ -586,6 +589,13 @@ int dev_map_enqueue(struct bpf_dtab_netdev *dst, struct xdp_buff *xdp,
 	return 0;
 }
 
+static inline int dev_map_generic_redirect(struct bpf_dtab_netdev *dst,
+					   struct sk_buff *skb,
+					   struct bpf_prog *xdp_prog)
+{
+	return 0;
+}
+
 static inline
 struct bpf_cpu_map_entry *__cpu_map_lookup_elem(struct bpf_map *map, u32 key)
 {
diff --git a/include/linux/filter.h b/include/linux/filter.h
index 45fc0f5..8ddff1f 100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -19,6 +19,7 @@
 #include <linux/cryptohash.h>
 #include <linux/set_memory.h>
 #include <linux/kallsyms.h>
+#include <linux/if_vlan.h>
 
 #include <net/sch_generic.h>
 
@@ -786,6 +787,21 @@ static inline bool bpf_dump_raw_ok(void)
 struct bpf_prog *bpf_patch_insn_single(struct bpf_prog *prog, u32 off,
 				       const struct bpf_insn *patch, u32 len);
 
+static inline int __xdp_generic_ok_fwd_dev(struct sk_buff *skb,
+					   struct net_device *fwd)
+{
+	unsigned int len;
+
+	if (unlikely(!(fwd->flags & IFF_UP)))
+		return -ENETDOWN;
+
+	len = fwd->mtu + fwd->hard_header_len + VLAN_HLEN;
+	if (skb->len > len)
+		return -EMSGSIZE;
+
+	return 0;
+}
+
 /* The pair of xdp_do_redirect and xdp_do_flush_map MUST be called in the
  * same cpu context. Further for best results no more than a single map
  * for the do_redirect/do_flush pair should be used. This limitation is
diff --git a/kernel/bpf/devmap.c b/kernel/bpf/devmap.c
index a7cc7b3..642c97f 100644
--- a/kernel/bpf/devmap.c
+++ b/kernel/bpf/devmap.c
@@ -345,6 +345,20 @@ int dev_map_enqueue(struct bpf_dtab_netdev *dst, struct xdp_buff *xdp,
 	return bq_enqueue(dst, xdpf, dev_rx);
 }
 
+int dev_map_generic_redirect(struct bpf_dtab_netdev *dst, struct sk_buff *skb,
+			     struct bpf_prog *xdp_prog)
+{
+	int err;
+
+	err = __xdp_generic_ok_fwd_dev(skb, dst->dev);
+	if (unlikely(err))
+		return err;
+	skb->dev = dst->dev;
+	generic_xdp_tx(skb, xdp_prog);
+
+	return 0;
+}
+
 static void *dev_map_lookup_elem(struct bpf_map *map, void *key)
 {
 	struct bpf_dtab_netdev *obj = __dev_map_lookup_elem(map, *(u32 *)key);
diff --git a/net/core/filter.c b/net/core/filter.c
index 3d9ba7e..e7f12e9 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -3214,20 +3214,6 @@ int xdp_do_redirect(struct net_device *dev, struct xdp_buff *xdp,
 }
 EXPORT_SYMBOL_GPL(xdp_do_redirect);
 
-static int __xdp_generic_ok_fwd_dev(struct sk_buff *skb, struct net_device *fwd)
-{
-	unsigned int len;
-
-	if (unlikely(!(fwd->flags & IFF_UP)))
-		return -ENETDOWN;
-
-	len = fwd->mtu + fwd->hard_header_len + VLAN_HLEN;
-	if (skb->len > len)
-		return -EMSGSIZE;
-
-	return 0;
-}
-
 static int xdp_do_generic_redirect_map(struct net_device *dev,
 				       struct sk_buff *skb,
 				       struct xdp_buff *xdp,
@@ -3256,10 +3242,11 @@ static int xdp_do_generic_redirect_map(struct net_device *dev,
 	}
 
 	if (map->map_type == BPF_MAP_TYPE_DEVMAP) {
-		if (unlikely((err = __xdp_generic_ok_fwd_dev(skb, fwd))))
+		struct bpf_dtab_netdev *dst = fwd;
+
+		err = dev_map_generic_redirect(dst, skb, xdp_prog);
+		if (unlikely(err))
 			goto err;
-		skb->dev = fwd;
-		generic_xdp_tx(skb, xdp_prog);
 	} else if (map->map_type == BPF_MAP_TYPE_XSKMAP) {
 		struct xdp_sock *xs = fwd;
 
-- 
1.8.3.1

^ permalink raw reply related

* Re: KASAN: use-after-free Read in rds_cong_queue_updates
From: santosh.shilimkar @ 2018-06-13  7:54 UTC (permalink / raw)
  To: Dmitry Vyukov, syzbot
  Cc: Sowmini Varadhan, David Miller, LKML, linux-rdma, netdev,
	rds-devel, syzkaller-bugs
In-Reply-To: <CACT4Y+b+DFpa34E3UWzSYn6qy-BMmxN80ikdmU_xMe9x9JF2+Q@mail.gmail.com>



On 6/13/18 12:51 AM, Dmitry Vyukov wrote:
> On Wed, Jun 13, 2018 at 4:51 AM, syzbot
> <syzbot+4c20b3866171ce8441d2@syzkaller.appspotmail.com> wrote:
>> syzbot has found a reproducer for the following crash on:
> 
> Woohoo! Please take a look, this is a top crasher.
> 
Will have a look Dmitry !!

^ permalink raw reply

* Re: KASAN: out-of-bounds Read in rds_cong_queue_updates (2)
From: Dmitry Vyukov @ 2018-06-13  7:52 UTC (permalink / raw)
  To: syzbot, Sowmini Varadhan
  Cc: David Miller, LKML, linux-rdma, netdev, rds-devel,
	Santosh Shilimkar, syzkaller-bugs
In-Reply-To: <00000000000081bd9d056e813e48@google.com>

On Wed, Jun 13, 2018 at 9:51 AM, syzbot
<syzbot+287843ad8a4d2870e538@syzkaller.appspotmail.com> wrote:
> Hello,
>
> syzbot found the following crash on:
>
> HEAD commit:    0adb32858b0b Linux 4.16
> git tree:       upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=138f2d0b800000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=df0c336cc3b55d45
> dashboard link: https://syzkaller.appspot.com/bug?extid=287843ad8a4d2870e538
> compiler:       gcc (GCC) 7.1.1 20170620
>
> Unfortunately, I don't have any reproducer for this crash yet.
>
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+287843ad8a4d2870e538@syzkaller.appspotmail.com

I think this is:

#syz dup: KASAN: use-after-free Read in rds_cong_queue_updates

> ==================================================================
> BUG: KASAN: out-of-bounds in __read_once_size include/linux/compiler.h:188
> [inline]
> BUG: KASAN: out-of-bounds in atomic_read arch/x86/include/asm/atomic.h:27
> [inline]
> BUG: KASAN: out-of-bounds in refcount_read include/linux/refcount.h:42
> [inline]
> BUG: KASAN: out-of-bounds in check_net include/net/net_namespace.h:228
> [inline]
> BUG: KASAN: out-of-bounds in rds_destroy_pending net/rds/rds.h:868 [inline]
> BUG: KASAN: out-of-bounds in rds_cong_queue_updates+0x4d3/0x4f0
> net/rds/cong.c:226
> Read of size 4 at addr ffff88018d7f2204 by task kworker/u4:6/10561
>
> CPU: 1 PID: 10561 Comm: kworker/u4:6 Not tainted 4.16.0+ #10
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> Google 01/01/2011
> Workqueue: krdsd rds_send_worker
> Call Trace:
>  __dump_stack lib/dump_stack.c:17 [inline]
>  dump_stack+0x194/0x24d lib/dump_stack.c:53
> kernel msg: ebtables bug: please report to author: Wrong len argument
>  print_address_description+0x73/0x250 mm/kasan/report.c:256
>  kasan_report_error mm/kasan/report.c:354 [inline]
>  kasan_report+0x23c/0x360 mm/kasan/report.c:412
>  __asan_report_load4_noabort+0x14/0x20 mm/kasan/report.c:432
>  __read_once_size include/linux/compiler.h:188 [inline]
>  atomic_read arch/x86/include/asm/atomic.h:27 [inline]
>  refcount_read include/linux/refcount.h:42 [inline]
>  check_net include/net/net_namespace.h:228 [inline]
>  rds_destroy_pending net/rds/rds.h:868 [inline]
>  rds_cong_queue_updates+0x4d3/0x4f0 net/rds/cong.c:226
>  rds_recv_rcvbuf_delta.part.2+0x289/0x320 net/rds/recv.c:118
>  rds_recv_rcvbuf_delta net/rds/recv.c:377 [inline]
>  rds_recv_incoming+0xeb4/0x11d0 net/rds/recv.c:377
>  rds_loop_xmit+0x149/0x320 net/rds/loop.c:82
>  rds_send_xmit+0xbcd/0x26b0 net/rds/send.c:355
>  rds_send_worker+0x115/0x2a0 net/rds/threads.c:199
>  process_one_work+0xc47/0x1bb0 kernel/workqueue.c:2113
>  worker_thread+0x223/0x1990 kernel/workqueue.c:2247
>  kthread+0x33c/0x400 kernel/kthread.c:238
>  ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:406
>
> The buggy address belongs to the page:
> page:ffffea000635fc80 count:3 mapcount:2 mapping:0000000000000000 index:0x0
> flags: 0x2fffc0000000000()
> raw: 02fffc0000000000 0000000000000000 0000000000000000 0000000300000001
> raw: dead000000000100 dead000000000200 0000000000000000 0000000000000000
> page dumped because: kasan: bad access detected
>
> Memory state around the buggy address:
>  ffff88018d7f2100: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>  ffff88018d7f2180: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>                       ^
>  ffff88018d7f2280: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>  ffff88018d7f2300: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> ==================================================================
> Kernel panic - not syncing: panic_on_warn set ...
>
> CPU: 1 PID: 10561 Comm: kworker/u4:6 Tainted: G    B            4.16.0+ #10
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> Google 01/01/2011
> Workqueue: krdsd rds_send_worker
> Call Trace:
>  __dump_stack lib/dump_stack.c:17 [inline]
>  dump_stack+0x194/0x24d lib/dump_stack.c:53
>  panic+0x1e4/0x41c kernel/panic.c:183
>  kasan_end_report+0x50/0x50 mm/kasan/report.c:180
>  kasan_report_error mm/kasan/report.c:359 [inline]
>  kasan_report+0x149/0x360 mm/kasan/report.c:412
>  __asan_report_load4_noabort+0x14/0x20 mm/kasan/report.c:432
>  __read_once_size include/linux/compiler.h:188 [inline]
>  atomic_read arch/x86/include/asm/atomic.h:27 [inline]
>  refcount_read include/linux/refcount.h:42 [inline]
>  check_net include/net/net_namespace.h:228 [inline]
>  rds_destroy_pending net/rds/rds.h:868 [inline]
>  rds_cong_queue_updates+0x4d3/0x4f0 net/rds/cong.c:226
>  rds_recv_rcvbuf_delta.part.2+0x289/0x320 net/rds/recv.c:118
>  rds_recv_rcvbuf_delta net/rds/recv.c:377 [inline]
>  rds_recv_incoming+0xeb4/0x11d0 net/rds/recv.c:377
>  rds_loop_xmit+0x149/0x320 net/rds/loop.c:82
>  rds_send_xmit+0xbcd/0x26b0 net/rds/send.c:355
>  rds_send_worker+0x115/0x2a0 net/rds/threads.c:199
>  process_one_work+0xc47/0x1bb0 kernel/workqueue.c:2113
>  worker_thread+0x223/0x1990 kernel/workqueue.c:2247
>  kthread+0x33c/0x400 kernel/kthread.c:238
>  ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:406
> Dumping ftrace buffer:
>    (ftrace buffer empty)
> Kernel Offset: disabled
> Rebooting in 86400 seconds..
>
>
> ---
> This bug is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at syzkaller@googlegroups.com.
>
> syzbot will keep track of this bug report. See:
> https://goo.gl/tpsmEJ#bug-status-tracking for how to communicate with
> syzbot.
>
> --
> You received this message because you are subscribed to the Google Groups
> "syzkaller-bugs" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to syzkaller-bugs+unsubscribe@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/syzkaller-bugs/00000000000081bd9d056e813e48%40google.com.
> For more options, visit https://groups.google.com/d/optout.

^ permalink raw reply

* Re: KASAN: use-after-free Read in rds_cong_queue_updates
From: Dmitry Vyukov @ 2018-06-13  7:51 UTC (permalink / raw)
  To: syzbot, Sowmini Varadhan
  Cc: David Miller, LKML, linux-rdma, netdev, rds-devel,
	Santosh Shilimkar, syzkaller-bugs
In-Reply-To: <000000000000a643e4056e7d0db4@google.com>

On Wed, Jun 13, 2018 at 4:51 AM, syzbot
<syzbot+4c20b3866171ce8441d2@syzkaller.appspotmail.com> wrote:
> syzbot has found a reproducer for the following crash on:

Woohoo! Please take a look, this is a top crasher.

> HEAD commit:    f0dc7f9c6dd9 Merge git://git.kernel.org/pub/scm/linux/kern..
> git tree:       net-next
> console output: https://syzkaller.appspot.com/x/log.txt?x=1461f03f800000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=fa9c20c48788d1c1
> dashboard link: https://syzkaller.appspot.com/bug?extid=4c20b3866171ce8441d2
> compiler:       gcc (GCC) 8.0.1 20180413 (experimental)
> syzkaller repro:https://syzkaller.appspot.com/x/repro.syz?x=16cbfeaf800000
> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=165227f7800000
>
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+4c20b3866171ce8441d2@syzkaller.appspotmail.com
>
> IPv6: ADDRCONF(NETDEV_CHANGE): bond0: link becomes ready
> 8021q: adding VLAN 0 to HW filter on device team0
> IPVS: ftp: loaded support on port[0] = 21
> IPVS: ftp: loaded support on port[0] = 21
> ==================================================================
> BUG: KASAN: use-after-free in atomic_read
> include/asm-generic/atomic-instrumented.h:21 [inline]
> BUG: KASAN: use-after-free in refcount_read include/linux/refcount.h:42
> [inline]
> BUG: KASAN: use-after-free in check_net include/net/net_namespace.h:236
> [inline]
> BUG: KASAN: use-after-free in rds_destroy_pending net/rds/rds.h:897 [inline]
> BUG: KASAN: use-after-free in rds_cong_queue_updates+0x255/0x590
> net/rds/cong.c:226
> Read of size 4 at addr ffff8801ab180044 by task syz-executor199/4800
>
> CPU: 1 PID: 4800 Comm: syz-executor199 Not tainted 4.17.0+ #84
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> Google 01/01/2011
> Call Trace:
>  __dump_stack lib/dump_stack.c:77 [inline]
>  dump_stack+0x1b9/0x294 lib/dump_stack.c:113
>  print_address_description+0x6c/0x20b mm/kasan/report.c:256
>  kasan_report_error mm/kasan/report.c:354 [inline]
>  kasan_report.cold.7+0x242/0x2fe mm/kasan/report.c:412
>  check_memory_region_inline mm/kasan/kasan.c:260 [inline]
>  check_memory_region+0x13e/0x1b0 mm/kasan/kasan.c:267
>  kasan_check_read+0x11/0x20 mm/kasan/kasan.c:272
>  atomic_read include/asm-generic/atomic-instrumented.h:21 [inline]
>  refcount_read include/linux/refcount.h:42 [inline]
>  check_net include/net/net_namespace.h:236 [inline]
>  rds_destroy_pending net/rds/rds.h:897 [inline]
>  rds_cong_queue_updates+0x255/0x590 net/rds/cong.c:226
>  rds_recv_rcvbuf_delta.part.3+0x211/0x350 net/rds/recv.c:126
>  rds_recv_rcvbuf_delta net/rds/recv.c:735 [inline]
>  rds_clear_recv_queue+0x2f0/0x4c0 net/rds/recv.c:735
>  rds_release+0x15c/0x550 net/rds/af_rds.c:72
>  __sock_release+0xd7/0x260 net/socket.c:603
>  sock_close+0x19/0x20 net/socket.c:1186
>  __fput+0x353/0x890 fs/file_table.c:209
>  ____fput+0x15/0x20 fs/file_table.c:243
>  task_work_run+0x1e4/0x290 kernel/task_work.c:113
>  exit_task_work include/linux/task_work.h:22 [inline]
>  do_exit+0x1aee/0x2730 kernel/exit.c:865
>  do_group_exit+0x16f/0x430 kernel/exit.c:968
>  get_signal+0x886/0x1960 kernel/signal.c:2468
>  do_signal+0x9c/0x21c0 arch/x86/kernel/signal.c:816
>  exit_to_usermode_loop+0x2cf/0x360 arch/x86/entry/common.c:162
>  prepare_exit_to_usermode arch/x86/entry/common.c:197 [inline]
>  syscall_return_slowpath arch/x86/entry/common.c:268 [inline]
>  do_syscall_64+0x6ac/0x800 arch/x86/entry/common.c:293
>  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> RIP: 0033:0x44f439
> Code: e8 ac be 02 00 48 83 c4 18 c3 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7
> 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff
> 0f 83 5b ff fb ff c3 66 2e 0f 1f 84 00 00 00 00
> RSP: 002b:00007fc65567dcf8 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
> RAX: fffffffffffffe00 RBX: 00000000006edadc RCX: 000000000044f439
> RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00000000006edadc
> RBP: 00000000006edad8 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
> R13: 00007fff3df31b1f R14: 00007fc65567e9c0 R15: 0000000000000061
>
> Allocated by task 4800:
>  save_stack+0x43/0xd0 mm/kasan/kasan.c:448
>  set_track mm/kasan/kasan.c:460 [inline]
>  kasan_kmalloc+0xc4/0xe0 mm/kasan/kasan.c:553
>  kasan_slab_alloc+0x12/0x20 mm/kasan/kasan.c:490
>  kmem_cache_alloc+0x12e/0x760 mm/slab.c:3554
>  kmem_cache_zalloc include/linux/slab.h:696 [inline]
>  net_alloc net/core/net_namespace.c:383 [inline]
>  copy_net_ns+0x159/0x4c0 net/core/net_namespace.c:423
>  create_new_namespaces+0x69d/0x8f0 kernel/nsproxy.c:107
>  unshare_nsproxy_namespaces+0xc3/0x1f0 kernel/nsproxy.c:206
>  ksys_unshare+0x708/0xf90 kernel/fork.c:2411
>  __do_sys_unshare kernel/fork.c:2479 [inline]
>  __se_sys_unshare kernel/fork.c:2477 [inline]
>  __x64_sys_unshare+0x31/0x40 kernel/fork.c:2477
>  do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:290
>  entry_SYSCALL_64_after_hwframe+0x49/0xbe
>
> Freed by task 746:
>  save_stack+0x43/0xd0 mm/kasan/kasan.c:448
>  set_track mm/kasan/kasan.c:460 [inline]
>  __kasan_slab_free+0x11a/0x170 mm/kasan/kasan.c:521
>  kasan_slab_free+0xe/0x10 mm/kasan/kasan.c:528
>  __cache_free mm/slab.c:3498 [inline]
>  kmem_cache_free+0x86/0x2d0 mm/slab.c:3756
>  net_free net/core/net_namespace.c:399 [inline]
>  net_drop_ns.part.14+0x11a/0x130 net/core/net_namespace.c:406
>  net_drop_ns net/core/net_namespace.c:405 [inline]
>  cleanup_net+0x6a1/0xb20 net/core/net_namespace.c:541
>  process_one_work+0xc64/0x1b70 kernel/workqueue.c:2153
>  worker_thread+0x181/0x13a0 kernel/workqueue.c:2296
>  kthread+0x345/0x410 kernel/kthread.c:240
>  ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:412
>
> The buggy address belongs to the object at ffff8801ab180040
>  which belongs to the cache net_namespace(17:syz0) of size 8896
> The buggy address is located 4 bytes inside of
>  8896-byte region [ffff8801ab180040, ffff8801ab182300)
> The buggy address belongs to the page:
> page:ffffea0006ac6000 count:1 mapcount:0 mapping:ffff8801aeaa0080 index:0x0
> compound_mapcount: 0
> flags: 0x2fffc0000008100(slab|head)
> raw: 02fffc0000008100 ffff8801d3827048 ffff8801d3827048 ffff8801aeaa0080
> raw: 0000000000000000 ffff8801ab180040 0000000100000001 ffff8801ab7cae40
> page dumped because: kasan: bad access detected
> page->mem_cgroup:ffff8801ab7cae40
>
> Memory state around the buggy address:
>  ffff8801ab17ff00: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
>  ffff8801ab17ff80: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
>                                            ^
>  ffff8801ab180080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>  ffff8801ab180100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> ==================================================================
>
> --
> You received this message because you are subscribed to the Google Groups
> "syzkaller-bugs" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to syzkaller-bugs+unsubscribe@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/syzkaller-bugs/000000000000a643e4056e7d0db4%40google.com.
>
> For more options, visit https://groups.google.com/d/optout.

^ permalink raw reply

* KASAN: out-of-bounds Read in rds_cong_queue_updates (2)
From: syzbot @ 2018-06-13  7:51 UTC (permalink / raw)
  To: davem, linux-kernel, linux-rdma, netdev, rds-devel,
	santosh.shilimkar, syzkaller-bugs

Hello,

syzbot found the following crash on:

HEAD commit:    0adb32858b0b Linux 4.16
git tree:       upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=138f2d0b800000
kernel config:  https://syzkaller.appspot.com/x/.config?x=df0c336cc3b55d45
dashboard link: https://syzkaller.appspot.com/bug?extid=287843ad8a4d2870e538
compiler:       gcc (GCC) 7.1.1 20170620

Unfortunately, I don't have any reproducer for this crash yet.

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+287843ad8a4d2870e538@syzkaller.appspotmail.com

==================================================================
BUG: KASAN: out-of-bounds in __read_once_size include/linux/compiler.h:188  
[inline]
BUG: KASAN: out-of-bounds in atomic_read arch/x86/include/asm/atomic.h:27  
[inline]
BUG: KASAN: out-of-bounds in refcount_read include/linux/refcount.h:42  
[inline]
BUG: KASAN: out-of-bounds in check_net include/net/net_namespace.h:228  
[inline]
BUG: KASAN: out-of-bounds in rds_destroy_pending net/rds/rds.h:868 [inline]
BUG: KASAN: out-of-bounds in rds_cong_queue_updates+0x4d3/0x4f0  
net/rds/cong.c:226
Read of size 4 at addr ffff88018d7f2204 by task kworker/u4:6/10561

CPU: 1 PID: 10561 Comm: kworker/u4:6 Not tainted 4.16.0+ #10
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011
Workqueue: krdsd rds_send_worker
Call Trace:
  __dump_stack lib/dump_stack.c:17 [inline]
  dump_stack+0x194/0x24d lib/dump_stack.c:53
kernel msg: ebtables bug: please report to author: Wrong len argument
  print_address_description+0x73/0x250 mm/kasan/report.c:256
  kasan_report_error mm/kasan/report.c:354 [inline]
  kasan_report+0x23c/0x360 mm/kasan/report.c:412
  __asan_report_load4_noabort+0x14/0x20 mm/kasan/report.c:432
  __read_once_size include/linux/compiler.h:188 [inline]
  atomic_read arch/x86/include/asm/atomic.h:27 [inline]
  refcount_read include/linux/refcount.h:42 [inline]
  check_net include/net/net_namespace.h:228 [inline]
  rds_destroy_pending net/rds/rds.h:868 [inline]
  rds_cong_queue_updates+0x4d3/0x4f0 net/rds/cong.c:226
  rds_recv_rcvbuf_delta.part.2+0x289/0x320 net/rds/recv.c:118
  rds_recv_rcvbuf_delta net/rds/recv.c:377 [inline]
  rds_recv_incoming+0xeb4/0x11d0 net/rds/recv.c:377
  rds_loop_xmit+0x149/0x320 net/rds/loop.c:82
  rds_send_xmit+0xbcd/0x26b0 net/rds/send.c:355
  rds_send_worker+0x115/0x2a0 net/rds/threads.c:199
  process_one_work+0xc47/0x1bb0 kernel/workqueue.c:2113
  worker_thread+0x223/0x1990 kernel/workqueue.c:2247
  kthread+0x33c/0x400 kernel/kthread.c:238
  ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:406

The buggy address belongs to the page:
page:ffffea000635fc80 count:3 mapcount:2 mapping:0000000000000000 index:0x0
flags: 0x2fffc0000000000()
raw: 02fffc0000000000 0000000000000000 0000000000000000 0000000300000001
raw: dead000000000100 dead000000000200 0000000000000000 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
  ffff88018d7f2100: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  ffff88018d7f2180: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> ffff88018d7f2200: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
                       ^
  ffff88018d7f2280: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  ffff88018d7f2300: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
==================================================================
Kernel panic - not syncing: panic_on_warn set ...

CPU: 1 PID: 10561 Comm: kworker/u4:6 Tainted: G    B            4.16.0+ #10
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011
Workqueue: krdsd rds_send_worker
Call Trace:
  __dump_stack lib/dump_stack.c:17 [inline]
  dump_stack+0x194/0x24d lib/dump_stack.c:53
  panic+0x1e4/0x41c kernel/panic.c:183
  kasan_end_report+0x50/0x50 mm/kasan/report.c:180
  kasan_report_error mm/kasan/report.c:359 [inline]
  kasan_report+0x149/0x360 mm/kasan/report.c:412
  __asan_report_load4_noabort+0x14/0x20 mm/kasan/report.c:432
  __read_once_size include/linux/compiler.h:188 [inline]
  atomic_read arch/x86/include/asm/atomic.h:27 [inline]
  refcount_read include/linux/refcount.h:42 [inline]
  check_net include/net/net_namespace.h:228 [inline]
  rds_destroy_pending net/rds/rds.h:868 [inline]
  rds_cong_queue_updates+0x4d3/0x4f0 net/rds/cong.c:226
  rds_recv_rcvbuf_delta.part.2+0x289/0x320 net/rds/recv.c:118
  rds_recv_rcvbuf_delta net/rds/recv.c:377 [inline]
  rds_recv_incoming+0xeb4/0x11d0 net/rds/recv.c:377
  rds_loop_xmit+0x149/0x320 net/rds/loop.c:82
  rds_send_xmit+0xbcd/0x26b0 net/rds/send.c:355
  rds_send_worker+0x115/0x2a0 net/rds/threads.c:199
  process_one_work+0xc47/0x1bb0 kernel/workqueue.c:2113
  worker_thread+0x223/0x1990 kernel/workqueue.c:2247
  kthread+0x33c/0x400 kernel/kthread.c:238
  ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:406
Dumping ftrace buffer:
    (ftrace buffer empty)
Kernel Offset: disabled
Rebooting in 86400 seconds..


---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.

syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#bug-status-tracking for how to communicate with  
syzbot.

^ permalink raw reply

* Re: [PATCH 4/4] net: emaclite: Remove xemaclite_mdio_setup return check
From: Andrew Lunn @ 2018-06-13  7:29 UTC (permalink / raw)
  To: Radhey Shyam Pandey
  Cc: davem, michal.simek, netdev, linux-arm-kernel, linux-kernel
In-Reply-To: <1528871719-1681-5-git-send-email-radhey.shyam.pandey@xilinx.com>

On Wed, Jun 13, 2018 at 12:05:19PM +0530, Radhey Shyam Pandey wrote:
> Errors are already reported in xemaclite_mdio_setup so avoid
> reporting it again.
> 
> Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
> Signed-off-by: Michal Simek <michal.simek@xilinx.com>
> ---
>  drivers/net/ethernet/xilinx/xilinx_emaclite.c |    4 +---
>  1 files changed, 1 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/ethernet/xilinx/xilinx_emaclite.c b/drivers/net/ethernet/xilinx/xilinx_emaclite.c
> index ec4608e..2a0c06e 100644
> --- a/drivers/net/ethernet/xilinx/xilinx_emaclite.c
> +++ b/drivers/net/ethernet/xilinx/xilinx_emaclite.c
> @@ -1143,9 +1143,7 @@ static int xemaclite_of_probe(struct platform_device *ofdev)
>  	xemaclite_update_address(lp, ndev->dev_addr);
>  
>  	lp->phy_node = of_parse_phandle(ofdev->dev.of_node, "phy-handle", 0);
> -	rc = xemaclite_mdio_setup(lp, &ofdev->dev);
> -	if (rc)
> -		dev_warn(&ofdev->dev, "error registering MDIO bus\n");
> +	xemaclite_mdio_setup(lp, &ofdev->dev);
>  
>  	dev_info(dev, "MAC address is now %pM\n", ndev->dev_addr);

The patch itself is O.K. 

Reviewed-by: Andrew Lunn <andrew@lunn.ch>

However, do you want to keep going if the MDIO bus fails? Maybe you
should failed the probe?

    Andrew

^ permalink raw reply

* Re: [PATCH 3/4] net: emaclite: Remove unused 'has_mdio' flag.
From: Andrew Lunn @ 2018-06-13  7:23 UTC (permalink / raw)
  To: Radhey Shyam Pandey
  Cc: davem, michal.simek, netdev, linux-arm-kernel, linux-kernel
In-Reply-To: <1528871719-1681-4-git-send-email-radhey.shyam.pandey@xilinx.com>

On Wed, Jun 13, 2018 at 12:05:18PM +0530, Radhey Shyam Pandey wrote:
> Remove unused 'has_mdio' flag.
> 
> Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
> Signed-off-by: Michal Simek <michal.simek@xilinx.com>

Reviewed-by: Andrew Lunn <andrew@lunn.ch>

    Andrew

^ permalink raw reply

* Re: [PATCH 2/4] net: emaclite: Fix MDIO bus unregister bug
From: Andrew Lunn @ 2018-06-13  7:23 UTC (permalink / raw)
  To: Radhey Shyam Pandey
  Cc: davem, michal.simek, netdev, linux-arm-kernel, linux-kernel
In-Reply-To: <1528871719-1681-3-git-send-email-radhey.shyam.pandey@xilinx.com>

On Wed, Jun 13, 2018 at 12:05:17PM +0530, Radhey Shyam Pandey wrote:
> Since 'has_mdio' flag is not used,sequence insmod->rmmod-> insmod
> leads to failure as MDIO unregister doesn't happen in .remove().
> Fix it by checking MII bus pointer instead.
> 
> Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
> Signed-off-by: Michal Simek <michal.simek@xilinx.com>

Reviewed-by: Andrew Lunn <andrew@lunn.ch>

    Andrew

^ permalink raw reply

* Re: [PATCH 1/4] net: emaclite: Fix position of lp->mii_bus assignment
From: Andrew Lunn @ 2018-06-13  7:21 UTC (permalink / raw)
  To: Radhey Shyam Pandey
  Cc: davem, michal.simek, netdev, linux-arm-kernel, linux-kernel
In-Reply-To: <1528871719-1681-2-git-send-email-radhey.shyam.pandey@xilinx.com>

On Wed, Jun 13, 2018 at 12:05:16PM +0530, Radhey Shyam Pandey wrote:
> To ensure MDIO bus is not double freed in remove() path
> assign lp->mii_bus after MDIO bus registration.
> 
> Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
> Signed-off-by: Michal Simek <michal.simek@xilinx.com>

Reviewed-by: Andrew Lunn <andrew@lunn.ch>

    Andrew

^ permalink raw reply

* [PATCH 4/4] net: emaclite: Remove xemaclite_mdio_setup return check
From: Radhey Shyam Pandey @ 2018-06-13  6:35 UTC (permalink / raw)
  To: davem, michal.simek, radhey.shyam.pandey
  Cc: netdev, linux-arm-kernel, linux-kernel
In-Reply-To: <1528871719-1681-1-git-send-email-radhey.shyam.pandey@xilinx.com>

Errors are already reported in xemaclite_mdio_setup so avoid
reporting it again.

Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
---
 drivers/net/ethernet/xilinx/xilinx_emaclite.c |    4 +---
 1 files changed, 1 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/xilinx/xilinx_emaclite.c b/drivers/net/ethernet/xilinx/xilinx_emaclite.c
index ec4608e..2a0c06e 100644
--- a/drivers/net/ethernet/xilinx/xilinx_emaclite.c
+++ b/drivers/net/ethernet/xilinx/xilinx_emaclite.c
@@ -1143,9 +1143,7 @@ static int xemaclite_of_probe(struct platform_device *ofdev)
 	xemaclite_update_address(lp, ndev->dev_addr);
 
 	lp->phy_node = of_parse_phandle(ofdev->dev.of_node, "phy-handle", 0);
-	rc = xemaclite_mdio_setup(lp, &ofdev->dev);
-	if (rc)
-		dev_warn(&ofdev->dev, "error registering MDIO bus\n");
+	xemaclite_mdio_setup(lp, &ofdev->dev);
 
 	dev_info(dev, "MAC address is now %pM\n", ndev->dev_addr);
 
-- 
1.7.1

^ permalink raw reply related

* [PATCH 3/4] net: emaclite: Remove unused 'has_mdio' flag.
From: Radhey Shyam Pandey @ 2018-06-13  6:35 UTC (permalink / raw)
  To: davem, michal.simek, radhey.shyam.pandey
  Cc: netdev, linux-arm-kernel, linux-kernel
In-Reply-To: <1528871719-1681-1-git-send-email-radhey.shyam.pandey@xilinx.com>

Remove unused 'has_mdio' flag.

Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
---
 drivers/net/ethernet/xilinx/xilinx_emaclite.c |    2 --
 1 files changed, 0 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/xilinx/xilinx_emaclite.c b/drivers/net/ethernet/xilinx/xilinx_emaclite.c
index 06eb6c8..ec4608e 100644
--- a/drivers/net/ethernet/xilinx/xilinx_emaclite.c
+++ b/drivers/net/ethernet/xilinx/xilinx_emaclite.c
@@ -123,7 +123,6 @@
  * @phy_node:		pointer to the PHY device node
  * @mii_bus:		pointer to the MII bus
  * @last_link:		last link status
- * @has_mdio:		indicates whether MDIO is included in the HW
  */
 struct net_local {
 
@@ -144,7 +143,6 @@ struct net_local {
 	struct mii_bus *mii_bus;
 
 	int last_link;
-	bool has_mdio;
 };
 
 
-- 
1.7.1

^ permalink raw reply related

* [PATCH 2/4] net: emaclite: Fix MDIO bus unregister bug
From: Radhey Shyam Pandey @ 2018-06-13  6:35 UTC (permalink / raw)
  To: davem, michal.simek, radhey.shyam.pandey
  Cc: netdev, linux-arm-kernel, linux-kernel
In-Reply-To: <1528871719-1681-1-git-send-email-radhey.shyam.pandey@xilinx.com>

Since 'has_mdio' flag is not used,sequence insmod->rmmod-> insmod
leads to failure as MDIO unregister doesn't happen in .remove().
Fix it by checking MII bus pointer instead.

Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
---
 drivers/net/ethernet/xilinx/xilinx_emaclite.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/net/ethernet/xilinx/xilinx_emaclite.c b/drivers/net/ethernet/xilinx/xilinx_emaclite.c
index 37989ce..06eb6c8 100644
--- a/drivers/net/ethernet/xilinx/xilinx_emaclite.c
+++ b/drivers/net/ethernet/xilinx/xilinx_emaclite.c
@@ -1191,7 +1191,7 @@ static int xemaclite_of_remove(struct platform_device *of_dev)
 	struct net_local *lp = netdev_priv(ndev);
 
 	/* Un-register the mii_bus, if configured */
-	if (lp->has_mdio) {
+	if (lp->mii_bus) {
 		mdiobus_unregister(lp->mii_bus);
 		mdiobus_free(lp->mii_bus);
 		lp->mii_bus = NULL;
-- 
1.7.1

^ permalink raw reply related

* [PATCH 1/4] net: emaclite: Fix position of lp->mii_bus assignment
From: Radhey Shyam Pandey @ 2018-06-13  6:35 UTC (permalink / raw)
  To: davem, michal.simek, radhey.shyam.pandey
  Cc: netdev, linux-arm-kernel, linux-kernel
In-Reply-To: <1528871719-1681-1-git-send-email-radhey.shyam.pandey@xilinx.com>

To ensure MDIO bus is not double freed in remove() path
assign lp->mii_bus after MDIO bus registration.

Signed-off-by: Radhey Shyam Pandey <radhey.shyam.pandey@xilinx.com>
Signed-off-by: Michal Simek <michal.simek@xilinx.com>
---
 drivers/net/ethernet/xilinx/xilinx_emaclite.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/xilinx/xilinx_emaclite.c b/drivers/net/ethernet/xilinx/xilinx_emaclite.c
index 69e31ce..37989ce 100644
--- a/drivers/net/ethernet/xilinx/xilinx_emaclite.c
+++ b/drivers/net/ethernet/xilinx/xilinx_emaclite.c
@@ -863,14 +863,14 @@ static int xemaclite_mdio_setup(struct net_local *lp, struct device *dev)
 	bus->write = xemaclite_mdio_write;
 	bus->parent = dev;
 
-	lp->mii_bus = bus;
-
 	rc = of_mdiobus_register(bus, np);
 	if (rc) {
 		dev_err(dev, "Failed to register mdio bus.\n");
 		goto err_register;
 	}
 
+	lp->mii_bus = bus;
+
 	return 0;
 
 err_register:
-- 
1.7.1

^ permalink raw reply related

* [PATCH 0/4] emaclite bug fixes and code cleanup
From: Radhey Shyam Pandey @ 2018-06-13  6:35 UTC (permalink / raw)
  To: davem, michal.simek, radhey.shyam.pandey
  Cc: netdev, linux-arm-kernel, linux-kernel

This patch series fixes bug in emaclite remove and mdio_setup routines.
It does minor code cleanup.

Radhey Shyam Pandey (4):
  net: emaclite: Fix position of lp->mii_bus assignment
  net: emaclite: Fix MDIO bus unregister bug
  net: emaclite: Remove unused 'has_mdio' flag.
  net: emaclite: Remove xemaclite_mdio_setup return check

 drivers/net/ethernet/xilinx/xilinx_emaclite.c |   12 ++++--------
 1 files changed, 4 insertions(+), 8 deletions(-)

^ permalink raw reply

* [PATCH net] net/multicast: clean change record if add new INCLUDE group
From: Hangbin Liu @ 2018-06-13  6:32 UTC (permalink / raw)
  To: netdev
  Cc: David S. Miller, Paolo Abeni, Stefano Brivio, Daniel Borkmann,
	WANG Cong, hideaki.yoshifuji, Hangbin Liu

Based on RFC3376 5.1 and RFC3810 6.1:
   If no interface
   state existed for that multicast address before the change (i.e., the
   change consisted of creating a new per-interface record), or if no
   state exists after the change (i.e., the change consisted of deleting
   a per-interface record), then the "non-existent" state is considered
   to have a filter mode of INCLUDE and an empty source list.

Which means a new multicast group should start with state IN(). That is
exactly what we did with ip_mc_join_group()/ipv6_sock_mc_join(), which
adds a group with state EX() and init crcount to mc_qrv. The kernel will
send a TO_EX() report message after adding group. This is what IGMPv3/MLDv2
ASM(Any-Source Multicast) mode should look like.

But for IGMPv3/MLDv2 SSM JOIN_SOURCE_GROUP mode, we split the group
joining into two steps. First step we join the group like ASM, i.e. via
ip_mc_join_group()/ipv6_sock_mc_join(). So the state changes from IN() to EX().

Then we add the Source-specific address with INCLUDE mode. So the state
changes from EX() to IN(A).

Before the first step sends a group change record, we finished the second step.
So we will only send the second change record. i.e. TO_IN(A)

Regarding the RFC stands, we should actually send an ALLOW(A) message for
SSM JOIN_SOURCE_GROUP as the state should mimic the 'IN() to IN(A)' transition.

The issue was exposed by commit a052517a8ff65 ("net/multicast: should not send
source list records when have filter mode change"). Before this commit we will
send both ALLOW(A) and TO_IN(A). After this commit we only send TO_IN(A).

Fix it by adding a is_new key to clean the crcount when we add a new
INCLUDE SSM group.

Fixes: a052517a8ff65 ("net/multicast: should not send source list records when have filter mode change")
Reviewed-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
---
 include/linux/igmp.h     |  2 +-
 include/net/ipv6.h       |  2 +-
 net/ipv4/igmp.c          | 27 ++++++++++++++++++++++++++-
 net/ipv4/ip_sockglue.c   |  8 ++++++--
 net/ipv6/ipv6_sockglue.c |  4 +++-
 net/ipv6/mcast.c         | 25 ++++++++++++++++++++++++-
 6 files changed, 61 insertions(+), 7 deletions(-)

diff --git a/include/linux/igmp.h b/include/linux/igmp.h
index f823185..32cb02b 100644
--- a/include/linux/igmp.h
+++ b/include/linux/igmp.h
@@ -112,7 +112,7 @@ extern int ip_mc_join_group(struct sock *sk, struct ip_mreqn *imr);
 extern int ip_mc_leave_group(struct sock *sk, struct ip_mreqn *imr);
 extern void ip_mc_drop_socket(struct sock *sk);
 extern int ip_mc_source(int add, int omode, struct sock *sk,
-		struct ip_mreq_source *mreqs, int ifindex);
+		struct ip_mreq_source *mreqs, int ifindex, bool is_new);
 extern int ip_mc_msfilter(struct sock *sk, struct ip_msfilter *msf,int ifindex);
 extern int ip_mc_msfget(struct sock *sk, struct ip_msfilter *msf,
 		struct ip_msfilter __user *optval, int __user *optlen);
diff --git a/include/net/ipv6.h b/include/net/ipv6.h
index 836f31a..754c5cb 100644
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -1065,7 +1065,7 @@ struct group_source_req;
 struct group_filter;
 
 int ip6_mc_source(int add, int omode, struct sock *sk,
-		  struct group_source_req *pgsr);
+		  struct group_source_req *pgsr, bool is_new);
 int ip6_mc_msfilter(struct sock *sk, struct group_filter *gsf);
 int ip6_mc_msfget(struct sock *sk, struct group_filter *gsf,
 		  struct group_filter __user *optval, int __user *optlen);
diff --git a/net/ipv4/igmp.c b/net/ipv4/igmp.c
index b26a81a..8d6ecc3 100644
--- a/net/ipv4/igmp.c
+++ b/net/ipv4/igmp.c
@@ -2249,8 +2249,27 @@ int ip_mc_leave_group(struct sock *sk, struct ip_mreqn *imr)
 }
 EXPORT_SYMBOL(ip_mc_leave_group);
 
+static void ip_mc_clear_cr(struct in_device *in_dev, __be32 pmca)
+{
+#ifdef CONFIG_IP_MULTICAST
+	struct ip_mc_list *pmc;
+
+	rcu_read_lock();
+	for_each_pmc_rcu(in_dev, pmc) {
+		if (pmca == pmc->multiaddr)
+			break;
+	}
+	if (pmc) {
+		spin_lock_bh(&pmc->lock);
+		pmc->crcount = 0;
+		spin_unlock_bh(&pmc->lock);
+	}
+	rcu_read_unlock();
+#endif
+}
+
 int ip_mc_source(int add, int omode, struct sock *sk, struct
-	ip_mreq_source *mreqs, int ifindex)
+	ip_mreq_source *mreqs, int ifindex, bool is_new)
 {
 	int err;
 	struct ip_mreqn imr;
@@ -2301,6 +2320,12 @@ int ip_mc_source(int add, int omode, struct sock *sk, struct
 		ip_mc_del_src(in_dev, &mreqs->imr_multiaddr, pmc->sfmode, 0,
 			NULL, 0);
 		pmc->sfmode = omode;
+		/* Based on RFC3376 5.1, for newly added INCLUDE SSM, we should
+		 * not send filter-mode change record as the mode should be
+		 * from IN() to IN(A).
+		 */
+		if (is_new)
+			ip_mc_clear_cr(in_dev, mreqs->imr_multiaddr);
 	}
 
 	psl = rtnl_dereference(pmc->sflist);
diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c
index 57bbb06..8d8c0cd 100644
--- a/net/ipv4/ip_sockglue.c
+++ b/net/ipv4/ip_sockglue.c
@@ -962,6 +962,7 @@ static int do_ip_setsockopt(struct sock *sk, int level,
 	case IP_DROP_SOURCE_MEMBERSHIP:
 	{
 		struct ip_mreq_source mreqs;
+		bool is_new = false;
 		int omode, add;
 
 		if (optlen != sizeof(struct ip_mreq_source))
@@ -987,11 +988,12 @@ static int do_ip_setsockopt(struct sock *sk, int level,
 				break;
 			omode = MCAST_INCLUDE;
 			add = 1;
+			is_new = true;
 		} else /* IP_DROP_SOURCE_MEMBERSHIP */ {
 			omode = MCAST_INCLUDE;
 			add = 0;
 		}
-		err = ip_mc_source(add, omode, sk, &mreqs, 0);
+		err = ip_mc_source(add, omode, sk, &mreqs, 0, is_new);
 		break;
 	}
 	case MCAST_JOIN_GROUP:
@@ -1027,6 +1029,7 @@ static int do_ip_setsockopt(struct sock *sk, int level,
 		struct group_source_req greqs;
 		struct ip_mreq_source mreqs;
 		struct sockaddr_in *psin;
+		bool is_new = false;
 		int omode, add;
 
 		if (optlen != sizeof(struct group_source_req))
@@ -1065,12 +1068,13 @@ static int do_ip_setsockopt(struct sock *sk, int level,
 			greqs.gsr_interface = mreq.imr_ifindex;
 			omode = MCAST_INCLUDE;
 			add = 1;
+			is_new = true;
 		} else /* MCAST_LEAVE_SOURCE_GROUP */ {
 			omode = MCAST_INCLUDE;
 			add = 0;
 		}
 		err = ip_mc_source(add, omode, sk, &mreqs,
-				   greqs.gsr_interface);
+				   greqs.gsr_interface, is_new);
 		break;
 	}
 	case MCAST_MSFILTER:
diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
index 4d780c7..36e7c40 100644
--- a/net/ipv6/ipv6_sockglue.c
+++ b/net/ipv6/ipv6_sockglue.c
@@ -695,6 +695,7 @@ static int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
 	case MCAST_UNBLOCK_SOURCE:
 	{
 		struct group_source_req greqs;
+		bool is_new = false;
 		int omode, add;
 
 		if (optlen < sizeof(struct group_source_req))
@@ -725,11 +726,12 @@ static int do_ipv6_setsockopt(struct sock *sk, int level, int optname,
 				break;
 			omode = MCAST_INCLUDE;
 			add = 1;
+			is_new = true;
 		} else /* MCAST_LEAVE_SOURCE_GROUP */ {
 			omode = MCAST_INCLUDE;
 			add = 0;
 		}
-		retv = ip6_mc_source(add, omode, sk, &greqs);
+		retv = ip6_mc_source(add, omode, sk, &greqs, is_new);
 		break;
 	}
 	case MCAST_MSFILTER:
diff --git a/net/ipv6/mcast.c b/net/ipv6/mcast.c
index 793159d..f508a1c 100644
--- a/net/ipv6/mcast.c
+++ b/net/ipv6/mcast.c
@@ -315,8 +315,25 @@ void ipv6_sock_mc_close(struct sock *sk)
 	rtnl_unlock();
 }
 
+static void ip6_mc_clear_cr(struct inet6_dev *idev, const struct in6_addr *pmca)
+{
+	struct ifmcaddr6 *pmc;
+
+	read_lock_bh(&idev->lock);
+	for (pmc = idev->mc_list; pmc; pmc = pmc->next) {
+		if (ipv6_addr_equal(pmca, &pmc->mca_addr))
+			break;
+	}
+	if (pmc) {
+		spin_lock_bh(&pmc->mca_lock);
+		pmc->mca_crcount = 0;
+		spin_unlock_bh(&pmc->mca_lock);
+	}
+	read_unlock_bh(&idev->lock);
+}
+
 int ip6_mc_source(int add, int omode, struct sock *sk,
-	struct group_source_req *pgsr)
+	struct group_source_req *pgsr, bool is_new)
 {
 	struct in6_addr *source, *group;
 	struct ipv6_mc_socklist *pmc;
@@ -365,6 +382,12 @@ int ip6_mc_source(int add, int omode, struct sock *sk,
 		ip6_mc_add_src(idev, group, omode, 0, NULL, 0);
 		ip6_mc_del_src(idev, group, pmc->sfmode, 0, NULL, 0);
 		pmc->sfmode = omode;
+		/* Based on RFC3810 6.1, for newly added INCLUDE SSM, we
+		 * should not send filter-mode change record as the mode
+		 * should be from IN() to IN(A).
+		 */
+		if (is_new)
+			ip6_mc_clear_cr(idev, group);
 	}
 
 	write_lock(&pmc->sflock);
-- 
2.5.5

^ permalink raw reply related

* Re: [PATCH 03/18] rhashtable: remove nulls_base and related code.
From: Herbert Xu @ 2018-06-13  6:25 UTC (permalink / raw)
  To: NeilBrown; +Cc: Thomas Graf, netdev, linux-kernel
In-Reply-To: <87vaavmhuk.fsf@notabene.neil.brown.name>

On Thu, Jun 07, 2018 at 12:49:07PM +1000, NeilBrown wrote:
> On Fri, Jun 01 2018, NeilBrown wrote:
> 
> > This "feature" is unused, undocumented, and untested and so
> > doesn't really belong.  Next patch will introduce support
> > to detect when a search gets diverted down a different chain,
> > which the common purpose of nulls markers.
> >
> > This patch actually fixes a bug too.  The table resizing allows a
> > table to grow to 2^31 buckets, but the hash is truncated to 27 bits -
> > any growth beyond 2^27 is wasteful an ineffective.
> >
> > This patch results in NULLS_MARKER(0) being used for all chains,
> > and leaves the use of rht_is_a_null() to test for it.
> >
> > Signed-off-by: NeilBrown <neilb@suse.com>
> 
> Hi Herbert,
>  You've acked a few patches that depends on this one, but not this
>  patch itself.  If you could ack this one, I could submit a collection
>  of patches for inclusion (after the merge window closes I guess)
>  and then have fewer outstanding.
>  This assumes you are in-principle happy with the alternative approach I
>  took to handling list-nulls.  I got the impression that it was only
>  some small details holding that back.

You can add my ack to this patch:

Acked-by: Herbert Xu <herbert@gondor.apana.org.au>

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox