Netdev List
 help / color / mirror / Atom feed
* Re: [RFC PATCH v2 08/10] selftests: conditionally enable XDP support in udpgso_bench_rx
From: Willem de Bruijn @ 2018-10-21 20:09 UTC (permalink / raw)
  To: Paolo Abeni; +Cc: Network Development, Willem de Bruijn, steffen.klassert
In-Reply-To: <76b1b20399a7d5c9d4249f97383bc711e9416f1e.1539957909.git.pabeni@redhat.com>

On Fri, Oct 19, 2018 at 10:31 AM Paolo Abeni <pabeni@redhat.com> wrote:
>
> XDP support will be used by a later patch to test the GRO path
> in a net namespace, leveraging the veth XDP implementation.
> To avoid breaking existing setup, XDP support is conditionally
> enabled and build only if llc is locally available.
>
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> ---
> diff --git a/tools/testing/selftests/net/Makefile b/tools/testing/selftests/net/Makefile
> index 256d82d5fa87..176459b7c4d6 100644
> --- a/tools/testing/selftests/net/Makefile
> +++ b/tools/testing/selftests/net/Makefile
> @@ -16,8 +16,77 @@ TEST_GEN_PROGS = reuseport_bpf reuseport_bpf_cpu reuseport_bpf_numa
>  TEST_GEN_PROGS += reuseport_dualstack reuseaddr_conflict tls
>
>  KSFT_KHDR_INSTALL := 1
> +
> +# Allows pointing LLC/CLANG to a LLVM backend with bpf support, redefine on cmdline:
> +#  make samples/bpf/ LLC=~/git/llvm/build/bin/llc CLANG=~/git/llvm/build/bin/clang
> +LLC ?= llc
> +CLANG ?= clang
> +LLVM_OBJCOPY ?= llvm-objcopy
> +BTF_PAHOLE ?= pahole
> +HAS_LLC := $(shell which $(LLC) 2>/dev/null)
> +
> +# conditional enable testes requiring llc
> +ifneq (, $(HAS_LLC))
> +TEST_GEN_FILES += xdp_dummy.o
> +endif
> +
>  include ../lib.mk
>
> +ifneq (, $(HAS_LLC))
> +
> +# Detect that we're cross compiling and use the cross compiler
> +ifdef CROSS_COMPILE
> +CLANG_ARCH_ARGS = -target $(ARCH)
> +endif
> +
> +PROBE := $(shell $(LLC) -march=bpf -mcpu=probe -filetype=null /dev/null 2>&1)
> +
> +# Let newer LLVM versions transparently probe the kernel for availability
> +# of full BPF instruction set.
> +ifeq ($(PROBE),)
> +  CPU ?= probe
> +else
> +  CPU ?= generic
> +endif
> +
> +SRC_PATH := $(abspath ../../../..)
> +LIB_PATH := $(SRC_PATH)/tools/lib
> +XDP_CFLAGS := -D SUPPORT_XDP=1 -I$(LIB_PATH)
> +LIBBPF = $(LIB_PATH)/bpf/libbpf.a
> +BTF_LLC_PROBE := $(shell $(LLC) -march=bpf -mattr=help 2>&1 | grep dwarfris)
> +BTF_PAHOLE_PROBE := $(shell $(BTF_PAHOLE) --help 2>&1 | grep BTF)
> +BTF_OBJCOPY_PROBE := $(shell $(LLVM_OBJCOPY) --help 2>&1 | grep -i 'usage.*llvm')
> +CLANG_SYS_INCLUDES := $(shell $(CLANG) -v -E - </dev/null 2>&1 \
> +        | sed -n '/<...> search starts here:/,/End of search list./{ s| \(/.*\)|-idirafter \1|p }')
> +CLANG_FLAGS = -I. -I$(SRC_PATH)/include -I../bpf/ \
> +             $(CLANG_SYS_INCLUDES) -Wno-compare-distinct-pointer-types
> +
> +ifneq ($(and $(BTF_LLC_PROBE),$(BTF_PAHOLE_PROBE),$(BTF_OBJCOPY_PROBE)),)
> +       CLANG_CFLAGS += -g
> +       LLC_FLAGS += -mattr=dwarfris
> +       DWARF2BTF = y
> +endif
> +
> +$(LIBBPF): FORCE
> +# Fix up variables inherited from Kbuild that tools/ build system won't like
> +       $(MAKE) -C $(dir $@) RM='rm -rf' LDFLAGS= srctree=$(SRC_PATH) O= $(nodir $@)
> +

This is a lot of XDP specific code. Not for this patchset, per se, but
would be nice if we can reuse the logic in selftests/bpf for all this.

> --- a/tools/testing/selftests/net/udpgso_bench_rx.c
> @@ -227,6 +234,13 @@ static void parse_opts(int argc, char **argv)
>                         cfg_verify = true;
>                         cfg_read_all = true;
>                         break;
> +#ifdef SUPPORT_XDP
> +               case 'x':
> +                       cfg_xdp_iface = if_nametoindex(optarg);
> +                       if (!cfg_xdp_iface)
> +                               error(1, errno, "unknown interface %s", optarg);
> +                       break;
> +#endif

nit: needs to be added to getopt string in this patch.

^ permalink raw reply

* Re: [RFC PATCH v2 10/10] selftests: add functionals test for UDP GRO
From: Willem de Bruijn @ 2018-10-21 20:09 UTC (permalink / raw)
  To: Paolo Abeni; +Cc: Network Development, Willem de Bruijn, steffen.klassert
In-Reply-To: <846157138dd61db611c5df7942c0e9f76bbc5a0e.1539957909.git.pabeni@redhat.com>

On Fri, Oct 19, 2018 at 10:31 AM Paolo Abeni <pabeni@redhat.com> wrote:
>
> Extends the existing udp programs to allow checking for proper
> GRO aggregation/GSO size, and run the tests via a shell script, using
> a veth pair with XDP program attached to trigger the GRO code path.
>
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> ---
>  tools/testing/selftests/net/Makefile          |   2 +-
>  tools/testing/selftests/net/udpgro.sh         | 144 ++++++++++++++++++
>  tools/testing/selftests/net/udpgro_bench.sh   |   8 +-
>  tools/testing/selftests/net/udpgso_bench.sh   |   2 +-
>  tools/testing/selftests/net/udpgso_bench_rx.c | 125 +++++++++++++--
>  tools/testing/selftests/net/udpgso_bench_tx.c |  22 ++-
>  6 files changed, 281 insertions(+), 22 deletions(-)
>  create mode 100755 tools/testing/selftests/net/udpgro.sh
>

> diff --git a/tools/testing/selftests/net/udpgro.sh b/tools/testing/selftests/net/udpgro.sh

> +       run_test "no GRO chk cmsg" "${ipv4_args} -M 10 -s 1400" "-4 -n 10 -l 1400 -S -1"
> +       run_test "no GRO chk cmsg" "${ipv6_args} -M 10 -s 1400" "-n 10 -l 1400 -S -1"

why expected segment size -1 in these two?

> diff --git a/tools/testing/selftests/net/udpgso_bench_tx.c b/tools/testing/selftests/net/udpgso_bench_tx.c
>  static void usage(const char *filepath)
>  {
> -       error(1, 0, "Usage: %s [-46cmStuz] [-C cpu] [-D dst ip] [-l secs] [-p port] [-s sendsize]",
> +       error(1, 0, "Usage: %s [-46cmtuz] [-C cpu] [-D dst ip] [-l secs] [-m messagenr] [-p port] [-s sendsize] [-S gsosize]",
>                     filepath);

missing -M

^ permalink raw reply

* Re: [PATCH bpf-next] selftests/bpf: enable (uncomment) all tests in test_libbpf.sh
From: Jesper Dangaard Brouer @ 2018-10-21 21:04 UTC (permalink / raw)
  To: Quentin Monnet
  Cc: Alexei Starovoitov, Daniel Borkmann, netdev, oss-drivers, brouer
In-Reply-To: <1540136228-11360-1-git-send-email-quentin.monnet@netronome.com>

On Sun, 21 Oct 2018 16:37:08 +0100
Quentin Monnet <quentin.monnet@netronome.com> wrote:

> 2018-10-21 11:57 UTC+0200 ~ Jesper Dangaard Brouer <brouer@redhat.com>
> > On Sat, 20 Oct 2018 23:00:24 +0100
> > Quentin Monnet <quentin.monnet@netronome.com> wrote:
> >   
> 
> [...]
> 
> >> --- a/tools/testing/selftests/bpf/test_libbpf.sh
> >> +++ b/tools/testing/selftests/bpf/test_libbpf.sh
> >> @@ -33,17 +33,11 @@ trap exit_handler 0 2 3 6 9
> >>   
> >>   libbpf_open_file test_l4lb.o
> >>   
> >> -# TODO: fix libbpf to load noinline functions
> >> -# [warning] libbpf: incorrect bpf_call opcode
> >> -#libbpf_open_file test_l4lb_noinline.o
> >> +libbpf_open_file test_l4lb_noinline.o
> >>   
> >> -# TODO: fix test_xdp_meta.c to load with libbpf
> >> -# [warning] libbpf: test_xdp_meta.o doesn't provide kernel version
> >> -#libbpf_open_file test_xdp_meta.o
> >> +libbpf_open_file test_xdp_meta.o
> >>   
> >> -# TODO: fix libbpf to handle .eh_frame
> >> -# [warning] libbpf: relocation failed: no section(10)
> >> -#libbpf_open_file ../../../../samples/bpf/tracex3_kern.o
> >> +libbpf_open_file ../../../../samples/bpf/tracex3_kern.o  
> > 
> > I don't like the ../../../../samples/bpf/ reference (even-through I
> > added this TODO), as the kselftests AFAIK support installing the
> > selftests and then this tests will fail.
> > Maybe we can find another example kern.o file?
> > (which isn't compiled with -target bpf)  
> 
> Hi Jesper, yeah maybe making the test rely on something from samples/bpf
> instead of just the selftests/bpf directory is not a good idea. But
> there is no program compiled without the "-target-bpf" in that
> directory. What we could do is explicitly compile one without the flag
> in the Makefile, as in the patch below, but I am not sure that doing so
> is acceptable?

I think it makes sense to have a test program compiled without the
"-target-bpf", as that will happen for users.  And I guess we can add
some more specific test that are related to "-target-bpf".

> Or should tests for libbpf have a directory of their own,
> with another Makefile?

Hmm, I'm not sure about that idea.

I did plan by naming the test "libbpf_open_file", what we add more
libbpf_ prefixed tests to the test_libbpf.sh script, which should
cover more aspects of the _base_ libbpf functionality.

> Another question regarding the test with test_xdp_meta.o: does the fix I
> suggested (setting a version in the .C file) makes sense, or did you
> leave this test for testing someday that libbpf would be able to open
> even programs that do not set a version (in which case this is still not
> the case if program type is not provided, and in fact my fix ruins
> everything? :s).

Well, yes.  I was hinting if we should relax the version requirement
for e.g. XDP BPF progs.

> ---
>  tools/testing/selftests/bpf/Makefile        | 10 ++++++++++
>  tools/testing/selftests/bpf/test_libbpf.sh  | 14 +++++---------
>  tools/testing/selftests/bpf/test_xdp_meta.c |  2 ++
>  3 files changed, 17 insertions(+), 9 deletions(-)
> 
> diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
> index e39dfb4e7970..ecd79b7fb107 100644
> --- a/tools/testing/selftests/bpf/Makefile
> +++ b/tools/testing/selftests/bpf/Makefile
> @@ -135,6 +135,16 @@ endif
>  endif
>  endif
>  
> +# Have one program compiled without "-target bpf" to test whether libbpf loads
> +# it successfully
> +$(OUTPUT)/test_xdp.o: test_xdp.c
> +	$(CLANG) $(CLANG_FLAGS) \
> +		-O2 -emit-llvm -c $< -o - | \
> +	$(LLC) -march=bpf -mcpu=$(CPU) $(LLC_FLAGS) -filetype=obj -o $@
> +ifeq ($(DWARF2BTF),y)
> +	$(BTF_PAHOLE) -J $@
> +endif
> +
>  $(OUTPUT)/%.o: %.c
>  	$(CLANG) $(CLANG_FLAGS) \
>  		 -O2 -target bpf -emit-llvm -c $< -o - |      \
> diff --git a/tools/testing/selftests/bpf/test_libbpf.sh b/tools/testing/selftests/bpf/test_libbpf.sh
> index 156d89f1edcc..b45962a44243 100755
> --- a/tools/testing/selftests/bpf/test_libbpf.sh
> +++ b/tools/testing/selftests/bpf/test_libbpf.sh
> @@ -33,17 +33,13 @@ trap exit_handler 0 2 3 6 9
>  
>  libbpf_open_file test_l4lb.o
>  
> -# TODO: fix libbpf to load noinline functions
> -# [warning] libbpf: incorrect bpf_call opcode
> -#libbpf_open_file test_l4lb_noinline.o
> +# Load a program with BPF-to-BPF calls
> +libbpf_open_file test_l4lb_noinline.o
>  
> -# TODO: fix test_xdp_meta.c to load with libbpf
> -# [warning] libbpf: test_xdp_meta.o doesn't provide kernel version
> -#libbpf_open_file test_xdp_meta.o
> +libbpf_open_file test_xdp_meta.o
>  
> -# TODO: fix libbpf to handle .eh_frame
> -# [warning] libbpf: relocation failed: no section(10)
> -#libbpf_open_file ../../../../samples/bpf/tracex3_kern.o
> +# Load a program compiled without the "-target bpf" flag
> +libbpf_open_file test_xdp.o
>  
>  # Success
>  exit 0
> diff --git a/tools/testing/selftests/bpf/test_xdp_meta.c b/tools/testing/selftests/bpf/test_xdp_meta.c
> index 8d0182650653..2f42de66e2bb 100644
> --- a/tools/testing/selftests/bpf/test_xdp_meta.c
> +++ b/tools/testing/selftests/bpf/test_xdp_meta.c
> @@ -8,6 +8,8 @@
>  #define round_up(x, y) ((((x) - 1) | __round_mask(x, y)) + 1)
>  #define ctx_ptr(ctx, mem) (void *)(unsigned long)ctx->mem
>  
> +int _version SEC("version") = 1;
> +
>  SEC("t")
>  int ing_cls(struct __sk_buff *ctx)
>  {



-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply

* Re: CRC errors between mvneta and macb
From: Richard Genoud @ 2018-10-22  6:51 UTC (permalink / raw)
  To: linux-kernel, Richard Genoud
  Cc: Thomas Petazzoni, Antoine Tenart, Gregory CLEMENT,
	Yelena Krivosheev, Maxime Chevallier, Nicolas Ferre, netdev
In-Reply-To: <20181019154439.GC24045@1wt.eu>

Le 19/10/2018 à 17:44, Willy Tarreau a écrit :
> On Fri, Oct 19, 2018 at 05:15:03PM +0200, Richard Genoud wrote:
>> When there's a CRC error, the TXCLK has its polarity inverted...
>> That's a clue !
>>
>> But this TXCLK (25MHz) is not used on the g35-ek.
>> Only the REFCLK/XT2 (50MHz) is used to synchronise the PHY and the macb.
>> So I guess that the TXCLK has a role to play to generate TX+/TX-
> 
> Well, just a stupid idea, maybe when this signal is inverted, the TX+/TX-
> are desynchronized by half a clock and are not always properly interpreted
> on the other side ?
> 
> Willy
> 

I must admit that I'm not familiar with the PHY internals, I'll have to
dig into that.

Richard.

^ permalink raw reply

* RE: SMSC9730 Autosuspend/Resume Questions
From: Nisar.Sayed @ 2018-10-22  7:03 UTC (permalink / raw)
  To: frieder.schrempf, steve.glendinning, UNGLinuxDriver, davem,
	netdev
  Cc: oneukum, linux-usb, linux-kernel
In-Reply-To: <69e14429-a4d9-dd5c-5e70-c22086490cc0@exceet.de>

Hi Frieder,

> Hi,
> 
> I recently tested a board with SMSC9730 connected via USB HSIC to an
> i.MX6S SOC. I used these patches on top of v4.14-rc8 for the USB HSIC
> support: [1].
> 
> When I turned on autosuspend, the smsc95xx stopped in the middle of the
> suspending routine and /sys/bus/usb/devices/usb3/3-
> 1/power/runtime_status reported "suspending"
> forever.
> 
> With some debug logs I found out, that the last line of code that was
> executed is [2], after that I didn't get any further messages.
> 
> Then I applied the following diff and the suspend/resume started to work
> reliably:
> 
> @@ -1382,10 +1385,11 @@ static int smsc95xx_link_ok_nopm(struct usbnet
> *dev)
>          ret = smsc95xx_mdio_read_nopm(dev->net, mii->phy_id, MII_BMSR);
>          if (ret < 0)
>                  return ret;
> -
> +       /*
>          ret = smsc95xx_mdio_read_nopm(dev->net, mii->phy_id, MII_BMSR);
>          if (ret < 0)
>                  return ret;
> +       */
> 
>          return !!(ret & BMSR_LSTATUS);
>   }
> 
> So it seems like the dummy read, that is commented with "first, a dummy
> read, needed to latch some MII phys" causes problems in my setup. Do you
> have an idea what could be the reason? Is this a proper fix?
>
 
Link status bit is latch low register and needs to be read twice. You may get wrong Link status with this fix.

- Nisar

^ permalink raw reply

* Re: SMSC9730 Autosuspend/Resume Questions
From: Frieder Schrempf @ 2018-10-22  7:14 UTC (permalink / raw)
  To: Nisar.Sayed, steve.glendinning, UNGLinuxDriver, davem, netdev
  Cc: oneukum, linux-usb, linux-kernel
In-Reply-To: <CE371C1263339941885964188A0225FA44571A@CHN-SV-EXMX03.mchp-main.com>

Hi Nisar,

On 22.10.18 09:03, Nisar.Sayed@microchip.com wrote:
> Hi Frieder,
> 
>> Hi,
>>
>> I recently tested a board with SMSC9730 connected via USB HSIC to an
>> i.MX6S SOC. I used these patches on top of v4.14-rc8 for the USB HSIC
>> support: [1].
>>
>> When I turned on autosuspend, the smsc95xx stopped in the middle of the
>> suspending routine and /sys/bus/usb/devices/usb3/3-
>> 1/power/runtime_status reported "suspending"
>> forever.
>>
>> With some debug logs I found out, that the last line of code that was
>> executed is [2], after that I didn't get any further messages.
>>
>> Then I applied the following diff and the suspend/resume started to work
>> reliably:
>>
>> @@ -1382,10 +1385,11 @@ static int smsc95xx_link_ok_nopm(struct usbnet
>> *dev)
>>           ret = smsc95xx_mdio_read_nopm(dev->net, mii->phy_id, MII_BMSR);
>>           if (ret < 0)
>>                   return ret;
>> -
>> +       /*
>>           ret = smsc95xx_mdio_read_nopm(dev->net, mii->phy_id, MII_BMSR);
>>           if (ret < 0)
>>                   return ret;
>> +       */
>>
>>           return !!(ret & BMSR_LSTATUS);
>>    }
>>
>> So it seems like the dummy read, that is commented with "first, a dummy
>> read, needed to latch some MII phys" causes problems in my setup. Do you
>> have an idea what could be the reason? Is this a proper fix?
>>
>   
> Link status bit is latch low register and needs to be read twice. You may get wrong Link status with this fix.

Ok, thanks for the information. I guess I'll have to do some more 
debugging to find out why the second read fails in my case.

Frieder

^ permalink raw reply

* Re: [PATCH net] r8169: fix NAPI handling under high load
From: Jonathan Woithe @ 2018-10-21 23:07 UTC (permalink / raw)
  To: Heiner Kallweit
  Cc: Francois Romieu, Holger Hoffstätte, David Miller,
	Realtek linux nic maintainers, netdev@vger.kernel.org
In-Reply-To: <20181018061554.GF2487@marvin.atrad.com.au>

On Fri, 19 Oct 2018 17:59:21 +1030, Jonathan Woithe wrote:
> On 10/18/18 08:15, Jonathan Woithe wrote:
> > On Thu, Oct 18, 2018 at 08:03:32AM +0200, Heiner Kallweit wrote:
> > > Proposed fix is here:
> > > https://patchwork.ozlabs.org/patch/985014/
> > > Would be good if you could test it. Thanks!
> > 
> > I should be able to do so tomorrow.  Which kernel would you like me to apply
> > the patch to?
>
> It turns out I couldn't compile 4.18.15 conveniently ...
> I could compile 4.14 though - I had previously confirmed that the problem
> was still seen with that kernel.  I therefore applied the fix to 4.14 (which
> was trivial) and started a test with that.  The system has been running
> without exhibiting the effects of the bug for over five hours.  With an
> unpatched 4.14, symptoms were typically seen within 30 minutes.
> 
> I will leave the system to operate over the weekend to be sure, but at this
> stage it seems likely that the patch will resolve this long-standing
> difficulty that we've experienced.  I will report back on Monday.

Our test system has now been running for just under 3 days (69 hours) and
there has not been any incidence of the problem with this patch applied to
the 4.14 r8169 driver.  Without the patch, multiple problems are logged
within 30 minutes.

Given this, I would conclude that this patch fixes the problem for us. 
Although belated (since the patch has already been accepted):

  Tested-by: Jonathan Woithe <jwoithe@atrad.com.au>

Thank you very much for your work which lead to the patch!  It means we can
finally provide an upgrade path for systems in the field which are equipped
with the r8169 hardware.

For reference, the problem being tested on our systems is the one discussed
in the "r8169 regression: UDP packets dropped intermittantly" thread on this
mailing list.

Regards
  jonathan

^ permalink raw reply

* RE: [PATCH] bonding: avoid repeated display of same link status change
From: Manish Kumar Singh @ 2018-10-22  7:29 UTC (permalink / raw)
  To: Eric Dumazet, netdev
  Cc: Jay Vosburgh, Veaceslav Falico, Andy Gospodarek, David S. Miller,
	linux-kernel
In-Reply-To: <dd819430-a67e-4507-b03f-d9d40528e74d@default>



> -----Original Message-----
> From: Manish Kumar Singh
> Sent: 24 सितम्बर 2018 12:36
> To: Eric Dumazet; netdev@vger.kernel.org
> Cc: Jay Vosburgh; Veaceslav Falico; Andy Gospodarek; David S. Miller; linux-
> kernel@vger.kernel.org
> Subject: RE: [PATCH] bonding: avoid repeated display of same link status
> change
> 
> 
> 
> > -----Original Message-----
> > From: Eric Dumazet [mailto:eric.dumazet@gmail.com]
> > Sent: 18 सितम्बर 2018 19:30
> > To: Manish Kumar Singh; Eric Dumazet; netdev@vger.kernel.org
> > Cc: Jay Vosburgh; Veaceslav Falico; Andy Gospodarek; David S. Miller; linux-
> > kernel@vger.kernel.org
> > Subject: Re: [PATCH] bonding: avoid repeated display of same link status
> > change
> >
> >
> >
> > On 09/17/2018 10:05 PM, Manish Kumar Singh wrote:
> > >
> > >
> > >> -----Original Message-----
> > >> From: Eric Dumazet [mailto:eric.dumazet@gmail.com]
> > >> Sent: 17 सितम्बर 2018 20:08
> > >> To: Manish Kumar Singh; netdev@vger.kernel.org
> > >> Cc: Jay Vosburgh; Veaceslav Falico; Andy Gospodarek; David S. Miller;
> > linux-
> > >> kernel@vger.kernel.org
> > >> Subject: Re: [PATCH] bonding: avoid repeated display of same link status
> > >> change
> > >>
> > >>
> > >>
> > >> On 09/17/2018 12:20 AM, mk.singh@oracle.com wrote:
> > >>> From: Manish Kumar Singh <mk.singh@oracle.com>
> > >>>
> > >>> When link status change needs to be committed and rtnl lock couldn't
> be
> > >>> taken, avoid redisplay of same link status change message.
> > >>>
> > >>> Signed-off-by: Manish Kumar Singh <mk.singh@oracle.com>
> > >>> ---
> > >>>  drivers/net/bonding/bond_main.c | 6 ++++--
> > >>>  include/net/bonding.h           | 1 +
> > >>>  2 files changed, 5 insertions(+), 2 deletions(-)
> > >>>
> > >>> diff --git a/drivers/net/bonding/bond_main.c
> > >> b/drivers/net/bonding/bond_main.c
> > >>> index 217b790d22ed..fb4e3aff1677 100644
> > >>> --- a/drivers/net/bonding/bond_main.c
> > >>> +++ b/drivers/net/bonding/bond_main.c
> > >>> @@ -2087,7 +2087,7 @@ static int bond_miimon_inspect(struct
> bonding
> > >> *bond)
> > >>>  			bond_propose_link_state(slave, BOND_LINK_FAIL);
> > >>>  			commit++;
> > >>>  			slave->delay = bond->params.downdelay;
> > >>> -			if (slave->delay) {
> > >>> +			if (slave->delay && !bond->rtnl_needed) {
> > >>>  				netdev_info(bond->dev, "link status down
> > for
> > >> %sinterface %s, disabling it in %d ms\n",
> > >>>  					    (BOND_MODE(bond) ==
> > >>>  					     BOND_MODE_ACTIVEBACKUP) ?
> > >>> @@ -2127,7 +2127,7 @@ static int bond_miimon_inspect(struct
> bonding
> > >> *bond)
> > >>>  			commit++;
> > >>>  			slave->delay = bond->params.updelay;
> > >>>
> > >>> -			if (slave->delay) {
> > >>> +			if (slave->delay && !bond->rtnl_needed) {
> > >>>  				netdev_info(bond->dev, "link status up for
> > >> interface %s, enabling it in %d ms\n",
> > >>>  					    slave->dev->name,
> > >>>  					    ignore_updelay ? 0 :
> > >>> @@ -2301,9 +2301,11 @@ static void bond_mii_monitor(struct
> > >> work_struct *work)
> > >>>  		if (!rtnl_trylock()) {
> > >>>  			delay = 1;
> > >>>  			should_notify_peers = false;
> > >>> +			bond->rtnl_needed = true;
> > >>
> > >> How can you set a shared variable with no synchronization ?
> > > Thanks Eric for reviewing the patch. rtnl_needed is not a shared variable,
> it
> > is part of bonding structure, that is one per bonding driver instance. There
> > can't be two parallel instances of bond_miimon_inspect for a
> single  bonding
> > driver instance at any given point of time. and only bond_miimon_inspect
> > updates it. That’s why I think there is no need of any synchronization here.
> > >
> > >
> >
> > If rtnl_trylock() can not grab RTNL,
> > there is no way the current thread can set the  variable without a race, if
> the
> > word including rtnl_needed is shared by other fields in the structure.
> >
> > Your patch adds a subtle possibility of future bugs, even if it runs fine
> today.
> >
> > Do not pave the way for future bugs, make your code robust, please.
> 
> Thankyou Eric, we are making the changes and will repost the patch after
> testing it.
> -Manish

Please review the updated patch, I have changed the rtnl_needed to an atomic variable, to avoid any race condition:

From: Manish Kumar Singh <mk.singh@oracle.com>

When link status change needs to be committed and rtnl lock couldn't be taken, avoid redisplay of same link status change message.

Signed-off-by: Manish Kumar Singh <mk.singh@oracle.com>
---
 drivers/net/bonding/bond_main.c | 6 ++++--
 include/net/bonding.h           | 1 +
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index 217b790d22ed..fac5350bf19c 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -2087,7 +2087,7 @@ static int bond_miimon_inspect(struct bonding *bond)
 			bond_propose_link_state(slave, BOND_LINK_FAIL);
 			commit++;
 			slave->delay = bond->params.downdelay;
-			if (slave->delay) {
+			if (slave->delay && !atomic_read(&bond->rtnl_needed)) {
 				netdev_info(bond->dev, "link status down for %sinterface %s, disabling it in %d ms\n",
 					    (BOND_MODE(bond) ==
 					     BOND_MODE_ACTIVEBACKUP) ?
@@ -2127,7 +2127,7 @@ static int bond_miimon_inspect(struct bonding *bond)
 			commit++;
 			slave->delay = bond->params.updelay;
 
-			if (slave->delay) {
+			if (slave->delay && !atomic_read(&bond->rtnl_needed)) {
 				netdev_info(bond->dev, "link status up for interface %s, enabling it in %d ms\n",
 					    slave->dev->name,
 					    ignore_updelay ? 0 :
@@ -2301,9 +2301,11 @@ static void bond_mii_monitor(struct work_struct *work)
 		if (!rtnl_trylock()) {
 			delay = 1;
 			should_notify_peers = false;
+			atomic_set(&bond->rtnl_needed, 1);
 			goto re_arm;
 		}
 
+		atomic_set(&bond->rtnl_needed, 0);
 		bond_for_each_slave(bond, slave, iter) {
 			bond_commit_link_state(slave, BOND_SLAVE_NOTIFY_LATER);
 		}
diff --git a/include/net/bonding.h b/include/net/bonding.h index 808f1d167349..ffc1219f7a07 100644
--- a/include/net/bonding.h
+++ b/include/net/bonding.h
@@ -234,6 +234,7 @@ struct bonding {
 	struct	 dentry *debug_dir;
 #endif /* CONFIG_DEBUG_FS */
 	struct rtnl_link_stats64 bond_stats;
+	atomic_t rtnl_needed;
 };
 
 #define bond_slave_get_rcu(dev) \
--
2.14.1



Thanks,
Manish

> >
> >
> >
> >
> >
> >
> >

^ permalink raw reply

* Re: [PATCH 1/1] net: dsa: legacy: simplify getting .driver_data
From: Andrew Lunn @ 2018-10-21 23:13 UTC (permalink / raw)
  To: Wolfram Sang
  Cc: linux-kernel, linux-renesas-soc, Vivien Didelot, Florian Fainelli,
	David S. Miller, netdev
In-Reply-To: <20181021200105.2414-2-wsa+renesas@sang-engineering.com>

On Sun, Oct 21, 2018 at 10:01:05PM +0200, Wolfram Sang wrote:
> We should get 'driver_data' from 'struct device' directly. Going via
> platform_device is an unneeded step back and forth.
> 
> Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>

Reviewed-by: Andrew Lunn <andrew@lunn.ch>

    Andrew

^ permalink raw reply

* Re: [PATCH 9/9] net: phy: mdio-mux-bcm-iproc: simplify getting .driver_data
From: Andrew Lunn @ 2018-10-21 23:14 UTC (permalink / raw)
  To: Wolfram Sang
  Cc: linux-kernel, linux-renesas-soc, Florian Fainelli,
	David S. Miller, Ray Jui, Scott Branden, Jon Mason,
	bcm-kernel-feedback-list, netdev, linux-arm-kernel
In-Reply-To: <20181021200021.1693-10-wsa+renesas@sang-engineering.com>

On Sun, Oct 21, 2018 at 10:00:20PM +0200, Wolfram Sang wrote:
> We should get 'driver_data' from 'struct device' directly. Going via
> platform_device is an unneeded step back and forth.
> 
> Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>

Reviewed-by: Andrew Lunn <andrew@lunn.ch>

    Andrew

^ permalink raw reply

* Re: [PATCH 2/9] net: dsa: qca8k: simplify getting .driver_data
From: Andrew Lunn @ 2018-10-21 23:15 UTC (permalink / raw)
  To: Wolfram Sang
  Cc: linux-kernel, linux-renesas-soc, Vivien Didelot, Florian Fainelli,
	David S. Miller, netdev
In-Reply-To: <20181021200021.1693-3-wsa+renesas@sang-engineering.com>

On Sun, Oct 21, 2018 at 10:00:13PM +0200, Wolfram Sang wrote:
> We should get 'driver_data' from 'struct device' directly. Going via
> platform_device is an unneeded step back and forth.
> 
> Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>

Reviewed-by: Andrew Lunn <andrew@lunn.ch>

    Andrew

^ permalink raw reply

* Re: r8169 regression: UDP packets dropped intermittantly
From: Jonathan Woithe @ 2018-10-22  0:01 UTC (permalink / raw)
  To: netdev; +Cc: linux-kernel
In-Reply-To: <20180115065658.GI16419@marvin.atrad.com.au>

On Mon, Jan 15, 2018 at 05:26:59PM +1030, Jonathan Woithe wrote:
> Is there any more information that can be provided (or tests done) to assist
> in tracking this problem down?  Based on the tests done in December it seems
> that the problem only affects specific RTL-8169 variants, with most being
> ok.  Is it a case that we simply need to accept that for the greater good
> commit da78dbff2e05630921c551dbbc70a4b7981a8fff has permanently broken
> Netgear GA311 [1] network cards with respect to these UDP packets and that
> nothing can be done?

For future reference, commit 6b839b6cf9eada30b086effb51e5d6076bafc761
("r8169: fix NAPI handling under high load") appears to have fixed the
regression documented by this thread.  Thanks to Heiner Kallweit for the
work which lead to this solution.

Regards
  jonathan

> [1] Or perhaps any using the RTL8110s variant.

^ permalink raw reply

* RE: [PATCH] net: ethernet:fec: Consistently use SPEED_ prefix
From: Andy Duan @ 2018-10-22  2:41 UTC (permalink / raw)
  To: Andrew Lunn, David Miller
  Cc: clabbe@baylibre.com, netdev, Florian Fainelli,
	hkallweit1@gmail.com
In-Reply-To: <1540068491-14760-1-git-send-email-andrew@lunn.ch>

From: netdev-owner@vger.kernel.org <netdev-owner@vger.kernel.org>
> All other calls to phy_set_max_speed() use the SPEED_ prefix. Make the
> FEC driver follow this common pattern. This makes no different to
> generated code since SPEED_1000 is 1000, and SPEED_100 is 100.
> 

Please also add more information that was introduced by commit: 58056c1e1b0e ("net: ethernet: Use phy_set_max_speed() to limit advertised speed ")

Andy
> Reported-by: Corentin Labbe <clabbe@baylibre.com>
> Signed-off-by: Andrew Lunn <andrew@lunn.ch>
> ---
>  drivers/net/ethernet/freescale/fec_main.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/ethernet/freescale/fec_main.c
> b/drivers/net/ethernet/freescale/fec_main.c
> index 6db69ba30dcd..b067eaf8b792 100644
> --- a/drivers/net/ethernet/freescale/fec_main.c
> +++ b/drivers/net/ethernet/freescale/fec_main.c
> @@ -1948,7 +1948,7 @@ static int fec_enet_mii_probe(struct
> net_device *ndev)
> 
>  	/* mask with MAC supported features */
>  	if (fep->quirks & FEC_QUIRK_HAS_GBIT) {
> -		phy_set_max_speed(phy_dev, 1000);
> +		phy_set_max_speed(phy_dev, SPEED_1000);
>  		phy_remove_link_mode(phy_dev,
>  				     ETHTOOL_LINK_MODE_1000baseT_Half_BIT);
>  #if !defined(CONFIG_M5272)
> @@ -1956,7 +1956,7 @@ static int fec_enet_mii_probe(struct
> net_device *ndev)  #endif
>  	}
>  	else
> -		phy_set_max_speed(phy_dev, 100);
> +		phy_set_max_speed(phy_dev, SPEED_100);
> 
>  	fep->link = 0;
>  	fep->full_duplex = 0;
> --
> 2.19.0

^ permalink raw reply

* [PATCH 0/2] Mark expected switch fall-throughs and fix missing break
From: Gustavo A. R. Silva @ 2018-10-22 11:49 UTC (permalink / raw)
  To: linux-kernel
  Cc: Jes Sorensen, Kalle Valo, linux-wireless, David S. Miller, netdev,
	Gustavo A. R. Silva

In preparation to enabling -Wimplicit-fallthrough, this patchset aims
to mark multiple switch cases where we are expecting to fall through.

Also, the second patch in this series fixes a missing break in switch.

Thanks

Gustavo A. R. Silva (2):
  rtl8xxxu: Mark expected switch fall-throughs
  rtl8xxxu: Fix missing break in switch

 drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_core.c | 5 +++++
 1 file changed, 5 insertions(+)

-- 
2.7.4

^ permalink raw reply

* [PATCH 1/2] rtl8xxxu: Mark expected switch fall-throughs
From: Gustavo A. R. Silva @ 2018-10-22 11:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: Jes Sorensen, Kalle Valo, linux-wireless, David S. Miller, netdev,
	Gustavo A. R. Silva
In-Reply-To: <cover.1540208577.git.gustavo@embeddedor.com>

In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.

Addresses-Coverity-ID: 1357355 ("Missing break in switch")
Addresses-Coverity-ID: 1357378 ("Missing break in switch")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
---
 drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_core.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_core.c b/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_core.c
index 56040b1..c6b0686 100644
--- a/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_core.c
+++ b/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_core.c
@@ -1153,6 +1153,7 @@ void rtl8xxxu_gen1_config_channel(struct ieee80211_hw *hw)
 	switch (hw->conf.chandef.width) {
 	case NL80211_CHAN_WIDTH_20_NOHT:
 		ht = false;
+		/* fall through */
 	case NL80211_CHAN_WIDTH_20:
 		opmode |= BW_OPMODE_20MHZ;
 		rtl8xxxu_write8(priv, REG_BW_OPMODE, opmode);
@@ -1280,6 +1281,7 @@ void rtl8xxxu_gen2_config_channel(struct ieee80211_hw *hw)
 	switch (hw->conf.chandef.width) {
 	case NL80211_CHAN_WIDTH_20_NOHT:
 		ht = false;
+		/* fall through */
 	case NL80211_CHAN_WIDTH_20:
 		rf_mode_bw |= WMAC_TRXPTCL_CTL_BW_20;
 		subchannel = 0;
@@ -1748,9 +1750,11 @@ static int rtl8xxxu_identify_chip(struct rtl8xxxu_priv *priv)
 		case 3:
 			priv->ep_tx_low_queue = 1;
 			priv->ep_tx_count++;
+			/* fall through */
 		case 2:
 			priv->ep_tx_normal_queue = 1;
 			priv->ep_tx_count++;
+			/* fall through */
 		case 1:
 			priv->ep_tx_high_queue = 1;
 			priv->ep_tx_count++;
-- 
2.7.4

^ permalink raw reply related

* Re: pull-request: bpf-next 2018-10-21
From: David Miller @ 2018-10-22  4:12 UTC (permalink / raw)
  To: daniel; +Cc: ast, netdev
In-Reply-To: <20181021192426.22776-1-daniel@iogearbox.net>

From: Daniel Borkmann <daniel@iogearbox.net>
Date: Sun, 21 Oct 2018 21:24:26 +0200

> The following pull-request contains BPF updates for your *net-next* tree.
> 
> The main changes are:
 ...
> Please consider pulling these changes from:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git

Pulled, thanks Daniel.

^ permalink raw reply

* RE: [PATCH v2 1/2] dt-bindings: net: add MDIO bus multiplexer driven by a regmap device
From: Pankaj Bansal @ 2018-10-22  4:22 UTC (permalink / raw)
  To: Florian Fainelli, Andrew Lunn; +Cc: netdev@vger.kernel.org
In-Reply-To: <HE1PR0402MB33233F0E876E2091DFD9C9D0F1F80@HE1PR0402MB3323.eurprd04.prod.outlook.com>


> -----Original Message-----
> From: Pankaj Bansal
> Sent: Thursday, October 18, 2018 10:00 AM
> To: Florian Fainelli <f.fainelli@gmail.com>; Andrew Lunn <andrew@lunn.ch>
> Cc: netdev@vger.kernel.org
> Subject: RE: [PATCH v2 1/2] dt-bindings: net: add MDIO bus multiplexer driven by
> a regmap device
> 
> Hi Florian
> 
> > -----Original Message-----
> > From: Florian Fainelli [mailto:f.fainelli@gmail.com]
> > Sent: Sunday, October 7, 2018 11:32 PM
> > To: Pankaj Bansal <pankaj.bansal@nxp.com>; Andrew Lunn
> > <andrew@lunn.ch>
> > Cc: netdev@vger.kernel.org
> > Subject: Re: [PATCH v2 1/2] dt-bindings: net: add MDIO bus multiplexer
> > driven by a regmap device
> >
> >
> >
> > On 10/07/18 11:24, Pankaj Bansal wrote:
> > > Add support for an MDIO bus multiplexer controlled by a regmap
> > > device, like an FPGA.
> > >
> > > Tested on a NXP LX2160AQDS board which uses the "QIXIS" FPGA
> > > attached to the i2c bus.
> > >
> > > Signed-off-by: Pankaj Bansal <pankaj.bansal@nxp.com>
> > > ---
> > >
> > > Notes:
> > >     V2:
> > >      - Fixed formatting error caused by using space instead of tab
> > >      - Add newline between property list and subnode
> > >      - Add newline between two subnodes
> > >
> > >  .../bindings/net/mdio-mux-regmap.txt         | 95 ++++++++++++++++++
> > >  1 file changed, 95 insertions(+)
> > >
> > > diff --git
> > > a/Documentation/devicetree/bindings/net/mdio-mux-regmap.txt
> > > b/Documentation/devicetree/bindings/net/mdio-mux-regmap.txt
> > > new file mode 100644
> > > index 000000000000..88ebac26c1c5
> > > --- /dev/null
> > > +++ b/Documentation/devicetree/bindings/net/mdio-mux-regmap.txt
> > > @@ -0,0 +1,95 @@
> > > +Properties for an MDIO bus multiplexer controlled by a regmap
> > > +
> > > +This is a special case of a MDIO bus multiplexer.  A regmap device,
> > > +like an FPGA, is used to control which child bus is connected.  The
> > > +mdio-mux node must be a child of the device that is controlled by a
> regmap.
> > > +The driver currently only supports devices with upto 32-bit registers.
> >
> > I would omit any sort of details about Linux constructs designed to
> > support specific needs (e.g: regmap) as well as putting driver
> > limitations into a binding document because it really ought to be restricted to
> describing hardware.
> >
> 
> Actually the platform driver mdio-mux-regmap.c, is generalization of mdio-mux-
> mmioreg.c As such, it doesn't describe any new hardware, so no new properties
> are described by it.
> The only new property is compatible field.
> I don't know how to describe this driver otherwise.  Can you please help?

I further thought about it. mdio-mux-regmap.c is not meant to control a specific device.
It is meant to control some registers of parent device. Therefore, IMO this should not be a
Platform driver and there should not be any "compatible" property to which this driver is associated.

Rather this driver should expose the APIs, which should be called by parent devices' driver.

What is your thought on this ?

> 
> > > +
> > > +Required properties in addition to the generic multiplexer properties:
> > > +
> > > +- compatible : string, must contain "mdio-mux-regmap"
> > > +
> > > +- reg : integer, contains the offset of the register that controls the bus
> > > +	multiplexer. it can be 32 bit number.
> >
> > Technically it could be any "reg" property size, the only requirement
> > is that it must be of the appropriate size with respect to what the
> > parent node of this "mdio-mux-regmap" node has, determined by #address-
> cells/#size-cells width.
> 
> We are reading only single cell of this property using "of_propert_read_u32".
> That is why I put the size in this.
> 
> >
> > > +
> > > +- mux-mask : integer, contains an 32 bit mask that specifies which
> > > +	bits in the register control the actual bus multiplexer.  The
> > > +	'reg' property of each child mdio-mux node must be constrained by
> > > +	this mask.
> >
> > Same thing here.
> 
> We are reading only single cell of this property using "of_propert_read_u32".
> That is why I put the size in this.
> 
> >
> > Since this is a MDIO mux, I would invite you to specify what the
> > correct #address-cells/#size-cells values should be (1, and 0
> > respectively as your example correctly shows).
> >
> 
> I will mention #address-cells/#size-cells values
> 
> > > +
> > > +Example:
> > > +
> > > +The FPGA node defines a i2c connected FPGA with a register space of
> > > +0x30
> > bytes.
> > > +For the "EMI2" MDIO bus, register 0x54 (BRDCFG4) controls the mux
> > > +on that
> > bus.
> > > +A bitmask of 0x07 means that bits 0, 1 and 2 (bit 0 is lsb) are the
> > > +bits on
> > > +BRDCFG4 that control the actual mux.
> > > +
> > > +i2c@2000000 {
> > > +	compatible = "fsl,vf610-i2c";
> > > +	#address-cells = <1>;
> > > +	#size-cells = <0>;
> > > +	reg = <0x0 0x2000000 0x0 0x10000>;
> > > +	interrupts = <0 34 0x4>; // Level high type
> > > +	clock-names = "i2c";
> > > +	clocks = <&clockgen 4 7>;
> > > +	fsl-scl-gpio = <&gpio2 15 0>;
> > > +	status = "okay";
> > > +
> > > +	/* The FPGA node */
> > > +	fpga@66 {
> > > +		compatible = "fsl,lx2160aqds-fpga", "fsl,fpga-qixis-i2c";
> > > +		reg = <0x66>;
> > > +		#address-cells = <1>;
> > > +		#size-cells = <0>;
> > > +
> > > +		mdio1_mux@54 {
> > > +			compatible = "mdio-mux-regmap", "mdio-mux";
> > > +			mdio-parent-bus = <&emdio2>; /* MDIO bus */
> > > +			reg = <0x54>;		 /* BRDCFG4 */
> > > +			mux-mask = <0x07>;      /* EMI2_MDIO */
> > > +			#address-cells=<1>;
> > > +			#size-cells = <0>;
> > > +
> > > +			mdio1_ioslot1@0 {   // Slot 1
> > > +				reg = <0x00>;
> > > +				#address-cells = <1>;
> > > +				#size-cells = <0>;
> > > +
> > > +				phy1@1 {
> > > +					reg = <1>;
> > > +					compatible = "ethernet-phy-
> > id0210.7441";
> > > +				};
> > > +
> > > +				phy1@0 {
> > > +					reg = <0>;
> > > +					compatible = "ethernet-phy-
> > id0210.7441";
> > > +				};
> > > +			};
> > > +
> > > +			mdio1_ioslot2@1 {   // Slot 2
> > > +				reg = <0x01>;
> > > +				#address-cells = <1>;
> > > +				#size-cells = <0>;
> > > +
> > > +			};
> > > +
> > > +			mdio1_ioslot3@2 {   // Slot 3
> > > +				reg = <0x02>;
> > > +				#address-cells = <1>;
> > > +				#size-cells = <0>;
> > > +
> > > +			};
> > > +		};
> > > +	};
> > > +};
> > > +
> > > +	/* The parent MDIO bus. */
> > > +	emdio2: mdio@0x8B97000 {
> > > +		compatible = "fsl,fman-memac-mdio";
> > > +		reg = <0x0 0x8B97000 0x0 0x1000>;
> > > +		device_type = "mdio";
> > > +		little-endian;
> > > +
> > > +		#address-cells = <1>;
> > > +		#size-cells = <0>;
> > > +	};
> > >
> >
> > --
> > Florian

^ permalink raw reply

* Re: [rtnetlink] Potential bug in Linux (rt)netlink code
From: Henning Rogge @ 2018-10-22  5:22 UTC (permalink / raw)
  To: netdev; +Cc: Stephen Hemminger
In-Reply-To: <cae45e94-72f5-3aac-8c91-8e71facc3482@fkie.fraunhofer.de>

Does anyone else have an idea how to debug this problem?

Henning Rogge

Am 15.10.2018 um 07:25 schrieb Henning Rogge:
> Am 12.10.2018 um 20:51 schrieb Stephen Hemminger:
>> On Fri, 12 Oct 2018 09:30:40 +0200
>> Henning Rogge <henning.rogge@fkie.fraunhofer.de> wrote:
>>
>>> Hi,
>>>
>>> I am working on a self-written routing agent
>>> (https://github.com/OLSR/OONF) and am stuck on a problem with netlink
>>> that I cannot explain with an userspace error.
>>>
>>> I am using a netlink socket for setting routes
>>> (RTM_NEWROUTE/RTM_DELROUTE), querying the kernel for the current routes
>>> in the database (via a RTM_GETROUTE dump) and for getting multicast
>>> messages for ongoing routing changes.
>>>
>>> After a few netlink messages I get to the point where the kernel just
>>> does not responst to a RTM_NEWROUTE. No error, no answer, despite the
>>> NLM_F_ACK flag set)... but sometime when (during shutdown of the routing
>>> agent) the program sends another route command (most times a
>>> RTM_DELROUTE) I get a single netlink packet with a "successful" response
>>> for both the "missing" RTM_NEWROUTE and one for the new RTM DELROUTE
>>> sequence number.
>>>
>>> I am testing two routing agents, each of them in a systemd-nspawn based
>>> container connected over a bridge on the host system on a current Debian
>>> Testing (kernel 4.18.0-1-amd64).
>>>
>>> I am directly using the netlink sockets, without any other userspace
>>> library in between.
>>>
>>> I have checked the hexdumps of a couple of netlink messages (including
>>> the ones just before the bug happens) by hand and they seem to be okay.
>>>
>>> When I tried to add a "netlink listener" socket for futher debugging (ip
>>> link add nlmon0 type nlmon) the problem vanished until I removed the
>>> listener socket again.
>>>
>>> Any ideas how to debug this problem? Unfortunately I have no short
>>> example program to trigger the bug... I have rarely seen the problem for
>>> years (once every couple of months), but until a few days ago I never
>>> managed to reproduce it.
>>>
>>> Henning Rogge
>>
>> Are you reading the responses to your requests?  If you don't read
>> the response, the socket will get flow blocked.
> 
> Yes, I do...
> 
> all netlink sockets the program uses are constantly watched for traffic 
> coming from the kernel (with an epoll()-based event loop, no edge-trigger).
> 
> I even have a rate limitation towards the kernel, only sending a 
> "pagesize" full of netlink data towards the kernel, then waiting for the 
> reply before sending more (I had the blocking problem a few years ago 
> when experimenting with LOTS of routes).
> 
> Henning Rogge

Henning Rogge
-- 
Diplom-Informatiker Henning Rogge , Fraunhofer-Institut für
Kommunikation, Informationsverarbeitung und Ergonomie FKIE
Kommunikationssysteme (KOM)
Zanderstrasse 5, 53177 Bonn, Germany
Telefon +49 228 50212-469
mailto:henning.rogge@fkie.fraunhofer.de http://www.fkie.fraunhofer.de

^ permalink raw reply

* Re: [PATCH] net/mlx5: allocate enough space in
From: Dan Carpenter @ 2018-10-22  5:23 UTC (permalink / raw)
  To: Or Gerlitz
  Cc: Saeed Mahameed, Paul Blakey, kernel-janitors, Linux Netdev List,
	Leon Romanovsky
In-Reply-To: <CAJ3xEMjZ8WQu2cj_QgU9RoZvUZu7X2Szh6-Si6KZU-qMBFqwHQ@mail.gmail.com>

On Sun, Oct 21, 2018 at 01:56:26PM +0300, Or Gerlitz wrote:
> I will re-post your patch, this time to netdev since the original
> commit is there
> and so should be the fix, thanks for reporting/fixing!

I didn't realize it had been posted to netdev already so I deliberately
left that off the CC.  If Dave hasn't applied the original (he probably
has now because he is so quick) then it's fine by me if you fold them
together.

regards,
dan carpenter

^ permalink raw reply

* Re: [PATCH] net/mlx5: allocate enough space in
From: Or Gerlitz @ 2018-10-22  6:18 UTC (permalink / raw)
  To: Dan Carpenter
  Cc: Saeed Mahameed, Paul Blakey, kernel-janitors, Linux Netdev List,
	Leon Romanovsky
In-Reply-To: <20181022052336.fr4kvd7idba5d2om@mwanda>

On Mon, Oct 22, 2018 at 8:23 AM Dan Carpenter <dan.carpenter@oracle.com> wrote:
>
> On Sun, Oct 21, 2018 at 01:56:26PM +0300, Or Gerlitz wrote:
> > I will re-post your patch, this time to netdev since the original
> > commit is there
> > and so should be the fix, thanks for reporting/fixing!
>
> I didn't realize it had been posted to netdev already so I deliberately
> left that off the CC.  If Dave hasn't applied the original (he probably

you didn't post it to netdev, I don't see how he could have got it


> has now because he is so quick) then it's fine by me if you fold them
> together.
>
> regards,
> dan carpenter
>

^ permalink raw reply

* Re: [PATCH 0/2] Mark expected switch fall-throughs and fix missing break
From: Jes Sorensen @ 2018-10-22 14:36 UTC (permalink / raw)
  To: Gustavo A. R. Silva, linux-kernel
  Cc: Kalle Valo, linux-wireless, David S. Miller, netdev
In-Reply-To: <cover.1540208577.git.gustavo@embeddedor.com>

On 10/22/18 7:49 AM, Gustavo A. R. Silva wrote:
> In preparation to enabling -Wimplicit-fallthrough, this patchset aims
> to mark multiple switch cases where we are expecting to fall through.
> 
> Also, the second patch in this series fixes a missing break in switch.

Enabling that sounds like a great way to inflict pain and suffering.

Not a big deal to put that in the code, the second patch looks fine though.

Jes

^ permalink raw reply

* Re: [PATCH] net/mlx5: Allocate enough space for the FDB sub-namespaces
From: Or Gerlitz @ 2018-10-22  6:19 UTC (permalink / raw)
  To: Or Gerlitz; +Cc: David Miller, Saeed Mahameed, Linux Netdev List, Dan Carpenter
In-Reply-To: <1540119949-1748-1-git-send-email-ogerlitz@mellanox.com>

On Sun, Oct 21, 2018 at 2:47 PM Or Gerlitz <ogerlitz@mellanox.com> wrote:
>
> From: Dan Carpenter <dan.carpenter@oracle.com>
>
> FDB_MAX_CHAIN is three.  We wanted to allocate enough memory to hold four
> structs but there are missing parentheses so we only allocate enough
> memory for three structs and the first byte of the fourth one.
>
> Fixes: 328edb499f99 ("net/mlx5: Split FDB fast path prio to multiple namespaces")
> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com>

Dave, I forgot to mark it as net-next, so here I am saying that..


> ---
>  drivers/net/ethernet/mellanox/mlx5/core/fs_core.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
> index 67ba4c9..9d73eb9 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.c
> @@ -2470,7 +2470,7 @@ static int init_fdb_root_ns(struct mlx5_flow_steering *steering)
>                 return -ENOMEM;
>
>         steering->fdb_sub_ns = kzalloc(sizeof(steering->fdb_sub_ns) *
> -                                      FDB_MAX_CHAIN + 1, GFP_KERNEL);
> +                                      (FDB_MAX_CHAIN + 1), GFP_KERNEL);
>         if (!steering->fdb_sub_ns)
>                 return -ENOMEM;
>
> --
> 2.3.7
>

^ permalink raw reply

* Re: [PATCH] net/mlx5: allocate enough space in
From: Dan Carpenter @ 2018-10-22  6:41 UTC (permalink / raw)
  To: Or Gerlitz
  Cc: Saeed Mahameed, Paul Blakey, kernel-janitors, Linux Netdev List,
	Leon Romanovsky
In-Reply-To: <CAJ3xEMj5NPo-BJ2t_xno+R5O6R0_i3CU0Dig3_Wakop0xLa8CQ@mail.gmail.com>

On Mon, Oct 22, 2018 at 09:18:43AM +0300, Or Gerlitz wrote:
> On Mon, Oct 22, 2018 at 8:23 AM Dan Carpenter <dan.carpenter@oracle.com> wrote:
> >
> > On Sun, Oct 21, 2018 at 01:56:26PM +0300, Or Gerlitz wrote:
> > > I will re-post your patch, this time to netdev since the original
> > > commit is there
> > > and so should be the fix, thanks for reporting/fixing!
> >
> > I didn't realize it had been posted to netdev already so I deliberately
> > left that off the CC.  If Dave hasn't applied the original (he probably
> 
> you didn't post it to netdev, I don't see how he could have got it
> 

I meant commit 328edb499f99 ("net/mlx5: Split FDB fast path prio to
multiple namespaces").  It turns out that Dave has applied that, but I
was expecting to see his S-o-B on it.

It's hard for me to know which tree patches are applied to.

Also I see that I screwed up the subject.  Thanks for fixing that.

regards,
dan carpenter

^ permalink raw reply

* Re: [PATCH] bpf: btf: Fix a missing-check bug
From: Martin Lau @ 2018-10-22 15:40 UTC (permalink / raw)
  To: Wenwen Wang
  Cc: Kangjie Lu, Alexei Starovoitov, Daniel Borkmann,
	open list:BPF (Safe dynamic programs and tools),
	open list:BPF (Safe dynamic programs and tools)
In-Reply-To: <1539988191-13973-1-git-send-email-wang6495@umn.edu>

On Fri, Oct 19, 2018 at 05:29:51PM -0500, Wenwen Wang wrote:
> In btf_parse(), the header of the user-space btf data 'btf_data' is firstly
> parsed and verified through btf_parse_hdr(). In btf_parse_hdr(), the header
> is copied from user-space 'btf_data' to kernel-space 'btf->hdr' and then
> verified. If no error happens during the verification process, the whole
> data of 'btf_data', including the header, is then copied to 'data' in
> btf_parse(). It is obvious that the header is copied twice here. More
> importantly, no check is enforced after the second copy to make sure the
> headers obtained in these two copies are same. Given that 'btf_data'
> resides in the user space, a malicious user can race to modify the header
> between these two copies. By doing so, the user can inject inconsistent
> data, which can cause undefined behavior of the kernel and introduce
> potential security risk.
> 
> To avoid the above issue, this patch rewrites the header after the second
> copy, using 'btf->hdr', which is obtained in the first copy.
Acked-by: Martin KaFai Lau <kafai@fb.com>

^ permalink raw reply

* Re: [PATCH] bpf: btf: Fix a missing-check bug
From: Y Song @ 2018-10-22 15:57 UTC (permalink / raw)
  To: wang6495; +Cc: kjlu, Alexei Starovoitov, Daniel Borkmann, netdev, LKML
In-Reply-To: <1539988191-13973-1-git-send-email-wang6495@umn.edu>

On Fri, Oct 19, 2018 at 3:30 PM Wenwen Wang <wang6495@umn.edu> wrote:
>
> In btf_parse(), the header of the user-space btf data 'btf_data' is firstly
> parsed and verified through btf_parse_hdr(). In btf_parse_hdr(), the header
> is copied from user-space 'btf_data' to kernel-space 'btf->hdr' and then
> verified. If no error happens during the verification process, the whole
> data of 'btf_data', including the header, is then copied to 'data' in
> btf_parse(). It is obvious that the header is copied twice here. More
> importantly, no check is enforced after the second copy to make sure the
> headers obtained in these two copies are same. Given that 'btf_data'
> resides in the user space, a malicious user can race to modify the header
> between these two copies. By doing so, the user can inject inconsistent
> data, which can cause undefined behavior of the kernel and introduce
> potential security risk.
>
> To avoid the above issue, this patch rewrites the header after the second
> copy, using 'btf->hdr', which is obtained in the first copy.
>
> Signed-off-by: Wenwen Wang <wang6495@umn.edu>
> ---
>  kernel/bpf/btf.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
> index 138f030..2a85f91 100644
> --- a/kernel/bpf/btf.c
> +++ b/kernel/bpf/btf.c
> @@ -2202,6 +2202,9 @@ static struct btf *btf_parse(void __user *btf_data, u32 btf_data_size,
>                 goto errout;
>         }
>
> +       memcpy(data, &btf->hdr,
> +               min_t(u32, btf->hdr.hdr_len, sizeof(btf->hdr)));

Could you restructure the code to memcpy the header followed by copying
the rest of btf_data with copy_from_user? This way, each byte is only
copied once.
Could you add some comments right before memcpy so later people will know
why we implement this way?

> +
>         err = btf_parse_str_sec(env);
>         if (err)
>                 goto errout;
> --
> 2.7.4
>

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox