Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH v5 1/4] gpiolib: Pass bitmaps, not integer arrays, to get/set array
From: Miguel Ojeda @ 2018-08-30 11:10 UTC (permalink / raw)
  To: Janusz Krzysztofik
  Cc: Linus Walleij, Jonathan Corbet, Peter Korsgaard, Peter Rosin,
	Ulf Hansson, Andrew Lunn, Florian Fainelli, David S. Miller,
	Dominik Brodowski, Greg Kroah-Hartman, Kishon Vijay Abraham I,
	Lars-Peter Clausen, Michael Hennerich, Jonathan Cameron,
	Hartmut Knaack, Peter Meerwald-Stadler, Jiri Slaby, Willy Tarreau,
	Geert Uytterhoeven, Linux 
In-Reply-To: <20180829204900.19390-2-jmkrzyszt@gmail.com>

Hi Janusz,

On Wed, Aug 29, 2018 at 10:48 PM, Janusz Krzysztofik
<jmkrzyszt@gmail.com> wrote:
> Most users of get/set array functions iterate consecutive bits of data,
> usually a single integer, while processing array of results obtained
> from, or building an array of values to be passed to those functions.
> Save time wasted on those iterations by changing the functions' API to
> accept bitmaps.
>
> All current users are updated as well.
>
> More benefits from the change are expected as soon as planned support
> for accepting/passing those bitmaps directly from/to respective GPIO
> chip callbacks if applicable is implemented.
>
> Cc: Jonathan Corbet <corbet@lwn.net>
> Cc: Miguel Ojeda Sandonis <miguel.ojeda.sandonis@gmail.com>
> Cc: Peter Korsgaard <peter.korsgaard@barco.com>
> Cc: Peter Rosin <peda@axentia.se>
> Cc: Andrew Lunn <andrew@lunn.ch>
> Cc: Florian Fainelli <f.fainelli@gmail.com>
> Cc: "David S. Miller" <davem@davemloft.net>
> Cc: Dominik Brodowski <linux@dominikbrodowski.net>
> Cc: Kishon Vijay Abraham I <kishon@ti.com>
> Cc: Lars-Peter Clausen <lars@metafoo.de>
> Cc: Michael Hennerich <Michael.Hennerich@analog.com>
> Cc: Jonathan Cameron <jic23@kernel.org>
> Cc: Hartmut Knaack <knaack.h@gmx.de>
> Cc: Peter Meerwald-Stadler <pmeerw@pmeerw.net>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: Jiri Slaby <jslaby@suse.com>
> Signed-off-by: Janusz Krzysztofik <jmkrzyszt@gmail.com>
> Acked-by: Ulf Hansson <ulf.hansson@linaro.org>
> ---
>  Documentation/driver-api/gpio/consumer.rst  | 22 ++++----
>  drivers/auxdisplay/hd44780.c                | 52 +++++++++--------
>  drivers/bus/ts-nbus.c                       | 19 ++-----
>  drivers/gpio/gpio-max3191x.c                | 17 +++---
>  drivers/gpio/gpiolib.c                      | 86 +++++++++++++++--------------
>  drivers/gpio/gpiolib.h                      |  4 +-
>  drivers/i2c/muxes/i2c-mux-gpio.c            |  8 +--
>  drivers/mmc/core/pwrseq_simple.c            | 13 ++---
>  drivers/mux/gpio.c                          |  9 +--
>  drivers/net/phy/mdio-mux-gpio.c             |  3 +-
>  drivers/pcmcia/soc_common.c                 | 11 ++--
>  drivers/phy/motorola/phy-mapphone-mdm6600.c | 17 +++---
>  drivers/staging/iio/adc/ad7606.c            |  9 +--
>  drivers/tty/serial/serial_mctrl_gpio.c      |  7 ++-
>  include/linux/gpio/consumer.h               | 18 +++---
>  15 files changed, 140 insertions(+), 155 deletions(-)
>
> diff --git a/Documentation/driver-api/gpio/consumer.rst b/Documentation/driver-api/gpio/consumer.rst
> index aa03f389d41d..ed68042ddccf 100644
> --- a/Documentation/driver-api/gpio/consumer.rst
> +++ b/Documentation/driver-api/gpio/consumer.rst
> @@ -323,29 +323,29 @@ The following functions get or set the values of an array of GPIOs::
>
>         int gpiod_get_array_value(unsigned int array_size,
>                                   struct gpio_desc **desc_array,
> -                                 int *value_array);
> +                                 unsigned long *value_bitmap);
>         int gpiod_get_raw_array_value(unsigned int array_size,
>                                       struct gpio_desc **desc_array,
> -                                     int *value_array);
> +                                     unsigned long *value_bitmap);
>         int gpiod_get_array_value_cansleep(unsigned int array_size,
>                                            struct gpio_desc **desc_array,
> -                                          int *value_array);
> +                                          unsigned long *value_bitmap);
>         int gpiod_get_raw_array_value_cansleep(unsigned int array_size,
>                                            struct gpio_desc **desc_array,
> -                                          int *value_array);
> +                                          unsigned long *value_bitmap);
>
>         void gpiod_set_array_value(unsigned int array_size,
>                                    struct gpio_desc **desc_array,
> -                                  int *value_array)
> +                                  unsigned long *value_bitmap)
>         void gpiod_set_raw_array_value(unsigned int array_size,
>                                        struct gpio_desc **desc_array,
> -                                      int *value_array)
> +                                      unsigned long *value_bitmap)
>         void gpiod_set_array_value_cansleep(unsigned int array_size,
>                                             struct gpio_desc **desc_array,
> -                                           int *value_array)
> +                                           unsigned long *value_bitmap)
>         void gpiod_set_raw_array_value_cansleep(unsigned int array_size,
>                                                 struct gpio_desc **desc_array,
> -                                               int *value_array)
> +                                               unsigned long *value_bitmap)
>
>  The array can be an arbitrary set of GPIOs. The functions will try to access
>  GPIOs belonging to the same bank or chip simultaneously if supported by the
> @@ -356,8 +356,8 @@ accessed sequentially.
>  The functions take three arguments:
>         * array_size    - the number of array elements
>         * desc_array    - an array of GPIO descriptors
> -       * value_array   - an array to store the GPIOs' values (get) or
> -                         an array of values to assign to the GPIOs (set)
> +       * value_bitmap  - a bitmap to store the GPIOs' values (get) or
> +                         a bitmap of values to assign to the GPIOs (set)
>
>  The descriptor array can be obtained using the gpiod_get_array() function
>  or one of its variants. If the group of descriptors returned by that function
> @@ -366,7 +366,7 @@ the struct gpio_descs returned by gpiod_get_array()::
>
>         struct gpio_descs *my_gpio_descs = gpiod_get_array(...);
>         gpiod_set_array_value(my_gpio_descs->ndescs, my_gpio_descs->desc,
> -                             my_gpio_values);
> +                             my_gpio_value_bitmap);
>
>  It is also possible to access a completely arbitrary array of descriptors. The
>  descriptors may be obtained using any combination of gpiod_get() and
> diff --git a/drivers/auxdisplay/hd44780.c b/drivers/auxdisplay/hd44780.c
> index f1a42f0f1ded..bbbd6a29bf01 100644
> --- a/drivers/auxdisplay/hd44780.c
> +++ b/drivers/auxdisplay/hd44780.c
> @@ -62,20 +62,19 @@ static void hd44780_strobe_gpio(struct hd44780 *hd)
>  /* write to an LCD panel register in 8 bit GPIO mode */
>  static void hd44780_write_gpio8(struct hd44780 *hd, u8 val, unsigned int rs)
>  {
> -       int values[10]; /* for DATA[0-7], RS, RW */
> -       unsigned int i, n;
> +       unsigned long value_bitmap[1];  /* for DATA[0-7], RS, RW */

(I read your comments in the other email)

I still find this odd, but if everyone is going to have this change
done like this, consistency is better.

> +       unsigned int n;
>
> -       for (i = 0; i < 8; i++)
> -               values[PIN_DATA0 + i] = !!(val & BIT(i));
> -       values[PIN_CTRL_RS] = rs;
> +       value_bitmap[0] = val;
> +       __assign_bit(PIN_CTRL_RS, value_bitmap, rs);
>         n = 9;
>         if (hd->pins[PIN_CTRL_RW]) {
> -               values[PIN_CTRL_RW] = 0;
> +               __clear_bit(PIN_CTRL_RW, value_bitmap);
>                 n++;
>         }
>
>         /* Present the data to the port */
> -       gpiod_set_array_value_cansleep(n, &hd->pins[PIN_DATA0], values);
> +       gpiod_set_array_value_cansleep(n, &hd->pins[PIN_DATA0], value_bitmap);
>
>         hd44780_strobe_gpio(hd);
>  }
> @@ -83,32 +82,31 @@ static void hd44780_write_gpio8(struct hd44780 *hd, u8 val, unsigned int rs)
>  /* write to an LCD panel register in 4 bit GPIO mode */
>  static void hd44780_write_gpio4(struct hd44780 *hd, u8 val, unsigned int rs)
>  {
> -       int values[10]; /* for DATA[0-7], RS, RW, but DATA[0-3] is unused */
> -       unsigned int i, n;
> +       /* for DATA[0-7], RS, RW, but DATA[0-3] is unused */
> +       unsigned long value_bitmap[1];

Ditto.

> +       unsigned int n;
>
>         /* High nibble + RS, RW */
> -       for (i = 4; i < 8; i++)
> -               values[PIN_DATA0 + i] = !!(val & BIT(i));
> -       values[PIN_CTRL_RS] = rs;
> +       value_bitmap[0] = val;
> +       __assign_bit(PIN_CTRL_RS, value_bitmap, rs);
>         n = 5;
>         if (hd->pins[PIN_CTRL_RW]) {
> -               values[PIN_CTRL_RW] = 0;
> +               __clear_bit(PIN_CTRL_RW, value_bitmap);
>                 n++;
>         }
> +       value_bitmap[0] >>= PIN_DATA4;
>
>         /* Present the data to the port */
> -       gpiod_set_array_value_cansleep(n, &hd->pins[PIN_DATA4],
> -                                      &values[PIN_DATA4]);
> +       gpiod_set_array_value_cansleep(n, &hd->pins[PIN_DATA4], value_bitmap);
>
>         hd44780_strobe_gpio(hd);
>
>         /* Low nibble */
> -       for (i = 0; i < 4; i++)
> -               values[PIN_DATA4 + i] = !!(val & BIT(i));
> +       value_bitmap[0] &= ~((1 << PIN_DATA4) - 1);
> +       value_bitmap[0] |= val & ~((1 << PIN_DATA4) - 1);

This is still wrong! What I originally meant in my v4 review is that
there is an extra ~ in the second line.

Also, a couple of general comments:

 - Please review the list of CCs (I was not CC'd originally, so maybe
there are other maintainers that aren't, either)
 - In general, the new code seems harder to read than the original one
(but that is my impression).

Thanks for your effort! :-)

Cheers,
Miguel

^ permalink raw reply

* Re: [PATCH net] ebpf: fix bpf_msg_pull_data
From: Daniel Borkmann @ 2018-08-30  7:20 UTC (permalink / raw)
  To: Tushar Dave, john.fastabend, ast, davem, netdev; +Cc: sowmini.varadhan
In-Reply-To: <6e4e767a-3e83-c6e2-b5bf-b82ec1770e62@oracle.com>

On 08/30/2018 02:21 AM, Tushar Dave wrote:
> On 08/29/2018 05:07 PM, Tushar Dave wrote:
>> While doing some preliminary testing it is found that bpf helper
>> bpf_msg_pull_data does not calculate the data and data_end offset
>> correctly. Fix it!
>>
>> Fixes: 015632bb30da ("bpf: sk_msg program helper bpf_sk_msg_pull_data")
>> Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com>
>> Acked-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
>> ---
>>   net/core/filter.c | 38 +++++++++++++++++++++++++-------------
>>   1 file changed, 25 insertions(+), 13 deletions(-)
>>
>> diff --git a/net/core/filter.c b/net/core/filter.c
>> index c25eb36..3eeb3d6 100644
>> --- a/net/core/filter.c
>> +++ b/net/core/filter.c
>> @@ -2285,7 +2285,7 @@ struct sock *do_msg_redirect_map(struct sk_msg_buff *msg)
>>   BPF_CALL_4(bpf_msg_pull_data,
>>          struct sk_msg_buff *, msg, u32, start, u32, end, u64, flags)
>>   {
>> -    unsigned int len = 0, offset = 0, copy = 0;
>> +    unsigned int len = 0, offset = 0, copy = 0, off = 0;
>>       struct scatterlist *sg = msg->sg_data;
>>       int first_sg, last_sg, i, shift;
>>       unsigned char *p, *to, *from;
>> @@ -2299,22 +2299,30 @@ struct sock *do_msg_redirect_map(struct sk_msg_buff *msg)
>>       i = msg->sg_start;
>>       do {
>>           len = sg[i].length;
>> -        offset += len;
>>           if (start < offset + len)
>>               break;
>> +        offset += len;
>>           i++;
>>           if (i == MAX_SKB_FRAGS)
>>               i = 0;
>> -    } while (i != msg->sg_end);
>> +    } while (i <= msg->sg_end);

I don't think this condition is correct, msg operates as a scatterlist ring,
so sg_end may very well be < current i when there's a wrap-around in the
traversal ... more below.

>>   +    /* return error if start is out of range */
>>       if (unlikely(start >= offset + len))
>>           return -EINVAL;
>>   -    if (!msg->sg_copy[i] && bytes <= len)
>> -        goto out;
>> +    /* return error if i is last entry in sglist and end is out of range */
>> +    if (msg->sg_copy[i] && end > offset + len)
>> +        return -EINVAL;
>>         first_sg = i;
>>   +    /* if i is not last entry in sg list and end (i.e start + bytes) is
>> +     * within this sg[i] then goto out and calculate data and data_end
>> +     */
>> +    if (!msg->sg_copy[i] && end <= offset + len)
>> +        goto out;
>> +
>>       /* At this point we need to linearize multiple scatterlist
>>        * elements or a single shared page. Either way we need to
>>        * copy into a linear buffer exclusively owned by BPF. Then
>> @@ -2330,9 +2338,14 @@ struct sock *do_msg_redirect_map(struct sk_msg_buff *msg)
>>           i++;
>>           if (i == MAX_SKB_FRAGS)
>>               i = 0;
>> -        if (bytes < copy)
>> +        if (end < copy)
>>               break;
>> -    } while (i != msg->sg_end);
>> +    } while (i <= msg->sg_end);
>> +
>> +    /* return error if i is last entry in sglist and end is out of range */
>> +    if (i > msg->sg_end && end > offset + copy)
>> +        return -EINVAL;
>> +
>>       last_sg = i;
>>         if (unlikely(copy < end - start))
>> @@ -2342,23 +2355,22 @@ struct sock *do_msg_redirect_map(struct sk_msg_buff *msg)
>>       if (unlikely(!page))
>>           return -ENOMEM;
>>       p = page_address(page);
>> -    offset = 0;
>>         i = first_sg;
>>       do {
>>           from = sg_virt(&sg[i]);
>>           len = sg[i].length;
>> -        to = p + offset;
>> +        to = p + off;
>>             memcpy(to, from, len);
>> -        offset += len;
>> +        off += len;
>>           sg[i].length = 0;
>>           put_page(sg_page(&sg[i]));
>>             i++;
>>           if (i == MAX_SKB_FRAGS)
>>               i = 0;
>> -    } while (i != last_sg);
>> +    } while (i < last_sg);
>>         sg[first_sg].length = copy;
>>       sg_set_page(&sg[first_sg], page, copy, 0);
>> @@ -2380,7 +2392,7 @@ struct sock *do_msg_redirect_map(struct sk_msg_buff *msg)
>>           else
>>               move_from = i + shift;
>>   -        if (move_from == msg->sg_end)
>> +        if (move_from > msg->sg_end)
>>               break;
>>             sg[i] = sg[move_from];
>> @@ -2396,7 +2408,7 @@ struct sock *do_msg_redirect_map(struct sk_msg_buff *msg)
>>       if (msg->sg_end < 0)
>>           msg->sg_end += MAX_SKB_FRAGS;
>>   out:
>> -    msg->data = sg_virt(&sg[i]) + start - offset;
>> +    msg->data = sg_virt(&sg[first_sg]) + start - offset;
>>       msg->data_end = msg->data + bytes;
>>         return 0;
>>
> 
> Please discard this patch. I just noticed that Daniel Borkmann sent some similar fixes for bpf_msg_pull_data.

Yeah I've been looking at these recently as well, please make sure you test
with the below fixes included to see if there's anything left:

https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git/commit/net/core/filter.c?id=5b24109b0563d45094c470684c1f8cea1af269f8
https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git/commit/net/core/filter.c?id=0e06b227c5221dd51b5569de93f3b9f532be4a32
https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git/commit/net/core/filter.c?id=2e43f95dd8ee62bc8bf57f2afac37fbd70c8d565
https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git/commit/net/core/filter.c?id=a8cf76a9023bc6709b1361d06bb2fae5227b9d68

Thanks,
Daniel

^ permalink raw reply

* [PATCHv2] net/rds: RDS is not Radio Data System
From: Pavel Machek @ 2018-08-30 11:30 UTC (permalink / raw)
  To: Trivial patch monkey, Netdev list, sowmini.varadhan,
	santosh.shilimkar, ka-cheong.poon, linux-kernel
  Cc: David S. Miller
In-Reply-To: <20180830102336.GA21063@amd>

[-- Attachment #1: Type: text/plain, Size: 810 bytes --]

Getting prompt "The RDS Protocol" (RDS) is not too helpful, and it is
easily confused with Radio Data System (which we may want to support
in kernel, too).

Signed-off-by: Pavel Machek <pavel@ucw.cz>
Acked-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>

diff --git a/net/rds/Kconfig b/net/rds/Kconfig
index 41f7556..2738f14 100644
--- a/net/rds/Kconfig
+++ b/net/rds/Kconfig
@@ -1,6 +1,6 @@
 
 config RDS
-	tristate "The RDS Protocol"
+	tristate "The Reliable Datagram Sockets Protocol"
 	depends on INET
 	---help---
 	  The RDS (Reliable Datagram Sockets) protocol provides reliable,

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 181 bytes --]

^ permalink raw reply related

* Re: [PATCHv2] net/rds: RDS is not Radio Data System
From: Sowmini Varadhan @ 2018-08-30 11:32 UTC (permalink / raw)
  To: Pavel Machek
  Cc: Trivial patch monkey, Netdev list, santosh.shilimkar,
	ka-cheong.poon, linux-kernel, David S. Miller
In-Reply-To: <20180830113003.GA29794@amd>

On (08/30/18 13:30), Pavel Machek wrote:
> -	tristate "The RDS Protocol"
> +	tristate "The Reliable Datagram Sockets Protocol"


Looks good to me.

Acked-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>

^ permalink raw reply

* Re: [PATCH v5 1/4] gpiolib: Pass bitmaps, not integer arrays, to get/set array
From: Geert Uytterhoeven @ 2018-08-30  7:40 UTC (permalink / raw)
  To: jmkrzyszt
  Cc: Linus Walleij, Jonathan Corbet, Miguel Ojeda Sandonis,
	peter.korsgaard, Peter Rosin, Ulf Hansson, Andrew Lunn,
	Florian Fainelli, David S. Miller, Dominik Brodowski, Greg KH,
	Kishon Vijay Abraham I, Lars-Peter Clausen, Michael Hennerich,
	Jonathan Cameron, Hartmut Knaack, Peter Meerwald, Jiri Slaby,
	Willy Tarreau, open list:DOCUMENTATION
In-Reply-To: <20180829204900.19390-2-jmkrzyszt@gmail.com>

Hi Janusz,

On Wed, Aug 29, 2018 at 10:48 PM Janusz Krzysztofik <jmkrzyszt@gmail.com> wrote:
> Most users of get/set array functions iterate consecutive bits of data,
> usually a single integer, while processing array of results obtained
> from, or building an array of values to be passed to those functions.
> Save time wasted on those iterations by changing the functions' API to
> accept bitmaps.
>
> All current users are updated as well.
>
> More benefits from the change are expected as soon as planned support
> for accepting/passing those bitmaps directly from/to respective GPIO
> chip callbacks if applicable is implemented.
>
> Cc: Jonathan Corbet <corbet@lwn.net>
> Cc: Miguel Ojeda Sandonis <miguel.ojeda.sandonis@gmail.com>
> Cc: Peter Korsgaard <peter.korsgaard@barco.com>
> Cc: Peter Rosin <peda@axentia.se>
> Cc: Andrew Lunn <andrew@lunn.ch>
> Cc: Florian Fainelli <f.fainelli@gmail.com>
> Cc: "David S. Miller" <davem@davemloft.net>
> Cc: Dominik Brodowski <linux@dominikbrodowski.net>
> Cc: Kishon Vijay Abraham I <kishon@ti.com>
> Cc: Lars-Peter Clausen <lars@metafoo.de>
> Cc: Michael Hennerich <Michael.Hennerich@analog.com>
> Cc: Jonathan Cameron <jic23@kernel.org>
> Cc: Hartmut Knaack <knaack.h@gmx.de>
> Cc: Peter Meerwald-Stadler <pmeerw@pmeerw.net>
> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Cc: Jiri Slaby <jslaby@suse.com>
> Signed-off-by: Janusz Krzysztofik <jmkrzyszt@gmail.com>
> Acked-by: Ulf Hansson <ulf.hansson@linaro.org>

Thanks for your patch!

> --- a/drivers/auxdisplay/hd44780.c
> +++ b/drivers/auxdisplay/hd44780.c
> @@ -62,20 +62,19 @@ static void hd44780_strobe_gpio(struct hd44780 *hd)
>  /* write to an LCD panel register in 8 bit GPIO mode */
>  static void hd44780_write_gpio8(struct hd44780 *hd, u8 val, unsigned int rs)
>  {
> -       int values[10]; /* for DATA[0-7], RS, RW */
> -       unsigned int i, n;
> +       unsigned long value_bitmap[1];  /* for DATA[0-7], RS, RW */
> +       unsigned int n;
>
> -       for (i = 0; i < 8; i++)
> -               values[PIN_DATA0 + i] = !!(val & BIT(i));
> -       values[PIN_CTRL_RS] = rs;
> +       value_bitmap[0] = val;
> +       __assign_bit(PIN_CTRL_RS, value_bitmap, rs);


>         n = 9;
>         if (hd->pins[PIN_CTRL_RW]) {
> -               values[PIN_CTRL_RW] = 0;
> +               __clear_bit(PIN_CTRL_RW, value_bitmap);

The clearing is not needed, as this has been done by 'value_bitmap[0] = val;'

>                 n++;
>         }

So the above block can be simplified to:

        n = hd->pins[PIN_CTRL_RW] ? 10 : 9;

>
>         /* Present the data to the port */
> -       gpiod_set_array_value_cansleep(n, &hd->pins[PIN_DATA0], values);
> +       gpiod_set_array_value_cansleep(n, &hd->pins[PIN_DATA0], value_bitmap);
>
>         hd44780_strobe_gpio(hd);
>  }
> @@ -83,32 +82,31 @@ static void hd44780_write_gpio8(struct hd44780 *hd, u8 val, unsigned int rs)
>  /* write to an LCD panel register in 4 bit GPIO mode */
>  static void hd44780_write_gpio4(struct hd44780 *hd, u8 val, unsigned int rs)
>  {
> -       int values[10]; /* for DATA[0-7], RS, RW, but DATA[0-3] is unused */
> -       unsigned int i, n;
> +       /* for DATA[0-7], RS, RW, but DATA[0-3] is unused */

This comment is not correct, as the low bits will be used.

        /* DATA[4-7], RS, RW */

> +       unsigned long value_bitmap[1];
> +       unsigned int n;
>
>         /* High nibble + RS, RW */
> -       for (i = 4; i < 8; i++)
> -               values[PIN_DATA0 + i] = !!(val & BIT(i));
> -       values[PIN_CTRL_RS] = rs;
> +       value_bitmap[0] = val;
> +       __assign_bit(PIN_CTRL_RS, value_bitmap, rs);
>         n = 5;
>         if (hd->pins[PIN_CTRL_RW]) {
> -               values[PIN_CTRL_RW] = 0;
> +               __clear_bit(PIN_CTRL_RW, value_bitmap);

Not needed.

>                 n++;
>         }

Hence:

        n = hd->pins[PIN_CTRL_RW] ? 6: 5;

> +       value_bitmap[0] >>= PIN_DATA4;

Yuck?!?

Isn't it more readable to just do:

        /* High nibble + RS, RW */
        value_bitmap[0] = val >> 4;
        __assign_bit(4, value_bitmap, rs);
        n = hd->pins[PIN_CTRL_RW] ? 6: 5;

>
>         /* Present the data to the port */
> -       gpiod_set_array_value_cansleep(n, &hd->pins[PIN_DATA4],
> -                                      &values[PIN_DATA4]);
> +       gpiod_set_array_value_cansleep(n, &hd->pins[PIN_DATA4], value_bitmap);
>
>         hd44780_strobe_gpio(hd);
>
>         /* Low nibble */
> -       for (i = 0; i < 4; i++)
> -               values[PIN_DATA4 + i] = !!(val & BIT(i));
> +       value_bitmap[0] &= ~((1 << PIN_DATA4) - 1);
> +       value_bitmap[0] |= val & ~((1 << PIN_DATA4) - 1);

... and:

         /* Low nibble */
        value_bitmap[0] &= ~0x0f;
        value_bitmap[0] |= val & 0x0f;

>
>         /* Present the data to the port */
> -       gpiod_set_array_value_cansleep(n, &hd->pins[PIN_DATA4],
> -                                      &values[PIN_DATA4]);
> +       gpiod_set_array_value_cansleep(n, &hd->pins[PIN_DATA4], value_bitmap);
>
>         hd44780_strobe_gpio(hd);
>  }
> @@ -155,23 +153,23 @@ static void hd44780_write_cmd_gpio4(struct charlcd *lcd, int cmd)
>  /* Send 4-bits of a command to the LCD panel in raw 4 bit GPIO mode */
>  static void hd44780_write_cmd_raw_gpio4(struct charlcd *lcd, int cmd)
>  {
> -       int values[10]; /* for DATA[0-7], RS, RW, but DATA[0-3] is unused */
> +       /* for DATA[0-7], RS, RW, but DATA[0-3] is unused */

This comment is not correct, as the low bits will be used.

        /* DATA[4-7], RS, RW */

> +       unsigned long value_bitmap[1];
>         struct hd44780 *hd = lcd->drvdata;
> -       unsigned int i, n;
> +       unsigned int n;
>
>         /* Command nibble + RS, RW */
> -       for (i = 0; i < 4; i++)
> -               values[PIN_DATA4 + i] = !!(cmd & BIT(i));
> -       values[PIN_CTRL_RS] = 0;
> +       value_bitmap[0] = cmd << PIN_DATA4;
> +       __clear_bit(PIN_CTRL_RS, value_bitmap);

Implied by the assignment above.

>         n = 5;
>         if (hd->pins[PIN_CTRL_RW]) {
> -               values[PIN_CTRL_RW] = 0;
> +               __clear_bit(PIN_CTRL_RW, value_bitmap);
>                 n++;
>         }
> +       value_bitmap[0] = value_bitmap[0] >> PIN_DATA4;

Hence:

        /* Command nibble + RS, RW */
        value_bitmap[0] = cmd;
        n = hd->pins[PIN_CTRL_RW] ? 6: 5;


>
>         /* Present the data to the port */
> -       gpiod_set_array_value_cansleep(n, &hd->pins[PIN_DATA4],
> -                                      &values[PIN_DATA4]);
> +       gpiod_set_array_value_cansleep(n, &hd->pins[PIN_DATA4], value_bitmap);
>
>         hd44780_strobe_gpio(hd);
>  }

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply

* Re: [PATCH net-next 5/5] ppp: handle PPPIOCGIDLE for 64-bit time_t
From: Arnd Bergmann @ 2018-08-30 11:47 UTC (permalink / raw)
  To: g.nault
  Cc: Paul Mackerras, linux-ppp, Networking, mitch, mostrows, jchapman,
	xeb, David Miller, Al Viro, y2038 Mailman List,
	Linux Kernel Mailing List, Karsten Keil, open list:DOCUMENTATION
In-Reply-To: <20180830110601.GA19534@alphalink.fr>

On Thu, Aug 30, 2018 at 1:06 PM Guillaume Nault <g.nault@alphalink.fr> wrote:
> On Wed, Aug 29, 2018 at 04:03:30PM +0200, Arnd Bergmann wrote:

> > @@ -743,10 +744,17 @@ static long ppp_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
> >               err = 0;
> >               break;
> >
> > -     case PPPIOCGIDLE:
> > -             idle.xmit_idle = (jiffies - ppp->last_xmit) / HZ;
> > -             idle.recv_idle = (jiffies - ppp->last_recv) / HZ;
> > -             if (copy_to_user(argp, &idle, sizeof(idle)))
> > +     case PPPIOCGIDLE32:
> > +                idle32.xmit_idle = (jiffies - ppp->last_xmit) / HZ;
> > +                idle32.recv_idle = (jiffies - ppp->last_recv) / HZ;
> > +                if (copy_to_user(argp, &idle32, sizeof(idle32)))
> >
> Missing 'break;'
>
> > +             err = 0;
> > +             break;
> > +
> > +     case PPPIOCGIDLE64:
> > +             idle64.xmit_idle = (jiffies - ppp->last_xmit) / HZ;
> > +             idle64.recv_idle = (jiffies - ppp->last_recv) / HZ;
> > +             if (copy_to_user(argp, &idle32, sizeof(idle32)))
> >
> I guess you meant 'idle64' instead of 'idle32'.

Good catch, fixing up both now.

Thanks for the review!

     Arnd

^ permalink raw reply

* Re: [PATCH net-next 1/5] pppoe: fix PPPOEIOCSFWD compat handling
From: Arnd Bergmann @ 2018-08-30 11:54 UTC (permalink / raw)
  To: g.nault
  Cc: Paul Mackerras, linux-ppp, Networking, mitch, mostrows, jchapman,
	xeb, David Miller, Al Viro, y2038 Mailman List,
	Linux Kernel Mailing List
In-Reply-To: <20180830110453.GC1473@alphalink.fr>

On Thu, Aug 30, 2018 at 1:04 PM Guillaume Nault <g.nault@alphalink.fr> wrote:
>
> On Wed, Aug 29, 2018 at 04:03:26PM +0200, Arnd Bergmann wrote:
> > Support for handling the PPPOEIOCSFWD ioctl in compat mode was added in
> > linux-2.5.69 along with hundreds of other commands, but was always broken
> > sincen only the structure is compatible, but the command number is not,
> > due to the size being sizeof(size_t), or at first sizeof(sizeof((struct
> > sockaddr_pppox)), which is different on 64-bit architectures.
> >
> And the implementation was broken until 2016 (see 29e73269aa4d ("pppoe:
> fix reference counting in PPPoE proxy")), and nobody ever noticed. I
> should probably have removed this ioctl entirely instead of fixing it.
> Clearly, it has never been used.
>
> If you think it's worth fixing (as opposed to dropping this ioctl or
> its compat mode), then,
> Acked-by: Guillaume Nault <g.nault@alphalink.fr>

I don't care much, but fixing it seems seems easier than coming
up with a convincing rationale for dropping.

I'll update the changelog text to include your additional background
information though, unless someone else prefers to have it dropped.

       Arnd

^ permalink raw reply

* Re: [PATCH] ieee802154: mcr20a: read out of bounds in mcr20a_set_channel()
From: Dan Carpenter @ 2018-08-30  7:54 UTC (permalink / raw)
  To: Xue Liu
  Cc: alex. aring, Stefan Schmidt, David S. Miller, linux-wpan - ML,
	netdev, kernel-janitors
In-Reply-To: <CAJuUDwtjRi0HnR95uaFp4zviSC7Yrs68o8jsK1nY6jWu6909bg@mail.gmail.com>

On Wed, Aug 29, 2018 at 07:50:51PM +0200, Xue Liu wrote:
> Hello Dan,
> 
> 
> On Wed, 29 Aug 2018 at 16:49, Dan Carpenter <dan.carpenter@oracle.com>
> wrote:
> 
> > The "channel" variable can be any u8 value.  We need to make sure we
> > don't read outside of the PLL_INT[] or PLL_FRAC[] arrays.
> >
> I think the “channel” variable can not be any u8 value. This values is
> already checked before set_channel function is called.
> https://github.com/torvalds/linux/blob/master/net/ieee802154/nl802154.c#L978

Oh...  Hm...  I should have reviewed more carefully.  This patch isn't
correct.  But there may still be a bug.  What Smatch is worried about is
the other call tree:

ieee802154_start_req() calls (struct ieee802154_mlme_ops)->start_req
  -> mac802154_mlme_start_req()
     -> mac802154_dev_set_page_channel()
        -> drv_set_channel() calls local->ops->set_channel(&local->hw, page, channel);
           -> mcr20a_set_channel()

So maybe we could move the check from nl802154_set_channel() to
drv_set_channel() so channel is checked on both call trees.

regards,
dan carpenter

^ permalink raw reply

* [PATCH bpf-next] xsk: include XDP meta data in AF_XDP frames
From: Björn Töpel @ 2018-08-30  8:09 UTC (permalink / raw)
  To: bjorn.topel, magnus.karlsson, magnus.karlsson, ast, daniel,
	netdev
  Cc: Björn Töpel, brouer

From: Björn Töpel <bjorn.topel@intel.com>

Previously, the AF_XDP (XDP_DRV/XDP_SKB copy-mode) ingress logic did
not include XDP meta data in the data buffers copied out to the user
application.

In this commit, we check if meta data is available, and if so, it is
prepended to the frame.

Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
---
 net/xdp/xsk.c | 37 ++++++++++++++++++++++++++++---------
 1 file changed, 28 insertions(+), 9 deletions(-)

diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
index 4e937cd7c17d..817e4cee1540 100644
--- a/net/xdp/xsk.c
+++ b/net/xdp/xsk.c
@@ -55,20 +55,30 @@ EXPORT_SYMBOL(xsk_umem_discard_addr);
 
 static int __xsk_rcv(struct xdp_sock *xs, struct xdp_buff *xdp, u32 len)
 {
-	void *buffer;
+	void *to_buf, *from_buf;
+	u32 metalen;
 	u64 addr;
 	int err;
 
 	if (!xskq_peek_addr(xs->umem->fq, &addr) ||
-	    len > xs->umem->chunk_size_nohr) {
+	    len > xs->umem->chunk_size_nohr - XDP_PACKET_HEADROOM) {
 		xs->rx_dropped++;
 		return -ENOSPC;
 	}
 
 	addr += xs->umem->headroom;
 
-	buffer = xdp_umem_get_data(xs->umem, addr);
-	memcpy(buffer, xdp->data, len);
+	if (xdp_data_meta_unsupported(xdp)) {
+		from_buf = xdp->data;
+		metalen = 0;
+	} else {
+		from_buf = xdp->data_meta;
+		metalen = xdp->data - xdp->data_meta;
+	}
+
+	to_buf = xdp_umem_get_data(xs->umem, addr);
+	memcpy(to_buf, from_buf, len + metalen);
+	addr += metalen;
 	err = xskq_produce_batch_desc(xs->rx, addr, len);
 	if (!err) {
 		xskq_discard_addr(xs->umem->fq);
@@ -111,8 +121,8 @@ void xsk_flush(struct xdp_sock *xs)
 
 int xsk_generic_rcv(struct xdp_sock *xs, struct xdp_buff *xdp)
 {
-	u32 len = xdp->data_end - xdp->data;
-	void *buffer;
+	u32 metalen, len = xdp->data_end - xdp->data;
+	void *to_buf, *from_buf;
 	u64 addr;
 	int err;
 
@@ -120,15 +130,24 @@ int xsk_generic_rcv(struct xdp_sock *xs, struct xdp_buff *xdp)
 		return -EINVAL;
 
 	if (!xskq_peek_addr(xs->umem->fq, &addr) ||
-	    len > xs->umem->chunk_size_nohr) {
+	    len > xs->umem->chunk_size_nohr - XDP_PACKET_HEADROOM) {
 		xs->rx_dropped++;
 		return -ENOSPC;
 	}
 
 	addr += xs->umem->headroom;
 
-	buffer = xdp_umem_get_data(xs->umem, addr);
-	memcpy(buffer, xdp->data, len);
+	if (xdp_data_meta_unsupported(xdp)) {
+		from_buf = xdp->data;
+		metalen = 0;
+	} else {
+		from_buf = xdp->data_meta;
+		metalen = xdp->data - xdp->data_meta;
+	}
+
+	to_buf = xdp_umem_get_data(xs->umem, addr);
+	memcpy(to_buf, from_buf, len + metalen);
+	addr += metalen;
 	err = xskq_produce_batch_desc(xs->rx, addr, len);
 	if (!err) {
 		xskq_discard_addr(xs->umem->fq);
-- 
2.17.1

^ permalink raw reply related

* Re: mlx5 driver loading failing on v4.19 / net-next / bpf-next
From: Tariq Toukan @ 2018-08-30  8:35 UTC (permalink / raw)
  To: Jesper Dangaard Brouer, Saeed Mahameed, netdev@vger.kernel.org
  Cc: Eran Ben Elisha
In-Reply-To: <20180829170358.5d822db8@redhat.com>



On 29/08/2018 6:05 PM, Jesper Dangaard Brouer wrote:
> Hi Saeed,
> 
> I'm having issues loading mlx5 driver on v4.19 kernels (tested both
> net-next and bpf-next), while kernel v4.18 seems to work.  It happens
> with a Mellanox ConnectX-5 NIC (and also a CX4-Lx but I removed that
> from the system now).
> 

Hi Jesper,

Thanks for your report!

We are working to analyze and debug the issue.

> One pain point is very long boot-time, caused by some timeout code in
> the driver. The kernel console log (dmesg) says:
> 
> [    5.763330] mlx5_core 0000:03:00.0: firmware version: 16.22.1002
> [    5.769367] mlx5_core 0000:03:00.0: 126.016 Gb/s available PCIe bandwidth, limited by 8 GT/s x16 link at 0000:00:02.0 (capable of 252.048 Gb/s with 16 GT/s x16 link)
> 
> (...) other drivers loading
> 
> [   66.816635] mlx5_core 0000:03:00.0: wait_func:964:(pid 112): ENABLE_HCA(0x104) timeout. Will cause a leak of a command resource
> [   66.828123] mlx5_core 0000:03:00.0: enable hca failed
> [   66.845516] mlx5_core 0000:03:00.0: mlx5_load_one failed with error code -110
> [   66.852802] mlx5_core: probe of 0000:03:00.0 failed with error -110
> 
> [   66.859347] mlx5_core 0000:03:00.1: firmware version: 16.22.1002
> [   66.865388] mlx5_core 0000:03:00.1: 126.016 Gb/s available PCIe bandwidth, limited by 8 GT/s x16 link at 0000:00:02.0 (capable of 252.048 Gb/s with 16 GT/s x16 link)
> 
> [  125.787395] XFS (sda3): Mounting V5 Filesystem
> [  125.848509] XFS (sda3): Ending clean mount
> [  127.984784] mlx5_core 0000:03:00.1: wait_func:964:(pid 5): ENABLE_HCA(0x104) timeout. Will cause a leak of a command resource
> [  127.996090] mlx5_core 0000:03:00.1: enable hca failed
> [  128.013819] mlx5_core 0000:03:00.1: mlx5_load_one failed with error code -110
> [  128.021076] mlx5_core: probe of 0000:03:00.1 failed with error -110
> 
> 
> Do you have any idea what could be causing this?
> 

We'll update regarding any progress.

Thanks!

^ permalink raw reply

* Re: [PATCH 0/3] IB/ipoib: Use dev_port to disambiguate port numbers
From: Arseny Maslennikov @ 2018-08-30  8:42 UTC (permalink / raw)
  To: Leon Romanovsky; +Cc: linux-rdma, Doug Ledford, Jason Gunthorpe, netdev
In-Reply-To: <20180830054330.GC2796@mtr-leonro.mtl.com>

[-- Attachment #1: Type: text/plain, Size: 1805 bytes --]

On Thu, Aug 30, 2018 at 08:43:30AM +0300, Leon Romanovsky wrote:
> On Wed, Aug 29, 2018 at 12:01:14AM +0300, Arseny Maslennikov wrote:
> > Pre-3.15 userspace had trouble distinguishing different ports
> > of a NIC on a single PCI bus/device/function. To solve this,
> > a sysfs field `dev_port' was introduced quite a while ago
> > (commit v3.14-rc3-739-g3f85944fe207), and some relevant device
> > drivers were fixed to use it, but not in case of IPoIB.
> >
> > The convention for some reason never got documented in the kernel, but
> > was immediately adopted by userspace (notably udev[1][2], biosdevname[3])
> >
> > 3/3 documents the sysfs field — that's why I'm CC-ing netdev.
> >
> > This series was tested on and applies to 4.19-rc1.
> >
> > [1] https://lists.freedesktop.org/archives/systemd-devel/2014-June/020788.html
> > [2] https://lists.freedesktop.org/archives/systemd-devel/2014-July/020804.html
> > [3] https://github.com/CloudAutomationNTools/biosdevname/blob/c795d51dd93a5309652f0d635f12a3ecfabfaa72/src/eths.c#L38
> >
> > Arseny Maslennikov (3):
> >   IB/ipoib: Use dev_port to expose network interface port numbers
> >   IB/ipoib: Stop using dev_id to expose port numbers
> 
> I completely agree with previous Yuval's comment, it makes no sense to
> start separate commits for every line.
> 
> Please decide what is best and right behavior and do it, instead of
> pushing it up to be the maintainer's problem.
> 

No problem; will squash those two in v2, then.

> Thanks
> 
> >   Documentation/ABI: document /sys/class/net/*/dev_port
> >
> >  Documentation/ABI/testing/sysfs-class-net | 10 ++++++++++
> >  drivers/infiniband/ulp/ipoib/ipoib_main.c |  2 +-
> >  2 files changed, 11 insertions(+), 1 deletion(-)
> >
> > --
> > 2.18.0
> >



[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* Re: [PATCH 0/4] clk-pmc-atom + r8169: Add ether_clk handling to fix suspend issues
From: Andy Shevchenko @ 2018-08-30  8:43 UTC (permalink / raw)
  To: Hans de Goede
  Cc: David S . Miller, Heiner Kallweit, Michael Turquette,
	Stephen Boyd, Irina Tirdea, netdev, Johannes Stezenbach,
	Carlo Caione, linux-clk
In-Reply-To: <e3be83dc-67c7-fdaa-1dd9-0a4aa3d10584@redhat.com>

On Wed, Aug 29, 2018 at 07:06:09PM +0200, Hans de Goede wrote:

> > What a nice stuff, thanks!
> > 
> > Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
> 
> Thank you (and also thank you for the other reviews)
> 
> > Btw, you probably better to refer to
> > https://bugzilla.kernel.org/show_bug.cgi?id=196861 which is dedicated for S0ix
> > issue.
> 
> Ok, I've added a link to that. I've also added:
> 
> Reported-by: Johannes Stezenbach <js@sig21.net>
> 
> To honor Johannes for figuring out that leaving the clocks enabled
> was a problem in the first place.
> 
> This will all be included in v2 of the series when I send it out.

Forgot to mention, instead of Irina, which is not longer working for Intel for
more than a year, put Pierre's address there.


-- 
With Best Regards,
Andy Shevchenko

^ permalink raw reply

* Re: [PATCH bpf-next 00/11] AF_XDP zero-copy support for i40e
From: Björn Töpel @ 2018-08-30  9:05 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: Karlsson, Magnus, Magnus Karlsson, Duyck, Alexander H,
	Alexander Duyck, ast, Jesper Dangaard Brouer, Netdev,
	Brandeburg, Jesse, Singhai, Anjali, peter.waskiewicz.jr,
	Björn Töpel, michael.lundkvist, Willem de Bruijn,
	John Fastabend, Jakub Kicinski, neerav.parikh, MykytaI Iziumtsev,
	Francois Ozog
In-Reply-To: <1c26d5c6-cd37-b022-fe34-1ca48ada598f@iogearbox.net>

Den ons 29 aug. 2018 kl 18:12 skrev Daniel Borkmann <daniel@iogearbox.net>:
>
> On 08/28/2018 02:44 PM, Björn Töpel wrote:
> > From: Björn Töpel <bjorn.topel@intel.com>
> >
> > This patch set introduces zero-copy AF_XDP support for Intel's i40e
> > driver. In the first preparatory patch we also add support for
> > XDP_REDIRECT for zero-copy allocated frames so that XDP programs can
> > redirect them. This was a ToDo from the first AF_XDP zero-copy patch
> > set from early June. Special thanks to Alex Duyck and Jesper Dangaard
> > Brouer for reviewing earlier versions of this patch set.
> >
> > The i40e zero-copy code is located in its own file i40e_xsk.[ch]. Note
> > that in the interest of time, to get an AF_XDP zero-copy implementation
> > out there for people to try, some code paths have been copied from the
> > XDP path to the zero-copy path. It is out goal to merge the two paths
> > in later patch sets.
> >
> > In contrast to the implementation from beginning of June, this patch
> > set does not require any extra HW queues for AF_XDP zero-copy
> > TX. Instead, the XDP TX HW queue is used for both XDP_REDIRECT and
> > AF_XDP zero-copy TX.
> >
> > Jeff, given that most of changes are in i40e, it is up to you how you
> > would like to route these patches. The set is tagged bpf-next, but
> > if taking it via the Intel driver tree is easier, let us know.
> >
> > We have run some benchmarks on a dual socket system with two Broadwell
> > E5 2660 @ 2.0 GHz with hyperthreading turned off. Each socket has 14
> > cores which gives a total of 28, but only two cores are used in these
> > experiments. One for TR/RX and one for the user space application. The
> > memory is DDR4 @ 2133 MT/s (1067 MHz) and the size of each DIMM is
> > 8192MB and with 8 of those DIMMs in the system we have 64 GB of total
> > memory. The compiler used is gcc (Ubuntu 7.3.0-16ubuntu3) 7.3.0. The
> > NIC is Intel I40E 40Gbit/s using the i40e driver.
> >
> > Below are the results in Mpps of the I40E NIC benchmark runs for 64
> > and 1500 byte packets, generated by a commercial packet generator HW
> > outputing packets at full 40 Gbit/s line rate. The results are with
> > retpoline and all other spectre and meltdown fixes, so these results
> > are not comparable to the ones from the zero-copy patch set in June.
> >
> > AF_XDP performance 64 byte packets.
> > Benchmark   XDP_SKB    XDP_DRV    XDP_DRV with zerocopy
> > rxdrop       2.6        8.2         15.0
> > txpush       2.2        -           21.9
> > l2fwd        1.7        2.3         11.3
> >
> > AF_XDP performance 1500 byte packets:
> > Benchmark   XDP_SKB   XDP_DRV     XDP_DRV with zerocopy
> > rxdrop       2.0        3.3         3.3
> > l2fwd        1.3        1.7         3.1
> >
> > XDP performance on our system as a base line:
> >
> > 64 byte packets:
> > XDP stats       CPU     pps         issue-pps
> > XDP-RX CPU      16      18.4M  0
> >
> > 1500 byte packets:
> > XDP stats       CPU     pps         issue-pps
> > XDP-RX CPU      16      3.3M    0
> >
> > The structure of the patch set is as follows:
> >
> > Patch 1: Add support for XDP_REDIRECT of zero-copy allocated frames
> > Patches 2-4: Preparatory patches to common xsk and net code
> > Patches 5-7: Preparatory patches to i40e driver code for RX
> > Patch 8: i40e zero-copy support for RX
> > Patch 9: Preparatory patch to i40e driver code for TX
> > Patch 10: i40e zero-copy support for TX
> > Patch 11: Add flags to sample application to force zero-copy/copy mode
> >
> > We based this patch set on bpf-next commit 050cdc6c9501 ("Merge
> > git://git.kernel.org/pub/scm/linux/kernel/git/davem/net")
> >
> >
> > Magnus & Björn
> >
> > Björn Töpel (8):
> >   xdp: implement convert_to_xdp_frame for MEM_TYPE_ZERO_COPY
> >   xdp: export xdp_rxq_info_unreg_mem_model
> >   xsk: expose xdp_umem_get_{data,dma} to drivers
> >   i40e: added queue pair disable/enable functions
> >   i40e: refactor Rx path for re-use
> >   i40e: move common Rx functions to i40e_txrx_common.h
> >   i40e: add AF_XDP zero-copy Rx support
> >   samples/bpf: add -c/--copy -z/--zero-copy flags to xdpsock
> >
> > Magnus Karlsson (3):
> >   net: add napi_if_scheduled_mark_missed
> >   i40e: move common Tx functions to i40e_txrx_common.h
> >   i40e: add AF_XDP zero-copy Tx support
>
> Thanks for working on this, LGTM! Are you also planning to get ixgbe
> out after that?
>

Yes, the plan is to get ixgbe out as the next driver, but we'll focus
on the existing i40e issues first.


Thanks,
Björn

> For the series:
>
> Acked-by: Daniel Borkmann <daniel@iogearbox.net>
>
> Thanks,
> Daniel

^ permalink raw reply

* Re: [PATCH net-next 1/5] pppoe: fix PPPOEIOCSFWD compat handling
From: Guillaume Nault @ 2018-08-30 13:09 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Paul Mackerras, linux-ppp, Networking, mitch, mostrows, jchapman,
	xeb, David Miller, Al Viro, y2038 Mailman List,
	Linux Kernel Mailing List
In-Reply-To: <CAK8P3a1WSa5yPWXYGieAc0+Xk=CAy_G-RhNNcQBsGQGOU-8WfA@mail.gmail.com>

On Thu, Aug 30, 2018 at 01:54:48PM +0200, Arnd Bergmann wrote:
> On Thu, Aug 30, 2018 at 1:04 PM Guillaume Nault <g.nault@alphalink.fr> wrote:
> >
> > On Wed, Aug 29, 2018 at 04:03:26PM +0200, Arnd Bergmann wrote:
> > > Support for handling the PPPOEIOCSFWD ioctl in compat mode was added in
> > > linux-2.5.69 along with hundreds of other commands, but was always broken
> > > sincen only the structure is compatible, but the command number is not,
> > > due to the size being sizeof(size_t), or at first sizeof(sizeof((struct
> > > sockaddr_pppox)), which is different on 64-bit architectures.
> > >
> > And the implementation was broken until 2016 (see 29e73269aa4d ("pppoe:
> > fix reference counting in PPPoE proxy")), and nobody ever noticed. I
> > should probably have removed this ioctl entirely instead of fixing it.
> > Clearly, it has never been used.
> >
> > If you think it's worth fixing (as opposed to dropping this ioctl or
> > its compat mode), then,
> > Acked-by: Guillaume Nault <g.nault@alphalink.fr>
> 
> I don't care much, but fixing it seems seems easier than coming
> up with a convincing rationale for dropping.
> 
> I'll update the changelog text to include your additional background
> information though, unless someone else prefers to have it dropped.
> 
Sounds good. Thanks.

^ permalink raw reply

* [PATCH] staging: fsl-dpaa2/ethsw: Fix uninitialized variables
From: Ioana Radulescu @ 2018-08-30 13:17 UTC (permalink / raw)
  To: gregkh, netdev; +Cc: devel, linux-kernel, dan.carpenter, ioana.ciornei

Functions port_vlans_add() and port_vlans_del() could,
in theory, return an uninitialized variable. Fix this
by initializing the variable in question at declaration.

Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
---
 drivers/staging/fsl-dpaa2/ethsw/ethsw.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/staging/fsl-dpaa2/ethsw/ethsw.c b/drivers/staging/fsl-dpaa2/ethsw/ethsw.c
index ecdd3d8..c1616c3 100644
--- a/drivers/staging/fsl-dpaa2/ethsw/ethsw.c
+++ b/drivers/staging/fsl-dpaa2/ethsw/ethsw.c
@@ -717,7 +717,7 @@ static int port_vlans_add(struct net_device *netdev,
 			  struct switchdev_trans *trans)
 {
 	struct ethsw_port_priv *port_priv = netdev_priv(netdev);
-	int vid, err;
+	int vid, err = 0;
 
 	if (netif_is_bridge_master(vlan->obj.orig_dev))
 		return -EOPNOTSUPP;
@@ -872,7 +872,7 @@ static int port_vlans_del(struct net_device *netdev,
 			  const struct switchdev_obj_port_vlan *vlan)
 {
 	struct ethsw_port_priv *port_priv = netdev_priv(netdev);
-	int vid, err;
+	int vid, err = 0;
 
 	if (netif_is_bridge_master(vlan->obj.orig_dev))
 		return -EOPNOTSUPP;
-- 
2.7.4

^ permalink raw reply related

* Re: [PATCH 00/15] soc: octeontx2: Add RVU admin function driver
From: Andrew Lunn @ 2018-08-30 13:26 UTC (permalink / raw)
  To: Sunil Kovvuri
  Cc: Arnd Bergmann, LKML, olof, LAKML, linux-soc, Sunil Goutham,
	Linux Netdev List, David S. Miller
In-Reply-To: <CA+sq2CfbpQKas7TbQfrcw=Drpb7yO_gqH31iXabjNEQEG4xYrQ@mail.gmail.com>

> > > My feeling overall is that we need a review from the network driver
> > > folks more than the arm-soc team etc, and that maybe the driver
> > > as a whole should go into drivers/net/ethernet.
> >
> > This driver doesn't handle any network IO and moreever this driver has to handle
> > configuration requests from crypto driver as well. There will be
> > separate network and
> > crypto drivers which will be upstreamed into drivers/net/ethernet and
> > drivers/crypto.
> > And in future silicons there will be different types of functional
> > blocks which will be
> > added into this resource virtualization unit (RVU). Hence i thought
> > this driver is not a
> > right fit in drivers/net/ethernet.

Hi Sunil

Do you have a git branch for everything? I would like to look at the
actual Ethernet driver, and the full API this driver exports to other
drivers.

I think there real question here is, do you have split between this
driver and the actual device drivers in the right place? For me, link
up/down detection should be in the Ethernet driver, since it is not
shared with the crypto driver.

       Thanks
	Andrew

^ permalink raw reply

* Re: [PATCH 04/15] soc: octeontx2: Add mailbox support infra
From: Arnd Bergmann @ 2018-08-30 13:56 UTC (permalink / raw)
  To: Sunil Kovvuri
  Cc: Linux Kernel Mailing List, Olof Johansson, Linux ARM, linux-soc,
	amakarov, sgoutham, lbartosik, Networking, David Miller
In-Reply-To: <CA+sq2Cdn6ut3kdTWoOgCayO2evQrD9TjBaUpVUR3_rWZFxYeig@mail.gmail.com>

On Tue, Aug 28, 2018 at 3:23 PM Sunil Kovvuri <sunil.kovvuri@gmail.com> wrote:
>
> On Tue, Aug 28, 2018 at 6:22 PM Arnd Bergmann <arnd@arndb.de> wrote:
> >
> > On Tue, Aug 28, 2018 at 2:48 PM Sunil Kovvuri <sunil.kovvuri@gmail.com> wrote:
> > >
> > > On Tue, Aug 28, 2018 at 5:33 PM Arnd Bergmann <arnd@arndb.de> wrote:
> > > >
> > > > On Tue, Aug 28, 2018 at 12:57 PM <sunil.kovvuri@gmail.com> wrote:
> > > > >
> > > > > From: Aleksey Makarov <amakarov@marvell.com>
> > > > >
> > > > > This patch adds mailbox support infrastructure APIs.
> > > > > Each RVU device has a dedicated 64KB mailbox region
> > > > > shared with it's peer for communication. RVU AF has
> > > > > a separate mailbox region shared with each of RVU PFs
> > > > > and a RVU PF has a separate region shared with each of
> > > > > it's VF.
> > > > >
> > > > > These set of APIs are used by this driver (RVU AF) and
> > > > > other RVU PF/VF drivers eg netdev, crypto e.t.c.
> > > > >
> > > > > Signed-off-by: Aleksey Makarov <amakarov@marvell.com>
> > > > > Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
> > > > > Signed-off-by: Lukasz Bartosik <lbartosik@marvell.com>
> > > >
> > > > Why does this driver not use the drivers/mailbox/ infrastructure?
> > > >
> > > This is a common administrative software driver which will be handling requests
> > > from kernel drivers and as well as drivers in userspace applications.
> > > We had to keep mailbox communication infrastructure same across all usages.
> >
> > Can you explain more about the usage of userspace applications
> > and what interface you plan to use into the kernel?
>
> Any PCI device here irrespective in what domain (kernel or userspace)
> they are in
> use common mailbox communication. Which is
> # Write a mailbox msg (format is agreed between all parties) into
> shared (between AF and other PF/VFs)
>    memory region and trigger a interrupt to admin function.
> # Admin function processes the msg and puts reply in the same memory
> region and trigger
>    IRQ to the requesting device. If the device has a driver instance
> in kernel then it uses
>    IRQ and userspace applications does polling on the IRQ status bit.

Ok, so the mailbox here is a communication mechanism between
two device drivers that may run on the same kernel, or in different
instances (user space, virtual machine, ...), but each driver
only talks to the mailbox visible in its own device, right?

What is the purpose of the exported interface then? Is this
just an abstraction so each of the drivers can talk to its own
mailbox using a set of common helper functions?

      Arnd

^ permalink raw reply

* [PATCH] rtl8xxxu: Add rtl8188ctv support
From: Aleksei Mamlin @ 2018-08-30 14:05 UTC (permalink / raw)
  To: Jes Sorensen
  Cc: Kalle Valo, David S . Miller, linux-wireless, netdev,
	linux-kernel, Aleksei Mamlin

The Realtek rtl8188ctv (0x0bda:0x018a) is a highly integrated single-chip
WLAN USB2.0 network interface controller.

Currently rtl8188ctv is supported by rtlwifi driver.
It is similar to the rtl8188cus(0x0bda:0x818a) and uses the same config.

Signed-off-by: Aleksei Mamlin <mamlinav@gmail.com>
---
 drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_core.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_core.c b/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_core.c
index 505ab1b055ff..73f6fc0d4a01 100644
--- a/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_core.c
+++ b/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_core.c
@@ -6231,6 +6231,8 @@ static const struct usb_device_id dev_table[] = {
 {USB_DEVICE_AND_INTERFACE_INFO(0x2001, 0x3308, 0xff, 0xff, 0xff),
 	.driver_info = (unsigned long)&rtl8192cu_fops},
 /* Currently untested 8188 series devices */
+{USB_DEVICE_AND_INTERFACE_INFO(USB_VENDOR_ID_REALTEK, 0x018a, 0xff, 0xff, 0xff),
+	.driver_info = (unsigned long)&rtl8192cu_fops},
 {USB_DEVICE_AND_INTERFACE_INFO(USB_VENDOR_ID_REALTEK, 0x8191, 0xff, 0xff, 0xff),
 	.driver_info = (unsigned long)&rtl8192cu_fops},
 {USB_DEVICE_AND_INTERFACE_INFO(USB_VENDOR_ID_REALTEK, 0x8170, 0xff, 0xff, 0xff),
-- 
2.16.4

^ permalink raw reply related

* [PATCH net-next] packet: add sockopt to ignore outgoing packets
From: Vincent Whitchurch @ 2018-08-30 10:01 UTC (permalink / raw)
  To: davem; +Cc: netdev, willemb, Vincent Whitchurch

Currently, the only way to ignore outgoing packets on a packet socket is
via the BPF filter.  With MSG_ZEROCOPY, packets that are looped into
AF_PACKET are copied in dev_queue_xmit_nit(), and this copy happens even
if the filter run from packet_rcv() would reject them.  So the presence
of a packet socket on the interface takes away the benefits of
MSG_ZEROCOPY, even if the packet socket is not interested in outgoing
packets.  (Even when MSG_ZEROCOPY is not used, the skb is unnecessarily
cloned, but the cost for that is much lower.)

Add a socket option to allow AF_PACKET sockets to ignore outgoing
packets to solve this.  Note that the *BSDs already have something
similar: BIOCSSEESENT/BIOCSDIRECTION and BIOCSDIRFILT.

The first intended user is lldpd.

Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
---
 include/linux/netdevice.h      |  1 +
 include/uapi/linux/if_packet.h |  1 +
 net/core/dev.c                 |  3 +++
 net/packet/af_packet.c         | 15 +++++++++++++++
 4 files changed, 20 insertions(+)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index ca5ab98053c8..8ef14d9edc58 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -2317,6 +2317,7 @@ static inline struct sk_buff *call_gro_receive_sk(gro_receive_sk_t cb,
 
 struct packet_type {
 	__be16			type;	/* This is really htons(ether_type). */
+	bool			ignore_outgoing;
 	struct net_device	*dev;	/* NULL is wildcarded here	     */
 	int			(*func) (struct sk_buff *,
 					 struct net_device *,
diff --git a/include/uapi/linux/if_packet.h b/include/uapi/linux/if_packet.h
index 67b61d91d89b..467b654bd4c7 100644
--- a/include/uapi/linux/if_packet.h
+++ b/include/uapi/linux/if_packet.h
@@ -57,6 +57,7 @@ struct sockaddr_ll {
 #define PACKET_QDISC_BYPASS		20
 #define PACKET_ROLLOVER_STATS		21
 #define PACKET_FANOUT_DATA		22
+#define PACKET_IGNORE_OUTGOING		23
 
 #define PACKET_FANOUT_HASH		0
 #define PACKET_FANOUT_LB		1
diff --git a/net/core/dev.c b/net/core/dev.c
index 325fc5088370..0addb4f0abfe 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1947,6 +1947,9 @@ static inline bool skb_loop_sk(struct packet_type *ptype, struct sk_buff *skb)
 	if (!ptype->af_packet_priv || !skb->sk)
 		return false;
 
+	if (ptype->ignore_outgoing)
+		return true;
+
 	if (ptype->id_match)
 		return ptype->id_match(ptype, skb->sk);
 	else if ((struct sock *)ptype->af_packet_priv == skb->sk)
diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index 5610061e7f2e..37bbafad724a 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -3805,6 +3805,18 @@ packet_setsockopt(struct socket *sock, int level, int optname, char __user *optv
 
 		return fanout_set_data(po, optval, optlen);
 	}
+	case PACKET_IGNORE_OUTGOING:
+	{
+		int val;
+
+		if (optlen != sizeof(val))
+			return -EINVAL;
+		if (copy_from_user(&val, optval, sizeof(val)))
+			return -EFAULT;
+
+		po->prot_hook.ignore_outgoing = !!val;
+		return 0;
+	}
 	case PACKET_TX_HAS_OFF:
 	{
 		unsigned int val;
@@ -3928,6 +3940,9 @@ static int packet_getsockopt(struct socket *sock, int level, int optname,
 			((u32)po->fanout->flags << 24)) :
 		       0);
 		break;
+	case PACKET_IGNORE_OUTGOING:
+		val = po->prot_hook.ignore_outgoing;
+		break;
 	case PACKET_ROLLOVER_STATS:
 		if (!po->rollover)
 			return -EINVAL;
-- 
2.11.0

^ permalink raw reply related

* Re: [PATCH] rtnetlink: expose value from SET_NETDEV_DEVTYPE via IFLA_DEVTYPE attribute
From: Marcel Holtmann @ 2018-08-30 10:13 UTC (permalink / raw)
  To: Jiri Pirko; +Cc: netdev, David S. Miller
In-Reply-To: <20180830051230.GE2181@nanopsycho>

Hi Jiri,

>>>> The name value from SET_NETDEV_DEVTYPE only ended up in the uevent sysfs
>>>> file as DEVTYPE= information. To avoid any kind of race conditions
>>>> between netlink messages and reading from sysfs, it is useful to add the
>>>> same string as new IFLA_DEVTYPE attribute included in the RTM_NEWLINK
>>>> messages.
>>>> 
>>>> For network managing daemons that have to classify ARPHRD_ETHER network
>>>> devices into different types (like Wireless LAN, Bluetooth etc.), this
>>>> avoids the extra round trip to sysfs and parsing of the uevent file.
>>>> 
>>>> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
>>>> ---
>>>> include/uapi/linux/if_link.h |  2 ++
>>>> net/core/rtnetlink.c         | 12 ++++++++++++
>>>> 2 files changed, 14 insertions(+)
>>>> 
>>>> diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h
>>>> index 43391e2d1153..781294972bb4 100644
>>>> --- a/include/uapi/linux/if_link.h
>>>> +++ b/include/uapi/linux/if_link.h
>>>> @@ -166,6 +166,8 @@ enum {
>>>> 	IFLA_NEW_IFINDEX,
>>>> 	IFLA_MIN_MTU,
>>>> 	IFLA_MAX_MTU,
>>>> +	IFLA_DEVTYPE,		/* Name value from SET_NETDEV_DEVTYPE */
>>> 
>>> This is not something netdev-related. dev->dev.type is struct device_type.
>>> This is a generic "device" thing. Incorrect to expose over
>>> netdev-specific API. Please use "device" API for this.
>> 
>> it is not just "device" related since this is a sub-classification of ARPHRD_ETHER type as a wrote above. Don't get hang up that this information is part of struct device. The dev->dev.type contains strings like "wlan", "bluetooth", "wimax", "gadget" etc. so that you can tell what kind of Ethernet card you have.
> 
> I understand. But using dev->dev.type for that purpose seems like an
> abuse. If this is subtype of netdev, it should not be stored in parent
> device. I believe that you need to re-introduce this as a part of struct
> net_device - preferably an enum - and that you can expose over rtnl (and
> net-sysfs).

it is not stored in the parent device. It is stored in the device of the netdev.

As Stephen said, there is no device API. And why would I add the same info again and bloat netdev?

drivers/net/bonding/bond_main.c:        SET_NETDEV_DEVTYPE(bond_dev, &bond_type);
drivers/net/geneve.c:   SET_NETDEV_DEVTYPE(dev, &geneve_type);
drivers/net/macsec.c:   SET_NETDEV_DEVTYPE(dev, &macsec_type);
drivers/net/ppp/ppp_generic.c:  SET_NETDEV_DEVTYPE(dev, &ppp_type);
drivers/net/usb/hso.c:  SET_NETDEV_DEVTYPE(net, &hso_type);
drivers/net/usb/usbnet.c:               SET_NETDEV_DEVTYPE(net, &wlan_type);
drivers/net/usb/usbnet.c:               SET_NETDEV_DEVTYPE(net, &wwan_type);
drivers/net/vxlan.c:    SET_NETDEV_DEVTYPE(dev, &vxlan_type);
drivers/net/wimax/i2400m/usb.c: SET_NETDEV_DEVTYPE(net_dev, &i2400mu_type);
drivers/net/wireless/intersil/prism54/islpci_dev.c:     SET_NETDEV_DEVTYPE(ndev, &wlan_type);
drivers/staging/gdm724x/gdm_lte.c:              SET_NETDEV_DEVTYPE(net, &wwan_type);
drivers/usb/gadget/function/u_ether.c:  SET_NETDEV_DEVTYPE(net, &gadget_type);
drivers/usb/gadget/function/u_ether.c:  SET_NETDEV_DEVTYPE(net, &gadget_type);
net/8021q/vlan_dev.c:   SET_NETDEV_DEVTYPE(dev, &vlan_type);
net/bluetooth/6lowpan.c:        SET_NETDEV_DEVTYPE(netdev, &bt_type);
net/bluetooth/bnep/core.c:      SET_NETDEV_DEVTYPE(dev, &bnep_type);
net/bridge/br_device.c: SET_NETDEV_DEVTYPE(dev, &br_type);
net/dsa/slave.c:        SET_NETDEV_DEVTYPE(slave_dev, &dsa_type);
net/hsr/hsr_device.c:   SET_NETDEV_DEVTYPE(dev, &hsr_type);
net/l2tp/l2tp_eth.c:    SET_NETDEV_DEVTYPE(dev, &l2tpeth_type);
net/wireless/core.c:            SET_NETDEV_DEVTYPE(dev, &wiphy_type);

So for all these devices we have to add duplicate information now?

I actually don't like that idea. I can rename it to IFLA_FOO or any other proposal, but adding the same information twice is pointless.

>> We can revive the patches for adding this information as IFLA_INFO_KIND, but last time they where reverted since NetworkManager did all the wrong decisions based on that. That part is fixed now and we can put that back and declare the sub-type classification of Ethernet device in two places. Or we just take the information that we added a long time ago for exactly this sub-classification and provide them to userspace via RTNL.
> 
> IFLA_INFO_KIND serves for different purpose. It is for drivers that
> register rtnl_link_ops. I don't think it would be wise to mix it with
> this.

We could register rtnl_link_ops for these Ethernet drivers, but then we are duplicating the information as well.

Regards

Marcel

^ permalink raw reply

* Re: KASAN: stack-out-of-bounds Read in __schedule
From: Dmitry Vyukov @ 2018-08-30 14:19 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: Alexander Potapenko, Alexei Starovoitov, netdev, Jan Kara,
	syzbot+45a34334c61a8ecf661d, Jan Kara, linux-ext4, LKML,
	syzkaller-bugs, Theodore Ts'o
In-Reply-To: <279041ab-d449-1bfb-a05d-2d8b0d5c601b@iogearbox.net>

On Thu, Aug 30, 2018 at 2:52 AM, Daniel Borkmann <daniel@iogearbox.net> wrote:
>>>>> Hello,
>>>>>
>>>>> syzbot found the following crash on:
>>>>>
>>>>> HEAD commit:    5b394b2ddf03 Linux 4.19-rc1
>>>>> git tree:       upstream
>>>>> console output: https://syzkaller.appspot.com/x/log.txt?x=14f4d8e1400000
>>>>> kernel config:  https://syzkaller.appspot.com/x/.config?x=49927b422dcf0b29
>>>>> dashboard link: https://syzkaller.appspot.com/bug?extid=45a34334c61a8ecf661d
>>>>> compiler:       gcc (GCC) 8.0.1 20180413 (experimental)
>>>>> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=13127e5a400000
>>>>>
>>>>> IMPORTANT: if you fix the bug, please add the following tag to the commit:
>>>>> Reported-by: syzbot+45a34334c61a8ecf661d@syzkaller.appspotmail.com
>>>>>
>>>>> IPv6: ADDRCONF(NETDEV_UP): veth1: link is not ready
>>>>> IPv6: ADDRCONF(NETDEV_CHANGE): veth1: link becomes ready
>>>>> IPv6: ADDRCONF(NETDEV_CHANGE): veth0: link becomes ready
>>>>> 8021q: adding VLAN 0 to HW filter on device team0
>>>>> ==================================================================
>>>>> BUG: KASAN: stack-out-of-bounds in schedule_debug kernel/sched/core.c:3285
>>>>> [inline]
>>>>> BUG: KASAN: stack-out-of-bounds in __schedule+0x1977/0x1df0
>>>>> kernel/sched/core.c:3395
>>>>> Read of size 8 at addr ffff8801ad090000 by task syz-executor0/4718
>>>>
>>>> Weird, can you please help me decipher this? So here KASAN complains about
>>>> wrong memory access in the scheduler.
>>
>> This looks like a result of a previous bad silent memory corruption.
>>
>> The KASAN report says there is a stack out-of-bounds in scheduler. And
>> that if followed by slab corruption report in another task.
>>
>> fs/jbd2/transaction.c happens to be the first meaningful file in this
>> crash, and so that's where it is attributed to.
>>
>> Rerunning the reproducer several times can maybe give some better
>> glues, or maybe not, maybe they all will look equally puzzling.
>>
>> This part of the repro looks familiar:
>>
>> r1 = bpf$MAP_CREATE(0x0, &(0x7f0000002e40)={0x12, 0x0, 0x4, 0x6e, 0x0,
>> 0x1}, 0x68)
>> bpf$MAP_UPDATE_ELEM(0x2, &(0x7f0000000180)={r1, &(0x7f0000000000),
>> &(0x7f0000000140)}, 0x20)
>>
>> We had exactly such consequences of a bug in bpf map very recently,
>> but that was claimed to be fixed. Maybe not completely?
>> +bpf maintainers
>
> Looks like syzbot found this in Linus tree with HEAD commit 5b394b2ddf03 ("Linux 4.19-rc1")
> one day later net PR got merged via 050cdc6c9501 ("Merge git://git.kernel.org/pub/...").
>
> This PR contained a couple of fixes I did on sockmap code during audit such as:
>
>   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=b845c898b2f1ea458d5453f0fa1da6e2dfce3bb4
>
> Looking at the reproducer syzkaller found it contains:
>
>   r1 = bpf$MAP_CREATE(0x0, &(0x7f0000002e40)={0x12, 0x0, 0x4, 0x6e, 0x0, 0x1}, 0x68)
>                                                     ^^^
>
> So it found the crash with map type of sock hash and key size of 0x0 (which is invalid),
> where subsequent map update triggered the corruption. I just did a 'syz test' and it
> wasn't able to trigger the crash anymore.
>
> #syz fix: bpf, sockmap: fix sock_hash_alloc and reject zero-sized keys


Thanks.

I am again trying to figure out how/why this causes such bad failure modes.
Looking at sock_hash_ctx_update_elem it seems that all of
htab_map_hash/lookup_elem_raw/alloc_sock_hash_elem should handle
key_size=0 fine hashing/comparing/updating 0 bytes. Do you have any
ideas as to what could have gone wrong?

^ permalink raw reply

* Re: [PATCH bpf-next 11/11] samples/bpf: add -c/--copy -z/--zero-copy flags to xdpsock
From: Björn Töpel @ 2018-08-30 10:21 UTC (permalink / raw)
  To: Jesper Dangaard Brouer
  Cc: Karlsson, Magnus, Magnus Karlsson, Duyck, Alexander H,
	Alexander Duyck, ast, Daniel Borkmann, Netdev, Brandeburg, Jesse,
	Singhai, Anjali, peter.waskiewicz.jr, Björn Töpel,
	michael.lundkvist, Willem de Bruijn, John Fastabend,
	Jakub Kicinski, neerav.parikh, MykytaI Iziumtsev, Francois Ozog
In-Reply-To: <20180829144446.72509a96@redhat.com>

Den ons 29 aug. 2018 kl 14:44 skrev Jesper Dangaard Brouer <brouer@redhat.com>:
>
> On Tue, 28 Aug 2018 14:44:35 +0200
> Björn Töpel <bjorn.topel@gmail.com> wrote:
>
> > From: Björn Töpel <bjorn.topel@intel.com>
> >
> > The -c/--copy -z/--zero-copy flags enforces either copy or zero-copy
> > mode.
>
> Nice, thanks for adding this.  It allows me to quickly test the
> difference between normal-copy vs zero-copy modes.
> (Kernel bpf-next without RETPOLINE).
>
> AF_XDP RX-drop:
>  Normal-copy mode: rx 13,070,318 pps - 76.5 ns
>  Zero-copy   mode: rx 26,132,328 pps - 38.3 ns
>
> Compare to XDP_DROP:  34,251,464 pps - 29.2 ns
>    XDP_DROP + read :  30,756,664 pps - 32.5 ns
>
> The normal-copy mode is surprisingly fast (and it works for every
> driver implemeting the regular XDP_REDIRECT action).  It is still
> faster to do in-kernel XDP_DROP than AF_XDP zero-copy mode dropping,
> which was expected given frames travel to a remote CPU before returned
> (don't think remote CPU reads payload?).  The gap in nanosec is
> actually quite small, thus I'm impressed by the SPSC-queue
> implementation working across these CPUs.
>
>
> AF_XDP layer2-fwd:
>  Normal-copy mode: rx  3,200,885   tx  3,200,892
>  Zero-copy   mode: rx 17,026,300   tx 17,026,269
>
> Compare to XDP_TX: rx 14,529,079   tx 14,529,850  - 68.82 ns
>      XDP_REDIRECT: rx 13,235,785   tx 13,235,784  - 75.55 ns
>
> The copy-mode is slow because it allocates SKBs internally (I do
> wonder if we could speed it up by using ndo_xdp_xmit + disable-BH).
> More intersting is that the zero-copy is faster than XDP_TX and
> XDP_REDIRECT. I think the speedup comes from avoiding some DMA mapping
> calls with ZC.
>
> Side-note: XDP_TX vs. REDIRECT: 75.55 - 68.82 = 6.73 ns.  The cost of
> going through the xdp_do_redirect_map core is actually quite small :-)
> (I have some micro optimizations that should help ~2ns).
>
>
> AF_XDP TX-only:
>  Normal-copy mode: tx  2,853,461 pps
>  Zero-copy   mode: tx 22,255,311 pps
>
> (There is not XDP mode that does TX to compare against)
>

Kudos for doing the in-depth benchmarking!


Thanks!
Björn

> --
> Best regards,
>   Jesper Dangaard Brouer
>   MSc.CS, Principal Kernel Engineer at Red Hat
>   LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply

* Re: [PATCH] rtnetlink: expose value from SET_NETDEV_DEVTYPE via IFLA_DEVTYPE attribute
From: Jiri Pirko @ 2018-08-30 10:20 UTC (permalink / raw)
  To: Marcel Holtmann; +Cc: netdev, David S. Miller
In-Reply-To: <0F27B37D-0D2E-4BCB-B325-FD0527A2E1F3@holtmann.org>

Thu, Aug 30, 2018 at 12:13:40PM CEST, marcel@holtmann.org wrote:
>Hi Jiri,
>
>>>>> The name value from SET_NETDEV_DEVTYPE only ended up in the uevent sysfs
>>>>> file as DEVTYPE= information. To avoid any kind of race conditions
>>>>> between netlink messages and reading from sysfs, it is useful to add the
>>>>> same string as new IFLA_DEVTYPE attribute included in the RTM_NEWLINK
>>>>> messages.
>>>>> 
>>>>> For network managing daemons that have to classify ARPHRD_ETHER network
>>>>> devices into different types (like Wireless LAN, Bluetooth etc.), this
>>>>> avoids the extra round trip to sysfs and parsing of the uevent file.
>>>>> 
>>>>> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
>>>>> ---
>>>>> include/uapi/linux/if_link.h |  2 ++
>>>>> net/core/rtnetlink.c         | 12 ++++++++++++
>>>>> 2 files changed, 14 insertions(+)
>>>>> 
>>>>> diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h
>>>>> index 43391e2d1153..781294972bb4 100644
>>>>> --- a/include/uapi/linux/if_link.h
>>>>> +++ b/include/uapi/linux/if_link.h
>>>>> @@ -166,6 +166,8 @@ enum {
>>>>> 	IFLA_NEW_IFINDEX,
>>>>> 	IFLA_MIN_MTU,
>>>>> 	IFLA_MAX_MTU,
>>>>> +	IFLA_DEVTYPE,		/* Name value from SET_NETDEV_DEVTYPE */
>>>> 
>>>> This is not something netdev-related. dev->dev.type is struct device_type.
>>>> This is a generic "device" thing. Incorrect to expose over
>>>> netdev-specific API. Please use "device" API for this.
>>> 
>>> it is not just "device" related since this is a sub-classification of ARPHRD_ETHER type as a wrote above. Don't get hang up that this information is part of struct device. The dev->dev.type contains strings like "wlan", "bluetooth", "wimax", "gadget" etc. so that you can tell what kind of Ethernet card you have.
>> 
>> I understand. But using dev->dev.type for that purpose seems like an
>> abuse. If this is subtype of netdev, it should not be stored in parent
>> device. I believe that you need to re-introduce this as a part of struct
>> net_device - preferably an enum - and that you can expose over rtnl (and
>> net-sysfs).
>
>it is not stored in the parent device. It is stored in the device of the netdev.

Yeah. That is what I ment. "Device struct" associated with the
netdevice.

>
>As Stephen said, there is no device API. And why would I add the same info again and bloat netdev?
>
>drivers/net/bonding/bond_main.c:        SET_NETDEV_DEVTYPE(bond_dev, &bond_type);
>drivers/net/geneve.c:   SET_NETDEV_DEVTYPE(dev, &geneve_type);
>drivers/net/macsec.c:   SET_NETDEV_DEVTYPE(dev, &macsec_type);
>drivers/net/ppp/ppp_generic.c:  SET_NETDEV_DEVTYPE(dev, &ppp_type);
>drivers/net/usb/hso.c:  SET_NETDEV_DEVTYPE(net, &hso_type);
>drivers/net/usb/usbnet.c:               SET_NETDEV_DEVTYPE(net, &wlan_type);
>drivers/net/usb/usbnet.c:               SET_NETDEV_DEVTYPE(net, &wwan_type);
>drivers/net/vxlan.c:    SET_NETDEV_DEVTYPE(dev, &vxlan_type);
>drivers/net/wimax/i2400m/usb.c: SET_NETDEV_DEVTYPE(net_dev, &i2400mu_type);
>drivers/net/wireless/intersil/prism54/islpci_dev.c:     SET_NETDEV_DEVTYPE(ndev, &wlan_type);
>drivers/staging/gdm724x/gdm_lte.c:              SET_NETDEV_DEVTYPE(net, &wwan_type);
>drivers/usb/gadget/function/u_ether.c:  SET_NETDEV_DEVTYPE(net, &gadget_type);
>drivers/usb/gadget/function/u_ether.c:  SET_NETDEV_DEVTYPE(net, &gadget_type);
>net/8021q/vlan_dev.c:   SET_NETDEV_DEVTYPE(dev, &vlan_type);
>net/bluetooth/6lowpan.c:        SET_NETDEV_DEVTYPE(netdev, &bt_type);
>net/bluetooth/bnep/core.c:      SET_NETDEV_DEVTYPE(dev, &bnep_type);
>net/bridge/br_device.c: SET_NETDEV_DEVTYPE(dev, &br_type);
>net/dsa/slave.c:        SET_NETDEV_DEVTYPE(slave_dev, &dsa_type);
>net/hsr/hsr_device.c:   SET_NETDEV_DEVTYPE(dev, &hsr_type);
>net/l2tp/l2tp_eth.c:    SET_NETDEV_DEVTYPE(dev, &l2tpeth_type);
>net/wireless/core.c:            SET_NETDEV_DEVTYPE(dev, &wiphy_type);
>
>So for all these devices we have to add duplicate information now?

The fact the things are done now it a wrong way does not justify
extending UAPI in the same wrong way. So yes, change them all if you
need it. The "ethernet-subtype" should not an attribute of "struct
device" but rather "struct net_device".

>
>I actually don't like that idea. I can rename it to IFLA_FOO or any other proposal, but adding the same information twice is pointless.

So don't expose the "struct device" attribute over rtnetlink. Use
"struct device"-specific interface for that.

>
>>> We can revive the patches for adding this information as IFLA_INFO_KIND, but last time they where reverted since NetworkManager did all the wrong decisions based on that. That part is fixed now and we can put that back and declare the sub-type classification of Ethernet device in two places. Or we just take the information that we added a long time ago for exactly this sub-classification and provide them to userspace via RTNL.
>> 
>> IFLA_INFO_KIND serves for different purpose. It is for drivers that
>> register rtnl_link_ops. I don't think it would be wise to mix it with
>> this.
>
>We could register rtnl_link_ops for these Ethernet drivers, but then we are duplicating the information as well.

That is not the purpose of "rtnl_link_ops".


>
>Regards
>
>Marcel
>

^ permalink raw reply

* [PATCH][net-next] xdp: remove redundant variable 'headroom'
From: Colin King @ 2018-08-30 14:27 UTC (permalink / raw)
  To: David S . Miller, Björn Töpel, Jesper Dangaard Brouer,
	netdev
  Cc: kernel-janitors, linux-kernel

From: Colin Ian King <colin.king@canonical.com>

Variable 'headroom' is being assigned but is never used hence it is
redundant and can be removed.

Cleans up clang warning:
variable ‘headroom’ set but not used [-Wunused-but-set-variable]

Signed-off-by: Colin Ian King <colin.king@canonical.com>
---
 net/core/xdp.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/net/core/xdp.c b/net/core/xdp.c
index 654dbb19707e..4b2b194f4f1f 100644
--- a/net/core/xdp.c
+++ b/net/core/xdp.c
@@ -412,7 +412,7 @@ EXPORT_SYMBOL_GPL(xdp_attachment_setup);
 
 struct xdp_frame *xdp_convert_zc_to_xdp_frame(struct xdp_buff *xdp)
 {
-	unsigned int metasize, headroom, totsize;
+	unsigned int metasize, totsize;
 	void *addr, *data_to_copy;
 	struct xdp_frame *xdpf;
 	struct page *page;
@@ -420,7 +420,6 @@ struct xdp_frame *xdp_convert_zc_to_xdp_frame(struct xdp_buff *xdp)
 	/* Clone into a MEM_TYPE_PAGE_ORDER0 xdp_frame. */
 	metasize = xdp_data_meta_unsupported(xdp) ? 0 :
 		   xdp->data - xdp->data_meta;
-	headroom = xdp->data - xdp->data_hard_start;
 	totsize = xdp->data_end - xdp->data + metasize;
 
 	if (sizeof(*xdpf) + totsize > PAGE_SIZE)
-- 
2.17.1

^ permalink raw reply related

* [PATCH v1] bridge: Switch to bitmap_zalloc()
From: Andy Shevchenko @ 2018-08-30 10:33 UTC (permalink / raw)
  To: Stephen Hemminger, bridge, David S. Miller, netdev; +Cc: Andy Shevchenko

Switch to bitmap_zalloc() to show clearly what we are allocating.
Besides that it returns pointer of bitmap type instead of opaque void *.

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
---
 net/bridge/br_if.c   | 5 ++---
 net/bridge/br_vlan.c | 5 ++---
 2 files changed, 4 insertions(+), 6 deletions(-)

diff --git a/net/bridge/br_if.c b/net/bridge/br_if.c
index 0363f1bdc401..3bb66508f07d 100644
--- a/net/bridge/br_if.c
+++ b/net/bridge/br_if.c
@@ -394,8 +394,7 @@ static int find_portno(struct net_bridge *br)
 	struct net_bridge_port *p;
 	unsigned long *inuse;
 
-	inuse = kcalloc(BITS_TO_LONGS(BR_MAX_PORTS), sizeof(unsigned long),
-			GFP_KERNEL);
+	inuse = bitmap_zalloc(BR_MAX_PORTS, GFP_KERNEL);
 	if (!inuse)
 		return -ENOMEM;
 
@@ -404,7 +403,7 @@ static int find_portno(struct net_bridge *br)
 		set_bit(p->port_no, inuse);
 	}
 	index = find_first_zero_bit(inuse, BR_MAX_PORTS);
-	kfree(inuse);
+	bitmap_free(inuse);
 
 	return (index >= BR_MAX_PORTS) ? -EXFULL : index;
 }
diff --git a/net/bridge/br_vlan.c b/net/bridge/br_vlan.c
index 7df269092103..bb6ba794864f 100644
--- a/net/bridge/br_vlan.c
+++ b/net/bridge/br_vlan.c
@@ -877,8 +877,7 @@ int __br_vlan_set_default_pvid(struct net_bridge *br, u16 pvid)
 		return 0;
 	}
 
-	changed = kcalloc(BITS_TO_LONGS(BR_MAX_PORTS), sizeof(unsigned long),
-			  GFP_KERNEL);
+	changed = bitmap_zalloc(BR_MAX_PORTS, GFP_KERNEL);
 	if (!changed)
 		return -ENOMEM;
 
@@ -925,7 +924,7 @@ int __br_vlan_set_default_pvid(struct net_bridge *br, u16 pvid)
 	br->default_pvid = pvid;
 
 out:
-	kfree(changed);
+	bitmap_free(changed);
 	return err;
 
 err_port:
-- 
2.18.0

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox