Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH net-next] virtio_net: force_napi_tx module param.
From: Jason Wang @ 2018-08-29  7:56 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: Jon Olson (Google Drive), Michael S. Tsirkin, caleb.raitto,
	David Miller, Network Development, Caleb Raitto
In-Reply-To: <CAF=yD-K6Uy8TvPA4sUDN9_MWD=QjHaWioRbQ2F4PGdcY_barvw@mail.gmail.com>



On 2018年08月29日 03:57, Willem de Bruijn wrote:
> On Mon, Jul 30, 2018 at 2:06 AM Jason Wang <jasowang@redhat.com> wrote:
>>
>>
>> On 2018年07月25日 08:17, Jon Olson wrote:
>>> On Tue, Jul 24, 2018 at 3:46 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>>>> On Tue, Jul 24, 2018 at 06:31:54PM -0400, Willem de Bruijn wrote:
>>>>> On Tue, Jul 24, 2018 at 6:23 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>>>>>> On Tue, Jul 24, 2018 at 04:52:53PM -0400, Willem de Bruijn wrote:
>>>>>>> >From the above linked patch, I understand that there are yet
>>>>>>> other special cases in production, such as a hard cap on #tx queues to
>>>>>>> 32 regardless of number of vcpus.
>>>>>> I don't think upstream kernels have this limit - we can
>>>>>> now use vmalloc for higher number of queues.
>>>>> Yes. that patch* mentioned it as a google compute engine imposed
>>>>> limit. It is exactly such cloud provider imposed rules that I'm
>>>>> concerned about working around in upstream drivers.
>>>>>
>>>>> * for reference, I mean https://patchwork.ozlabs.org/patch/725249/
>>>> Yea. Why does GCE do it btw?
>>> There are a few reasons for the limit, some historical, some current.
>>>
>>> Historically we did this because of a kernel limit on the number of
>>> TAP queues (in Montreal I thought this limit was 32). To my chagrin,
>>> the limit upstream at the time we did it was actually eight. We had
>>> increased the limit from eight to 32 internally, and it appears in
>>> upstream it has subsequently increased upstream to 256. We no longer
>>> use TAP for networking, so that constraint no longer applies for us,
>>> but when looking at removing/raising the limit we discovered no
>>> workloads that clearly benefited from lifting it, and it also placed
>>> more pressure on our virtual networking stack particularly on the Tx
>>> side. We left it as-is.
>>>
>>> In terms of current reasons there are really two. One is memory usage.
>>> As you know, virtio-net uses rx/tx pairs, so there's an expectation
>>> that the guest will have an Rx queue for every Tx queue. We run our
>>> individual virtqueues fairly deep (4096 entries) to give guests a wide
>>> time window for re-posting Rx buffers and avoiding starvation on
>>> packet delivery. Filling an Rx vring with max-sized mergeable buffers
>>> (4096 bytes) is 16MB of GFP_ATOMIC allocations. At 32 queues this can
>>> be up to 512MB of memory posted for network buffers. Scaling this to
>>> the largest VM GCE offers today (160 VCPUs -- n1-ultramem-160) keeping
>>> all of the Rx rings full would (in the large average Rx packet size
>>> case) consume up to 2.5 GB(!) of guest RAM. Now, those VMs have 3.8T
>>> of RAM available, but I don't believe we've observed a situation where
>>> they would have benefited from having 2.5 gigs of buffers posted for
>>> incoming network traffic :)
>> We can work to have async txq and rxq instead of paris if there's a
>> strong requirement.
>>
>>> The second reason is interrupt related -- as I mentioned above, we
>>> have found no workloads that clearly benefit from so many queues, but
>>> we have found workloads that degrade. In particular workloads that do
>>> a lot of small packet processing but which aren't extremely latency
>>> sensitive can achieve higher PPS by taking fewer interrupt across
>>> fewer VCPUs due to better batching (this also incurs higher latency,
>>> but at the limit the "busy" cores end up suppressing most interrupts
>>> and spending most of their cycles farming out work). Memcache is a
>>> good example here, particularly if the latency targets for request
>>> completion are in the ~milliseconds range (rather than the
>>> microseconds we typically strive for with TCP_RR-style workloads).
>>>
>>> All of that said, we haven't been forthcoming with data (and
>>> unfortunately I don't have it handy in a useful form, otherwise I'd
>>> simply post it here), so I understand the hesitation to simply run
>>> with napi_tx across the board. As Willem said, this patch seemed like
>>> the least disruptive way to allow us to continue down the road of
>>> "universal" NAPI Tx and to hopefully get data across enough workloads
>>> (with VMs small, large, and absurdly large :) to present a compelling
>>> argument in one direction or another. As far as I know there aren't
>>> currently any NAPI related ethtool commands (based on a quick perusal
>>> of ethtool.h)
>> As I suggest before, maybe we can (ab)use tx-frames-irq.
> I forgot to respond to this originally, but I agree.
>
> How about something like the snippet below. It would be simpler to
> reason about if only allow switching while the device is down, but
> napi does not strictly require that.
>
> +static int virtnet_set_coalesce(struct net_device *dev,
> +                               struct ethtool_coalesce *ec)
> +{
> +       const u32 tx_coalesce_napi_mask = (1 << 16);
> +       const struct ethtool_coalesce ec_default = {
> +               .cmd = ETHTOOL_SCOALESCE,
> +               .rx_max_coalesced_frames = 1,
> +               .tx_max_coalesced_frames = 1,
> +       };
> +       struct virtnet_info *vi = netdev_priv(dev);
> +       int napi_weight = 0;
> +       bool running;
> +       int i;
> +
> +       if (ec->tx_max_coalesced_frames & tx_coalesce_napi_mask) {
> +               ec->tx_max_coalesced_frames &= ~tx_coalesce_napi_mask;
> +               napi_weight = NAPI_POLL_WEIGHT;
> +       }
> +
> +       /* disallow changes to fields not explicitly tested above */
> +       if (memcmp(ec, &ec_default, sizeof(ec_default)))
> +               return -EINVAL;
> +
> +       if (napi_weight ^ vi->sq[0].napi.weight) {
> +               running = netif_running(vi->dev);
> +
> +               for (i = 0; i < vi->max_queue_pairs; i++) {
> +                       vi->sq[i].napi.weight = napi_weight;
> +
> +                       if (!running)
> +                               continue;
> +
> +                       if (napi_weight)
> +                               virtnet_napi_tx_enable(vi, vi->sq[i].vq,
> +                                                      &vi->sq[i].napi);
> +                       else
> +                               napi_disable(&vi->sq[i].napi);
> +               }
> +       }
> +
> +       return 0;
> +}
> +
> +static int virtnet_get_coalesce(struct net_device *dev,
> +                               struct ethtool_coalesce *ec)
> +{
> +       const u32 tx_coalesce_napi_mask = (1 << 16);
> +       const struct ethtool_coalesce ec_default = {
> +               .cmd = ETHTOOL_GCOALESCE,
> +               .rx_max_coalesced_frames = 1,
> +               .tx_max_coalesced_frames = 1,
> +       };
> +       struct virtnet_info *vi = netdev_priv(dev);
> +
> +       memcpy(ec, &ec_default, sizeof(ec_default));
> +
> +       if (vi->sq[0].napi.weight)
> +               ec->tx_max_coalesced_frames |= tx_coalesce_napi_mask;
> +
> +       return 0;
> +}

Looks good. Just one nit, maybe it's better simply check against zero?

Thanks

^ permalink raw reply

* pull-request: mac80211-next 2018-08-29
From: Johannes Berg @ 2018-08-29 11:53 UTC (permalink / raw)
  To: David Miller
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-wireless-u79uwXL29TY76Z2rM5mHXA

Hi Dave,

Here's a small update for net-next, there's not much but since we
still don't have a shared tree, the ieee80211_send_layer2_update()
should go to Kalle through your tree :)

Please pull and let me know if there's any problem.

Thanks,
johannes



The following changes since commit 050cdc6c9501abcd64720b8cc3e7941efee9547d:

  Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net (2018-08-27 11:59:39 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next.git tags/mac80211-next-for-davem-2018-08-29

for you to fetch changes up to 9c06602b1b920ed6b546632bdbbc1f400eea5242:

  cfg80211: clarify frames covered by average ACK signal report (2018-08-29 11:01:51 +0200)

----------------------------------------------------------------
Only a few changes at this point:
 * new channels in 60 GHz
 * clarify (average) ACK signal reporting API
 * expose ieee80211_send_layer2_update() for all drivers
 * start/stop mac80211's TXQs properly when required
 * avoid regulatory restore with IE ignoring
 * spelling: contidion -> condition
 * fully implement WFA Multi-AP backhaul

----------------------------------------------------------------
Alexei Avshalom Lazar (1):
      cfg80211: Add support for 60GHz band channels 5 and 6

Balaji Pothunoori (1):
      cfg80211: clarify frames covered by average ACK signal report

Dedy Lansky (1):
      cfg80211/mac80211: make ieee80211_send_layer2_update a public function

Manikanta Pubbisetty (1):
      mac80211: add stop/start logic for software TXQs

Rajeev Kumar Sirasanagandla (1):
      cfg80211: Avoid regulatory restore when COUNTRY_IE_IGNORE is set

Richard Guy Briggs (1):
      rfkill: fix spelling mistake contidion to condition

Sathishkumar Muruganandam (1):
      mac80211: add missing WFA Multi-AP backhaul STA Rx requirement

 drivers/net/wireless/ath/wil6210/debugfs.c |   2 +-
 include/net/cfg80211.h                     |  13 +++-
 include/net/mac80211.h                     |   4 ++
 include/uapi/linux/nl80211.h               |  20 +++---
 net/mac80211/cfg.c                         |  48 +------------
 net/mac80211/ieee80211_i.h                 |   3 +
 net/mac80211/main.c                        |   4 ++
 net/mac80211/rx.c                          |   2 +-
 net/mac80211/sta_info.c                    |   6 +-
 net/mac80211/tx.c                          |  11 ++-
 net/mac80211/util.c                        | 110 +++++++++++++++++++++++++++--
 net/rfkill/core.c                          |   4 +-
 net/wireless/nl80211.c                     |   7 +-
 net/wireless/reg.c                         |  48 ++++++++++++-
 net/wireless/trace.h                       |   2 +-
 net/wireless/util.c                        |  51 ++++++++++++-
 16 files changed, 258 insertions(+), 77 deletions(-)

^ permalink raw reply

* Re: [RFC RFT PATCH v4 1/4] gpiolib: Pass bitmaps, not integer arrays, to get/set array
From: Miguel Ojeda @ 2018-08-29 12:03 UTC (permalink / raw)
  To: Janusz Krzysztofik
  Cc: Andrew Lunn, Ulf Hansson, Linux Doc Mailing List, linux-iio,
	Linus Walleij, Dominik Brodowski, Network Development, linux-i2c,
	Peter Meerwald-Stadler, devel, Florian Fainelli, Jonathan Corbet,
	Kishon Vijay Abraham I, Willy Tarreau, Geert Uytterhoeven,
	linux-serial, Jiri Slaby, Michael Hennerich, linux-gpio,
	Lars-Peter Clausen, Greg Kroah-Hartman, linux-mmc, linux-kernel,
	Peter Rosin
In-Reply-To: <20180820234341.5271-2-jmkrzyszt@gmail.com>

Hi Janusz,

On Tue, Aug 21, 2018 at 1:43 AM, Janusz Krzysztofik <jmkrzyszt@gmail.com> wrote:
> Most users of get/set array functions iterate consecutive bits of data,
> usually a single integer, while or processing array of results obtained
> from or building an array of values to be passed to those functions.
> Save time wasted on those iterations by changing the functions' API to
> accept bitmaps.
>
> All current users are updated as well.
>
> More benefits from the change are expected as soon as planned support
> for accepting/passing those bitmaps directly from/to respective GPIO
> chip callbacks if applicable is implemented.
>
> Signed-off-by: Janusz Krzysztofik <jmkrzyszt@gmail.com>
> ---
>  Documentation/driver-api/gpio/consumer.rst  | 22 ++++----
>  drivers/auxdisplay/hd44780.c                | 52 +++++++++--------

[CC'ing Willy and Geert for hd44780]

> diff --git a/drivers/auxdisplay/hd44780.c b/drivers/auxdisplay/hd44780.c
> index f1a42f0f1ded..d340473aa142 100644
> --- a/drivers/auxdisplay/hd44780.c
> +++ b/drivers/auxdisplay/hd44780.c
> @@ -62,20 +62,19 @@ static void hd44780_strobe_gpio(struct hd44780 *hd)
>  /* write to an LCD panel register in 8 bit GPIO mode */
>  static void hd44780_write_gpio8(struct hd44780 *hd, u8 val, unsigned int rs)
>  {
> -       int values[10]; /* for DATA[0-7], RS, RW */
> -       unsigned int i, n;
> +       unsigned long value_bitmap[1];  /* for DATA[0-7], RS, RW */

Why [1]? I understand it is because in other cases it may be more than
one, but...

> +       unsigned int n;
>
> -       for (i = 0; i < 8; i++)
> -               values[PIN_DATA0 + i] = !!(val & BIT(i));
> -       values[PIN_CTRL_RS] = rs;
> +       value_bitmap[0] = val;
> +       __assign_bit(PIN_CTRL_RS, value_bitmap, rs);
>         n = 9;
>         if (hd->pins[PIN_CTRL_RW]) {
> -               values[PIN_CTRL_RW] = 0;
> +               __clear_bit(PIN_CTRL_RW, value_bitmap);
>                 n++;
>         }
>
>         /* Present the data to the port */
> -       gpiod_set_array_value_cansleep(n, &hd->pins[PIN_DATA0], values);
> +       gpiod_set_array_value_cansleep(n, &hd->pins[PIN_DATA0], value_bitmap);
>
>         hd44780_strobe_gpio(hd);
>  }
> @@ -83,32 +82,31 @@ static void hd44780_write_gpio8(struct hd44780 *hd, u8 val, unsigned int rs)
>  /* write to an LCD panel register in 4 bit GPIO mode */
>  static void hd44780_write_gpio4(struct hd44780 *hd, u8 val, unsigned int rs)
>  {
> -       int values[10]; /* for DATA[0-7], RS, RW, but DATA[0-3] is unused */
> -       unsigned int i, n;
> +       /* for DATA[0-7], RS, RW, but DATA[0-3] is unused */
> +       unsigned long value_bitmap[0];

This one is even more strange... :-)

> +       unsigned int n;
>
>         /* High nibble + RS, RW */
> -       for (i = 4; i < 8; i++)
> -               values[PIN_DATA0 + i] = !!(val & BIT(i));
> -       values[PIN_CTRL_RS] = rs;
> +       value_bitmap[0] = val;
> +       __assign_bit(PIN_CTRL_RS, value_bitmap, rs);
>         n = 5;
>         if (hd->pins[PIN_CTRL_RW]) {
> -               values[PIN_CTRL_RW] = 0;
> +               __clear_bit(PIN_CTRL_RW, value_bitmap);
>                 n++;
>         }
> +       value_bitmap[0] = value_bitmap[0] >> PIN_DATA4;

Maybe >>=?

>
>         /* Present the data to the port */
> -       gpiod_set_array_value_cansleep(n, &hd->pins[PIN_DATA4],
> -                                      &values[PIN_DATA4]);
> +       gpiod_set_array_value_cansleep(n, &hd->pins[PIN_DATA4], value_bitmap);
>
>         hd44780_strobe_gpio(hd);
>
>         /* Low nibble */
> -       for (i = 0; i < 4; i++)
> -               values[PIN_DATA4 + i] = !!(val & BIT(i));
> +       value_bitmap[0] &= ~((1 << PIN_DATA4) - 1);
> +       value_bitmap[0] |= val & ~((1 << PIN_DATA4) - 1);

Are you sure this is correct? You are basically doing an or of
value_bitmap and val and clearing the low-nibble.

>
>         /* Present the data to the port */
> -       gpiod_set_array_value_cansleep(n, &hd->pins[PIN_DATA4],
> -                                      &values[PIN_DATA4]);
> +       gpiod_set_array_value_cansleep(n, &hd->pins[PIN_DATA4], value_bitmap);
>
>         hd44780_strobe_gpio(hd);
>  }

Cheers,
Miguel

^ permalink raw reply

* RE: [PATCH net-next v2 0/2] dpaa2-eth: Move DPAA2 Ethernet driver
From: Ioana Ciocoi Radulescu @ 2018-08-29 12:06 UTC (permalink / raw)
  To: Dan Carpenter
  Cc: netdev@vger.kernel.org, davem@davemloft.net,
	devel@driverdev.osuosl.org, andrew@lunn.ch, Horia Geanta,
	Madalin-cristian Bucur, gregkh@linuxfoundation.org,
	linux-kernel@vger.kernel.org, Ioana Ciornei, Laurentiu Tudor
In-Reply-To: <20180829110739.3xzbdtelwjlhun2v@mwanda>

> -----Original Message-----
> From: Dan Carpenter <dan.carpenter@oracle.com>
> Sent: Wednesday, August 29, 2018 2:08 PM
> To: Ioana Ciocoi Radulescu <ruxandra.radulescu@nxp.com>
> Cc: netdev@vger.kernel.org; davem@davemloft.net;
> devel@driverdev.osuosl.org; andrew@lunn.ch; Horia Geanta
> <horia.geanta@nxp.com>; Madalin-cristian Bucur
> <madalin.bucur@nxp.com>; gregkh@linuxfoundation.org; linux-
> kernel@vger.kernel.org; Ioana Ciornei <ioana.ciornei@nxp.com>; Laurentiu
> Tudor <laurentiu.tudor@nxp.com>
> Subject: Re: [PATCH net-next v2 0/2] dpaa2-eth: Move DPAA2 Ethernet
> driver
> 
> There are a few static checker warnings for the driver but only the
> first one looks like a possibly real bug.

Thanks for the report, what tool did you use to get these warnings?

> 
> drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.c:1116 dpaa2_eth_stop()
> error: uninitialized symbol 'dpni_enabled'.

Will initialize in v3. Not actually a bug, since it's used only as an out param.

> drivers/staging/fsl-dpaa2/ethernet/dpaa2-eth.c:1671 set_fq_affinity() warn:
> 'xps_mask' puts 1024 bytes on stack
> drivers/staging/fsl-dpaa2/ethsw/ethsw.c:742 port_vlans_add() error:
> uninitialized symbol 'err'.
> drivers/staging/fsl-dpaa2/ethsw/ethsw.c:886 port_vlans_del() error:
> uninitialized symbol 'err'.

The dpaa2 switch driver is not targeted by the current patchset, so I'll fix these
in a separate patch on Greg's staging tree.

Thanks,
Ioana

^ permalink raw reply

* Re: [PATCH net-next v2 0/2] dpaa2-eth: Move DPAA2 Ethernet driver
From: Dan Carpenter @ 2018-08-29 12:15 UTC (permalink / raw)
  To: Ioana Ciocoi Radulescu
  Cc: devel@driverdev.osuosl.org, andrew@lunn.ch, Horia Geanta,
	Madalin-cristian Bucur, netdev@vger.kernel.org, Ioana Ciornei,
	linux-kernel@vger.kernel.org, gregkh@linuxfoundation.org,
	davem@davemloft.net, Laurentiu Tudor
In-Reply-To: <AM3PR04MB338A141EDC007FF47B510C094090@AM3PR04MB338.eurprd04.prod.outlook.com>

On Wed, Aug 29, 2018 at 12:06:44PM +0000, Ioana Ciocoi Radulescu wrote:
> > 
> > There are a few static checker warnings for the driver but only the
> > first one looks like a possibly real bug.
> 
> Thanks for the report, what tool did you use to get these warnings?
> 

These are Smatch warnings, but you need to build the cross function DB
to see the warnings.

regards,
dan carpenter

^ permalink raw reply

* [PATCH net v2 0/2] net_sched: reject unknown tcfa_action values
From: Paolo Abeni @ 2018-08-29  8:22 UTC (permalink / raw)
  To: netdev
  Cc: Jamal Hadi Salim, Cong Wang, Jiri Pirko, David S . Miller,
	Davide Caratti, Lucas Bates

As agreed some time ago, this changeset reject unknown tcfa_action values,
instead of changing such values under the hood.

A tdc test is included to verify the new behavior.

v1 -> v2:
 - helper is now static and renamed according to act_* convention
 - updated extack message, according to the new behavior

Paolo Abeni (2):
  net_sched: reject unknown tcfa_action values
  tc-testing: add test-cases for numeric and invalid control action

 net/sched/act_api.c                           | 16 +++++--
 .../tc-testing/tc-tests/actions/police.json   | 48 +++++++++++++++++++
 2 files changed, 59 insertions(+), 5 deletions(-)

-- 
2.17.1

^ permalink raw reply

* [PATCH net v2 1/2] net_sched: reject unknown tcfa_action values
From: Paolo Abeni @ 2018-08-29  8:22 UTC (permalink / raw)
  To: netdev
  Cc: Jamal Hadi Salim, Cong Wang, Jiri Pirko, David S . Miller,
	Davide Caratti, Lucas Bates
In-Reply-To: <cover.1535527298.git.pabeni@redhat.com>

After the commit 802bfb19152c ("net/sched: user-space can't set
unknown tcfa_action values"), unknown tcfa_action values are
converted to TC_ACT_UNSPEC, but the common agreement is instead
rejecting such configurations.

This change also introduces a helper to simplify the destruction
of a single action, avoiding code duplication.

v1 -> v2:
 - helper is now static and renamed according to act_* convention
 - updated extack message, according to the new behavior

Fixes: 802bfb19152c ("net/sched: user-space can't set unknown tcfa_action values")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
 net/sched/act_api.c | 16 +++++++++++-----
 1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/net/sched/act_api.c b/net/sched/act_api.c
index db83dac1e7f4..316c98bb87e4 100644
--- a/net/sched/act_api.c
+++ b/net/sched/act_api.c
@@ -662,6 +662,13 @@ int tcf_action_destroy(struct tc_action *actions[], int bind)
 	return ret;
 }
 
+static int tcf_action_destroy_1(struct tc_action *a, int bind)
+{
+	struct tc_action *actions[] = { a, NULL };
+
+	return tcf_action_destroy(actions, bind);
+}
+
 static int tcf_action_put(struct tc_action *p)
 {
 	return __tcf_action_put(p, false);
@@ -881,17 +888,16 @@ struct tc_action *tcf_action_init_1(struct net *net, struct tcf_proto *tp,
 	if (TC_ACT_EXT_CMP(a->tcfa_action, TC_ACT_GOTO_CHAIN)) {
 		err = tcf_action_goto_chain_init(a, tp);
 		if (err) {
-			struct tc_action *actions[] = { a, NULL };
-
-			tcf_action_destroy(actions, bind);
+			tcf_action_destroy_1(a, bind);
 			NL_SET_ERR_MSG(extack, "Failed to init TC action chain");
 			return ERR_PTR(err);
 		}
 	}
 
 	if (!tcf_action_valid(a->tcfa_action)) {
-		NL_SET_ERR_MSG(extack, "invalid action value, using TC_ACT_UNSPEC instead");
-		a->tcfa_action = TC_ACT_UNSPEC;
+		tcf_action_destroy_1(a, bind);
+		NL_SET_ERR_MSG(extack, "Invalid control action value");
+		return ERR_PTR(-EINVAL);
 	}
 
 	return a;
-- 
2.17.1

^ permalink raw reply related

* [PATCH net v2 2/2] tc-testing: add test-cases for numeric and invalid control action
From: Paolo Abeni @ 2018-08-29  8:22 UTC (permalink / raw)
  To: netdev
  Cc: Jamal Hadi Salim, Cong Wang, Jiri Pirko, David S . Miller,
	Davide Caratti, Lucas Bates
In-Reply-To: <cover.1535527298.git.pabeni@redhat.com>

Only the police action allows us to specify an arbitrary numeric value
for the control action. This change introduces an explicit test case
for the above feature and then leverage it for testing the kernel behavior
for invalid control actions (reject).

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
 .../tc-testing/tc-tests/actions/police.json   | 48 +++++++++++++++++++
 1 file changed, 48 insertions(+)

diff --git a/tools/testing/selftests/tc-testing/tc-tests/actions/police.json b/tools/testing/selftests/tc-testing/tc-tests/actions/police.json
index f03763d81617..30f9b54bd666 100644
--- a/tools/testing/selftests/tc-testing/tc-tests/actions/police.json
+++ b/tools/testing/selftests/tc-testing/tc-tests/actions/police.json
@@ -312,6 +312,54 @@
             "$TC actions flush action police"
         ]
     },
+    {
+        "id": "6aaf",
+        "name": "Add police actions with conform-exceed control pass/pipe [with numeric values]",
+        "category": [
+            "actions",
+            "police"
+        ],
+        "setup": [
+            [
+                "$TC actions flush action police",
+                0,
+                1,
+                255
+            ]
+        ],
+        "cmdUnderTest": "$TC actions add action police rate 3mbit burst 250k conform-exceed 0/3 index 1",
+        "expExitCode": "0",
+        "verifyCmd": "$TC actions get action police index 1",
+        "matchPattern": "action order [0-9]*:  police 0x1 rate 3Mbit burst 250Kb mtu 2Kb action pass/pipe",
+        "matchCount": "1",
+        "teardown": [
+            "$TC actions flush action police"
+        ]
+    },
+    {
+        "id": "29b1",
+        "name": "Add police actions with conform-exceed control <invalid>/drop",
+        "category": [
+            "actions",
+            "police"
+        ],
+        "setup": [
+            [
+                "$TC actions flush action police",
+                0,
+                1,
+                255
+            ]
+        ],
+        "cmdUnderTest": "$TC actions add action police rate 3mbit burst 250k conform-exceed 10/drop index 1",
+        "expExitCode": "255",
+        "verifyCmd": "$TC actions ls action police",
+        "matchPattern": "action order [0-9]*:  police 0x1 rate 3Mbit burst 250Kb mtu 2Kb action ",
+        "matchCount": "0",
+        "teardown": [
+            "$TC actions flush action police"
+        ]
+    },
     {
         "id": "c26f",
         "name": "Add police action with invalid peakrate value",
-- 
2.17.1

^ permalink raw reply related

* Re: [PATCH 2/3] xen-netback: validate queue numbers in xenvif_set_hash_mapping()
From: Wei Liu @ 2018-08-29  8:25 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Paul Durrant, Wei Liu, davem, xen-devel, netdev
In-Reply-To: <5B85636102000078001E2A4D@prv1-mh.provo.novell.com>

On Tue, Aug 28, 2018 at 08:59:45AM -0600, Jan Beulich wrote:
> Checking them before the grant copy means nothing as to the validity of
> the incoming request. As we shouldn't make the new data live before
> having validated it, introduce a second instance of the mapping array.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> ---
>  drivers/net/xen-netback/common.h    |    3 ++-
>  drivers/net/xen-netback/hash.c      |   20 ++++++++++++++------
>  drivers/net/xen-netback/interface.c |    3 ++-
>  3 files changed, 18 insertions(+), 8 deletions(-)
> 
> --- 4.19-rc1-xen-netback-set-hash-mapping.orig/drivers/net/xen-netback/common.h
> +++ 4.19-rc1-xen-netback-set-hash-mapping/drivers/net/xen-netback/common.h
> @@ -241,8 +241,9 @@ struct xenvif_hash_cache {
>  struct xenvif_hash {
>  	unsigned int alg;
>  	u32 flags;
> +	bool mapping_sel;
>  	u8 key[XEN_NETBK_MAX_HASH_KEY_SIZE];
> -	u32 mapping[XEN_NETBK_MAX_HASH_MAPPING_SIZE];
> +	u32 mapping[2][XEN_NETBK_MAX_HASH_MAPPING_SIZE];
>  	unsigned int size;
>  	struct xenvif_hash_cache cache;
>  };
> --- 4.19-rc1-xen-netback-set-hash-mapping.orig/drivers/net/xen-netback/hash.c
> +++ 4.19-rc1-xen-netback-set-hash-mapping/drivers/net/xen-netback/hash.c
> @@ -324,7 +324,8 @@ u32 xenvif_set_hash_mapping_size(struct
>  		return XEN_NETIF_CTRL_STATUS_INVALID_PARAMETER;
>  
>  	vif->hash.size = size;
> -	memset(vif->hash.mapping, 0, sizeof(u32) * size);
> +	memset(vif->hash.mapping[vif->hash.mapping_sel], 0,
> +	       sizeof(u32) * size);
>  
>  	return XEN_NETIF_CTRL_STATUS_SUCCESS;
>  }
> @@ -332,7 +333,7 @@ u32 xenvif_set_hash_mapping_size(struct
>  u32 xenvif_set_hash_mapping(struct xenvif *vif, u32 gref, u32 len,
>  			    u32 off)
>  {
> -	u32 *mapping = vif->hash.mapping;
> +	u32 *mapping = vif->hash.mapping[!vif->hash.mapping_sel];

Can you rename this to inactive_mapping so the code can be followed more
easily?

The code looks correct to me, but I would like Paul to have a look
before it can go in.

Wei.

^ permalink raw reply

* Re: [PATCH 3/3] xen-netback: handle page straddling in xenvif_set_hash_mapping()
From: Wei Liu @ 2018-08-29  8:26 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Paul Durrant, Wei Liu, davem, xen-devel, netdev
In-Reply-To: <5B85637E02000078001E2A50@prv1-mh.provo.novell.com>

On Tue, Aug 28, 2018 at 09:00:14AM -0600, Jan Beulich wrote:
> There's no guarantee that the mapping array doesn't cross a page
> boundary. Use a second grant copy operation if necessary.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Acked-by: Wei Liu <wei.liu2@citrix.com>

^ permalink raw reply

* [PATCH 2/5] net: mvneta: fix the wrong function to unmap rx buf
From: Jisheng Zhang @ 2018-08-29  8:27 UTC (permalink / raw)
  To: thomas.petazzoni, David S. Miller
  Cc: netdev, linux-kernel, Andrew Lunn, Gregory CLEMENT,
	linux-arm-kernel
In-Reply-To: <20180829162456.2bd69796@xhacker.debian>

Commit 7e47fd84b56b ("net: mvneta: Allocate page for the descriptor")
always allocate one page for each rx descriptor, so the rx is mapped
with dmap_map_page() now, but the unmap routine isn't updated at the
same time.

Fix this by using dma_unmap_page() in corresponding places.

Fixes: 7e47fd84b56b ("net: mvneta: Allocate page for the descriptor")
Signed-off-by: Jisheng Zhang <Jisheng.Zhang@synaptics.com>
---
 drivers/net/ethernet/marvell/mvneta.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
index 0ce94f6587a5..d9206094fce3 100644
--- a/drivers/net/ethernet/marvell/mvneta.c
+++ b/drivers/net/ethernet/marvell/mvneta.c
@@ -1890,8 +1890,9 @@ static void mvneta_rxq_drop_pkts(struct mvneta_port *pp,
 		if (!data || !(rx_desc->buf_phys_addr))
 			continue;
 
-		dma_unmap_single(pp->dev->dev.parent, rx_desc->buf_phys_addr,
-				 MVNETA_RX_BUF_SIZE(pp->pkt_size), DMA_FROM_DEVICE);
+		dma_unmap_page(pp->dev->dev.parent, rx_desc->buf_phys_addr,
+			       MVNETA_RX_BUF_SIZE(pp->pkt_size),
+			       DMA_FROM_DEVICE);
 		__free_page(data);
 	}
 }
@@ -2008,8 +2009,8 @@ static int mvneta_rx_swbm(struct napi_struct *napi,
 				skb_add_rx_frag(rxq->skb, frag_num, page,
 						frag_offset, frag_size,
 						PAGE_SIZE);
-				dma_unmap_single(dev->dev.parent, phys_addr,
-						 PAGE_SIZE, DMA_FROM_DEVICE);
+				dma_unmap_page(dev->dev.parent, phys_addr,
+					       PAGE_SIZE, DMA_FROM_DEVICE);
 				rxq->left_size -= frag_size;
 			}
 		} else {
@@ -2039,9 +2040,8 @@ static int mvneta_rx_swbm(struct napi_struct *napi,
 						frag_offset, frag_size,
 						PAGE_SIZE);
 
-				dma_unmap_single(dev->dev.parent, phys_addr,
-						 PAGE_SIZE,
-						 DMA_FROM_DEVICE);
+				dma_unmap_page(dev->dev.parent, phys_addr,
+					       PAGE_SIZE, DMA_FROM_DEVICE);
 
 				rxq->left_size -= frag_size;
 			}
-- 
2.18.0

^ permalink raw reply related

* Re: [PATCH net] vti6: remove !skb->ignore_df check from vti6_xmit()
From: Steffen Klassert @ 2018-08-29  8:39 UTC (permalink / raw)
  To: Alexey Kodanev; +Cc: netdev, David Miller
In-Reply-To: <1535042994-27225-1-git-send-email-alexey.kodanev@oracle.com>

On Thu, Aug 23, 2018 at 07:49:54PM +0300, Alexey Kodanev wrote:
> Before the commit d6990976af7c ("vti6: fix PMTU caching and reporting
> on xmit") '!skb->ignore_df' check was always true because the function
> skb_scrub_packet() was called before it, resetting ignore_df to zero.
> 
> In the commit, skb_scrub_packet() was moved below, and now this check
> can be false for the packet, e.g. when sending it in the two fragments,
> this prevents successful PMTU updates in such case. The next attempts
> to send the packet lead to the same tx error. Moreover, vti6 initial
> MTU value relies on PMTU adjustments.
> 
> This issue can be reproduced with the following LTP test script:
>     udp_ipsec_vti.sh -6 -p ah -m tunnel -s 2000
> 
> Fixes: ccd740cbc6e0 ("vti6: Add pmtu handling to vti6_xmit.")
> Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com>
> ---
> Not sure about xfrmi_xmit2(), it has a similar check for ignore_df...
> 
>  net/ipv6/ip6_vti.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/net/ipv6/ip6_vti.c b/net/ipv6/ip6_vti.c
> index 38dec9d..f48d196 100644
> --- a/net/ipv6/ip6_vti.c
> +++ b/net/ipv6/ip6_vti.c
> @@ -481,7 +481,7 @@ static bool vti6_state_check(const struct xfrm_state *x,
>  	}
>  
>  	mtu = dst_mtu(dst);
> -	if (!skb->ignore_df && skb->len > mtu) {
> +	if (skb->len > mtu) {
>  		skb_dst_update_pmtu(skb, mtu);

This looks OK to me. If I remember correct, the !skb->ignore_df
check was taken from the native xfrm6 PMTU handling. There this 
check makes sense because the packet can be still fragmented
along the way through the stack. In this case here it is too late
as we are about to TX the packet through the vti device. So
we should update to the new IPsec PMTU and notify the sender
about this.

Acked-by: Steffen Klassert <steffen.klassert@secunet.com>

^ permalink raw reply

* Re: [PATCH ipsec-next] xfrm: allow driver to quietly refuse offload
From: Steffen Klassert @ 2018-08-29  8:42 UTC (permalink / raw)
  To: Shannon Nelson; +Cc: netdev
In-Reply-To: <1534973890-23111-1-git-send-email-shannon.nelson@oracle.com>

On Wed, Aug 22, 2018 at 02:38:10PM -0700, Shannon Nelson wrote:
> If the "offload" attribute is used to create an IPsec SA
> and the .xdo_dev_state_add() fails, the SA creation fails.
> However, if the "offload" attribute is used on a device that
> doesn't offer it, the attribute is quietly ignored and the SA
> is created without an offload.
> 
> Along the same line of that second case, it would be good to
> have a way for the device to refuse to offload an SA without
> failing the whole SA creation.  This patch adds that feature
> by allowing the driver to return -EOPNOTSUPP as a signal that
> the SA may be fine, it just can't be offloaded.
> 
> This allows the user a little more flexibility in requesting
> offloads and not needing to know every detail at all times about
> each specific NIC when trying to create SAs.
> 
> Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>

Applied to ipsec-next, thanks Shannon!

^ permalink raw reply

* Re: [PATCH net v2 1/2] net_sched: reject unknown tcfa_action values
From: Jiri Pirko @ 2018-08-29  8:52 UTC (permalink / raw)
  To: Paolo Abeni
  Cc: netdev, Jamal Hadi Salim, Cong Wang, David S . Miller,
	Davide Caratti, Lucas Bates
In-Reply-To: <0aca759d71e70f359e51d517f9b1c087b4868fa1.1535527298.git.pabeni@redhat.com>

Wed, Aug 29, 2018 at 10:22:33AM CEST, pabeni@redhat.com wrote:
>After the commit 802bfb19152c ("net/sched: user-space can't set
>unknown tcfa_action values"), unknown tcfa_action values are
>converted to TC_ACT_UNSPEC, but the common agreement is instead
>rejecting such configurations.
>
>This change also introduces a helper to simplify the destruction
>of a single action, avoiding code duplication.
>
>v1 -> v2:
> - helper is now static and renamed according to act_* convention
> - updated extack message, according to the new behavior
>
>Fixes: 802bfb19152c ("net/sched: user-space can't set unknown tcfa_action values")
>Signed-off-by: Paolo Abeni <pabeni@redhat.com>

Acked-by: Jiri Pirko <jiri@mellanox.com>

^ permalink raw reply

* [PATCH net-next 1/3] net: rework SIOCGSTAMP ioctl handling
From: Arnd Bergmann @ 2018-08-29 12:59 UTC (permalink / raw)
  To: netdev, David S . Miller
  Cc: linux-arch, y2038, Arnd Bergmann, Eric Dumazet, Willem de Bruijn,
	linux-kernel, linux-hams, linux-bluetooth, linux-can, dccp,
	linux-wpan, linux-sctp, linux-x25

The SIOCGSTAMP/SIOCGSTAMPNS ioctl commands are implemented by many
socket protocol handlers, and all of those end up calling the same
sock_get_timestamp()/sock_get_timestampns() helper functions, which
results in a lot of duplicate code.

With the introduction of 64-bit time_t on 32-bit architectures, this
gets worse, as we then need four different ioctl commands in each
socket protocol implementation.

To simplify that, let's add a new .gettstamp() operation in
struct proto_ops, and move ioctl implementation into the common
sock_ioctl()/compat_sock_ioctl_trans() functions that these all go
through.

We can reuse the sock_get_timestamp() implementation, but generalize
it so it can deal with both native and compat mode, as well as
timeval and timespec structures.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
 include/linux/net.h          |  2 ++
 include/net/compat.h         |  3 --
 include/net/sock.h           |  4 +--
 net/appletalk/ddp.c          |  7 +----
 net/atm/ioctl.c              | 16 -----------
 net/atm/pvc.c                |  1 +
 net/atm/svc.c                |  1 +
 net/ax25/af_ax25.c           |  9 +-----
 net/bluetooth/af_bluetooth.c |  8 ------
 net/bluetooth/l2cap_sock.c   |  1 +
 net/bluetooth/rfcomm/sock.c  |  1 +
 net/bluetooth/sco.c          |  1 +
 net/can/af_can.c             |  6 ----
 net/can/bcm.c                |  1 +
 net/can/raw.c                |  1 +
 net/compat.c                 | 54 ------------------------------------
 net/core/sock.c              | 38 +++++++++++--------------
 net/dccp/ipv4.c              |  1 +
 net/dccp/ipv6.c              |  1 +
 net/ieee802154/socket.c      |  6 ++--
 net/ipv4/af_inet.c           |  9 ++----
 net/ipv6/af_inet6.c          |  8 ++----
 net/ipv6/raw.c               |  1 +
 net/l2tp/l2tp_ip.c           |  1 +
 net/l2tp/l2tp_ip6.c          |  1 +
 net/netrom/af_netrom.c       | 14 +---------
 net/packet/af_packet.c       |  7 ++---
 net/qrtr/qrtr.c              |  4 +--
 net/rose/af_rose.c           |  7 +----
 net/sctp/ipv6.c              |  1 +
 net/sctp/protocol.c          |  1 +
 net/socket.c                 | 48 ++++++++++----------------------
 net/x25/af_x25.c             | 27 +-----------------
 33 files changed, 63 insertions(+), 228 deletions(-)

diff --git a/include/linux/net.h b/include/linux/net.h
index e0930678c8bf..2be3e9c772fe 100644
--- a/include/linux/net.h
+++ b/include/linux/net.h
@@ -155,6 +155,8 @@ struct proto_ops {
 	int	 	(*compat_ioctl) (struct socket *sock, unsigned int cmd,
 				      unsigned long arg);
 #endif
+	int		(*gettstamp) (struct socket *sock, void __user *userstamp,
+				      bool timeval, bool time32);
 	int		(*listen)    (struct socket *sock, int len);
 	int		(*shutdown)  (struct socket *sock, int flags);
 	int		(*setsockopt)(struct socket *sock, int level,
diff --git a/include/net/compat.h b/include/net/compat.h
index 4c6d75612b6c..f277653c7e17 100644
--- a/include/net/compat.h
+++ b/include/net/compat.h
@@ -30,9 +30,6 @@ struct compat_cmsghdr {
 	compat_int_t	cmsg_type;
 };
 
-int compat_sock_get_timestamp(struct sock *, struct timeval __user *);
-int compat_sock_get_timestampns(struct sock *, struct timespec __user *);
-
 #else /* defined(CONFIG_COMPAT) */
 /*
  * To avoid compiler warnings:
diff --git a/include/net/sock.h b/include/net/sock.h
index 433f45fc2d68..ef6c9409dc75 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -1584,6 +1584,8 @@ int sock_setsockopt(struct socket *sock, int level, int op,
 
 int sock_getsockopt(struct socket *sock, int level, int op,
 		    char __user *optval, int __user *optlen);
+int sock_gettstamp(struct socket *sock, void __user *userstamp,
+		   bool timeval, bool time32);
 struct sk_buff *sock_alloc_send_skb(struct sock *sk, unsigned long size,
 				    int noblock, int *errcode);
 struct sk_buff *sock_alloc_send_pskb(struct sock *sk, unsigned long header_len,
@@ -2423,8 +2425,6 @@ static inline bool sk_listener(const struct sock *sk)
 }
 
 void sock_enable_timestamp(struct sock *sk, int flag);
-int sock_get_timestamp(struct sock *, struct timeval __user *);
-int sock_get_timestampns(struct sock *, struct timespec __user *);
 int sock_recv_errqueue(struct sock *sk, struct msghdr *msg, int len, int level,
 		       int type);
 
diff --git a/net/appletalk/ddp.c b/net/appletalk/ddp.c
index 9b6bc5abe946..a21a643997aa 100644
--- a/net/appletalk/ddp.c
+++ b/net/appletalk/ddp.c
@@ -1806,12 +1806,6 @@ static int atalk_ioctl(struct socket *sock, unsigned int cmd, unsigned long arg)
 		rc = put_user(amount, (int __user *)argp);
 		break;
 	}
-	case SIOCGSTAMP:
-		rc = sock_get_timestamp(sk, argp);
-		break;
-	case SIOCGSTAMPNS:
-		rc = sock_get_timestampns(sk, argp);
-		break;
 	/* Routing */
 	case SIOCADDRT:
 	case SIOCDELRT:
@@ -1871,6 +1865,7 @@ static const struct proto_ops atalk_dgram_ops = {
 	.getname	= atalk_getname,
 	.poll		= datagram_poll,
 	.ioctl		= atalk_ioctl,
+	.gettstamp	= sock_gettstamp,
 #ifdef CONFIG_COMPAT
 	.compat_ioctl	= atalk_compat_ioctl,
 #endif
diff --git a/net/atm/ioctl.c b/net/atm/ioctl.c
index 2ff0e5e470e3..d955b683aa7c 100644
--- a/net/atm/ioctl.c
+++ b/net/atm/ioctl.c
@@ -81,22 +81,6 @@ static int do_vcc_ioctl(struct socket *sock, unsigned int cmd,
 				 (int __user *)argp) ? -EFAULT : 0;
 		goto done;
 	}
-	case SIOCGSTAMP: /* borrowed from IP */
-#ifdef CONFIG_COMPAT
-		if (compat)
-			error = compat_sock_get_timestamp(sk, argp);
-		else
-#endif
-			error = sock_get_timestamp(sk, argp);
-		goto done;
-	case SIOCGSTAMPNS: /* borrowed from IP */
-#ifdef CONFIG_COMPAT
-		if (compat)
-			error = compat_sock_get_timestampns(sk, argp);
-		else
-#endif
-			error = sock_get_timestampns(sk, argp);
-		goto done;
 	case ATM_SETSC:
 		net_warn_ratelimited("ATM_SETSC is obsolete; used by %s:%d\n",
 				     current->comm, task_pid_nr(current));
diff --git a/net/atm/pvc.c b/net/atm/pvc.c
index 2cb10af16afc..02bd2a436bdf 100644
--- a/net/atm/pvc.c
+++ b/net/atm/pvc.c
@@ -118,6 +118,7 @@ static const struct proto_ops pvc_proto_ops = {
 #ifdef CONFIG_COMPAT
 	.compat_ioctl = vcc_compat_ioctl,
 #endif
+	.gettstamp =	sock_gettstamp,
 	.listen =	sock_no_listen,
 	.shutdown =	pvc_shutdown,
 	.setsockopt =	pvc_setsockopt,
diff --git a/net/atm/svc.c b/net/atm/svc.c
index 2f91b766ac42..908cbb8654f5 100644
--- a/net/atm/svc.c
+++ b/net/atm/svc.c
@@ -641,6 +641,7 @@ static const struct proto_ops svc_proto_ops = {
 #ifdef CONFIG_COMPAT
 	.compat_ioctl =	svc_compat_ioctl,
 #endif
+	.gettstamp =	sock_gettstamp,
 	.listen =	svc_listen,
 	.shutdown =	svc_shutdown,
 	.setsockopt =	svc_setsockopt,
diff --git a/net/ax25/af_ax25.c b/net/ax25/af_ax25.c
index c603d33d5410..121f7b877df9 100644
--- a/net/ax25/af_ax25.c
+++ b/net/ax25/af_ax25.c
@@ -1707,14 +1707,6 @@ static int ax25_ioctl(struct socket *sock, unsigned int cmd, unsigned long arg)
 		break;
 	}
 
-	case SIOCGSTAMP:
-		res = sock_get_timestamp(sk, argp);
-		break;
-
-	case SIOCGSTAMPNS:
-		res = sock_get_timestampns(sk, argp);
-		break;
-
 	case SIOCAX25ADDUID:	/* Add a uid to the uid/call map table */
 	case SIOCAX25DELUID:	/* Delete a uid from the uid/call map table */
 	case SIOCAX25GETUID: {
@@ -1943,6 +1935,7 @@ static const struct proto_ops ax25_proto_ops = {
 	.getname	= ax25_getname,
 	.poll		= datagram_poll,
 	.ioctl		= ax25_ioctl,
+	.gettstamp	= sock_gettstamp,
 	.listen		= ax25_listen,
 	.shutdown	= ax25_shutdown,
 	.setsockopt	= ax25_setsockopt,
diff --git a/net/bluetooth/af_bluetooth.c b/net/bluetooth/af_bluetooth.c
index deacc52d7ff1..34e15ca66779 100644
--- a/net/bluetooth/af_bluetooth.c
+++ b/net/bluetooth/af_bluetooth.c
@@ -511,14 +511,6 @@ int bt_sock_ioctl(struct socket *sock, unsigned int cmd, unsigned long arg)
 		err = put_user(amount, (int __user *) arg);
 		break;
 
-	case SIOCGSTAMP:
-		err = sock_get_timestamp(sk, (struct timeval __user *) arg);
-		break;
-
-	case SIOCGSTAMPNS:
-		err = sock_get_timestampns(sk, (struct timespec __user *) arg);
-		break;
-
 	default:
 		err = -ENOIOCTLCMD;
 		break;
diff --git a/net/bluetooth/l2cap_sock.c b/net/bluetooth/l2cap_sock.c
index 686bdc6b35b0..e8fed07a49d5 100644
--- a/net/bluetooth/l2cap_sock.c
+++ b/net/bluetooth/l2cap_sock.c
@@ -1655,6 +1655,7 @@ static const struct proto_ops l2cap_sock_ops = {
 	.recvmsg	= l2cap_sock_recvmsg,
 	.poll		= bt_sock_poll,
 	.ioctl		= bt_sock_ioctl,
+	.gettstamp	= sock_gettstamp,
 	.mmap		= sock_no_mmap,
 	.socketpair	= sock_no_socketpair,
 	.shutdown	= l2cap_sock_shutdown,
diff --git a/net/bluetooth/rfcomm/sock.c b/net/bluetooth/rfcomm/sock.c
index d606e9212291..a1dde6fab323 100644
--- a/net/bluetooth/rfcomm/sock.c
+++ b/net/bluetooth/rfcomm/sock.c
@@ -1049,6 +1049,7 @@ static const struct proto_ops rfcomm_sock_ops = {
 	.setsockopt	= rfcomm_sock_setsockopt,
 	.getsockopt	= rfcomm_sock_getsockopt,
 	.ioctl		= rfcomm_sock_ioctl,
+	.gettstamp	= sock_gettstamp,
 	.poll		= bt_sock_poll,
 	.socketpair	= sock_no_socketpair,
 	.mmap		= sock_no_mmap
diff --git a/net/bluetooth/sco.c b/net/bluetooth/sco.c
index 8f0f9279eac9..e7e5b3ed0394 100644
--- a/net/bluetooth/sco.c
+++ b/net/bluetooth/sco.c
@@ -1200,6 +1200,7 @@ static const struct proto_ops sco_sock_ops = {
 	.recvmsg	= sco_sock_recvmsg,
 	.poll		= bt_sock_poll,
 	.ioctl		= bt_sock_ioctl,
+	.gettstamp	= sock_gettstamp,
 	.mmap		= sock_no_mmap,
 	.socketpair	= sock_no_socketpair,
 	.shutdown	= sco_sock_shutdown,
diff --git a/net/can/af_can.c b/net/can/af_can.c
index 1684ba5b51eb..e8fd5dc1780a 100644
--- a/net/can/af_can.c
+++ b/net/can/af_can.c
@@ -89,13 +89,7 @@ static atomic_t skbcounter = ATOMIC_INIT(0);
 
 int can_ioctl(struct socket *sock, unsigned int cmd, unsigned long arg)
 {
-	struct sock *sk = sock->sk;
-
 	switch (cmd) {
-
-	case SIOCGSTAMP:
-		return sock_get_timestamp(sk, (struct timeval __user *)arg);
-
 	default:
 		return -ENOIOCTLCMD;
 	}
diff --git a/net/can/bcm.c b/net/can/bcm.c
index 0af8f0db892a..db3e521b9f47 100644
--- a/net/can/bcm.c
+++ b/net/can/bcm.c
@@ -1662,6 +1662,7 @@ static const struct proto_ops bcm_ops = {
 	.getname       = sock_no_getname,
 	.poll          = datagram_poll,
 	.ioctl         = can_ioctl,	/* use can_ioctl() from af_can.c */
+	.gettstamp     = sock_gettstamp,
 	.listen        = sock_no_listen,
 	.shutdown      = sock_no_shutdown,
 	.setsockopt    = sock_no_setsockopt,
diff --git a/net/can/raw.c b/net/can/raw.c
index 1051eee82581..968f6f8082a1 100644
--- a/net/can/raw.c
+++ b/net/can/raw.c
@@ -845,6 +845,7 @@ static const struct proto_ops raw_ops = {
 	.getname       = raw_getname,
 	.poll          = datagram_poll,
 	.ioctl         = can_ioctl,	/* use can_ioctl() from af_can.c */
+	.gettstamp     = sock_gettstamp,
 	.listen        = sock_no_listen,
 	.shutdown      = sock_no_shutdown,
 	.setsockopt    = raw_setsockopt,
diff --git a/net/compat.c b/net/compat.c
index 47a614b370cd..e5456dd4c7a5 100644
--- a/net/compat.c
+++ b/net/compat.c
@@ -455,60 +455,6 @@ static int compat_sock_getsockopt(struct socket *sock, int level, int optname,
 	return sock_getsockopt(sock, level, optname, optval, optlen);
 }
 
-int compat_sock_get_timestamp(struct sock *sk, struct timeval __user *userstamp)
-{
-	struct compat_timeval __user *ctv;
-	int err;
-	struct timeval tv;
-
-	if (COMPAT_USE_64BIT_TIME)
-		return sock_get_timestamp(sk, userstamp);
-
-	ctv = (struct compat_timeval __user *) userstamp;
-	err = -ENOENT;
-	sock_enable_timestamp(sk, SOCK_TIMESTAMP);
-	tv = ktime_to_timeval(sk->sk_stamp);
-	if (tv.tv_sec == -1)
-		return err;
-	if (tv.tv_sec == 0) {
-		sk->sk_stamp = ktime_get_real();
-		tv = ktime_to_timeval(sk->sk_stamp);
-	}
-	err = 0;
-	if (put_user(tv.tv_sec, &ctv->tv_sec) ||
-			put_user(tv.tv_usec, &ctv->tv_usec))
-		err = -EFAULT;
-	return err;
-}
-EXPORT_SYMBOL(compat_sock_get_timestamp);
-
-int compat_sock_get_timestampns(struct sock *sk, struct timespec __user *userstamp)
-{
-	struct compat_timespec __user *ctv;
-	int err;
-	struct timespec ts;
-
-	if (COMPAT_USE_64BIT_TIME)
-		return sock_get_timestampns (sk, userstamp);
-
-	ctv = (struct compat_timespec __user *) userstamp;
-	err = -ENOENT;
-	sock_enable_timestamp(sk, SOCK_TIMESTAMP);
-	ts = ktime_to_timespec(sk->sk_stamp);
-	if (ts.tv_sec == -1)
-		return err;
-	if (ts.tv_sec == 0) {
-		sk->sk_stamp = ktime_get_real();
-		ts = ktime_to_timespec(sk->sk_stamp);
-	}
-	err = 0;
-	if (put_user(ts.tv_sec, &ctv->tv_sec) ||
-			put_user(ts.tv_nsec, &ctv->tv_nsec))
-		err = -EFAULT;
-	return err;
-}
-EXPORT_SYMBOL(compat_sock_get_timestampns);
-
 static int __compat_sys_getsockopt(int fd, int level, int optname,
 				   char __user *optval,
 				   int __user *optlen)
diff --git a/net/core/sock.c b/net/core/sock.c
index 3730eb855095..df17bbfaca27 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -2897,37 +2897,31 @@ bool lock_sock_fast(struct sock *sk)
 }
 EXPORT_SYMBOL(lock_sock_fast);
 
-int sock_get_timestamp(struct sock *sk, struct timeval __user *userstamp)
+int sock_gettstamp(struct socket *sock, void __user *userstamp,
+		   bool timeval, bool time32)
 {
-	struct timeval tv;
-
-	sock_enable_timestamp(sk, SOCK_TIMESTAMP);
-	tv = ktime_to_timeval(sk->sk_stamp);
-	if (tv.tv_sec == -1)
-		return -ENOENT;
-	if (tv.tv_sec == 0) {
-		sk->sk_stamp = ktime_get_real();
-		tv = ktime_to_timeval(sk->sk_stamp);
-	}
-	return copy_to_user(userstamp, &tv, sizeof(tv)) ? -EFAULT : 0;
-}
-EXPORT_SYMBOL(sock_get_timestamp);
-
-int sock_get_timestampns(struct sock *sk, struct timespec __user *userstamp)
-{
-	struct timespec ts;
+	struct sock *sk = sock->sk;
+	struct timespec64 ts;
 
 	sock_enable_timestamp(sk, SOCK_TIMESTAMP);
-	ts = ktime_to_timespec(sk->sk_stamp);
+	ts = ktime_to_timespec64(sk->sk_stamp);
 	if (ts.tv_sec == -1)
 		return -ENOENT;
 	if (ts.tv_sec == 0) {
 		sk->sk_stamp = ktime_get_real();
-		ts = ktime_to_timespec(sk->sk_stamp);
+		ts = ktime_to_timespec64(sk->sk_stamp);
 	}
-	return copy_to_user(userstamp, &ts, sizeof(ts)) ? -EFAULT : 0;
+
+	if (timeval)
+		ts.tv_nsec /= 1000;
+#ifdef CONFIG_COMPAT_32BIT_TIME
+	if (time32)
+		return put_old_timespec32(&ts, userstamp);
+#endif
+
+	return put_timespec64(&ts, userstamp);
 }
-EXPORT_SYMBOL(sock_get_timestampns);
+EXPORT_SYMBOL(sock_gettstamp);
 
 void sock_enable_timestamp(struct sock *sk, int flag)
 {
diff --git a/net/dccp/ipv4.c b/net/dccp/ipv4.c
index b08feb219b44..8103f3525773 100644
--- a/net/dccp/ipv4.c
+++ b/net/dccp/ipv4.c
@@ -986,6 +986,7 @@ static const struct proto_ops inet_dccp_ops = {
 	/* FIXME: work on tcp_poll to rename it to inet_csk_poll */
 	.poll		   = dccp_poll,
 	.ioctl		   = inet_ioctl,
+	.gettstamp	   = sock_gettstamp,
 	/* FIXME: work on inet_listen to rename it to sock_common_listen */
 	.listen		   = inet_dccp_listen,
 	.shutdown	   = inet_shutdown,
diff --git a/net/dccp/ipv6.c b/net/dccp/ipv6.c
index 6344f1b18a6a..dacdb5b2638d 100644
--- a/net/dccp/ipv6.c
+++ b/net/dccp/ipv6.c
@@ -1072,6 +1072,7 @@ static const struct proto_ops inet6_dccp_ops = {
 	.getname	   = inet6_getname,
 	.poll		   = dccp_poll,
 	.ioctl		   = inet6_ioctl,
+	.gettstamp	   = sock_gettstamp,
 	.listen		   = inet_dccp_listen,
 	.shutdown	   = inet_shutdown,
 	.setsockopt	   = sock_common_setsockopt,
diff --git a/net/ieee802154/socket.c b/net/ieee802154/socket.c
index bc6b912603f1..ce2dfb997537 100644
--- a/net/ieee802154/socket.c
+++ b/net/ieee802154/socket.c
@@ -164,10 +164,6 @@ static int ieee802154_sock_ioctl(struct socket *sock, unsigned int cmd,
 	struct sock *sk = sock->sk;
 
 	switch (cmd) {
-	case SIOCGSTAMP:
-		return sock_get_timestamp(sk, (struct timeval __user *)arg);
-	case SIOCGSTAMPNS:
-		return sock_get_timestampns(sk, (struct timespec __user *)arg);
 	case SIOCGIFADDR:
 	case SIOCSIFADDR:
 		return ieee802154_dev_ioctl(sk, (struct ifreq __user *)arg,
@@ -426,6 +422,7 @@ static const struct proto_ops ieee802154_raw_ops = {
 	.getname	   = sock_no_getname,
 	.poll		   = datagram_poll,
 	.ioctl		   = ieee802154_sock_ioctl,
+	.gettstamp	   = sock_gettstamp,
 	.listen		   = sock_no_listen,
 	.shutdown	   = sock_no_shutdown,
 	.setsockopt	   = sock_common_setsockopt,
@@ -988,6 +985,7 @@ static const struct proto_ops ieee802154_dgram_ops = {
 	.getname	   = sock_no_getname,
 	.poll		   = datagram_poll,
 	.ioctl		   = ieee802154_sock_ioctl,
+	.gettstamp	   = sock_gettstamp,
 	.listen		   = sock_no_listen,
 	.shutdown	   = sock_no_shutdown,
 	.setsockopt	   = sock_common_setsockopt,
diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index 20fda8fb8ffd..3490275bab50 100644
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -911,12 +911,6 @@ int inet_ioctl(struct socket *sock, unsigned int cmd, unsigned long arg)
 	struct rtentry rt;
 
 	switch (cmd) {
-	case SIOCGSTAMP:
-		err = sock_get_timestamp(sk, (struct timeval __user *)arg);
-		break;
-	case SIOCGSTAMPNS:
-		err = sock_get_timestampns(sk, (struct timespec __user *)arg);
-		break;
 	case SIOCADDRT:
 	case SIOCDELRT:
 		if (copy_from_user(&rt, p, sizeof(struct rtentry)))
@@ -988,6 +982,7 @@ const struct proto_ops inet_stream_ops = {
 	.getname	   = inet_getname,
 	.poll		   = tcp_poll,
 	.ioctl		   = inet_ioctl,
+	.gettstamp	   = sock_gettstamp,
 	.listen		   = inet_listen,
 	.shutdown	   = inet_shutdown,
 	.setsockopt	   = sock_common_setsockopt,
@@ -1023,6 +1018,7 @@ const struct proto_ops inet_dgram_ops = {
 	.getname	   = inet_getname,
 	.poll		   = udp_poll,
 	.ioctl		   = inet_ioctl,
+	.gettstamp	   = sock_gettstamp,
 	.listen		   = sock_no_listen,
 	.shutdown	   = inet_shutdown,
 	.setsockopt	   = sock_common_setsockopt,
@@ -1055,6 +1051,7 @@ static const struct proto_ops inet_sockraw_ops = {
 	.getname	   = inet_getname,
 	.poll		   = datagram_poll,
 	.ioctl		   = inet_ioctl,
+	.gettstamp	   = sock_gettstamp,
 	.listen		   = sock_no_listen,
 	.shutdown	   = inet_shutdown,
 	.setsockopt	   = sock_common_setsockopt,
diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c
index 673bba31eb18..77f9716958a7 100644
--- a/net/ipv6/af_inet6.c
+++ b/net/ipv6/af_inet6.c
@@ -532,12 +532,6 @@ int inet6_ioctl(struct socket *sock, unsigned int cmd, unsigned long arg)
 	struct net *net = sock_net(sk);
 
 	switch (cmd) {
-	case SIOCGSTAMP:
-		return sock_get_timestamp(sk, (struct timeval __user *)arg);
-
-	case SIOCGSTAMPNS:
-		return sock_get_timestampns(sk, (struct timespec __user *)arg);
-
 	case SIOCADDRT:
 	case SIOCDELRT:
 
@@ -570,6 +564,7 @@ const struct proto_ops inet6_stream_ops = {
 	.getname	   = inet6_getname,
 	.poll		   = tcp_poll,			/* ok		*/
 	.ioctl		   = inet6_ioctl,		/* must change  */
+	.gettstamp	   = sock_gettstamp,
 	.listen		   = inet_listen,		/* ok		*/
 	.shutdown	   = inet_shutdown,		/* ok		*/
 	.setsockopt	   = sock_common_setsockopt,	/* ok		*/
@@ -603,6 +598,7 @@ const struct proto_ops inet6_dgram_ops = {
 	.getname	   = inet6_getname,
 	.poll		   = udp_poll,			/* ok		*/
 	.ioctl		   = inet6_ioctl,		/* must change  */
+	.gettstamp	   = sock_gettstamp,
 	.listen		   = sock_no_listen,		/* ok		*/
 	.shutdown	   = inet_shutdown,		/* ok		*/
 	.setsockopt	   = sock_common_setsockopt,	/* ok		*/
diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c
index 413d98bf24f4..a913ccfff021 100644
--- a/net/ipv6/raw.c
+++ b/net/ipv6/raw.c
@@ -1344,6 +1344,7 @@ const struct proto_ops inet6_sockraw_ops = {
 	.getname	   = inet6_getname,
 	.poll		   = datagram_poll,		/* ok		*/
 	.ioctl		   = inet6_ioctl,		/* must change  */
+	.gettstamp	   = sock_gettstamp,
 	.listen		   = sock_no_listen,		/* ok		*/
 	.shutdown	   = inet_shutdown,		/* ok		*/
 	.setsockopt	   = sock_common_setsockopt,	/* ok		*/
diff --git a/net/l2tp/l2tp_ip.c b/net/l2tp/l2tp_ip.c
index 35f6f86d4dcc..b7b844d9edec 100644
--- a/net/l2tp/l2tp_ip.c
+++ b/net/l2tp/l2tp_ip.c
@@ -615,6 +615,7 @@ static const struct proto_ops l2tp_ip_ops = {
 	.getname	   = l2tp_ip_getname,
 	.poll		   = datagram_poll,
 	.ioctl		   = inet_ioctl,
+	.gettstamp	   = sock_gettstamp,
 	.listen		   = sock_no_listen,
 	.shutdown	   = inet_shutdown,
 	.setsockopt	   = sock_common_setsockopt,
diff --git a/net/l2tp/l2tp_ip6.c b/net/l2tp/l2tp_ip6.c
index 237f1a4a0b0c..c379ebfa4cb7 100644
--- a/net/l2tp/l2tp_ip6.c
+++ b/net/l2tp/l2tp_ip6.c
@@ -751,6 +751,7 @@ static const struct proto_ops l2tp_ip6_ops = {
 	.getname	   = l2tp_ip6_getname,
 	.poll		   = datagram_poll,
 	.ioctl		   = inet6_ioctl,
+	.gettstamp	   = sock_gettstamp,
 	.listen		   = sock_no_listen,
 	.shutdown	   = inet_shutdown,
 	.setsockopt	   = sock_common_setsockopt,
diff --git a/net/netrom/af_netrom.c b/net/netrom/af_netrom.c
index 03f37c4e64fe..687103e0a3c8 100644
--- a/net/netrom/af_netrom.c
+++ b/net/netrom/af_netrom.c
@@ -1194,7 +1194,6 @@ static int nr_ioctl(struct socket *sock, unsigned int cmd, unsigned long arg)
 {
 	struct sock *sk = sock->sk;
 	void __user *argp = (void __user *)arg;
-	int ret;
 
 	switch (cmd) {
 	case TIOCOUTQ: {
@@ -1220,18 +1219,6 @@ static int nr_ioctl(struct socket *sock, unsigned int cmd, unsigned long arg)
 		return put_user(amount, (int __user *)argp);
 	}
 
-	case SIOCGSTAMP:
-		lock_sock(sk);
-		ret = sock_get_timestamp(sk, argp);
-		release_sock(sk);
-		return ret;
-
-	case SIOCGSTAMPNS:
-		lock_sock(sk);
-		ret = sock_get_timestampns(sk, argp);
-		release_sock(sk);
-		return ret;
-
 	case SIOCGIFADDR:
 	case SIOCSIFADDR:
 	case SIOCGIFDSTADDR:
@@ -1357,6 +1344,7 @@ static const struct proto_ops nr_proto_ops = {
 	.getname	=	nr_getname,
 	.poll		=	datagram_poll,
 	.ioctl		=	nr_ioctl,
+	.gettstamp	=	sock_gettstamp,
 	.listen		=	nr_listen,
 	.shutdown	=	sock_no_shutdown,
 	.setsockopt	=	nr_setsockopt,
diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index 5610061e7f2e..06097f1e060b 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -4053,11 +4053,6 @@ static int packet_ioctl(struct socket *sock, unsigned int cmd,
 		spin_unlock_bh(&sk->sk_receive_queue.lock);
 		return put_user(amount, (int __user *)arg);
 	}
-	case SIOCGSTAMP:
-		return sock_get_timestamp(sk, (struct timeval __user *)arg);
-	case SIOCGSTAMPNS:
-		return sock_get_timestampns(sk, (struct timespec __user *)arg);
-
 #ifdef CONFIG_INET
 	case SIOCADDRT:
 	case SIOCDELRT:
@@ -4415,6 +4410,7 @@ static const struct proto_ops packet_ops_spkt = {
 	.getname =	packet_getname_spkt,
 	.poll =		datagram_poll,
 	.ioctl =	packet_ioctl,
+	.gettstamp =	sock_gettstamp,
 	.listen =	sock_no_listen,
 	.shutdown =	sock_no_shutdown,
 	.setsockopt =	sock_no_setsockopt,
@@ -4436,6 +4432,7 @@ static const struct proto_ops packet_ops = {
 	.getname =	packet_getname,
 	.poll =		packet_poll,
 	.ioctl =	packet_ioctl,
+	.gettstamp =	sock_gettstamp,
 	.listen =	sock_no_listen,
 	.shutdown =	sock_no_shutdown,
 	.setsockopt =	packet_setsockopt,
diff --git a/net/qrtr/qrtr.c b/net/qrtr/qrtr.c
index 86e1e37eb4e8..9da159f3fc2a 100644
--- a/net/qrtr/qrtr.c
+++ b/net/qrtr/qrtr.c
@@ -967,9 +967,6 @@ static int qrtr_ioctl(struct socket *sock, unsigned int cmd, unsigned long arg)
 			break;
 		}
 		break;
-	case SIOCGSTAMP:
-		rc = sock_get_timestamp(sk, argp);
-		break;
 	case SIOCADDRT:
 	case SIOCDELRT:
 	case SIOCSIFADDR:
@@ -1032,6 +1029,7 @@ static const struct proto_ops qrtr_proto_ops = {
 	.recvmsg	= qrtr_recvmsg,
 	.getname	= qrtr_getname,
 	.ioctl		= qrtr_ioctl,
+	.gettstamp	= sock_gettstamp,
 	.poll		= datagram_poll,
 	.shutdown	= sock_no_shutdown,
 	.setsockopt	= sock_no_setsockopt,
diff --git a/net/rose/af_rose.c b/net/rose/af_rose.c
index d00a0ef39a56..64bcbb22c8f7 100644
--- a/net/rose/af_rose.c
+++ b/net/rose/af_rose.c
@@ -1299,12 +1299,6 @@ static int rose_ioctl(struct socket *sock, unsigned int cmd, unsigned long arg)
 		return put_user(amount, (unsigned int __user *) argp);
 	}
 
-	case SIOCGSTAMP:
-		return sock_get_timestamp(sk, (struct timeval __user *) argp);
-
-	case SIOCGSTAMPNS:
-		return sock_get_timestampns(sk, (struct timespec __user *) argp);
-
 	case SIOCGIFADDR:
 	case SIOCSIFADDR:
 	case SIOCGIFDSTADDR:
@@ -1472,6 +1466,7 @@ static const struct proto_ops rose_proto_ops = {
 	.getname	=	rose_getname,
 	.poll		=	datagram_poll,
 	.ioctl		=	rose_ioctl,
+	.gettstamp	=	sock_gettstamp,
 	.listen		=	rose_listen,
 	.shutdown	=	sock_no_shutdown,
 	.setsockopt	=	rose_setsockopt,
diff --git a/net/sctp/ipv6.c b/net/sctp/ipv6.c
index fc6c5e4bffa5..62da13b888e0 100644
--- a/net/sctp/ipv6.c
+++ b/net/sctp/ipv6.c
@@ -1028,6 +1028,7 @@ static const struct proto_ops inet6_seqpacket_ops = {
 	.getname	   = sctp_getname,
 	.poll		   = sctp_poll,
 	.ioctl		   = inet6_ioctl,
+	.gettstamp	   = sock_gettstamp,
 	.listen		   = sctp_inet_listen,
 	.shutdown	   = inet_shutdown,
 	.setsockopt	   = sock_common_setsockopt,
diff --git a/net/sctp/protocol.c b/net/sctp/protocol.c
index e948db29ab53..b640eeedc8b4 100644
--- a/net/sctp/protocol.c
+++ b/net/sctp/protocol.c
@@ -1026,6 +1026,7 @@ static const struct proto_ops inet_seqpacket_ops = {
 	.getname	   = inet_getname,	/* Semantics are different.  */
 	.poll		   = sctp_poll,
 	.ioctl		   = inet_ioctl,
+	.gettstamp	   = sock_gettstamp,
 	.listen		   = sctp_inet_listen,
 	.shutdown	   = inet_shutdown,	/* Looks harmless.  */
 	.setsockopt	   = sock_common_setsockopt, /* IP_SOL IP_OPTION is a problem */
diff --git a/net/socket.c b/net/socket.c
index b9d71b503720..6814e8dc8af1 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -1069,6 +1069,15 @@ static long sock_ioctl(struct file *file, unsigned cmd, unsigned long arg)
 
 			err = open_related_ns(&net->ns, get_net_ns);
 			break;
+		case SIOCGSTAMP:
+		case SIOCGSTAMPNS:
+			if (!sock->ops->gettstamp) {
+				err = -ENOIOCTLCMD;
+				break;
+			}
+			err = sock->ops->gettstamp(sock, argp,
+						   cmd == SIOCGSTAMP, false);
+			break;
 		default:
 			err = sock_do_ioctl(net, sock, cmd, arg);
 			break;
@@ -2740,38 +2749,6 @@ void socket_seq_show(struct seq_file *seq)
 #endif				/* CONFIG_PROC_FS */
 
 #ifdef CONFIG_COMPAT
-static int do_siocgstamp(struct net *net, struct socket *sock,
-			 unsigned int cmd, void __user *up)
-{
-	mm_segment_t old_fs = get_fs();
-	struct timeval ktv;
-	int err;
-
-	set_fs(KERNEL_DS);
-	err = sock_do_ioctl(net, sock, cmd, (unsigned long)&ktv);
-	set_fs(old_fs);
-	if (!err)
-		err = compat_put_timeval(&ktv, up);
-
-	return err;
-}
-
-static int do_siocgstampns(struct net *net, struct socket *sock,
-			   unsigned int cmd, void __user *up)
-{
-	mm_segment_t old_fs = get_fs();
-	struct timespec kts;
-	int err;
-
-	set_fs(KERNEL_DS);
-	err = sock_do_ioctl(net, sock, cmd, (unsigned long)&kts);
-	set_fs(old_fs);
-	if (!err)
-		err = compat_put_timespec(&kts, up);
-
-	return err;
-}
-
 static int compat_dev_ifconf(struct net *net, struct compat_ifconf __user *uifc32)
 {
 	struct compat_ifconf ifc32;
@@ -3119,9 +3096,12 @@ static int compat_sock_ioctl_trans(struct file *file, struct socket *sock,
 	case SIOCDELRT:
 		return routing_ioctl(net, sock, cmd, argp);
 	case SIOCGSTAMP:
-		return do_siocgstamp(net, sock, cmd, argp);
 	case SIOCGSTAMPNS:
-		return do_siocgstampns(net, sock, cmd, argp);
+		if (!sock->ops->gettstamp)
+			return -ENOIOCTLCMD;
+		return sock->ops->gettstamp(sock, argp, cmd == SIOCGSTAMP,
+					    !COMPAT_USE_64BIT_TIME);
+
 	case SIOCBONDSLAVEINFOQUERY:
 	case SIOCBONDINFOQUERY:
 	case SIOCSHWTSTAMP:
diff --git a/net/x25/af_x25.c b/net/x25/af_x25.c
index d49aa79b7997..fe2673d17009 100644
--- a/net/x25/af_x25.c
+++ b/net/x25/af_x25.c
@@ -1388,18 +1388,6 @@ static int x25_ioctl(struct socket *sock, unsigned int cmd, unsigned long arg)
 		break;
 	}
 
-	case SIOCGSTAMP:
-		rc = -EINVAL;
-		if (sk)
-			rc = sock_get_timestamp(sk,
-						(struct timeval __user *)argp);
-		break;
-	case SIOCGSTAMPNS:
-		rc = -EINVAL;
-		if (sk)
-			rc = sock_get_timestampns(sk,
-					(struct timespec __user *)argp);
-		break;
 	case SIOCGIFADDR:
 	case SIOCSIFADDR:
 	case SIOCGIFDSTADDR:
@@ -1671,8 +1659,6 @@ static int compat_x25_ioctl(struct socket *sock, unsigned int cmd,
 				unsigned long arg)
 {
 	void __user *argp = compat_ptr(arg);
-	struct sock *sk = sock->sk;
-
 	int rc = -ENOIOCTLCMD;
 
 	switch(cmd) {
@@ -1680,18 +1666,6 @@ static int compat_x25_ioctl(struct socket *sock, unsigned int cmd,
 	case TIOCINQ:
 		rc = x25_ioctl(sock, cmd, (unsigned long)argp);
 		break;
-	case SIOCGSTAMP:
-		rc = -EINVAL;
-		if (sk)
-			rc = compat_sock_get_timestamp(sk,
-					(struct timeval __user*)argp);
-		break;
-	case SIOCGSTAMPNS:
-		rc = -EINVAL;
-		if (sk)
-			rc = compat_sock_get_timestampns(sk,
-					(struct timespec __user*)argp);
-		break;
 	case SIOCGIFADDR:
 	case SIOCSIFADDR:
 	case SIOCGIFDSTADDR:
@@ -1755,6 +1729,7 @@ static const struct proto_ops x25_proto_ops = {
 #ifdef CONFIG_COMPAT
 	.compat_ioctl = compat_x25_ioctl,
 #endif
+	.gettstamp =	sock_gettstamp,
 	.listen =	x25_listen,
 	.shutdown =	sock_no_shutdown,
 	.setsockopt =	x25_setsockopt,
-- 
2.18.0

^ permalink raw reply related

* [PATCH net-next 2/3] asm-generic: generalize asm/sockios.h
From: Arnd Bergmann @ 2018-08-29 12:59 UTC (permalink / raw)
  To: netdev, David S . Miller
  Cc: linux-arch, y2038, Arnd Bergmann, Tony Luck, Fenghua Yu,
	James E.J. Bottomley, Helge Deller, Thomas Gleixner, x86, Al Viro,
	linux-ia64, linux-kernel, linux-parisc, sparclinux
In-Reply-To: <20180829130308.3504560-1-arnd@arndb.de>

ia64, parisc and sparc just use a copy of the generic version
of asm/sockios.h, and x86 is a redirect to the same file, so we
can just let the header file be generated.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
 arch/ia64/include/uapi/asm/Kbuild      |  1 +
 arch/ia64/include/uapi/asm/sockios.h   | 21 ---------------------
 arch/parisc/include/uapi/asm/Kbuild    |  1 +
 arch/parisc/include/uapi/asm/sockios.h | 14 --------------
 arch/sparc/include/uapi/asm/Kbuild     |  1 +
 arch/sparc/include/uapi/asm/sockios.h  | 15 ---------------
 arch/x86/include/uapi/asm/Kbuild       |  1 +
 arch/x86/include/uapi/asm/sockios.h    |  1 -
 8 files changed, 4 insertions(+), 51 deletions(-)
 delete mode 100644 arch/ia64/include/uapi/asm/sockios.h
 delete mode 100644 arch/parisc/include/uapi/asm/sockios.h
 delete mode 100644 arch/sparc/include/uapi/asm/sockios.h
 delete mode 100644 arch/x86/include/uapi/asm/sockios.h

diff --git a/arch/ia64/include/uapi/asm/Kbuild b/arch/ia64/include/uapi/asm/Kbuild
index 3982e673e967..a6377ad3ba1c 100644
--- a/arch/ia64/include/uapi/asm/Kbuild
+++ b/arch/ia64/include/uapi/asm/Kbuild
@@ -8,3 +8,4 @@ generic-y += msgbuf.h
 generic-y += poll.h
 generic-y += sembuf.h
 generic-y += shmbuf.h
+generic-y += sockios.h
diff --git a/arch/ia64/include/uapi/asm/sockios.h b/arch/ia64/include/uapi/asm/sockios.h
deleted file mode 100644
index f27a12f95d20..000000000000
--- a/arch/ia64/include/uapi/asm/sockios.h
+++ /dev/null
@@ -1,21 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
-#ifndef _ASM_IA64_SOCKIOS_H
-#define _ASM_IA64_SOCKIOS_H
-
-/*
- * Socket-level I/O control calls.
- *
- * Based on <asm-i386/sockios.h>.
- *
- * Modified 1998, 1999
- *	David Mosberger-Tang <davidm@hpl.hp.com>, Hewlett-Packard Co
- */
-#define FIOSETOWN 	0x8901
-#define SIOCSPGRP	0x8902
-#define FIOGETOWN	0x8903
-#define SIOCGPGRP	0x8904
-#define SIOCATMARK	0x8905
-#define SIOCGSTAMP	0x8906		/* Get stamp (timeval) */
-#define SIOCGSTAMPNS	0x8907		/* Get stamp (timespec) */
-
-#endif /* _ASM_IA64_SOCKIOS_H */
diff --git a/arch/parisc/include/uapi/asm/Kbuild b/arch/parisc/include/uapi/asm/Kbuild
index 286ef5a5904b..be6c171f57f7 100644
--- a/arch/parisc/include/uapi/asm/Kbuild
+++ b/arch/parisc/include/uapi/asm/Kbuild
@@ -7,3 +7,4 @@ generic-y += kvm_para.h
 generic-y += param.h
 generic-y += poll.h
 generic-y += resource.h
+generic-y += sockios.h
diff --git a/arch/parisc/include/uapi/asm/sockios.h b/arch/parisc/include/uapi/asm/sockios.h
deleted file mode 100644
index 66a3ba64d53f..000000000000
--- a/arch/parisc/include/uapi/asm/sockios.h
+++ /dev/null
@@ -1,14 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
-#ifndef __ARCH_PARISC_SOCKIOS__
-#define __ARCH_PARISC_SOCKIOS__
-
-/* Socket-level I/O control calls. */
-#define FIOSETOWN 	0x8901
-#define SIOCSPGRP	0x8902
-#define FIOGETOWN	0x8903
-#define SIOCGPGRP	0x8904
-#define SIOCATMARK	0x8905
-#define SIOCGSTAMP	0x8906		/* Get stamp (timeval) */
-#define SIOCGSTAMPNS	0x8907		/* Get stamp (timespec) */
-
-#endif
diff --git a/arch/sparc/include/uapi/asm/Kbuild b/arch/sparc/include/uapi/asm/Kbuild
index 4680ba246b55..8fdae51d0eae 100644
--- a/arch/sparc/include/uapi/asm/Kbuild
+++ b/arch/sparc/include/uapi/asm/Kbuild
@@ -2,4 +2,5 @@
 include include/uapi/asm-generic/Kbuild.asm
 
 generic-y += bpf_perf_event.h
+generic-y += sockios.h
 generic-y += types.h
diff --git a/arch/sparc/include/uapi/asm/sockios.h b/arch/sparc/include/uapi/asm/sockios.h
deleted file mode 100644
index 18a3ec14a847..000000000000
--- a/arch/sparc/include/uapi/asm/sockios.h
+++ /dev/null
@@ -1,15 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
-#ifndef _ASM_SPARC_SOCKIOS_H
-#define _ASM_SPARC_SOCKIOS_H
-
-/* Socket-level I/O control calls. */
-#define FIOSETOWN 	0x8901
-#define SIOCSPGRP	0x8902
-#define FIOGETOWN	0x8903
-#define SIOCGPGRP	0x8904
-#define SIOCATMARK	0x8905
-#define SIOCGSTAMP	0x8906		/* Get stamp (timeval) */
-#define SIOCGSTAMPNS	0x8907		/* Get stamp (timespec) */
-
-#endif /* !(_ASM_SPARC_SOCKIOS_H) */
-
diff --git a/arch/x86/include/uapi/asm/Kbuild b/arch/x86/include/uapi/asm/Kbuild
index 322681622d1e..1d489e6b237e 100644
--- a/arch/x86/include/uapi/asm/Kbuild
+++ b/arch/x86/include/uapi/asm/Kbuild
@@ -6,3 +6,4 @@ generated-y += unistd_32.h
 generated-y += unistd_64.h
 generated-y += unistd_x32.h
 generic-y += poll.h
+generic-y += sockios.h
diff --git a/arch/x86/include/uapi/asm/sockios.h b/arch/x86/include/uapi/asm/sockios.h
deleted file mode 100644
index def6d4746ee7..000000000000
--- a/arch/x86/include/uapi/asm/sockios.h
+++ /dev/null
@@ -1 +0,0 @@
-#include <asm-generic/sockios.h>
-- 
2.18.0

^ permalink raw reply related

* Re: [PATCH 4/5] net: mvneta: enable NETIF_F_RXCSUM by default
From: Andrew Lunn @ 2018-08-29 13:08 UTC (permalink / raw)
  To: Jisheng Zhang
  Cc: thomas.petazzoni, David S. Miller, netdev, linux-kernel,
	Gregory CLEMENT, linux-arm-kernel
In-Reply-To: <20180829162932.6015e89d@xhacker.debian>

On Wed, Aug 29, 2018 at 04:29:32PM +0800, Jisheng Zhang wrote:
> The code and HW supports NETIF_F_RXCSUM, so let's enable it by default.

Hi Jisheng

I've never studied what all these different flags mean. Does
NETIF_F_RXCSUM mean Ethernet FCS? Or does it also include IPv4, IPv6,
UDP, TCP... checksums?

I've seen network interfaces get checksum'ing wrong when used with an
Ethernet switch with DSA. The extra header DSA uses means the hardware
cannot parse the packet correctly, and so cannot find these headers.

If this is just for FCS, then it is not a problem.

   Thanks
	Andrew

^ permalink raw reply

* Re: [PATCH 0/5] net: mvneta: some bug fix and trivial improvement
From: Andrew Lunn @ 2018-08-29 13:12 UTC (permalink / raw)
  To: Jisheng Zhang
  Cc: thomas.petazzoni, David S. Miller, netdev, linux-kernel,
	Gregory CLEMENT, linux-arm-kernel
In-Reply-To: <20180829162456.2bd69796@xhacker.debian>

Hi Jisheng

Please separate fixes from new features.

Fixes should be based on DaveM net branch, and use the subject line
[PATCH net]...

New features should be based on DaveM net-next branch, and use the
subject line [PATCH net-next]...

	Thanks
		Andrew

^ permalink raw reply

* Re: [PATCH net-next v2 2/2] dpaa2-eth: Move DPAA2 Ethernet driver from staging to drivers/net
From: Andrew Lunn @ 2018-08-29 13:21 UTC (permalink / raw)
  To: Joe Perches
  Cc: Ioana Radulescu, netdev, davem, gregkh, devel, linux-kernel,
	ioana.ciornei, laurentiu.tudor, madalin.bucur, horia.geanta
In-Reply-To: <7dbd5426426315977a8dc7e1745d3addce3d16b5.camel@perches.com>

On Wed, Aug 29, 2018 at 03:50:02AM -0700, Joe Perches wrote:
> On Wed, 2018-08-29 at 04:42 -0500, Ioana Radulescu wrote:
> > The DPAA2 Ethernet driver supports Freescale/NXP SoCs with DPAA2
> > (DataPath Acceleration Architecture v2). The driver manages
> > network objects discovered on the fsl-mc bus.
> 
> Please use git 'format-patch -M' to make the diff
> smaller and more readable.

Hi Joe

I asked for this.

This is a request to move the driver from staging into the main
tree. We want to review the code in order to see if it has reached
mainline quality.

If the patch just lists a rename, not actually code, i cannot review
it, so i will just NACK it. We need to see the code.

Once the code has been reviewed and has all the needed Acked-by:, then
-M could be used. But this driver is not that far yet.

Thanks

    Andrew

^ permalink raw reply

* Re: [PATCH 0/2] net/sched: Add hardware specific counters to TC actions
From: Eelco Chaudron @ 2018-08-29  9:43 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: David Miller, netdev, jhs, xiyou.wangcong, jiri, simon.horman,
	Marcelo Ricardo Leitner, louis.peens
In-Reply-To: <20180823201446.3802e84b@cakuba.netronome.com>



On 23 Aug 2018, at 20:14, Jakub Kicinski wrote:

> On Mon, 20 Aug 2018 16:03:40 +0200, Eelco Chaudron wrote:
>> On 17 Aug 2018, at 13:27, Jakub Kicinski wrote:
>>> On Thu, 16 Aug 2018 14:02:44 +0200, Eelco Chaudron wrote:
>>>> On 11 Aug 2018, at 21:06, David Miller wrote:
>>>>
>>>>> From: Jakub Kicinski <jakub.kicinski@netronome.com>
>>>>> Date: Thu, 9 Aug 2018 20:26:08 -0700
>>>>>
>>>>>> It is not immediately clear why this is needed.  The memory and
>>>>>> updating two sets of counters won't come for free, so perhaps a
>>>>>> stronger justification than troubleshooting is due? :S
>>>>>>
>>>>>> Netdev has counters for fallback vs forwarded traffic, so you'd
>>>>>> know
>>>>>> that traffic hits the SW datapath, plus the rules which are in_hw
>>>>>> will
>>>>>> most likely not match as of today for flower (assuming
>>>>>> correctness).
>>>>
>>>> I strongly believe that these counters are a requirement for a 
>>>> mixed
>>>> software/hardware (flow) based forwarding environment. The global
>>>> counters will not help much here as you might have chosen to have
>>>> certain traffic forwarded by software.
>>>>
>>>> These counters are probably the only option you have to figure out
>>>> why
>>>> forwarding is not as fast as expected, and you want to blame the TC
>>>> offload NIC.
>>>
>>> The suggested debugging flow would be:
>>>  (1) check the global counter for fallback are incrementing;
>>>  (2) find a flow with high stats but no in_hw flag set.
>>>
>>> The in_hw indication should be sufficient in most cases (unless 
>>> there
>>> are shared blocks between netdevs of different ASICs...).
>>
>> I guess the aim is to find miss behaving hardware, i.e. having the 
>> in_hw
>> flag set, but flows still coming to the kernel.
>
> For misbehaving hardware in_hw will not work indeed.  Whether we need
> these extra always-on stats for such use case could be debated :)
>
>>>>>> I'm slightly concerned about potential performance impact, would
>>>>>> you
>>>>>> be able to share some numbers for non-trivial number of flows 
>>>>>> (100k
>>>>>> active?)?
>>>>>
>>>>> Agreed, features used for diagnostics cannot have a harmful 
>>>>> penalty
>>>>> for fast path performance.
>>>>
>>>> Fast path performance is not affected as these counters are not
>>>> incremented there. They are only incremented by the nic driver when
>>>> they
>>>> gather their statistics from hardware.
>>>
>>> Not by much, you are adding state to performance-critical 
>>> structures,
>>> though, for what is effectively debugging purposes.
>>>
>>> I was mostly talking about the HW offload stat updates (sorry for 
>>> not
>>> being clear).
>>>
>>> We can have some hundreds of thousands active offloaded flows, each 
>>> of
>>> them can have multiple actions, and stats have to be updated 
>>> multiple
>>> times per second and dumped probably around once a second, too.  On 
>>> a
>>> busy system the stats will get evicted from cache between each 
>>> round.
>>>
>>> But I'm speculating let's see if I can get some numbers on it (if 
>>> you
>>> could get some too, that would be great!).
>>
>> I’ll try to measure some of this later this week/early next week.
>
> I asked Louis to run some tests while I'm travelling, and he reports
> that my worry about reporting the extra stats was unfounded.  Update
> function does not show up in traces at all.  It seems under stress
> (generated with stress-ng) the thread dumping the stats in userspace
> (in OvS it would be the revalidator) actually consumes less CPU in
> __gnet_stats_copy_basic (0.4% less for ~2.0% total).
>
> Would this match with your results?  I'm not sure why dumping would be
> faster with your change..

Tested with OVS and https://github.com/chaudron/ovs_perf using 300K TC 
rules installed in HW.

For __gnet_stats_copy_basic() being faster I have (had) a theory. Now 
this function is called twice, and I assumed the first call would cache 
memory and the second call would be faster.

Sampling a lot of perf data, I get an average of 1115ns with the base 
kernel and 954ns with the fix applied, so about ~14%.

Thought I would perf tcf_action_copy_stats() as it is the place updating 
the additional counter. But even in this case, I see a better 
performance with the patch applied.

In average 13581ns with the fix, vs base kernel at 1391ns, so about 
2.3%.

I guess the changes to the tc_action structure got better cache 
alignment.


>>>> However, the flow creation is effected, as this is where the extra
>>>> memory gets allocated. I had done some 40K flow tests before and 
>>>> did
>>>> not
>>>> see any noticeable change in flow insertion performance. As 
>>>> requested
>>>> by
>>>> Jakub I did it again for 100K (and threw a Netronome blade in the 
>>>> mix
>>>> ;). I used Marcelo’s test tool,
>>>> https://github.com/marceloleitner/perf-flower.git.
>>>>
>>>> Here are the numbers (time in seconds) for 10 runs in sorted order:
>>>>
>>>> +-------------+----------------+
>>>> | Base_kernel | Change_applied |
>>>> +-------------+----------------+
>>>> |    5.684019 |       5.656388 |
>>>> |    5.699658 |       5.674974 |
>>>> |    5.725220 |       5.722107 |
>>>> |    5.739285 |       5.839855 |
>>>> |    5.748088 |       5.865238 |
>>>> |    5.766231 |       5.873913 |
>>>> |    5.842264 |       5.909259 |
>>>> |    5.902202 |       5.912685 |
>>>> |    5.905391 |       5.947138 |
>>>> |    6.032997 |       5.997779 |
>>>> +-------------+----------------+
>>>>
>>>> I guess the deviation is in the userspace part, which is where in
>>>> real
>>>> life flows get added anyway.
>>>>
>>>> Let me know if more is unclear.

^ permalink raw reply

* Re: [PATCH 2/3] IB/ipoib: Stop using dev_id to expose port numbers
From: Sergei Shtylyov @ 2018-08-29  9:44 UTC (permalink / raw)
  To: Arseny Maslennikov, linux-rdma; +Cc: Doug Ledford, Jason Gunthorpe, netdev
In-Reply-To: <20180828210117.6437-3-ar@cs.msu.ru>

Hello!

On 8/29/2018 12:01 AM, Arseny Maslennikov wrote:

> Some InfiniBand network devices have multiple ports on the same PCI
> function. Prior to this the kernel erroneously used the `dev_id' sysfs
> field of those network interfaces to convey the port number to userspace.
> 
> `dev_id' is currently reserved for distinguishing stacked ifaces
> (e.g: VLANs) with the same hardware address as their parent device.
> 
> Similar fixes to net/mlx4_en and many other drivers, which started
> exporting this information through `dev_id' before 3.15, were accepted
> into the kernel 4 years ago.
> See 76a066f2a2a0268b565459c417b59724b5a3197b, commit message:
> `net/mlx4_en: Expose port number through sysfs'.

    See commit 76a066f2a2a0 ("net/mlx4_en: Expose port number through sysfs").
> This commit is separated from the previous one since we may wish to
> preserve backwards compatibility with userspace being already dependent
> on `dev_id' being different.
> 
> Signed-off-by: Arseny Maslennikov <ar@cs.msu.ru>
[...]

MBR, Sergei

^ permalink raw reply

* Re: [PATCH 5/5] net: mvneta: reduce smp_processor_id() calling in mvneta_tx_done_gbe
From: Gregory CLEMENT @ 2018-08-29  9:44 UTC (permalink / raw)
  To: Jisheng Zhang
  Cc: thomas.petazzoni, David S. Miller, netdev, linux-kernel,
	Andrew Lunn, linux-arm-kernel
In-Reply-To: <20180829163021.70ce99ab@xhacker.debian>

Hi Jisheng,
 
 On mer., août 29 2018, Jisheng Zhang <Jisheng.Zhang@synaptics.com> wrote:

> In the loop of mvneta_tx_done_gbe(), we call the smp_processor_id()
> each time, move the call out of the loop to optimize the code a bit.
>
> Before the patch, the loop looks like(under arm64):
>
>         ldr     x1, [x29,#120]
>         ...
>         ldr     w24, [x1,#36]
>         ...
>         bl      0 <_raw_spin_lock>
>         str     w24, [x27,#132]
>         ...
>
> After the patch, the loop looks like(under arm64):
>
>         ...
>         bl      0 <_raw_spin_lock>
>         str     w23, [x28,#132]
>         ...
> where w23 is loaded so be ready before the loop.
>
> From another side, mvneta_tx_done_gbe() is called from mvneta_poll()
> which is in non-preemptible context, so it's safe to call the
> smp_processor_id() function once.

This improvement should go to net-next. Besides this patch looks nice:

Reviewed-by: Gregory CLEMENT <gregory.clement@bootlin.com>

Thanks,

Gregory


>
> Signed-off-by: Jisheng Zhang <Jisheng.Zhang@synaptics.com>
> ---
>  drivers/net/ethernet/marvell/mvneta.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
> index 7d98f7828a30..62e81e267e13 100644
> --- a/drivers/net/ethernet/marvell/mvneta.c
> +++ b/drivers/net/ethernet/marvell/mvneta.c
> @@ -2507,12 +2507,13 @@ static void mvneta_tx_done_gbe(struct mvneta_port *pp, u32 cause_tx_done)
>  {
>  	struct mvneta_tx_queue *txq;
>  	struct netdev_queue *nq;
> +	int cpu = smp_processor_id();
>  
>  	while (cause_tx_done) {
>  		txq = mvneta_tx_done_policy(pp, cause_tx_done);
>  
>  		nq = netdev_get_tx_queue(pp->dev, txq->id);
> -		__netif_tx_lock(nq, smp_processor_id());
> +		__netif_tx_lock(nq, cpu);
>  
>  		if (txq->count)
>  			mvneta_txq_done(pp, txq);
> -- 
> 2.18.0
>

-- 
Gregory Clement, Bootlin
Embedded Linux and Kernel engineering
http://bootlin.com

^ permalink raw reply

* Re: [PATCH] Revert "net: stmmac: Do not keep rearming the coalesce timer in stmmac_xmit"
From: Jose Abreu @ 2018-08-29 13:56 UTC (permalink / raw)
  To: Jerome Brunet, Giuseppe Cavallaro, Alexandre Torgue, Jose Abreu,
	netdev
  Cc: linux-kernel, linux-amlogic, Joao Pinto, Vitor Soares,
	Corentin Labbe, Martin Blumenstingl
In-Reply-To: <d3858ef0-97c7-28b5-db4a-4ac71af52ba5@synopsys.com>

++ Martin

Hi Martin,

I just saw you have the same problem as Jerome. Can you please
share the information I mention bellow?

Thanks and Best Regards,
Jose Miguel Abreu

On 28-08-2018 09:12, Jose Abreu wrote:
> Hi Jerome,
>
> On 24-08-2018 10:04, Jerome Brunet wrote:
>> This reverts commit 4ae0169fd1b3c792b66be58995b7e6b629919ecf.
>>
>> This change in the handling of the coalesce timer is causing regression on
>> (at least) amlogic platforms.
>>
>> Network will break down very quickly (a few seconds) after starting
>> a download. This can easily be reproduced using iperf3 for example.
>>
>> The problem has been reported on the S805, S905, S912 and A113 SoCs
>> (Realtek and Micrel PHYs) and it is likely impacting all Amlogics
>> platforms using Gbit ethernet
>>
>> No problem was seen with the platform using 10/100 only PHYs (GXL internal)
>>
>> Reverting change brings things back to normal and allows to use network
>> again until we better understand the problem with the coalesce timer.
>>
>>
> Apologies for the delayed answer but I was in FTO.
>
> I'm not sure what can be causing this but I have some questions
> for you:
>     - What do you mean by "network will break down"? Do you see
> queue timeout?
>     - What do you see in ethtool/ifconfig stats? Can you send me
> the stats before and after network break?
>     - Is your setup multi-queue/channel?
>     - Can you point me to the DT bindings of your setup?
>
> Thanks and Best Regards,
> Jose Miguel Abreu

^ permalink raw reply

* [PATCH net 0/2] igmp: fix two incorrect unsolicit report count issues
From: Hangbin Liu @ 2018-08-29 10:06 UTC (permalink / raw)
  To: netdev; +Cc: David Miller, Hangbin Liu

Just like the subject, fix two minor igmp unsolicit report count issues.

Hangbin Liu (2):
  igmp: fix incorrect unsolicit report count when join group
  igmp: fix incorrect unsolicit report count after link down and up

 net/ipv4/igmp.c | 11 ++++-------
 1 file changed, 4 insertions(+), 7 deletions(-)

-- 
2.5.5

^ permalink raw reply

* [PATCH net 1/2] igmp: fix incorrect unsolicit report count when join group
From: Hangbin Liu @ 2018-08-29 10:06 UTC (permalink / raw)
  To: netdev; +Cc: David Miller, Hangbin Liu
In-Reply-To: <1535537171-24533-1-git-send-email-liuhangbin@gmail.com>

We should not start timer if im->unsolicit_count equal to 0 after decrease.
Or we will send one more unsolicit report message. i.e. 3 instead of 2 by
default.

Fixes: 1da177e4c3f41 ("Linux-2.6.12-rc2")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
---
 net/ipv4/igmp.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/net/ipv4/igmp.c b/net/ipv4/igmp.c
index cf75f89..deb1f82 100644
--- a/net/ipv4/igmp.c
+++ b/net/ipv4/igmp.c
@@ -820,10 +820,9 @@ static void igmp_timer_expire(struct timer_list *t)
 	spin_lock(&im->lock);
 	im->tm_running = 0;
 
-	if (im->unsolicit_count) {
-		im->unsolicit_count--;
+	if (im->unsolicit_count && --im->unsolicit_count)
 		igmp_start_timer(im, unsolicited_report_interval(in_dev));
-	}
+
 	im->reporter = 1;
 	spin_unlock(&im->lock);
 
-- 
2.5.5

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox