* RE: [v2, 09/10] dpaa_eth: add support for hardware timestamping
From: Y.b. Lu @ 2018-06-07 9:21 UTC (permalink / raw)
To: Madalin-cristian Bucur, netdev@vger.kernel.org, Richard Cochran,
Rob Herring, Shawn Guo, David S . Miller
Cc: devicetree@vger.kernel.org, linuxppc-dev@lists.ozlabs.org,
linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org
In-Reply-To: <AM6PR04MB40080865F823C7C57716818FEC640@AM6PR04MB4008.eurprd04.prod.outlook.com>
Hi Madalin,
> -----Original Message-----
> From: Madalin-cristian Bucur
> Sent: Thursday, June 7, 2018 4:24 PM
> To: Y.b. Lu <yangbo.lu@nxp.com>; netdev@vger.kernel.org; Richard Cochran
> <richardcochran@gmail.com>; Rob Herring <robh+dt@kernel.org>; Shawn
> Guo <shawnguo@kernel.org>; David S . Miller <davem@davemloft.net>
> Cc: devicetree@vger.kernel.org; linuxppc-dev@lists.ozlabs.org;
> linux-arm-kernel@lists.infradead.org; linux-kernel@vger.kernel.org; Y.b. Lu
> <yangbo.lu@nxp.com>
> Subject: RE: [v2, 09/10] dpaa_eth: add support for hardware timestamping
>
> > -----Original Message-----
> > From: Yangbo Lu [mailto:yangbo.lu@nxp.com]
> > Sent: Thursday, June 7, 2018 6:23 AM
> > Subject: [v2, 09/10] dpaa_eth: add support for hardware timestamping
> >
> > This patch is to add hardware timestamping support for dpaa_eth. On
> > Rx, timestamping is enabled for all frames. On Tx, we only instruct
> > the hardware to timestamp the frames marked accordingly by the stack.
> >
> > Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
> > ---
> > Changes for v2:
> > - Removed ifdef for timestamp code.
> > - Minor fixes for code style.
> > ---
> > drivers/net/ethernet/freescale/dpaa/dpaa_eth.c | 101
> > ++++++++++++++++++++++-
> > drivers/net/ethernet/freescale/dpaa/dpaa_eth.h | 3 +
> > 2 files changed, 99 insertions(+), 5 deletions(-)
> >
> > diff --git a/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
> > b/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
> > index fd43f98..bd589ac 100644
> > --- a/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
> > +++ b/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c
> > @@ -1168,7 +1168,7 @@ static int dpaa_eth_init_tx_port(struct
> > fman_port *port, struct dpaa_fq *errq,
> > buf_prefix_content.priv_data_size = buf_layout->priv_data_size;
> > buf_prefix_content.pass_prs_result = true;
> > buf_prefix_content.pass_hash_result = true;
> > - buf_prefix_content.pass_time_stamp = false;
> > + buf_prefix_content.pass_time_stamp = true;
> > buf_prefix_content.data_align = DPAA_FD_DATA_ALIGNMENT;
> >
> > params.specific_params.non_rx_params.err_fqid = errq->fqid; @@
> > -1210,7 +1210,7 @@ static int dpaa_eth_init_rx_port(struct fman_port
> > *port, struct dpaa_bp **bps,
> > buf_prefix_content.priv_data_size = buf_layout->priv_data_size;
> > buf_prefix_content.pass_prs_result = true;
> > buf_prefix_content.pass_hash_result = true;
> > - buf_prefix_content.pass_time_stamp = false;
> > + buf_prefix_content.pass_time_stamp = true;
> > buf_prefix_content.data_align = DPAA_FD_DATA_ALIGNMENT;
> >
> > rx_p = ¶ms.specific_params.rx_params;
> > @@ -1592,6 +1592,16 @@ static int dpaa_eth_refill_bpools(struct
> > dpaa_priv
> > *priv)
> > return 0;
> > }
> >
> > +static int dpaa_get_tstamp_ns(struct net_device *net_dev, u64 *ns,
> > + struct fman_port *port, const void *data) {
> > + if (!fman_port_get_tstamp_field(port, data, ns)) {
> > + be64_to_cpus(ns);
>
> Please move this endianness conversion in the fman API.
[Y.b. Lu] Ok. Will move to fman API in next version.
>
> > + return 0;
> > + }
> > + return -EINVAL;
> > +}
> > +
> > /* Cleanup function for outgoing frame descriptors that were built on
> > Tx path,
> > * either contiguous frames or scatter/gather ones.
> > * Skb freeing is not handled here.
> > @@ -1607,14 +1617,29 @@ static int dpaa_eth_refill_bpools(struct
> > dpaa_priv
> > *priv)
> > {
> > const enum dma_data_direction dma_dir = DMA_TO_DEVICE;
> > struct device *dev = priv->net_dev->dev.parent;
> > + struct skb_shared_hwtstamps shhwtstamps;
> > dma_addr_t addr = qm_fd_addr(fd);
> > const struct qm_sg_entry *sgt;
> > struct sk_buff **skbh, *skb;
> > int nr_frags, i;
> > + u64 ns;
> >
> > skbh = (struct sk_buff **)phys_to_virt(addr);
> > skb = *skbh;
> >
> > + if (priv->tx_tstamp && skb_shinfo(skb)->tx_flags &
> > SKBTX_HW_TSTAMP) {
> > + memset(&shhwtstamps, 0, sizeof(shhwtstamps));
> > +
> > + if (!dpaa_get_tstamp_ns(priv->net_dev, &ns,
> > + priv->mac_dev->port[TX],
> > + (void *)skbh)) {
> > + shhwtstamps.hwtstamp = ns_to_ktime(ns);
> > + skb_tstamp_tx(skb, &shhwtstamps);
> > + } else {
> > + dev_warn(dev, "dpaa_get_tstamp_ns failed!\n");
> > + }
> > + }
> > +
> > if (unlikely(qm_fd_get_format(fd) == qm_fd_sg)) {
> > nr_frags = skb_shinfo(skb)->nr_frags;
> > dma_unmap_single(dev, addr, qm_fd_get_offset(fd) + @@ -2086,6
> > +2111,11 @@ static int dpaa_start_xmit(struct sk_buff *skb, struct
> > net_device *net_dev)
> > if (unlikely(err < 0))
> > goto skb_to_fd_failed;
> >
> > + if (priv->tx_tstamp && skb_shinfo(skb)->tx_flags &
> > SKBTX_HW_TSTAMP) {
> > + fd.cmd |= FM_FD_CMD_UPD;
>
> The fd.cmd field is big endian, please use this:
>
> + fd.cmd |= cpu_to_be32(FM_FD_CMD_UPD);
>
[Y.b. Lu] Thanks a lot for pointing out this issue. This fixes TX timestamp issue on ARM platform.
By now, I have verified both PowerPC platform and ARM platform. The PTP clock driver and timestamping worked fine.
I will send out v3 patch-set for reviewing.
> > + skb_shinfo(skb)->tx_flags |= SKBTX_IN_PROGRESS;
> > + }
> > +
> > if (likely(dpaa_xmit(priv, percpu_stats, queue_mapping, &fd) == 0))
> > return NETDEV_TX_OK;
> >
> > @@ -2227,6 +2257,7 @@ static enum qman_cb_dqrr_result
> > rx_default_dqrr(struct qman_portal *portal,
> > struct qman_fq *fq,
> > const struct qm_dqrr_entry
> > *dq)
> > {
> > + struct skb_shared_hwtstamps *shhwtstamps;
> > struct rtnl_link_stats64 *percpu_stats;
> > struct dpaa_percpu_priv *percpu_priv;
> > const struct qm_fd *fd = &dq->fd;
> > @@ -2240,6 +2271,7 @@ static enum qman_cb_dqrr_result
> > rx_default_dqrr(struct qman_portal *portal,
> > struct sk_buff *skb;
> > int *count_ptr;
> > void *vaddr;
> > + u64 ns;
> >
> > fd_status = be32_to_cpu(fd->status);
> > fd_format = qm_fd_get_format(fd);
> > @@ -2304,6 +2336,18 @@ static enum qman_cb_dqrr_result
> > rx_default_dqrr(struct qman_portal *portal,
> > if (!skb)
> > return qman_cb_dqrr_consume;
> >
> > + if (priv->rx_tstamp) {
> > + shhwtstamps = skb_hwtstamps(skb);
> > + memset(shhwtstamps, 0, sizeof(*shhwtstamps));
> > +
> > + if (!dpaa_get_tstamp_ns(priv->net_dev, &ns,
> > + priv->mac_dev->port[RX],
> > + vaddr))
> > + shhwtstamps->hwtstamp = ns_to_ktime(ns);
> > + else
> > + dev_warn(net_dev->dev.parent,
> > "dpaa_get_tstamp_ns failed!\n");
> > + }
> > +
> > skb->protocol = eth_type_trans(skb, net_dev);
> >
> > if (net_dev->features & NETIF_F_RXHASH && priv->keygen_in_use &&
> @@
> > -2523,11 +2567,58 @@ static int dpaa_eth_stop(struct net_device
> > *net_dev)
> > return err;
> > }
> >
> > +static int dpaa_ts_ioctl(struct net_device *dev, struct ifreq *rq,
> > +int cmd) {
> > + struct dpaa_priv *priv = netdev_priv(dev);
> > + struct hwtstamp_config config;
> > +
> > + if (copy_from_user(&config, rq->ifr_data, sizeof(config)))
> > + return -EFAULT;
> > +
> > + switch (config.tx_type) {
> > + case HWTSTAMP_TX_OFF:
> > + /* Couldn't disable rx/tx timestamping separately.
> > + * Do nothing here.
> > + */
> > + priv->tx_tstamp = false;
> > + break;
> > + case HWTSTAMP_TX_ON:
> > + priv->mac_dev->set_tstamp(priv->mac_dev->fman_mac,
> > true);
> > + priv->tx_tstamp = true;
> > + break;
> > + default:
> > + return -ERANGE;
> > + }
> > +
> > + if (config.rx_filter == HWTSTAMP_FILTER_NONE) {
> > + /* Couldn't disable rx/tx timestamping separately.
> > + * Do nothing here.
> > + */
> > + priv->rx_tstamp = false;
> > + } else {
> > + priv->mac_dev->set_tstamp(priv->mac_dev->fman_mac,
> > true);
> > + priv->rx_tstamp = true;
> > + /* TS is set for all frame types, not only those requested */
> > + config.rx_filter = HWTSTAMP_FILTER_ALL;
> > + }
> > +
> > + return copy_to_user(rq->ifr_data, &config, sizeof(config)) ?
> > + -EFAULT : 0;
> > +}
> > +
> > static int dpaa_ioctl(struct net_device *net_dev, struct ifreq *rq,
> > int cmd) {
> > - if (!net_dev->phydev)
> > - return -EINVAL;
> > - return phy_mii_ioctl(net_dev->phydev, rq, cmd);
> > + int ret = -EINVAL;
> > +
> > + if (cmd == SIOCGMIIREG) {
> > + if (net_dev->phydev)
> > + return phy_mii_ioctl(net_dev->phydev, rq, cmd);
> > + }
> > +
> > + if (cmd == SIOCSHWTSTAMP)
> > + return dpaa_ts_ioctl(net_dev, rq, cmd);
> > +
> > + return ret;
> > }
> >
> > static const struct net_device_ops dpaa_ops = { diff --git
> > a/drivers/net/ethernet/freescale/dpaa/dpaa_eth.h
> > b/drivers/net/ethernet/freescale/dpaa/dpaa_eth.h
> > index bd94220..af320f8 100644
> > --- a/drivers/net/ethernet/freescale/dpaa/dpaa_eth.h
> > +++ b/drivers/net/ethernet/freescale/dpaa/dpaa_eth.h
> > @@ -182,6 +182,9 @@ struct dpaa_priv {
> >
> > struct dpaa_buffer_layout buf_layout[2];
> > u16 rx_headroom;
> > +
> > + bool tx_tstamp; /* Tx timestamping enabled */
> > + bool rx_tstamp; /* Rx timestamping enabled */
> > };
> >
> > /* from dpaa_ethtool.c */
> > --
> > 1.7.1
^ permalink raw reply
* [v3, 08/10] fsl/fman: define frame description command UPD
From: Yangbo Lu @ 2018-06-07 9:20 UTC (permalink / raw)
To: netdev, madalin.bucur, Richard Cochran, Rob Herring, Shawn Guo,
David S . Miller
Cc: devicetree, linuxppc-dev, linux-arm-kernel, linux-kernel,
Yangbo Lu
In-Reply-To: <20180607092050.46128-1-yangbo.lu@nxp.com>
Defined frame description command FM_FD_CMD_UPD for
prepended data updating.
Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
---
Changes for v2:
- None.
Changes for v3:
- None.
---
drivers/net/ethernet/freescale/fman/fman.h | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)
diff --git a/drivers/net/ethernet/freescale/fman/fman.h b/drivers/net/ethernet/freescale/fman/fman.h
index bfa02e0..935c317 100644
--- a/drivers/net/ethernet/freescale/fman/fman.h
+++ b/drivers/net/ethernet/freescale/fman/fman.h
@@ -41,6 +41,7 @@
/* Frame queue Context Override */
#define FM_FD_CMD_FCO 0x80000000
#define FM_FD_CMD_RPD 0x40000000 /* Read Prepended Data */
+#define FM_FD_CMD_UPD 0x20000000 /* Update Prepended Data */
#define FM_FD_CMD_DTC 0x10000000 /* Do L4 Checksum */
/* TX-Port: Unsupported Format */
--
1.7.1
^ permalink raw reply related
* [v3, 10/10] dpaa_eth: add the get_ts_info interface for ethtool
From: Yangbo Lu @ 2018-06-07 9:20 UTC (permalink / raw)
To: netdev, madalin.bucur, Richard Cochran, Rob Herring, Shawn Guo,
David S . Miller
Cc: devicetree, linuxppc-dev, linux-arm-kernel, linux-kernel,
Yangbo Lu
In-Reply-To: <20180607092050.46128-1-yangbo.lu@nxp.com>
Added the get_ts_info interface for ethtool to check
the timestamping capability.
Signed-off-by: Yangbo Lu <yangbo.lu@nxp.com>
---
Changes for v2:
- Removed ifdef for hw timestamp.
Changes for v3:
- None.
---
drivers/net/ethernet/freescale/dpaa/dpaa_ethtool.c | 39 ++++++++++++++++++++
1 files changed, 39 insertions(+), 0 deletions(-)
diff --git a/drivers/net/ethernet/freescale/dpaa/dpaa_ethtool.c b/drivers/net/ethernet/freescale/dpaa/dpaa_ethtool.c
index 2f933b6..3184c8f 100644
--- a/drivers/net/ethernet/freescale/dpaa/dpaa_ethtool.c
+++ b/drivers/net/ethernet/freescale/dpaa/dpaa_ethtool.c
@@ -32,6 +32,9 @@
#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
#include <linux/string.h>
+#include <linux/of_platform.h>
+#include <linux/net_tstamp.h>
+#include <linux/fsl/ptp_qoriq.h>
#include "dpaa_eth.h"
#include "mac.h"
@@ -515,6 +518,41 @@ static int dpaa_set_rxnfc(struct net_device *dev, struct ethtool_rxnfc *cmd)
return ret;
}
+static int dpaa_get_ts_info(struct net_device *net_dev,
+ struct ethtool_ts_info *info)
+{
+ struct device *dev = net_dev->dev.parent;
+ struct device_node *mac_node = dev->of_node;
+ struct device_node *fman_node = NULL, *ptp_node = NULL;
+ struct platform_device *ptp_dev = NULL;
+ struct qoriq_ptp *ptp = NULL;
+
+ info->phc_index = -1;
+
+ fman_node = of_get_parent(mac_node);
+ if (fman_node)
+ ptp_node = of_parse_phandle(fman_node, "ptimer-handle", 0);
+
+ if (ptp_node)
+ ptp_dev = of_find_device_by_node(ptp_node);
+
+ if (ptp_dev)
+ ptp = platform_get_drvdata(ptp_dev);
+
+ if (ptp)
+ info->phc_index = ptp->phc_index;
+
+ info->so_timestamping = SOF_TIMESTAMPING_TX_HARDWARE |
+ SOF_TIMESTAMPING_RX_HARDWARE |
+ SOF_TIMESTAMPING_RAW_HARDWARE;
+ info->tx_types = (1 << HWTSTAMP_TX_OFF) |
+ (1 << HWTSTAMP_TX_ON);
+ info->rx_filters = (1 << HWTSTAMP_FILTER_NONE) |
+ (1 << HWTSTAMP_FILTER_ALL);
+
+ return 0;
+}
+
const struct ethtool_ops dpaa_ethtool_ops = {
.get_drvinfo = dpaa_get_drvinfo,
.get_msglevel = dpaa_get_msglevel,
@@ -530,4 +568,5 @@ static int dpaa_set_rxnfc(struct net_device *dev, struct ethtool_rxnfc *cmd)
.set_link_ksettings = dpaa_set_link_ksettings,
.get_rxnfc = dpaa_get_rxnfc,
.set_rxnfc = dpaa_set_rxnfc,
+ .get_ts_info = dpaa_get_ts_info,
};
--
1.7.1
^ permalink raw reply related
* Re: [PATCH 0/5] can: enable multi-queue for SocketCAN devices
From: Oliver Hartkopp @ 2018-06-07 9:49 UTC (permalink / raw)
To: Jonas Mark (BT-FIR/ENG1), Wolfgang Grandegger, Marc Kleine-Budde
Cc: linux-can@vger.kernel.org, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org, hs@denx.de,
ZHU Yi (BT-FIR/ENG1-Zhu)
In-Reply-To: <e8c47bca2ea64b5ab900f4f1e98bb405@de.bosch.com>
On 06/07/2018 10:06 AM, Jonas Mark (BT-FIR/ENG1) wrote:
> Hi Oliver,
>
>>> The driver suite consists of three separate drivers. The following
>>> diagram illustrates the dependencies in layers.
>>>
>>> /dev/companion SocketCAN User Space
>>> -------------------------------------------------------------------
>>> +----------------+ +---------------+
>>> | companion-char | | companion-can |
>>> +----------------+ +---------------+
>>> +----------------------------------+
>>> | companion-spi |
>>> +----------------------------------+
>>> +----------------------------------+
>>> | standard SPI subsystem |
>>> +----------------------------------+ Linux Kernel
>>> -------------------------------------------------------------------
>>> | | | | | | Hardware
>>> CS-+ | | | | +-BUSY
>>> CLK--+ | | +---REQUEST
>>> MOSI---+ |
>>> MISO-----+
>>>
>>> companion-spi
>>> core.c: handles SPI, sysfs entry and interface to upper layer
>>> protocol-manager.c: handles protocol with the SPI HW
>>> queue-manager.c: handles buffering and packets scheduling
>>>
>>> companion-can
>>> makes use of multi-queue support and allows to use tc to configure
>>> the queuing discipline (e.g. mqprio). Together with the SO_PRIORITY
>>> socket option this allows to specify the FIFO a CAN frame shall be
>>> sent to.
>>>
>>> companion-char
>>> handles messages to other undisclosed functionality beyond CAN.
>
>>> .../devicetree/bindings/spi/bosch,companion.txt | 82 ++
>>> drivers/char/Kconfig | 7 +
>>> drivers/char/Makefile | 2 +
>>> drivers/char/companion-char.c | 367 ++++++
>>> drivers/net/can/Kconfig | 8 +
>>> drivers/net/can/Makefile | 1 +
>>> drivers/net/can/companion-can.c | 694 ++++++++++++
>>
>> Please place the companion driver in
>>
>> drivers/net/can/spi/companion.c
>>
>> It also makes more sense in the Kconfig structure.
>>
>> Probably this naming scheme also makes sense for
>>
>> linux/drivers/char/spi/companion.c
>>
>> then ...
>>
>> If not it should be named at least
>>
>> drivers/char/companion-spi.c
>>
>> or
>>
>> drivers/char/spi-companion.c
>
> We intentionally left out the spi in the driver path / name because
> only the drivers/spi/companion/* driver knows that that it is connected
> to SPI. The others (drivers/net/can/companion-can.c and
> drivers/char/companion-char.c) only know the API. This could also be
> supplied by a driver which talks to the Companion via a different
> interface. Actually, we started with a UART connection but switched to
> SPI due to latency issues.
Ok, got it.
> Should we still change it?
At least I would then vote for
drivers/char/companion.c
drivers/net/can/companion.c
instead of
drivers/char/companion-char.c
drivers/net/can/companion-can.c
as you would have companion-users in different driver subsystems that
are already clearly referenced by their path.
The modules itself should still be named with companion-can of course
(as-is right now).
Btw.
+#define DRIVER_NAME "bosch,companion-can"
+static const struct can_bittiming_const companion_can_bittiming_const = {
+ .name = "bosch,companion",
Is there any reason why it's not only "companion-can" or "companion"?
The fact that the driver is provided by Bosch is visible in the source code.
Best regards,
Oliver
>
>>> drivers/net/can/dev.c | 8 +-
>>> drivers/spi/Kconfig | 2 +
>>> drivers/spi/Makefile | 2 +
>>> drivers/spi/companion/Kconfig | 5 +
>>> drivers/spi/companion/Makefile | 2 +
>>> drivers/spi/companion/core.c | 1189 ++++++++++++++++++++
>>> drivers/spi/companion/protocol-manager.c | 1035 +++++++++++++++++
>>> drivers/spi/companion/protocol-manager.h | 348 ++++++
>>> drivers/spi/companion/protocol.h | 273 +++++
>>> drivers/spi/companion/queue-manager.c | 146 +++
>>> drivers/spi/companion/queue-manager.h | 245 ++++
>>> include/linux/can/dev.h | 7 +-
>>> include/linux/companion.h | 258 +++++
>>> 20 files changed, 4677 insertions(+), 4 deletions(-)
>>> create mode 100644
>> Documentation/devicetree/bindings/spi/bosch,companion.txt
>>> create mode 100644 drivers/char/companion-char.c
>>> create mode 100644 drivers/net/can/companion-can.c
>>> create mode 100644 drivers/spi/companion/Kconfig
>>> create mode 100644 drivers/spi/companion/Makefile
>>> create mode 100644 drivers/spi/companion/core.c
>>> create mode 100644 drivers/spi/companion/protocol-manager.c
>>> create mode 100644 drivers/spi/companion/protocol-manager.h
>>> create mode 100644 drivers/spi/companion/protocol.h
>>> create mode 100644 drivers/spi/companion/queue-manager.c
>>> create mode 100644 drivers/spi/companion/queue-manager.h
>>> create mode 100644 include/linux/companion.h
>
> Greetings,
> Mark
>
> Building Technologies, Panel Software Fire (BT-FIR/ENG1)
> Bosch Sicherheitssysteme GmbH | Postfach 11 11 | 85626 Grasbrunn | GERMANY | www.boschsecurity.com
>
> Sitz: Stuttgart, Registergericht: Amtsgericht Stuttgart HRB 23118
> Aufsichtsratsvorsitzender: Stefan Hartung; Geschäftsführung: Gert van Iperen, Andreas Bartz, Thomas Quante, Bernhard Schuster
>
^ permalink raw reply
* Re: [RFC v6 4/5] virtio_ring: add event idx support in packed ring
From: Jason Wang @ 2018-06-07 9:50 UTC (permalink / raw)
To: Tiwei Bie, mst, virtualization, linux-kernel, netdev; +Cc: wexu, jfreimann
In-Reply-To: <20180605074046.20709-5-tiwei.bie@intel.com>
On 2018年06月05日 15:40, Tiwei Bie wrote:
> static bool virtqueue_enable_cb_delayed_packed(struct virtqueue *_vq)
> {
> struct vring_virtqueue *vq = to_vvq(_vq);
> + u16 bufs, used_idx, wrap_counter;
>
> START_USE(vq);
>
> /* We optimistically turn back on interrupts, then check if there was
> * more to do. */
> + /* Depending on the VIRTIO_RING_F_EVENT_IDX feature, we need to
> + * either clear the flags bit or point the event index at the next
> + * entry. Always update the event index to keep code simple. */
> +
Maybe for packed ring, it's time to treat event index separately to
avoid a virtio_wmb() for event idx is off.
> + /* TODO: tune this threshold */
> + if (vq->next_avail_idx < vq->last_used_idx)
> + bufs = (vq->vring_packed.num + vq->next_avail_idx -
> + vq->last_used_idx) * 3 / 4;
> + else
> + bufs = (vq->next_avail_idx - vq->last_used_idx) * 3 / 4;
vq->next_avail-idx could be equal to vq->last_usd_idx when the ring is
full. Though virito-net is the only user now and it can guarantee this
won't happen. But consider this is a core API, we should make sure it
can work for any cases.
It looks to me that bufs is just vq->vring_packed.num - vq->num_free?
> +
> + wrap_counter = vq->used_wrap_counter;
> +
> + used_idx = vq->last_used_idx + bufs;
> + if (used_idx >= vq->vring_packed.num) {
> + used_idx -= vq->vring_packed.num;
> + wrap_counter ^= 1;
> + }
> +
> + vq->vring_packed.driver->off_wrap = cpu_to_virtio16(_vq->vdev,
> + used_idx | (wrap_counter << 15));
>
> if (vq->event_flags_shadow == VRING_EVENT_F_DISABLE) {
> - vq->event_flags_shadow = VRING_EVENT_F_ENABLE;
> + /* We need to update event offset and event wrap
> + * counter first before updating event flags. */
> + virtio_wmb(vq->weak_barriers);
> + vq->event_flags_shadow = vq->event ? VRING_EVENT_F_DESC :
> + VRING_EVENT_F_ENABLE;
> vq->vring_packed.driver->flags = cpu_to_virtio16(_vq->vdev,
> vq->event_flags_shadow);
> - /* We need to enable interrupts first before re-checking
> - * for more used buffers. */
> - virtio_mb(vq->weak_barriers);
> }
>
> + /* We need to update event suppression structure first
> + * before re-checking for more used buffers. */
> + virtio_mb(vq->weak_barriers);
> +
> if (more_used_packed(vq)) {
> END_USE(vq);
> return false;
I think what we need to to make sure the descriptor used_idx is used?
Otherwise we may stop and restart qdisc too frequently?
Thanks
> --
^ permalink raw reply
* [RFC net-next] ipv4: Don't promote secondaries when flushing addresses
From: Jakub Sitnicki @ 2018-06-07 10:13 UTC (permalink / raw)
To: netdev
Promoting secondary addresses on address removal makes flushing all
addresses from a device with 1000's of them slow. This is because we
cannot take down the secondary addresses when we are removing the
primary one, which would make it faster.
However, the userspace, when performing a flush, will in the end remove
all the addresses regardless of secondary address promotion taking
place. Unfortunately the kernel currently cannot distinguish between a
single address removal and a flush of all addresses.
To help with this case introduce a IFA_F_FLUSH flag that can be used by
userspace to signal that a removal operation is being done because of a
flush. When the flag is set, don't bother with secondary address
promotion as we expect that secondary addresses will be removed soon as
well.
Signed-off-by: Jakub Sitnicki <jkbs@redhat.com>
---
A benchmark involving a flush of 40,000 addresses from a dummy device
shows a x4 speed-up of the 'flush' operation. 'ip' had to be modified to
set the IFA_F_FLUSH flag for RTM_DELADDR requests issued for the
'flush':
# time $IP -stats addr flush dev dum0
Before:
real 0m30.596s
user 0m0.000s
sys 0m30.567s
After:
real 0m7.601s
user 0m0.000s
sys 0m7.569s
It's also worth noting that promote_secondaries sysctl param is enabled by
default since systemd 216 thus making it the new "normal" on some distros.
include/uapi/linux/if_addr.h | 1 +
net/ipv4/devinet.c | 14 ++++++++++----
2 files changed, 11 insertions(+), 4 deletions(-)
diff --git a/include/uapi/linux/if_addr.h b/include/uapi/linux/if_addr.h
index ebaf5701c9db..19aab9a9cec5 100644
--- a/include/uapi/linux/if_addr.h
+++ b/include/uapi/linux/if_addr.h
@@ -54,6 +54,7 @@ enum {
#define IFA_F_NOPREFIXROUTE 0x200
#define IFA_F_MCAUTOJOIN 0x400
#define IFA_F_STABLE_PRIVACY 0x800
+#define IFA_F_FLUSH 0x1000
struct ifa_cacheinfo {
__u32 ifa_prefered;
diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c
index d7585ab1a77a..1f436e1e5222 100644
--- a/net/ipv4/devinet.c
+++ b/net/ipv4/devinet.c
@@ -331,13 +331,14 @@ int inet_addr_onlink(struct in_device *in_dev, __be32 a, __be32 b)
}
static void __inet_del_ifa(struct in_device *in_dev, struct in_ifaddr **ifap,
- int destroy, struct nlmsghdr *nlh, u32 portid)
+ int destroy, struct nlmsghdr *nlh, u32 portid,
+ bool flush)
{
struct in_ifaddr *promote = NULL;
struct in_ifaddr *ifa, *ifa1 = *ifap;
struct in_ifaddr *last_prim = in_dev->ifa_list;
struct in_ifaddr *prev_prom = NULL;
- int do_promote = IN_DEV_PROMOTE_SECONDARIES(in_dev);
+ int do_promote = IN_DEV_PROMOTE_SECONDARIES(in_dev) && !flush;
ASSERT_RTNL();
@@ -437,7 +438,7 @@ static void __inet_del_ifa(struct in_device *in_dev, struct in_ifaddr **ifap,
static void inet_del_ifa(struct in_device *in_dev, struct in_ifaddr **ifap,
int destroy)
{
- __inet_del_ifa(in_dev, ifap, destroy, NULL, 0);
+ __inet_del_ifa(in_dev, ifap, destroy, NULL, 0, false);
}
static void check_lifetime(struct work_struct *work);
@@ -607,6 +608,7 @@ static int inet_rtm_deladdr(struct sk_buff *skb, struct nlmsghdr *nlh,
struct in_device *in_dev;
struct ifaddrmsg *ifm;
struct in_ifaddr *ifa, **ifap;
+ bool flush = false;
int err = -EINVAL;
ASSERT_RTNL();
@@ -623,6 +625,9 @@ static int inet_rtm_deladdr(struct sk_buff *skb, struct nlmsghdr *nlh,
goto errout;
}
+ if (tb[IFA_FLAGS])
+ flush = !!(nla_get_u32(tb[IFA_FLAGS]) & IFA_F_FLUSH);
+
for (ifap = &in_dev->ifa_list; (ifa = *ifap) != NULL;
ifap = &ifa->ifa_next) {
if (tb[IFA_LOCAL] &&
@@ -639,7 +644,8 @@ static int inet_rtm_deladdr(struct sk_buff *skb, struct nlmsghdr *nlh,
if (ipv4_is_multicast(ifa->ifa_address))
ip_mc_config(net->ipv4.mc_autojoin_sk, false, ifa);
- __inet_del_ifa(in_dev, ifap, 1, nlh, NETLINK_CB(skb).portid);
+ __inet_del_ifa(in_dev, ifap, 1, nlh, NETLINK_CB(skb).portid,
+ flush);
return 0;
}
^ permalink raw reply related
* [PATCH] selftests: bpf: fix urandom_read build issue
From: Anders Roxell @ 2018-06-07 10:57 UTC (permalink / raw)
To: ast, daniel, shuah; +Cc: netdev, linux-kernel, linux-kselftest, Anders Roxell
gcc complains that urandom_read gets built twice.
gcc -o tools/testing/selftests/bpf/urandom_read
-static urandom_read.c -Wl,--build-id
gcc -Wall -O2 -I../../../include/uapi -I../../../lib -I../../../lib/bpf
-I../../../../include/generated -I../../../include urandom_read.c
urandom_read -lcap -lelf -lrt -lpthread -o
tools/testing/selftests/bpf/urandom_read
gcc: fatal error: input file
‘tools/testing/selftests/bpf/urandom_read’ is the
same as output file
compilation terminated.
../lib.mk:110: recipe for target
'tools/testing/selftests/bpf/urandom_read' failed
To fix this issue remove the urandom_read target and so target
TEST_CUSTOM_PROGS gets used.
Fixes: 81f77fd0deeb ("bpf: add selftest for stackmap with BPF_F_STACK_BUILD_ID")
Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
---
tools/testing/selftests/bpf/Makefile | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
index 607ed8729c06..67285591ffd7 100644
--- a/tools/testing/selftests/bpf/Makefile
+++ b/tools/testing/selftests/bpf/Makefile
@@ -16,10 +16,8 @@ LDLIBS += -lcap -lelf -lrt -lpthread
TEST_CUSTOM_PROGS = $(OUTPUT)/urandom_read
all: $(TEST_CUSTOM_PROGS)
-$(TEST_CUSTOM_PROGS): urandom_read
-
-urandom_read: urandom_read.c
- $(CC) -o $(TEST_CUSTOM_PROGS) -static $< -Wl,--build-id
+$(TEST_CUSTOM_PROGS): $(OUTPUT)/%: %.c
+ $(CC) -o $@ -static $< -Wl,--build-id
# Order correspond to 'make run_tests' order
TEST_GEN_PROGS = test_verifier test_tag test_maps test_lru_map test_lpm_map test_progs \
--
2.17.1
^ permalink raw reply related
* Re: [RFC net-next] ipv4: Don't promote secondaries when flushing addresses
From: Michal Kubecek @ 2018-06-07 11:00 UTC (permalink / raw)
To: netdev; +Cc: Jakub Sitnicki
In-Reply-To: <20180607101301.30439-1-jkbs@redhat.com>
On Thu, Jun 07, 2018 at 12:13:01PM +0200, Jakub Sitnicki wrote:
> Promoting secondary addresses on address removal makes flushing all
> addresses from a device with 1000's of them slow. This is because we
> cannot take down the secondary addresses when we are removing the
> primary one, which would make it faster.
>
> However, the userspace, when performing a flush, will in the end remove
> all the addresses regardless of secondary address promotion taking
> place. Unfortunately the kernel currently cannot distinguish between a
> single address removal and a flush of all addresses.
>
> To help with this case introduce a IFA_F_FLUSH flag that can be used by
> userspace to signal that a removal operation is being done because of a
> flush. When the flag is set, don't bother with secondary address
> promotion as we expect that secondary addresses will be removed soon as
> well.
Unless you intend to use the flag to allow deleting a specific address
with its secondaries (overriding promote_secondaries), maybe it would
be more practical to go even further and delete all addresses on the
interface if IFA_F_FLUSH is set so that userspace could delete all
addresses with one request.
Michal Kubecek
^ permalink raw reply
* Re: [PATCH v3] selftests: add headers_install to lib.mk
From: Anders Roxell @ 2018-06-07 11:07 UTC (permalink / raw)
To: Daniel Borkmann
Cc: Masahiro Yamada, Michal Marek, Shuah Khan, Bamvor Zhang, brgl,
Paolo Bonzini, Andrew Morton, Mike Rapoport, aarcange,
linux-kbuild, Linux Kernel Mailing List, linux-kselftest,
Networking, alexei.starovoitov
In-Reply-To: <1a021bf3-cf93-aa12-c5a8-1ea6c7900fbb@iogearbox.net>
On 14 May 2018 at 21:20, Daniel Borkmann <daniel@iogearbox.net> wrote:
> On 05/14/2018 01:58 PM, Anders Roxell wrote:
>> If the kernel headers aren't installed we can't build all the tests.
>> Add a new make target rule 'khdr' in the file lib.mk to generate the
>> kernel headers and that gets include for every test-dir Makefile that
>> includes lib.mk If the testdir in turn have its own sub-dirs the
>> top_srcdir needs to be set to the linux-rootdir to be able to generate
>> the kernel headers.
>>
>> Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
>> Reviewed-by: Fathi Boudra <fathi.boudra@linaro.org>
>> ---
>> Makefile | 14 +-------------
>> scripts/subarch.include | 13 +++++++++++++
>> tools/testing/selftests/android/Makefile | 2 +-
>> tools/testing/selftests/android/ion/Makefile | 1 +
>> tools/testing/selftests/bpf/Makefile | 5 ++---
>> tools/testing/selftests/futex/functional/Makefile | 1 +
>> tools/testing/selftests/gpio/Makefile | 7 ++-----
>> tools/testing/selftests/kvm/Makefile | 7 ++-----
>> tools/testing/selftests/lib.mk | 10 ++++++++++
>> tools/testing/selftests/vm/Makefile | 4 ----
>> 10 files changed, 33 insertions(+), 31 deletions(-)
>> create mode 100644 scripts/subarch.include
> [...]
>> diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
>> index 438d4f93875b..9741609a0eb1 100644
>> --- a/tools/testing/selftests/bpf/Makefile
>> +++ b/tools/testing/selftests/bpf/Makefile
>> @@ -16,9 +16,8 @@ LDLIBS += -lcap -lelf -lrt -lpthread
>> TEST_CUSTOM_PROGS = $(OUTPUT)/urandom_read
>> all: $(TEST_CUSTOM_PROGS)
>>
>> -$(TEST_CUSTOM_PROGS): urandom_read
>> -
>> -urandom_read: urandom_read.c
>> +$(TEST_CUSTOM_PROGS):| khdr
>> +$(TEST_CUSTOM_PROGS): urandom_read.c
>> $(CC) -o $(TEST_CUSTOM_PROGS) -static $<
>>
>> # Order correspond to 'make run_tests' order
>
> Can you elaborate on the error in BPF you're seeing that would force a
> headers_install for it?
BPF shouldn't be affected, a new revision of the patch does not touch
the bpf/Makefile.
I will send out a patch soon.
Cheers,
Anders
> Some people are running the tools/ infrastructure
> (incl. BPF kselftests) outside of kernel tree where this dependency would
> break their setup. Why BPF bits cannot be fixed otherwise?
>
> Thanks,
> Daniel
^ permalink raw reply
* [PATCH v4] selftests: add headers_install to lib.mk
From: Anders Roxell @ 2018-06-07 11:09 UTC (permalink / raw)
To: yamada.masahiro, michal.lkml, shuah, bamv2005, brgl, pbonzini,
akpm, rppt, aarcange
Cc: linux-kbuild, linux-kernel, linux-kselftest, netdev,
Anders Roxell
In-Reply-To: <20180413090351.25662-1-anders.roxell@linaro.org>
If the kernel headers aren't installed we can't build all the tests.
Add a new make target rule 'khdr' in the file lib.mk to generate the
kernel headers and that gets include for every test-dir Makefile that
includes lib.mk If the testdir in turn have its own sub-dirs the
top_srcdir needs to be set to the linux-rootdir to be able to generate
the kernel headers.
Signed-off-by: Anders Roxell <anders.roxell@linaro.org>
Reviewed-by: Fathi Boudra <fathi.boudra@linaro.org>
---
Makefile | 14 +-------------
scripts/subarch.include | 13 +++++++++++++
tools/testing/selftests/android/Makefile | 2 +-
tools/testing/selftests/android/ion/Makefile | 2 ++
tools/testing/selftests/futex/functional/Makefile | 1 +
tools/testing/selftests/gpio/Makefile | 7 ++-----
tools/testing/selftests/kvm/Makefile | 7 ++-----
tools/testing/selftests/lib.mk | 12 ++++++++++++
tools/testing/selftests/net/Makefile | 1 +
.../selftests/networking/timestamping/Makefile | 1 +
tools/testing/selftests/vm/Makefile | 4 ----
11 files changed, 36 insertions(+), 28 deletions(-)
create mode 100644 scripts/subarch.include
diff --git a/Makefile b/Makefile
index 6b9aea95ae3a..8050072300fa 100644
--- a/Makefile
+++ b/Makefile
@@ -286,19 +286,7 @@ KERNELRELEASE = $(shell cat include/config/kernel.release 2> /dev/null)
KERNELVERSION = $(VERSION)$(if $(PATCHLEVEL),.$(PATCHLEVEL)$(if $(SUBLEVEL),.$(SUBLEVEL)))$(EXTRAVERSION)
export VERSION PATCHLEVEL SUBLEVEL KERNELRELEASE KERNELVERSION
-# SUBARCH tells the usermode build what the underlying arch is. That is set
-# first, and if a usermode build is happening, the "ARCH=um" on the command
-# line overrides the setting of ARCH below. If a native build is happening,
-# then ARCH is assigned, getting whatever value it gets normally, and
-# SUBARCH is subsequently ignored.
-
-SUBARCH := $(shell uname -m | sed -e s/i.86/x86/ -e s/x86_64/x86/ \
- -e s/sun4u/sparc64/ \
- -e s/arm.*/arm/ -e s/sa110/arm/ \
- -e s/s390x/s390/ -e s/parisc64/parisc/ \
- -e s/ppc.*/powerpc/ -e s/mips.*/mips/ \
- -e s/sh[234].*/sh/ -e s/aarch64.*/arm64/ \
- -e s/riscv.*/riscv/)
+include scripts/subarch.include
# Cross compiling and selecting different set of gcc/bin-utils
# ---------------------------------------------------------------------------
diff --git a/scripts/subarch.include b/scripts/subarch.include
new file mode 100644
index 000000000000..650682821126
--- /dev/null
+++ b/scripts/subarch.include
@@ -0,0 +1,13 @@
+# SUBARCH tells the usermode build what the underlying arch is. That is set
+# first, and if a usermode build is happening, the "ARCH=um" on the command
+# line overrides the setting of ARCH below. If a native build is happening,
+# then ARCH is assigned, getting whatever value it gets normally, and
+# SUBARCH is subsequently ignored.
+
+SUBARCH := $(shell uname -m | sed -e s/i.86/x86/ -e s/x86_64/x86/ \
+ -e s/sun4u/sparc64/ \
+ -e s/arm.*/arm/ -e s/sa110/arm/ \
+ -e s/s390x/s390/ -e s/parisc64/parisc/ \
+ -e s/ppc.*/powerpc/ -e s/mips.*/mips/ \
+ -e s/sh[234].*/sh/ -e s/aarch64.*/arm64/ \
+ -e s/riscv.*/riscv/)
diff --git a/tools/testing/selftests/android/Makefile b/tools/testing/selftests/android/Makefile
index 72c25a3cb658..d9a725478375 100644
--- a/tools/testing/selftests/android/Makefile
+++ b/tools/testing/selftests/android/Makefile
@@ -6,7 +6,7 @@ TEST_PROGS := run.sh
include ../lib.mk
-all:
+all: khdr
@for DIR in $(SUBDIRS); do \
BUILD_TARGET=$(OUTPUT)/$$DIR; \
mkdir $$BUILD_TARGET -p; \
diff --git a/tools/testing/selftests/android/ion/Makefile b/tools/testing/selftests/android/ion/Makefile
index e03695287f76..88cfe88e466f 100644
--- a/tools/testing/selftests/android/ion/Makefile
+++ b/tools/testing/selftests/android/ion/Makefile
@@ -10,6 +10,8 @@ $(TEST_GEN_FILES): ipcsocket.c ionutils.c
TEST_PROGS := ion_test.sh
+KSFT_KHDR_INSTALL := 1
+top_srcdir = ../../../../..
include ../../lib.mk
$(OUTPUT)/ionapp_export: ionapp_export.c ipcsocket.c ionutils.c
diff --git a/tools/testing/selftests/futex/functional/Makefile b/tools/testing/selftests/futex/functional/Makefile
index ff8feca49746..ad1eeb14fda7 100644
--- a/tools/testing/selftests/futex/functional/Makefile
+++ b/tools/testing/selftests/futex/functional/Makefile
@@ -18,6 +18,7 @@ TEST_GEN_FILES := \
TEST_PROGS := run.sh
+top_srcdir = ../../../../..
include ../../lib.mk
$(TEST_GEN_FILES): $(HEADERS)
diff --git a/tools/testing/selftests/gpio/Makefile b/tools/testing/selftests/gpio/Makefile
index 1bbb47565c55..4665cdbf1a8d 100644
--- a/tools/testing/selftests/gpio/Makefile
+++ b/tools/testing/selftests/gpio/Makefile
@@ -21,11 +21,8 @@ endef
CFLAGS += -O2 -g -std=gnu99 -Wall -I../../../../usr/include/
LDLIBS += -lmount -I/usr/include/libmount
-$(BINARIES): ../../../gpio/gpio-utils.o ../../../../usr/include/linux/gpio.h
+$(BINARIES):| khdr
+$(BINARIES): ../../../gpio/gpio-utils.o
../../../gpio/gpio-utils.o:
make ARCH=$(ARCH) CROSS_COMPILE=$(CROSS_COMPILE) -C ../../../gpio
-
-../../../../usr/include/linux/gpio.h:
- make -C ../../../.. headers_install INSTALL_HDR_PATH=$(shell pwd)/../../../../usr/
-
diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile
index d9d00319b07c..bcb69380bbab 100644
--- a/tools/testing/selftests/kvm/Makefile
+++ b/tools/testing/selftests/kvm/Makefile
@@ -32,9 +32,6 @@ $(LIBKVM_OBJ): $(OUTPUT)/%.o: %.c
$(OUTPUT)/libkvm.a: $(LIBKVM_OBJ)
$(AR) crs $@ $^
-$(LINUX_HDR_PATH):
- make -C $(top_srcdir) headers_install
-
-all: $(STATIC_LIBS) $(LINUX_HDR_PATH)
+all: $(STATIC_LIBS)
$(TEST_GEN_PROGS): $(STATIC_LIBS)
-$(TEST_GEN_PROGS) $(LIBKVM_OBJ): | $(LINUX_HDR_PATH)
+$(STATIC_LIBS):| khdr
diff --git a/tools/testing/selftests/lib.mk b/tools/testing/selftests/lib.mk
index 17ab36605a8e..0a8e75886224 100644
--- a/tools/testing/selftests/lib.mk
+++ b/tools/testing/selftests/lib.mk
@@ -16,8 +16,20 @@ TEST_GEN_PROGS := $(patsubst %,$(OUTPUT)/%,$(TEST_GEN_PROGS))
TEST_GEN_PROGS_EXTENDED := $(patsubst %,$(OUTPUT)/%,$(TEST_GEN_PROGS_EXTENDED))
TEST_GEN_FILES := $(patsubst %,$(OUTPUT)/%,$(TEST_GEN_FILES))
+top_srcdir ?= ../../../..
+include $(top_srcdir)/scripts/subarch.include
+ARCH ?= $(SUBARCH)
+
all: $(TEST_GEN_PROGS) $(TEST_GEN_PROGS_EXTENDED) $(TEST_GEN_FILES)
+.PHONY: khdr
+khdr:
+ make ARCH=$(ARCH) -C $(top_srcdir) headers_install
+
+ifdef KSFT_KHDR_INSTALL
+$(TEST_GEN_PROGS) $(TEST_GEN_PROGS_EXTENDED) $(TEST_GEN_FILES):| khdr
+endif
+
.ONESHELL:
define RUN_TEST_PRINT_RESULT
TEST_HDR_MSG="selftests: "`basename $$PWD`:" $$BASENAME_TEST"; \
diff --git a/tools/testing/selftests/net/Makefile b/tools/testing/selftests/net/Makefile
index 663e11e85727..d515dabc6b0d 100644
--- a/tools/testing/selftests/net/Makefile
+++ b/tools/testing/selftests/net/Makefile
@@ -15,6 +15,7 @@ TEST_GEN_FILES += udpgso udpgso_bench_tx udpgso_bench_rx
TEST_GEN_PROGS = reuseport_bpf reuseport_bpf_cpu reuseport_bpf_numa
TEST_GEN_PROGS += reuseport_dualstack reuseaddr_conflict
+KSFT_KHDR_INSTALL := 1
include ../lib.mk
$(OUTPUT)/reuseport_bpf_numa: LDFLAGS += -lnuma
diff --git a/tools/testing/selftests/networking/timestamping/Makefile b/tools/testing/selftests/networking/timestamping/Makefile
index a728040edbe1..14cfcf006936 100644
--- a/tools/testing/selftests/networking/timestamping/Makefile
+++ b/tools/testing/selftests/networking/timestamping/Makefile
@@ -5,6 +5,7 @@ TEST_PROGS := hwtstamp_config rxtimestamp timestamping txtimestamp
all: $(TEST_PROGS)
+top_srcdir = ../../../../..
include ../../lib.mk
clean:
diff --git a/tools/testing/selftests/vm/Makefile b/tools/testing/selftests/vm/Makefile
index fdefa2295ddc..58759454b1d0 100644
--- a/tools/testing/selftests/vm/Makefile
+++ b/tools/testing/selftests/vm/Makefile
@@ -25,10 +25,6 @@ TEST_PROGS := run_vmtests
include ../lib.mk
-$(OUTPUT)/userfaultfd: ../../../../usr/include/linux/kernel.h
$(OUTPUT)/userfaultfd: LDLIBS += -lpthread
$(OUTPUT)/mlock-random-test: LDLIBS += -lcap
-
-../../../../usr/include/linux/kernel.h:
- make -C ../../../.. headers_install
--
2.17.1
^ permalink raw reply related
* [BUG BISECT] NFSv4 client fails on Flush Journal to Persistent Storage
From: Krzysztof Kozlowski @ 2018-06-07 11:19 UTC (permalink / raw)
To: Trond Myklebust, Anna Schumaker, J. Bruce Fields, Jeff Layton,
David S. Miller, linux-nfs, netdev, linux-kernel,
linux-samsung-soc@vger.kernel.org
[-- Attachment #1: Type: text/plain, Size: 1514 bytes --]
Hi,
When booting my boards under recent linux-next, I see failures of systemd:
[FAILED] Failed to start Flush Journal to Persistent Storage.
See 'systemctl status systemd-journal-flush.service' for details.
Starting Create Volatile Files and Directories...
[** ] A start job is running for Create V… [ 223.209289] nfs:
server 192.168.1.10 not responding, still trying
[ 223.209377] nfs: server 192.168.1.10 not responding, still trying
Effectively the boards fails to boot. Example is here:
https://krzk.eu/#/builders/1/builds/2157
This was bisected to:
commit 37ac86c3a76c113619b7d9afe0251bbfc04cb80a
Author: Chuck Lever <chuck.lever@oracle.com>
Date: Fri May 4 15:34:53 2018 -0400
SUNRPC: Initialize rpc_rqst outside of xprt->reserve_lock
alloc_slot is a transport-specific op, but initializing an rpc_rqst
is common to all transports. In addition, the only part of initial-
izing an rpc_rqst that needs serialization is getting a fresh XID.
Move rpc_rqst initialization to common code in preparation for
adding a transport-specific alloc_slot to xprtrdma.
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
Bisect log attached. Full configuration:
1. exynos_defconfig
2. ARMv7, octa-core, Exynos5422 and Exynos4412 (Odroid XU3, U3 and others)
3. NFSv4 client (from Raspberry Pi)
Let me know if you need any more information.
Best regards,
Krzysztof
[-- Attachment #2: log.txt --]
[-- Type: text/plain, Size: 5666 bytes --]
git bisect start
# good: [c64d4419a17cfb39a5b573f9016cd02ade4c9a64] mtd: cfi_cmdset_0002: Change erase one block to enable XIP once
git bisect good c64d4419a17cfb39a5b573f9016cd02ade4c9a64
# good: [56c6855c81c8a6828b5d65aa974cd50f4b67760c] mtd: spi-nor: Add Micron MT25QU02 support
git bisect good 56c6855c81c8a6828b5d65aa974cd50f4b67760c
# bad: [2b70a7bd4673b6dcf2763888d0b172a40dd49434] Merge remote-tracking branch 'block/for-next'
git bisect bad 2b70a7bd4673b6dcf2763888d0b172a40dd49434
# bad: [25350cbeef4a3ec943754c4aa4c8ac1aaa64c7a2] Merge remote-tracking branch 'nand/nand/next'
git bisect bad 25350cbeef4a3ec943754c4aa4c8ac1aaa64c7a2
# good: [311da4975894aab7a4bb94aa83f38f052d7ffda4] Merge branch 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm
git bisect good 311da4975894aab7a4bb94aa83f38f052d7ffda4
# bad: [7d470f95b4ccd3244759524f6cc9a76a2a34a9c6] Merge remote-tracking branch 'printk/for-next'
git bisect bad 7d470f95b4ccd3244759524f6cc9a76a2a34a9c6
# good: [8a3ab2f38f1669e3be6433a1f6b82a077b38c4c7] brcmfmac: trigger memory dump upon firmware halt signal
git bisect good 8a3ab2f38f1669e3be6433a1f6b82a077b38c4c7
# good: [7fa76d777ec53eeece1546b737a3b93b37639575] netdevsim: Add extack error message for devlink reload
git bisect good 7fa76d777ec53eeece1546b737a3b93b37639575
# bad: [1fe087da59fde2cce4cf8e0a1c78b3279fb7ea44] Merge remote-tracking branch 'fbdev/fbdev-for-next'
git bisect bad 1fe087da59fde2cce4cf8e0a1c78b3279fb7ea44
# bad: [9d85b2fee6616994759eb4599c886af510ded175] Merge remote-tracking branch 'pci/next'
git bisect bad 9d85b2fee6616994759eb4599c886af510ded175
# good: [13fbadcd512c225c907d6e8147fb48a88114bf03] Merge branch 'pci/sparc'
git bisect good 13fbadcd512c225c907d6e8147fb48a88114bf03
# good: [741f8e7ecc2c6414cff442ec8eb07dcfe4481533] Merge branch 'lorenzo/pci/hv'
git bisect good 741f8e7ecc2c6414cff442ec8eb07dcfe4481533
# good: [1c2bef0a3fd14287a66edd7ead57fd2e439485a2] Merge branch 'lorenzo/pci/rcar'
git bisect good 1c2bef0a3fd14287a66edd7ead57fd2e439485a2
# good: [e52d38f4abf49f8b63a6ad0ce21e5f495c15897f] Merge branch 'lorenzo/pci/rockchip'
git bisect good e52d38f4abf49f8b63a6ad0ce21e5f495c15897f
# good: [73144d77cb87d60b4bcab6992a62d6787b09dcf0] Merge branch 'lorenzo/pci/vmd'
git bisect good 73144d77cb87d60b4bcab6992a62d6787b09dcf0
# good: [82e1719c4cd65bd7f7847d6c02376cfca3d5e793] PCI: Clean up whitespace in quirks.c
git bisect good 82e1719c4cd65bd7f7847d6c02376cfca3d5e793
# good: [0ecda3a087462eb89c1d9227deea998d8cd014e8] Merge branch 'pci/kconfig'
git bisect good 0ecda3a087462eb89c1d9227deea998d8cd014e8
# good: [488ad6d3678beee65bcd74e6a9764bd7cee9d3d3] Merge branch 'pci/trivial'
git bisect good 488ad6d3678beee65bcd74e6a9764bd7cee9d3d3
# good: [885892fb378dc096693557ba4f2b875188619b36] mlx4_core: restore optimal ICM memory allocation
git bisect good 885892fb378dc096693557ba4f2b875188619b36
# good: [488ad6d3678beee65bcd74e6a9764bd7cee9d3d3] Merge branch 'pci/trivial'
git bisect good 488ad6d3678beee65bcd74e6a9764bd7cee9d3d3
# good: [0ecda3a087462eb89c1d9227deea998d8cd014e8] Merge branch 'pci/kconfig'
git bisect good 0ecda3a087462eb89c1d9227deea998d8cd014e8
# good: [82e1719c4cd65bd7f7847d6c02376cfca3d5e793] PCI: Clean up whitespace in quirks.c
git bisect good 82e1719c4cd65bd7f7847d6c02376cfca3d5e793
# good: [1c2bef0a3fd14287a66edd7ead57fd2e439485a2] Merge branch 'lorenzo/pci/rcar'
git bisect good 1c2bef0a3fd14287a66edd7ead57fd2e439485a2
# good: [87cb5ac9cece9f0f75d0532fa2afcbf871f6b72e] Merge remote-tracking branch 'arm-soc/for-next'
git bisect good 87cb5ac9cece9f0f75d0532fa2afcbf871f6b72e
# good: [64eec192d609531b0c173c5b4885832372fb2a4c] Merge remote-tracking branch 'powerpc/next'
git bisect good 64eec192d609531b0c173c5b4885832372fb2a4c
# good: [dd8c1fd2071de3a02ea60e3fa68be24c1e89945e] Merge remote-tracking branch 'jfs/jfs-next'
git bisect good dd8c1fd2071de3a02ea60e3fa68be24c1e89945e
# bad: [6125a3dab21f1939dae7e836105dea0c9c465db4] Merge remote-tracking branch 'orangefs/for-next'
git bisect bad 6125a3dab21f1939dae7e836105dea0c9c465db4
# good: [3f0b3cf46e0542ac4b4241c579b944b755d11b67] NFS: Filter cache invalidation when holding a delegation
git bisect good 3f0b3cf46e0542ac4b4241c579b944b755d11b67
# good: [28771950c592482ee86cb1c3b661688aec3c0d7d] svcrdma: Fix incorrect return value/type in svc_rdma_post_recvs
git bisect good 28771950c592482ee86cb1c3b661688aec3c0d7d
# bad: [8335640cf89faa0f4e39e73e314f3f3a22d776f3] xprtrdma: Add trace_xprtrdma_dma_map(mr)
git bisect bad 8335640cf89faa0f4e39e73e314f3f3a22d776f3
# bad: [0e0b854cfb3302b1907e9d3a927469b95710238f] xprtrdma: Clean up Receive trace points
git bisect bad 0e0b854cfb3302b1907e9d3a927469b95710238f
# bad: [0e0b854cfb3302b1907e9d3a927469b95710238f] xprtrdma: Clean up Receive trace points
git bisect bad 0e0b854cfb3302b1907e9d3a927469b95710238f
# good: [75bc37fefc4471e718ba8e651aa74673d4e0a9eb] Linux 4.17-rc4
git bisect good 75bc37fefc4471e718ba8e651aa74673d4e0a9eb
# bad: [0e0b854cfb3302b1907e9d3a927469b95710238f] xprtrdma: Clean up Receive trace points
git bisect bad 0e0b854cfb3302b1907e9d3a927469b95710238f
# good: [914fcad9873cbd46e3a4c3c31551b98b15a49079] xprtrdma: Fix max_send_wr computation
git bisect good 914fcad9873cbd46e3a4c3c31551b98b15a49079
# bad: [a9cde23ab7cdf5e4e93432dffd0e734267f2b745] SUNRPC: Add a ->free_slot transport callout
git bisect bad a9cde23ab7cdf5e4e93432dffd0e734267f2b745
# bad: [37ac86c3a76c113619b7d9afe0251bbfc04cb80a] SUNRPC: Initialize rpc_rqst outside of xprt->reserve_lock
git bisect bad 37ac86c3a76c113619b7d9afe0251bbfc04cb80a
# first bad commit: [37ac86c3a76c113619b7d9afe0251bbfc04cb80a] SUNRPC: Initialize rpc_rqst outside of xprt->reserve_lock
^ permalink raw reply
* Re: [BUG BISECT] NFSv4 client fails on Flush Journal to Persistent Storage
From: Krzysztof Kozlowski @ 2018-06-07 11:22 UTC (permalink / raw)
To: Trond Myklebust, Anna Schumaker, J. Bruce Fields, Jeff Layton,
David S. Miller, linux-nfs, netdev, linux-kernel,
linux-samsung-soc@vger.kernel.org
In-Reply-To: <CAJKOXPd2rntNOpU1quR9Zm_J22+=pEaj4ZTC_tdZ0zcRYUciFg@mail.gmail.com>
On Thu, Jun 7, 2018 at 1:19 PM, Krzysztof Kozlowski <krzk@kernel.org> wrote:
> Hi,
>
> When booting my boards under recent linux-next, I see failures of systemd:
>
> [FAILED] Failed to start Flush Journal to Persistent Storage.
> See 'systemctl status systemd-journal-flush.service' for details.
> Starting Create Volatile Files and Directories...
> [** ] A start job is running for Create V… [ 223.209289] nfs:
> server 192.168.1.10 not responding, still trying
> [ 223.209377] nfs: server 192.168.1.10 not responding, still trying
>
> Effectively the boards fails to boot. Example is here:
> https://krzk.eu/#/builders/1/builds/2157
>
> This was bisected to:
> commit 37ac86c3a76c113619b7d9afe0251bbfc04cb80a
> Author: Chuck Lever <chuck.lever@oracle.com>
> Date: Fri May 4 15:34:53 2018 -0400
>
> SUNRPC: Initialize rpc_rqst outside of xprt->reserve_lock
>
> alloc_slot is a transport-specific op, but initializing an rpc_rqst
> is common to all transports. In addition, the only part of initial-
> izing an rpc_rqst that needs serialization is getting a fresh XID.
>
> Move rpc_rqst initialization to common code in preparation for
> adding a transport-specific alloc_slot to xprtrdma.
>
> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
> Signed-off-by: Anna Schumaker <Anna.Schumaker@Netapp.com>
>
>
> Bisect log attached. Full configuration:
> 1. exynos_defconfig
> 2. ARMv7, octa-core, Exynos5422 and Exynos4412 (Odroid XU3, U3 and others)
> 3. NFSv4 client (from Raspberry Pi)
>
> Let me know if you need any more information.
Ah, I forgot maybe the most important information in reproducment -
client uses NFS root (NFSv4).
Best regards,
Krzysztof
^ permalink raw reply
* Re: [RFC net-next] ipv4: Don't promote secondaries when flushing addresses
From: Jakub Sitnicki @ 2018-06-07 12:17 UTC (permalink / raw)
To: Michal Kubecek; +Cc: netdev
In-Reply-To: <20180607110029.vt6tqwlqmuotwpxf@unicorn.suse.cz>
On Thu, 7 Jun 2018 13:00:29 +0200
Michal Kubecek <mkubecek@suse.cz> wrote:
> On Thu, Jun 07, 2018 at 12:13:01PM +0200, Jakub Sitnicki wrote:
> > Promoting secondary addresses on address removal makes flushing all
> > addresses from a device with 1000's of them slow. This is because we
> > cannot take down the secondary addresses when we are removing the
> > primary one, which would make it faster.
> >
> > However, the userspace, when performing a flush, will in the end remove
> > all the addresses regardless of secondary address promotion taking
> > place. Unfortunately the kernel currently cannot distinguish between a
> > single address removal and a flush of all addresses.
> >
> > To help with this case introduce a IFA_F_FLUSH flag that can be used by
> > userspace to signal that a removal operation is being done because of a
> > flush. When the flag is set, don't bother with secondary address
> > promotion as we expect that secondary addresses will be removed soon as
> > well.
>
> Unless you intend to use the flag to allow deleting a specific address
> with its secondaries (overriding promote_secondaries), maybe it would
> be more practical to go even further and delete all addresses on the
> interface if IFA_F_FLUSH is set so that userspace could delete all
> addresses with one request.
Thanks for input, Michal. The intend as I understand it is to make
flushing all the addresses fast(er). Let me see if I can rework it
according to your suggestion. It does make more sense to do it like
that to me too.
Thanks,
Jakub
^ permalink raw reply
* Re: [RFC net-next] ipv4: Don't promote secondaries when flushing addresses
From: Phil Sutter @ 2018-06-07 12:35 UTC (permalink / raw)
To: Jakub Sitnicki; +Cc: Michal Kubecek, netdev
In-Reply-To: <20180607141750.434f6201@beetle>
Hi Jakub,
On Thu, Jun 07, 2018 at 02:17:50PM +0200, Jakub Sitnicki wrote:
> On Thu, 7 Jun 2018 13:00:29 +0200
> Michal Kubecek <mkubecek@suse.cz> wrote:
>
> > On Thu, Jun 07, 2018 at 12:13:01PM +0200, Jakub Sitnicki wrote:
> > > Promoting secondary addresses on address removal makes flushing all
> > > addresses from a device with 1000's of them slow. This is because we
> > > cannot take down the secondary addresses when we are removing the
> > > primary one, which would make it faster.
> > >
> > > However, the userspace, when performing a flush, will in the end remove
> > > all the addresses regardless of secondary address promotion taking
> > > place. Unfortunately the kernel currently cannot distinguish between a
> > > single address removal and a flush of all addresses.
> > >
> > > To help with this case introduce a IFA_F_FLUSH flag that can be used by
> > > userspace to signal that a removal operation is being done because of a
> > > flush. When the flag is set, don't bother with secondary address
> > > promotion as we expect that secondary addresses will be removed soon as
> > > well.
> >
> > Unless you intend to use the flag to allow deleting a specific address
> > with its secondaries (overriding promote_secondaries), maybe it would
> > be more practical to go even further and delete all addresses on the
> > interface if IFA_F_FLUSH is set so that userspace could delete all
> > addresses with one request.
>
> Thanks for input, Michal. The intend as I understand it is to make
> flushing all the addresses fast(er). Let me see if I can rework it
> according to your suggestion. It does make more sense to do it like
> that to me too.
Yes, I agree with Michal. IIRC, flushing a specific primary along with
all it's secondaries from an interface is not even supported by
iproute2, so no need to optimize for that I guess. OTOH, if your
solution allowed to get rid of that nasty loop in ipaddr_flush(), I owe
you one extra beer at the next occasion. :)
Thanks for holding on to this old ticket!
Cheers, Phil
^ permalink raw reply
* [PATCH] wcn36xx: Remove Unicode Byte Order Mark from testcode
From: Geert Uytterhoeven @ 2018-06-07 12:45 UTC (permalink / raw)
To: Eyal Ilsar, Kalle Valo, David S . Miller
Cc: Arnd Bergmann, wcn36xx, linux-wireless, netdev, linux-kernel,
Geert Uytterhoeven
Older gcc (< 4.4) doesn't like files starting with a Unicode BOM:
drivers/net/wireless/ath/wcn36xx/testmode.c:1: error: stray ‘\357’ in program
drivers/net/wireless/ath/wcn36xx/testmode.c:1: error: stray ‘\273’ in program
drivers/net/wireless/ath/wcn36xx/testmode.c:1: error: stray ‘\277’ in program
Remove the BOM, the rest of the file is plain ASCII anyway.
Output of "file drivers/net/wireless/ath/wcn36xx/testmode.c" before:
drivers/net/wireless/ath/wcn36xx/testmode.c: C source, UTF-8 Unicode (with BOM) text
and after:
drivers/net/wireless/ath/wcn36xx/testmode.c: C source, ASCII text
Fixes: 87f825e6e246cee0 ("wcn36xx: Add support for Factory Test Mode (FTM)")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
---
drivers/net/wireless/ath/wcn36xx/testmode.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/wireless/ath/wcn36xx/testmode.c b/drivers/net/wireless/ath/wcn36xx/testmode.c
index 1279064a3b716c2e..51a038022c8b8040 100644
--- a/drivers/net/wireless/ath/wcn36xx/testmode.c
+++ b/drivers/net/wireless/ath/wcn36xx/testmode.c
@@ -1,4 +1,4 @@
-/*
+/*
* Copyright (c) 2018, The Linux Foundation. All rights reserved.
*
* Permission to use, copy, modify, and/or distribute this software for any
--
2.7.4
^ permalink raw reply related
* [PATCH] net: mscc: ocelot: Fix uninitialized error in ocelot_netdevice_event()
From: Geert Uytterhoeven @ 2018-06-07 13:10 UTC (permalink / raw)
To: Alexandre Belloni, David S . Miller, Andrew Lunn
Cc: Arnd Bergmann, netdev, linux-kernel, Geert Uytterhoeven
With gcc-4.1.2:
drivers/net/ethernet/mscc/ocelot.c: In function ‘ocelot_netdevice_event’:
drivers/net/ethernet/mscc/ocelot.c:1129: warning: ‘ret’ may be used uninitialized in this function
If the list iterated over by netdev_for_each_lower_dev() is empty, ret
is never initialized, and converted into a notifier return value.
Fix this by preinitializing ret to zero.
Fixes: a556c76adc052c97 ("net: mscc: Add initial Ocelot switch support")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
---
This may be unlikely to happen, but given the notifier is called for any
(also non-ocelot) network device, better be safe than sorry.
---
drivers/net/ethernet/mscc/ocelot.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/mscc/ocelot.c b/drivers/net/ethernet/mscc/ocelot.c
index c8c74aa548d96e00..fb2c8f8071e64d3b 100644
--- a/drivers/net/ethernet/mscc/ocelot.c
+++ b/drivers/net/ethernet/mscc/ocelot.c
@@ -1126,7 +1126,7 @@ static int ocelot_netdevice_event(struct notifier_block *unused,
{
struct netdev_notifier_changeupper_info *info = ptr;
struct net_device *dev = netdev_notifier_info_to_dev(ptr);
- int ret;
+ int ret = 0;
if (netif_is_lag_master(dev)) {
struct net_device *slave;
--
2.7.4
^ permalink raw reply related
* [PATCH] xsk: Fix umem fill/completion queue mmap on 32-bit
From: Geert Uytterhoeven @ 2018-06-07 13:37 UTC (permalink / raw)
To: David S . Miller, Björn Töpel, Magnus Karlsson,
Alexei Starovoitov
Cc: Arnd Bergmann, Andrew Morton, netdev, linux-mm, linux-kernel,
Geert Uytterhoeven
With gcc-4.1.2 on 32-bit:
net/xdp/xsk.c:663: warning: integer constant is too large for ‘long’ type
net/xdp/xsk.c:665: warning: integer constant is too large for ‘long’ type
Add the missing "ULL" suffixes to the large XDP_UMEM_PGOFF_*_RING values
to fix this.
net/xdp/xsk.c:663: warning: comparison is always false due to limited range of data type
net/xdp/xsk.c:665: warning: comparison is always false due to limited range of data type
"unsigned long" is 32-bit on 32-bit systems, hence the offset is
truncated, and can never be equal to any of the XDP_UMEM_PGOFF_*_RING
values. Use loff_t (and the required cast) to fix this.
Fixes: 423f38329d267969 ("xsk: add umem fill queue support and mmap")
Fixes: fe2308328cd2f26e ("xsk: add umem completion queue support and mmap")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
---
Compile-tested only.
---
include/uapi/linux/if_xdp.h | 4 ++--
net/xdp/xsk.c | 2 +-
2 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/include/uapi/linux/if_xdp.h b/include/uapi/linux/if_xdp.h
index 1fa0e977ea8d0224..caed8b1614ffc0aa 100644
--- a/include/uapi/linux/if_xdp.h
+++ b/include/uapi/linux/if_xdp.h
@@ -63,8 +63,8 @@ struct xdp_statistics {
/* Pgoff for mmaping the rings */
#define XDP_PGOFF_RX_RING 0
#define XDP_PGOFF_TX_RING 0x80000000
-#define XDP_UMEM_PGOFF_FILL_RING 0x100000000
-#define XDP_UMEM_PGOFF_COMPLETION_RING 0x180000000
+#define XDP_UMEM_PGOFF_FILL_RING 0x100000000ULL
+#define XDP_UMEM_PGOFF_COMPLETION_RING 0x180000000ULL
/* Rx/Tx descriptor */
struct xdp_desc {
diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
index c6ed2454f7ce55e8..36919a254ba370c3 100644
--- a/net/xdp/xsk.c
+++ b/net/xdp/xsk.c
@@ -643,7 +643,7 @@ static int xsk_getsockopt(struct socket *sock, int level, int optname,
static int xsk_mmap(struct file *file, struct socket *sock,
struct vm_area_struct *vma)
{
- unsigned long offset = vma->vm_pgoff << PAGE_SHIFT;
+ loff_t offset = (loff_t)vma->vm_pgoff << PAGE_SHIFT;
unsigned long size = vma->vm_end - vma->vm_start;
struct xdp_sock *xs = xdp_sk(sock->sk);
struct xsk_queue *q = NULL;
--
2.7.4
^ permalink raw reply related
* Re: [RFC net-next] ipv4: Don't promote secondaries when flushing addresses
From: Michal Kubecek @ 2018-06-07 13:44 UTC (permalink / raw)
To: netdev; +Cc: Phil Sutter, Jakub Sitnicki
In-Reply-To: <20180607123539.GH16785@orbyte.nwl.cc>
On Thu, Jun 07, 2018 at 02:35:39PM +0200, Phil Sutter wrote:
> Yes, I agree with Michal. IIRC, flushing a specific primary along with
> all it's secondaries from an interface is not even supported by
> iproute2, so no need to optimize for that I guess. OTOH, if your
> solution allowed to get rid of that nasty loop in ipaddr_flush(), I owe
> you one extra beer at the next occasion. :)
I'm afraid it will have to stay as fallback for older kernels not
supporting flush requests. But there would be no need to actually use
it. If we know RTM_DELADDR request for zero address is guaranteed to
fail with current and older kernels, we could do
- use RTM_DELADDR with IFA_F_FLUSH and zero address
- if it fails, get the list and run the loop
If not, it could still be
- use RTM_DELADDR with IFA_F_FLUSH and zero address
- get the list of addresses (empty if first step worked)
- run the loop
Michal Kubecek
^ permalink raw reply
* [PATCH net] net/sched: act_simple: fix parsing of TCA_DEFDATA
From: Davide Caratti @ 2018-06-07 13:46 UTC (permalink / raw)
To: Jamal Hadi Salim, Cong Wang, Jiri Pirko; +Cc: David S. Miller, netdev
use nla_strlcpy() to avoid copying data beyond the length of TCA_DEFDATA
netlink attribute, in case it is less than SIMP_MAX_DATA and it does not
end with '\0' character.
Fixes: 0eff683f737b ("net/sched: potential data corruption")
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
---
net/sched/act_simple.c | 15 ++++++---------
1 file changed, 6 insertions(+), 9 deletions(-)
diff --git a/net/sched/act_simple.c b/net/sched/act_simple.c
index 9618b4a83cee..98c4afe7c15b 100644
--- a/net/sched/act_simple.c
+++ b/net/sched/act_simple.c
@@ -53,22 +53,22 @@ static void tcf_simp_release(struct tc_action *a)
kfree(d->tcfd_defdata);
}
-static int alloc_defdata(struct tcf_defact *d, char *defdata)
+static int alloc_defdata(struct tcf_defact *d, const struct nlattr *defdata)
{
d->tcfd_defdata = kzalloc(SIMP_MAX_DATA, GFP_KERNEL);
if (unlikely(!d->tcfd_defdata))
return -ENOMEM;
- strlcpy(d->tcfd_defdata, defdata, SIMP_MAX_DATA);
+ nla_strlcpy(d->tcfd_defdata, defdata, SIMP_MAX_DATA);
return 0;
}
-static void reset_policy(struct tcf_defact *d, char *defdata,
+static void reset_policy(struct tcf_defact *d, const struct nlattr *defdata,
struct tc_defact *p)
{
spin_lock_bh(&d->tcf_lock);
d->tcf_action = p->action;
memset(d->tcfd_defdata, 0, SIMP_MAX_DATA);
- strlcpy(d->tcfd_defdata, defdata, SIMP_MAX_DATA);
+ nla_strlcpy(d->tcfd_defdata, defdata, SIMP_MAX_DATA);
spin_unlock_bh(&d->tcf_lock);
}
@@ -87,7 +87,6 @@ static int tcf_simp_init(struct net *net, struct nlattr *nla,
struct tcf_defact *d;
bool exists = false;
int ret = 0, err;
- char *defdata;
if (nla == NULL)
return -EINVAL;
@@ -110,8 +109,6 @@ static int tcf_simp_init(struct net *net, struct nlattr *nla,
return -EINVAL;
}
- defdata = nla_data(tb[TCA_DEF_DATA]);
-
if (!exists) {
ret = tcf_idr_create(tn, parm->index, est, a,
&act_simp_ops, bind, false);
@@ -119,7 +116,7 @@ static int tcf_simp_init(struct net *net, struct nlattr *nla,
return ret;
d = to_defact(*a);
- ret = alloc_defdata(d, defdata);
+ ret = alloc_defdata(d, tb[TCA_DEF_DATA]);
if (ret < 0) {
tcf_idr_release(*a, bind);
return ret;
@@ -133,7 +130,7 @@ static int tcf_simp_init(struct net *net, struct nlattr *nla,
if (!ovr)
return -EEXIST;
- reset_policy(d, defdata, parm);
+ reset_policy(d, tb[TCA_DEF_DATA], parm);
}
if (ret == ACT_P_CREATED)
--
2.17.0
^ permalink raw reply related
* Re: [PATCH bpf-next v5 00/10] BTF: BPF Type Format
From: Arnaldo Carvalho de Melo @ 2018-06-07 13:54 UTC (permalink / raw)
To: Martin KaFai Lau; +Cc: netdev, Alexei Starovoitov, Daniel Borkmann, kernel-team
In-Reply-To: <20180605212548.lwtaw4svvydo2lhy@kafai-mbp.dhcp.thefacebook.com>
Em Tue, Jun 05, 2018 at 02:25:48PM -0700, Martin KaFai Lau escreveu:
> On Thu, Apr 19, 2018 at 04:40:34PM -0300, Arnaldo Carvalho de Melo wrote:
> > Em Wed, Apr 18, 2018 at 03:55:56PM -0700, Martin KaFai Lau escreveu:
> > > This patch introduces BPF Type Format (BTF).
> > >
> > > BTF (BPF Type Format) is the meta data format which describes
> > > the data types of BPF program/map. Hence, it basically focus
> > > on the C programming language which the modern BPF is primary
> > > using. The first use case is to provide a generic pretty print
> > > capability for a BPF map.
> > >
> > > A modified pahole that can convert dwarf to BTF is here:
> > > https://github.com/iamkafai/pahole/tree/btf
> > > (Arnaldo, there is some BTF_KIND numbering changes on
> > > Apr 18th, d61426c1571)
> >
> > Thanks for letting me know, I'm starting to look at this,
> Hi Arnaldo,
>
> Do you have a chance to take a look and pull it? The kernel
> changes will be in 4.18, so it will be handy if it is available in
> the pahole repository.
>
> [ btw, the latest commit (1 commit) should be 94a11b59e592 ].
Yeah, the one I had before had:
It also raises the number of types (and functions) limit from 0x7fff to
0x7fffffff.
----
And on this last one I see that:
/* Max # of type identifier */
-#define BTF_MAX_TYPE 0x7fffffff
+#define BTF_MAX_TYPE 0x0000ffff
/* Max offset into the string section */
-#define BTF_MAX_NAME_OFFSET 0x7fffffff
+#define BTF_MAX_NAME_OFFSET 0x0000ffff
So somehow (still reading) you'll be able to get more space, if we find
necessary, to have more types and names, ok.
Continuing...
- Arnaldo
> >
> > - Arnaldo
> >
> > > Please see individual patch for details.
> > >
> > > v5:
> > > - Remove BTF_KIND_FLOAT and BTF_KIND_FUNC which are not
> > > currently used. They can be added in the future.
> > > Some bpf_df_xxx() are removed together.
> > > - Add comment in patch 7 to clarify that the new bpffs_map_fops
> > > should not be extended further.
> > >
> > > v4:
> > > - Fix warning (remove unneeded semicolon)
> > > - Remove a redundant variable (nr_bytes) from btf_int_check_meta() in
> > > patch 1. Caught by W=1.
> > >
> > > v3:
> > > - Rebase to bpf-next
> > > - Fix sparse warning (by adding static)
> > > - Add BTF header logging: btf_verifier_log_hdr()
> > > - Fix the alignment test on btf->type_off
> > > - Add tests for the BTF header
> > > - Lower the max BTF size to 16MB. It should be enough
> > > for some time. We could raise it later if it would
> > > be needed.
> > >
> > > v2:
> > > - Use kvfree where needed in patch 1 and 2
> > > - Also consider BTF_INT_OFFSET() in the btf_int_check_meta()
> > > in patch 1
> > > - Fix an incorrect goto target in map_create() during
> > > the btf-error-path in patch 7
> > > - re-org some local vars to keep the rev xmas tree in btf.c
> > >
> > > Martin KaFai Lau (10):
> > > bpf: btf: Introduce BPF Type Format (BTF)
> > > bpf: btf: Validate type reference
> > > bpf: btf: Check members of struct/union
> > > bpf: btf: Add pretty print capability for data with BTF type info
> > > bpf: btf: Add BPF_BTF_LOAD command
> > > bpf: btf: Add BPF_OBJ_GET_INFO_BY_FD support to BTF fd
> > > bpf: btf: Add pretty print support to the basic arraymap
> > > bpf: btf: Sync bpf.h and btf.h to tools/
> > > bpf: btf: Add BTF support to libbpf
> > > bpf: btf: Add BTF tests
> > >
> > > include/linux/bpf.h | 20 +-
> > > include/linux/btf.h | 48 +
> > > include/uapi/linux/bpf.h | 12 +
> > > include/uapi/linux/btf.h | 130 ++
> > > kernel/bpf/Makefile | 1 +
> > > kernel/bpf/arraymap.c | 50 +
> > > kernel/bpf/btf.c | 2064 ++++++++++++++++++++++++++
> > > kernel/bpf/inode.c | 156 +-
> > > kernel/bpf/syscall.c | 51 +-
> > > tools/include/uapi/linux/bpf.h | 12 +
> > > tools/include/uapi/linux/btf.h | 130 ++
> > > tools/lib/bpf/Build | 2 +-
> > > tools/lib/bpf/bpf.c | 92 +-
> > > tools/lib/bpf/bpf.h | 16 +
> > > tools/lib/bpf/btf.c | 374 +++++
> > > tools/lib/bpf/btf.h | 22 +
> > > tools/lib/bpf/libbpf.c | 148 +-
> > > tools/lib/bpf/libbpf.h | 3 +
> > > tools/testing/selftests/bpf/Makefile | 26 +-
> > > tools/testing/selftests/bpf/test_btf.c | 1669 +++++++++++++++++++++
> > > tools/testing/selftests/bpf/test_btf_haskv.c | 48 +
> > > tools/testing/selftests/bpf/test_btf_nokv.c | 43 +
> > > 22 files changed, 5076 insertions(+), 41 deletions(-)
> > > create mode 100644 include/linux/btf.h
> > > create mode 100644 include/uapi/linux/btf.h
> > > create mode 100644 kernel/bpf/btf.c
> > > create mode 100644 tools/include/uapi/linux/btf.h
> > > create mode 100644 tools/lib/bpf/btf.c
> > > create mode 100644 tools/lib/bpf/btf.h
> > > create mode 100644 tools/testing/selftests/bpf/test_btf.c
> > > create mode 100644 tools/testing/selftests/bpf/test_btf_haskv.c
> > > create mode 100644 tools/testing/selftests/bpf/test_btf_nokv.c
> > >
> > > --
> > > 2.9.5
^ permalink raw reply
* Re: [PATCH bpf-next v5 00/10] BTF: BPF Type Format
From: Arnaldo Carvalho de Melo @ 2018-06-07 14:03 UTC (permalink / raw)
To: Martin KaFai Lau; +Cc: netdev, Alexei Starovoitov, Daniel Borkmann, kernel-team
In-Reply-To: <20180607135401.GE30317@kernel.org>
Em Thu, Jun 07, 2018 at 10:54:01AM -0300, Arnaldo Carvalho de Melo escreveu:
> Em Tue, Jun 05, 2018 at 02:25:48PM -0700, Martin KaFai Lau escreveu:
> > [ btw, the latest commit (1 commit) should be 94a11b59e592 ].
So, the commit log message for the pahole patch is non-existent:
https://github.com/iamkafai/pahole/commit/94a11b59e5920908085bfc8d24c92f95c8ffceaf
we should do better in describing what is done and how, I'm staring
with a message you sent to the kernel part:
--
This patch introduces BPF Type Format (BTF).
BTF (BPF Type Format) is the meta data format which describes
the data types of BPF program/map. Hence, it basically focus
on the C programming language which the modern BPF is primary
using. The first use case is to provide a generic pretty print
capability for a BPF map.
^ permalink raw reply
* [PATCH] ieee802154: add rx LQI from userspace
From: Clément Péron @ 2018-06-07 14:08 UTC (permalink / raw)
To: Romuald Cari, linux-wpan
Cc: Alexander Aring, Stefan Schmidt, David S . Miller, netdev,
linux-kernel, Clément Peron
From: Romuald CARI <romuald.cari@devialet.com>
The Link Quality Indication data exposed by drivers could not be accessed from
userspace. Since this data is per-datagram received, it makes sense to make it
available to userspace application through the ancillary data mechanism in
recvmsg rather than through ioctls. This can be activated using the socket
option WPAN_WANTLQI under SOL_IEEE802154 protocol.
This LQI data is available in the ancillary data buffer under the SOL_IEEE802154
level as the type WPAN_LQI. The value is an unsigned byte indicating the link
quality with values ranging 0-255.
Signed-off-by: Romuald Cari <romuald.cari@devialet.com>
Signed-off-by: Clément Peron <clement.peron@devialet.com>
---
include/net/af_ieee802154.h | 1 +
net/ieee802154/socket.c | 17 +++++++++++++++++
2 files changed, 18 insertions(+)
diff --git a/include/net/af_ieee802154.h b/include/net/af_ieee802154.h
index a5563d27a3eb..8003a9f6eb43 100644
--- a/include/net/af_ieee802154.h
+++ b/include/net/af_ieee802154.h
@@ -56,6 +56,7 @@ struct sockaddr_ieee802154 {
#define WPAN_WANTACK 0
#define WPAN_SECURITY 1
#define WPAN_SECURITY_LEVEL 2
+#define WPAN_WANTLQI 3
#define WPAN_SECURITY_DEFAULT 0
#define WPAN_SECURITY_OFF 1
diff --git a/net/ieee802154/socket.c b/net/ieee802154/socket.c
index a60658c85a9a..bc6b912603f1 100644
--- a/net/ieee802154/socket.c
+++ b/net/ieee802154/socket.c
@@ -25,6 +25,7 @@
#include <linux/termios.h> /* For TIOCOUTQ/INQ */
#include <linux/list.h>
#include <linux/slab.h>
+#include <linux/socket.h>
#include <net/datalink.h>
#include <net/psnap.h>
#include <net/sock.h>
@@ -452,6 +453,7 @@ struct dgram_sock {
unsigned int bound:1;
unsigned int connected:1;
unsigned int want_ack:1;
+ unsigned int want_lqi:1;
unsigned int secen:1;
unsigned int secen_override:1;
unsigned int seclevel:3;
@@ -486,6 +488,7 @@ static int dgram_init(struct sock *sk)
struct dgram_sock *ro = dgram_sk(sk);
ro->want_ack = 1;
+ ro->want_lqi = 0;
return 0;
}
@@ -713,6 +716,7 @@ static int dgram_recvmsg(struct sock *sk, struct msghdr *msg, size_t len,
size_t copied = 0;
int err = -EOPNOTSUPP;
struct sk_buff *skb;
+ struct dgram_sock *ro = dgram_sk(sk);
DECLARE_SOCKADDR(struct sockaddr_ieee802154 *, saddr, msg->msg_name);
skb = skb_recv_datagram(sk, flags, noblock, &err);
@@ -744,6 +748,13 @@ static int dgram_recvmsg(struct sock *sk, struct msghdr *msg, size_t len,
*addr_len = sizeof(*saddr);
}
+ if (ro->want_lqi) {
+ err = put_cmsg(msg, SOL_IEEE802154, WPAN_WANTLQI,
+ sizeof(uint8_t), &(mac_cb(skb)->lqi));
+ if (err)
+ goto done;
+ }
+
if (flags & MSG_TRUNC)
copied = skb->len;
done:
@@ -847,6 +858,9 @@ static int dgram_getsockopt(struct sock *sk, int level, int optname,
case WPAN_WANTACK:
val = ro->want_ack;
break;
+ case WPAN_WANTLQI:
+ val = ro->want_lqi;
+ break;
case WPAN_SECURITY:
if (!ro->secen_override)
val = WPAN_SECURITY_DEFAULT;
@@ -892,6 +906,9 @@ static int dgram_setsockopt(struct sock *sk, int level, int optname,
case WPAN_WANTACK:
ro->want_ack = !!val;
break;
+ case WPAN_WANTLQI:
+ ro->want_lqi = !!val;
+ break;
case WPAN_SECURITY:
if (!ns_capable(net->user_ns, CAP_NET_ADMIN) &&
!ns_capable(net->user_ns, CAP_NET_RAW)) {
--
2.17.1
^ permalink raw reply related
* Re: [PATCH net] failover: eliminate callback hell
From: Alexander Duyck @ 2018-06-07 14:17 UTC (permalink / raw)
To: Stephen Hemminger
Cc: Samudrala, Sridhar, Michael S. Tsirkin, Jiri Pirko, KY Srinivasan,
Haiyang Zhang, David Miller, Netdev, Stephen Hemminger
In-Reply-To: <20180606152516.5edd5893@xeon-e3>
On Wed, Jun 6, 2018 at 3:25 PM, Stephen Hemminger
<stephen@networkplumber.org> wrote:
> On Wed, 6 Jun 2018 14:54:04 -0700
> "Samudrala, Sridhar" <sridhar.samudrala@intel.com> wrote:
>
>> On 6/6/2018 2:24 PM, Stephen Hemminger wrote:
>> > On Wed, 6 Jun 2018 15:30:27 +0300
>> > "Michael S. Tsirkin" <mst@redhat.com> wrote:
>> >
>> >> On Wed, Jun 06, 2018 at 09:25:12AM +0200, Jiri Pirko wrote:
>> >>> Tue, Jun 05, 2018 at 05:42:31AM CEST, stephen@networkplumber.org wrote:
>> >>>> The net failover should be a simple library, not a virtual
>> >>>> object with function callbacks (see callback hell).
>> >>> Why just a library? It should do a common things. I think it should be a
>> >>> virtual object. Looks like your patch again splits the common
>> >>> functionality into multiple drivers. That is kind of backwards attitude.
>> >>> I don't get it. We should rather focus on fixing the mess the
>> >>> introduction of netvsc-bonding caused and switch netvsc to 3-netdev
>> >>> model.
>> >> So it seems that at least one benefit for netvsc would be better
>> >> handling of renames.
>> >>
>> >> Question is how can this change to 3-netdev happen? Stephen is
>> >> concerned about risk of breaking some userspace.
>> >>
>> >> Stephen, this seems to be the usecase that IFF_HIDDEN was trying to
>> >> address, and you said then "why not use existing network namespaces
>> >> rather than inventing a new abstraction". So how about it then? Do you
>> >> want to find a way to use namespaces to hide the PV device for netvsc
>> >> compatibility?
>> >>
>> > Netvsc can't work with 3 dev model. MS has worked with enough distro's and
>> > startups that all demand eth0 always be present. And VF may come and go.
>> > After this history, there is a strong motivation not to change how kernel
>> > behaves. Switching to 3 device model would be perceived as breaking
>> > existing userspace.
>>
>> I think it should be possible for netvsc to work with 3 dev model if the only
>> requirement is that eth0 will always be present. With net_failover, you will
>> see eth0 and eth0nsby OR with older distros eth0 and eth1. It may be an issue
>> if somehow there is userspace requirement that there can be only 2 netdevs, not 3
>> when VF is plugged.
>>
>> eth0 will be the net_failover device and eth0nsby/eth1 will be the netvsc device
>> and the IP address gets configured on eth0. Will this be an issue?
>
> DPDK drivers in 18.05 depend on 2 device model. Yes it is a bit of mess
> but that is the way it is.
Why would DPDK care what we do in the kernel? Isn't it just slapping
vfio-pci on the netdevs it sees?
^ permalink raw reply
* Re: [PATCH v4 9/9] net-next: New ax88796 platform driver for Amiga X-Surf 100 Zorro board (m68k)
From: Geert Uytterhoeven @ 2018-06-07 14:36 UTC (permalink / raw)
To: Michael Schmitz
Cc: netdev, Andrew Lunn, Finn Thain, Florian Fainelli, Linux/m68k,
Michael Karcher, Michael Karcher
In-Reply-To: <1524103526-12240-10-git-send-email-schmitzmic@gmail.com>
Hi Michael,
On Thu, Apr 19, 2018 at 4:05 AM, Michael Schmitz <schmitzmic@gmail.com> wrote:
> From: Michael Karcher <kernel@mkarcher.dialup.fu-berlin.de>
>
> Add platform device driver to populate the ax88796 platform data from
> information provided by the XSurf100 zorro device driver. The ax88796
> module will be loaded through this module's probe function.
>
> Signed-off-by: Michael Karcher <kernel@mkarcher.dialup.fu-berlin.de>
> Signed-off-by: Michael Schmitz <schmitzmic@gmail.com>
This is now commit 861928f4e60e826c ("net-next: New ax88796 platform
driver for Amiga X-Surf 100 Zorro board (m68k)").
> --- /dev/null
> +++ b/drivers/net/ethernet/8390/xsurf100.c
> +#define __NS8390_init ax_NS8390_init
[...]
> +#include "lib8390.c"
drivers/net/ethernet/8390/lib8390.c:202: warning: ‘__ei_open’ defined
but not used
drivers/net/ethernet/8390/lib8390.c:231: warning: ‘__ei_close’ defined
but not used
drivers/net/ethernet/8390/lib8390.c:255: warning: ‘__ei_tx_timeout’
defined but not used
drivers/net/ethernet/8390/lib8390.c:302: warning: ‘__ei_start_xmit’
defined but not used
drivers/net/ethernet/8390/lib8390.c:510: warning: ‘__ei_poll’ defined
but not used
drivers/net/ethernet/8390/lib8390.c:851: warning: ‘__ei_get_stats’
defined but not used
drivers/net/ethernet/8390/lib8390.c:951: warning:
‘__ei_set_multicast_list’ defined but not used
drivers/net/ethernet/8390/lib8390.c:989: warning:
‘____alloc_ei_netdev’ defined but not used
So I was wondering: why is this file included, as XSURF100 selects AX88796,
while ax88796.c includes lib8390.c anyway?
Apparently lib8390.c is included for register definitions (provided by
8390.h, and can easily be fixed), and for the __NS8390_init()
implementation, called below.
> +static void xs100_block_output(struct net_device *dev, int count,
> + const unsigned char *buf, const int start_page)
> +{
[...]
> + while ((ei_inb(nic_base + EN0_ISR) & ENISR_RDC) == 0) {
> + if (jiffies - dma_start > 2 * HZ / 100) { /* 20ms */
> + netdev_warn(dev, "timeout waiting for Tx RDC.\n");
> + ei_local->reset_8390(dev);
> + ax_NS8390_init(dev, 1);
> + break;
> + }
> + }
> +
> + ei_outb(ENISR_RDC, nic_base + EN0_ISR); /* Ack intr. */
> + ei_local->dmaing &= ~0x01;
> +}
Can we get rid of the inclusion of lib8390.c, and the related warnings?
Perhaps ax88796.c can export its ax_NS8390_init(), iff the implementation
is identical? Or is there a better solution?
Thanks!
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
^ permalink raw reply
* Fw: [Bug 199963] New: UDP rx_queue incorrect calculation in /proc/net/udp
From: Stephen Hemminger @ 2018-06-07 14:39 UTC (permalink / raw)
To: netdev
Begin forwarded message:
Date: Thu, 07 Jun 2018 13:21:23 +0000
From: bugzilla-daemon@bugzilla.kernel.org
To: stephen@networkplumber.org
Subject: [Bug 199963] New: UDP rx_queue incorrect calculation in /proc/net/udp
https://bugzilla.kernel.org/show_bug.cgi?id=199963
Bug ID: 199963
Summary: UDP rx_queue incorrect calculation in /proc/net/udp
Product: Networking
Version: 2.5
Kernel Version: Kernels >= 4.15
Hardware: All
OS: Linux
Tree: Mainline
Status: NEW
Severity: normal
Priority: P1
Component: IPV4
Assignee: stephen@networkplumber.org
Reporter: trevor.francis@46labs.com
Regression: No
since upgrading to any kernel >= 4.15 the rx_queue in /proc/net/udp is now
reporting a queue, regardless of system load and regardless of what
applications are running on it. The tx_queue is always 0, but rx_queue has
seemingly random spikes of udp queueing. This is observed across hundreds of
servers with either varying or no workload.
netstat -nl|grep ^udp
udp 4352 0 0.0.0.0:68 0.0.0.0:*
sl local_address rem_address st tx_queue rx_queue tr tm->when retrnsmt uid
timeout inode ref pointer drops
14645: 3500007F:0035 00000000:0000 07 00000000:0000C900 00:00000000 00000000
101 0 3367 2 ffff8da177fdcc00 0
--
You are receiving this mail because:
You are the assignee for the bug.
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox