Netdev List

Netdev List
 help / color / mirror / Atom feed

* [RFC PATCH net-next 08/11] spi: spi-fsl-dspi: Disable interrupts and preemption during poll mode transfer
From: Vladimir Oltean @ 2019-08-16  0:44 UTC (permalink / raw)
  To: h.feurstein, mlichvar, richardcochran, andrew, f.fainelli,
	broonie
  Cc: linux-spi, netdev, Vladimir Oltean
In-Reply-To: <20190816004449.10100-1-olteanv@gmail.com>

While the poll mode helps reduce the overall latency in transmitting a
SPI message in the EOQ and TCFQ modes, the transmission can still have
jitter due to the CPU needing to service interrupts.

The transmission latency does not matter except in situations where the
SPI transfer represents the readout of a POSIX clock. In that case,
even with the byte-level PTP system timestamping in place, a pending IRQ
might find its way to be processed on the local CPU exactly during the
window when the transfer is snapshotted.

Disabling interrupts ensures the above situation never happens. When it
does, it manifests itself as random delay spikes, which throw off the
servo loop of phc2sys and make it lose lock.

Short phc2sys summary after 58 minutes of running without this patch:

  offset: min -26251 max 16416 mean -21.8672 std dev 863.416
  delay: min 4720 max 57280 mean 5182.49 std dev 1607.19
  lost servo lock 3 times

Summary of the same phc2sys service running for 120 minutes with the
patch:

  offset: min -378 max 381 mean -0.0083089 std dev 101.495
  delay: min 4720 max 5920 mean 5129.38 std dev 154.899
  lost servo lock 0 times

Disable interrupts unconditionally if running in poll mode.
Two aspects:
- If the DSPI driver is in IRQ mode, then disabling interrupts becomes a
  contradiction in terms. Poll mode is recommendable for predictable
  latency.
- In theory it should be possible to disable interrupts only for SPI
  transfers that represent an interaction with a POSIX clock. The driver
  can sense this by looking at transfer->ptp_sts. However enabling this
  unconditionally makes issues much more visible (and not just in fringe
  cases), were they to ever appear.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
---
 drivers/spi/spi-fsl-dspi.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/drivers/spi/spi-fsl-dspi.c b/drivers/spi/spi-fsl-dspi.c
index f0838853392d..5404f1b45ad0 100644
--- a/drivers/spi/spi-fsl-dspi.c
+++ b/drivers/spi/spi-fsl-dspi.c
@@ -184,6 +184,8 @@ struct fsl_dspi {
 	int			irq;
 	struct clk		*clk;
 
+	/* Used to disable IRQs and preemption */
+	spinlock_t		lock;
 	struct ptp_system_timestamp *ptp_sts;
 	const void		*ptp_sts_word;
 	bool			take_snapshot;
@@ -757,6 +759,7 @@ static int dspi_transfer_one_message(struct spi_master *master,
 	struct spi_device *spi = message->spi;
 	enum dspi_trans_mode trans_mode;
 	struct spi_transfer *transfer;
+	unsigned long flags = 0;
 	int status = 0;
 
 	message->actual_length = 0;
@@ -813,6 +816,9 @@ static int dspi_transfer_one_message(struct spi_master *master,
 				     SPI_FRAME_EBITS(transfer->bits_per_word) |
 				     SPI_CTARE_DTCP(1));
 
+		if (!dspi->irq)
+			spin_lock_irqsave(&dspi->lock, flags);
+
 		dspi->take_snapshot = (dspi->tx == dspi->ptp_sts_word);
 
 		if (dspi->take_snapshot) {
@@ -848,6 +854,9 @@ static int dspi_transfer_one_message(struct spi_master *master,
 			do {
 				status = dspi_poll(dspi);
 			} while (status == -EAGAIN);
+
+			spin_unlock_irqrestore(&dspi->lock, flags);
+
 		} else if (trans_mode != DSPI_DMA_MODE) {
 			status = wait_event_interruptible(dspi->waitq,
 							  dspi->waitflags);
@@ -1178,6 +1187,7 @@ static int dspi_probe(struct platform_device *pdev)
 	}
 
 	init_waitqueue_head(&dspi->waitq);
+	spin_lock_init(&dspi->lock);
 
 poll_mode:
 	if (dspi->devtype_data->trans_mode == DSPI_DMA_MODE) {
-- 
2.17.1


^ permalink raw reply related

* [RFC PATCH net-next 02/11] net: dsa: sja1105: Implement the .gettimex64 system call for PTP
From: Vladimir Oltean @ 2019-08-16  0:44 UTC (permalink / raw)
  To: h.feurstein, mlichvar, richardcochran, andrew, f.fainelli,
	broonie
  Cc: linux-spi, netdev, Vladimir Oltean
In-Reply-To: <20190816004449.10100-1-olteanv@gmail.com>

Through the PTP_SYS_OFFSET_EXTENDED ioctl, it is possible for userspace
applications (i.e. phc2sys) to compensate for the delays incurred while
reading the PHC's time.

Implement this ioctl by passing the PTP system timestamp to the switch's
SPI controller driver.

The 'read PTP time' message is a 12 byte structure, first 4 bytes of
which represent the SPI header, and the last 8 bytes represent the
64-bit PTP time. The switch itself starts processing the command
immediately after receiving the last bit of the address, i.e. at the
middle of byte 3 (last byte of header). The PTP time is shadowed to a
buffer register in the switch, and retrieved atomically during the
subsequent SPI frames.

As such, specify to the SPI controller that we want byte 3 to be the one
that gets software-timestamped.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
---
 drivers/net/dsa/sja1105/sja1105.h      |  4 ++-
 drivers/net/dsa/sja1105/sja1105_main.c |  4 +--
 drivers/net/dsa/sja1105/sja1105_ptp.c  | 21 ++++++++++------
 drivers/net/dsa/sja1105/sja1105_spi.c  | 34 ++++++++++++++++++--------
 4 files changed, 42 insertions(+), 21 deletions(-)

diff --git a/drivers/net/dsa/sja1105/sja1105.h b/drivers/net/dsa/sja1105/sja1105.h
index e6371b2c2df1..18c4f2f808e4 100644
--- a/drivers/net/dsa/sja1105/sja1105.h
+++ b/drivers/net/dsa/sja1105/sja1105.h
@@ -93,6 +93,7 @@ struct sja1105_private {
 	struct spi_device *spidev;
 	struct dsa_switch *ds;
 	struct sja1105_port ports[SJA1105_NUM_PORTS];
+	struct ptp_system_timestamp *ptp_sts;
 	struct ptp_clock_info ptp_caps;
 	struct ptp_clock *clock;
 	/* The cycle counter translates the PTP timestamps (based on
@@ -136,7 +137,8 @@ int sja1105_spi_send_packed_buf(const struct sja1105_private *priv,
 				void *packed_buf, size_t size_bytes);
 int sja1105_spi_send_int(const struct sja1105_private *priv,
 			 sja1105_spi_rw_mode_t rw, u64 reg_addr,
-			 u64 *value, u64 size_bytes);
+			 u64 *value, u64 size_bytes,
+			 struct ptp_system_timestamp *ptp_sts);
 int sja1105_spi_send_long_packed_buf(const struct sja1105_private *priv,
 				     sja1105_spi_rw_mode_t rw, u64 base_addr,
 				     void *packed_buf, u64 buf_len);
diff --git a/drivers/net/dsa/sja1105/sja1105_main.c b/drivers/net/dsa/sja1105/sja1105_main.c
index 82bdc2da8f8f..c4df2cef89cd 100644
--- a/drivers/net/dsa/sja1105/sja1105_main.c
+++ b/drivers/net/dsa/sja1105/sja1105_main.c
@@ -2100,8 +2100,8 @@ static int sja1105_check_device_id(struct sja1105_private *priv)
 	u64 part_no;
 	int rc;
 
-	rc = sja1105_spi_send_int(priv, SPI_READ, regs->device_id,
-				  &device_id, SJA1105_SIZE_DEVICE_ID);
+	rc = sja1105_spi_send_int(priv, SPI_READ, regs->device_id, &device_id,
+				  SJA1105_SIZE_DEVICE_ID, NULL);
 	if (rc < 0)
 		return rc;
 
diff --git a/drivers/net/dsa/sja1105/sja1105_ptp.c b/drivers/net/dsa/sja1105/sja1105_ptp.c
index 51a0014369fc..ee7d1065d745 100644
--- a/drivers/net/dsa/sja1105/sja1105_ptp.c
+++ b/drivers/net/dsa/sja1105/sja1105_ptp.c
@@ -1,6 +1,7 @@
 // SPDX-License-Identifier: GPL-2.0
 /* Copyright (c) 2019, Vladimir Oltean <olteanv@gmail.com>
  */
+#include <linux/spi/spi.h>
 #include "sja1105.h"
 
 /* The adjfine API clamps ppb between [-32,768,000, 32,768,000], and
@@ -262,14 +263,17 @@ int sja1105_ptp_reset(struct sja1105_private *priv)
 	return rc;
 }
 
-static int sja1105_ptp_gettime(struct ptp_clock_info *ptp,
-			       struct timespec64 *ts)
+static int sja1105_ptp_gettimex(struct ptp_clock_info *ptp,
+				struct timespec64 *ts,
+				struct ptp_system_timestamp *sts)
 {
 	struct sja1105_private *priv = ptp_to_sja1105(ptp);
 	u64 ns;
 
 	mutex_lock(&priv->ptp_lock);
+	priv->ptp_sts = sts;
 	ns = timecounter_read(&priv->tstamp_tc);
+	priv->ptp_sts = NULL;
 	mutex_unlock(&priv->ptp_lock);
 
 	*ts = ns_to_timespec64(ns);
@@ -355,7 +359,7 @@ static u64 sja1105_ptptsclk_read(const struct cyclecounter *cc)
 
 	sja1105_debug_gpio(priv, 1);
 	rc = sja1105_spi_send_int(priv, SPI_READ, regs->ptptsclk,
-				  &ptptsclk, 8);
+				  &ptptsclk, 8, priv->ptp_sts);
 	sja1105_debug_gpio(priv, 0);
 	if (rc < 0)
 		dev_err_ratelimited(priv->ds->dev,
@@ -368,9 +372,10 @@ static void sja1105_ptp_overflow_check(struct work_struct *work)
 {
 	struct delayed_work *dw = to_delayed_work(work);
 	struct sja1105_private *priv = rw_to_sja1105(dw);
+	struct ptp_system_timestamp dummy;
 	struct timespec64 ts;
 
-	sja1105_ptp_gettime(&priv->ptp_caps, &ts);
+	sja1105_ptp_gettimex(&priv->ptp_caps, &ts, &dummy);
 
 	schedule_delayed_work(&priv->refresh_work, SJA1105_REFRESH_INTERVAL);
 }
@@ -387,7 +392,7 @@ static void sja1105_ptp_extts_work(struct work_struct *work)
 	mutex_lock(&priv->ptp_lock);
 
 	rc = sja1105_spi_send_int(priv, SPI_READ, regs->ptpsyncts,
-				  &ptpsyncts, 8);
+				  &ptpsyncts, 8, NULL);
 	if (rc < 0)
 		dev_err(priv->ds->dev, "Failed to read PTPSYNCTS: %d\n", rc);
 
@@ -433,12 +438,12 @@ static int sja1105_pps_enable(struct sja1105_private *priv, bool on)
 		ptp_pin_duration = SJA1105_HZ_TO_PIN_DURATION(1);
 
 		rc = sja1105_spi_send_int(priv, SPI_WRITE, regs->ptppinst,
-					  &ptp_pin_start, 8);
+					  &ptp_pin_start, 8, NULL);
 		if (rc < 0)
 			return rc;
 
 		rc = sja1105_spi_send_int(priv, SPI_WRITE, regs->ptppindur,
-					  &ptp_pin_duration, 4);
+					  &ptp_pin_duration, 4, NULL);
 		if (rc < 0)
 			return rc;
 	}
@@ -500,7 +505,7 @@ int sja1105_ptp_clock_register(struct sja1105_private *priv)
 		.name		= "SJA1105 PHC",
 		.adjfine	= sja1105_ptp_adjfine,
 		.adjtime	= sja1105_ptp_adjtime,
-		.gettime64	= sja1105_ptp_gettime,
+		.gettimex64	= sja1105_ptp_gettimex,
 		.settime64	= sja1105_ptp_settime,
 		.enable		= sja1105_ptp_enable,
 		.max_adj	= SJA1105_MAX_ADJ_PPB,
diff --git a/drivers/net/dsa/sja1105/sja1105_spi.c b/drivers/net/dsa/sja1105/sja1105_spi.c
index e33c85569882..a3486a40e0fb 100644
--- a/drivers/net/dsa/sja1105/sja1105_spi.c
+++ b/drivers/net/dsa/sja1105/sja1105_spi.c
@@ -15,10 +15,13 @@
 	(SJA1105_SIZE_SPI_MSG_HEADER + SJA1105_SIZE_SPI_MSG_MAXLEN)
 
 static int sja1105_spi_transfer(const struct sja1105_private *priv,
-				const void *tx, void *rx, int size)
+				const void *tx, void *rx, int size,
+				struct ptp_system_timestamp *ptp_sts)
 {
 	struct spi_device *spi = priv->spidev;
 	struct spi_transfer transfer = {
+		.ptp_sts_word_offset = 3,
+		.ptp_sts = ptp_sts,
 		.tx_buf = tx,
 		.rx_buf = rx,
 		.len = size,
@@ -66,9 +69,11 @@ sja1105_spi_message_pack(void *buf, const struct sja1105_spi_message *msg)
  * @size_bytes is smaller than SIZE_SPI_MSG_MAXLEN. Larger packed buffers
  * are chunked in smaller pieces by sja1105_spi_send_long_packed_buf below.
  */
-int sja1105_spi_send_packed_buf(const struct sja1105_private *priv,
-				sja1105_spi_rw_mode_t rw, u64 reg_addr,
-				void *packed_buf, size_t size_bytes)
+static int
+__sja1105_spi_send_packed_buf(const struct sja1105_private *priv,
+			      sja1105_spi_rw_mode_t rw, u64 reg_addr,
+			      void *packed_buf, size_t size_bytes,
+			      struct ptp_system_timestamp *ptp_sts)
 {
 	u8 tx_buf[SJA1105_SIZE_SPI_TRANSFER_MAX] = {0};
 	u8 rx_buf[SJA1105_SIZE_SPI_TRANSFER_MAX] = {0};
@@ -90,7 +95,7 @@ int sja1105_spi_send_packed_buf(const struct sja1105_private *priv,
 		memcpy(tx_buf + SJA1105_SIZE_SPI_MSG_HEADER,
 		       packed_buf, size_bytes);
 
-	rc = sja1105_spi_transfer(priv, tx_buf, rx_buf, msg_len);
+	rc = sja1105_spi_transfer(priv, tx_buf, rx_buf, msg_len, ptp_sts);
 	if (rc < 0)
 		return rc;
 
@@ -101,6 +106,14 @@ int sja1105_spi_send_packed_buf(const struct sja1105_private *priv,
 	return 0;
 }
 
+int sja1105_spi_send_packed_buf(const struct sja1105_private *priv,
+				sja1105_spi_rw_mode_t rw, u64 reg_addr,
+				void *packed_buf, size_t size_bytes)
+{
+	return __sja1105_spi_send_packed_buf(priv, rw, reg_addr, packed_buf,
+					     size_bytes, NULL);
+}
+
 /* If @rw is:
  * - SPI_WRITE: creates and sends an SPI write message at absolute
  *		address reg_addr, taking size_bytes from *packed_buf
@@ -114,7 +127,8 @@ int sja1105_spi_send_packed_buf(const struct sja1105_private *priv,
  */
 int sja1105_spi_send_int(const struct sja1105_private *priv,
 			 sja1105_spi_rw_mode_t rw, u64 reg_addr,
-			 u64 *value, u64 size_bytes)
+			 u64 *value, u64 size_bytes,
+			 struct ptp_system_timestamp *ptp_sts)
 {
 	u8 packed_buf[SJA1105_SIZE_SPI_MSG_MAXLEN];
 	int rc;
@@ -126,8 +140,8 @@ int sja1105_spi_send_int(const struct sja1105_private *priv,
 		sja1105_pack(packed_buf, value, 8 * size_bytes - 1, 0,
 			     size_bytes);
 
-	rc = sja1105_spi_send_packed_buf(priv, rw, reg_addr, packed_buf,
-					 size_bytes);
+	rc = __sja1105_spi_send_packed_buf(priv, rw, reg_addr, packed_buf,
+					   size_bytes, ptp_sts);
 
 	if (rw == SPI_READ)
 		sja1105_unpack(packed_buf, value, 8 * size_bytes - 1, 0,
@@ -291,7 +305,7 @@ int sja1105_inhibit_tx(const struct sja1105_private *priv,
 	int rc;
 
 	rc = sja1105_spi_send_int(priv, SPI_READ, regs->port_control,
-				  &inhibit_cmd, SJA1105_SIZE_PORT_CTRL);
+				  &inhibit_cmd, SJA1105_SIZE_PORT_CTRL, NULL);
 	if (rc < 0)
 		return rc;
 
@@ -301,7 +315,7 @@ int sja1105_inhibit_tx(const struct sja1105_private *priv,
 		inhibit_cmd &= ~port_bitmap;
 
 	return sja1105_spi_send_int(priv, SPI_WRITE, regs->port_control,
-				    &inhibit_cmd, SJA1105_SIZE_PORT_CTRL);
+				    &inhibit_cmd, SJA1105_SIZE_PORT_CTRL, NULL);
 }
 
 struct sja1105_status {
-- 
2.17.1


^ permalink raw reply related

* [RFC PATCH net-next 06/11] spi: spi-fsl-dspi: Implement the PTP system timestamping
From: Vladimir Oltean @ 2019-08-16  0:44 UTC (permalink / raw)
  To: h.feurstein, mlichvar, richardcochran, andrew, f.fainelli,
	broonie
  Cc: linux-spi, netdev, Vladimir Oltean
In-Reply-To: <20190816004449.10100-1-olteanv@gmail.com>

This adds a snapshotting software feature for TCFQ and EOQ modes of
operation. Due to my lack of proper understanding of the DMA mode,
the latter mode is left as an exercise for future developers.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
---
 drivers/spi/spi-fsl-dspi.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/drivers/spi/spi-fsl-dspi.c b/drivers/spi/spi-fsl-dspi.c
index 41c45ee2bb2d..3fc266d8263a 100644
--- a/drivers/spi/spi-fsl-dspi.c
+++ b/drivers/spi/spi-fsl-dspi.c
@@ -184,6 +184,9 @@ struct fsl_dspi {
 	int			irq;
 	struct clk		*clk;
 
+	struct ptp_system_timestamp *ptp_sts;
+	const void		*ptp_sts_word;
+	bool			take_snapshot;
 	struct spi_transfer	*cur_transfer;
 	struct spi_message	*cur_msg;
 	struct chip_data	*cur_chip;
@@ -658,6 +661,9 @@ static int dspi_rxtx(struct fsl_dspi *dspi)
 	u16 spi_tcnt;
 	u32 spi_tcr;
 
+	if (dspi->take_snapshot)
+		ptp_read_system_postts(dspi->ptp_sts);
+
 	/* Get transfer counter (in number of SPI transfers). It was
 	 * reset to 0 when transfer(s) were started.
 	 */
@@ -675,6 +681,11 @@ static int dspi_rxtx(struct fsl_dspi *dspi)
 		/* Success! */
 		return 0;
 
+	dspi->take_snapshot = (dspi->tx == dspi->ptp_sts_word);
+
+	if (dspi->take_snapshot)
+		ptp_read_system_prets(dspi->ptp_sts);
+
 	if (dspi->devtype_data->trans_mode == DSPI_EOQ_MODE)
 		dspi_eoq_write(dspi);
 	else
@@ -764,6 +775,8 @@ static int dspi_transfer_one_message(struct spi_master *master,
 		dspi->rx = transfer->rx_buf;
 		dspi->rx_end = dspi->rx + transfer->len;
 		dspi->len = transfer->len;
+		dspi->ptp_sts = transfer->ptp_sts;
+		dspi->ptp_sts_word = dspi->tx + transfer->ptp_sts_word_offset;
 		/* Validated transfer specific frame size (defaults applied) */
 		dspi->bits_per_word = transfer->bits_per_word;
 		if (transfer->bits_per_word <= 8)
@@ -784,6 +797,11 @@ static int dspi_transfer_one_message(struct spi_master *master,
 				     SPI_FRAME_EBITS(transfer->bits_per_word) |
 				     SPI_CTARE_DTCP(1));
 
+		dspi->take_snapshot = (dspi->tx == dspi->ptp_sts_word);
+
+		if (dspi->take_snapshot)
+			ptp_read_system_prets(dspi->ptp_sts);
+
 		trans_mode = dspi->devtype_data->trans_mode;
 		switch (trans_mode) {
 		case DSPI_EOQ_MODE:
-- 
2.17.1


^ permalink raw reply related

* Re: [PATCH v2 bpf-next 1/4] bpf: unprivileged BPF access via /dev/bpf
From: Andy Lutomirski @ 2019-08-16  0:54 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Kees Cook, Andy Lutomirski, Song Liu, Networking, bpf,
	Alexei Starovoitov, Daniel Borkmann, Kernel Team, Lorenz Bauer,
	Jann Horn, Greg KH, Linux API, LSM List
In-Reply-To: <20190815234622.t65oxm5mtfzy6fhg@ast-mbp.dhcp.thefacebook.com>

> On Aug 15, 2019, at 4:46 PM, Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:

>> 
>> I'm not sure why you draw the line for VMs -- they're just as buggy
>> as anything else. Regardless, I reject this line of thinking: yes,
>> all software is buggy, but that isn't a reason to give up.
> 
> hmm. are you saying you want kernel community to work towards
> making containers (namespaces) being able to run arbitrary code
> downloaded from the internet?

Yes.

As an example, Sandstorm uses a combination of namespaces (user, network, mount, ipc) and a moderately permissive seccomp policy to run arbitrary code. Not just little snippets, either — node.js, Mongo, MySQL, Meteor, and other fairly heavyweight stacks can all run under Sandstorm, with the whole stack (database engine binaries, etc) supplied by entirely untrusted customers.  During the time Sandstorm was under active development, I can recall *one* bug that would have allowed a sandbox escape. That’s a pretty good track record.  (Also, Meltdown and Spectre, sigh.)

To be clear, Sandstorm did not allow creation of a userns by the untrusted code, and Sandstorm would have heavily restricted bpf(), but that should only be necessary because of the possibility of kernel bugs, not because of the overall design.

Alexei, I’m trying to encourage you to aim for something even better than you have now. Right now, if you grant a user various very strong capabilities, that user’s systemd can use bpf network filters.  Your proposal would allow this with a different, but still very strong, set of capabilities. There’s nothing wrong with this per se, but I think you can aim much higher:

CAP_NET_ADMIN and your CAP_BPF both effectively allow the holder to take over the system, *by design*.  I’m suggesting that you engage the security community (Kees, myself, Aleksa, Jann, Serge, Christian, etc) to aim for something better: make it so that a normal Linux distro would be willing to relax its settings enough so that normal users can use bpf filtering in the systemd units and maybe eventually use even more bpf() capabilities. And let’s make is to that mainstream container managers (that use userns!) will be willing (as an option) to delegate bpf() to their containers. We’re happy to help design, review, and even write code, but we need you to be willing to work with us to make a design that seems like it will work and then to wait long enough to merge it for us to think about it, try to poke holes in it, and convince ourselves and each other that it has a good chance of being sound.

Obviously there will be many cases where an unprivileged program should *not* be able to use bpf() IP filtering, but let’s make it so that enabling these advanced features does not automatically give away the keys to the kingdom.

(Sandstorm still exists but is no longer as actively developed, sadly.)

^ permalink raw reply

* Re: [PATCH] ipvlan: set hw_enc_features like macvlan
From: Mahesh Bandewar (महेश बंडेवार) @ 2019-08-16  1:03 UTC (permalink / raw)
  To: Bill Sommerfeld
  Cc: David S. Miller, Petr Machata, Jiri Pirko, Ido Schimmel,
	Daniel Borkmann, YueHaibing, Thomas Gleixner, Miaohe Lin,
	Eric Dumazet, linux-netdev, linux-kernel
In-Reply-To: <20190815001043.153874-1-wsommerfeld@google.com>

On Wed, Aug 14, 2019 at 5:10 PM Bill Sommerfeld <wsommerfeld@google.com> wrote:
>
> Allow encapsulated packets sent to tunnels layered over ipvlan to use
> offloads rather than forcing SW fallbacks.
>
> Since commit f21e5077010acda73a60 ("macvlan: add offload features for
> encapsulation"), macvlan has set dev->hw_enc_features to include
> everything in dev->features; do likewise in ipvlan.
>
Thanks Bill

> Signed-off-by: Bill Sommerfeld <wsommerfeld@google.com>
Acked-by: Mahesh Bandewar <maheshb@google.com>
> ---
>  drivers/net/ipvlan/ipvlan_main.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/drivers/net/ipvlan/ipvlan_main.c b/drivers/net/ipvlan/ipvlan_main.c
> index 1c96bed5a7c4..887bbba4631e 100644
> --- a/drivers/net/ipvlan/ipvlan_main.c
> +++ b/drivers/net/ipvlan/ipvlan_main.c
> @@ -126,6 +126,7 @@ static int ipvlan_init(struct net_device *dev)
>                      (phy_dev->state & IPVLAN_STATE_MASK);
>         dev->features = phy_dev->features & IPVLAN_FEATURES;
>         dev->features |= NETIF_F_LLTX | NETIF_F_VLAN_CHALLENGED;
> +       dev->hw_enc_features |= dev->features;
>         dev->gso_max_size = phy_dev->gso_max_size;
>         dev->gso_max_segs = phy_dev->gso_max_segs;
>         dev->hard_header_len = phy_dev->hard_header_len;
> --
> 2.23.0.rc1.153.gdeed80330f-goog
>

^ permalink raw reply

* Re: [PATCH net-next,v4 08/12] drivers: net: use flow block API
From: Pablo Neira Ayuso @ 2019-08-16  1:04 UTC (permalink / raw)
  To: Edward Cree; +Cc: netdev, netfilter-devel
In-Reply-To: <b3232864-3800-e2a4-9ee3-2cfcf222a148@solarflare.com>

On Wed, Aug 14, 2019 at 05:17:20PM +0100, Edward Cree wrote:
> On 13/08/2019 20:51, Pablo Neira Ayuso wrote:
> > On Mon, Aug 12, 2019 at 06:50:09PM +0100, Edward Cree wrote:
> >> Pablo, can you explain (because this commit message doesn't) why these per-
> >>  driver lists are needed, and what the information/state is that has module
> >>  (rather than, say, netdevice) scope?
> >
> > The idea is to update drivers to support one flow_block per subsystem,
> > one for ethtool, one for tc, and so on. So far, existing drivers only
> > allow for binding one single flow_block to one of the existing
> > subsystems. So this limitation applies at driver level.
>
> That argues for per-driver _code_, not for per-driver _state_. For instance,
>  each driver could (more logically) store this information in the netdev
>  private data, rather than a static global.  Or even, since each driver
>  instance has a unique cb_ident = netdev_priv(net_dev), this doesn't need to
>  be local to the driver at all and could just belong to the device owning the
>  flow_block (which isn't necessarily the device doing the offload, per
>  indirect blocks).

This list could be placed in netdev_priv() area, yes.

> TBH I'm still not clear why you need a flow_block per subsystem, rather than
>  just having multiple subsystems feed their offload requests through the same
>  flow_block but with different enum tc_setup_type or enum tc_fl_command or
>  some other indication that this is "netfilter" rather than "tc" asking for a
>  tc_cls_flower_offload.

In tc, the flow_block is set up by when the ingress qdisc is
registered. The usual scenario for most drivers is to have one single
flow_block per registered ingress qdisc, this makes a 1:1 mapping
between ingress qdisc and flow_block.

Still, you can register two or more ingress qdiscs to make them share
the same policy via 'tc block'. In that case all those qdiscs use one
single flow_block. This makes a N:1 mapping between these qdisc
ingress and the flow_block. This policy applies to all ingress qdiscs
that are part of the same tc block. By 'tc block', I'm refering to the
tcf_block structure.

In netfilter, there are ingress basechains that are registered per
device. Each basechain gets a flow_block by when the basechain is
registered. Shared blocks as in tcf_block are not yet supported, but
it should not be hard to extend it to make this work.

To reuse the same flow_block as entry point for all subsystems as your
propose - assuming offloads for two or more subsystems are in place -
then all of them would need to have the same block sharing
configuration, which might not be the case, ie. tc ingress might have
a eth0 and eth1 use the same policy via flow_block, while netfilter
might have one basechain for eth0 and another for eth1 (no policy
sharing).

[...]
> This really needs a design document explaining what all the bits are, how
>  they fit together, and why they need to be like that.

I did not design this flow_block abstraction, this concept was already
in place under a different name and extend it so the ethtool/netfilter
subsystems to avoid driver code duplication for offloads. Having said
this, I agree documentation is good to have.

^ permalink raw reply

* Re: [RFC PATCH bpf-next 00/14] xdp_flow: Flow offload to XDP
From: Toshiaki Makita @ 2019-08-16  1:09 UTC (permalink / raw)
  To: Stanislav Fomichev
  Cc: Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau, Song Liu,
	Yonghong Song, David S. Miller, Jakub Kicinski,
	Jesper Dangaard Brouer, John Fastabend, Jamal Hadi Salim,
	Cong Wang, Jiri Pirko, netdev, bpf, William Tu
In-Reply-To: <20190815152100.GN2820@mini-arch>

On 2019/08/16 0:21, Stanislav Fomichev wrote:
> On 08/15, Toshiaki Makita wrote:
>> On 2019/08/15 2:07, Stanislav Fomichev wrote:
>>> On 08/13, Toshiaki Makita wrote:
>>>> * Implementation
>>>>
>>>> xdp_flow makes use of UMH to load an eBPF program for XDP, similar to
>>>> bpfilter. The difference is that xdp_flow does not generate the eBPF
>>>> program dynamically but a prebuilt program is embedded in UMH. This is
>>>> mainly because flow insertion is considerably frequent. If we generate
>>>> and load an eBPF program on each insertion of a flow, the latency of the
>>>> first packet of ping in above test will incease, which I want to avoid.
>>> Can this be instead implemented with a new hook that will be called
>>> for TC events? This hook can write to perf event buffer and control
>>> plane will insert/remove/modify flow tables in the BPF maps (contol
>>> plane will also install xdp program).
>>>
>>> Why do we need UMH? What am I missing?
>>
>> So you suggest doing everything in xdp_flow kmod?
> You probably don't even need xdp_flow kmod. Add new tc "offload" mode
> (bypass) that dumps every command via netlink (or calls the BPF hook
> where you can dump it into perf event buffer) and then read that info
> from userspace and install xdp programs and modify flow tables.
> I don't think you need any kernel changes besides that stream
> of data from the kernel about qdisc/tc flow creation/removal/etc.

My intention is to make more people who want high speed network easily use XDP,
so making transparent XDP offload with current TC interface.

What userspace program would monitor TC events with your suggestion?
ovs-vswitchd? If so, it even does not need to monitor TC. It can
implement XDP offload directly.
(However I prefer kernel solution. Please refer to "About alternative
userland (ovs-vswitchd etc.) implementation" section in the cover letter.)

Also such a TC monitoring solution easily can be out-of-sync with real TC
behavior as TC filter/flower is being heavily developed and changed,
e.g. introduction of TC block, support multiple masks with the same pref, etc.
I'm not sure such an unreliable solution have much value.

> But, I haven't looked at the series deeply, so I might be missing
> something :-)
> 
>> I also thought about that. There are two phases so let's think about them separately.
>>
>> 1) TC block (qdisc) creation / eBPF load
>>
>> I saw eBPF maintainers repeatedly saying eBPF program loading needs to be
>> done from userland, not from kernel, to run the verifier for safety.
>> However xdp_flow eBPF program is prebuilt and embedded in kernel so we may
>> allow such programs to be loaded from kernel? I currently don't have the will
>> to make such an API as loading can be done with current UMH mechanism.
>>
>> 2) flow insertion / eBPF map update
>>
>> Not sure if this needs to be done from userland. One concern is that eBPF maps can
>> be modified by unrelated processes and we need to handle all unexpected state of maps.
>> Such handling tends to be difficult and may cause unexpected kernel behavior.
>> OTOH updating maps from kmod may reduces the latency of flow insertion drastically.
> Latency from the moment I type 'tc filter add ...' to the moment the rule
> is installed into the maps? Does it really matter?

Yes it matters. Flow insertion is kind of data path in OVS.
Please see how ping latency is affected in the cover letter.

> Do I understand correctly that both of those events (qdisc creation and
> flow insertion) are triggered from tcf_block_offload_cmd (or similar)?

Both of eBPF load and map update are triggered from tcf_block_offload_cmd.
I think you understand it correctly.

Toshiaki Makita

^ permalink raw reply

* Re: [PATCH net-next,v4 07/12] net: sched: use flow block API
From: Pablo Neira Ayuso @ 2019-08-16  1:10 UTC (permalink / raw)
  To: Edward Cree; +Cc: netdev, netfilter-devel
In-Reply-To: <b45709c7-38b5-2dcb-3db1-0c2fca1840be@solarflare.com>

On Wed, Aug 14, 2019 at 05:32:34PM +0100, Edward Cree wrote:
> On 09/07/2019 21:55, Pablo Neira Ayuso wrote:
> > This patch adds tcf_block_setup() which uses the flow block API.
> >
> > This infrastructure takes the flow block callbacks coming from the
> > driver and register/unregister to/from the cls_api core.
> >
> > Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
> > ---
> > <snip>
> > @@ -796,13 +804,20 @@ static int tcf_block_offload_cmd(struct tcf_block *block,
> >  				 struct netlink_ext_ack *extack)
> >  {
> >  	struct tc_block_offload bo = {};
> > +	int err;
> >  
> >  	bo.net = dev_net(dev);
> >  	bo.command = command;
> >  	bo.binder_type = ei->binder_type;
> >  	bo.block = block;
> >  	bo.extack = extack;
> > -	return dev->netdev_ops->ndo_setup_tc(dev, TC_SETUP_BLOCK, &bo);
> > +	INIT_LIST_HEAD(&bo.cb_list);
> > +
> > +	err = dev->netdev_ops->ndo_setup_tc(dev, TC_SETUP_BLOCK, &bo);
> > +	if (err < 0)
> > +		return err;
> > +
> > +	return tcf_block_setup(block, &bo);
> >  }
> >  
> >  static int tcf_block_offload_bind(struct tcf_block *block, struct Qdisc *q,
> > @@ -1636,6 +1651,77 @@ void tcf_block_cb_unregister(struct tcf_block *block,
> >  }
> >  EXPORT_SYMBOL(tcf_block_cb_unregister);
> >  
> > +static int tcf_block_bind(struct tcf_block *block,
> > +			  struct flow_block_offload *bo)
> > +{
> > +	struct flow_block_cb *block_cb, *next;
> > +	int err, i = 0;
> > +
> > +	list_for_each_entry(block_cb, &bo->cb_list, list) {
> > +		err = tcf_block_playback_offloads(block, block_cb->cb,
> > +						  block_cb->cb_priv, true,
> > +						  tcf_block_offload_in_use(block),
> > +						  bo->extack);
> > +		if (err)
> > +			goto err_unroll;
> > +
> > +		i++;
> > +	}
> > +	list_splice(&bo->cb_list, &block->cb_list);
> > +
> > +	return 0;
> > +
> > +err_unroll:
> > +	list_for_each_entry_safe(block_cb, next, &bo->cb_list, list) {
> > +		if (i-- > 0) {
> > +			list_del(&block_cb->list);
> > +			tcf_block_playback_offloads(block, block_cb->cb,
> > +						    block_cb->cb_priv, false,
> > +						    tcf_block_offload_in_use(block),
> > +						    NULL);
> > +		}
> > +		flow_block_cb_free(block_cb);
> > +	}
> > +
> > +	return err;
> > +}
> >
> Why has the replay been moved from the function called by the driver
>  (__tcf_block_cb_register()) to work done by the driver's caller based on
>  what the driver has left on this flow_block_offload.cb_list?  This makes
>  it impossible for the driver to (say) unregister a block outside of an
>  explicit request from ndo_setup_tc().
> In my under-development driver, I have a teardown path called on PCI
>  remove, which calls tcf_block_cb_unregister() on all my block bindings
>  (of which the driver keeps track), to ensure that no flow rules are still
>  in place when unregister_netdev() is called;

It's the subsystem that has to release resources when
unregister_netdev() event happens. At least in netfilter, when the
device is going away, the filtering policy is removed, hence the
FLOW_BLOCK_UNBIND is called to release the blocks and, hence, the
offload resources. I remember tc ingress qdisc works like this too.

>  this is needed because some  of the driver's state for certain
>  rules involves taking a reference on  the netdevice (dev_hold()). 
>  Your structural changes here make that  impossible; is there any
>  reason why they're necessary?

May I have access to your driver code?

This would make it easier for me to understand your requirements, and
to discuss changes with you.

^ permalink raw reply

* Re: INFO: task hung in tls_sw_release_resources_tx
From: Jakub Kicinski @ 2019-08-16  1:11 UTC (permalink / raw)
  To: Hillf Danton
  Cc: syzbot, ast, aviadye, borisp, bpf, daniel, davejwatson, davem,
	john.fastabend, kafai, linux-kernel, netdev, songliubraving,
	syzkaller-bugs, yhs
In-Reply-To: <20190815141419.15036-1-hdanton@sina.com>

On Thu, 15 Aug 2019 22:14:19 +0800, Hillf Danton wrote:
> On Thu, 15 Aug 2019 03:54:06 -0700
> > Hello,
> > 
> > syzbot found the following crash on:
> > 
> > HEAD commit:    6d5afe20 sctp: fix memleak in sctp_send_reset_streams
> > git tree:       net
> > console output: https://syzkaller.appspot.com/x/log.txt?x=16e5536a600000
> > kernel config:  https://syzkaller.appspot.com/x/.config?x=a4c9e9f08e9e8960
> > dashboard link: https://syzkaller.appspot.com/bug?extid=6a9ff159672dfbb41c95
> > compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
> > syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=17cb0502600000
> > C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=14d5dc22600000
> > 
> > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > Reported-by: syzbot+6a9ff159672dfbb41c95@syzkaller.appspotmail.com
> > 
> > INFO: task syz-executor153:10198 blocked for more than 143 seconds.
> >        Not tainted 5.3.0-rc3+ #162
> > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > syz-executor153 D27672 10198  10179 0x80000002
> > Call Trace:
> >   context_switch kernel/sched/core.c:3254 [inline]
> >   __schedule+0x755/0x1580 kernel/sched/core.c:3880
> >   schedule+0xa8/0x270 kernel/sched/core.c:3944
> >   schedule_timeout+0x717/0xc50 kernel/time/timer.c:1783
> >   do_wait_for_common kernel/sched/completion.c:83 [inline]
> >   __wait_for_common kernel/sched/completion.c:104 [inline]
> >   wait_for_common kernel/sched/completion.c:115 [inline]
> >   wait_for_completion+0x29c/0x440 kernel/sched/completion.c:136
> >   crypto_wait_req include/linux/crypto.h:685 [inline]
> >   crypto_wait_req include/linux/crypto.h:680 [inline]
> >   tls_sw_release_resources_tx+0x4ee/0x6b0 net/tls/tls_sw.c:2075
> >   tls_sk_proto_cleanup net/tls/tls_main.c:275 [inline]
> >   tls_sk_proto_close+0x686/0x970 net/tls/tls_main.c:305
> >   inet_release+0xed/0x200 net/ipv4/af_inet.c:427
> >   inet6_release+0x53/0x80 net/ipv6/af_inet6.c:470
> >   __sock_release+0xce/0x280 net/socket.c:590
> >   sock_close+0x1e/0x30 net/socket.c:1268
> >   __fput+0x2ff/0x890 fs/file_table.c:280
> >   ____fput+0x16/0x20 fs/file_table.c:313
> >   task_work_run+0x145/0x1c0 kernel/task_work.c:113
> >   exit_task_work include/linux/task_work.h:22 [inline]
> >   do_exit+0x92f/0x2e50 kernel/exit.c:879
> >   do_group_exit+0x135/0x360 kernel/exit.c:983
> >   __do_sys_exit_group kernel/exit.c:994 [inline]
> >   __se_sys_exit_group kernel/exit.c:992 [inline]
> >   __x64_sys_exit_group+0x44/0x50 kernel/exit.c:992
> >   do_syscall_64+0xfd/0x6a0 arch/x86/entry/common.c:296
> >   entry_SYSCALL_64_after_hwframe+0x49/0xbe
> > RIP: 0033:0x43ff88
> > Code: 00 00 be 3c 00 00 00 eb 19 66 0f 1f 84 00 00 00 00 00 48 89 d7 89 f0  
> > 0f 05 48 3d 00 f0 ff ff 77 21 f4 48 89 d7 44 89 c0 0f 05 <48> 3d 00 f0 ff  
> > ff 76 e0 f7 d8 64 41 89 01 eb d8 0f 1f 84 00 00 00
> > RSP: 002b:00007ffd1c2d0f78 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
> > RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 000000000043ff88
> > RDX: 0000000000000000 RSI: 000000000000003c RDI: 0000000000000000
> > RBP: 00000000004bf890 R08: 00000000000000e7 R09: ffffffffffffffd0
> > R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001
> > R13: 00000000006d1180 R14: 0000000000000000 R15: 0000000000000000
> > INFO: lockdep is turned off.
> > NMI backtrace for cpu 0
> > CPU: 0 PID: 1057 Comm: khungtaskd Not tainted 5.3.0-rc3+ #162
> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
> > Google 01/01/2011
> > Call Trace:
> >   __dump_stack lib/dump_stack.c:77 [inline]
> >   dump_stack+0x172/0x1f0 lib/dump_stack.c:113
> >   nmi_cpu_backtrace.cold+0x70/0xb2 lib/nmi_backtrace.c:101
> >   nmi_trigger_cpumask_backtrace+0x23b/0x28b lib/nmi_backtrace.c:62
> >   arch_trigger_cpumask_backtrace+0x14/0x20 arch/x86/kernel/apic/hw_nmi.c:38
> >   trigger_all_cpu_backtrace include/linux/nmi.h:146 [inline]
> >   check_hung_uninterruptible_tasks kernel/hung_task.c:205 [inline]
> >   watchdog+0x9d0/0xef0 kernel/hung_task.c:289
> >   kthread+0x361/0x430 kernel/kthread.c:255
> >   ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352
> > Sending NMI from CPU 0 to CPUs 1:
> > NMI backtrace for cpu 1 skipped: idling at native_safe_halt+0xe/0x10  
> > arch/x86/include/asm/irqflags.h:60  
> 
> 1, diff -> commit f87e62d45e51 -> commit 1023121375c6
> 
> --- a/net/tls/tls_sw.c
> +++ b/net/tls/tls_sw.c
> @@ -2167,11 +2167,13 @@ static void tx_work_handler(struct work_
>  		return;
>  
>  	ctx = tls_sw_ctx_tx(tls_ctx);
> -	if (test_bit(BIT_TX_CLOSING, &ctx->tx_bitmask))
> -		return;
> -
> -	if (!test_and_clear_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask))
> -		return;
> +	if (test_bit(BIT_TX_CLOSING, &ctx->tx_bitmask)) {
> +		if (!test_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask))
> +			return;
> +	} else {
> +		if (!test_and_clear_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask))
> +			return;
> +	}
>  	lock_sock(sk);
>  	tls_tx_records(sk, -1);
>  	release_sock(sk);
> --
> 
> 2, a simpler one. And clear BIT_TX_SCHEDULED perhaps after releasing sock.
> 
> --- a/net/tls/tls_sw.c
> +++ b/net/tls/tls_sw.c
> @@ -2167,11 +2167,9 @@ static void tx_work_handler(struct work_
>  		return;
>  
>  	ctx = tls_sw_ctx_tx(tls_ctx);
> -	if (test_bit(BIT_TX_CLOSING, &ctx->tx_bitmask))
> -		return;
> +	if (!test_bit(BIT_TX_CLOSING, &ctx->tx_bitmask))
> +		clear_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask);
>  
> -	if (!test_and_clear_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask))
> -		return;
>  	lock_sock(sk);
>  	tls_tx_records(sk, -1);
>  	release_sock(sk);

Mmm.. too terse, I don't follow what you're trying to do here :(

I've been staring at this for a while and trying to repo but it's not
happening here.

The only thing I see is that EBUSY is not handled.

^ permalink raw reply

* [PATCH net,v5 0/2] flow_offload hardware priority fixes
From: Pablo Neira Ayuso @ 2019-08-16  1:24 UTC (permalink / raw)
  To: netfilter-devel
  Cc: davem, netdev, marcelo.leitner, jiri, wenxu, saeedm, paulb,
	gerlitz.or, jakub.kicinski

Hi,

This patchset contains two updates for the flow_offload users:

1) Pass the major tc priority to drivers so they do not have to
   lshift it. This is a preparation patch for the fix coming in
   patch #2.

2) Set the hardware priority from the netfilter basechain priority,
   some drivers break when using the existing hardware priority
   number that is set to zero.

v5: fix patch 2/2 to address a clang warning and to simplify
    the priority mapping.

Please, apply.

Pablo Neira Ayuso (2):
  net: sched: use major priority number as hardware priority
  netfilter: nf_tables: map basechain priority to hardware priority

 drivers/net/ethernet/mellanox/mlx5/core/en_tc.c      |  2 +-
 drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c   |  2 +-
 drivers/net/ethernet/mscc/ocelot_flower.c            | 12 +++---------
 drivers/net/ethernet/netronome/nfp/flower/qos_conf.c |  2 +-
 drivers/net/ethernet/stmicro/stmmac/stmmac_tc.c      |  2 +-
 include/net/netfilter/nf_tables_offload.h            |  2 ++
 include/net/pkt_cls.h                                |  2 +-
 net/netfilter/nf_tables_api.c                        |  4 ++++
 net/netfilter/nf_tables_offload.c                    | 17 ++++++++++++++---
 9 files changed, 28 insertions(+), 17 deletions(-)

-- 
2.11.0


^ permalink raw reply

* [PATCH net,v5 2/2] netfilter: nf_tables: map basechain priority to hardware priority
From: Pablo Neira Ayuso @ 2019-08-16  1:24 UTC (permalink / raw)
  To: netfilter-devel
  Cc: davem, netdev, marcelo.leitner, jiri, wenxu, saeedm, paulb,
	gerlitz.or, jakub.kicinski
In-Reply-To: <20190816012410.31844-1-pablo@netfilter.org>

This patch adds initial support for offloading basechains using the
priority range from 1 to 65535. This is restricting the netfilter
priority range to 16-bit integer since this is what most drivers assume
so far from tc. It should be possible to extend this range of supported
priorities later on once drivers are updated to support for 32-bit
integer priorities.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
v5: fix clang warning by simplifying the mapping of hardware priorities
    to basechain priority in the range of 1-65535. Zero is left behind
    since some drivers do not support this, no negative basechain
    priorities are used at this stage.

 include/net/netfilter/nf_tables_offload.h |  2 ++
 net/netfilter/nf_tables_api.c             |  4 ++++
 net/netfilter/nf_tables_offload.c         | 17 ++++++++++++++---
 3 files changed, 20 insertions(+), 3 deletions(-)

diff --git a/include/net/netfilter/nf_tables_offload.h b/include/net/netfilter/nf_tables_offload.h
index 3196663a10e3..c8b9dec376f5 100644
--- a/include/net/netfilter/nf_tables_offload.h
+++ b/include/net/netfilter/nf_tables_offload.h
@@ -73,4 +73,6 @@ int nft_flow_rule_offload_commit(struct net *net);
 	(__reg)->key		= __key;				\
 	memset(&(__reg)->mask, 0xff, (__reg)->len);
 
+int nft_chain_offload_priority(struct nft_base_chain *basechain);
+
 #endif
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 605a7cfe7ca7..b65caf4974e1 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -1662,6 +1662,10 @@ static int nf_tables_addchain(struct nft_ctx *ctx, u8 family, u8 genmask,
 
 		chain->flags |= NFT_BASE_CHAIN | flags;
 		basechain->policy = NF_ACCEPT;
+		if (chain->flags & NFT_CHAIN_HW_OFFLOAD &&
+		    nft_chain_offload_priority(basechain) < 0)
+			return -EOPNOTSUPP;
+
 		flow_block_init(&basechain->flow_block);
 	} else {
 		chain = kzalloc(sizeof(*chain), GFP_KERNEL);
diff --git a/net/netfilter/nf_tables_offload.c b/net/netfilter/nf_tables_offload.c
index 64f5fd5f240e..c0d18c1d77ac 100644
--- a/net/netfilter/nf_tables_offload.c
+++ b/net/netfilter/nf_tables_offload.c
@@ -103,10 +103,11 @@ void nft_offload_update_dependency(struct nft_offload_ctx *ctx,
 }
 
 static void nft_flow_offload_common_init(struct flow_cls_common_offload *common,
-					 __be16 proto,
-					struct netlink_ext_ack *extack)
+					 __be16 proto, int priority,
+					 struct netlink_ext_ack *extack)
 {
 	common->protocol = proto;
+	common->prio = priority;
 	common->extack = extack;
 }
 
@@ -124,6 +125,15 @@ static int nft_setup_cb_call(struct nft_base_chain *basechain,
 	return 0;
 }
 
+int nft_chain_offload_priority(struct nft_base_chain *basechain)
+{
+	if (basechain->ops.priority <= 0 ||
+	    basechain->ops.priority > USHRT_MAX)
+		return -1;
+
+	return 0;
+}
+
 static int nft_flow_offload_rule(struct nft_trans *trans,
 				 enum flow_cls_command command)
 {
@@ -142,7 +152,8 @@ static int nft_flow_offload_rule(struct nft_trans *trans,
 	if (flow)
 		proto = flow->proto;
 
-	nft_flow_offload_common_init(&cls_flow.common, proto, &extack);
+	nft_flow_offload_common_init(&cls_flow.common, proto,
+				     basechain->ops.priority, &extack);
 	cls_flow.command = command;
 	cls_flow.cookie = (unsigned long) rule;
 	if (flow)
-- 
2.11.0


^ permalink raw reply related

* [PATCH net,v5 1/2] net: sched: use major priority number as hardware priority
From: Pablo Neira Ayuso @ 2019-08-16  1:24 UTC (permalink / raw)
  To: netfilter-devel
  Cc: davem, netdev, marcelo.leitner, jiri, wenxu, saeedm, paulb,
	gerlitz.or, jakub.kicinski
In-Reply-To: <20190816012410.31844-1-pablo@netfilter.org>

tc transparently maps the software priority number to hardware. Update
it to pass the major priority which is what most drivers expect. Update
drivers too so they do not need to lshift the priority field of the
flow_cls_common_offload object. The stmmac driver is an exception, since
this code assumes the tc software priority is fine, therefore, lshift it
just to be conservative.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Jiri Pirko <jiri@mellanox.com>
---
v5: no changes.

 drivers/net/ethernet/mellanox/mlx5/core/en_tc.c      |  2 +-
 drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c   |  2 +-
 drivers/net/ethernet/mscc/ocelot_flower.c            | 12 +++---------
 drivers/net/ethernet/netronome/nfp/flower/qos_conf.c |  2 +-
 drivers/net/ethernet/stmicro/stmmac/stmmac_tc.c      |  2 +-
 include/net/pkt_cls.h                                |  2 +-
 6 files changed, 8 insertions(+), 14 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
index deeb65da99f3..00b2d4a86159 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
@@ -3167,7 +3167,7 @@ mlx5e_flow_esw_attr_init(struct mlx5_esw_flow_attr *esw_attr,
 
 	esw_attr->parse_attr = parse_attr;
 	esw_attr->chain = f->common.chain_index;
-	esw_attr->prio = TC_H_MAJ(f->common.prio) >> 16;
+	esw_attr->prio = f->common.prio;
 
 	esw_attr->in_rep = in_rep;
 	esw_attr->in_mdev = in_mdev;
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c
index e8ac90564dbe..84a87d059333 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c
@@ -471,7 +471,7 @@ int mlxsw_sp_acl_rulei_commit(struct mlxsw_sp_acl_rule_info *rulei)
 void mlxsw_sp_acl_rulei_priority(struct mlxsw_sp_acl_rule_info *rulei,
 				 unsigned int priority)
 {
-	rulei->priority = priority >> 16;
+	rulei->priority = priority;
 }
 
 void mlxsw_sp_acl_rulei_keymask_u32(struct mlxsw_sp_acl_rule_info *rulei,
diff --git a/drivers/net/ethernet/mscc/ocelot_flower.c b/drivers/net/ethernet/mscc/ocelot_flower.c
index 59487d446a09..b894bc0c9c16 100644
--- a/drivers/net/ethernet/mscc/ocelot_flower.c
+++ b/drivers/net/ethernet/mscc/ocelot_flower.c
@@ -13,12 +13,6 @@ struct ocelot_port_block {
 	struct ocelot_port *port;
 };
 
-static u16 get_prio(u32 prio)
-{
-	/* prio starts from 0x1000 while the ids starts from 0 */
-	return prio >> 16;
-}
-
 static int ocelot_flower_parse_action(struct flow_cls_offload *f,
 				      struct ocelot_ace_rule *rule)
 {
@@ -168,7 +162,7 @@ static int ocelot_flower_parse(struct flow_cls_offload *f,
 	}
 
 finished_key_parsing:
-	ocelot_rule->prio = get_prio(f->common.prio);
+	ocelot_rule->prio = f->common.prio;
 	ocelot_rule->id = f->cookie;
 	return ocelot_flower_parse_action(f, ocelot_rule);
 }
@@ -218,7 +212,7 @@ static int ocelot_flower_destroy(struct flow_cls_offload *f,
 	struct ocelot_ace_rule rule;
 	int ret;
 
-	rule.prio = get_prio(f->common.prio);
+	rule.prio = f->common.prio;
 	rule.port = port_block->port;
 	rule.id = f->cookie;
 
@@ -236,7 +230,7 @@ static int ocelot_flower_stats_update(struct flow_cls_offload *f,
 	struct ocelot_ace_rule rule;
 	int ret;
 
-	rule.prio = get_prio(f->common.prio);
+	rule.prio = f->common.prio;
 	rule.port = port_block->port;
 	rule.id = f->cookie;
 	ret = ocelot_ace_rule_stats_update(&rule);
diff --git a/drivers/net/ethernet/netronome/nfp/flower/qos_conf.c b/drivers/net/ethernet/netronome/nfp/flower/qos_conf.c
index 86e968cd5ffd..124a43dc136a 100644
--- a/drivers/net/ethernet/netronome/nfp/flower/qos_conf.c
+++ b/drivers/net/ethernet/netronome/nfp/flower/qos_conf.c
@@ -93,7 +93,7 @@ nfp_flower_install_rate_limiter(struct nfp_app *app, struct net_device *netdev,
 		return -EOPNOTSUPP;
 	}
 
-	if (flow->common.prio != (1 << 16)) {
+	if (flow->common.prio != 1) {
 		NL_SET_ERR_MSG_MOD(extack, "unsupported offload: qos rate limit offload requires highest priority");
 		return -EOPNOTSUPP;
 	}
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_tc.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_tc.c
index 37c0bc699cd9..6c305b6ecad0 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_tc.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_tc.c
@@ -94,7 +94,7 @@ static int tc_fill_entry(struct stmmac_priv *priv,
 	struct stmmac_tc_entry *entry, *frag = NULL;
 	struct tc_u32_sel *sel = cls->knode.sel;
 	u32 off, data, mask, real_off, rem;
-	u32 prio = cls->common.prio;
+	u32 prio = cls->common.prio << 16;
 	int ret;
 
 	/* Only 1 match per entry */
diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h
index e429809ca90d..98be18ef1ed3 100644
--- a/include/net/pkt_cls.h
+++ b/include/net/pkt_cls.h
@@ -646,7 +646,7 @@ tc_cls_common_offload_init(struct flow_cls_common_offload *cls_common,
 {
 	cls_common->chain_index = tp->chain->index;
 	cls_common->protocol = tp->protocol;
-	cls_common->prio = tp->prio;
+	cls_common->prio = tp->prio >> 16;
 	if (tc_skip_sw(flags) || flags & TCA_CLS_FLAGS_VERBOSE)
 		cls_common->extack = extack;
 }
-- 
2.11.0


^ permalink raw reply related

* Re: [RFC PATCH bpf-next 00/14] xdp_flow: Flow offload to XDP
From: Toshiaki Makita @ 2019-08-16  1:28 UTC (permalink / raw)
  To: Jakub Kicinski, Stanislav Fomichev
  Cc: Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau, Song Liu,
	Yonghong Song, David S. Miller, Jesper Dangaard Brouer,
	John Fastabend, Jamal Hadi Salim, Cong Wang, Jiri Pirko, netdev,
	bpf, William Tu
In-Reply-To: <20190815122232.4b1fa01c@cakuba.netronome.com>

On 2019/08/16 4:22, Jakub Kicinski wrote:
> On Thu, 15 Aug 2019 08:21:00 -0700, Stanislav Fomichev wrote:
>> On 08/15, Toshiaki Makita wrote:
>>> On 2019/08/15 2:07, Stanislav Fomichev wrote:
>>>> On 08/13, Toshiaki Makita wrote:
>>>>> * Implementation
>>>>>
>>>>> xdp_flow makes use of UMH to load an eBPF program for XDP, similar to
>>>>> bpfilter. The difference is that xdp_flow does not generate the eBPF
>>>>> program dynamically but a prebuilt program is embedded in UMH. This is
>>>>> mainly because flow insertion is considerably frequent. If we generate
>>>>> and load an eBPF program on each insertion of a flow, the latency of the
>>>>> first packet of ping in above test will incease, which I want to avoid.
>>>> Can this be instead implemented with a new hook that will be called
>>>> for TC events? This hook can write to perf event buffer and control
>>>> plane will insert/remove/modify flow tables in the BPF maps (contol
>>>> plane will also install xdp program).
>>>>
>>>> Why do we need UMH? What am I missing?
>>>
>>> So you suggest doing everything in xdp_flow kmod?
>> You probably don't even need xdp_flow kmod. Add new tc "offload" mode
>> (bypass) that dumps every command via netlink (or calls the BPF hook
>> where you can dump it into perf event buffer) and then read that info
>> from userspace and install xdp programs and modify flow tables.
>> I don't think you need any kernel changes besides that stream
>> of data from the kernel about qdisc/tc flow creation/removal/etc.
> 
> There's a certain allure in bringing the in-kernel BPF translation
> infrastructure forward. OTOH from system architecture perspective IMHO
> it does seem like a task best handed in user space. bpfilter can replace
> iptables completely, here we're looking at an acceleration relatively
> loosely coupled with flower.

I don't think it's loosely coupled. Emulating TC behavior in userspace
is not so easy.

Think about recent multi-mask support in flower. Previously userspace could
assume there is one mask and hash table for each preference in TC. After the
change TC accepts different masks with the same pref. Such a change tends to
break userspace emulation. It may ignore masks passed from flow insertion
and use the mask remembered when the first flow of the pref is inserted. It
may override the mask of all existing flows with the pref. It may fail to
insert such flows. Any of them would result in unexpected wrong datapath
handling which is critical.
I think such an emulation layer needs to be updated in sync with TC.

Toshiaki Makita

^ permalink raw reply

* Re: [PATCH net-next v7 3/3] net: phy: broadcom: add 1000Base-X support for BCM54616S
From: Tao Ren @ 2019-08-16  1:33 UTC (permalink / raw)
  To: Florian Fainelli, Andrew Lunn, Heiner Kallweit, David S . Miller,
	Vladimir Oltean, Arun Parameswaran, Justin Chen,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	openbmc@lists.ozlabs.org
In-Reply-To: <d45d609b-60ea-760d-31f4-51afa379c55a@gmail.com>

On 8/15/19 4:36 PM, Florian Fainelli wrote:
> On 8/11/19 4:40 PM, Tao Ren wrote:
>> The BCM54616S PHY cannot work properly in RGMII->1000Base-X mode, mainly
>> because genphy functions are designed for copper links, and 1000Base-X
>> (clause 37) auto negotiation needs to be handled differently.
>>
>> This patch enables 1000Base-X support for BCM54616S by customizing 3
>> driver callbacks, and it's verified to be working on Facebook CMM BMC
>> platform (RGMII->1000Base-KX):
>>
>>   - probe: probe callback detects PHY's operation mode based on
>>     INTERF_SEL[1:0] pins and 1000X/100FX selection bit in SerDES 100-FX
>>     Control register.
>>
>>   - config_aneg: calls genphy_c37_config_aneg when the PHY is running in
>>     1000Base-X mode; otherwise, genphy_config_aneg will be called.
>>
>>   - read_status: calls genphy_c37_read_status when the PHY is running in
>>     1000Base-X mode; otherwise, genphy_read_status will be called.
>>
>> Note: BCM54616S PHY can also be configured in RGMII->100Base-FX mode, and
>> 100Base-FX support is not available as of now.
>>
>> Signed-off-by: Tao Ren <taoren@fb.com>
> 
>> -		reg = bcm_phy_read_shadow(phydev, BCM5482_SHD_MODE);
>> -		bcm_phy_write_shadow(phydev, BCM5482_SHD_MODE,
>> -				     reg | BCM5482_SHD_MODE_1000BX);
>> +		reg = bcm_phy_read_shadow(phydev, BCM54XX_SHD_MODE);
>> +		bcm_phy_write_shadow(phydev, BCM54XX_SHD_MODE,
>> +				     reg | BCM54XX_SHD_MODE_1000BX);
> 
> This could have been a separate patch, but this looks reasonable to me
> and this is correct with the datasheet, thanks Tao.
> 
> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>

Thank you FLorian.

It is great experience working with you all to figure out root cause and work out the patches, and I really appreciate it.


Cheers,

Tao

^ permalink raw reply

* Re: [RFC PATCH bpf-next 00/14] xdp_flow: Flow offload to XDP
From: Toshiaki Makita @ 2019-08-16  1:38 UTC (permalink / raw)
  To: William Tu
  Cc: Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau, Song Liu,
	Yonghong Song, David S. Miller, Jakub Kicinski,
	Jesper Dangaard Brouer, John Fastabend, Jamal Hadi Salim,
	Cong Wang, Jiri Pirko, Linux Kernel Network Developers, bpf
In-Reply-To: <CALDO+SYC4sPw-7iDkFMCD=kf2UnTW2qc0m6Kgz41zLmNNxQ+Ww@mail.gmail.com>

On 2019/08/16 0:46, William Tu wrote:
> On Tue, Aug 13, 2019 at 5:07 AM Toshiaki Makita
> <toshiaki.makita1@gmail.com> wrote:
>>
>> This is a rough PoC for an idea to offload TC flower to XDP.
>>
>>
>> * Motivation
>>
>> The purpose is to speed up software TC flower by using XDP.
>>
>> I chose TC flower because my current interest is in OVS. OVS uses TC to
>> offload flow tables to hardware, so if TC can offload flows to XDP, OVS
>> also can be offloaded to XDP.
>>
>> When TC flower filter is offloaded to XDP, the received packets are
>> handled by XDP first, and if their protocol or something is not
>> supported by the eBPF program, the program returns XDP_PASS and packets
>> are passed to upper layer TC.
>>
>> The packet processing flow will be like this when this mechanism,
>> xdp_flow, is used with OVS.
>>
>>   +-------------+
>>   | openvswitch |
>>   |    kmod     |
>>   +-------------+
>>          ^
>>          | if not match in filters (flow key or action not supported by TC)
>>   +-------------+
>>   |  TC flower  |
>>   +-------------+
>>          ^
>>          | if not match in flow tables (flow key or action not supported by XDP)
>>   +-------------+
>>   |  XDP prog   |
>>   +-------------+
>>          ^
>>          | incoming packets
>>
> I like this idea, some comments about the OVS AF_XDP work.
> 
> Another way when using OVS AF_XDP is to serve as slow path of TC flow
> HW offload.
> For example:
> 
>   Userspace OVS datapath (The one used by OVS-DPDK)
>       ^
>        |
>    +------------------------------+
>    |  OVS AF_XDP netdev |
>    +------------------------------+
>           ^
>           | if not supported or not match in flow tables
>    +---------------------+
>    |  TC HW flower  |
>    +---------------------+
>           ^
>           | incoming packets
> 
> So in this case it's either TC HW flower offload, or the userspace PMD OVS.
> Both cases should be pretty fast.
> 
> I think xdp_flow can also be used by OVS AF_XDP netdev, sitting between
> TC HW flower and OVS AF_XDP netdev.
> Before the XDP program sending packet to AF_XDP socket, the
> xdp_flow can execute first, and if not match, then send to AF_XDP.
> So in your patch set, implement s.t like
>    bpf_redirect_map(&xsks_map, index, 0);

Thanks, the concept sounds good but this is probably difficult as long as
this is a TC offload, which is emulating TC.
If I changed the direction and implement offload in ovs-vswitchd, it would
be possible. I'll remember this optimization.

> Another thing is that at each layer we are doing its own packet parsing.
>  From your graph, first parse at XDP program, then at TC flow, then at
> openvswitch kmod.
> I wonder if we can reuse some parsing result.

That would be nice if possible...
Currently I don't have any ideas to do that. Someday XDP may support more
metadata for this or HW-offload like checksum. Then we can store the information
and upper layers may be able to use that.

Toshiaki Makita

^ permalink raw reply

* Re: WARNING in is_bpf_text_address
From: Bart Van Assche @ 2019-08-16  1:39 UTC (permalink / raw)
  To: Will Deacon, syzbot
  Cc: akpm, ast, bpf, daniel, davem, dvyukov, hawk, hdanton,
	jakub.kicinski, johannes.berg, johannes, john.fastabend, kafai,
	linux-kernel, longman, mingo, netdev, paulmck, peterz,
	songliubraving, syzkaller-bugs, tglx, tj, torvalds, will.deacon,
	xdp-newbies, yhs
In-Reply-To: <20190815075142.vuza32plqtiuhixx@willie-the-truck>

On 8/15/19 12:51 AM, Will Deacon wrote:
> Hi Bart,
> 
> On Sat, Aug 10, 2019 at 05:24:06PM -0700, syzbot wrote:
>> syzbot has found a reproducer for the following crash on:
>>
>> HEAD commit:    451577f3 Merge tag 'kbuild-fixes-v5.3-3' of git://git.kern..
>> git tree:       upstream
>> console output: https://syzkaller.appspot.com/x/log.txt?x=120850a6600000
>> kernel config:  https://syzkaller.appspot.com/x/.config?x=2031e7d221391b8a
>> dashboard link: https://syzkaller.appspot.com/bug?extid=bd3bba6ff3fcea7a6ec6
>> compiler:       clang version 9.0.0 (/home/glider/llvm/clang
>> 80fee25776c2fb61e74c1ecb1a523375c2500b69)
>> syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=130ffe4a600000
>> C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=17137d2c600000
>>
>> The bug was bisected to:
>>
>> commit a0b0fd53e1e67639b303b15939b9c653dbe7a8c4
>> Author: Bart Van Assche <bvanassche@acm.org>
>> Date:   Thu Feb 14 23:00:46 2019 +0000
>>
>>      locking/lockdep: Free lock classes that are no longer in use
>>
>> bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=152f6a9da00000
>> final crash:    https://syzkaller.appspot.com/x/report.txt?x=172f6a9da00000
>> console output: https://syzkaller.appspot.com/x/log.txt?x=132f6a9da00000
> 
> I know you don't think much to these reports, but please could you have a
> look (even if it's just to declare it a false positive)?

Hi Will,

Had you already noticed the following message?

https://lore.kernel.org/bpf/d76d7a63-7854-e92d-30cb-52546d333ffe@iogearbox.net/

 From that message: "Hey Bart, don't think it's related in any way to 
your commit. I'll allocate some time on working on this issue today, 
thanks!"

Bart.

^ permalink raw reply

* Re: [PATCH v5] perf machine: arm/arm64: Improve completeness for kernel address space
From: Leo Yan @ 2019-08-16  1:45 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Arnaldo Carvalho de Melo, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, Alexei Starovoitov, Daniel Borkmann,
	Martin KaFai Lau, Song Liu, Yonghong Song, linux-kernel, netdev,
	bpf, clang-built-linux, Mathieu Poirier, Peter Zijlstra,
	Suzuki Poulouse, coresight, linux-arm-kernel
In-Reply-To: <e0919e39-7607-815b-3a12-96f098e45a5f@intel.com>

Hi Adrian,

On Thu, Aug 15, 2019 at 02:45:57PM +0300, Adrian Hunter wrote:

[...]

> >> How come you cannot use kallsyms to get the information?
> > 
> > Thanks for pointing out this.  Sorry I skipped your comment "I don't
> > know how you intend to calculate ARM_PRE_START_SIZE" when you reviewed
> > the patch v3, I should use that chance to elaborate the detailed idea
> > and so can get more feedback/guidance before procceed.
> > 
> > Actually, I have considered to use kallsyms when worked on the previous
> > patch set.
> > 
> > As mentioned in patch set v4's cover letter, I tried to implement
> > machine__create_extra_kernel_maps() for arm/arm64, the purpose is to
> > parse kallsyms so can find more kernel maps and thus also can fixup
> > the kernel start address.  But I found the 'perf script' tool directly
> > calls machine__get_kernel_start() instead of running into the flow for
> > machine__create_extra_kernel_maps();
> 
> Doesn't it just need to loop through each kernel map to find the lowest
> start address?

Based on your suggestion, I worked out below change and verified it
can work well on arm64 for fixing up start address; please let me know
if the change works for you?

diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index f6ee7fbad3e4..51d78313dca1 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -2671,9 +2671,26 @@ int machine__nr_cpus_avail(struct machine *machine)
 	return machine ? perf_env__nr_cpus_avail(machine->env) : 0;
 }
 
+static int machine__fixup_kernel_start(void *arg,
+				       const char *name __maybe_unused,
+				       char type,
+				       u64 start)
+{
+	struct machine *machine = arg;
+
+	type = toupper(type);
+
+	/* Fixup for text, weak, data and bss sections. */
+	if (type == 'T' || type == 'W' || type == 'D' || type == 'B')
+		machine->kernel_start = min(machine->kernel_start, start);
+
+	return 0;
+}
+
 int machine__get_kernel_start(struct machine *machine)
 {
 	struct map *map = machine__kernel_map(machine);
+	char filename[PATH_MAX];
 	int err = 0;
 
 	/*
@@ -2687,6 +2704,7 @@ int machine__get_kernel_start(struct machine *machine)
 	machine->kernel_start = 1ULL << 63;
 	if (map) {
 		err = map__load(map);
 		/*
 		 * On x86_64, PTI entry trampolines are less than the
 		 * start of kernel text, but still above 2^63. So leave
@@ -2695,6 +2713,16 @@ int machine__get_kernel_start(struct machine *machine)
 		if (!err && !machine__is(machine, "x86_64"))
 			machine->kernel_start = map->start;
 	}
+
+	machine__get_kallsyms_filename(machine, filename, PATH_MAX);
+
+	if (symbol__restricted_filename(filename, "/proc/kallsyms"))
+		goto out;
+
+	if (kallsyms__parse(filename, machine, machine__fixup_kernel_start))
+		pr_warning("Fail to fixup kernel start address. skipping...\n");
+
+out:
 	return err;
 }

Thanks,
Leo Yan

^ permalink raw reply related

* [PATCH -next v2] btf: fix return value check in btf_vmlinux_init()
From: Wei Yongjun @ 2019-08-16  2:40 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau, Song Liu,
	Yonghong Song, Andrii Nakryiko
  Cc: Wei Yongjun, netdev, bpf, kernel-janitors
In-Reply-To: <20190815142432.101401-1-weiyongjun1@huawei.com>

In case of error, the function kobject_create_and_add() returns NULL
pointer not ERR_PTR(). The IS_ERR() test in the return value check
should be replaced with NULL test.

Fixes: 341dfcf8d78e ("btf: expose BTF info through sysfs")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
---
 kernel/bpf/sysfs_btf.c | 9 ++-------
 1 file changed, 2 insertions(+), 7 deletions(-)

diff --git a/kernel/bpf/sysfs_btf.c b/kernel/bpf/sysfs_btf.c
index 4659349fc795..7ae5dddd1fe6 100644
--- a/kernel/bpf/sysfs_btf.c
+++ b/kernel/bpf/sysfs_btf.c
@@ -30,17 +30,12 @@ static struct kobject *btf_kobj;
 
 static int __init btf_vmlinux_init(void)
 {
-	int err;
-
 	if (!_binary__btf_vmlinux_bin_start)
 		return 0;
 
 	btf_kobj = kobject_create_and_add("btf", kernel_kobj);
-	if (IS_ERR(btf_kobj)) {
-		err = PTR_ERR(btf_kobj);
-		btf_kobj = NULL;
-		return err;
-	}
+	if (!btf_kobj)
+		return -ENOMEM;
 
 	bin_attr_btf_vmlinux.size = _binary__btf_vmlinux_bin_end -
 				    _binary__btf_vmlinux_bin_start;




^ permalink raw reply related

* linux-next: manual merge of the net-next tree with the kbuild tree
From: Stephen Rothwell @ 2019-08-16  2:41 UTC (permalink / raw)
  To: David Miller, Networking, Masahiro Yamada
  Cc: Linux Next Mailing List, Linux Kernel Mailing List, Kees Cook,
	Andrii Nakryiko, Daniel Borkmann

[-- Attachment #1: Type: text/plain, Size: 2690 bytes --]

Hi all,

Today's linux-next merge of the net-next tree got a conflict in:

  scripts/link-vmlinux.sh

between commit:

  e167191e4a8a ("kbuild: Parameterize kallsyms generation and correct reporting")

from the kbuild tree and commits:

  341dfcf8d78e ("btf: expose BTF info through sysfs")
  7fd785685e22 ("btf: rename /sys/kernel/btf/kernel into /sys/kernel/btf/vmlinux")

from the net-next tree.

I fixed it up (I think - see below) and can carry the fix as necessary.
This is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.

-- 
Cheers,
Stephen Rothwell

diff --cc scripts/link-vmlinux.sh
index 2438a9faf3f1,c31193340108..000000000000
--- a/scripts/link-vmlinux.sh
+++ b/scripts/link-vmlinux.sh
@@@ -56,11 -56,10 +56,11 @@@ modpost_link(
  }
  
  # Link of vmlinux
- # ${1} - optional extra .o files
- # ${2} - output file
+ # ${1} - output file
+ # ${@:2} - optional extra .o files
  vmlinux_link()
  {
 +	info LD ${2}
  	local lds="${objtree}/${KBUILD_LDS}"
  	local objects
  
@@@ -139,18 -149,6 +150,18 @@@ kallsyms(
  	${CC} ${aflags} -c -o ${2} ${afile}
  }
  
 +# Perform one step in kallsyms generation, including temporary linking of
 +# vmlinux.
 +kallsyms_step()
 +{
 +	kallsymso_prev=${kallsymso}
 +	kallsymso=.tmp_kallsyms${1}.o
 +	kallsyms_vmlinux=.tmp_vmlinux${1}
 +
- 	vmlinux_link "${kallsymso_prev}" ${kallsyms_vmlinux}
++	vmlinux_link ${kallsyms_vmlinux} "${kallsymso_prev}" ${btf_vmlinux_bin_o}
 +	kallsyms ${kallsyms_vmlinux} ${kallsymso}
 +}
 +
  # Create map file with all symbols from ${1}
  # See mksymap for additional details
  mksysmap()
@@@ -228,8 -227,14 +240,15 @@@ ${MAKE} -f "${srctree}/scripts/Makefile
  info MODINFO modules.builtin.modinfo
  ${OBJCOPY} -j .modinfo -O binary vmlinux.o modules.builtin.modinfo
  
+ btf_vmlinux_bin_o=""
+ if [ -n "${CONFIG_DEBUG_INFO_BTF}" ]; then
+ 	if gen_btf .tmp_vmlinux.btf .btf.vmlinux.bin.o ; then
+ 		btf_vmlinux_bin_o=.btf.vmlinux.bin.o
+ 	fi
+ fi
+ 
  kallsymso=""
 +kallsymso_prev=""
  kallsyms_vmlinux=""
  if [ -n "${CONFIG_KALLSYMS}" ]; then
  
@@@ -268,11 -285,8 +287,7 @@@
  	fi
  fi
  
- vmlinux_link "${kallsymso}" vmlinux
- 
- if [ -n "${CONFIG_DEBUG_INFO_BTF}" ]; then
- 	gen_btf vmlinux
- fi
 -info LD vmlinux
+ vmlinux_link vmlinux "${kallsymso}" "${btf_vmlinux_bin_o}"
  
  if [ -n "${CONFIG_BUILDTIME_EXTABLE_SORT}" ]; then
  	info SORTEX vmlinux

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply

* RE: [PATCH net-next] r8152: divide the tx and rx bottom functions
From: Hayes Wang @ 2019-08-16  2:59 UTC (permalink / raw)
  To: David Miller
  Cc: netdev@vger.kernel.org, nic_swsd, linux-kernel@vger.kernel.org
In-Reply-To: <20190815.135851.1942927063321516679.davem@davemloft.net>

David Miller [mailto:davem@davemloft.net]
> Sent: Friday, August 16, 2019 4:59 AM
[...]
> Theoretically, yes.
> 
> But do you have actual performance numbers showing this to be worth
> the change?
> 
> Always provide performance numbers with changes that are supposed to
> improve performance.

On x86, they are almost the same.
Tx/Rx: 943/943 Mbits/sec -> 945/944

For arm platform,
Tx/Rx: 917/917 Mbits/sec -> 933/933
Improve about 1.74%.

Best Regards,
Hayes



^ permalink raw reply

* Re: [PATCH net] tunnel: fix dev null pointer dereference when send pkg larger than mtu in collect_md mode
From: Hangbin Liu @ 2019-08-16  3:24 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: netdev, Stefano Brivio, wenxu, Alexei Starovoitov,
	David S . Miller
In-Reply-To: <cb5b5d82-1239-34a9-23f5-1894a2ec92a2@gmail.com>

Hi Eric,

Thanks for the review.
On Thu, Aug 15, 2019 at 11:16:58AM +0200, Eric Dumazet wrote:
> > diff --git a/net/ipv4/ip_tunnel.c b/net/ipv4/ip_tunnel.c
> > index 38c02bb62e2c..c6713c7287df 100644
> > --- a/net/ipv4/ip_tunnel.c
> > +++ b/net/ipv4/ip_tunnel.c
> > @@ -597,6 +597,9 @@ void ip_md_tunnel_xmit(struct sk_buff *skb, struct net_device *dev,
> >  		goto tx_error;
> >  	}
> >  
> > +	if (skb_dst(skb) && !skb_dst(skb)->dev)
> > +		skb_dst(skb)->dev = rt->dst.dev;
> > +
> 
> 
> IMO this looks wrong.
> This dst seems shared. 

If the dst is shared, it may cause some problem. Could you point me where the
dst may be shared possibly?

> Once set, we will reuse the same dev ?

If yes, how about just set the skb dst to rt->dst, as the
iptunnel_xmit would do later.

skb_dst_drop(skb);
skb_dst_set(skb, &rt->dst);

or do you have any other idea?
> 
> If intended, why not doing this in __metadata_dst_init() instead of in the fast path ?

I'm afraid we couldn't do this, I didn't find a way to init dev in
__metadata_dst_init(). Do you?

Thanks
Hangbin

^ permalink raw reply

* Re: [PATCH v2] socket.7: Add description of SO_SELECT_ERR_QUEUE
From: Ricardo Biehl Pasquali @ 2019-08-16  3:43 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages), jacob.e.keller
  Cc: linux-man, netdev, stefan.puiu, corbet, davem
In-Reply-To: <f053fe2c-20e5-4754-8b13-89cddfbfb52d@gmail.com>

TL;DR: This email proposes a description of the socket
option SO_SELECT_ERR_QUEUE taking into account the change
in wake up behavior when errors are enqueued introduced by
the commit 6e5d58fdc9bedd0255a8 ("skbuff: Fix not waking
applications when errors are enqueued") in Linux 4.16.

On Mon, Jul 29, 2019 at 08:51:42PM +0200, Michael Kerrisk (man-pages) wrote:
> Sorry -- I've not had a lot of cycles to spare for man-pages of late.

Hi. No problem, I've just wondering whether you were
receiving the messages.

> Thanks for the patch. But your text doesn't quite capture the idea
> in this commit message:
> 
> commit 7d4c04fc170087119727119074e72445f2bb192b
> Author: Keller, Jacob E <jacob.e.keller@intel.com>
> Date:   Thu Mar 28 11:19:25 2013 +0000

It definitely does not.

Initially, despite the description of the commit and the
name of the option, I was investigating only the poll() case
as this was what I was working on.

Sorry.

Now I investigated the behavior of select() and poll(). I've
updated a test code that I wrote some time ago.

See <https://github.com/pasqualirb/poll_select_test>.

I've also written a Behavior section in README which I did
not include here.

> What would you think of something like this:
>        SO_SELECT_ERR_QUEUE (since Linux 3.10)
>               When this option is set on a socket, an error condition  on
>               a socket causes notification not only via the exceptfds set
>               of select(2).  Similarly, poll(2) also  returns  a  POLLPRI
>               whenever an POLLERR event is returned.
> 
>               Background:  this  option  was  added  when waking up on an
>               error condition occurred occured only via the  readfds  and
>               writefds  sets of select(2).  The option was added to allow
>               monitoring for error conditions via the exceptfds  argument
>               without simultaneously having to receive notifications (via
>               readfds) for regular data that can be read from the socket.
>               After changes in Linux 4.16, in Linux 4.16, the use of this
>               flag to achieve the desired notifications is no longer nec‐
>               essary.  This option is nevertheless retained for backwards
>               compatibility.
> 
> ?

I think the part "causes notification not only via the
exeptfds set" implies that the option causes notification
in other sets besides exceptfds. However, the option causes
notification in exceptfds (before Linux 4.16).

In "Background", before Linux 4.16, "waking up" happened
also in exeptfds (see 'Internal details' section), although
select() did not return.

A description covering poll() and select() cases plus wake
up behavior might be:

  When this option is set on a socket and an error condition
  triggers wake up (see Background below), an exeptional
  condition (POLLPRI of poll(2); exeptfds of select(2)) is
  returned if user requested it.

  Background:

  Before Linux 4.16, an error condition triggers wake up only
  if user requested POLLIN or POLLPRI (i.e. any of readfds,
  writefds or exeptfds of select(2)). However, for an error
  condition to be returned to the user instead of sleeping
  again in the kernel, POLLERR (i.e. readfds or writefds of
  select(2)) must also have been requested (implicit in
  poll(2)). The option eliminates this need in select(2) by
  returning POLLPRI (i.e. exeptfds) if user requested it.

  Since Linux 4.16, an error condition triggers wake up only
  if user requested POLLERR (i.e. readfds or writefds of
  select(2)). Wake up is not triggered when requesting only
  exeptfds, although returning on it occurs if the error
  condition was generated before calling select(2).

  // Linux 4.16 commit 6e5d58fdc9bedd0255a8 ("skbuff: Fix not
  // waking applications when errors are enqueued")

Another description, focusing on select(), might be:

  Before Linux 4.16, when this option is set on a socket and
  an error condition occurs, select(2) returns on exeptfds if
  user requested it. It is already returned on readfds and
  writefds. Since Linux 4.16, when the option is set, an error
  condition does not return via exeptfds anymore unless it
  occurred before calling select(2).

  For poll(2), regardless of the kernel version, the option
  causes POLLPRI to be added when POLLERR is returned.

  The option does not affect wake up, it affects only whether
  select(2) returns. The wake up behavior is affected in Linux
  4.16. Before this release, waking up on an error condition
  required requesting POLLIN or POLLPRI. However, for an error
  condition to be returned to the user instead of sleeping
  again in the kernel, POLLERR must also be requested. Since
  Linux 4.16, waking up requires requesting only POLLERR.

I have been rewriting this multiple times in the past two
weeks, and I still think it is not clear/simple enough.

What do you think? Please comment your understanding and
your ideas.

Internal details
================

The commit 6e5d58fdc9bedd0255a8 ("skbuff: Fix not waking
applications when errors are enqueued") introduced in Linux
4.16, changed the function that triggered the wake up. The
function sk_data_ready() (sock_def_readable()), which wakes
up the task if POLLIN or POLLPRI is requested, was replaced
by sk_error_report() (sock_queue_err_skb()), which wakes up
the task only if POLLERR is requested.

With the option (SO_SELECT_ERR_QUEUE) set, requesting only
exeptfds (POLLPRI) does not intersect the trigger events
anymore, so the task is not woken. However, if POLLERR is
triggered __before__ calling select(), select() __will__
return because availability of events is checked before
sleep.

In select(), POLLPRI is always requested [1]. POLLERR is
requested by readfds and writefds [2]. POLLIN and POLLHUP
by readfds [2]. POLLOUT by writefds [2].

In poll(), user freely requests events, but POLLERR and
POLLHUP are always requested [3].

[1] https://git.kernel.org/pub/scm/linux/kernel/git/stable/
    linux.git/tree/fs/select.c?id=6e5d58fdc9bedd0255a8#n443

[2] https://git.kernel.org/pub/scm/linux/kernel/git/stable/
    linux.git/tree/fs/select.c?id=6e5d58fdc9bedd0255a8#n435

[3] https://git.kernel.org/pub/scm/linux/kernel/git/stable/
    linux.git/tree/fs/select.c?id=6e5d58fdc9bedd0255a8#n820

	pasquali

^ permalink raw reply

* [PATCH] airo: fix memory leaks
From: Wenwen Wang @ 2019-08-16  3:50 UTC (permalink / raw)
  To: Wenwen Wang
  Cc: Kalle Valo, David S. Miller, Herbert Xu, Dan Carpenter,
	Eric Biggers, Ard Biesheuvel,
	open list:NETWORKING DRIVERS (WIRELESS),
	open list:NETWORKING DRIVERS, open list

In proc_BSSList_open(), 'file->private_data' is allocated through kzalloc()
and 'data->rbuffer' is allocated through kmalloc(). In the following
execution, if an error occurs, they are not deallocated, leading to memory
leaks. To fix this issue, free the allocated memory regions before
returning the error.

Signed-off-by: Wenwen Wang <wenwen@cs.uga.edu>
---
 drivers/net/wireless/cisco/airo.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/net/wireless/cisco/airo.c b/drivers/net/wireless/cisco/airo.c
index 9342ffb..f43c065 100644
--- a/drivers/net/wireless/cisco/airo.c
+++ b/drivers/net/wireless/cisco/airo.c
@@ -5441,11 +5441,18 @@ static int proc_BSSList_open( struct inode *inode, struct file *file ) {
 			Cmd cmd;
 			Resp rsp;
 
-			if (ai->flags & FLAG_RADIO_MASK) return -ENETDOWN;
+			if (ai->flags & FLAG_RADIO_MASK) {
+				kfree(data->rbuffer);
+				kfree(file->private_data);
+				return -ENETDOWN;
+			}
 			memset(&cmd, 0, sizeof(cmd));
 			cmd.cmd=CMD_LISTBSS;
-			if (down_interruptible(&ai->sem))
+			if (down_interruptible(&ai->sem)) {
+				kfree(data->rbuffer);
+				kfree(file->private_data);
 				return -ERESTARTSYS;
+			}
 			issuecommand(ai, &cmd, &rsp);
 			up(&ai->sem);
 			data->readlen = 0;
-- 
2.7.4


^ permalink raw reply related

* Re: [PATCH net] tunnel: fix dev null pointer dereference when send pkg larger than mtu in collect_md mode
From: Hangbin Liu @ 2019-08-16  4:01 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: netdev, Stefano Brivio, wenxu, Alexei Starovoitov,
	David S . Miller
In-Reply-To: <20190816032418.GX18865@dhcp-12-139.nay.redhat.com>

On Fri, Aug 16, 2019 at 11:24:18AM +0800, Hangbin Liu wrote:
> If yes, how about just set the skb dst to rt->dst, as the
> iptunnel_xmit would do later.
> 
> skb_dst_drop(skb);
> skb_dst_set(skb, &rt->dst);
> 

Tested and this donesn't work good....

^ permalink raw reply

* Reminder: 8 active syzbot reports in "net/bpf" subsystem
From: Eric Biggers @ 2019-08-16  4:17 UTC (permalink / raw)
  To: netdev, bpf, David S. Miller, Alexei Starovoitov, Daniel Borkmann
  Cc: Martin KaFai Lau, Song Liu, Yonghong Song, syzkaller-bugs

[This email was generated by a script.  Let me know if you have any suggestions
to make it better, or if you want it re-generated with the latest status.]

Of the distinct crashes that syzbot has seen in the last week, I've manually
marked 8 of them as possibly being bugs in the "net/bpf" subsystem.  I've listed
these bug reports below.

Of these 8 reports, 1 was bisected to a commit from the following person:

	Alexei Starovoitov <ast@kernel.org>

I've manually checked that this bisection result looks plausible.

If you believe a bug report is no longer valid, please close it by sending a
'#syz fix', '#syz dup', or '#syz invalid' command in reply to the original
thread, as explained at https://goo.gl/tpsmEJ#status

If you believe I misattributed a bug report to the "net/bpf" subsystem, please
let me know and (if possible) forward it to the correct place.

Note: in total, I've actually assigned 42 open syzbot reports to this subsystem.
But to help focus people's efforts, I've only listed the 8 that have
(re-)occurred in the last week.  Let me know if you want the full list.

Here are the bug reports:

--------------------------------------------------------------------------------
Title:              WARNING in bpf_jit_free
Last occurred:      0 days ago
Reported:           395 days ago
Branches:           Mainline and others
Dashboard link:     https://syzkaller.appspot.com/bug?id=d04f9c2ec11ab2678f7427795ff5170cb9eb2220
Original thread:    https://lore.kernel.org/lkml/000000000000e92d1805711f5552@google.com/T/#u

This bug has a C reproducer.

syzbot has bisected this bug, but I think the bisection result is incorrect.

The original thread for this bug received 5 replies; the last was 65 days ago.

If you fix this bug, please add the following tag to the commit:
    Reported-by: syzbot+2ff1e7cb738fd3c41113@syzkaller.appspotmail.com

If you send any email or patch for this bug, please consider replying to the
original thread.  For the git send-email command to use, or tips on how to reply
if the thread isn't in your mailbox, see the "Reply instructions" at
https://lore.kernel.org/r/000000000000e92d1805711f5552@google.com

--------------------------------------------------------------------------------
Title:              WARNING: kernel stack frame pointer has bad value (2)
Last occurred:      1 day ago
Reported:           395 days ago
Branches:           Mainline and others
Dashboard link:     https://syzkaller.appspot.com/bug?id=02a32f98a4e3b5a2ed6929aabdd28dd1618b9c03
Original thread:    https://lore.kernel.org/lkml/0000000000000956640571197f98@google.com/T/#u

This bug has a C reproducer.

The original thread for this bug received 1 reply, 395 days ago.

If you fix this bug, please add the following tag to the commit:
    Reported-by: syzbot+903cdd6bce9a6eb832a4@syzkaller.appspotmail.com

If you send any email or patch for this bug, please consider replying to the
original thread.  For the git send-email command to use, or tips on how to reply
if the thread isn't in your mailbox, see the "Reply instructions" at
https://lore.kernel.org/r/0000000000000956640571197f98@google.com

--------------------------------------------------------------------------------
Title:              BUG: unable to handle kernel paging request in bpf_prog_kallsyms_add
Last occurred:      0 days ago
Reported:           339 days ago
Branches:           Mainline and others
Dashboard link:     https://syzkaller.appspot.com/bug?id=97f89d84d528e4f5150dcfbdeb97347bc8471e96
Original thread:    https://lore.kernel.org/lkml/0000000000009417ef0575802d44@google.com/T/#u

This bug has a syzkaller reproducer only.

The original thread for this bug received 2 replies; the last was 164 days ago.

If you fix this bug, please add the following tag to the commit:
    Reported-by: syzbot+c827a78260579449ad39@syzkaller.appspotmail.com

If you send any email or patch for this bug, please consider replying to the
original thread.  For the git send-email command to use, or tips on how to reply
if the thread isn't in your mailbox, see the "Reply instructions" at
https://lore.kernel.org/r/0000000000009417ef0575802d44@google.com

--------------------------------------------------------------------------------
Title:              WARNING in bpf_prog_kallsyms_find
Last occurred:      0 days ago
Reported:           100 days ago
Branches:           Mainline and others
Dashboard link:     https://syzkaller.appspot.com/bug?id=40b0c218e639f1d882b86abff2549cfe11c5101e
Original thread:    https://lore.kernel.org/lkml/000000000000a8fa360588580820@google.com/T/#u

This bug has a C reproducer.

No one replied to the original thread for this bug.

If you fix this bug, please add the following tag to the commit:
    Reported-by: syzbot+89d1ce6e80218a6192d8@syzkaller.appspotmail.com

If you send any email or patch for this bug, please consider replying to the
original thread.  For the git send-email command to use, or tips on how to reply
if the thread isn't in your mailbox, see the "Reply instructions" at
https://lore.kernel.org/r/000000000000a8fa360588580820@google.com

--------------------------------------------------------------------------------
Title:              KASAN: use-after-free Read in sk_psock_unlink
Last occurred:      5 days ago
Reported:           293 days ago
Branches:           Mainline and others
Dashboard link:     https://syzkaller.appspot.com/bug?id=d691981726208716cc7aec231fb915e27763d662
Original thread:    https://lore.kernel.org/lkml/000000000000fd342e05791cc86f@google.com/T/#u

This bug has a syzkaller reproducer only.

syzbot has bisected this bug, but I think the bisection result is incorrect.

The original thread for this bug received 1 reply, 85 days ago.

If you fix this bug, please add the following tag to the commit:
    Reported-by: syzbot+3acd9f67a6a15766686e@syzkaller.appspotmail.com

If you send any email or patch for this bug, please consider replying to the
original thread.  For the git send-email command to use, or tips on how to reply
if the thread isn't in your mailbox, see the "Reply instructions" at
https://lore.kernel.org/r/000000000000fd342e05791cc86f@google.com

--------------------------------------------------------------------------------
Title:              KASAN: slab-out-of-bounds Read in do_jit
Last occurred:      0 days ago
Reported:           23 days ago
Branches:           Mainline and others
Dashboard link:     https://syzkaller.appspot.com/bug?id=3aacade388873fa82bd6d2efb6aaa9ab85964020
Original thread:    https://lore.kernel.org/lkml/000000000000a6ab6b058e5b899b@google.com/T/#u

This bug has a C reproducer.

This bug was bisected to:

		commit 2589726d12a1b12eaaa93c7f1ea64287e383c7a5
		Author: Alexei Starovoitov <ast@kernel.org>
		Date:   Sat Jun 15 19:12:20 2019 +0000

		  bpf: introduce bounded loops

No one has replied to the original thread for this bug yet.

If you fix this bug, please add the following tag to the commit:
    Reported-by: syzbot+6b40f58c6d280fa23b40@syzkaller.appspotmail.com

If you send any email or patch for this bug, please consider replying to the
original thread.  For the git send-email command to use, or tips on how to reply
if the thread isn't in your mailbox, see the "Reply instructions" at
https://lore.kernel.org/r/000000000000a6ab6b058e5b899b@google.com

--------------------------------------------------------------------------------
Title:              WARNING in is_bpf_text_address
Last occurred:      0 days ago
Reported:           55 days ago
Branches:           Mainline
Dashboard link:     https://syzkaller.appspot.com/bug?id=2386340f7a641010bb1e17228d1e9319592c01ba
Original thread:    https://lore.kernel.org/lkml/00000000000000ac4f058bd50039@google.com/T/#u

This bug has a C reproducer.

The original thread for this bug has received 5 replies; the last was 2 hours
ago.

If you fix this bug, please add the following tag to the commit:
    Reported-by: syzbot+bd3bba6ff3fcea7a6ec6@syzkaller.appspotmail.com

If you send any email or patch for this bug, please reply to the original
thread, which had activity only 2 hours ago.  For the git send-email command to
use, or tips on how to reply if the thread isn't in your mailbox, see the "Reply
instructions" at https://lore.kernel.org/r/00000000000000ac4f058bd50039@google.com

--------------------------------------------------------------------------------
Title:              memory leak in sock_hash_update_common
Last occurred:      7 days ago
Reported:           85 days ago
Branches:           Mainline
Dashboard link:     https://syzkaller.appspot.com/bug?id=9992588b3bbe2617f62f41b1162af9fc8ea4829c
Original thread:    https://lore.kernel.org/lkml/000000000000fa662405897c0774@google.com/T/#u

This bug has a syzkaller reproducer only.

No one has replied to the original thread for this bug yet.

If you fix this bug, please add the following tag to the commit:
    Reported-by: syzbot+30c7a1fc662026545124@syzkaller.appspotmail.com

If you send any email or patch for this bug, please consider replying to the
original thread.  For the git send-email command to use, or tips on how to reply
if the thread isn't in your mailbox, see the "Reply instructions" at
https://lore.kernel.org/r/000000000000fa662405897c0774@google.com

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox