Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH net] cls_flower: fix use after free in flower S/W path
From: Cong Wang @ 2018-06-20 18:06 UTC (permalink / raw)
  To: Paolo Abeni
  Cc: Linux Kernel Network Developers, Jamal Hadi Salim, Jiri Pirko,
	Marcelo Ricardo Leitner, Paul Blakey
In-Reply-To: <e3a9263c79766595ee701b3d66c885ea3d2febe6.1529515725.git.pabeni@redhat.com>

On Wed, Jun 20, 2018 at 10:34 AM, Paolo Abeni <pabeni@redhat.com> wrote:
>
> +static void fl_mask_free(struct fl_flow_mask *mask)
> +{
> +       rhashtable_destroy(&mask->ht);

I don't believe you can call rhashtable_destroy() in BH
context, it acquires a mutex...

^ permalink raw reply

* Re: [PATCH v4 net-next] net:sched: add action inheritdsfield to skbedit
From: Marcelo Ricardo Leitner @ 2018-06-20 18:40 UTC (permalink / raw)
  To: Davide Caratti
  Cc: Michel Machado, Fu, Qiaobin, davem@davemloft.net,
	netdev@vger.kernel.org, jhs@mojatatu.com,
	xiyou.wangcong@gmail.com
In-Reply-To: <a1716a3accbbf9a72bbedb011862ac57eaddf481.camel@redhat.com>

On Wed, Jun 20, 2018 at 07:02:41PM +0200, Davide Caratti wrote:
...
> >     I agree that we should update the drop counter, but given that 
> > you're already converting the stats to be per-cpu counters, whatever we 
> > add now will be just symbolic since you're going to change it anyway.

It wouldn't be symbolic. One thing is to convert a given increment
into something else, another is to start increasing it for some (new)
reason.

> 
> that's ok for me also, as I can use the current v4 code for the rebase
> (and not wait for another respin) _ but let's hear what reviewers think.
> 
> >  If 
> > reviewers think that Qiaobin's patch must add the update line, could you 
> > provide the exact line and location so we avoid going to v6 of this patch?
> 
> In case, I was thinking of something like: 
> 
> https://elixir.bootlin.com/linux/v4.18-rc1/source/net/sched/act_ipt.c#L249
> 
> so, between 'err:' and 'spin_unlock(&d->tcf_lock)', insert a line like:
> 
> d->tcf_qstats.drop++;

I prefer the more complete version. Then it will have a more complete
(git) history and help people when troubleshooting.

Thanks,
Marcelo

^ permalink raw reply

* [PATCH bpf 0/2] tools: bpftool: small fixes for error handling in prog load
From: Jakub Kicinski @ 2018-06-20 18:42 UTC (permalink / raw)
  To: alexei.starovoitov, daniel; +Cc: netdev, oss-drivers, Jakub Kicinski

Hi!

Two small fixes for error handling in bpftool prog load, first patch
removes a duplicated message, second one frees resources correctly.
Multiple error messages break JSON:

# bpftool -jp prog load tracex1_kern.o /sys/fs/bpf/a
{
    "error": "can't pin the object (/sys/fs/bpf/a): File exists"
},{
    "error": "failed to pin program"
}	

Jakub Kicinski (2):
  tools: bpftool: remove duplicated error message on prog load
  tools: bpftool: remember to close the libbpf object on prog load error

 tools/bpf/bpftool/prog.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

-- 
2.17.1

^ permalink raw reply

* [PATCH bpf 1/2] tools: bpftool: remove duplicated error message on prog load
From: Jakub Kicinski @ 2018-06-20 18:42 UTC (permalink / raw)
  To: alexei.starovoitov, daniel; +Cc: netdev, oss-drivers, Jakub Kicinski
In-Reply-To: <20180620184246.18672-1-jakub.kicinski@netronome.com>

do_pin_fd() will already print out an error message if something
goes wrong.  Printing another error is unnecessary and will break
JSON output, since error messages are full objects:

$ bpftool -jp prog load tracex1_kern.o /sys/fs/bpf/a
{
    "error": "can't pin the object (/sys/fs/bpf/a): File exists"
},{
    "error": "failed to pin program"
}

Fixes: 49a086c201a9 ("bpftool: implement prog load command")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
---
 tools/bpf/bpftool/prog.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/tools/bpf/bpftool/prog.c b/tools/bpf/bpftool/prog.c
index 05f42a46d6ed..12b694fe0404 100644
--- a/tools/bpf/bpftool/prog.c
+++ b/tools/bpf/bpftool/prog.c
@@ -694,10 +694,8 @@ static int do_load(int argc, char **argv)
 		return -1;
 	}
 
-	if (do_pin_fd(prog_fd, argv[1])) {
-		p_err("failed to pin program");
+	if (do_pin_fd(prog_fd, argv[1]))
 		return -1;
-	}
 
 	if (json_output)
 		jsonw_null(json_wtr);
-- 
2.17.1

^ permalink raw reply related

* [PATCH bpf 2/2] tools: bpftool: remember to close the libbpf object after prog load
From: Jakub Kicinski @ 2018-06-20 18:42 UTC (permalink / raw)
  To: alexei.starovoitov, daniel; +Cc: netdev, oss-drivers, Jakub Kicinski
In-Reply-To: <20180620184246.18672-1-jakub.kicinski@netronome.com>

Remembering to close all descriptors and free memory may not seem
important in a user space tool like bpftool, but if we were to run
in batch mode the consumed resources start to add up quickly.  Make
sure program load closes the libbpf object (which unloads and frees
it).

Fixes: 49a086c201a9 ("bpftool: implement prog load command")
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com>
---
 tools/bpf/bpftool/prog.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/tools/bpf/bpftool/prog.c b/tools/bpf/bpftool/prog.c
index 12b694fe0404..959aa53ab678 100644
--- a/tools/bpf/bpftool/prog.c
+++ b/tools/bpf/bpftool/prog.c
@@ -695,12 +695,18 @@ static int do_load(int argc, char **argv)
 	}
 
 	if (do_pin_fd(prog_fd, argv[1]))
-		return -1;
+		goto err_close_obj;
 
 	if (json_output)
 		jsonw_null(json_wtr);
 
+	bpf_object__close(obj);
+
 	return 0;
+
+err_close_obj:
+	bpf_object__close(obj);
+	return -1;
 }
 
 static int do_help(int argc, char **argv)
-- 
2.17.1

^ permalink raw reply related

* Re: [PATCH] r8169: Fix netpoll oops
From: Heiner Kallweit @ 2018-06-20 18:51 UTC (permalink / raw)
  To: Ville Syrjala, netdev; +Cc: Realtek linux nic maintainers, David S . Miller
In-Reply-To: <20180620120153.11676-1-ville.syrjala@linux.intel.com>

On 20.06.2018 14:01, Ville Syrjala wrote:
> From: Ville Syrjälä <ville.syrjala@linux.intel.com>
> 
> Pass the correct thing to rtl8169_interrupt() from netpoll.
> 
Indeed, this place I missed to change.

Only small remark: The patch should be annotated as "net" so
that it can go to 4.18.

> Cc: Realtek linux nic maintainers <nic_swsd@realtek.com>
> Cc: netdev@vger.kernel.org
> Cc: Heiner Kallweit <hkallweit1@gmail.com>
> Cc: David S. Miller <davem@davemloft.net>
> Fixes: ebcd5daa7ffd ("r8169: change interrupt handler argument type")
> Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
> ---
>  drivers/net/ethernet/realtek/r8169.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
> index 75dfac0248f4..f4cae2be0fda 100644
> --- a/drivers/net/ethernet/realtek/r8169.c
> +++ b/drivers/net/ethernet/realtek/r8169.c
> @@ -7148,7 +7148,7 @@ static void rtl8169_netpoll(struct net_device *dev)
>  {
>  	struct rtl8169_private *tp = netdev_priv(dev);
>  
> -	rtl8169_interrupt(pci_irq_vector(tp->pci_dev, 0), dev);
> +	rtl8169_interrupt(pci_irq_vector(tp->pci_dev, 0), tp);
>  }
>  #endif
>  
> 

^ permalink raw reply

* Re: Route fallback issue
From: Julian Anastasov @ 2018-06-20 19:00 UTC (permalink / raw)
  To: Akshat Kakkar; +Cc: netdev, cronolog+lartc, lartc, Erik Auerswald
In-Reply-To: <CAA5aLPiPBTEp4eMs9T6xA3dkOPry3H=oD2svo3Ybon3dEgh3gA@mail.gmail.com>


	Hello,

On Wed, 20 Jun 2018, Akshat Kakkar wrote:

> Hi netdev community,
> 
> I have 2 interfaces
> eno1 : 192.168.1.10/24
> eno2 : 192.168.2.10/24
> 
> I added routes as
> 172.16.0.0/12 via 192.168.1.254 metric 1
> 172.16.0.0/12 via 192.168.2.254 metric 2
> 
> My intention : All traffic to 172.16.0.0/12 should go thru eno1. If
> 192.168.1.254 is not reachable (no arp entry or link down), then it
> should fall back to eno2.

	You can also try alternative routes. But as the
kernel supports only default alternative routes, you can
put them in their own table:

# Alternative routes use same metric!!!
ip route append default via 192.168.1.254 dev eno1 table 100
ip route append default via 192.168.2.254 dev eno2 table 100
ip rule add prio 100 to 172.16.0.0/12 table 100

	Of course, you will get better results if an user space
tool puts only alive routes in service after doing health
checks of all near gateways.

Regards

^ permalink raw reply

* Re: [PATCH net-next 0/2] fixes for ipsec selftests
From: Anders Roxell @ 2018-06-20 19:09 UTC (permalink / raw)
  To: shannon.nelson; +Cc: Networking, David Miller
In-Reply-To: <1529473363-4036-1-git-send-email-shannon.nelson@oracle.com>

On Wed, 20 Jun 2018 at 07:42, Shannon Nelson <shannon.nelson@oracle.com> wrote:
>
> A couple of bad behaviors in the ipsec selftest were pointed out
> by Anders Roxell <anders.roxell@linaro.org> and are addressed here.
>
> Shannon Nelson (2):
>   selftests: rtnetlink: hide complaint from terminated monitor
>   selftests: rtnetlink: use a local IP address for IPsec tests
>
>  tools/testing/selftests/net/rtnetlink.sh | 11 +++++++----
>  1 file changed, 7 insertions(+), 4 deletions(-)
>
> --
> 2.7.4
>

Hi Shannon,

With this patches applied and my config patch.

I still get this error when I run the ipsec test:

FAIL: can't add fou port 7777, skipping test
RTNETLINK answers: Operation not supported
FAIL: can't add macsec interface, skipping test
RTNETLINK answers: Protocol not supported
RTNETLINK answers: No such process
RTNETLINK answers: No such process
FAIL: ipsec

Can you please cc the kselftest list when sending patches to
tools/testing/selftests/ ?

Cheers,
Anders

^ permalink raw reply

* net/usb: Use irqsave in USB's complete callback
From: Sebastian Andrzej Siewior @ 2018-06-20 19:31 UTC (permalink / raw)
  To: netdev; +Cc: linux-usb, tglx, David S. Miller

This is about using _irqsave() primitives in the completion callback in
order to get rid of local_irq_save() in __usb_hcd_giveback_urb().

Sebastian

^ permalink raw reply

* [PATCH 2/5] net: usb: hso: use irqsave() in USB's complete callback
From: Sebastian Andrzej Siewior @ 2018-06-20 19:31 UTC (permalink / raw)
  To: netdev; +Cc: linux-usb, tglx, David S. Miller, Sebastian Andrzej Siewior
In-Reply-To: <20180620193121.24967-1-bigeasy@linutronix.de>

The USB completion callback does not disable interrupts while acquiring
the lock. We want to remove the local_irq_disable() invocation from
__usb_hcd_giveback_urb() and therefore it is required for the callback
handler to disable the interrupts while acquiring the lock.
The callback may be invoked either in IRQ or BH context depending on the
USB host controller.
Use the _irqsave() variant of the locking primitives.

Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 drivers/net/usb/hso.c | 35 +++++++++++++++++++++--------------
 1 file changed, 21 insertions(+), 14 deletions(-)

diff --git a/drivers/net/usb/hso.c b/drivers/net/usb/hso.c
index e53883ad6107..de305ead32e6 100644
--- a/drivers/net/usb/hso.c
+++ b/drivers/net/usb/hso.c
@@ -999,6 +999,7 @@ static void read_bulk_callback(struct urb *urb)
 	struct hso_net *odev = urb->context;
 	struct net_device *net;
 	int result;
+	unsigned long flags;
 	int status = urb->status;
 
 	/* is al ok?  (Filip: Who's Al ?) */
@@ -1028,11 +1029,11 @@ static void read_bulk_callback(struct urb *urb)
 	if (urb->actual_length) {
 		/* Handle the IP stream, add header and push it onto network
 		 * stack if the packet is complete. */
-		spin_lock(&odev->net_lock);
+		spin_lock_irqsave(&odev->net_lock, flags);
 		packetizeRx(odev, urb->transfer_buffer, urb->actual_length,
 			    (urb->transfer_buffer_length >
 			     urb->actual_length) ? 1 : 0);
-		spin_unlock(&odev->net_lock);
+		spin_unlock_irqrestore(&odev->net_lock, flags);
 	}
 
 	/* We are done with this URB, resubmit it. Prep the USB to wait for
@@ -1193,6 +1194,7 @@ static void hso_std_serial_read_bulk_callback(struct urb *urb)
 {
 	struct hso_serial *serial = urb->context;
 	int status = urb->status;
+	unsigned long flags;
 
 	hso_dbg(0x8, "--- Got serial_read_bulk callback %02x ---\n", status);
 
@@ -1216,10 +1218,10 @@ static void hso_std_serial_read_bulk_callback(struct urb *urb)
 	if (serial->parent->port_spec & HSO_INFO_CRC_BUG)
 		fix_crc_bug(urb, serial->in_endp->wMaxPacketSize);
 	/* Valid data, handle RX data */
-	spin_lock(&serial->serial_lock);
+	spin_lock_irqsave(&serial->serial_lock, flags);
 	serial->rx_urb_filled[hso_urb_to_index(serial, urb)] = 1;
 	put_rxbuf_data_and_resubmit_bulk_urb(serial);
-	spin_unlock(&serial->serial_lock);
+	spin_unlock_irqrestore(&serial->serial_lock, flags);
 }
 
 /*
@@ -1502,12 +1504,13 @@ static void tiocmget_intr_callback(struct urb *urb)
 		DUMP(serial_state_notification,
 		     sizeof(struct hso_serial_state_notification));
 	} else {
+		unsigned long flags;
 
 		UART_state_bitmap = le16_to_cpu(serial_state_notification->
 						UART_state_bitmap);
 		prev_UART_state_bitmap = tiocmget->prev_UART_state_bitmap;
 		icount = &tiocmget->icount;
-		spin_lock(&serial->serial_lock);
+		spin_lock_irqsave(&serial->serial_lock, flags);
 		if ((UART_state_bitmap & B_OVERRUN) !=
 		   (prev_UART_state_bitmap & B_OVERRUN))
 			icount->parity++;
@@ -1530,7 +1533,7 @@ static void tiocmget_intr_callback(struct urb *urb)
 		   (prev_UART_state_bitmap & B_RX_CARRIER))
 			icount->dcd++;
 		tiocmget->prev_UART_state_bitmap = UART_state_bitmap;
-		spin_unlock(&serial->serial_lock);
+		spin_unlock_irqrestore(&serial->serial_lock, flags);
 		tiocmget->intr_completed = 1;
 		wake_up_interruptible(&tiocmget->waitq);
 	}
@@ -1852,6 +1855,7 @@ static void intr_callback(struct urb *urb)
 	struct hso_serial *serial;
 	unsigned char *port_req;
 	int status = urb->status;
+	unsigned long flags;
 	int i;
 
 	usb_mark_last_busy(urb->dev);
@@ -1879,7 +1883,7 @@ static void intr_callback(struct urb *urb)
 			if (serial != NULL) {
 				hso_dbg(0x1, "Pending read interrupt on port %d\n",
 					i);
-				spin_lock(&serial->serial_lock);
+				spin_lock_irqsave(&serial->serial_lock, flags);
 				if (serial->rx_state == RX_IDLE &&
 					serial->port.count > 0) {
 					/* Setup and send a ctrl req read on
@@ -1893,7 +1897,8 @@ static void intr_callback(struct urb *urb)
 					hso_dbg(0x1, "Already a read pending on port %d or port not open\n",
 						i);
 				}
-				spin_unlock(&serial->serial_lock);
+				spin_unlock_irqrestore(&serial->serial_lock,
+						       flags);
 			}
 		}
 	}
@@ -1920,6 +1925,7 @@ static void hso_std_serial_write_bulk_callback(struct urb *urb)
 {
 	struct hso_serial *serial = urb->context;
 	int status = urb->status;
+	unsigned long flags;
 
 	/* sanity check */
 	if (!serial) {
@@ -1927,9 +1933,9 @@ static void hso_std_serial_write_bulk_callback(struct urb *urb)
 		return;
 	}
 
-	spin_lock(&serial->serial_lock);
+	spin_lock_irqsave(&serial->serial_lock, flags);
 	serial->tx_urb_used = 0;
-	spin_unlock(&serial->serial_lock);
+	spin_unlock_irqrestore(&serial->serial_lock, flags);
 	if (status) {
 		handle_usb_error(status, __func__, serial->parent);
 		return;
@@ -1971,14 +1977,15 @@ static void ctrl_callback(struct urb *urb)
 	struct hso_serial *serial = urb->context;
 	struct usb_ctrlrequest *req;
 	int status = urb->status;
+	unsigned long flags;
 
 	/* sanity check */
 	if (!serial)
 		return;
 
-	spin_lock(&serial->serial_lock);
+	spin_lock_irqsave(&serial->serial_lock, flags);
 	serial->tx_urb_used = 0;
-	spin_unlock(&serial->serial_lock);
+	spin_unlock_irqrestore(&serial->serial_lock, flags);
 	if (status) {
 		handle_usb_error(status, __func__, serial->parent);
 		return;
@@ -1994,9 +2001,9 @@ static void ctrl_callback(struct urb *urb)
 	    (USB_DIR_IN | USB_TYPE_OPTION_VENDOR | USB_RECIP_INTERFACE)) {
 		/* response to a read command */
 		serial->rx_urb_filled[0] = 1;
-		spin_lock(&serial->serial_lock);
+		spin_lock_irqsave(&serial->serial_lock, flags);
 		put_rxbuf_data_and_resubmit_ctrl_urb(serial);
-		spin_unlock(&serial->serial_lock);
+		spin_unlock_irqrestore(&serial->serial_lock, flags);
 	} else {
 		hso_put_activity(serial->parent);
 		tty_port_tty_wakeup(&serial->port);
-- 
2.17.1

^ permalink raw reply related

* [PATCH 3/5] net: usb: kaweth: use irqsave() in USB's complete callback
From: Sebastian Andrzej Siewior @ 2018-06-20 19:31 UTC (permalink / raw)
  To: netdev; +Cc: linux-usb, tglx, David S. Miller, Sebastian Andrzej Siewior
In-Reply-To: <20180620193121.24967-1-bigeasy@linutronix.de>

The USB completion callback does not disable interrupts while acquiring
the lock. We want to remove the local_irq_disable() invocation from
__usb_hcd_giveback_urb() and therefore it is required for the callback
handler to disable the interrupts while acquiring the lock.
The callback may be invoked either in IRQ or BH context depending on the
USB host controller.
Use the _irqsave() variant of the locking primitives.

Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 drivers/net/usb/kaweth.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/net/usb/kaweth.c b/drivers/net/usb/kaweth.c
index f1605833c5cf..913e50bab0a2 100644
--- a/drivers/net/usb/kaweth.c
+++ b/drivers/net/usb/kaweth.c
@@ -587,7 +587,7 @@ static void kaweth_usb_receive(struct urb *urb)
 	struct kaweth_device *kaweth = urb->context;
 	struct net_device *net = kaweth->net;
 	int status = urb->status;
-
+	unsigned long flags;
 	int count = urb->actual_length;
 	int count2 = urb->transfer_buffer_length;
 
@@ -619,12 +619,12 @@ static void kaweth_usb_receive(struct urb *urb)
 		net->stats.rx_errors++;
 		dev_dbg(dev, "Status was -EOVERFLOW.\n");
 	}
-	spin_lock(&kaweth->device_lock);
+	spin_lock_irqsave(&kaweth->device_lock, flags);
 	if (IS_BLOCKED(kaweth->status)) {
-		spin_unlock(&kaweth->device_lock);
+		spin_unlock_irqrestore(&kaweth->device_lock, flags);
 		return;
 	}
-	spin_unlock(&kaweth->device_lock);
+	spin_unlock_irqrestore(&kaweth->device_lock, flags);
 
 	if(status && status != -EREMOTEIO && count != 1) {
 		dev_err(&kaweth->intf->dev,
-- 
2.17.1

^ permalink raw reply related

* [PATCH 4/5] net: usb: r8152: use irqsave() in USB's complete callback
From: Sebastian Andrzej Siewior @ 2018-06-20 19:31 UTC (permalink / raw)
  To: netdev; +Cc: linux-usb, tglx, David S. Miller, Sebastian Andrzej Siewior
In-Reply-To: <20180620193121.24967-1-bigeasy@linutronix.de>

The USB completion callback does not disable interrupts while acquiring
the lock. We want to remove the local_irq_disable() invocation from
__usb_hcd_giveback_urb() and therefore it is required for the callback
handler to disable the interrupts while acquiring the lock.
The callback may be invoked either in IRQ or BH context depending on the
USB host controller.
Use the _irqsave() variant of the locking primitives.

Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 drivers/net/usb/r8152.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/net/usb/r8152.c b/drivers/net/usb/r8152.c
index 86f7196f9d91..c08c0d633407 100644
--- a/drivers/net/usb/r8152.c
+++ b/drivers/net/usb/r8152.c
@@ -1252,6 +1252,7 @@ static void read_bulk_callback(struct urb *urb)
 	int status = urb->status;
 	struct rx_agg *agg;
 	struct r8152 *tp;
+	unsigned long flags;
 
 	agg = urb->context;
 	if (!agg)
@@ -1281,9 +1282,9 @@ static void read_bulk_callback(struct urb *urb)
 		if (urb->actual_length < ETH_ZLEN)
 			break;
 
-		spin_lock(&tp->rx_lock);
+		spin_lock_irqsave(&tp->rx_lock, flags);
 		list_add_tail(&agg->list, &tp->rx_done);
-		spin_unlock(&tp->rx_lock);
+		spin_unlock_irqrestore(&tp->rx_lock, flags);
 		napi_schedule(&tp->napi);
 		return;
 	case -ESHUTDOWN:
@@ -1311,6 +1312,7 @@ static void write_bulk_callback(struct urb *urb)
 	struct net_device *netdev;
 	struct tx_agg *agg;
 	struct r8152 *tp;
+	unsigned long flags;
 	int status = urb->status;
 
 	agg = urb->context;
@@ -1332,9 +1334,9 @@ static void write_bulk_callback(struct urb *urb)
 		stats->tx_bytes += agg->skb_len;
 	}
 
-	spin_lock(&tp->tx_lock);
+	spin_lock_irqsave(&tp->tx_lock, flags);
 	list_add_tail(&agg->list, &tp->tx_free);
-	spin_unlock(&tp->tx_lock);
+	spin_unlock_irqrestore(&tp->tx_lock, flags);
 
 	usb_autopm_put_interface_async(tp->intf);
 
-- 
2.17.1

^ permalink raw reply related

* [PATCH 1/5] net: usb: cdc-phonet: use irqsave() in USB's complete callback
From: Sebastian Andrzej Siewior @ 2018-06-20 19:31 UTC (permalink / raw)
  To: netdev; +Cc: linux-usb, tglx, David S. Miller, Sebastian Andrzej Siewior
In-Reply-To: <20180620193121.24967-1-bigeasy@linutronix.de>

The USB completion callback does not disable interrupts while acquiring
the lock. We want to remove the local_irq_disable() invocation from
__usb_hcd_giveback_urb() and therefore it is required for the callback
handler to disable the interrupts while acquiring the lock.
The callback may be invoked either in IRQ or BH context depending on the
USB host controller.
Use the _irqsave() variant of the locking primitives.

Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 drivers/net/usb/cdc-phonet.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/net/usb/cdc-phonet.c b/drivers/net/usb/cdc-phonet.c
index 288ecd999171..3c40312fa453 100644
--- a/drivers/net/usb/cdc-phonet.c
+++ b/drivers/net/usb/cdc-phonet.c
@@ -99,6 +99,7 @@ static void tx_complete(struct urb *req)
 	struct net_device *dev = skb->dev;
 	struct usbpn_dev *pnd = netdev_priv(dev);
 	int status = req->status;
+	unsigned long flags;
 
 	switch (status) {
 	case 0:
@@ -115,10 +116,10 @@ static void tx_complete(struct urb *req)
 	}
 	dev->stats.tx_packets++;
 
-	spin_lock(&pnd->tx_lock);
+	spin_lock_irqsave(&pnd->tx_lock, flags);
 	pnd->tx_queue--;
 	netif_wake_queue(dev);
-	spin_unlock(&pnd->tx_lock);
+	spin_unlock_irqrestore(&pnd->tx_lock, flags);
 
 	dev_kfree_skb_any(skb);
 	usb_free_urb(req);
-- 
2.17.1

^ permalink raw reply related

* [PATCH 5/5] net: usb: rtl8150: use irqsave() in USB's complete callback
From: Sebastian Andrzej Siewior @ 2018-06-20 19:31 UTC (permalink / raw)
  To: netdev
  Cc: linux-usb, tglx, David S. Miller, Sebastian Andrzej Siewior,
	Petko Manolov
In-Reply-To: <20180620193121.24967-1-bigeasy@linutronix.de>

The USB completion callback does not disable interrupts while acquiring
the lock. We want to remove the local_irq_disable() invocation from
__usb_hcd_giveback_urb() and therefore it is required for the callback
handler to disable the interrupts while acquiring the lock.
The callback may be invoked either in IRQ or BH context depending on the
USB host controller.
Use the _irqsave() variant of the locking primitives.

Cc: Petko Manolov <petkan@nucleusys.com>
Cc: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 drivers/net/usb/rtl8150.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/net/usb/rtl8150.c b/drivers/net/usb/rtl8150.c
index 5f565bd574da..0e81d4c441d9 100644
--- a/drivers/net/usb/rtl8150.c
+++ b/drivers/net/usb/rtl8150.c
@@ -391,6 +391,7 @@ static void read_bulk_callback(struct urb *urb)
 	u16 rx_stat;
 	int status = urb->status;
 	int result;
+	unsigned long flags;
 
 	dev = urb->context;
 	if (!dev)
@@ -432,9 +433,9 @@ static void read_bulk_callback(struct urb *urb)
 	netdev->stats.rx_packets++;
 	netdev->stats.rx_bytes += pkt_len;
 
-	spin_lock(&dev->rx_pool_lock);
+	spin_lock_irqsave(&dev->rx_pool_lock, flags);
 	skb = pull_skb(dev);
-	spin_unlock(&dev->rx_pool_lock);
+	spin_unlock_irqrestore(&dev->rx_pool_lock, flags);
 	if (!skb)
 		goto resched;
 
-- 
2.17.1

^ permalink raw reply related

* [PATCH 0/4] wireless: Use irqsave in USB's complete callback
From: Sebastian Andrzej Siewior @ 2018-06-20 19:36 UTC (permalink / raw)
  To: netdev-u79uwXL29TY76Z2rM5mHXA
  Cc: linux-usb-u79uwXL29TY76Z2rM5mHXA, tglx-hfZtesqFncYOwBW4kG4KsQ,
	David S. Miller, linux-wireless-u79uwXL29TY76Z2rM5mHXA,
	Kalle Valo

This is about using _irqsave() primitives in the completion callback in order
to get rid of local_irq_save() in __usb_hcd_giveback_urb().

Sebastian

^ permalink raw reply

* [PATCH 1/4] ath9k: use irqsave() in USB's complete callback
From: Sebastian Andrzej Siewior @ 2018-06-20 19:36 UTC (permalink / raw)
  To: netdev-u79uwXL29TY76Z2rM5mHXA
  Cc: linux-usb-u79uwXL29TY76Z2rM5mHXA, tglx-hfZtesqFncYOwBW4kG4KsQ,
	David S. Miller, linux-wireless-u79uwXL29TY76Z2rM5mHXA,
	Kalle Valo, Sebastian Andrzej Siewior, QCA ath9k Development
In-Reply-To: <20180620193648.25136-1-bigeasy-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org>

The USB completion callback does not disable interrupts while acquiring
the lock. We want to remove the local_irq_disable() invocation from
__usb_hcd_giveback_urb() and therefore it is required for the callback
handler to disable the interrupts while acquiring the lock.
The callback may be invoked either in IRQ or BH context depending on the
USB host controller.
Use the _irqsave() variant of the locking primitives.

Cc: QCA ath9k Development <ath9k-devel-A+ZNKFmMK5xy9aJCnZT0Uw@public.gmane.org>
Cc: Kalle Valo <kvalo-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
Cc: "David S. Miller" <davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
Cc: linux-wireless-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Signed-off-by: Sebastian Andrzej Siewior <bigeasy-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org>
---
 drivers/net/wireless/ath/ath9k/hif_usb.c      |  7 ++++---
 drivers/net/wireless/ath/ath9k/htc_drv_txrx.c |  9 +++++----
 drivers/net/wireless/ath/ath9k/wmi.c          | 11 ++++++-----
 3 files changed, 15 insertions(+), 12 deletions(-)

diff --git a/drivers/net/wireless/ath/ath9k/hif_usb.c b/drivers/net/wireless/ath/ath9k/hif_usb.c
index cb0eef13af1c..fb649d85b8fc 100644
--- a/drivers/net/wireless/ath/ath9k/hif_usb.c
+++ b/drivers/net/wireless/ath/ath9k/hif_usb.c
@@ -138,6 +138,7 @@ static void hif_usb_mgmt_cb(struct urb *urb)
 {
 	struct cmd_buf *cmd = (struct cmd_buf *)urb->context;
 	struct hif_device_usb *hif_dev;
+	unsigned long flags;
 	bool txok = true;
 
 	if (!cmd || !cmd->skb || !cmd->hif_dev)
@@ -158,14 +159,14 @@ static void hif_usb_mgmt_cb(struct urb *urb)
 		 * If the URBs are being flushed, no need to complete
 		 * this packet.
 		 */
-		spin_lock(&hif_dev->tx.tx_lock);
+		spin_lock_irqsave(&hif_dev->tx.tx_lock, flags);
 		if (hif_dev->tx.flags & HIF_USB_TX_FLUSH) {
-			spin_unlock(&hif_dev->tx.tx_lock);
+			spin_unlock_irqrestore(&hif_dev->tx.tx_lock, flags);
 			dev_kfree_skb_any(cmd->skb);
 			kfree(cmd);
 			return;
 		}
-		spin_unlock(&hif_dev->tx.tx_lock);
+		spin_unlock_irqrestore(&hif_dev->tx.tx_lock, flags);
 
 		break;
 	default:
diff --git a/drivers/net/wireless/ath/ath9k/htc_drv_txrx.c b/drivers/net/wireless/ath/ath9k/htc_drv_txrx.c
index 585736a837ed..799010ed04e0 100644
--- a/drivers/net/wireless/ath/ath9k/htc_drv_txrx.c
+++ b/drivers/net/wireless/ath/ath9k/htc_drv_txrx.c
@@ -1107,25 +1107,26 @@ void ath9k_htc_rxep(void *drv_priv, struct sk_buff *skb,
 	struct ath_hw *ah = priv->ah;
 	struct ath_common *common = ath9k_hw_common(ah);
 	struct ath9k_htc_rxbuf *rxbuf = NULL, *tmp_buf = NULL;
+	unsigned long flags;
 
-	spin_lock(&priv->rx.rxbuflock);
+	spin_lock_irqsave(&priv->rx.rxbuflock, flags);
 	list_for_each_entry(tmp_buf, &priv->rx.rxbuf, list) {
 		if (!tmp_buf->in_process) {
 			rxbuf = tmp_buf;
 			break;
 		}
 	}
-	spin_unlock(&priv->rx.rxbuflock);
+	spin_unlock_irqrestore(&priv->rx.rxbuflock, flags);
 
 	if (rxbuf == NULL) {
 		ath_dbg(common, ANY, "No free RX buffer\n");
 		goto err;
 	}
 
-	spin_lock(&priv->rx.rxbuflock);
+	spin_lock_irqsave(&priv->rx.rxbuflock, flags);
 	rxbuf->skb = skb;
 	rxbuf->in_process = true;
-	spin_unlock(&priv->rx.rxbuflock);
+	spin_unlock_irqrestore(&priv->rx.rxbuflock, flags);
 
 	tasklet_schedule(&priv->rx_tasklet);
 	return;
diff --git a/drivers/net/wireless/ath/ath9k/wmi.c b/drivers/net/wireless/ath/ath9k/wmi.c
index b0b5579b7560..d1f6710ca63b 100644
--- a/drivers/net/wireless/ath/ath9k/wmi.c
+++ b/drivers/net/wireless/ath/ath9k/wmi.c
@@ -209,6 +209,7 @@ static void ath9k_wmi_ctrl_rx(void *priv, struct sk_buff *skb,
 {
 	struct wmi *wmi = priv;
 	struct wmi_cmd_hdr *hdr;
+	unsigned long flags;
 	u16 cmd_id;
 
 	if (unlikely(wmi->stopped))
@@ -218,20 +219,20 @@ static void ath9k_wmi_ctrl_rx(void *priv, struct sk_buff *skb,
 	cmd_id = be16_to_cpu(hdr->command_id);
 
 	if (cmd_id & 0x1000) {
-		spin_lock(&wmi->wmi_lock);
+		spin_lock_irqsave(&wmi->wmi_lock, flags);
 		__skb_queue_tail(&wmi->wmi_event_queue, skb);
-		spin_unlock(&wmi->wmi_lock);
+		spin_unlock_irqrestore(&wmi->wmi_lock, flags);
 		tasklet_schedule(&wmi->wmi_event_tasklet);
 		return;
 	}
 
 	/* Check if there has been a timeout. */
-	spin_lock(&wmi->wmi_lock);
+	spin_lock_irqsave(&wmi->wmi_lock, flags);
 	if (be16_to_cpu(hdr->seq_no) != wmi->last_seq_id) {
-		spin_unlock(&wmi->wmi_lock);
+		spin_unlock_irqrestore(&wmi->wmi_lock, flags);
 		goto free_skb;
 	}
-	spin_unlock(&wmi->wmi_lock);
+	spin_unlock_irqrestore(&wmi->wmi_lock, flags);
 
 	/* WMI command response */
 	ath9k_wmi_rsp_callback(wmi, skb);
-- 
2.17.1

^ permalink raw reply related

* [PATCH 3/4] libertas: use irqsave() in USB's complete callback
From: Sebastian Andrzej Siewior @ 2018-06-20 19:36 UTC (permalink / raw)
  To: netdev-u79uwXL29TY76Z2rM5mHXA
  Cc: linux-usb-u79uwXL29TY76Z2rM5mHXA, tglx-hfZtesqFncYOwBW4kG4KsQ,
	David S. Miller, linux-wireless-u79uwXL29TY76Z2rM5mHXA,
	Kalle Valo, Sebastian Andrzej Siewior,
	libertas-dev-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
In-Reply-To: <20180620193648.25136-1-bigeasy-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org>

The USB completion callback does not disable interrupts while acquiring
the lock. We want to remove the local_irq_disable() invocation from
__usb_hcd_giveback_urb() and therefore it is required for the callback
handler to disable the interrupts while acquiring the lock.
The callback may be invoked either in IRQ or BH context depending on the
USB host controller.
Use the _irqsave() variant of the locking primitives.

I am removing the
	BUG_ON(!in_interrupt());

check because it serves no purpose. Running the completion callback in
BH context makes in_interrupt() still return true but the interrupts
could be enabled. The important part is that ->driver_lock is acquired
with disabled interrupts which is the case now.

Cc: Kalle Valo <kvalo-sgV2jX0FEOL9JmXXK+q4OQ@public.gmane.org>
Cc: "David S. Miller" <davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
Cc: libertas-dev-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org
Cc: linux-wireless-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Signed-off-by: Sebastian Andrzej Siewior <bigeasy-hfZtesqFncYOwBW4kG4KsQ@public.gmane.org>
---
 drivers/net/wireless/marvell/libertas/if_usb.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/net/wireless/marvell/libertas/if_usb.c b/drivers/net/wireless/marvell/libertas/if_usb.c
index ffea610f67e2..c67a8e7be310 100644
--- a/drivers/net/wireless/marvell/libertas/if_usb.c
+++ b/drivers/net/wireless/marvell/libertas/if_usb.c
@@ -614,6 +614,7 @@ static inline void process_cmdrequest(int recvlength, uint8_t *recvbuff,
 				      struct if_usb_card *cardp,
 				      struct lbs_private *priv)
 {
+	unsigned long flags;
 	u8 i;
 
 	if (recvlength > LBS_CMD_BUFFER_SIZE) {
@@ -623,9 +624,7 @@ static inline void process_cmdrequest(int recvlength, uint8_t *recvbuff,
 		return;
 	}
 
-	BUG_ON(!in_interrupt());

^ permalink raw reply related

* [PATCH 2/4] libertas_tf: use irqsave() in USB's complete callback
From: Sebastian Andrzej Siewior @ 2018-06-20 19:36 UTC (permalink / raw)
  To: netdev
  Cc: linux-usb, tglx, David S. Miller, linux-wireless, Kalle Valo,
	Sebastian Andrzej Siewior
In-Reply-To: <20180620193648.25136-1-bigeasy@linutronix.de>

The USB completion callback does not disable interrupts while acquiring
the lock. We want to remove the local_irq_disable() invocation from
__usb_hcd_giveback_urb() and therefore it is required for the callback
handler to disable the interrupts while acquiring the lock.
The callback may be invoked either in IRQ or BH context depending on the
USB host controller.
Use the _irqsave() variant of the locking primitives.

I am removing the
	BUG_ON(!in_interrupt());

check because it serves no purpose. Running the completion callback in
BH context makes in_interrupt() still return true but the interrupts
could be enabled. The important part is that ->driver_lock is acquired
with disabled interrupts which is the case now.

Cc: Kalle Valo <kvalo@codeaurora.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: linux-wireless@vger.kernel.org
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 drivers/net/wireless/marvell/libertas_tf/if_usb.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/net/wireless/marvell/libertas_tf/if_usb.c b/drivers/net/wireless/marvell/libertas_tf/if_usb.c
index 5153922e7ce1..e92fc5001171 100644
--- a/drivers/net/wireless/marvell/libertas_tf/if_usb.c
+++ b/drivers/net/wireless/marvell/libertas_tf/if_usb.c
@@ -603,6 +603,8 @@ static inline void process_cmdrequest(int recvlength, uint8_t *recvbuff,
 				      struct if_usb_card *cardp,
 				      struct lbtf_private *priv)
 {
+	unsigned long flags;
+
 	if (recvlength > LBS_CMD_BUFFER_SIZE) {
 		lbtf_deb_usbd(&cardp->udev->dev,
 			     "The receive buffer is too large\n");
@@ -610,14 +612,12 @@ static inline void process_cmdrequest(int recvlength, uint8_t *recvbuff,
 		return;
 	}
 
-	BUG_ON(!in_interrupt());
-
-	spin_lock(&priv->driver_lock);
+	spin_lock_irqsave(&priv->driver_lock, flags);
 	memcpy(priv->cmd_resp_buff, recvbuff + MESSAGE_HEADER_LEN,
 	       recvlength - MESSAGE_HEADER_LEN);
 	kfree_skb(skb);
 	lbtf_cmd_response_rx(priv);
-	spin_unlock(&priv->driver_lock);
+	spin_unlock_irqrestore(&priv->driver_lock, flags);
 }
 
 /**
-- 
2.17.1

^ permalink raw reply related

* [PATCH 4/4] zd1211rw: use irqsave() in USB's complete callback
From: Sebastian Andrzej Siewior @ 2018-06-20 19:36 UTC (permalink / raw)
  To: netdev
  Cc: linux-usb, tglx, David S. Miller, linux-wireless, Kalle Valo,
	Sebastian Andrzej Siewior, Daniel Drake, Ulrich Kunitz
In-Reply-To: <20180620193648.25136-1-bigeasy@linutronix.de>

The USB completion callback does not disable interrupts while acquiring
the lock. We want to remove the local_irq_disable() invocation from
__usb_hcd_giveback_urb() and therefore it is required for the callback
handler to disable the interrupts while acquiring the lock.
The callback may be invoked either in IRQ or BH context depending on the
USB host controller.
Use the _irqsave() variant of the locking primitives.

Cc: Daniel Drake <dsd@gentoo.org>
Cc: Ulrich Kunitz <kune@deine-taler.de>
Cc: Kalle Valo <kvalo@codeaurora.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: linux-wireless@vger.kernel.org
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 drivers/net/wireless/zydas/zd1211rw/zd_usb.c | 21 +++++++++++---------
 1 file changed, 12 insertions(+), 9 deletions(-)

diff --git a/drivers/net/wireless/zydas/zd1211rw/zd_usb.c b/drivers/net/wireless/zydas/zd1211rw/zd_usb.c
index c30bf118c67d..c2cda3acd4af 100644
--- a/drivers/net/wireless/zydas/zd1211rw/zd_usb.c
+++ b/drivers/net/wireless/zydas/zd1211rw/zd_usb.c
@@ -371,25 +371,27 @@ static inline void handle_regs_int_override(struct urb *urb)
 {
 	struct zd_usb *usb = urb->context;
 	struct zd_usb_interrupt *intr = &usb->intr;
+	unsigned long flags;
 
-	spin_lock(&intr->lock);
+	spin_lock_irqsave(&intr->lock, flags);
 	if (atomic_read(&intr->read_regs_enabled)) {
 		atomic_set(&intr->read_regs_enabled, 0);
 		intr->read_regs_int_overridden = 1;
 		complete(&intr->read_regs.completion);
 	}
-	spin_unlock(&intr->lock);
+	spin_unlock_irqrestore(&intr->lock, flags);
 }
 
 static inline void handle_regs_int(struct urb *urb)
 {
 	struct zd_usb *usb = urb->context;
 	struct zd_usb_interrupt *intr = &usb->intr;
+	unsigned long flags;
 	int len;
 	u16 int_num;
 
 	ZD_ASSERT(in_interrupt());
-	spin_lock(&intr->lock);
+	spin_lock_irqsave(&intr->lock, flags);
 
 	int_num = le16_to_cpu(*(__le16 *)(urb->transfer_buffer+2));
 	if (int_num == CR_INTERRUPT) {
@@ -425,7 +427,7 @@ static inline void handle_regs_int(struct urb *urb)
 	}
 
 out:
-	spin_unlock(&intr->lock);
+	spin_unlock_irqrestore(&intr->lock, flags);
 
 	/* CR_INTERRUPT might override read_reg too. */
 	if (int_num == CR_INTERRUPT && atomic_read(&intr->read_regs_enabled))
@@ -665,6 +667,7 @@ static void rx_urb_complete(struct urb *urb)
 	struct zd_usb_rx *rx;
 	const u8 *buffer;
 	unsigned int length;
+	unsigned long flags;
 
 	switch (urb->status) {
 	case 0:
@@ -693,14 +696,14 @@ static void rx_urb_complete(struct urb *urb)
 		/* If there is an old first fragment, we don't care. */
 		dev_dbg_f(urb_dev(urb), "*** first fragment ***\n");
 		ZD_ASSERT(length <= ARRAY_SIZE(rx->fragment));
-		spin_lock(&rx->lock);
+		spin_lock_irqsave(&rx->lock, flags);
 		memcpy(rx->fragment, buffer, length);
 		rx->fragment_length = length;
-		spin_unlock(&rx->lock);
+		spin_unlock_irqrestore(&rx->lock, flags);
 		goto resubmit;
 	}
 
-	spin_lock(&rx->lock);
+	spin_lock_irqsave(&rx->lock, flags);
 	if (rx->fragment_length > 0) {
 		/* We are on a second fragment, we believe */
 		ZD_ASSERT(length + rx->fragment_length <=
@@ -710,9 +713,9 @@ static void rx_urb_complete(struct urb *urb)
 		handle_rx_packet(usb, rx->fragment,
 			         rx->fragment_length + length);
 		rx->fragment_length = 0;
-		spin_unlock(&rx->lock);
+		spin_unlock_irqrestore(&rx->lock, flags);
 	} else {
-		spin_unlock(&rx->lock);
+		spin_unlock_irqrestore(&rx->lock, flags);
 		handle_rx_packet(usb, buffer, length);
 	}
 
-- 
2.17.1

^ permalink raw reply related

* Re: Re: [Qemu-devel] [PATCH] qemu: Introduce VIRTIO_NET_F_STANDBY feature bit to virtio_net
From: Michael S. Tsirkin @ 2018-06-20 19:48 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Siwei Liu, Samudrala, Sridhar, Alexander Duyck, virtio-dev,
	aaron.f.brown, Jiri Pirko, Jakub Kicinski, Netdev, qemu-devel,
	virtualization, konrad.wilk, boris.ostrovsky, Joao Martins,
	Venu Busireddy, vijay.balakrishna
In-Reply-To: <20180620180619.6b4ee52d.cohuck@redhat.com>

On Wed, Jun 20, 2018 at 06:06:19PM +0200, Cornelia Huck wrote:
> In any case, I'm not sure anymore why we'd want the extra uuid.

It's mostly so we can have e.g. multiple devices with same MAC
(which some people seem to want in order to then use
then with different containers).

But it is also handy for when you assign a PF, since then you
can't set the MAC.

-- 
MST

^ permalink raw reply

* Re: Re: [Qemu-devel] [PATCH] qemu: Introduce VIRTIO_NET_F_STANDBY feature bit to virtio_net
From: Siwei Liu @ 2018-06-20 19:59 UTC (permalink / raw)
  To: Cornelia Huck
  Cc: Samudrala, Sridhar, Alexander Duyck, virtio-dev, aaron.f.brown,
	Jiri Pirko, Michael S. Tsirkin, Jakub Kicinski, Netdev,
	qemu-devel, virtualization, konrad.wilk, boris.ostrovsky,
	Joao Martins, Venu Busireddy, vijay.balakrishna
In-Reply-To: <20180620163408.737af8c2.cohuck@redhat.com>

On Wed, Jun 20, 2018 at 7:34 AM, Cornelia Huck <cohuck@redhat.com> wrote:
> On Tue, 19 Jun 2018 13:09:14 -0700
> Siwei Liu <loseweigh@gmail.com> wrote:
>
>> On Tue, Jun 19, 2018 at 3:54 AM, Cornelia Huck <cohuck@redhat.com> wrote:
>> > On Fri, 15 Jun 2018 10:06:07 -0700
>> > Siwei Liu <loseweigh@gmail.com> wrote:
>> >
>> >> On Fri, Jun 15, 2018 at 4:48 AM, Cornelia Huck <cohuck@redhat.com> wrote:
>> >> > On Thu, 14 Jun 2018 18:57:11 -0700
>> >> > Siwei Liu <loseweigh@gmail.com> wrote:
>
>> >> > I'm a bit confused here. What, exactly, ties the two devices together?
>> >>
>> >> The group UUID. Since QEMU VFIO dvice does not have insight of MAC
>> >> address (which it doesn't have to), the association between VFIO
>> >> passthrough and standby must be specificed for QEMU to understand the
>> >> relationship with this model. Note, standby feature is no longer
>> >> required to be exposed under this model.
>> >
>> > Isn't that a bit limiting, though?
>> >
>> > With this model, you can probably tie a vfio-pci device and a
>> > virtio-net-pci device together. But this will fail if you have
>> > different transports: Consider tying together a vfio-pci device and a
>> > virtio-net-ccw device on s390, for example. The standby feature bit is
>> > on the virtio-net level and should not have any dependency on the
>> > transport used.
>>
>> Probably we'd limit the support for grouping to virtio-net-pci device
>> and vfio-pci device only. For virtio-net-pci, as you might see with
>> Venu's patch, we store the group UUID on the config space of
>> virtio-pci, which is only applicable to PCI transport.
>>
>> If virtio-net-ccw needs to support the same, I think similar grouping
>> interface should be defined on the VirtIO CCW transport. I think the
>> current implementation of the Linux failover driver assumes that it's
>> SR-IOV VF with same MAC address which the virtio-net-pci needs to pair
>> with, and that the PV path is on same PF without needing to update
>> network of the port-MAC association change. If we need to extend the
>> grouping mechanism to virtio-net-ccw, it has to pass such failover
>> mode to virtio driver specifically through some other option I guess.
>
> Hm, I've just spent some time reading the Linux failover code and I did
> not really find much pci-related magic in there (other than checking
> for a pci device in net_failover_slave_pre_register). We also seem to
> look for a matching device by MAC only. What magic am I missing?

The existing assumptions around SR-IOV VF and thus PCI is implicit. A
lot of simplications are built on the fact that the passthrough device
is a SR-IOV Virtual Function specifically than others: MAC addresses
for couple devices must be the same, changing MAC address is
prohibited, programming VLAN filter is challenged, the datapath of
virtio-net has to share the same physical function where VF belongs
to. There's no hankshake during datapath switching at all to support a
normal passthrough device at this point. I'd imagine some work around
that ahead, which might be a bit involved than just to support a
simplified model for VF migration.

>
> Is the look-for-uuid handling supposed to happen in the host only?

The look-for-MAC matching scheme is not ideal in many aspects. I don't
want to repeat those again, but once the group UUID is added to QEMU,
the failover driver is supposed to switch to the UUID based matching
scheme in the guest.

>
>> >> > If libvirt already has the knowledge that it should manage the two as a
>> >> > couple, why do we need the group id (or something else for other
>> >> > architectures)? (Maybe I'm simply missing something because I'm not
>> >> > that familiar with pci.)
>> >>
>> >> The idea is to have QEMU control the visibility and enumeration order
>> >> of the passthrough VFIO for the failover scenario. Hotplug can be one
>> >> way to achieve it, and perhaps there's other way around also. The
>> >> group ID is not just for QEMU to couple devices, it's also helpful to
>> >> guest too as grouping using MAC address is just not safe.
>> >
>> > Sorry about dragging mainframes into this, but this will only work for
>> > homogenous device coupling, not for heterogenous. Consider my vfio-pci
>> > + virtio-net-ccw example again: The guest cannot find out that the two
>> > belong together by checking some group ID, it has to either use the MAC
>> > or some needs-to-be-architectured property.
>> >
>> > Alternatively, we could propose that mechanism as pci-only, which means
>> > we can rely on mechanisms that won't necessarily work on non-pci
>> > transports. (FWIW, I don't see a use case for using vfio-ccw to pass
>> > through a network card anytime in the near future, due to the nature of
>> > network cards currently in use on s390.)
>>
>> Yes, let's do this just for PCI transport (homogenous) for now.
>
> But why? Using pci for passthrough to make things easier (and because
> there's not really a use case), sure. But I really don't want to
> restrict this to virtio-pci only.

Of course, technically it doesn't have to be virtio-pci only. The
group UUID can even extend it further to non-pci transport. However,
with the current focus of the driver support on SR-IOV VF and limited
use case on non-pci, I'd feel no immediate effort will be needed on
that front.

>
>> >> In the model of (b), I think it essentially turns hotplug to one of
>> >> mechanisms for QEMU to control the visibility. The libvirt can still
>> >> manage the hotplug of individual devices during live migration or in
>> >> normal situation to hot add/remove devices. Though the visibility of
>> >> the VFIO is under the controll of QEMU, and it's possible that the hot
>> >> add/remove request does not involve actual hot plug activity in guest
>> >> at all.
>> >
>> > That depends on how you model visibility, I guess. You'll probably want
>> > to stop traffic flowing through one or the other of the cards; would
>> > link down or similar be enough for the virtio device?
>>
>> I'm not sure if it is a good idea. The guest user will see two devices
>> with same MAC but one of them is down. Do you expect user to use it or
>> not? And since the guest is going to be migrated, we need to unplug a
>> broken VF from guest before migrating, why do we bother plugging in
>> this useless VF at the first place?
>
> I was thinking about using hotunplugging only over migration and doing
> the link up only after feature negotiation has finished, but that is
> probably too complicated. Let's stick to hotplug for simplicity's sake.

OK. Thanks for the discussion, it's really useful.

Regards,
-Siwei

^ permalink raw reply

* [PATCH net-next] tcp_bbr: fix bbr pacing rate for internal pacing
From: Kevin Yang @ 2018-06-20 20:07 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, Eric Dumazet, Kevin Yang

From: Eric Dumazet <edumazet@google.com>

This commit makes BBR use only the MSS (without any headers) to
calculate pacing rates when internal TCP-layer pacing is used.

This is necessary to achieve the correct pacing behavior in this case,
since tcp_internal_pacing() uses only the payload length to calculate
pacing delays.

Signed-off-by: Kevin Yang <yyd@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Neal Cardwell <ncardwell@google.com>
---
 include/net/tcp.h     | 11 +++++++++++
 net/ipv4/tcp_bbr.c    |  6 +++++-
 net/ipv4/tcp_output.c | 14 --------------
 3 files changed, 16 insertions(+), 15 deletions(-)

diff --git a/include/net/tcp.h b/include/net/tcp.h
index 0448e7c5d2b4..822ee49ed0f9 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -1184,6 +1184,17 @@ static inline bool tcp_is_cwnd_limited(const struct sock *sk)
 	return tp->is_cwnd_limited;
 }
 
+/* BBR congestion control needs pacing.
+ * Same remark for SO_MAX_PACING_RATE.
+ * sch_fq packet scheduler is efficiently handling pacing,
+ * but is not always installed/used.
+ * Return true if TCP stack should pace packets itself.
+ */
+static inline bool tcp_needs_internal_pacing(const struct sock *sk)
+{
+	return smp_load_acquire(&sk->sk_pacing_status) == SK_PACING_NEEDED;
+}
+
 /* Something is really bad, we could not queue an additional packet,
  * because qdisc is full or receiver sent a 0 window.
  * We do not want to add fuel to the fire, or abort too early,
diff --git a/net/ipv4/tcp_bbr.c b/net/ipv4/tcp_bbr.c
index 58e2f479ffb4..3b5f45b9e81e 100644
--- a/net/ipv4/tcp_bbr.c
+++ b/net/ipv4/tcp_bbr.c
@@ -205,7 +205,11 @@ static u32 bbr_bw(const struct sock *sk)
  */
 static u64 bbr_rate_bytes_per_sec(struct sock *sk, u64 rate, int gain)
 {
-	rate *= tcp_mss_to_mtu(sk, tcp_sk(sk)->mss_cache);
+	unsigned int mss = tcp_sk(sk)->mss_cache;
+
+	if (!tcp_needs_internal_pacing(sk))
+		mss = tcp_mss_to_mtu(sk, mss);
+	rate *= mss;
 	rate *= gain;
 	rate >>= BBR_SCALE;
 	rate *= USEC_PER_SEC;
diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 8e08b409c71e..f8f6129160dd 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -973,17 +973,6 @@ enum hrtimer_restart tcp_pace_kick(struct hrtimer *timer)
 	return HRTIMER_NORESTART;
 }
 
-/* BBR congestion control needs pacing.
- * Same remark for SO_MAX_PACING_RATE.
- * sch_fq packet scheduler is efficiently handling pacing,
- * but is not always installed/used.
- * Return true if TCP stack should pace packets itself.
- */
-static bool tcp_needs_internal_pacing(const struct sock *sk)
-{
-	return smp_load_acquire(&sk->sk_pacing_status) == SK_PACING_NEEDED;
-}
-
 static void tcp_internal_pacing(struct sock *sk, const struct sk_buff *skb)
 {
 	u64 len_ns;
@@ -995,9 +984,6 @@ static void tcp_internal_pacing(struct sock *sk, const struct sk_buff *skb)
 	if (!rate || rate == ~0U)
 		return;
 
-	/* Should account for header sizes as sch_fq does,
-	 * but lets make things simple.
-	 */
 	len_ns = (u64)skb->len * NSEC_PER_SEC;
 	do_div(len_ns, rate);
 	hrtimer_start(&tcp_sk(sk)->pacing_timer,
-- 
2.18.0.rc1.244.gcf134e6275-goog

^ permalink raw reply related

* [PATCH bpf-next 0/3] bpf: btf: json print btf info with bpftool map dump
From: Okash Khawaja @ 2018-06-20 20:30 UTC (permalink / raw)
  To: Daniel Borkmann, Martin KaFai Lau, Alexei Starovoitov,
	Yonghong Song, Quentin Monnet, Jakub Kicinski, David S. Miller
  Cc: netdev, kernel-team, linux-kernel

Hi,

These patches augment the bpftool's map dump command with BTF info. In
particular, when user runs `bpftool map dump [-j|-p] id <map-id>`, they will
see map data formatted and tagged based upon BTF information associated with
that map. Here is what each patch does:

Patch 1 exports BTF functions inside libbpf, to be used by patch 2.

Patch 2 adds btf_dumper which uses type info exported in patch 1 along
with json_writer to json print or pretty json print map values alongside
btf debug info.

Patch 3 uses btf_dumper to json or pretty print map values when -j or -p
flag is specified to `btf map dump`.

Further details are included in patch descriptions.

Thanks,
Okash

^ permalink raw reply

* [PATCH bpf-next 1/3] bpf: btf: export btf types and name by offset from lib
From: Okash Khawaja @ 2018-06-20 20:30 UTC (permalink / raw)
  To: Daniel Borkmann, Martin KaFai Lau, Alexei Starovoitov,
	Yonghong Song, Quentin Monnet, Jakub Kicinski, David S. Miller
  Cc: netdev, kernel-team, linux-kernel
In-Reply-To: <20180620203051.223156973@fb.com>

[-- Attachment #1: 01-export-from-libbtf.patch --]
[-- Type: text/plain, Size: 4271 bytes --]

This patch introduces btf__resolve_type() function and exports two
existing functions from libbpf. btf__resolve_type follows modifier
types like const and typedef until it hits a type which actually takes
up memory, and then returns it. This function follows similar pattern
to btf__resolve_size but instead of computing size, it just returns
the type.

These  functions will be used in the followig patch which parses
information inside array of `struct btf_type *`. btf_name_by_offset is
used for printing variable names.

Signed-off-by: Okash Khawaja <osk@fb.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>

---
 tools/lib/bpf/btf.c |   65 ++++++++++++++++++++++++++++++++++++----------------
 tools/lib/bpf/btf.h |    3 ++
 2 files changed, 48 insertions(+), 20 deletions(-)

--- a/tools/lib/bpf/btf.c
+++ b/tools/lib/bpf/btf.c
@@ -17,6 +17,11 @@
 
 #define BTF_MAX_NR_TYPES 65535
 
+#define IS_MODIFIER(k) (((k) == BTF_KIND_TYPEDEF) || \
+		((k) == BTF_KIND_VOLATILE) || \
+		((k) == BTF_KIND_CONST) || \
+		((k) == BTF_KIND_RESTRICT))
+
 static struct btf_type btf_void;
 
 struct btf {
@@ -33,14 +38,6 @@ struct btf {
 	int fd;
 };
 
-static const char *btf_name_by_offset(const struct btf *btf, uint32_t offset)
-{
-	if (offset < btf->hdr->str_len)
-		return &btf->strings[offset];
-	else
-		return NULL;
-}
-
 static int btf_add_type(struct btf *btf, struct btf_type *t)
 {
 	if (btf->types_size - btf->nr_types < 2) {
@@ -190,15 +187,6 @@ static int btf_parse_type_sec(struct btf
 	return 0;
 }
 
-static const struct btf_type *btf_type_by_id(const struct btf *btf,
-					     uint32_t type_id)
-{
-	if (type_id > btf->nr_types)
-		return NULL;
-
-	return btf->types[type_id];
-}
-
 static bool btf_type_is_void(const struct btf_type *t)
 {
 	return t == &btf_void || BTF_INFO_KIND(t->info) == BTF_KIND_FWD;
@@ -234,7 +222,7 @@ int64_t btf__resolve_size(const struct b
 	int64_t size = -1;
 	int i;
 
-	t = btf_type_by_id(btf, type_id);
+	t = btf__type_by_id(btf, type_id);
 	for (i = 0; i < MAX_RESOLVE_DEPTH && !btf_type_is_void_or_null(t);
 	     i++) {
 		size = btf_type_size(t);
@@ -259,7 +247,7 @@ int64_t btf__resolve_size(const struct b
 			return -EINVAL;
 		}
 
-		t = btf_type_by_id(btf, type_id);
+		t = btf__type_by_id(btf, type_id);
 	}
 
 	if (size < 0)
@@ -271,6 +259,26 @@ int64_t btf__resolve_size(const struct b
 	return nelems * size;
 }
 
+int32_t btf__resolve_type(const struct btf *btf, uint32_t type_id)
+{
+	const struct btf_type *t;
+	int depth = 0;
+
+	t = btf__type_by_id(btf, type_id);
+	while (depth < MAX_RESOLVE_DEPTH &&
+			!btf_type_is_void_or_null(t) &&
+			IS_MODIFIER(BTF_INFO_KIND(t->info))) {
+		type_id = t->type;
+		t = btf__type_by_id(btf, type_id);
+		depth++;
+	}
+
+	if (depth == MAX_RESOLVE_DEPTH || btf_type_is_void_or_null(t))
+		return -EINVAL;
+
+	return type_id;
+}
+
 int32_t btf__find_by_name(const struct btf *btf, const char *type_name)
 {
 	uint32_t i;
@@ -280,7 +288,7 @@ int32_t btf__find_by_name(const struct b
 
 	for (i = 1; i <= btf->nr_types; i++) {
 		const struct btf_type *t = btf->types[i];
-		const char *name = btf_name_by_offset(btf, t->name_off);
+		const char *name = btf__name_by_offset(btf, t->name_off);
 
 		if (name && !strcmp(type_name, name))
 			return i;
@@ -371,3 +379,20 @@ int btf__fd(const struct btf *btf)
 {
 	return btf->fd;
 }
+
+const char *btf__name_by_offset(const struct btf *btf, uint32_t offset)
+{
+	if (offset < btf->hdr->str_len)
+		return &btf->strings[offset];
+	else
+		return NULL;
+}
+
+const struct btf_type *btf__type_by_id(const struct btf *btf,
+					     uint32_t type_id)
+{
+	if (type_id > btf->nr_types)
+		return NULL;
+
+	return btf->types[type_id];
+}
--- a/tools/lib/bpf/btf.h
+++ b/tools/lib/bpf/btf.h
@@ -17,6 +17,9 @@ void btf__free(struct btf *btf);
 struct btf *btf__new(uint8_t *data, uint32_t size, btf_print_fn_t err_log);
 int32_t btf__find_by_name(const struct btf *btf, const char *type_name);
 int64_t btf__resolve_size(const struct btf *btf, uint32_t type_id);
+int32_t btf__resolve_type(const struct btf *btf, uint32_t type_id);
 int btf__fd(const struct btf *btf);
+const char *btf__name_by_offset(const struct btf *btf, uint32_t offset);
+const struct btf_type *btf__type_by_id(const struct btf *btf, uint32_t type_id);
 
 #endif

^ permalink raw reply

* [PATCH bpf-next 2/3] bpf: btf: add btf json print functionality
From: Okash Khawaja @ 2018-06-20 20:30 UTC (permalink / raw)
  To: Daniel Borkmann, Martin KaFai Lau, Alexei Starovoitov,
	Yonghong Song, Quentin Monnet, Jakub Kicinski, David S. Miller
  Cc: netdev, kernel-team, linux-kernel
In-Reply-To: <20180620203051.223156973@fb.com>

[-- Attachment #1: 02-add-btf-dump-map.patch --]
[-- Type: text/plain, Size: 9192 bytes --]

This consumes functionality exported in the previous patch. It does the
main job of printing with BTF data. This is used in the following patch
to provide a more readable output of a map's dump. It relies on
json_writer to do json printing. Below is sample output where map keys
are ints and values are of type struct A:

typedef int int_type;
enum E {
        E0,
        E1,
};

struct B {
        int x;
        int y;
};

struct A {
        int m;
        unsigned long long n;
        char o;
        int p[8];
        int q[4][8];
        enum E r;
        void *s;
        struct B t;
        const int u;
        int_type v;
        unsigned int w1: 3;
        unsigned int w2: 3;
};

$ sudo bpftool map dump -p id 14
[{
        "key": 0
    },{
        "value": {
            "m": 1,
            "n": 2,
            "o": "c",
            "p": [15,16,17,18,15,16,17,18
            ],
            "q": [[25,26,27,28,25,26,27,28
                ],[35,36,37,38,35,36,37,38
                ],[45,46,47,48,45,46,47,48
                ],[55,56,57,58,55,56,57,58
                ]
            ],
            "r": 1,
            "s": 0x7ffff6f70568,
            "t": {
                "x": 5,
                "y": 10
            },
            "u": 100,
            "v": 20,
            "w1": 0x7,
            "w2": 0x3
        }
    }
]

This patch uses json's {} and [] to imply struct/union and array. More
explicit information can be added later. For example, a command line
option can be introduced to print whether a key or value is struct
or union, name of a struct etc. This will however come at the expense
of duplicating info when, for example, printing an array of structs.
enums are printed as ints without their names.

Signed-off-by: Okash Khawaja <osk@fb.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>

---
 tools/bpf/bpftool/btf_dumper.c |  247 +++++++++++++++++++++++++++++++++++++++++
 tools/bpf/bpftool/btf_dumper.h |   18 ++
 2 files changed, 265 insertions(+)

--- /dev/null
+++ b/tools/bpf/bpftool/btf_dumper.c
@@ -0,0 +1,247 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright (c) 2018 Facebook */
+
+#include <linux/btf.h>
+#include <linux/err.h>
+#include <stdio.h> /* for (FILE *) used by json_writer */
+#include <linux/bitops.h>
+#include <string.h>
+#include <ctype.h>
+
+#include "btf.h"
+#include "json_writer.h"
+
+#define BITS_PER_BYTE_MASK (BITS_PER_BYTE - 1)
+#define BITS_PER_BYTE_MASKED(bits) ((bits) & BITS_PER_BYTE_MASK)
+#define BITS_ROUNDDOWN_BYTES(bits) ((bits) >> 3)
+#define BITS_ROUNDUP_BYTES(bits) \
+	(BITS_ROUNDDOWN_BYTES(bits) + !!BITS_PER_BYTE_MASKED(bits))
+
+static int btf_dumper_do_type(const struct btf *btf, uint32_t type_id,
+		uint8_t bit_offset, const void *data, json_writer_t *jw);
+
+static void btf_dumper_ptr(const void *data, json_writer_t *jw)
+{
+	jsonw_printf(jw, "%p", *((uintptr_t *)data));
+}
+
+static int btf_dumper_modifier(const struct btf *btf, uint32_t type_id,
+		const void *data, json_writer_t *jw)
+{
+	int32_t actual_type_id = btf__resolve_type(btf, type_id);
+	int ret;
+
+	if (actual_type_id < 0)
+		return actual_type_id;
+
+	ret = btf_dumper_do_type(btf, actual_type_id, 0, data, jw);
+
+	return ret;
+}
+
+static void btf_dumper_enum(const void *data, json_writer_t *jw)
+{
+	jsonw_printf(jw, "%d", *((int32_t *)data));
+}
+
+static int btf_dumper_array(const struct btf *btf, uint32_t type_id,
+		const void *data, json_writer_t *jw)
+{
+	const struct btf_type *t = btf__type_by_id(btf, type_id);
+	struct btf_array *arr = (struct btf_array *)(t + 1);
+	int64_t elem_size;
+	uint32_t i;
+	int ret;
+
+	elem_size = btf__resolve_size(btf, arr->type);
+	if (elem_size < 0)
+		return elem_size;
+
+	jsonw_start_array(jw);
+	for (i = 0; i < arr->nelems; i++) {
+		ret = btf_dumper_do_type(btf, arr->type, 0,
+				data + (i * elem_size), jw);
+		if (ret)
+			return ret;
+	}
+
+	jsonw_end_array(jw);
+
+	return 0;
+}
+
+static void btf_dumper_int_bits(uint32_t int_type, uint8_t bit_offset,
+		const void *data, json_writer_t *jw)
+{
+	uint16_t total_bits_offset;
+	uint32_t bits = BTF_INT_BITS(int_type);
+	uint16_t bytes_to_copy;
+	uint16_t bits_to_copy;
+	uint8_t upper_bits;
+	union {
+		uint64_t u64_num;
+		uint8_t u8_nums[8];
+	} print_num;
+
+	total_bits_offset = bit_offset + BTF_INT_OFFSET(int_type);
+	data += BITS_ROUNDDOWN_BYTES(total_bits_offset);
+	bit_offset = BITS_PER_BYTE_MASKED(total_bits_offset);
+	bits_to_copy = bits + bit_offset;
+	bytes_to_copy = BITS_ROUNDUP_BYTES(bits_to_copy);
+
+	print_num.u64_num = 0;
+	memcpy(&print_num.u64_num, data, bytes_to_copy);
+
+	upper_bits = BITS_PER_BYTE_MASKED(bits_to_copy);
+	if (upper_bits) {
+		uint8_t mask = (1 << upper_bits) - 1;
+
+		print_num.u8_nums[bytes_to_copy - 1] &= mask;
+	}
+
+	print_num.u64_num >>= bit_offset;
+
+	jsonw_printf(jw, "0x%llx", print_num.u64_num);
+}
+
+static int btf_dumper_int(const struct btf_type *t, uint8_t bit_offset,
+		const void *data, json_writer_t *jw)
+{
+	uint32_t *int_type = (uint32_t *)(t + 1);
+	uint32_t bits = BTF_INT_BITS(*int_type);
+	int ret = 0;
+
+	/* if this is bit field */
+	if (bit_offset || BTF_INT_OFFSET(*int_type) ||
+			BITS_PER_BYTE_MASKED(bits)) {
+		btf_dumper_int_bits(*int_type, bit_offset, data, jw);
+		return ret;
+	}
+
+	switch (BTF_INT_ENCODING(*int_type)) {
+	case 0:
+		if (BTF_INT_BITS(*int_type) == 64)
+			jsonw_printf(jw, "%lu", *((uint64_t *)data));
+		else if (BTF_INT_BITS(*int_type) == 32)
+			jsonw_printf(jw, "%u", *((uint32_t *)data));
+		else if (BTF_INT_BITS(*int_type) == 16)
+			jsonw_printf(jw, "%hu", *((uint16_t *)data));
+		else if (BTF_INT_BITS(*int_type) == 8)
+			jsonw_printf(jw, "%hhu", *((uint8_t *)data));
+		else
+			btf_dumper_int_bits(*int_type, bit_offset, data, jw);
+		break;
+	case BTF_INT_SIGNED:
+		if (BTF_INT_BITS(*int_type) == 64)
+			jsonw_printf(jw, "%ld", *((int64_t *)data));
+		else if (BTF_INT_BITS(*int_type) == 32)
+			jsonw_printf(jw, "%d", *((int32_t *)data));
+		else if (BTF_INT_BITS(*int_type) ==  16)
+			jsonw_printf(jw, "%hd", *((int16_t *)data));
+		else if (BTF_INT_BITS(*int_type) ==  8)
+			jsonw_printf(jw, "%hhd", *((int8_t *)data));
+		else
+			btf_dumper_int_bits(*int_type, bit_offset, data, jw);
+		break;
+	case BTF_INT_CHAR:
+		if (*((char *)data) == '\0')
+			jsonw_null(jw);
+		else if (isprint(*((char *)data)))
+			jsonw_printf(jw, "\"%c\"", *((char *)data));
+		else
+			jsonw_printf(jw, "%hhx", *((char *)data));
+		break;
+	case BTF_INT_BOOL:
+		jsonw_bool(jw, *((int *)data));
+		break;
+	default:
+		/* shouldn't happen */
+		ret = -EINVAL;
+		break;
+	}
+
+	return ret;
+}
+
+static int btf_dumper_struct(const struct btf *btf, uint32_t type_id,
+		const void *data, json_writer_t *jw)
+{
+	const struct btf_type *t = btf__type_by_id(btf, type_id);
+	struct btf_member *m;
+	int ret = 0;
+	int i, vlen;
+
+	if (t == NULL)
+		return -EINVAL;
+
+	vlen = BTF_INFO_VLEN(t->info);
+	jsonw_start_object(jw);
+	m = (struct btf_member *)(t + 1);
+
+	for (i = 0; i < vlen; i++) {
+		jsonw_name(jw, btf__name_by_offset(btf, m[i].name_off));
+		ret = btf_dumper_do_type(btf, m[i].type,
+				BITS_PER_BYTE_MASKED(m[i].offset),
+				data + BITS_ROUNDDOWN_BYTES(m[i].offset), jw);
+		if (ret)
+			return ret;
+	}
+
+	jsonw_end_object(jw);
+
+	return 0;
+}
+
+static int btf_dumper_do_type(const struct btf *btf, uint32_t type_id,
+		uint8_t bit_offset, const void *data, json_writer_t *jw)
+{
+	const struct btf_type *t = btf__type_by_id(btf, type_id);
+	int ret = 0;
+
+	switch (BTF_INFO_KIND(t->info)) {
+	case BTF_KIND_INT:
+		ret = btf_dumper_int(t, bit_offset, data, jw);
+		break;
+	case BTF_KIND_STRUCT:
+	case BTF_KIND_UNION:
+		ret = btf_dumper_struct(btf, type_id, data, jw);
+		break;
+	case BTF_KIND_ARRAY:
+		ret = btf_dumper_array(btf, type_id, data, jw);
+		break;
+	case BTF_KIND_ENUM:
+		btf_dumper_enum(data, jw);
+		break;
+	case BTF_KIND_PTR:
+		btf_dumper_ptr(data, jw);
+		break;
+	case BTF_KIND_UNKN:
+		jsonw_printf(jw, "(unknown)");
+		break;
+	case BTF_KIND_FWD:
+		/* map key or value can't be forward */
+		ret = -EINVAL;
+		break;
+	case BTF_KIND_TYPEDEF:
+	case BTF_KIND_VOLATILE:
+	case BTF_KIND_CONST:
+	case BTF_KIND_RESTRICT:
+		ret = btf_dumper_modifier(btf, type_id, data, jw);
+		break;
+	default:
+		jsonw_printf(jw, "(unsupported-kind");
+		ret = -EINVAL;
+		break;
+	}
+
+	return ret;
+}
+
+int32_t btf_dumper_type(const struct btf *btf, json_writer_t *jw,
+		uint32_t type_id, const void *data)
+{
+	if (!jw)
+		return -EINVAL;
+
+	return btf_dumper_do_type(btf, type_id, 0, data, jw);
+}
--- /dev/null
+++ b/tools/bpf/bpftool/btf_dumper.h
@@ -0,0 +1,18 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright (c) 2018 Facebook */
+
+#ifndef BTF_DUMPER_H
+#define BTF_DUMPER_H
+
+/* btf_dumper_type - json print data along with type information
+ * @btf: btf instance initialised via btf__new()
+ * @jw: json writer used for printing
+ * @type_id: index in btf->types array. this points to the type to be dumped
+ * @data: pointer the actual data, i.e. the values to be printed
+ *
+ * Returns zero on success and negative error code otherwise
+ */
+int32_t btf_dumper_type(const struct btf *btf, json_writer_t *jw,
+		uint32_t type_id, void *data);
+
+#endif

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox