Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH] rtlwifi: btcoex: remove set but not used variable 'ppsc'
From: Kalle Valo @ 2018-11-06 16:58 UTC (permalink / raw)
  To: YueHaibing
  Cc: Ping-Ke Shih, Larry Finger, Colin Ian King, Nathan Chancellor,
	YueHaibing, linux-wireless-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA,
	kernel-janitors-u79uwXL29TY76Z2rM5mHXA
In-Reply-To: <1540194675-65562-1-git-send-email-yuehaibing-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>

YueHaibing <yuehaibing-hv44wF8Li93QT0dZR+AlfA@public.gmane.org> wrote:

> Fixes gcc '-Wunused-but-set-variable' warning:
> 
> drivers/net/wireless/realtek/rtlwifi/btcoexist/halbtcoutsrc.c: In function 'halbtc_leave_lps':
> drivers/net/wireless/realtek/rtlwifi/btcoexist/halbtcoutsrc.c:295:21: warning:
>  variable 'ppsc' set but not used [-Wunused-but-set-variable]
> 
> drivers/net/wireless/realtek/rtlwifi/btcoexist/halbtcoutsrc.c: In function 'halbtc_enter_lps':
> drivers/net/wireless/realtek/rtlwifi/btcoexist/halbtcoutsrc.c:318:21: warning:
>  variable 'ppsc' set but not used [-Wunused-but-set-variable]
> 
> It never used since introduction in
> commit aa45a673b291 ("rtlwifi: btcoexist: Add new mini driver")
> 
> Signed-off-by: YueHaibing <yuehaibing-hv44wF8Li93QT0dZR+AlfA@public.gmane.org>
> Acked-by: Ping-Ke Shih <pkshih-Rasf1IRRPZFBDgjK7y7TUQ@public.gmane.org>

Patch applied to wireless-drivers-next.git, thanks.

9198f460ec9d rtlwifi: btcoex: remove set but not used variable 'ppsc'

-- 
https://patchwork.kernel.org/patch/10651825/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

^ permalink raw reply

* Re: [PATCH 1/2] rtl8xxxu: Mark expected switch fall-throughs
From: Kalle Valo @ 2018-11-06 16:59 UTC (permalink / raw)
  To: Gustavo A. R. Silva
  Cc: linux-kernel, Jes Sorensen, linux-wireless, David S. Miller,
	netdev, Gustavo A. R. Silva
In-Reply-To: <08817d137b32d5d091eacc6fee0a3eca68d49d94.1540208577.git.gustavo@embeddedor.com>

"Gustavo A. R. Silva" <gustavo@embeddedor.com> wrote:

> In preparation to enabling -Wimplicit-fallthrough, mark switch cases
> where we are expecting to fall through.
> 
> Addresses-Coverity-ID: 1357355 ("Missing break in switch")
> Addresses-Coverity-ID: 1357378 ("Missing break in switch")
> Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>

2 patches applied to wireless-drivers-next.git, thanks.

e20c50cdca19 rtl8xxxu: Mark expected switch fall-throughs
307b00c5e695 rtl8xxxu: Fix missing break in switch

-- 
https://patchwork.kernel.org/patch/10651953/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

^ permalink raw reply

* Re: [PATCH 07/20] iwlegacy: 4965-mac: mark expected switch fall-through
From: Kalle Valo @ 2018-11-06 17:00 UTC (permalink / raw)
  To: Gustavo A. R. Silva
  Cc: Stanislaw Gruszka, linux-wireless, David S. Miller, netdev,
	linux-kernel, Gustavo A. R. Silva
In-Reply-To: <8d72b8a6f1529906672ac458e96d272bc55a1410.1540239684.git.gustavo@embeddedor.com>

"Gustavo A. R. Silva" <gustavo@embeddedor.com> wrote:

> In preparation to enabling -Wimplicit-fallthrough, mark switch cases
> where we are expecting to fall through.
> 
> Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>

14 patches applied to wireless-drivers-next.git, thanks.

e9904084dd1b iwlegacy: 4965-mac: mark expected switch fall-through
af71f8fef45c iwlegacy: common: mark expected switch fall-throughs
d56b26801e1d orinoco_usb: mark expected switch fall-through
d22b8fadd08e prism54: isl_38xx: Mark expected switch fall-through
3d238b9d5048 prism54: isl_ioctl: mark expected switch fall-through
38a0792d08e9 prism54: islpci_dev: mark expected switch fall-through
63fdc952df36 mwifiex: Mark expected switch fall-through
6eba8fd22352 rt2x00: rt2400pci: mark expected switch fall-through
10bb92217747 rt2x00: rt2500pci: mark expected switch fall-through
916e6bbcfcff rt2x00: rt2800lib: mark expected switch fall-throughs
641dd8068ecb rt2x00: rt61pci: mark expected switch fall-through
d22d2492a35d ray_cs: mark expected switch fall-throughs
89e54fa4562e rtlwifi: rtl8821ae: phy: Mark expected switch fall-through
7cbbe1597e44 zd1201: mark expected switch fall-through

-- 
https://patchwork.kernel.org/patch/10652563/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

^ permalink raw reply

* Re: [PATCHv3] rtlwifi: rtl8723ae: Remove set but not used variables and #defines
From: Kalle Valo @ 2018-11-06 17:01 UTC (permalink / raw)
  To: zhong jiang; +Cc: pkshih, davem, linux-wireless, joe, netdev, linux-kernel
In-Reply-To: <1540360329-65013-1-git-send-email-zhongjiang@huawei.com>

zhong jiang <zhongjiang@huawei.com> wrote:

> radiob_array_table' and 'radiob_arraylen' are not used after setting its value.
> It is safe to remove the unused variable. Meanwhile, radio B array should be
> removed as well. because it will no longer be referenced.
> 
> Signed-off-by: zhong jiang <zhongjiang@huawei.com>
> Acked-by: Ping-Ke Shih <pkshih@realtek.com>

Patch applied to wireless-drivers-next.git, thanks.

90e3243d16ad rtlwifi: rtl8723ae: Remove set but not used variables and #defines

-- 
https://patchwork.kernel.org/patch/10654181/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

^ permalink raw reply

* Re: [PATCH] rtlwifi: Remove same duplicated includes
From: Kalle Valo @ 2018-11-06 17:02 UTC (permalink / raw)
  To: zhong jiang; +Cc: pkshih, davem, linux-wireless, netdev, linux-kernel
In-Reply-To: <1540361256-678-1-git-send-email-zhongjiang@huawei.com>

zhong jiang <zhongjiang@huawei.com> wrote:

> Just remove same duplicated includes.
> 
> Signed-off-by: zhong jiang <zhongjiang@huawei.com>

Patch applied to wireless-drivers-next.git, thanks.

963b307361bd rtlwifi: Remove same duplicated includes

-- 
https://patchwork.kernel.org/patch/10654183/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

^ permalink raw reply

* Re: [PATCH] cw1200: fix small typo
From: Kalle Valo @ 2018-11-06 17:04 UTC (permalink / raw)
  To: Yangtao Li; +Cc: pizza, davem, linux-wireless, netdev, linux-kernel, Yangtao Li
In-Reply-To: <20181101153319.22830-1-tiny.windzz@gmail.com>

Yangtao Li <tiny.windzz@gmail.com> wrote:

> Signed-off-by: Yangtao Li <tiny.windzz@gmail.com>

Patch applied to wireless-drivers-next.git, thanks.

f4bd758f3f20 cw1200: fix small typo

-- 
https://patchwork.kernel.org/patch/10664149/

https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

^ permalink raw reply

* RE: [PATCH net-next 5/6] net/ncsi: Reset channel state in ncsi_start_dev()
From: Justin.Lee1 @ 2018-11-06 17:27 UTC (permalink / raw)
  To: sam, netdev; +Cc: davem, linux-kernel, openbmc
In-Reply-To: <de9816e6c9cb31fdae1bb3d4a38da65b8f3a7694.camel@mendozajonas.com>


> On Mon, 2018-11-05 at 18:01 +0000, Justin.Lee1@Dell.com wrote:
> > > On Tue, 2018-10-30 at 21:26 +0000, Justin.Lee1@Dell.com wrote:
> > > > > +int ncsi_reset_dev(struct ncsi_dev *nd)
> > > > > +{
> > > > > +	struct ncsi_dev_priv *ndp = TO_NCSI_DEV_PRIV(nd);
> > > > > +	struct ncsi_channel *nc, *active;
> > > > > +	struct ncsi_package *np;
> > > > > +	unsigned long flags;
> > > > > +	bool enabled;
> > > > > +	int state;
> > > > > +
> > > > > +	active = NULL;
> > > > > +	NCSI_FOR_EACH_PACKAGE(ndp, np) {
> > > > > +		NCSI_FOR_EACH_CHANNEL(np, nc) {
> > > > > +			spin_lock_irqsave(&nc->lock, flags);
> > > > > +			enabled = nc->monitor.enabled;
> > > > > +			state = nc->state;
> > > > > +			spin_unlock_irqrestore(&nc->lock, flags);
> > > > > +
> > > > > +			if (enabled)
> > > > > +				ncsi_stop_channel_monitor(nc);
> > > > > +			if (state == NCSI_CHANNEL_ACTIVE) {
> > > > > +				active = nc;
> > > > > +				break;
> > > > 
> > > > Is the original intention to process the channel one by one?
> > > > If it is the case, there are two loops and we might need to use
> > > > "goto found" instead.
> > > 
> > > Yes we'll need to break out of the package loop here as well.
> > > 
> > > > > +			}
> > > > > +		}
> > > > > +	}
> > > > > +
> > > > 
> > > > found: ?
> > > > 
> > > > > +	if (!active) {
> > > > > +		/* Done */
> > > > > +		spin_lock_irqsave(&ndp->lock, flags);
> > > > > +		ndp->flags &= ~NCSI_DEV_RESET;
> > > > > +		spin_unlock_irqrestore(&ndp->lock, flags);
> > > > > +		return ncsi_choose_active_channel(ndp);
> > > > > +	}
> > > > > +
> > > > > +	spin_lock_irqsave(&ndp->lock, flags);
> > > > > +	ndp->flags |= NCSI_DEV_RESET;
> > > > > +	ndp->active_channel = active;
> > > > > +	ndp->active_package = active->package;
> > > > > +	spin_unlock_irqrestore(&ndp->lock, flags);
> > > > > +
> > > > > +	nd->state = ncsi_dev_state_suspend;
> > > > > +	schedule_work(&ndp->work);
> > > > > +	return 0;
> > > > > +}
> > > > 
> > > > Also similar issue in ncsi_choose_active_channel() function below.
> > > > 
> > > > > @@ -916,32 +1045,49 @@ static int ncsi_choose_active_channel(struct ncsi_dev_priv *ndp)
> > > > >  
> > > > >  			ncm = &nc->modes[NCSI_MODE_LINK];
> > > > >  			if (ncm->data[2] & 0x1) {
> > > > > -				spin_unlock_irqrestore(&nc->lock, flags);
> > > > >  				found = nc;
> > > > > -				goto out;
> > > > > +				with_link = true;
> > > > >  			}
> > > > >  
> > > > > -			spin_unlock_irqrestore(&nc->lock, flags);
> > > > > +			/* If multi_channel is enabled configure all valid
> > > > > +			 * channels whether or not they currently have link
> > > > > +			 * so they will have AENs enabled.
> > > > > +			 */
> > > > > +			if (with_link || np->multi_channel) {
> > > > 
> > > > I notice that there is a case that we will misconfigure the interface.
> > > > For example below, multi-channel is not enable for package 1.
> > > > But we enable the channel for ncsi2 below (package 1 channel 0) as that interface is the first
> > > > channel for that package with link.
> > > 
> > > I don't think I see the issue here; multi-channel is not set on package
> > > 1, but both channels are in the channel whitelist. Channel 0 is
> > > configured since it's the first found on package 1, and channel 1 is not
> > > since channel 0 is already found. Are you expecting something different?
> > >  
> > 
> > The setting is that multi-package is enable for both package 0 and 1.
> > Multi-channel is only enabled for package 0.
> > 
> > > > cat /sys/kernel/debug/ncsi_protocol/ncsi_device_
> > > > IFIDX IFNAME NAME   PID CID RX TX MP MC WP WC PC CS PS LS RU CR NQ HA
> > > > =====================================================================
> > > >   2   eth2   ncsi0  000 000 1  1  1  1  1  1  0  2  1  1  1  1  0  1
> > > >   2   eth2   ncsi1  000 001 1  0  1  1  1  1  0  2  1  1  1  1  0  1
> > > >   2   eth2   ncsi2  001 000 1  0  1  0  1  1  0  2  1  1  1  1  0  1
> > 
> > I was replying to the wrong old email and it might cause a bit confusion.
> > The first 1 meaning channel is enabled for package 1 channel 0 (ncsi2). 
> > For eth2, we already has ncsi0 as the active channel with TX enable.
> > I would think that package doesn't have the multi-channel enabled and
> > we should not enable the channel for ncsi2. The problem is that package 1 doesn't
> > enable the multi-channel and it believes it needs to enable one channel for its package 
> > but it doesn't aware that the other package already has one active channel.
> 
> Ah, maybe the confusion here is that multi_channel is a per-package
> setting; it determines what a package does with its own channels.
> 
> So you have package 0 with multi-channel enabled so it enables channels 0
> & 1.
> Then you have package 1 without multi-channel so it enables only channel
> 0.
> There is still only one Tx channel (package 0, channel 0).
> 
> Does that sound right, or have I missed something?

Yes, you are right. There is only one TX enabled. 
If we can hold off a few seconds before applying, then we will not see 
these configuration changes in between the back to back netlink commands.

Thanks, 
Justin

^ permalink raw reply

* [PATCH net] igb: fix uninitialized variables
From: wangyunjian @ 2018-11-06  8:27 UTC (permalink / raw)
  To: netdev, intel-wired-lan; +Cc: stone.zhou, Yunjian Wang

From: Yunjian Wang <wangyunjian@huawei.com>

This patch fixes the variable 'phy_word' may be used uninitialized.

Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
---
 drivers/net/ethernet/intel/igb/e1000_i210.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/ethernet/intel/igb/e1000_i210.c b/drivers/net/ethernet/intel/igb/e1000_i210.c
index c54ebed..c393cb2 100644
--- a/drivers/net/ethernet/intel/igb/e1000_i210.c
+++ b/drivers/net/ethernet/intel/igb/e1000_i210.c
@@ -842,6 +842,7 @@ s32 igb_pll_workaround_i210(struct e1000_hw *hw)
 		nvm_word = E1000_INVM_DEFAULT_AL;
 	tmp_nvm = nvm_word | E1000_INVM_PLL_WO_VAL;
 	igb_write_phy_reg_82580(hw, I347AT4_PAGE_SELECT, E1000_PHY_PLL_FREQ_PAGE);
+	phy_word = E1000_PHY_PLL_UNCONF;
 	for (i = 0; i < E1000_MAX_PLL_TRIES; i++) {
 		/* check current state directly from internal PHY */
 		igb_read_phy_reg_82580(hw, E1000_PHY_PLL_FREQ_REG, &phy_word);
-- 
1.8.3.1

^ permalink raw reply related

* Re: [PATCH v2] ISDN: eicon: Remove driver
From: David Miller @ 2018-11-06 19:04 UTC (permalink / raw)
  To: olof; +Cc: isdn, netdev, linux-kernel, isdn4linux, mac
In-Reply-To: <20181102220026.6387-1-olof@lixom.net>

From: Olof Johansson <olof@lixom.net>
Date: Fri,  2 Nov 2018 15:00:26 -0700

> I started looking at the history of this driver, and last time the
> maintainer was active on the mailing list was when discussing how to
> remove it. This was in 2012:
> 
> https://lore.kernel.org/lkml/4F4DE175.30002@melware.de/
> 
> It looks to me like this has in practice been an orphan for quite a while.
> It's throwing warnings about stack size in a function that is in dire
> need of refactoring, and it's probably a case of "it's time to call it".
> 
> Cc: Armin Schindler <mac@melware.de>
> Cc: Karsten Keil <isdn@linux-pingi.de>
> Signed-off-by: Olof Johansson <olof@lixom.net>
> ---
> 
> v2:
> Missed a git add of drivers/isdn/hardware/Kconfig

Applied to net-next.

^ permalink raw reply

* Re: [PATCH] staging: net: ipv4: tcp_westwood: fixed warnings and checks
From: David Miller @ 2018-11-06 19:15 UTC (permalink / raw)
  To: suraj1998; +Cc: edumazet, kuznet, yoshfuji, netdev, linux-kernel
In-Reply-To: <1541425985-31869-1-git-send-email-suraj1998@gmail.com>

From: Suraj Singh <suraj1998@gmail.com>
Date: Mon,  5 Nov 2018 19:23:05 +0530

> Fixed warnings and checks for TCP Westwood
> 
> Signed-off-by: Suraj Singh <suraj1998@gmail.com>

I asked you yesterday why "staging: " appears in your subject line
and you have failed to respond and explain.

There are also functional issues with your patch:

> -		tp->snd_cwnd = tp->snd_ssthresh = tcp_westwood_bw_rttmin(sk);
> +		tp->snd_cwnd = tcp_westwood_bw_rttmin(sk);
> +		tp->snd_ssthresh = tcp_westwood_bw_rttmin(sk);

This is bogus, now tcp_westwood_bw_rttmin(sk) will potentially be called
two times instead of once.

The existing code is fine, please do not modify it.

^ permalink raw reply

* Re: [PATCH net] net: phy: Allow BCM54616S PHY to setup internal TX/RX clock delay
From: David Miller @ 2018-11-06 19:17 UTC (permalink / raw)
  To: taoren; +Cc: andrew, f.fainelli, netdev, linux-kernel, openbmc
In-Reply-To: <20181105223540.1897084-1-taoren@fb.com>

From: Tao Ren <taoren@fb.com>
Date: Mon, 5 Nov 2018 14:35:40 -0800

> This patch allows users to enable/disable internal TX and/or RX clock
> delay for BCM54616S PHYs so as to satisfy RGMII timing specifications.
> 
> On a particular platform, whether TX and/or RX clock delay is required
> depends on how PHY connected to the MAC IP. This requirement can be
> specified through "phy-mode" property in the platform device tree.
> 
> The patch is inspired by commit 733336262b28 ("net: phy: Allow BCM5481x
> PHYs to setup internal TX/RX clock delay").
> 
> Signed-off-by: Tao Ren <taoren@fb.com>

This is fine for 'net', applied, thanks.

^ permalink raw reply

* Re: [PATCH] net: skbuff.h: remove unnecessary unlikely()
From: David Miller @ 2018-11-06 19:22 UTC (permalink / raw)
  To: tiny.windzz
  Cc: edumazet, dja, willemb, ast, sbrivio, posk, pabeni, borisp,
	linux-kernel, netdev
In-Reply-To: <20181106154536.8789-1-tiny.windzz@gmail.com>

From: Yangtao Li <tiny.windzz@gmail.com>
Date: Tue,  6 Nov 2018 10:45:36 -0500

> WARN_ON() already contains an unlikely(), so it's not necessary to use
> unlikely.
> 
> Signed-off-by: Yangtao Li <tiny.windzz@gmail.com>

Applied.

^ permalink raw reply

* Re: [PATCH net] net: phy: Allow BCM54616S PHY to setup internal TX/RX clock delay
From: Tao Ren @ 2018-11-06 19:42 UTC (permalink / raw)
  To: David Miller
  Cc: andrew@lunn.ch, f.fainelli@gmail.com, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org, openbmc@lists.ozlabs.org
In-Reply-To: <20181106.111736.1149054212290410715.davem@davemloft.net>

On 11/6/18 11:17 AM, David Miller wrote:
> From: Tao Ren <taoren@fb.com>
> Date: Mon, 5 Nov 2018 14:35:40 -0800
> 
>> This patch allows users to enable/disable internal TX and/or RX clock
>> delay for BCM54616S PHYs so as to satisfy RGMII timing specifications.
>>
>> On a particular platform, whether TX and/or RX clock delay is required
>> depends on how PHY connected to the MAC IP. This requirement can be
>> specified through "phy-mode" property in the platform device tree.
>>
>> The patch is inspired by commit 733336262b28 ("net: phy: Allow BCM5481x
>> PHYs to setup internal TX/RX clock delay").
>>
>> Signed-off-by: Tao Ren <taoren@fb.com>
> 
> This is fine for 'net', applied, thanks.

Thanks David for the quick action.

- Tao Ren

^ permalink raw reply

* [RFC perf,bpf 1/5] perf, bpf: Introduce PERF_RECORD_BPF_EVENT
From: Song Liu @ 2018-11-06 20:52 UTC (permalink / raw)
  To: netdev, linux-kernel; +Cc: kernel-team, Song Liu, ast, daniel, peterz, acme
In-Reply-To: <20181106205246.567448-1-songliubraving@fb.com>

For better performance analysis of BPF programs, this patch introduces
PERF_RECORD_BPF_EVENT, a new perf_event_type that exposes BPF program
load/unload information to user space.

        /*
         * Record different types of bpf events:
         *   enum perf_bpf_event_type {
         *      PERF_BPF_EVENT_UNKNOWN          = 0,
         *      PERF_BPF_EVENT_PROG_LOAD        = 1,
         *      PERF_BPF_EVENT_PROG_UNLOAD      = 2,
         *   };
         *
         * struct {
         *      struct perf_event_header header;
         *      u16 type;
         *      u16 flags;
         *      u32 id;  // prog_id or map_id
         * };
         */
        PERF_RECORD_BPF_EVENT                   = 17,

PERF_RECORD_BPF_EVENT contains minimal information about the BPF program.
Perf utility (or other user space tools) should listen to this event and
fetch more details about the event via BPF syscalls
(BPF_PROG_GET_FD_BY_ID, BPF_OBJ_GET_INFO_BY_FD, etc.).

Currently, PERF_RECORD_BPF_EVENT only support two events:
PERF_BPF_EVENT_PROG_LOAD and PERF_BPF_EVENT_PROG_UNLOAD. But it can be
easily extended to support more events.

Signed-off-by: Song Liu <songliubraving@fb.com>
---
 include/linux/perf_event.h      |  5 ++
 include/uapi/linux/perf_event.h | 27 ++++++++++-
 kernel/bpf/syscall.c            |  4 ++
 kernel/events/core.c            | 82 ++++++++++++++++++++++++++++++++-
 4 files changed, 116 insertions(+), 2 deletions(-)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index 53c500f0ca79..a3126fd5b7f1 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -1113,6 +1113,9 @@ static inline void perf_event_task_sched_out(struct task_struct *prev,
 }
 
 extern void perf_event_mmap(struct vm_area_struct *vma);
+extern void perf_event_bpf_event(enum perf_bpf_event_type type,
+				 u16 flags, u32 id);
+
 extern struct perf_guest_info_callbacks *perf_guest_cbs;
 extern int perf_register_guest_info_callbacks(struct perf_guest_info_callbacks *callbacks);
 extern int perf_unregister_guest_info_callbacks(struct perf_guest_info_callbacks *callbacks);
@@ -1333,6 +1336,8 @@ static inline int perf_unregister_guest_info_callbacks
 (struct perf_guest_info_callbacks *callbacks)				{ return 0; }
 
 static inline void perf_event_mmap(struct vm_area_struct *vma)		{ }
+static inline void perf_event_bpf_event(enum perf_bpf_event_type type,
+					u16 flags, u32 id)		{ }
 static inline void perf_event_exec(void)				{ }
 static inline void perf_event_comm(struct task_struct *tsk, bool exec)	{ }
 static inline void perf_event_namespaces(struct task_struct *tsk)	{ }
diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index f35eb72739c0..d51cacb3077a 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -372,7 +372,8 @@ struct perf_event_attr {
 				context_switch :  1, /* context switch data */
 				write_backward :  1, /* Write ring buffer from end to beginning */
 				namespaces     :  1, /* include namespaces data */
-				__reserved_1   : 35;
+				bpf_event      :  1, /* include bpf events */
+				__reserved_1   : 34;
 
 	union {
 		__u32		wakeup_events;	  /* wakeup every n events */
@@ -963,9 +964,33 @@ enum perf_event_type {
 	 */
 	PERF_RECORD_NAMESPACES			= 16,
 
+	/*
+	 * Record different types of bpf events:
+	 *  enum perf_bpf_event_type {
+	 *     PERF_BPF_EVENT_UNKNOWN		= 0,
+	 *     PERF_BPF_EVENT_PROG_LOAD	= 1,
+	 *     PERF_BPF_EVENT_PROG_UNLOAD	= 2,
+	 *  };
+	 *
+	 * struct {
+	 *	struct perf_event_header header;
+	 *	u16 type;
+	 *	u16 flags;
+	 *	u32 id;  // prog_id or map_id
+	 * };
+	 */
+	PERF_RECORD_BPF_EVENT			= 17,
+
 	PERF_RECORD_MAX,			/* non-ABI */
 };
 
+enum perf_bpf_event_type {
+	PERF_BPF_EVENT_UNKNOWN		= 0,
+	PERF_BPF_EVENT_PROG_LOAD	= 1,
+	PERF_BPF_EVENT_PROG_UNLOAD	= 2,
+	PERF_BPF_EVENT_MAX,		/* non-ABI */
+};
+
 #define PERF_MAX_STACK_DEPTH		127
 #define PERF_MAX_CONTEXTS_PER_STACK	  8
 
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 18e3be193a05..b37051a13be6 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -1101,9 +1101,12 @@ static void __bpf_prog_put_rcu(struct rcu_head *rcu)
 static void __bpf_prog_put(struct bpf_prog *prog, bool do_idr_lock)
 {
 	if (atomic_dec_and_test(&prog->aux->refcnt)) {
+		int prog_id = prog->aux->id;
+
 		/* bpf_prog_free_id() must be called first */
 		bpf_prog_free_id(prog, do_idr_lock);
 		bpf_prog_kallsyms_del_all(prog);
+		perf_event_bpf_event(PERF_BPF_EVENT_PROG_UNLOAD, 0, prog_id);
 
 		call_rcu(&prog->aux->rcu, __bpf_prog_put_rcu);
 	}
@@ -1441,6 +1444,7 @@ static int bpf_prog_load(union bpf_attr *attr)
 	}
 
 	bpf_prog_kallsyms_add(prog);
+	perf_event_bpf_event(PERF_BPF_EVENT_PROG_LOAD, 0, prog->aux->id);
 	return err;
 
 free_used_maps:
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 5a97f34bc14c..54667be6669b 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -385,6 +385,7 @@ static atomic_t nr_namespaces_events __read_mostly;
 static atomic_t nr_task_events __read_mostly;
 static atomic_t nr_freq_events __read_mostly;
 static atomic_t nr_switch_events __read_mostly;
+static atomic_t nr_bpf_events __read_mostly;
 
 static LIST_HEAD(pmus);
 static DEFINE_MUTEX(pmus_lock);
@@ -4235,7 +4236,7 @@ static bool is_sb_event(struct perf_event *event)
 
 	if (attr->mmap || attr->mmap_data || attr->mmap2 ||
 	    attr->comm || attr->comm_exec ||
-	    attr->task ||
+	    attr->task || attr->bpf_event ||
 	    attr->context_switch)
 		return true;
 	return false;
@@ -4305,6 +4306,8 @@ static void unaccount_event(struct perf_event *event)
 		dec = true;
 	if (has_branch_stack(event))
 		dec = true;
+	if (event->attr.bpf_event)
+		atomic_dec(&nr_bpf_events);
 
 	if (dec) {
 		if (!atomic_add_unless(&perf_sched_count, -1, 1))
@@ -7650,6 +7653,81 @@ static void perf_log_throttle(struct perf_event *event, int enable)
 	perf_output_end(&handle);
 }
 
+/*
+ * bpf load/unload tracking
+ */
+
+struct perf_bpf_event {
+	struct {
+		struct perf_event_header        header;
+		u16 type;
+		u16 flags;
+		u32 id;
+	} event_id;
+};
+
+static int perf_event_bpf_match(struct perf_event *event)
+{
+	return event->attr.bpf_event;
+}
+
+static void perf_event_bpf_output(struct perf_event *event,
+				   void *data)
+{
+	struct perf_bpf_event *bpf_event = data;
+	struct perf_output_handle handle;
+	struct perf_sample_data sample;
+	int size = bpf_event->event_id.header.size;
+	int ret;
+
+	if (!perf_event_bpf_match(event))
+		return;
+
+	perf_event_header__init_id(&bpf_event->event_id.header, &sample, event);
+	ret = perf_output_begin(&handle, event,
+				bpf_event->event_id.header.size);
+	if (ret)
+		goto out;
+
+	perf_output_put(&handle, bpf_event->event_id);
+	perf_event__output_id_sample(event, &handle, &sample);
+
+	perf_output_end(&handle);
+out:
+	bpf_event->event_id.header.size = size;
+}
+
+static void perf_event_bpf(struct perf_bpf_event *bpf_event)
+{
+	perf_iterate_sb(perf_event_bpf_output,
+		       bpf_event,
+		       NULL);
+}
+
+void perf_event_bpf_event(enum perf_bpf_event_type type, u16 flags, u32 id)
+{
+	struct perf_bpf_event bpf_event;
+
+	if (!atomic_read(&nr_bpf_events))
+		return;
+
+	if (type <= PERF_BPF_EVENT_UNKNOWN || type >= PERF_BPF_EVENT_MAX)
+		return;
+
+	bpf_event = (struct perf_bpf_event){
+		.event_id = {
+			.header = {
+				.type = PERF_RECORD_BPF_EVENT,
+				.size = sizeof(bpf_event.event_id),
+			},
+			.type = type,
+			.flags = flags,
+			.id = id,
+		},
+	};
+	perf_event_bpf(&bpf_event);
+}
+
 void perf_event_itrace_started(struct perf_event *event)
 {
 	event->attach_state |= PERF_ATTACH_ITRACE;
@@ -9871,6 +9949,8 @@ static void account_event(struct perf_event *event)
 		inc = true;
 	if (is_cgroup_event(event))
 		inc = true;
+	if (event->attr.bpf_event)
+		atomic_inc(&nr_bpf_events);
 
 	if (inc) {
 		/*
-- 
2.17.1

^ permalink raw reply related

* [RFC perf,bpf 2/5] perf: sync tools/include/uapi/linux/perf_event.h
From: Song Liu @ 2018-11-06 20:52 UTC (permalink / raw)
  To: netdev, linux-kernel; +Cc: kernel-team, Song Liu, ast, daniel, peterz, acme
In-Reply-To: <20181106205246.567448-1-songliubraving@fb.com>

Sync changes for PERF_RECORD_BPF_EVENT.

Signed-off-by: Song Liu <songliubraving@fb.com>
---
 tools/include/uapi/linux/perf_event.h | 27 ++++++++++++++++++++++++++-
 1 file changed, 26 insertions(+), 1 deletion(-)

diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h
index f35eb72739c0..d51cacb3077a 100644
--- a/tools/include/uapi/linux/perf_event.h
+++ b/tools/include/uapi/linux/perf_event.h
@@ -372,7 +372,8 @@ struct perf_event_attr {
 				context_switch :  1, /* context switch data */
 				write_backward :  1, /* Write ring buffer from end to beginning */
 				namespaces     :  1, /* include namespaces data */
-				__reserved_1   : 35;
+				bpf_event      :  1, /* include bpf events */
+				__reserved_1   : 34;
 
 	union {
 		__u32		wakeup_events;	  /* wakeup every n events */
@@ -963,9 +964,33 @@ enum perf_event_type {
 	 */
 	PERF_RECORD_NAMESPACES			= 16,
 
+	/*
+	 * Record different types of bpf events:
+	 *  enum perf_bpf_event_type {
+	 *     PERF_BPF_EVENT_UNKNOWN		= 0,
+	 *     PERF_BPF_EVENT_PROG_LOAD	= 1,
+	 *     PERF_BPF_EVENT_PROG_UNLOAD	= 2,
+	 *  };
+	 *
+	 * struct {
+	 *	struct perf_event_header header;
+	 *	u16 type;
+	 *	u16 flags;
+	 *	u32 id;  // prog_id or map_id
+	 * };
+	 */
+	PERF_RECORD_BPF_EVENT			= 17,
+
 	PERF_RECORD_MAX,			/* non-ABI */
 };
 
+enum perf_bpf_event_type {
+	PERF_BPF_EVENT_UNKNOWN		= 0,
+	PERF_BPF_EVENT_PROG_LOAD	= 1,
+	PERF_BPF_EVENT_PROG_UNLOAD	= 2,
+	PERF_BPF_EVENT_MAX,		/* non-ABI */
+};
+
 #define PERF_MAX_STACK_DEPTH		127
 #define PERF_MAX_CONTEXTS_PER_STACK	  8
 
-- 
2.17.1

^ permalink raw reply related

* [RFC perf,bpf 4/5] perf util: introduce bpf_prog_info_event
From: Song Liu @ 2018-11-06 20:52 UTC (permalink / raw)
  To: netdev, linux-kernel; +Cc: kernel-team, Song Liu, ast, daniel, peterz, acme
In-Reply-To: <20181106205246.567448-1-songliubraving@fb.com>

This patch introduces struct bpf_prog_info_event to union perf_event.

struct bpf_prog_info_event {
       struct perf_event_header        header;
       u32                             prog_info_len;
       u32                             ksym_table_len;
       u64                             ksym_table;
       struct bpf_prog_info            prog_info;
       char                            data[];
};

struct bpf_prog_info_event contains information about a bpf program.
These events are written to perf.data by perf-record, and processed by
perf-report.

struct bpf_prog_info_event uses arrays for some data (ksym_table, and
arrays in struct bpf_prog_info). To make these arrays easy to serialize,
we allocate continuous memory (data). These array pointers are translated
to offset in bpf_prog_info_event before written to file. And vice-versa
when the event is read from file.

This patch enables synthesizing these events at the beginning of
perf-record run. Next patch will process short living bpf programs that
are created during perf-record.

Signed-off-by: Song Liu <songliubraving@fb.com>
---
 tools/perf/builtin-record.c |   5 +
 tools/perf/builtin-report.c |   2 +
 tools/perf/util/Build       |   2 +
 tools/perf/util/bpf-info.c  | 287 ++++++++++++++++++++++++++++++++++++
 tools/perf/util/bpf-info.h  |  29 ++++
 tools/perf/util/event.c     |   1 +
 tools/perf/util/event.h     |  14 ++
 tools/perf/util/session.c   |   4 +
 tools/perf/util/tool.h      |   3 +-
 9 files changed, 346 insertions(+), 1 deletion(-)
 create mode 100644 tools/perf/util/bpf-info.c
 create mode 100644 tools/perf/util/bpf-info.h

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 0980dfe3396b..73b02bde1ebc 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -41,6 +41,7 @@
 #include "util/perf-hooks.h"
 #include "util/time-utils.h"
 #include "util/units.h"
+#include "util/bpf-info.h"
 #include "asm/bug.h"
 
 #include <errno.h>
@@ -850,6 +851,9 @@ static int record__synthesize(struct record *rec, bool tail)
 	err = __machine__synthesize_threads(machine, tool, &opts->target, rec->evlist->threads,
 					    process_synthesized_event, opts->sample_address,
 					    opts->proc_map_timeout, 1);
+
+	err = perf_event__synthesize_bpf_prog_info(
+		&rec->tool, process_synthesized_event, machine);
 out:
 	return err;
 }
@@ -1531,6 +1535,7 @@ static struct record record = {
 		.namespaces	= perf_event__process_namespaces,
 		.mmap		= perf_event__process_mmap,
 		.mmap2		= perf_event__process_mmap2,
+		.bpf_prog_info	= perf_event__process_bpf_prog_info,
 		.ordered_events	= true,
 	},
 };
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index c0703979c51d..4a9a3e8da4e0 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -41,6 +41,7 @@
 #include "util/auxtrace.h"
 #include "util/units.h"
 #include "util/branch.h"
+#include "util/bpf-info.h"
 
 #include <dlfcn.h>
 #include <errno.h>
@@ -981,6 +982,7 @@ int cmd_report(int argc, const char **argv)
 			.auxtrace_info	 = perf_event__process_auxtrace_info,
 			.auxtrace	 = perf_event__process_auxtrace,
 			.feature	 = process_feature_event,
+			.bpf_prog_info	 = perf_event__process_bpf_prog_info,
 			.ordered_events	 = true,
 			.ordering_requires_timestamps = true,
 		},
diff --git a/tools/perf/util/Build b/tools/perf/util/Build
index ecd9f9ceda77..624c7281217c 100644
--- a/tools/perf/util/Build
+++ b/tools/perf/util/Build
@@ -150,6 +150,8 @@ endif
 
 libperf-y += perf-hooks.o
 
+libperf-$(CONFIG_LIBBPF) += bpf-info.o
+
 libperf-$(CONFIG_CXX) += c++/
 
 CFLAGS_config.o   += -DETC_PERFCONFIG="BUILD_STR($(ETC_PERFCONFIG_SQ))"
diff --git a/tools/perf/util/bpf-info.c b/tools/perf/util/bpf-info.c
new file mode 100644
index 000000000000..fa598c4328be
--- /dev/null
+++ b/tools/perf/util/bpf-info.c
@@ -0,0 +1,287 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2018 Facebook
+ */
+#include <errno.h>
+#include <stdio.h>
+#include <bpf/bpf.h>
+#include "bpf-info.h"
+#include "debug.h"
+#include "session.h"
+
+#define KSYM_NAME_LEN 128
+#define BPF_PROG_INFO_MIN_SIZE 128  /* minimal require jited_func_lens */
+
+static inline __u64 ptr_to_u64(const void *ptr)
+{
+	return (__u64) (unsigned long) ptr;
+}
+
+/* fetch information of the bpf program via bpf syscall. */
+struct bpf_prog_info_event *perf_bpf_info__get_bpf_prog_info_event(u32 prog_id)
+{
+	struct bpf_prog_info_event *prog_info_event = NULL;
+	struct bpf_prog_info info = {};
+	u32 info_len = sizeof(info);
+	u32 event_len, i;
+	int fd, err;
+	void *ptr;
+
+	fd = bpf_prog_get_fd_by_id(prog_id);
+	if (fd < 0) {
+		pr_debug("Failed to get fd for prog_id %u\n", prog_id);
+		return NULL;
+	}
+
+	err = bpf_obj_get_info_by_fd(fd, &info, &info_len);
+	if (err) {
+		pr_debug("can't get prog info: %s", strerror(errno));
+		goto close_fd;
+	}
+	if (info_len < BPF_PROG_INFO_MIN_SIZE) {
+		pr_debug("kernel is too old to support proper prog info\n");
+		goto close_fd;
+	}
+
+	/* calculate size of bpf_prog_info_event */
+	event_len = sizeof(struct bpf_prog_info_event);
+	event_len += info_len;
+	event_len -= sizeof(info);
+	event_len += info.jited_prog_len;
+	event_len += info.xlated_prog_len;
+	event_len += info.nr_map_ids * sizeof(u32);
+	event_len += info.nr_jited_ksyms * sizeof(u64);
+	event_len += info.nr_jited_func_lens * sizeof(u32);
+	event_len += info.nr_jited_ksyms * KSYM_NAME_LEN;
+
+	prog_info_event = (struct bpf_prog_info_event *) malloc(event_len);
+	if (!prog_info_event)
+		goto close_fd;
+
+	/* assign pointers for map_ids, jited_prog_insns, etc. */
+	ptr = prog_info_event->data;
+	info.map_ids = ptr_to_u64(ptr);
+	ptr += info.nr_map_ids * sizeof(u32);
+	info.jited_prog_insns = ptr_to_u64(ptr);
+	ptr += info.jited_prog_len;
+	info.xlated_prog_insns = ptr_to_u64(ptr);
+	ptr += info.xlated_prog_len;
+	info.jited_ksyms = ptr_to_u64(ptr);
+	ptr += info.nr_jited_ksyms * sizeof(u64);
+	info.jited_func_lens = ptr_to_u64(ptr);
+	ptr += info.nr_jited_func_lens * sizeof(u32);
+
+	err = bpf_obj_get_info_by_fd(fd, &info, &info_len);
+	if (err) {
+		pr_err("can't get prog info: %s\n", strerror(errno));
+		free(prog_info_event);
+		prog_info_event = NULL;
+		goto close_fd;
+	}
+
+	/* fill data in prog_info_event */
+	prog_info_event->header.type = PERF_RECORD_BPF_PROG_INFO;
+	prog_info_event->header.misc = 0;
+	prog_info_event->prog_info_len = info_len;
+
+	memcpy(&prog_info_event->prog_info, &info, info_len);
+
+	prog_info_event->ksym_table_len = 0;
+	prog_info_event->ksym_table = ptr_to_u64(ptr);
+
+	/* fill in fake symbol name for now, add real name after BTF */
+	if (info.nr_jited_func_lens == 1 && info.name) {  /* only main prog */
+		size_t l;
+
+		assert(info.nr_jited_ksyms == 1);
+		l = snprintf(ptr, KSYM_NAME_LEN, "bpf_prog_%s", info.name);
+		prog_info_event->ksym_table_len += l + 1;
+		ptr += l + 1;
+
+	} else {
+		assert(info.nr_jited_ksyms == info.nr_jited_func_lens);
+
+		for (i = 0; i < info.nr_jited_ksyms; i++) {
+			size_t l;
+
+			l = snprintf(ptr, KSYM_NAME_LEN, "bpf_prog_%d_%d",
+				     info.id, i);
+			prog_info_event->ksym_table_len += l + 1;
+			ptr += l + 1;
+		}
+	}
+
+	prog_info_event->header.size = ptr - (void *)prog_info_event;
+
+close_fd:
+	close(fd);
+	return prog_info_event;
+}
+
+static size_t fprintf_bpf_prog_info(
+	struct bpf_prog_info_event *prog_info_event, FILE *fp)
+{
+	struct bpf_prog_info *info = &prog_info_event->prog_info;
+	unsigned long *jited_ksyms = (unsigned long *)(info->jited_ksyms);
+	char *name_ptr = (char *) prog_info_event->ksym_table;
+	unsigned int i;
+	size_t ret;
+
+	ret = fprintf(fp, "bpf_prog: type: %u id: %u ", info->type, info->id);
+	ret += fprintf(fp, "nr_jited_ksyms: %u\n", info->nr_jited_ksyms);
+
+	for (i = 0; i < info->nr_jited_ksyms; i++) {
+		ret += fprintf(fp, "jited_ksyms[%u]: %lx %s\n",
+			       i, jited_ksyms[i], name_ptr);
+		name_ptr += strlen(name_ptr);
+	}
+	return ret;
+}
+
+size_t perf_event__fprintf_bpf_prog_info(union perf_event *event, FILE *fp)
+{
+	return fprintf_bpf_prog_info(&event->bpf_prog_info, fp);
+}
+
+/*
+ * translate all array ptr to offset from base address, called before
+ * writing the event to file
+ */
+void perf_bpf_info__ptr_to_offset(
+	struct bpf_prog_info_event *prog_info_event)
+{
+	u64 base = ptr_to_u64(prog_info_event);
+
+	prog_info_event->ksym_table -= base;
+	prog_info_event->prog_info.jited_prog_insns -= base;
+	prog_info_event->prog_info.xlated_prog_insns -= base;
+	prog_info_event->prog_info.map_ids -= base;
+	prog_info_event->prog_info.jited_ksyms -= base;
+	prog_info_event->prog_info.jited_func_lens -= base;
+}
+
+/*
+ * translate offset from base address to array pointer, called after
+ * reading the event from file
+ */
+void perf_bpf_info__offset_to_ptr(
+	struct bpf_prog_info_event *prog_info_event)
+{
+	u64 base = ptr_to_u64(prog_info_event);
+
+	prog_info_event->ksym_table += base;
+	prog_info_event->prog_info.jited_prog_insns += base;
+	prog_info_event->prog_info.xlated_prog_insns += base;
+	prog_info_event->prog_info.map_ids += base;
+	prog_info_event->prog_info.jited_ksyms += base;
+	prog_info_event->prog_info.jited_func_lens += base;
+}
+
+int perf_event__synthesize_one_bpf_prog_info(struct perf_tool *tool,
+					     perf_event__handler_t process,
+					     struct machine *machine,
+					     __u32 id)
+{
+	struct bpf_prog_info_event *prog_info_event;
+
+	prog_info_event = perf_bpf_info__get_bpf_prog_info_event(id);
+
+	if (!prog_info_event) {
+		pr_err("Failed to get prog_info_event\n");
+		return -1;
+	}
+	perf_bpf_info__ptr_to_offset(prog_info_event);
+
+	if (perf_tool__process_synth_event(
+		    tool, (union perf_event *)prog_info_event,
+		    machine, process) != 0) {
+		free(prog_info_event);
+		return -1;
+	}
+
+	free(prog_info_event);
+	return 0;
+}
+
+int perf_event__synthesize_bpf_prog_info(struct perf_tool *tool,
+					 perf_event__handler_t process,
+					 struct machine *machine)
+{
+	__u32 id = 0;
+	int err = 0;
+
+	while (true) {
+		err = bpf_prog_get_next_id(id, &id);
+		if (err) {
+			if (errno == ENOENT) {
+				err = 0;
+				break;
+			}
+			fprintf(stderr, "can't get next program: %s%s",
+				strerror(errno),
+				errno == EINVAL ? " -- kernel too old?" : "");
+			err = -1;
+			break;
+		}
+		err = perf_event__synthesize_one_bpf_prog_info(
+			tool, process, machine, id);
+	}
+	return err;
+}
+
+int perf_event__process_bpf_prog_info(struct perf_session *session,
+				      union perf_event *event)
+{
+	struct machine *machine = &session->machines.host;
+	struct bpf_prog_info_event *prog_info_event;
+	struct bpf_prog_info *info;
+	struct symbol *sym;
+	struct map *map;
+	char *name_ptr;
+	int ret = 0;
+	u64 *addrs;
+	u32 *lens;
+	u32 i;
+
+	prog_info_event = (struct bpf_prog_info_event *)
+		malloc(event->header.size);
+	if (!prog_info_event)
+		return -ENOMEM;
+
+	/* copy the data to rw memeory so we can modify it */
+	memcpy(prog_info_event,  &event->bpf_prog_info, event->header.size);
+	info = &prog_info_event->prog_info;
+
+	perf_bpf_info__offset_to_ptr(prog_info_event);
+	name_ptr = (char *) prog_info_event->ksym_table;
+	addrs = (u64 *)info->jited_ksyms;
+	lens = (u32 *)info->jited_func_lens;
+	for (i = 0; i < info->nr_jited_ksyms; i++) {
+		u32 len = info->nr_jited_func_lens == 1 ?
+			len = info->jited_prog_len : lens[i];
+
+		map = map_groups__find(&machine->kmaps, addrs[i]);
+		if (!map) {
+			map = dso__new_map("bpf_prog");
+			if (!map) {
+				ret = -ENOMEM;
+				break;
+			}
+			map->start = addrs[i];
+			map->pgoff = map->start;
+			map->end = map->start + len;
+			map_groups__insert(&machine->kmaps, map);
+		}
+
+		sym = symbol__new(addrs[i], len, 0, 0, name_ptr);
+		if (!sym) {
+			ret = -ENOMEM;
+			break;
+		}
+		dso__insert_symbol(map->dso, sym);
+		name_ptr += strlen(name_ptr) + 1;
+	}
+
+	free(prog_info_event);
+	return ret;
+}
diff --git a/tools/perf/util/bpf-info.h b/tools/perf/util/bpf-info.h
new file mode 100644
index 000000000000..813cad07bacb
--- /dev/null
+++ b/tools/perf/util/bpf-info.h
@@ -0,0 +1,29 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __PERF_BPF_INFO_H
+#define __PERF_BPF_INFO_H
+
+#include "event.h"
+#include "machine.h"
+#include "tool.h"
+#include "symbol.h"
+
+struct bpf_prog_info_event *perf_bpf_info__get_bpf_prog_info_event(u32 prog_id);
+
+size_t perf_event__fprintf_bpf_prog_info(union perf_event *event, FILE *fp);
+
+int perf_event__synthesize_one_bpf_prog_info(struct perf_tool *tool,
+					     perf_event__handler_t process,
+					     struct machine *machine,
+					     __u32 id);
+
+int perf_event__synthesize_bpf_prog_info(struct perf_tool *tool,
+					 perf_event__handler_t process,
+					 struct machine *machine);
+
+void perf_bpf_info__ptr_to_offset(struct bpf_prog_info_event *prog_info_event);
+void perf_bpf_info__offset_to_ptr(struct bpf_prog_info_event *prog_info_event);
+
+int perf_event__process_bpf_prog_info(struct perf_session *session,
+				      union perf_event *event);
+
+#endif /* __PERF_BPF_INFO_H */
diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
index 601432afbfb2..33b1c168b83e 100644
--- a/tools/perf/util/event.c
+++ b/tools/perf/util/event.c
@@ -61,6 +61,7 @@ static const char *perf_event__names[] = {
 	[PERF_RECORD_EVENT_UPDATE]		= "EVENT_UPDATE",
 	[PERF_RECORD_TIME_CONV]			= "TIME_CONV",
 	[PERF_RECORD_HEADER_FEATURE]		= "FEATURE",
+	[PERF_RECORD_BPF_PROG_INFO]		= "BPF_PROG_INFO",
 };
 
 static const char *perf_ns__names[] = {
diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
index 13a0c64dd0ed..dc64d800eaa6 100644
--- a/tools/perf/util/event.h
+++ b/tools/perf/util/event.h
@@ -5,6 +5,7 @@
 #include <limits.h>
 #include <stdio.h>
 #include <linux/kernel.h>
+#include <linux/bpf.h>
 
 #include "../perf.h"
 #include "build-id.h"
@@ -258,6 +259,7 @@ enum perf_user_event_type { /* above any possible kernel type */
 	PERF_RECORD_EVENT_UPDATE		= 78,
 	PERF_RECORD_TIME_CONV			= 79,
 	PERF_RECORD_HEADER_FEATURE		= 80,
+	PERF_RECORD_BPF_PROG_INFO		= 81,
 	PERF_RECORD_HEADER_MAX
 };
 
@@ -629,6 +631,17 @@ struct feature_event {
 	char				data[];
 };
 
+#define KSYM_NAME_LEN 128
+
+struct bpf_prog_info_event {
+	struct perf_event_header	header;
+	u32				prog_info_len;
+	u32				ksym_table_len;
+	u64				ksym_table;
+	struct bpf_prog_info		prog_info;
+	char				data[];
+};
+
 union perf_event {
 	struct perf_event_header	header;
 	struct mmap_event		mmap;
@@ -661,6 +674,7 @@ union perf_event {
 	struct time_conv_event		time_conv;
 	struct feature_event		feat;
 	struct bpf_event		bpf_event;
+	struct bpf_prog_info_event	bpf_prog_info;
 };
 
 void perf_event__print_totals(void);
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index dffe5120d2d3..5365ee1dfbec 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -415,6 +415,8 @@ void perf_tool__fill_defaults(struct perf_tool *tool)
 		tool->time_conv = process_event_op2_stub;
 	if (tool->feature == NULL)
 		tool->feature = process_event_op2_stub;
+	if (tool->bpf_prog_info == NULL)
+		tool->bpf_prog_info = process_event_op2_stub;
 }
 
 static void swap_sample_id_all(union perf_event *event, void *data)
@@ -1397,6 +1399,8 @@ static s64 perf_session__process_user_event(struct perf_session *session,
 		return tool->time_conv(session, event);
 	case PERF_RECORD_HEADER_FEATURE:
 		return tool->feature(session, event);
+	case PERF_RECORD_BPF_PROG_INFO:
+		return tool->bpf_prog_info(session, event);
 	default:
 		return -EINVAL;
 	}
diff --git a/tools/perf/util/tool.h b/tools/perf/util/tool.h
index 69ae898ca024..739a4b1188f7 100644
--- a/tools/perf/util/tool.h
+++ b/tools/perf/util/tool.h
@@ -70,7 +70,8 @@ struct perf_tool {
 			stat_config,
 			stat,
 			stat_round,
-			feature;
+			feature,
+			bpf_prog_info;
 	event_op3	auxtrace;
 	bool		ordered_events;
 	bool		ordering_requires_timestamps;
-- 
2.17.1

^ permalink raw reply related

* [RFC perf,bpf 5/5] perf util: generate bpf_prog_info_event for short living bpf programs
From: Song Liu @ 2018-11-06 20:52 UTC (permalink / raw)
  To: netdev, linux-kernel; +Cc: kernel-team, Song Liu, ast, daniel, peterz, acme
In-Reply-To: <20181106205246.567448-1-songliubraving@fb.com>

This patch enables perf-record to listen to bpf_event and generate
bpf_prog_info_event for bpf programs loaded and unloaded during
perf-record run.

To minimize latency between bpf_event and following bpf calls, separate
mmap with watermark of 1 is created to process these vip events. Then
a separate dummy event is attached to the special mmap.

By default, perf-record will listen to bpf_event. Option no-bpf-event is
added in case the user would opt out.

Signed-off-by: Song Liu <songliubraving@fb.com>
---
 tools/perf/builtin-record.c | 50 +++++++++++++++++++++++++++++++++++++
 tools/perf/util/evlist.c    | 42 ++++++++++++++++++++++++++++---
 tools/perf/util/evlist.h    |  4 +++
 tools/perf/util/evsel.c     |  8 ++++++
 tools/perf/util/evsel.h     |  3 +++
 5 files changed, 104 insertions(+), 3 deletions(-)

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 73b02bde1ebc..1036a64eb9f7 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -80,6 +80,7 @@ struct record {
 	bool			buildid_all;
 	bool			timestamp_filename;
 	bool			timestamp_boundary;
+	bool			no_bpf_event;
 	struct switch_output	switch_output;
 	unsigned long long	samples;
 };
@@ -381,6 +382,8 @@ static int record__open(struct record *rec)
 		pos->tracking = 1;
 		pos->attr.enable_on_exec = 1;
 	}
+	if (!rec->no_bpf_event)
+		perf_evlist__add_bpf_tracker(evlist);
 
 	perf_evlist__config(evlist, opts, &callchain_param);
 
@@ -562,10 +565,55 @@ static int record__mmap_read_evlist(struct record *rec, struct perf_evlist *evli
 	return rc;
 }
 
+static int record__mmap_process_vip_events(struct record *rec)
+{
+	int i;
+
+	for (i = 0; i < rec->evlist->nr_mmaps; i++) {
+		struct perf_mmap *map = &rec->evlist->vip_mmap[i];
+		union perf_event *event;
+
+		perf_mmap__read_init(map);
+		while ((event = perf_mmap__read_event(map)) != NULL) {
+			pr_debug("processing vip event of type %d\n",
+				 event->header.type);
+			switch (event->header.type) {
+			case PERF_RECORD_BPF_EVENT:
+				switch (event->bpf_event.type) {
+				case PERF_BPF_EVENT_PROG_LOAD:
+					perf_event__synthesize_one_bpf_prog_info(
+						&rec->tool,
+						process_synthesized_event,
+						&rec->session->machines.host,
+						event->bpf_event.id);
+					/* fall through */
+				case PERF_BPF_EVENT_PROG_UNLOAD:
+					record__write(rec, NULL, event,
+						      event->header.size);
+				break;
+				default:
+					break;
+				}
+				break;
+			default:
+				break;
+			}
+			perf_mmap__consume(map);
+		}
+		perf_mmap__read_done(map);
+	}
+
+	return 0;
+}
+
 static int record__mmap_read_all(struct record *rec)
 {
 	int err;
 
+	err = record__mmap_process_vip_events(rec);
+	if (err)
+		return err;
+
 	err = record__mmap_read_evlist(rec, rec->evlist, false);
 	if (err)
 		return err;
@@ -1686,6 +1734,8 @@ static struct option __record_options[] = {
 			  "signal"),
 	OPT_BOOLEAN(0, "dry-run", &dry_run,
 		    "Parse options then exit"),
+	OPT_BOOLEAN(0, "no-bpf-event", &record.no_bpf_event,
+		    "do not record event on bpf program load/unload"),
 	OPT_END()
 };
 
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index be440df29615..466a9f7b1e93 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -45,6 +45,7 @@ void perf_evlist__init(struct perf_evlist *evlist, struct cpu_map *cpus,
 	for (i = 0; i < PERF_EVLIST__HLIST_SIZE; ++i)
 		INIT_HLIST_HEAD(&evlist->heads[i]);
 	INIT_LIST_HEAD(&evlist->entries);
+	INIT_LIST_HEAD(&evlist->vip_entries);
 	perf_evlist__set_maps(evlist, cpus, threads);
 	fdarray__init(&evlist->pollfd, 64);
 	evlist->workload.pid = -1;
@@ -177,6 +178,8 @@ void perf_evlist__add(struct perf_evlist *evlist, struct perf_evsel *entry)
 {
 	entry->evlist = evlist;
 	list_add_tail(&entry->node, &evlist->entries);
+	if (entry->vip)
+		list_add_tail(&entry->vip_node, &evlist->vip_entries);
 	entry->idx = evlist->nr_entries;
 	entry->tracking = !entry->idx;
 
@@ -267,6 +270,27 @@ int perf_evlist__add_dummy(struct perf_evlist *evlist)
 	return 0;
 }
 
+int perf_evlist__add_bpf_tracker(struct perf_evlist *evlist)
+{
+	struct perf_event_attr attr = {
+		.type	          = PERF_TYPE_SOFTWARE,
+		.config           = PERF_COUNT_SW_DUMMY,
+		.watermark        = 1,
+		.bpf_event        = 1,
+		.wakeup_watermark = 1,
+		.size	   = sizeof(attr), /* to capture ABI version */
+	};
+	struct perf_evsel *evsel = perf_evsel__new_idx(&attr,
+						       evlist->nr_entries);
+
+	if (evsel == NULL)
+		return -ENOMEM;
+
+	evsel->vip = true;
+	perf_evlist__add(evlist, evsel);
+	return 0;
+}
+
 static int perf_evlist__add_attrs(struct perf_evlist *evlist,
 				  struct perf_event_attr *attrs, size_t nr_attrs)
 {
@@ -770,6 +794,7 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
 	int evlist_cpu = cpu_map__cpu(evlist->cpus, cpu_idx);
 
 	evlist__for_each_entry(evlist, evsel) {
+		struct perf_mmap *vip_maps = evlist->vip_mmap;
 		struct perf_mmap *maps = evlist->mmap;
 		int *output = _output;
 		int fd;
@@ -800,7 +825,11 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
 
 		fd = FD(evsel, cpu, thread);
 
-		if (*output == -1) {
+		if (evsel->vip) {
+			if (perf_mmap__mmap(&vip_maps[idx], mp,
+					    fd, evlist_cpu) < 0)
+				return -1;
+		} else if (*output == -1) {
 			*output = fd;
 
 			if (perf_mmap__mmap(&maps[idx], mp, *output, evlist_cpu) < 0)
@@ -822,8 +851,12 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
 		 * Therefore don't add it for polling.
 		 */
 		if (!evsel->system_wide &&
-		    __perf_evlist__add_pollfd(evlist, fd, &maps[idx], revent) < 0) {
-			perf_mmap__put(&maps[idx]);
+		    __perf_evlist__add_pollfd(
+			    evlist, fd,
+			    evsel->vip ? &vip_maps[idx] : &maps[idx],
+			    revent) < 0) {
+			perf_mmap__put(evsel->vip ?
+				       &vip_maps[idx] : &maps[idx]);
 			return -1;
 		}
 
@@ -1035,6 +1068,9 @@ int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
 	if (!evlist->mmap)
 		return -ENOMEM;
 
+	if (!evlist->vip_mmap)
+		evlist->vip_mmap = perf_evlist__alloc_mmap(evlist, false);
+
 	if (evlist->pollfd.entries == NULL && perf_evlist__alloc_pollfd(evlist) < 0)
 		return -ENOMEM;
 
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index dc66436add98..6d99e8dab570 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -26,6 +26,7 @@ struct record_opts;
 
 struct perf_evlist {
 	struct list_head entries;
+	struct list_head vip_entries;
 	struct hlist_head heads[PERF_EVLIST__HLIST_SIZE];
 	int		 nr_entries;
 	int		 nr_groups;
@@ -43,6 +44,7 @@ struct perf_evlist {
 	} workload;
 	struct fdarray	 pollfd;
 	struct perf_mmap *mmap;
+	struct perf_mmap *vip_mmap;
 	struct perf_mmap *overwrite_mmap;
 	struct thread_map *threads;
 	struct cpu_map	  *cpus;
@@ -84,6 +86,8 @@ int __perf_evlist__add_default_attrs(struct perf_evlist *evlist,
 
 int perf_evlist__add_dummy(struct perf_evlist *evlist);
 
+int perf_evlist__add_bpf_tracker(struct perf_evlist *evlist);
+
 int perf_evlist__add_newtp(struct perf_evlist *evlist,
 			   const char *sys, const char *name, void *handler);
 
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index af9d539e4b6a..94456a493607 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -235,6 +235,7 @@ void perf_evsel__init(struct perf_evsel *evsel,
 	evsel->evlist	   = NULL;
 	evsel->bpf_fd	   = -1;
 	INIT_LIST_HEAD(&evsel->node);
+	INIT_LIST_HEAD(&evsel->vip_node);
 	INIT_LIST_HEAD(&evsel->config_terms);
 	perf_evsel__object.init(evsel);
 	evsel->sample_size = __perf_evsel__sample_size(attr->sample_type);
@@ -1795,6 +1796,8 @@ int perf_evsel__open(struct perf_evsel *evsel, struct cpu_map *cpus,
 				     PERF_SAMPLE_BRANCH_NO_CYCLES);
 	if (perf_missing_features.group_read && evsel->attr.inherit)
 		evsel->attr.read_format &= ~(PERF_FORMAT_GROUP|PERF_FORMAT_ID);
+	if (perf_missing_features.bpf_event)
+		evsel->attr.bpf_event = 0;
 retry_sample_id:
 	if (perf_missing_features.sample_id_all)
 		evsel->attr.sample_id_all = 0;
@@ -1939,6 +1942,11 @@ int perf_evsel__open(struct perf_evsel *evsel, struct cpu_map *cpus,
 		perf_missing_features.exclude_guest = true;
 		pr_debug2("switching off exclude_guest, exclude_host\n");
 		goto fallback_missing_features;
+	} else if (!perf_missing_features.bpf_event &&
+		   evsel->attr.bpf_event) {
+		perf_missing_features.bpf_event = true;
+		pr_debug2("switching off bpf_event\n");
+		goto fallback_missing_features;
 	} else if (!perf_missing_features.sample_id_all) {
 		perf_missing_features.sample_id_all = true;
 		pr_debug2("switching off sample_id_all\n");
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 4107c39f4a54..82b1d3e42603 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -89,6 +89,7 @@ struct perf_stat_evsel;
  */
 struct perf_evsel {
 	struct list_head	node;
+	struct list_head	vip_node;
 	struct perf_evlist	*evlist;
 	struct perf_event_attr	attr;
 	char			*filter;
@@ -128,6 +129,7 @@ struct perf_evsel {
 	bool			ignore_missing_thread;
 	bool			forced_leader;
 	bool			use_uncore_alias;
+	bool			vip;  /* vip events have their own mmap */
 	/* parse modifier helper */
 	int			exclude_GH;
 	int			nr_members;
@@ -163,6 +165,7 @@ struct perf_missing_features {
 	bool lbr_flags;
 	bool write_backward;
 	bool group_read;
+	bool bpf_event;
 };
 
 extern struct perf_missing_features perf_missing_features;
-- 
2.17.1

^ permalink raw reply related

* Re: [PATCH 2/6] dt-bindings: phy: phy-of-simple: Document new binding
From: Rob Herring @ 2018-11-06 20:55 UTC (permalink / raw)
  To: Faiz Abbas
  Cc: linux-kernel@vger.kernel.org, devicetree, netdev, linux-can,
	Wolfgang Grandegger, Marc Kleine-Budde, Mark Rutland,
	Kishon Vijay Abraham I
In-Reply-To: <20181102192616.28291-3-faiz_abbas@ti.com>

On Fri, Nov 2, 2018 at 2:23 PM Faiz Abbas <faiz_abbas@ti.com> wrote:
>
> Add documentation for the generic simple phy implementation.

We don't do 'simple' or 'generic' bindings.

> Signed-off-by: Faiz Abbas <faiz_abbas@ti.com>
> ---
>  .../devicetree/bindings/phy/phy-of-simple.txt | 29 +++++++++++++++++++
>  1 file changed, 29 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/phy/phy-of-simple.txt
>
> diff --git a/Documentation/devicetree/bindings/phy/phy-of-simple.txt b/Documentation/devicetree/bindings/phy/phy-of-simple.txt
> new file mode 100644
> index 000000000000..696f2763395c
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/phy/phy-of-simple.txt
> @@ -0,0 +1,29 @@
> +Generic simple phy device tree binding
> +--------------------------------------
> +
> +A good number of phy implementations merely read dts properties,
> +enable clocks, regulators or do resets without having a dedicated register
> +map. This binding implements a generic phy driver which can be used for
> +such simple implementations and avoid boilerplate code duplication.

Sure, but then latter some needs certain timing/ordering of those
controls or some other DT additions. 'generic' or 'simple' never work
out for bindings. By all means though, write a simple/generic phy
driver. Just make it understand an explicit list of compatible
strings. Then when a phy turns out to be not so simple, we can write a
driver for it with changing the DT.

> +Required Properties:
> +-  compatible  : must be "simple-phy"
> +-  phy-cells    : must be 0

#phy-cells

> +
> +Optional Properties:
> +-  bus-width   : generic bus-width. Must be positive.
> +-  max-bitrate : generic max-bitrate. Must be positive.
> +-  pwr         : phandle to phy pwr regulator node.

That's not the regulator binding.

> +
> +Example:
> +
> +The following example is a can transceiver implemented as a generic phy.
> +It has a max-bitrate property and a pwr regulator.
> +
> +
> +transceiver1: can-transceiver {
> +       compatible = "simple-phy";
> +       max-bitrate = <5000000>;
> +       pwr-supply = <&transceiver1_fixed>;
> +       #phy-cells = <0>;
> +};
> --
> 2.18.0
>

^ permalink raw reply

* Re: [RFC perf,bpf 4/5] perf util: introduce bpf_prog_info_event
From: Alexei Starovoitov @ 2018-11-06 21:11 UTC (permalink / raw)
  To: Song Liu, netdev@vger.kernel.org, linux-kernel@vger.kernel.org
  Cc: Kernel Team, ast@kernel.org, daniel@iogearbox.net,
	peterz@infradead.org, acme@kernel.org
In-Reply-To: <20181106205246.567448-5-songliubraving@fb.com>

On 11/6/18 12:52 PM, Song Liu wrote:
> +	/* fill in fake symbol name for now, add real name after BTF */
> +	if (info.nr_jited_func_lens == 1 && info.name) {  /* only main prog */
> +		size_t l;
> +
> +		assert(info.nr_jited_ksyms == 1);
> +		l = snprintf(ptr, KSYM_NAME_LEN, "bpf_prog_%s", info.name);

please include the prog tag here. Just like kernel kallsyms do.
Other than this small nit the patch set looks great to me.

^ permalink raw reply

* Re: [PATCH V2] mlx5: Fix formats with line continuation whitespace
From: Doug Ledford @ 2018-11-06 21:34 UTC (permalink / raw)
  To: Leon Romanovsky, Joe Perches
  Cc: Saeed Mahameed, David S. Miller, netdev, linux-rdma, linux-kernel
In-Reply-To: <20181101073412.GQ3974@mtr-leonro.mtl.com>

[-- Attachment #1: Type: text/plain, Size: 722 bytes --]

On Thu, 2018-11-01 at 09:34 +0200, Leon Romanovsky wrote:
> On Thu, Nov 01, 2018 at 12:24:08AM -0700, Joe Perches wrote:
> > The line continuations unintentionally add whitespace so
> > instead use coalesced formats to remove the whitespace.
> > 
> > Signed-off-by: Joe Perches <joe@perches.com>
> > ---
> > 
> > v2: Remove excess space after %u
> > 
> >  drivers/net/ethernet/mellanox/mlx5/core/rl.c | 6 ++----
> >  1 file changed, 2 insertions(+), 4 deletions(-)
> > 
> 
> Thanks,
> Reviewed-by: Leon Romanovsky <leonro@mellanox.com>

Applied, thanks.

-- 
Doug Ledford <dledford@redhat.com>
    GPG KeyID: B826A3330E572FDD
    Key fingerprint = AE6B 1BDA 122B 23B4 265B  1274 B826 A333 0E57 2FDD

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* Re: [RFC perf,bpf 5/5] perf util: generate bpf_prog_info_event for short living bpf programs
From: David Ahern @ 2018-11-06 21:54 UTC (permalink / raw)
  To: Song Liu, netdev, linux-kernel; +Cc: kernel-team, ast, daniel, peterz, acme
In-Reply-To: <20181106205246.567448-6-songliubraving@fb.com>

On 11/6/18 1:52 PM, Song Liu wrote:
> +
>  static int record__mmap_read_all(struct record *rec)
>  {
>  	int err;
>  
> +	err = record__mmap_process_vip_events(rec);
> +	if (err)
> +		return err;
> +
>  	err = record__mmap_read_evlist(rec, rec->evlist, false);
>  	if (err)
>  		return err;

Seems to me that is going to increase the overhead of perf on any system
doing BPF updates. The BPF events cause a wakeup every load and unload,
and perf processes not only the VIP events but then walks all of the
other maps.

> @@ -1686,6 +1734,8 @@ static struct option __record_options[] = {
>  			  "signal"),
>  	OPT_BOOLEAN(0, "dry-run", &dry_run,
>  		    "Parse options then exit"),
> +	OPT_BOOLEAN(0, "no-bpf-event", &record.no_bpf_event,
> +		    "do not record event on bpf program load/unload"),

Why should this default on? If am recording FIB events, I don't care
about BPF events.

^ permalink raw reply

* Re: [PATCH v3 1/2] kretprobe: produce sane stack traces
From: Steven Rostedt @ 2018-11-06 22:15 UTC (permalink / raw)
  To: Aleksa Sarai
  Cc: Naveen N. Rao, Anil S Keshavamurthy, David S. Miller,
	Masami Hiramatsu, Jonathan Corbet, Peter Zijlstra, Ingo Molnar,
	Arnaldo Carvalho de Melo, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, Shuah Khan, Alexei Starovoitov, Daniel Borkmann,
	Brendan Gregg, Christian Brauner, Aleksa Sarai, netdev, linux-doc,
	linux-kernel
In-Reply-To: <20181104115913.74l4yzecisvtt2j5@yavin>

On Sun, 4 Nov 2018 22:59:13 +1100
Aleksa Sarai <cyphar@cyphar.com> wrote:

> The same issue is present in __save_stack_trace
> (arch/x86/kernel/stacktrace.c). This is likely the only reason that --
> as Steven said -- stacktraces wouldn't work with ftrace-graph (and thus
> with the refactor both of you are discussing).

By the way, I was playing with the the orc unwinder and stack traces
from the function graph tracer return code, and got it working with the
below patch. Caution, that patch also has a stack trace hardcoded in
the return path of the function graph tracer, so you don't want to run
function graph tracing without filtering.

You can apply the patch and do:

 # cd /sys/kernel/debug/tracing
 # echo schedule > set_ftrace_filter
 # echo function_graph > current_tracer
 # cat trace

 3)               |  schedule() {
     rcu_preempt-10    [003] ....    91.160297: <stack trace>
 => ftrace_return_to_handler
 => return_to_handler
 => schedule_timeout
 => rcu_gp_kthread
 => kthread
 => ret_from_fork
 3) # 4009.085 us |  }
 3)               |  schedule() {
     kworker/1:0-17    [001] ....    91.163288: <stack trace>
 => ftrace_return_to_handler
 => return_to_handler
 => worker_thread
 => kthread
 => ret_from_fork
 1) # 7000.070 us |  }
 1)               |  schedule() {
     rcu_preempt-10    [003] ....    91.164311: <stack trace>
 => ftrace_return_to_handler
 => return_to_handler
 => schedule_timeout
 => rcu_gp_kthread
 => kthread
 => ret_from_fork
 3) # 4006.540 us |  }


Where just adding the stack trace without the other code, these traces
ended at "return_to_handler".

This patch is not for inclusion, it was just a test to see how to make
this work.

-- Steve

diff --git a/arch/x86/kernel/ftrace_64.S b/arch/x86/kernel/ftrace_64.S
index 91b2cff4b79a..4bcd646ae1f4 100644
--- a/arch/x86/kernel/ftrace_64.S
+++ b/arch/x86/kernel/ftrace_64.S
@@ -320,13 +320,15 @@ ENTRY(ftrace_graph_caller)
 ENDPROC(ftrace_graph_caller)
 
 ENTRY(return_to_handler)
-	UNWIND_HINT_EMPTY
-	subq  $24, %rsp
+	subq $8, %rsp
+	UNWIND_HINT_FUNC
+	subq  $16, %rsp
 
 	/* Save the return values */
 	movq %rax, (%rsp)
 	movq %rdx, 8(%rsp)
 	movq %rbp, %rdi
+	leaq 16(%rsp), %rsi
 
 	call ftrace_return_to_handler
 
diff --git a/kernel/trace/trace_functions_graph.c b/kernel/trace/trace_functions_graph.c
index 169b3c44ee97..aaeca73218cc 100644
--- a/kernel/trace/trace_functions_graph.c
+++ b/kernel/trace/trace_functions_graph.c
@@ -242,13 +242,16 @@ ftrace_pop_return_trace(struct ftrace_graph_ret *trace, unsigned long *ret,
 	trace->calltime = current->ret_stack[index].calltime;
 	trace->overrun = atomic_read(&current->trace_overrun);
 	trace->depth = index;
+
+	trace_dump_stack(0);
 }
 
 /*
  * Send the trace to the ring-buffer.
  * @return the original return address.
  */
-unsigned long ftrace_return_to_handler(unsigned long frame_pointer)
+unsigned long ftrace_return_to_handler(unsigned long frame_pointer,
+	unsigned long *ptr)
 {
 	struct ftrace_graph_ret trace;
 	unsigned long ret;
@@ -257,6 +260,8 @@ unsigned long ftrace_return_to_handler(unsigned long frame_pointer)
 	trace.rettime = trace_clock_local();
 	barrier();
 	current->curr_ret_stack--;
+	*ptr = ret;
+
 	/*
 	 * The curr_ret_stack can be less than -1 only if it was
 	 * filtered out and it's about to return from the function.

^ permalink raw reply related

* Re: [RFC perf,bpf 5/5] perf util: generate bpf_prog_info_event for short living bpf programs
From: Song Liu @ 2018-11-06 23:17 UTC (permalink / raw)
  To: David Ahern
  Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org, Kernel Team,
	ast@kernel.org, daniel@iogearbox.net, peterz@infradead.org,
	acme@kernel.org
In-Reply-To: <b984776a-ce69-ab16-ed87-1cea89c9d79a@gmail.com>



> On Nov 6, 2018, at 1:54 PM, David Ahern <dsahern@gmail.com> wrote:
> 
> On 11/6/18 1:52 PM, Song Liu wrote:
>> +
>> static int record__mmap_read_all(struct record *rec)
>> {
>> 	int err;
>> 
>> +	err = record__mmap_process_vip_events(rec);
>> +	if (err)
>> +		return err;
>> +
>> 	err = record__mmap_read_evlist(rec, rec->evlist, false);
>> 	if (err)
>> 		return err;
> 
> Seems to me that is going to increase the overhead of perf on any system
> doing BPF updates. The BPF events cause a wakeup every load and unload,
> and perf processes not only the VIP events but then walks all of the
> other maps.

BPF prog load/unload events should be rare events in real world use cases. 
So I think the overhead is OK. Also, I don't see an easy way to improve 
this. 

> 
>> @@ -1686,6 +1734,8 @@ static struct option __record_options[] = {
>> 			  "signal"),
>> 	OPT_BOOLEAN(0, "dry-run", &dry_run,
>> 		    "Parse options then exit"),
>> +	OPT_BOOLEAN(0, "no-bpf-event", &record.no_bpf_event,
>> +		    "do not record event on bpf program load/unload"),
> 
> Why should this default on? If am recording FIB events, I don't care
> about BPF events.
> 

I am OK with default off if that's the preferred way. 

Thanks,
Song

^ permalink raw reply

* Re: [RFC perf,bpf 5/5] perf util: generate bpf_prog_info_event for short living bpf programs
From: Alexei Starovoitov @ 2018-11-06 23:29 UTC (permalink / raw)
  To: Song Liu, David Ahern
  Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org, Kernel Team,
	ast@kernel.org, daniel@iogearbox.net, peterz@infradead.org,
	acme@kernel.org
In-Reply-To: <6C5A9FBD-F50D-444C-9038-E9557EC850D2@fb.com>

On 11/6/18 3:17 PM, Song Liu wrote:
> 
> 
>> On Nov 6, 2018, at 1:54 PM, David Ahern <dsahern@gmail.com> wrote:
>>
>> On 11/6/18 1:52 PM, Song Liu wrote:
>>> +
>>> static int record__mmap_read_all(struct record *rec)
>>> {
>>> 	int err;
>>>
>>> +	err = record__mmap_process_vip_events(rec);
>>> +	if (err)
>>> +		return err;
>>> +
>>> 	err = record__mmap_read_evlist(rec, rec->evlist, false);
>>> 	if (err)
>>> 		return err;
>>
>> Seems to me that is going to increase the overhead of perf on any system
>> doing BPF updates. The BPF events cause a wakeup every load and unload,
>> and perf processes not only the VIP events but then walks all of the
>> other maps.
> 
> BPF prog load/unload events should be rare events in real world use cases.
> So I think the overhead is OK. Also, I don't see an easy way to improve
> this.
> 
>>
>>> @@ -1686,6 +1734,8 @@ static struct option __record_options[] = {
>>> 			  "signal"),
>>> 	OPT_BOOLEAN(0, "dry-run", &dry_run,
>>> 		    "Parse options then exit"),
>>> +	OPT_BOOLEAN(0, "no-bpf-event", &record.no_bpf_event,
>>> +		    "do not record event on bpf program load/unload"),
>>
>> Why should this default on? If am recording FIB events, I don't care
>> about BPF events.
>>
> 
> I am OK with default off if that's the preferred way.

I think concerns with perf overhead from collecting bpf events
are unfounded.
I would prefer for this flag to be on by default.

^ permalink raw reply

* Re: [RFC perf,bpf 5/5] perf util: generate bpf_prog_info_event for short living bpf programs
From: David Miller @ 2018-11-06 23:36 UTC (permalink / raw)
  To: ast
  Cc: songliubraving, dsahern, netdev, linux-kernel, Kernel-team, ast,
	daniel, peterz, acme
In-Reply-To: <27fc8327-3390-ba5a-6063-89c9e7165e7b@fb.com>

From: Alexei Starovoitov <ast@fb.com>
Date: Tue, 6 Nov 2018 23:29:07 +0000

> I think concerns with perf overhead from collecting bpf events
> are unfounded.
> I would prefer for this flag to be on by default.

I will sit in userspace looping over bpf load/unload and see how the
person trying to monitor something else with perf feels about that.

Really, it is inappropriate to turn this on by default, I completely
agree with David Ahern.

It's hard enough, _AS IS_, for me to fight back all of the bloat that
is in perf right now and get it back to being able to handle simple
full workloads without dropping events..

Every new event type like this sets us back.

If people want to monitor new things, or have new functionality, fine.

But not by default, please.

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox