Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH bpf-next] bpf, capabilities: introduce CAP_BPF
From: Andy Lutomirski @ 2019-08-27 23:34 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Andy Lutomirski, Alexei Starovoitov, Kees Cook, LSM List,
	James Morris, Jann Horn, Peter Zijlstra, Masami Hiramatsu,
	David S. Miller, Daniel Borkmann, Network Development, bpf,
	kernel-team, Linux API
In-Reply-To: <20190827192144.3b38b25a@gandalf.local.home>

On Tue, Aug 27, 2019 at 4:21 PM Steven Rostedt <rostedt@goodmis.org> wrote:
>
> On Tue, 27 Aug 2019 16:01:08 -0700
> Andy Lutomirski <luto@kernel.org> wrote:
>
> > [adding some security and tracing folks to cc]
> >
> > On Tue, Aug 27, 2019 at 1:52 PM Alexei Starovoitov <ast@kernel.org> wrote:
> > >
> > > Introduce CAP_BPF that allows loading all types of BPF programs,
> > > create most map types, load BTF, iterate programs and maps.
> > > CAP_BPF alone is not enough to attach or run programs.
> > >
> > > Networking:
> > >
> > > CAP_BPF and CAP_NET_ADMIN are necessary to:
> > > - attach to cgroup-bpf hooks like INET_INGRESS, INET_SOCK_CREATE, INET4_CONNECT
> > > - run networking bpf programs (like xdp, skb, flow_dissector)
> > >
> > > Tracing:
> > >
> > > CAP_BPF and perf_paranoid_tracepoint_raw() (which is kernel.perf_event_paranoid == -1)
> > > are necessary to:
> > > - attach bpf program to raw tracepoint
> > > - use bpf_trace_printk() in all program types (not only tracing programs)
> > > - create bpf stackmap
> > >
> > > To attach bpf to perf_events perf_event_open() needs to succeed as usual.
> > >
> > > CAP_BPF controls BPF side.
> > > CAP_NET_ADMIN controls intersection where BPF calls into networking.
> > > perf_paranoid_tracepoint_raw controls intersection where BPF calls into tracing.
> > >
> > > In the future CAP_TRACING could be introduced to control
> > > creation of kprobe/uprobe and attaching bpf to perf_events.
> > > In such case bpf_probe_read() thin wrapper would be controlled by CAP_BPF.
> > > Whereas probe_read() would be controlled by CAP_TRACING.
> > > CAP_TRACING would also control generic kprobe+probe_read.
> > > CAP_BPF and CAP_TRACING would be necessary for tracing bpf programs
> > > that want to use bpf_probe_read.
>
> No mention of the tracefs (/sys/kernel/tracing) file?

See below.  Also, I am embarrassed to admit that I just assumed that
/sys/kernel/debug/tracing was just like any other debugfs directory.

>
>
> >
> > Changing the capability that some existing operation requires could
> > break existing programs.  The old capability may need to be accepted
> > as well.
> >
> > I'm inclined to suggest that CAP_TRACING be figured out or rejected
> > before something like this gets applied.
> >
> >
> > >
> > > Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> > > ---
> > > I would prefer to introduce CAP_TRACING soon, since it
> > > will make tracing and networking permission model symmetrical.
> > >
> >
> > Here's my proposal for CAP_TRACING, documentation-style:
> >
> > --- begin ---
> >
> > CAP_TRACING enables a task to use various kernel features to trace
> > running user programs and the kernel itself.  CAP_TRACING also enables
> > a task to bypass some speculation attack countermeasures.  A task in
> > the init user namespace with CAP_TRACING will be able to tell exactly
> > what kernel code is executed and when, and will be able to read kernel
> > registers and kernel memory.  It will, similarly, be able to read the
> > state of other user tasks.
> >
> > Specifically, CAP_TRACING allows the following operations.  It may
> > allow more operations in the future:
> >
> >  - Full use of perf_event_open(), similarly to the effect of
> > kernel.perf_event_paranoid == -1.
> >
> >  - Loading and attaching tracing BPF programs, including use of BPF
> > raw tracepoints.
> >
> >  - Use of BPF stack maps.
> >
> >  - Use of bpf_probe_read() and bpf_trace_printk().
> >
> >  - Use of unsafe pointer-to-integer conversions in BPF.
> >
> >  - Bypassing of BPF's speculation attack hardening measures and
> > constant blinding.  (Note: other mechanisms might also allow this.)
> >
> > CAP_TRACING does not override normal permissions on sysfs or debugfs.
> > This means that, unless a new interface for programming kprobes and
> > such is added, it does not directly allow use of kprobes.
>
> kprobes can be created in the tracefs filesystem (which is separate from
> debugfs, tracefs just gets automatically mounted
> in /sys/kernel/debug/tracing when debugfs is mounted) from the
> kprobe_events file. /sys/kernel/tracing is just the tracefs
> directory without debugfs, and was created specifically to allow
> tracing to be access without opening up the can of worms in debugfs.

I think that, in principle, CAP_TRACING should allow this, but I'm not
sure how to achieve that.  I suppose we could set up
inode_operations.permission on tracefs, but what exactly would it do?
Would it be just like generic_permission() except that it would look
at CAP_TRACING instead of CAP_DAC_OVERRIDE?  That is, you can use
tracefs if you have CAP_TRACING *or* acl access?  Or would it be:

int tracing_permission(struct inode *inode, int mask)
{
  if (!capable(CAP_TRACING))
    return -EPERM;

  return generic_permission(inode, mask);
}

Which would mean that you need ACL *and* CAP_TRACING, so
administrators would change the mode to 777.  That's a bit scary.

And this still doesn't let people even *find* tracefs, since it's
hidden in debugfs.

So maybe make CAP_TRACING override ACLs but also add /sys/fs/tracing
and mount tracefs there, too, so that regular users can at least find
the mountpoint.

>
> Should we allow CAP_TRACING access to /proc/kallsyms? as it is helpful
> to convert perf and trace-cmd's function pointers into names. Once you
> allow tracing of the kernel, hiding /proc/kallsyms is pretty useless.

I think we should.

^ permalink raw reply

* Re: [PATCH v5 net-next 04/18] ionic: Add basic lif support
From: Shannon Nelson @ 2019-08-27 23:29 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: netdev, davem
In-Reply-To: <20190826214238.07a0eee9@cakuba.netronome.com>

On 8/26/19 9:42 PM, Jakub Kicinski wrote:
> On Mon, 26 Aug 2019 14:33:25 -0700, Shannon Nelson wrote:
>> +static inline bool ionic_is_pf(struct ionic *ionic)
>> +{
>> +	return ionic->pdev &&
>> +	       ionic->pdev->device == PCI_DEVICE_ID_PENSANDO_IONIC_ETH_PF;
>> +}
>> +
>> +static inline bool ionic_is_vf(struct ionic *ionic)
>> +{
>> +	return ionic->pdev &&
>> +	       ionic->pdev->device == PCI_DEVICE_ID_PENSANDO_IONIC_ETH_VF;
>> +}
>> +
>> +static inline bool ionic_is_25g(struct ionic *ionic)
>> +{
>> +	return ionic_is_pf(ionic) &&
>> +	       ionic->pdev->subsystem_device == IONIC_SUBDEV_ID_NAPLES_25;
>> +}
>> +
>> +static inline bool ionic_is_100g(struct ionic *ionic)
>> +{
>> +	return ionic_is_pf(ionic) &&
>> +	       (ionic->pdev->subsystem_device == IONIC_SUBDEV_ID_NAPLES_100_4 ||
>> +		ionic->pdev->subsystem_device == IONIC_SUBDEV_ID_NAPLES_100_8);
>> +}
> Again, a bunch of unused stuff.

More "near future" support code that didn't get stripped.  I'll pull it 
out for now.

sln


^ permalink raw reply

* Re: [PATCH net-next 1/4] r8169: prepare for adding RTL8125 support
From: Andrew Lunn @ 2019-08-27 23:27 UTC (permalink / raw)
  To: Heiner Kallweit
  Cc: Realtek linux nic maintainers, David Miller,
	netdev@vger.kernel.org, Chun-Hao Lin
In-Reply-To: <66ac2b09-ea87-a4ba-f6f3-1885e9587298@gmail.com>

On Tue, Aug 27, 2019 at 08:41:00PM +0200, Heiner Kallweit wrote:
> This patch prepares the driver for adding RTL8125 support:
> - change type of interrupt mask to u32
> - restrict rtl_is_8168evl_up to RTL8168 chip versions
> - factor out reading MAC address from registers
> - re-add function rtl_get_events
> - move disabling interrupt coalescing to RTL8169/RTL8168 init
> - read different register for PCI commit
> - don't use bit LastFrag in tx descriptor after send, RTL8125 clears it

Hi Heiner

That is a lot of changes in one patch. Although there is no planned
functional change, r8169 has a habit of breaking. Having lots of small
changes would help tracking down which change caused a breakage, via a
git bisect.

So you might want to consider splitting this up into a number of small
patches.

	Andrew

^ permalink raw reply

* Re: [PATCH bpf-next] bpf, capabilities: introduce CAP_BPF
From: Steven Rostedt @ 2019-08-27 23:21 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Alexei Starovoitov, Kees Cook, LSM List, James Morris, Jann Horn,
	Peter Zijlstra, Masami Hiramatsu, David S. Miller,
	Daniel Borkmann, Network Development, bpf, kernel-team, Linux API
In-Reply-To: <CALCETrV8iJv9+Ai11_1_r6MapPhhwt9hjxi=6EoixytabTScqg@mail.gmail.com>

On Tue, 27 Aug 2019 16:01:08 -0700
Andy Lutomirski <luto@kernel.org> wrote:

> [adding some security and tracing folks to cc]
> 
> On Tue, Aug 27, 2019 at 1:52 PM Alexei Starovoitov <ast@kernel.org> wrote:
> >
> > Introduce CAP_BPF that allows loading all types of BPF programs,
> > create most map types, load BTF, iterate programs and maps.
> > CAP_BPF alone is not enough to attach or run programs.
> >
> > Networking:
> >
> > CAP_BPF and CAP_NET_ADMIN are necessary to:
> > - attach to cgroup-bpf hooks like INET_INGRESS, INET_SOCK_CREATE, INET4_CONNECT
> > - run networking bpf programs (like xdp, skb, flow_dissector)
> >
> > Tracing:
> >
> > CAP_BPF and perf_paranoid_tracepoint_raw() (which is kernel.perf_event_paranoid == -1)
> > are necessary to:
> > - attach bpf program to raw tracepoint
> > - use bpf_trace_printk() in all program types (not only tracing programs)
> > - create bpf stackmap
> >
> > To attach bpf to perf_events perf_event_open() needs to succeed as usual.
> >
> > CAP_BPF controls BPF side.
> > CAP_NET_ADMIN controls intersection where BPF calls into networking.
> > perf_paranoid_tracepoint_raw controls intersection where BPF calls into tracing.
> >
> > In the future CAP_TRACING could be introduced to control
> > creation of kprobe/uprobe and attaching bpf to perf_events.
> > In such case bpf_probe_read() thin wrapper would be controlled by CAP_BPF.
> > Whereas probe_read() would be controlled by CAP_TRACING.
> > CAP_TRACING would also control generic kprobe+probe_read.
> > CAP_BPF and CAP_TRACING would be necessary for tracing bpf programs
> > that want to use bpf_probe_read.

No mention of the tracefs (/sys/kernel/tracing) file?
  
> 
> First, some high-level review:
> 
> Can you write up some clear documentation aimed at administrators that
> says what CAP_BPF does?  For example, is it expected that CAP_BPF by
> itself permits reading all kernel memory?  Why might one grant it?
> 
> Can you give at least one fully described use case where CAP_BPF
> solves a real-world problem that is not solved by existing mechanisms?

At least for CAP_TRACING (if it were to allow read/write access
to /sys/kernel/tracing), that would be very useful. It would be useful
to those that basically own their machines, and want to trace their
applications all the way into the kernel without having to run as full
root.


> 
> Changing the capability that some existing operation requires could
> break existing programs.  The old capability may need to be accepted
> as well.
> 
> I'm inclined to suggest that CAP_TRACING be figured out or rejected
> before something like this gets applied.
> 
> 
> >
> > Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> > ---
> > I would prefer to introduce CAP_TRACING soon, since it
> > will make tracing and networking permission model symmetrical.
> >  
> 
> Here's my proposal for CAP_TRACING, documentation-style:
> 
> --- begin ---
> 
> CAP_TRACING enables a task to use various kernel features to trace
> running user programs and the kernel itself.  CAP_TRACING also enables
> a task to bypass some speculation attack countermeasures.  A task in
> the init user namespace with CAP_TRACING will be able to tell exactly
> what kernel code is executed and when, and will be able to read kernel
> registers and kernel memory.  It will, similarly, be able to read the
> state of other user tasks.
> 
> Specifically, CAP_TRACING allows the following operations.  It may
> allow more operations in the future:
> 
>  - Full use of perf_event_open(), similarly to the effect of
> kernel.perf_event_paranoid == -1.
> 
>  - Loading and attaching tracing BPF programs, including use of BPF
> raw tracepoints.
> 
>  - Use of BPF stack maps.
> 
>  - Use of bpf_probe_read() and bpf_trace_printk().
> 
>  - Use of unsafe pointer-to-integer conversions in BPF.
> 
>  - Bypassing of BPF's speculation attack hardening measures and
> constant blinding.  (Note: other mechanisms might also allow this.)
> 
> CAP_TRACING does not override normal permissions on sysfs or debugfs.
> This means that, unless a new interface for programming kprobes and
> such is added, it does not directly allow use of kprobes.

kprobes can be created in the tracefs filesystem (which is separate from
debugfs, tracefs just gets automatically mounted
in /sys/kernel/debug/tracing when debugfs is mounted) from the
kprobe_events file. /sys/kernel/tracing is just the tracefs
directory without debugfs, and was created specifically to allow
tracing to be access without opening up the can of worms in debugfs.

Should we allow CAP_TRACING access to /proc/kallsyms? as it is helpful
to convert perf and trace-cmd's function pointers into names. Once you
allow tracing of the kernel, hiding /proc/kallsyms is pretty useless.

-- Steve

> 
> If CAP_TRACING, by itself, enables a task to crash or otherwise
> corrupt the kernel or other tasks, this will be considered a kernel
> bug.
> 
> CAP_TRACING in a non-init user namespace may, in the future, allow
> tracing of other tasks in that user namespace or its descendants.  It
> will not enable kernel tracing or tracing of tasks outside the user
> namespace in question.
> 
> --- end ---
> 

^ permalink raw reply

* Re: [PATCH net-next v2 3/3] dpaa2-eth: Add pause frame support
From: Andrew Lunn @ 2019-08-27 23:21 UTC (permalink / raw)
  To: Ioana Radulescu; +Cc: netdev, davem, ioana.ciornei
In-Reply-To: <1566915351-32075-3-git-send-email-ruxandra.radulescu@nxp.com>

On Tue, Aug 27, 2019 at 05:15:51PM +0300, Ioana Radulescu wrote:
> Starting with firmware version MC10.18.0, we have support for
> L2 flow control. Asymmetrical configuration (Rx or Tx only) is
> supported, but not pause frame autonegotioation.

> +static int set_pause(struct dpaa2_eth_priv *priv)
> +{
> +	struct device *dev = priv->net_dev->dev.parent;
> +	struct dpni_link_cfg link_cfg = {0};
> +	int err;
> +
> +	/* Get the default link options so we don't override other flags */
> +	err = dpni_get_link_cfg(priv->mc_io, 0, priv->mc_token, &link_cfg);
> +	if (err) {
> +		dev_err(dev, "dpni_get_link_cfg() failed\n");
> +		return err;
> +	}
> +
> +	link_cfg.options |= DPNI_LINK_OPT_PAUSE;
> +	link_cfg.options &= ~DPNI_LINK_OPT_ASYM_PAUSE;
> +	err = dpni_set_link_cfg(priv->mc_io, 0, priv->mc_token, &link_cfg);
> +	if (err) {
> +		dev_err(dev, "dpni_set_link_cfg() failed\n");
> +		return err;
> +	}
> +
> +	priv->link_state.options = link_cfg.options;
> +
> +	return 0;
> +}
> +
>  /* Configure the DPNI object this interface is associated with */
>  static int setup_dpni(struct fsl_mc_device *ls_dev)
>  {
> @@ -2500,6 +2562,13 @@ static int setup_dpni(struct fsl_mc_device *ls_dev)
>  
>  	set_enqueue_mode(priv);
>  
> +	/* Enable pause frame support */
> +	if (dpaa2_eth_has_pause_support(priv)) {
> +		err = set_pause(priv);
> +		if (err)
> +			goto close;

Hi Ioana

So by default you have the MAC do pause, not asym pause?  Generally,
any MAC that can do asym pause does asym pause.

But if this is what you want, it is not wrong.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>

    Andrew

^ permalink raw reply

* Re: [PATCH v5 net-next 02/18] ionic: Add hardware init and device commands
From: Shannon Nelson @ 2019-08-27 23:17 UTC (permalink / raw)
  To: Andrew Lunn; +Cc: netdev, davem
In-Reply-To: <20190827195017.GR2168@lunn.ch>

On 8/27/19 12:50 PM, Andrew Lunn wrote:
> On Tue, Aug 27, 2019 at 10:39:20AM -0700, Shannon Nelson wrote:
>> On 8/26/19 7:26 PM, Andrew Lunn wrote:
>>> On Mon, Aug 26, 2019 at 02:33:23PM -0700, Shannon Nelson wrote:
>>>> +void ionic_debugfs_add_dev(struct ionic *ionic)
>>>> +{
>>>> +	struct dentry *dentry;
>>>> +
>>>> +	dentry = debugfs_create_dir(ionic_bus_info(ionic), ionic_dir);
>>>> +	if (IS_ERR_OR_NULL(dentry))
>>>> +		return;
>>>> +
>>>> +	ionic->dentry = dentry;
>>>> +}
>>> Hi Shannon
>>>
>>> There was recently a big patchset from GregKH which removed all error
>>> checking from drivers calling debugfs calls. I'm pretty sure you don't
>>> need this check here.
>> With this check I end up either with a valid dentry value or NULL in
>> ionic->dentry.  Without this check I possibly get an error value in
>> ionic->dentry, which can get used later as the parent dentry to try to make
>> a new debugfs node.
> Hi Shannon
>
> What you should find is that every debugfs function will have
> something like:
>
> 	if (IS_ERR(dentry))
> 	   return dentry;
> or
> 	if (IS_ERR(parent))
> 	   return parent;
>
> If you know of a API which is missing such protection, i'm sure GregKH
> would like to know. Especially since he just ripped all such
> protection in driver out. Meaning he just broken some drivers if such
> IS_ERR() calls are missing in the debugfs core.
>
Ah, here's the confusion: there's a start_creating() in the tracefs as 
well as in the debugfs, and the tracefs code doesn't have all of those 
checks.

sln



^ permalink raw reply

* Re: [PATCH net-next v2 2/3] dpaa2-eth: Use stored link settings
From: Andrew Lunn @ 2019-08-27 23:12 UTC (permalink / raw)
  To: Ioana Radulescu; +Cc: netdev, davem, ioana.ciornei
In-Reply-To: <1566915351-32075-2-git-send-email-ruxandra.radulescu@nxp.com>

On Tue, Aug 27, 2019 at 05:15:50PM +0300, Ioana Radulescu wrote:
> Whenever a link state change occurs, we get notified and save
> the new link settings in the device's private data. In ethtool
> get_link_ksettings, use the stored state instead of interrogating
> the firmware each time.
> 
> Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>

Reviewed-by: Andrew Lunn <andrew@lunn.ch>

    Andrew

^ permalink raw reply

* Re: [PATCH net-next v2 1/3] dpaa2-eth: Remove support for changing link settings
From: Andrew Lunn @ 2019-08-27 23:11 UTC (permalink / raw)
  To: Ioana Radulescu; +Cc: netdev, davem, ioana.ciornei
In-Reply-To: <1566915351-32075-1-git-send-email-ruxandra.radulescu@nxp.com>

On Tue, Aug 27, 2019 at 05:15:49PM +0300, Ioana Radulescu wrote:
> We only support fixed-link for now, so there is no point in
> offering users the option to change link settings via ethtool.
> 
> Functionally there is no change, since firmware prevents us from
> changing link parameters anyway.
> 
> Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>

Reviewed-by: Andrew Lunn <andrew@lunn.ch>

    Andrew

^ permalink raw reply

* Re: [PATCH v2 net] Add genphy_c45_config_aneg() function to phy-c45.c
From: Andrew Lunn @ 2019-08-27 23:10 UTC (permalink / raw)
  To: David Miller
  Cc: marco.hartmann, f.fainelli, hkallweit1, netdev, linux-kernel,
	christian.herber
In-Reply-To: <20190827.150103.723109968950216148.davem@davemloft.net>

On Tue, Aug 27, 2019 at 03:01:03PM -0700, David Miller wrote:
> From: Marco Hartmann <marco.hartmann@nxp.com>
> Date: Wed, 21 Aug 2019 11:00:46 +0000
> 
> > Commit 34786005eca3 ("net: phy: prevent PHYs w/o Clause 22 regs from calling
> > genphy_config_aneg") introduced a check that aborts phy_config_aneg()
> > if the phy is a C45 phy.
> > This causes phy_state_machine() to call phy_error() so that the phy
> > ends up in PHY_HALTED state.
> > 
> > Instead of returning -EOPNOTSUPP, call genphy_c45_config_aneg()
> > (analogous to the C22 case) so that the state machine can run
> > correctly.
> > 
> > genphy_c45_config_aneg() closely resembles mv3310_config_aneg()
> > in drivers/net/phy/marvell10g.c, excluding vendor specific
> > configurations for 1000BaseT.
> > 
> > Fixes: 22b56e827093 ("net: phy: replace genphy_10g_driver with genphy_c45_driver")
> > 
> > Signed-off-by: Marco Hartmann <marco.hartmann@nxp.com>
> 
> Andrew, gentle ping to respond to Heiner who said:

It at least makes it consistent with phy_restart_aneg() and
phy_aneg_done().

Reviewed-by: Andrew Lunn <andrew@lunn.ch>

    Andrew

^ permalink raw reply

* [PATCH net-next] net/ncsi: add response handlers for PLDM over NC-SI
From: Ben Wei @ 2019-08-27 23:03 UTC (permalink / raw)
  To: David Miller, sam@mendozajonas.com, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org, openbmc@lists.ozlabs.org
  Cc: Ben Wei

This patch adds handlers for PLDM over NC-SI command response.

This enables NC-SI driver recognizes the packet type so the responses don't get dropped as unknown packet type.

PLDM over NC-SI are not handled in kernel driver for now, but can be passed back to user space via Netlink for further handling.

Signed-off-by: Ben Wei <benwei@fb.com>
---
 net/ncsi/ncsi-pkt.h |  5 +++++
 net/ncsi/ncsi-rsp.c | 11 +++++++++--
 2 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/net/ncsi/ncsi-pkt.h b/net/ncsi/ncsi-pkt.h index a8e9def593f2..80938b338fee 100644
--- a/net/ncsi/ncsi-pkt.h
+++ b/net/ncsi/ncsi-pkt.h
@@ -387,6 +387,9 @@ struct ncsi_aen_hncdsc_pkt {
 #define NCSI_PKT_CMD_OEM	0x50 /* OEM                              */
 #define NCSI_PKT_CMD_PLDM	0x51 /* PLDM request over NCSI over RBT  */
 #define NCSI_PKT_CMD_GPUUID	0x52 /* Get package UUID                 */
+#define NCSI_PKT_CMD_QPNPR	0x56 /* Query Pending NC PLDM request */
+#define NCSI_PKT_CMD_SNPR	0x57 /* Send NC PLDM Reply  */
+
 
 /* NCSI packet responses */
 #define NCSI_PKT_RSP_CIS	(NCSI_PKT_CMD_CIS    + 0x80)
@@ -419,6 +422,8 @@ struct ncsi_aen_hncdsc_pkt {
 #define NCSI_PKT_RSP_OEM	(NCSI_PKT_CMD_OEM    + 0x80)
 #define NCSI_PKT_RSP_PLDM	(NCSI_PKT_CMD_PLDM   + 0x80)
 #define NCSI_PKT_RSP_GPUUID	(NCSI_PKT_CMD_GPUUID + 0x80)
+#define NCSI_PKT_RSP_QPNPR	(NCSI_PKT_CMD_QPNPR   + 0x80)
+#define NCSI_PKT_RSP_SNPR	(NCSI_PKT_CMD_SNPR   + 0x80)
 
 /* NCSI response code/reason */
 #define NCSI_PKT_RSP_C_COMPLETED	0x0000 /* Command Completed        */
diff --git a/net/ncsi/ncsi-rsp.c b/net/ncsi/ncsi-rsp.c index 5254004f2b42..524974af0db6 100644
--- a/net/ncsi/ncsi-rsp.c
+++ b/net/ncsi/ncsi-rsp.c
@@ -1035,6 +1035,11 @@ static int ncsi_rsp_handler_gpuuid(struct ncsi_request *nr)
 	return 0;
 }
 
+static int ncsi_rsp_handler_pldm(struct ncsi_request *nr) {
+	return 0;
+}
+
 static int ncsi_rsp_handler_netlink(struct ncsi_request *nr)  {
 	struct ncsi_dev_priv *ndp = nr->ndp;
@@ -1088,8 +1093,10 @@ static struct ncsi_rsp_handler {
 	{ NCSI_PKT_RSP_GNPTS,  48, ncsi_rsp_handler_gnpts   },
 	{ NCSI_PKT_RSP_GPS,     8, ncsi_rsp_handler_gps     },
 	{ NCSI_PKT_RSP_OEM,    -1, ncsi_rsp_handler_oem     },
-	{ NCSI_PKT_RSP_PLDM,    0, NULL                     },
-	{ NCSI_PKT_RSP_GPUUID, 20, ncsi_rsp_handler_gpuuid  }
+	{ NCSI_PKT_RSP_PLDM,   -1, ncsi_rsp_handler_pldm    },
+	{ NCSI_PKT_RSP_GPUUID, 20, ncsi_rsp_handler_gpuuid  },
+	{ NCSI_PKT_RSP_QPNPR,  -1, ncsi_rsp_handler_pldm    },
+	{ NCSI_PKT_RSP_SNPR,   -1, ncsi_rsp_handler_pldm    }
 };
 
 int ncsi_rcv_rsp(struct sk_buff *skb, struct net_device *dev,
--
2.17.1


^ permalink raw reply

* Re: [PATCH bpf-next] bpf, capabilities: introduce CAP_BPF
From: Andy Lutomirski @ 2019-08-27 23:01 UTC (permalink / raw)
  To: Alexei Starovoitov, Kees Cook, LSM List, James Morris, Jann Horn,
	Peter Zijlstra, Masami Hiramatsu, Steven Rostedt
  Cc: David S. Miller, Daniel Borkmann, Network Development, bpf,
	kernel-team, Linux API
In-Reply-To: <20190827205213.456318-1-ast@kernel.org>

[adding some security and tracing folks to cc]

On Tue, Aug 27, 2019 at 1:52 PM Alexei Starovoitov <ast@kernel.org> wrote:
>
> Introduce CAP_BPF that allows loading all types of BPF programs,
> create most map types, load BTF, iterate programs and maps.
> CAP_BPF alone is not enough to attach or run programs.
>
> Networking:
>
> CAP_BPF and CAP_NET_ADMIN are necessary to:
> - attach to cgroup-bpf hooks like INET_INGRESS, INET_SOCK_CREATE, INET4_CONNECT
> - run networking bpf programs (like xdp, skb, flow_dissector)
>
> Tracing:
>
> CAP_BPF and perf_paranoid_tracepoint_raw() (which is kernel.perf_event_paranoid == -1)
> are necessary to:
> - attach bpf program to raw tracepoint
> - use bpf_trace_printk() in all program types (not only tracing programs)
> - create bpf stackmap
>
> To attach bpf to perf_events perf_event_open() needs to succeed as usual.
>
> CAP_BPF controls BPF side.
> CAP_NET_ADMIN controls intersection where BPF calls into networking.
> perf_paranoid_tracepoint_raw controls intersection where BPF calls into tracing.
>
> In the future CAP_TRACING could be introduced to control
> creation of kprobe/uprobe and attaching bpf to perf_events.
> In such case bpf_probe_read() thin wrapper would be controlled by CAP_BPF.
> Whereas probe_read() would be controlled by CAP_TRACING.
> CAP_TRACING would also control generic kprobe+probe_read.
> CAP_BPF and CAP_TRACING would be necessary for tracing bpf programs
> that want to use bpf_probe_read.

First, some high-level review:

Can you write up some clear documentation aimed at administrators that
says what CAP_BPF does?  For example, is it expected that CAP_BPF by
itself permits reading all kernel memory?  Why might one grant it?

Can you give at least one fully described use case where CAP_BPF
solves a real-world problem that is not solved by existing mechanisms?

Changing the capability that some existing operation requires could
break existing programs.  The old capability may need to be accepted
as well.

I'm inclined to suggest that CAP_TRACING be figured out or rejected
before something like this gets applied.

>
> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> ---
> I would prefer to introduce CAP_TRACING soon, since it
> will make tracing and networking permission model symmetrical.
>

Here's my proposal for CAP_TRACING, documentation-style:

--- begin ---

CAP_TRACING enables a task to use various kernel features to trace
running user programs and the kernel itself.  CAP_TRACING also enables
a task to bypass some speculation attack countermeasures.  A task in
the init user namespace with CAP_TRACING will be able to tell exactly
what kernel code is executed and when, and will be able to read kernel
registers and kernel memory.  It will, similarly, be able to read the
state of other user tasks.

Specifically, CAP_TRACING allows the following operations.  It may
allow more operations in the future:

 - Full use of perf_event_open(), similarly to the effect of
kernel.perf_event_paranoid == -1.

 - Loading and attaching tracing BPF programs, including use of BPF
raw tracepoints.

 - Use of BPF stack maps.

 - Use of bpf_probe_read() and bpf_trace_printk().

 - Use of unsafe pointer-to-integer conversions in BPF.

 - Bypassing of BPF's speculation attack hardening measures and
constant blinding.  (Note: other mechanisms might also allow this.)

CAP_TRACING does not override normal permissions on sysfs or debugfs.
This means that, unless a new interface for programming kprobes and
such is added, it does not directly allow use of kprobes.

If CAP_TRACING, by itself, enables a task to crash or otherwise
corrupt the kernel or other tasks, this will be considered a kernel
bug.

CAP_TRACING in a non-init user namespace may, in the future, allow
tracing of other tasks in that user namespace or its descendants.  It
will not enable kernel tracing or tracing of tasks outside the user
namespace in question.

--- end ---

Does this sound good?  The idea here is that CAP_TRACING should be
very useful even without CAP_BPF, which allows CAP_BPF to be less
powerful.

> +bool cap_bpf_tracing(void)
> +{
> +       return capable(CAP_SYS_ADMIN) ||
> +              (capable(CAP_BPF) && !perf_paranoid_tracepoint_raw());
> +}

If auditing is on, this will audit the wrong thing.  James, I think a
helper like:

bool ns_either_cap(struct user_ns *ns, int preferred_cap, int other_cap);

would help.  ns_either_cap returns true if either cap is held (i.e.
effective, as usual).  On success, it audits preferred_cap if held and
other_cap otherwise.  On failure, it audits preferred_cap.  Does this
sound right?

Also, for reference, perf_paranoid_tracepoint_raw() is this:

static inline bool perf_paranoid_tracepoint_raw(void)
{
        return sysctl_perf_event_paranoid > -1;
}

so the overall effect of cap_bpf_tracing() is rather odd, and it seems
to control a few things that don't obvious all have similar security
effects.

> @@ -2080,7 +2083,10 @@ static int bpf_prog_test_run(const union bpf_attr *attr,
>         struct bpf_prog *prog;
>         int ret = -ENOTSUPP;
>
> -       if (!capable(CAP_SYS_ADMIN))
> +       if (!capable(CAP_NET_ADMIN) || !capable(CAP_BPF))
> +               /* test_run callback is available for networking progs only.
> +                * Add cap_bpf_tracing() above when tracing progs become runable.
> +                */

I think test_run should probably be CAP_SYS_ADMIN forever.  test_run
is the only way that one can run a bpf program and call helper
functions via the program if one doesn't have permission to attach the
program.  Also, if there's a way to run a speculation attack via a bpf
program, test_run will make it much easier to do in a controlled
environment.  Finally, when debugging bpf programs, developers can use
their own computers or a VM.

^ permalink raw reply

* Re: [E1000-devel] SFP+ EEPROM readouts fail on X722 (ethtool -m: Invalid argument)
From: Jakub Jankowski @ 2019-08-27 23:01 UTC (permalink / raw)
  To: Fujinaka, Todd, e1000-devel@lists.sourceforge.net
  Cc: netdev@vger.kernel.org, mhemsley@open-systems.com
In-Reply-To: <9B4A1B1917080E46B64F07F2989DADD69B013DB0@ORSMSX115.amr.corp.intel.com>

Hi,

On 8/27/19 7:56 PM, Fujinaka, Todd wrote:
> The hints should be:
> # ethtool -m eth10
> Cannot get module EEPROM information: Invalid argument # dmesg | tail -n 1 [  445.971974] i40e 0000:3d:00.3 eth10: Module EEPROM memory read not supported. Please update the NVM image.
>
> # ethtool -i eth10
> driver: i40e
> version: 2.9.21
> firmware-version: 3.31 0x80000d31 1.1767.0
>
> And the working case:
> # ethtool -i eth8
> driver: i40e
> version: 2.9.21
> firmware-version: 6.01 0x800035cf 1.1876.0
>
> If you don't see it, 6.01 > 3.31.
The reason why firmware between the two is (that much) different is 
because the non-working case is from X722 NIC, while the working one is 
from X710.

> The NVM update tool should be available on downloadcenter.intel.com

Thanks for the pointer to NVM updater. I'd like to offer some additional 
comments about my experience with the newest one (v4.00):

a) running ./nvmupdate64e (from X722_NVMUpdate_Linux_x64 subdir) errors 
out without really saying what's wrong:

   # ./nvmupdate64e

   Intel(R) Ethernet NVM Update Tool
   NVMUpdate version 1.30.2.11
   Copyright (C) 2013 - 2017 Intel Corporation.


   WARNING: To avoid damage to your device, do not stop the update or 
reboot or power off the system during this update.
   Inventory in progress. Please wait [+.........]
   Tool execution completed with the following status: The configuration 
file could not be opened/read, or a syntax error was discovered in the file
   Press any key to exit.

after enabling logging (-l out.log) a bit more is revealed:

   # tail -n 2 out.log
   Error:   Config file line 2: Not supported config file version.
   Error:   Missing CONFIG VERSION parameter in configuration file.

but that's not entirely true, CONFIG VERSION is set in the default 
configuration file:

   # head -n 2 nvmupdate.cfg
   CURRENT FAMILY: 1.0.0
   CONFIG VERSION: 1.14.0

so why isn't this understood?
Manually editing nvmupdate.cfg and setting CONFIG VERSION: 1.11.0 seems 
to make this particular problem go away.

b) Re-doing this with downgraded config version exposes another problem:

   Config file read.
   Error:   Can't open NVM map file [Immediate_offset_2.txt]

and indeed, there is no Immediate_offset_2.txt in 
NVMUpdatePackage_WFT_WFQ&WF0_v4.00/X722_NVMUpdate_Linux_x64/
There is one, however, in 
NVMUpdatePackage_WFT_WFQ&WF0_v4.00/X722_NVMUpdate_EFIx64/ subdir. 
Copying it over to the _Linux_x64 resolves this particular problem

c) Re-doing this with Immediate_offset_2.txt in place exposes third problem:

   Error:   Can't open NVM image file 
[LBG_B2_Wolf_Pass_WFT_X557_P01_PHY_Auto_Detect_P23_NCSI_v3.31_800016DB.bin]

and once again - same story. It exists in 
NVMUpdatePackage_WFT_WFQ&WF0_v4.00/X722_NVMUpdate_EFIx64/ but not 
NVMUpdatePackage_WFT_WFQ&WF0_v4.00/X722_NVMUpdate_Linux_x64/ - had to 
copy it over.


Once I managed to get all these out of the way, the tool finally ran:

   Num Description                               Ver. DevId S:B Status
   === ======================================== ===== ===== ====== 
===============
   01) Intel(R) Ethernet Server Adapter I350-T4  1.99  1521 00:024 
Update not
available
   02) Intel(R) Ethernet Connection X722 for     3.49  37D2 00:061 Update
       10GBASE-T available
   03) Intel(R) Ethernet Server Adapter I350-T4  1.99  1521 00:175 
Update not
available


The initial starting point was:

0) firmware-version: 3.31 0x80000d31 1.1767.0

After first update+reboot, this was bumped to:

1) firmware-version: 3.1d 0x800016db 1.1767.0    (but ethtool -m ethX 
still doesn't work)

So I ran the tool the second time, it said 'Update available' again, but 
this time:

   Num Description                               Ver. DevId S:B Status
   === ======================================== ===== ===== ====== 
===============
   01) Intel(R) Ethernet Server Adapter I350-T4  1.99  1521 00:024 
Update not
available
   02) Intel(R) Ethernet Connection X722 for     3.29  37D2 00:061 Update
       10GBASE-T available
   03) Intel(R) Ethernet Server Adapter I350-T4  1.99  1521 00:175 
Update not
available

   Options: Adapter Index List (comma-separated), [A]ll, e[X]it
   Enter selection:02
   Would you like to back up the NVM images? [Y]es/[N]o: Y
   Update in progress. This operation may take several minutes.
   [*******+..]
   Tool execution completed with the following status: <---------- why 
is there no status printed?
   Press any key to exit.


Checking output log:

   # cat out3.log
   Intel(R) Ethernet NVM Update Tool
   NVMUpdate version 1.30.2.11
   Copyright (C) 2013 - 2017 Intel Corporation.

   ./nvmupdate64e -c nvmupdate.cfg -l out3.log

   Config file read.
   Inventory
   [00:061:00:00]: Intel(R) Ethernet Connection X722 for 10GBASE-T
       Flash inventory started
       Shadow RAM inventory started
   Alternate MAC address is not set
       Shadow RAM inventory finished
       Flash inventory finished
       OROM inventory started
       OROM inventory finished
       PHY NVM inventory started
       PHY NVM inventory finished
   [00:061:00:01]: Intel(R) Ethernet Connection X722 for 10GBASE-T
       Device already inventoried.
   [00:061:00:02]: Intel(R) Ethernet Connection X722 for 10GbE SFP+
       Device already inventoried.
       PHY NVM inventory started
       PHY NVM inventory finished
   [00:061:00:03]: Intel(R) Ethernet Connection X722 for 10GbE SFP+
       Device already inventoried.
   Update
   [00:061:00:00]: Intel(R) Ethernet Connection X722 for 10GBASE-T
       Creating backup images in directory: A4BF0164884A
       Backup images created.
       Flash update started
       NVM image verification started
       Shadow RAM image verification started

   Image differences found at offset 0x3AE [Device=0xF, Buffer=0x0] - 
update required.
   Error:   Flash update failed
   [00:061:00:02]: Intel(R) Ethernet Connection X722 for 10GbE SFP+
   #

However, ethtool -i suggests that firmware was updated to:

2) firmware-version: 4.00 0x80001577 1.1580.0    <------- so it did 
_something_ after all?

At this point, every subsequent attempt to run the NVM updater yields 
the same results: an update is available, but attempting to apply it 
fails with the same message in log.

And my initial issue still persists - ethtool -m <iface> still returns 
"invalid argument" with "Module EEPROM memory read not supported. Please 
update the NVM image" logged in dmesg.

How can I resolve this?

Cheers,
  Jakub.

>
> Todd Fujinaka
> Software Application Engineer
> Datacenter Engineering Group
> Intel Corporation
> todd.fujinaka@intel.com
>
>
> -----Original Message-----
> From: Jakub Jankowski [mailto:shasta@toxcorp.com]
> Sent: Tuesday, August 27, 2019 4:03 AM
> To: e1000-devel@lists.sourceforge.net
> Cc: netdev@vger.kernel.org; shasta@toxcorp.com; mhemsley@open-systems.com
> Subject: [E1000-devel] SFP+ EEPROM readouts fail on X722 (ethtool -m: Invalid argument)
>
> Hi,
>
> We can't get SFP+ EEPROM readouts for X722 to work at all:
>
> # ethtool -m eth10
> Cannot get module EEPROM information: Invalid argument # dmesg | tail -n 1 [  445.971974] i40e 0000:3d:00.3 eth10: Module EEPROM memory read not supported. Please update the NVM image.
> # lspci | grep 3d:00.3
> 3d:00.3 Ethernet controller: Intel Corporation Ethernet Connection X722 for 10GbE SFP+ (rev 09)
>
>
> We're running 4.19.65 kernel at the moment, testing using the newest out-of-tree Intel module
>
> # modinfo -F version i40e
> 2.9.21
>
> We also tried:
> - 4.19.65 with in-tree i40e (2.3.2-k)
> - stock Arch Linux (kernel 5.2.5, driver 2.8.20-k) and the results are the same, as shown above.
>
> # ethtool -i eth10
> driver: i40e
> version: 2.9.21
> firmware-version: 3.31 0x80000d31 1.1767.0
> expansion-rom-version:
> bus-info: 0000:3d:00.3
> supports-statistics: yes
> supports-test: yes
> supports-eeprom-access: yes
> supports-register-dump: yes
> supports-priv-flags: yes
> # dmidecode -s baseboard-manufacturer
> Intel Corporation
> # dmidecode -s baseboard-product-name
> S2600WFT
> # dmidecode -s baseboard-version
> H48104-853
>
> # lspci -vvv
> (...)
> 3d:00.3 Ethernet controller: Intel Corporation Ethernet Connection X722 for 10GbE SFP+ (rev 09)
> 	DeviceName: Intel PCH Integrated 10 Gigabit Ethernet Controller
> 	Subsystem: Intel Corporation Ethernet Connection X722 for 10GbE SFP+
> 	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx+
> 	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
> 	Latency: 0, Cache Line Size: 32 bytes
> 	Interrupt: pin A routed to IRQ 112
> 	NUMA node: 0
> 	Region 0: Memory at ab000000 (64-bit, prefetchable) [size=16M]
> 	Region 3: Memory at b0000000 (64-bit, prefetchable) [size=32K]
> 	Expansion ROM at <ignored> [disabled]
> 	Capabilities: [40] Power Management version 3
> 		Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
> 		Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=1 PME-
> 	Capabilities: [50] MSI: Enable- Count=1/1 Maskable+ 64bit+
> 		Address: 0000000000000000  Data: 0000
> 		Masking: 00000000  Pending: 00000000
> 	Capabilities: [70] MSI-X: Enable+ Count=129 Masked-
> 		Vector table: BAR=3 offset=00000000
> 		PBA: BAR=3 offset=00001000
> 	Capabilities: [a0] Express (v2) Endpoint, MSI 00
> 		DevCap:	MaxPayload 512 bytes, PhantFunc 0, Latency L0s <512ns, L1 <64us
> 			ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 0.000W
> 		DevCtl:	CorrErr+ NonFatalErr+ FatalErr+ UnsupReq+
> 			RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop- FLReset-
> 			MaxPayload 256 bytes, MaxReadReq 512 bytes
> 		DevSta:	CorrErr+ NonFatalErr- FatalErr- UnsupReq+ AuxPwr+ TransPend-
> 		LnkCap:	Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us
> 			ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+
> 		LnkCtl:	ASPM Disabled; RCB 64 bytes Disabled- CommClk+
> 			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
> 		LnkSta:	Speed 2.5GT/s (ok), Width x1 (ok)
> 			TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
> 		DevCap2: Completion Timeout: Range AB, TimeoutDis+, LTR-, OBFF Not Supported
> 			 AtomicOpsCap: 32bit- 64bit- 128bitCAS-
> 		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
> 			 AtomicOpsCtl: ReqEn-
> 		LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
> 			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
> 	Capabilities: [e0] Vital Product Data
> 		Product Name: Example VPD
> 		Read-only fields:
> 			[V0] Vendor specific:
> 			[RV] Reserved: checksum good, 0 byte(s) reserved
> 		End
> 	Capabilities: [100 v2] Advanced Error Reporting
> 		UESta:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
> 		UEMsk:	DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol-
> 		UESvrt:	DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
> 		CESta:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+
> 		CEMsk:	RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+
> 		AERCap:	First Error Pointer: 00, ECRCGenCap+ ECRCGenEn- ECRCChkCap+ ECRCChkEn-
> 			MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
> 		HeaderLog: 00000000 00000000 00000000 00000000
> 	Capabilities: [150 v1] Alternative Routing-ID Interpretation (ARI)
> 		ARICap:	MFVC- ACS-, Next Function: 0
> 		ARICtl:	MFVC- ACS-, Function Group: 0
> 	Capabilities: [160 v1] Single Root I/O Virtualization (SR-IOV)
> 		IOVCap:	Migration-, Interrupt Message Number: 000
> 		IOVCtl:	Enable- Migration- Interrupt- MSE- ARIHierarchy-
> 		IOVSta:	Migration-
> 		Initial VFs: 32, Total VFs: 32, Number of VFs: 0, Function Dependency Link: 03
> 		VF offset: 109, stride: 1, Device ID: 37cd
> 		Supported Page Size: 00000553, System Page Size: 00000001
> 		Region 0: Memory at 00000000af000000 (64-bit, prefetchable)
> 		Region 3: Memory at 00000000b0020000 (64-bit, prefetchable)
> 		VF Migration: offset: 00000000, BIR: 0
> 	Capabilities: [1a0 v1] Transaction Processing Hints
> 		Device specific mode supported
> 		No steering table available
> 	Capabilities: [1b0 v1] Access Control Services
> 		ACSCap:	SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans-
> 		ACSCtl:	SrcValid- TransBlk- ReqRedir- CmpltRedir- UpstreamFwd- EgressCtrl- DirectTrans-
> 	Kernel driver in use: i40e
> 	Kernel modules: i40e
>
>
> Same kernel+i40e, same SFP+ module - but on Intel X710, works like a treat:
>
> # lspci | grep X7
> 81:00.0 Ethernet controller: Intel Corporation Ethernet Controller X710 for 10GbE SFP+ (rev 01)
> 81:00.1 Ethernet controller: Intel Corporation Ethernet Controller X710 for 10GbE SFP+ (rev 01) # ethtool -m eth8
> 	Identifier                                : 0x03 (SFP)
> 	Extended identifier                       : 0x04 (GBIC/SFP defined by 2-wire interface ID)
> 	Connector                                 : 0x07 (LC)
> 	Transceiver codes                         : 0x10 0x00 0x00 0x01 0x00 0x00 0x00 0x00 0x00
> 	Transceiver type                          : 10G Ethernet: 10G Base-SR
> 	Transceiver type                          : Ethernet: 1000BASE-SX
> 	Encoding                                  : 0x06 (64B/66B)
> 	BR, Nominal                               : 10300MBd
>           (...)
> # ethtool -i eth8
> driver: i40e
> version: 2.9.21
> firmware-version: 6.01 0x800035cf 1.1876.0
> expansion-rom-version:
> bus-info: 0000:81:00.0
> supports-statistics: yes
> supports-test: yes
> supports-eeprom-access: yes
> supports-register-dump: yes
> supports-priv-flags: yes
> #
>
>
> Is this a known problem?
>
>
> Best regards,
>    Jakub
>
>
>
> _______________________________________________
> E1000-devel mailing list
> E1000-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/e1000-devel
> To learn more about Intel&#174; Ethernet, visit http://communities.intel.com/community/wired


^ permalink raw reply

* Re: [PATCH bpf-next] selftests/bpf: remove wrong nhoff in flow dissector test
From: Daniel Borkmann @ 2019-08-27 22:44 UTC (permalink / raw)
  To: Stanislav Fomichev, netdev, bpf; +Cc: davem, ast
In-Reply-To: <20190826222712.171177-1-sdf@google.com>

On 8/27/19 12:27 AM, Stanislav Fomichev wrote:
> .nhoff = 0 is (correctly) reset to ETH_HLEN on the next line so let's
> drop it.
> 
> Signed-off-by: Stanislav Fomichev <sdf@google.com>

Applied, thanks!

^ permalink raw reply

* Re: [PATCH bpf-next v3 0/4] selftests/bpf: test_progs: misc fixes
From: Daniel Borkmann @ 2019-08-27 22:44 UTC (permalink / raw)
  To: Stanislav Fomichev, netdev, bpf; +Cc: davem, ast, Andrii Nakryiko
In-Reply-To: <20190821234427.179886-1-sdf@google.com>

On 8/22/19 1:44 AM, Stanislav Fomichev wrote:
> * add test__skip to indicate skipped tests
> * remove global success/error counts (use environment)
> * remove asserts from the tests
> * remove unused ret from send_signal test
> 
> v3:
> * QCHECK -> CHECK_FAIL (Daniel Borkmann)
> 
> v2:
> * drop patch that changes output to keep consistent with test_verifier
>    (Alexei Starovoitov)
> * QCHECK instead of test__fail (Andrii Nakryiko)
> * test__skip count number of subtests (Andrii Nakryiko)
> 
> Cc: Andrii Nakryiko <andriin@fb.com>

Applied, thanks!

^ permalink raw reply

* Re: [PATCH bpf-next 0/4] bpf: precision tracking tests
From: Daniel Borkmann @ 2019-08-27 22:43 UTC (permalink / raw)
  To: Alexei Starovoitov, davem; +Cc: netdev, bpf, kernel-team
In-Reply-To: <20190823055215.2658669-1-ast@kernel.org>

On 8/23/19 7:52 AM, Alexei Starovoitov wrote:
> Add few additional tests for precision tracking in the verifier.
> 
> Alexei Starovoitov (4):
>    bpf: introduce verifier internal test flag
>    tools/bpf: sync bpf.h
>    selftests/bpf: verifier precise tests
>    selftests/bpf: add precision tracking test
> 
>   include/linux/bpf_verifier.h                  |   1 +
>   include/uapi/linux/bpf.h                      |   3 +
>   kernel/bpf/syscall.c                          |   1 +
>   kernel/bpf/verifier.c                         |   5 +-
>   tools/include/uapi/linux/bpf.h                |   3 +
>   tools/testing/selftests/bpf/test_verifier.c   |  68 +++++++--
>   .../testing/selftests/bpf/verifier/precise.c  | 142 ++++++++++++++++++
>   7 files changed, 211 insertions(+), 12 deletions(-)
>   create mode 100644 tools/testing/selftests/bpf/verifier/precise.c
> 

Applied, thanks!

^ permalink raw reply

* Re: libbpf distro packaging
From: Julia Kartseva @ 2019-08-27 22:30 UTC (permalink / raw)
  To: Jiri Olsa, Alexei Starovoitov
  Cc: Andrii Nakryiko, labbott@redhat.com, acme@kernel.org,
	debian-kernel@lists.debian.org, netdev@vger.kernel.org,
	Andrey Ignatov, Yonghong Song, jolsa@kernel.org, Daniel Borkmann
In-Reply-To: <20190826064235.GA17554@krava>

On 8/25/19, 11:42 PM, "Jiri Olsa" <jolsa@redhat.com> wrote:

> On Fri, Aug 23, 2019 at 04:00:01PM +0000, Alexei Starovoitov wrote:
> > 
> > Technically we can bump it at any time.
> > The goal was to bump it only when new kernel is released
> > to capture a collection of new APIs in a given 0.0.X release.
> > So that libbpf versions are synchronized with kernel versions
> > in some what loose way.
> > In this case we can make an exception and bump it now.
>
> I see, I dont think it's worth of the exception now,
> the patch is simple or we'll start with 0.0.3

PR introducing 0.0.5 ABI was merged:
https://github.com/libbpf/libbpf/commit/476e158
Jiri, you'd like to avoid patching, you can start w/ 0.0.5.
Also if you're planning to use *.spec from libbpf as a source of truth,
It may be enhanced by syncing spec and ABI versions, similar to
https://github.com/libbpf/libbpf/commit/d60f568

^ permalink raw reply

* Re: [PATCH net-next v2 2/3] dt-bindings: net: dsa: mt7530: Add support for port 5
From: Rob Herring @ 2019-08-27 22:22 UTC (permalink / raw)
  To: René van Dorst
  Cc: Sean Wang, Andrew Lunn, Vivien Didelot, Florian Fainelli,
	David S . Miller, Matthias Brugger, netdev, linux-arm-kernel,
	linux-mediatek, John Crispin, linux-mips, Frank Wunderlich,
	devicetree
In-Reply-To: <20190821144547.15113-3-opensource@vdorst.com>

On Wed, Aug 21, 2019 at 04:45:46PM +0200, René van Dorst wrote:
> MT7530 port 5 has many modes/configurations.
> Update the documentation how to use port 5.
> 
> Signed-off-by: René van Dorst <opensource@vdorst.com>
> Cc: devicetree@vger.kernel.org
> Cc: Rob Herring <robh@kernel.org>

> v1->v2:
> * Adding extra note about RGMII2 and gpio use.
> rfc->v1:
> * No change

The changelog goes below the '---'

> ---
>  .../devicetree/bindings/net/dsa/mt7530.txt    | 218 ++++++++++++++++++
>  1 file changed, 218 insertions(+)
> 
> diff --git a/Documentation/devicetree/bindings/net/dsa/mt7530.txt b/Documentation/devicetree/bindings/net/dsa/mt7530.txt
> index 47aa205ee0bd..43993aae3f9c 100644
> --- a/Documentation/devicetree/bindings/net/dsa/mt7530.txt
> +++ b/Documentation/devicetree/bindings/net/dsa/mt7530.txt
> @@ -35,6 +35,42 @@ Required properties for the child nodes within ports container:
>  - phy-mode: String, must be either "trgmii" or "rgmii" for port labeled
>  	 "cpu".
>  
> +Port 5 of the switch is muxed between:
> +1. GMAC5: GMAC5 can interface with another external MAC or PHY.
> +2. PHY of port 0 or port 4: PHY interfaces with an external MAC like 2nd GMAC
> +   of the SOC. Used in many setups where port 0/4 becomes the WAN port.
> +   Note: On a MT7621 SOC with integrated switch: 2nd GMAC can only connected to
> +	 GMAC5 when the gpios for RGMII2 (GPIO 22-33) are not used and not
> +	 connected to external component!
> +
> +Port 5 modes/configurations:
> +1. Port 5 is disabled and isolated: An external phy can interface to the 2nd
> +   GMAC of the SOC.
> +   In the case of a build-in MT7530 switch, port 5 shares the RGMII bus with 2nd
> +   GMAC and an optional external phy. Mind the GPIO/pinctl settings of the SOC!
> +2. Port 5 is muxed to PHY of port 0/4: Port 0/4 interfaces with 2nd GMAC.
> +   It is a simple MAC to PHY interface, port 5 needs to be setup for xMII mode
> +   and RGMII delay.
> +3. Port 5 is muxed to GMAC5 and can interface to an external phy.
> +   Port 5 becomes an extra switch port.
> +   Only works on platform where external phy TX<->RX lines are swapped.
> +   Like in the Ubiquiti ER-X-SFP.
> +4. Port 5 is muxed to GMAC5 and interfaces with the 2nd GAMC as 2nd CPU port.
> +   Currently a 2nd CPU port is not supported by DSA code.
> +
> +Depending on how the external PHY is wired:
> +1. normal: The PHY can only connect to 2nd GMAC but not to the switch
> +2. swapped: RGMII TX, RX are swapped; external phy interface with the switch as
> +   a ethernet port. But can't interface to the 2nd GMAC.
> +
> +Based on the DT the port 5 mode is configured.
> +
> +Driver tries to lookup the phy-handle of the 2nd GMAC of the master device.
> +When phy-handle matches PHY of port 0 or 4 then port 5 set-up as mode 2.
> +phy-mode must be set, see also example 2 below!
> + * mt7621: phy-mode = "rgmii-txid";
> + * mt7623: phy-mode = "rgmii";
> +
>  See Documentation/devicetree/bindings/net/dsa/dsa.txt for a list of additional
>  required, optional properties and how the integrated switch subnodes must
>  be specified.
> @@ -94,3 +130,185 @@ Example:
>  			};
>  		};
>  	};
> +
> +Example 2: MT7621: Port 4 is WAN port: 2nd GMAC -> Port 5 -> PHY port 4.
> +
> +&eth {
> +	status = "okay";

Don't show status in examples.

This should show the complete node.

> +
> +	gmac0: mac@0 {
> +		compatible = "mediatek,eth-mac";
> +		reg = <0>;
> +		phy-mode = "rgmii";
> +
> +		fixed-link {
> +			speed = <1000>;
> +			full-duplex;
> +			pause;
> +		};
> +	};
> +
> +	gmac1: mac@1 {
> +		compatible = "mediatek,eth-mac";
> +		reg = <1>;
> +		phy-mode = "rgmii-txid";
> +		phy-handle = <&phy4>;
> +	};
> +
> +	mdio: mdio-bus {
> +		#address-cells = <1>;
> +		#size-cells = <0>;
> +
> +		/* Internal phy */
> +		phy4: ethernet-phy@4 {
> +			reg = <4>;
> +		};
> +
> +		mt7530: switch@1f {
> +			compatible = "mediatek,mt7621";
> +			#address-cells = <1>;
> +			#size-cells = <0>;
> +			reg = <0x1f>;
> +			pinctrl-names = "default";
> +			mediatek,mcm;
> +
> +			resets = <&rstctrl 2>;
> +			reset-names = "mcm";
> +
> +			ports {
> +				#address-cells = <1>;
> +				#size-cells = <0>;
> +
> +				port@0 {
> +					reg = <0>;
> +					label = "lan0";
> +				};
> +
> +				port@1 {
> +					reg = <1>;
> +					label = "lan1";
> +				};
> +
> +				port@2 {
> +					reg = <2>;
> +					label = "lan2";
> +				};
> +
> +				port@3 {
> +					reg = <3>;
> +					label = "lan3";
> +				};
> +
> +/* Commented out. Port 4 is handled by 2nd GMAC.
> +				port@4 {
> +					reg = <4>;
> +					label = "lan4";
> +				};
> +*/
> +
> +				cpu_port0: port@6 {
> +					reg = <6>;
> +					label = "cpu";
> +					ethernet = <&gmac0>;
> +					phy-mode = "rgmii";
> +
> +					fixed-link {
> +						speed = <1000>;
> +						full-duplex;
> +						pause;
> +					};
> +				};
> +			};
> +		};
> +	};
> +};
> +
> +Example 3: MT7621: Port 5 is connected to external PHY: Port 5 -> external PHY.
> +
> +&eth {
> +	status = "okay";
> +
> +	gmac0: mac@0 {
> +		compatible = "mediatek,eth-mac";
> +		reg = <0>;
> +		phy-mode = "rgmii";
> +
> +		fixed-link {
> +			speed = <1000>;
> +			full-duplex;
> +			pause;
> +		};
> +	};
> +
> +	mdio: mdio-bus {
> +		#address-cells = <1>;
> +		#size-cells = <0>;
> +
> +		/* External phy */
> +		ephy5: ethernet-phy@7 {
> +			reg = <7>;
> +		};
> +
> +		mt7530: switch@1f {
> +			compatible = "mediatek,mt7621";
> +			#address-cells = <1>;
> +			#size-cells = <0>;
> +			reg = <0x1f>;
> +			pinctrl-names = "default";
> +			mediatek,mcm;
> +
> +			resets = <&rstctrl 2>;
> +			reset-names = "mcm";
> +
> +			ports {
> +				#address-cells = <1>;
> +				#size-cells = <0>;
> +
> +				port@0 {
> +					reg = <0>;
> +					label = "lan0";
> +				};
> +
> +				port@1 {
> +					reg = <1>;
> +					label = "lan1";
> +				};
> +
> +				port@2 {
> +					reg = <2>;
> +					label = "lan2";
> +				};
> +
> +				port@3 {
> +					reg = <3>;
> +					label = "lan3";
> +				};
> +
> +				port@4 {
> +					reg = <4>;
> +					label = "lan4";
> +				};
> +
> +				port@5 {
> +					reg = <5>;
> +					label = "lan5";
> +					phy-mode = "rgmii";
> +					phy-handle = <&ephy5>;
> +				};
> +
> +				cpu_port0: port@6 {
> +					reg = <6>;
> +					label = "cpu";
> +					ethernet = <&gmac0>;
> +					phy-mode = "rgmii";
> +
> +					fixed-link {
> +						speed = <1000>;
> +						full-duplex;
> +						pause;
> +					};
> +				};
> +			};
> +		};
> +	};
> +};
> -- 
> 2.20.1
> 

^ permalink raw reply

* Re: [Patch net] net_sched: fix a NULL pointer deref in ipt action
From: David Miller @ 2019-08-27 22:06 UTC (permalink / raw)
  To: xiyou.wangcong; +Cc: netdev, itugrok, jhs, jiri
In-Reply-To: <20190825170132.31174-1-xiyou.wangcong@gmail.com>

From: Cong Wang <xiyou.wangcong@gmail.com>
Date: Sun, 25 Aug 2019 10:01:32 -0700

> The net pointer in struct xt_tgdtor_param is not explicitly
> initialized therefore is still NULL when dereferencing it.
> So we have to find a way to pass the correct net pointer to
> ipt_destroy_target().
> 
> The best way I find is just saving the net pointer inside the per
> netns struct tcf_idrinfo, which could make this patch smaller.
> 
> Fixes: 0c66dc1ea3f0 ("netfilter: conntrack: register hooks in netns when needed by ruleset")
> Reported-and-tested-by: itugrok@yahoo.com
> Cc: Jamal Hadi Salim <jhs@mojatatu.com>
> Cc: Jiri Pirko <jiri@resnulli.us>
> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>

Applied and queued up for -stable, thanks.

^ permalink raw reply

* Re: [PATCH v2] bonding: force enable lacp port after link state recovery for 802.3ad
From: David Miller @ 2019-08-27 22:04 UTC (permalink / raw)
  To: zhangsha.zhang
  Cc: j.vosburgh, vfalico, andy, netdev, linux-kernel, yuehaibing,
	hunongda, alex.chen
In-Reply-To: <20190823034209.14596-1-zhangsha.zhang@huawei.com>

From: <zhangsha.zhang@huawei.com>
Date: Fri, 23 Aug 2019 11:42:09 +0800

> - If speed/duplex getting failed here, the link status
>   will be changed to BOND_LINK_FAIL;

How does it fail at this step?  I suspect this is a driver specific
problem.


^ permalink raw reply

* Re: [PATCH net-next v5] sched: Add dualpi2 qdisc
From: David Miller @ 2019-08-27 22:03 UTC (permalink / raw)
  To: olivier.tilmans
  Cc: eric.dumazet, stephen, olga, koen.de_schepper, research, henrist,
	jhs, xiyou.wangcong, jiri, linux-kernel, netdev
In-Reply-To: <20190822080045.27609-1-olivier.tilmans@nokia-bell-labs.com>

From: "Tilmans, Olivier (Nokia - BE/Antwerp)" <olivier.tilmans@nokia-bell-labs.com>
Date: Thu, 22 Aug 2019 08:10:48 +0000

> +static inline struct dualpi2_skb_cb *dualpi2_skb_cb(struct sk_buff *skb)

Please do not use the inline keyword in foo.c files, let the compiler decide.

> +static struct sk_buff *dualpi2_qdisc_dequeue(struct Qdisc *sch)
> +{
> +	struct dualpi2_sched_data *q = qdisc_priv(sch);
> +	struct sk_buff *skb;
> +	int qlen_c, credit_change;

Reverse christmas tree here, please.

> +static void dualpi2_timer(struct timer_list *timer)
> +{
> +	struct dualpi2_sched_data *q = from_timer(q, timer, pi2.timer);
> +	struct Qdisc *sch = q->sch;
> +	spinlock_t *root_lock; /* Lock to access the head of both queues. */

Likewise, and please remove this comment it makes the variable declarations
look odd.

^ permalink raw reply

* Re: [PATCH v2 net] Add genphy_c45_config_aneg() function to phy-c45.c
From: David Miller @ 2019-08-27 22:01 UTC (permalink / raw)
  To: marco.hartmann
  Cc: andrew, f.fainelli, hkallweit1, netdev, linux-kernel,
	christian.herber
In-Reply-To: <1566385208-23523-1-git-send-email-marco.hartmann@nxp.com>

From: Marco Hartmann <marco.hartmann@nxp.com>
Date: Wed, 21 Aug 2019 11:00:46 +0000

> Commit 34786005eca3 ("net: phy: prevent PHYs w/o Clause 22 regs from calling
> genphy_config_aneg") introduced a check that aborts phy_config_aneg()
> if the phy is a C45 phy.
> This causes phy_state_machine() to call phy_error() so that the phy
> ends up in PHY_HALTED state.
> 
> Instead of returning -EOPNOTSUPP, call genphy_c45_config_aneg()
> (analogous to the C22 case) so that the state machine can run
> correctly.
> 
> genphy_c45_config_aneg() closely resembles mv3310_config_aneg()
> in drivers/net/phy/marvell10g.c, excluding vendor specific
> configurations for 1000BaseT.
> 
> Fixes: 22b56e827093 ("net: phy: replace genphy_10g_driver with genphy_c45_driver")
> 
> Signed-off-by: Marco Hartmann <marco.hartmann@nxp.com>

Andrew, gentle ping to respond to Heiner who said:

> For me this patch would be ok, even though this generic config_aneg
> doesn't support 1000BaseT.
> 1. The whole genphy_c45 driver doesn't make sense w/o a config_aneg
>    callback implementation.
> 2. It can serve as a temporary fallback for new C45 PHY's that don't
>    have a dedicated driver yet.
> 3. We may have C45 PHYs not supporting 1000BaseT (e.g. T1).
> 
> Andrew?

Thanks.

^ permalink raw reply

* Re: [PATCH] net/mlx5: fix a -Wstringop-truncation warning
From: Saeed Mahameed @ 2019-08-27 21:51 UTC (permalink / raw)
  To: cai@lca.pw
  Cc: linux-rdma@vger.kernel.org, davem@davemloft.net, Moshe Shemesh,
	Feras Daoud, linux-kernel@vger.kernel.org, Eran Ben Elisha,
	netdev@vger.kernel.org, leon@kernel.org
In-Reply-To: <1566590183-9898-1-git-send-email-cai@lca.pw>

On Fri, 2019-08-23 at 15:56 -0400, Qian Cai wrote:
> In file included from ./arch/powerpc/include/asm/paca.h:15,
>                  from ./arch/powerpc/include/asm/current.h:13,
>                  from ./include/linux/thread_info.h:21,
>                  from ./include/asm-generic/preempt.h:5,
>                  from
> ./arch/powerpc/include/generated/asm/preempt.h:1,
>                  from ./include/linux/preempt.h:78,
>                  from ./include/linux/spinlock.h:51,
>                  from ./include/linux/wait.h:9,
>                  from ./include/linux/completion.h:12,
>                  from ./include/linux/mlx5/driver.h:37,
>                  from
> drivers/net/ethernet/mellanox/mlx5/core/lib/eq.h:6,
>                  from
> drivers/net/ethernet/mellanox/mlx5/core/diag/fw_tracer.c:33:
> In function 'strncpy',
>     inlined from 'mlx5_fw_tracer_save_trace' at
> drivers/net/ethernet/mellanox/mlx5/core/diag/fw_tracer.c:549:2,
>     inlined from 'mlx5_tracer_print_trace' at
> drivers/net/ethernet/mellanox/mlx5/core/diag/fw_tracer.c:574:2:
> ./include/linux/string.h:305:9: warning: '__builtin_strncpy' output
> may
> be truncated copying 256 bytes from a string of length 511
> [-Wstringop-truncation]
>   return __builtin_strncpy(p, q, size);
>          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> 
> Fix it by using the new strscpy_pad() since the commit 458a3bf82df4
> ("lib/string: Add strscpy_pad() function") which will always
> NUL-terminate the string, and avoid possibly leak data through the
> ring
> buffer where non-admin account might enable these events through
> perf.
> 
> Fixes: fd1483fe1f9f ("net/mlx5: Add support for FW reporter dump")
> Signed-off-by: Qian Cai <cai@lca.pw>
> 

Applied to mlx5-next, Thanks !

^ permalink raw reply

* Re: [PATCH net-next v2 3/3] dt-bindings: net: ethernet: Update mt7622 docs and dts to reflect the new phylink API
From: Rob Herring @ 2019-08-27 21:51 UTC (permalink / raw)
  To: René van Dorst
  Cc: John Crispin, Sean Wang, Nelson Chang, David S . Miller,
	Matthias Brugger, netdev, linux-arm-kernel, linux-mediatek,
	linux-mips, Frank Wunderlich, Stefan Roese, devicetree
In-Reply-To: <20190821144336.9259-4-opensource@vdorst.com>

On Wed, Aug 21, 2019 at 04:43:36PM +0200, René van Dorst wrote:
> This patch the removes the recently added mediatek,physpeed property.
> Use the fixed-link property speed = <2500> to set the phy in 2.5Gbit.
> See mt7622-bananapi-bpi-r64.dts for a working example.
> 
> Signed-off-by: René van Dorst <opensource@vdorst.com>
> Cc: devicetree@vger.kernel.org
> Cc: Rob Herring <robh@kernel.org>
> --
> v1->v2:
> * SGMII port only support BASE-X at 2.5Gbit.
> ---
>  .../arm/mediatek/mediatek,sgmiisys.txt        |  2 --

Bindings and dts files should be separate patches.


>  .../dts/mediatek/mt7622-bananapi-bpi-r64.dts  | 28 +++++++++++++------
>  arch/arm64/boot/dts/mediatek/mt7622.dtsi      |  1 -
>  3 files changed, 19 insertions(+), 12 deletions(-)

In any case,

Acked-by: Rob Herring <robh@kernel.org>

^ permalink raw reply

* Re: [PATCH] net/mlx5: fix a -Wstringop-truncation warning
From: Saeed Mahameed @ 2019-08-27 21:46 UTC (permalink / raw)
  To: cai@lca.pw
  Cc: linux-rdma@vger.kernel.org, davem@davemloft.net, Moshe Shemesh,
	Feras Daoud, linux-kernel@vger.kernel.org, Eran Ben Elisha,
	netdev@vger.kernel.org, leon@kernel.org
In-Reply-To: <1566936733.5576.16.camel@lca.pw>

On Tue, 2019-08-27 at 16:12 -0400, Qian Cai wrote:
> On Mon, 2019-08-26 at 21:11 +0000, Saeed Mahameed wrote:
> > On Fri, 2019-08-23 at 15:56 -0400, Qian Cai wrote:
> > > In file included from ./arch/powerpc/include/asm/paca.h:15,
> > >                  from ./arch/powerpc/include/asm/current.h:13,
> > >                  from ./include/linux/thread_info.h:21,
> > >                  from ./include/asm-generic/preempt.h:5,
> > >                  from
> > > ./arch/powerpc/include/generated/asm/preempt.h:1,
> > >                  from ./include/linux/preempt.h:78,
> > >                  from ./include/linux/spinlock.h:51,
> > >                  from ./include/linux/wait.h:9,
> > >                  from ./include/linux/completion.h:12,
> > >                  from ./include/linux/mlx5/driver.h:37,
> > >                  from
> > > drivers/net/ethernet/mellanox/mlx5/core/lib/eq.h:6,
> > >                  from
> > > drivers/net/ethernet/mellanox/mlx5/core/diag/fw_tracer.c:33:
> > > In function 'strncpy',
> > >     inlined from 'mlx5_fw_tracer_save_trace' at
> > > drivers/net/ethernet/mellanox/mlx5/core/diag/fw_tracer.c:549:2,
> > >     inlined from 'mlx5_tracer_print_trace' at
> > > drivers/net/ethernet/mellanox/mlx5/core/diag/fw_tracer.c:574:2:
> > > ./include/linux/string.h:305:9: warning: '__builtin_strncpy'
> > > output
> > > may
> > > be truncated copying 256 bytes from a string of length 511
> > > [-Wstringop-truncation]
> > >   return __builtin_strncpy(p, q, size);
> > >          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > > 
> > > Fix it by using the new strscpy_pad() since the commit
> > > 458a3bf82df4
> > > ("lib/string: Add strscpy_pad() function") which will always
> > > NUL-terminate the string, and avoid possibly leak data through
> > > the
> > > ring
> > > buffer where non-admin account might enable these events through
> > > perf.
> > > 
> > > Fixes: fd1483fe1f9f ("net/mlx5: Add support for FW reporter
> > > dump")
> > > Signed-off-by: Qian Cai <cai@lca.pw>
> > 
> > Hi Qian and thanks for your patch,
> > 
> > We already have a patch that handles this issue, please check it
> > out:
> > https://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux.git/commit/?h=net-
> > next-mlx5
> > 
> 
> That commit will make "struct mlx5_fw_tracer" too large and trigger a
> warning in
> __alloc_pages_nodemask(),
> 

I see! thanks for the input, the patch is still under review and not
yet passed to regression queue.

I will take your patch.. and will fix our patch on top of yours.

>         /*
>          * There are several places where we assume that the order
> value is sane
>          * so bail out early if the request is out of bound.
>          */
>         if (unlikely(order >= MAX_ORDER)) {
>                 WARN_ON_ONCE(!(gfp_mask & __GFP_NOWARN));
>                 return NULL;
>         }
> 
> [   98.339576][  T914] WARNING: CPU: 0 PID: 914 at
> mm/page_alloc.c:4705
> __alloc_pages_nodemask+0x441/0x1bb0
> [   98.349174][  T914] Modules linked in: smartpqi(+)
> scsi_transport_sas tg3
> mlx5_core(+) libphy firmware_class dm_mirror dm_region_hash dm_log
> dm_mod
> efivarfs
> [   98.363495][  T914] CPU: 0 PID: 914 Comm: kworker/0:2 Not tainted
> 5.3.0-rc6-
> next-20190827+ #14
> [   98.372243][  T914] Hardware name: HPE ProLiant DL385
> Gen10/ProLiant DL385
> Gen10, BIOS A40 07/10/2019
> [   98.381627][  T914] Workqueue: events work_for_cpu_fn
> [   98.386720][  T914] RIP: 0010:__alloc_pages_nodemask+0x441/0x1bb0
> [   98.392917][  T914] Code: 17 00 00 48 8d 65 d8 5b 41 5c 41 5d 41
> 5e 41 5f 5d
> c3 89 85 3c fe ff ff bb 01 00 00 00 e9 96 fd ff ff 81 e7 00 20 00 00
> 75 02 <0f>
> 0b 48 c7 85 50 fe ff ff 00 00 00 00 eb 82 31 d2 be 36 12 00 00
> [   98.412740][  T914] RSP: 0018:ffff88853418f948 EFLAGS: 00010246
> [   98.418704][  T914] RAX: 0000000000000000 RBX: ffffffff9571a860
> RCX:
> 1ffff110a6831f3e
> [   98.426652][  T914] RDX: 0000000000000000 RSI: 000000000000000b
> RDI:
> 0000000000000000
> [   98.434661][  T914] RBP: ffff88853418fb58 R08: ffffed1108808465
> R09:
> ffffed1108808465
> [   98.442613][  T914] R10: ffffed1108808464 R11: ffff888844042323
> R12:
> 0000000000000000
> [   98.450548][  T914] R13: 000000000000000b R14: 0000000000000000
> R15:
> 0000000000000001
> [   98.458434][  T914] FS:  0000000000000000(0000)
> GS:ffff888844000000(0000)
> knlGS:0000000000000000
> [   98.467350][  T914] CS:  0010 DS: 0000 ES: 0000 CR0:
> 0000000080050033
> [   98.473911][  T914] CR2: 0000555c64680148 CR3: 0000000550412000
> CR4:
> 00000000003406b0
> [   98.481838][  T914] Call Trace:
> [   98.485011][  T914]  ? find_next_bit+0x2c/0xa0
> [   98.489490][  T914]  ? __kasan_check_write+0x14/0x20
> [   98.494506][  T914]  ? graph_lock+0xb8/0x120
> [   98.498811][  T914]  ? __free_zapped_classes+0x740/0x740
> [   98.504239][  T914]  ? gfp_pfmemalloc_allowed+0xc0/0xc0
> [   98.509504][  T914]  ? __kasan_check_read+0x11/0x20
> [   98.514443][  T914]  ? register_lock_class+0x5ef/0x960
> [   98.519624][  T914]  ? rcu_read_lock_sched_held+0xac/0xe0
> [   98.525152][  T914]  ? rcu_read_lock_any_held.part.5+0x20/0x20
> [   98.531130][  T914]  ? find_next_bit+0x2c/0xa0
> [   98.535610][  T914]  alloc_pages_current+0x9c/0x110
> [   98.540638][  T914]  kmalloc_order+0x22/0x70
> [   98.544943][  T914]  kmalloc_order_trace+0x23/0x100
> [   98.550072][  T914]  mlx5_fw_tracer_create+0x51/0x870 [mlx5_core]
> [   98.556213][  T914]  ? __mutex_init+0x94/0xa0
> [   98.560744][  T914]  ? mlx5_init_rl_table+0x144/0x210 [mlx5_core]
> [   98.566929][  T914]  mlx5_load_one+0x199/0x980 [mlx5_core]
> [   98.572637][  T914]  init_one+0x494/0x760 [mlx5_core]
> [   98.577771][  T914]  ? mlx5_pci_resume+0xd0/0xd0 [mlx5_core]
> [   98.583574][  T914]  local_pci_probe+0x7a/0xc0
> [   98.588054][  T914]  ? pci_dma_configure+0xa0/0xa0
> [   98.592938][  T914]  work_for_cpu_fn+0x2e/0x50
> [   98.597416][  T914]  process_one_work+0x53b/0xa70
> [   98.602220][  T914]  ? pwq_dec_nr_in_flight+0x170/0x170
> [   98.607485][  T914]  ? move_linked_works+0x113/0x150
> [   98.612497][  T914]  worker_thread+0x363/0x5b0
> [   98.616976][  T914]  kthread+0x1df/0x200
> [   98.620932][  T914]  ? process_one_work+0xa70/0xa70
> [   98.625847][  T914]  ? kthread_park+0xd0/0xd0
> [   98.630240][  T914]  ret_from_fork+0x22/0x40

^ permalink raw reply

* Re: [PATCH v5 net-next 03/18] ionic: Add port management commands
From: Shannon Nelson @ 2019-08-27 21:44 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: netdev, davem
In-Reply-To: <20190826213631.37b8f56d@cakuba.netronome.com>

On 8/26/19 9:36 PM, Jakub Kicinski wrote:
> On Mon, 26 Aug 2019 14:33:24 -0700, Shannon Nelson wrote:
>> The port management commands apply to the physical port
>> associated with the PCI device, which might be shared among
>> several logical interfaces.
>>
>> Signed-off-by: Shannon Nelson <snelson@pensando.io>
[...]

>> diff --git a/drivers/net/ethernet/pensando/ionic/ionic_dev.h b/drivers/net/ethernet/pensando/ionic/ionic_dev.h
>> index 30a5206bba4e..5b83f21af18a 100644
>> --- a/drivers/net/ethernet/pensando/ionic/ionic_dev.h
>> +++ b/drivers/net/ethernet/pensando/ionic/ionic_dev.h
>> @@ -122,6 +122,10 @@ struct ionic_dev {
>>   	struct ionic_intr __iomem *intr_ctrl;
>>   	u64 __iomem *intr_status;
>>   
>> +	u32 port_info_sz;
>> +	struct ionic_port_info *port_info;
>> +	dma_addr_t port_info_pa;
>> +
>>   	struct ionic_devinfo dev_info;
>>   };
>>   
>> @@ -140,4 +144,15 @@ void ionic_dev_cmd_identify(struct ionic_dev *idev, u8 ver);
>>   void ionic_dev_cmd_init(struct ionic_dev *idev);
>>   void ionic_dev_cmd_reset(struct ionic_dev *idev);
>>   
>> +void ionic_dev_cmd_port_identify(struct ionic_dev *idev);
>> +void ionic_dev_cmd_port_init(struct ionic_dev *idev);
>> +void ionic_dev_cmd_port_reset(struct ionic_dev *idev);
>> +void ionic_dev_cmd_port_state(struct ionic_dev *idev, u8 state);
>> +void ionic_dev_cmd_port_speed(struct ionic_dev *idev, u32 speed);
>> +void ionic_dev_cmd_port_mtu(struct ionic_dev *idev, u32 mtu);
>> +void ionic_dev_cmd_port_autoneg(struct ionic_dev *idev, u8 an_enable);
>> +void ionic_dev_cmd_port_fec(struct ionic_dev *idev, u8 fec_type);
>> +void ionic_dev_cmd_port_pause(struct ionic_dev *idev, u8 pause_type);
>> +void ionic_dev_cmd_port_loopback(struct ionic_dev *idev, u8 loopback_mode);
> I don't think you call most of these functions in this patch.

No, but most get used in the ethtool code added a few patches later.  
The port_mtu probably won't get used, so I can pull that out.  The 
port_loopback will get used when I add a loopback test, but I can pull 
that out for now until that test is added.

>
>>   #endif /* _IONIC_DEV_H_ */
>> diff --git a/drivers/net/ethernet/pensando/ionic/ionic_main.c b/drivers/net/ethernet/pensando/ionic/ionic_main.c
>> index f52eb6c50358..47928f184230 100644
>> --- a/drivers/net/ethernet/pensando/ionic/ionic_main.c
>> +++ b/drivers/net/ethernet/pensando/ionic/ionic_main.c
>> @@ -309,6 +309,92 @@ int ionic_reset(struct ionic *ionic)
>>   	return err;
>>   }
>>   
>> +int ionic_port_identify(struct ionic *ionic)
>> +{
>> +	struct ionic_identity *ident = &ionic->ident;
>> +	struct ionic_dev *idev = &ionic->idev;
>> +	size_t sz;
>> +	int err;
>> +
>> +	mutex_lock(&ionic->dev_cmd_lock);
>> +
>> +	ionic_dev_cmd_port_identify(idev);
>> +	err = ionic_dev_cmd_wait(ionic, devcmd_timeout);
>> +	if (!err) {
>> +		sz = min(sizeof(ident->port), sizeof(idev->dev_cmd_regs->data));
>> +		memcpy_fromio(&ident->port, &idev->dev_cmd_regs->data, sz);
>> +	}
>> +
>> +	mutex_unlock(&ionic->dev_cmd_lock);
>> +
>> +	return err;
>> +}
>> +
>> +int ionic_port_init(struct ionic *ionic)
>> +{
>> +	struct ionic_identity *ident = &ionic->ident;
>> +	struct ionic_dev *idev = &ionic->idev;
>> +	size_t sz;
>> +	int err;
>> +
>> +	if (idev->port_info)
>> +		return 0;
>> +
>> +	idev->port_info_sz = ALIGN(sizeof(*idev->port_info), PAGE_SIZE);
>> +	idev->port_info = dma_alloc_coherent(ionic->dev, idev->port_info_sz,
>> +					     &idev->port_info_pa,
>> +					     GFP_KERNEL);
>> +	if (!idev->port_info) {
>> +		dev_err(ionic->dev, "Failed to allocate port info, aborting\n");
>> +		return -ENOMEM;
>> +	}
>> +
>> +	sz = min(sizeof(ident->port.config), sizeof(idev->dev_cmd_regs->data));
>> +
>> +	mutex_lock(&ionic->dev_cmd_lock);
>> +
>> +	memcpy_toio(&idev->dev_cmd_regs->data, &ident->port.config, sz);
>> +	ionic_dev_cmd_port_init(idev);
>> +	err = ionic_dev_cmd_wait(ionic, devcmd_timeout);
>> +
>> +	ionic_dev_cmd_port_state(&ionic->idev, IONIC_PORT_ADMIN_STATE_UP);
>> +	(void)ionic_dev_cmd_wait(ionic, devcmd_timeout);
>> +
>> +	mutex_unlock(&ionic->dev_cmd_lock);
>> +	if (err) {
>> +		dev_err(ionic->dev, "Failed to init port\n");
> The lifetime of port_info seems a little strange. Why is it left in
> place even if the command failed? Doesn't this leak memory?
>
>> +		return err;
>> +	}
>> +
>> +	return 0;
> return err; work for both paths
>
>> +}
>> +
>> +int ionic_port_reset(struct ionic *ionic)
>> +{
>> +	struct ionic_dev *idev = &ionic->idev;
>> +	int err;
>> +
>> +	if (!idev->port_info)
>> +		return 0;
>> +
>> +	mutex_lock(&ionic->dev_cmd_lock);
>> +	ionic_dev_cmd_port_reset(idev);
>> +	err = ionic_dev_cmd_wait(ionic, devcmd_timeout);
>> +	mutex_unlock(&ionic->dev_cmd_lock);
>> +	if (err) {
>> +		dev_err(ionic->dev, "Failed to reset port\n");
>> +		return err;
> Again, memory leak if command fails? (nothing frees port_info)
>
>> +	}
>> +
>> +	dma_free_coherent(ionic->dev, idev->port_info_sz,
>> +			  idev->port_info, idev->port_info_pa);
>> +
>> +	idev->port_info = NULL;
>> +	idev->port_info_pa = 0;
>> +
>> +	return err;
> Well, with current code err can only be 0 at this point.

I'll revisit these bits.

sln



^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox