Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH net-next 0/2] Add configuration information to register dump and debug data
From: David Miller @ 2018-04-23 16:06 UTC (permalink / raw)
  To: denis.bolotin; +Cc: netdev, ariel.elior
In-Reply-To: <20180423115605.8531-1-denis.bolotin@cavium.com>

From: Denis Bolotin <denis.bolotin@cavium.com>
Date: Mon, 23 Apr 2018 14:56:03 +0300

> The purpose of this patchset is to add configuration information to the
> debug data collection, which already contains register dump.
> The first patch (removing the ptt) is essential because it prevents the
> unnecessary ptt acquirement when calling mcp APIs.

Series applied, thank you.

^ permalink raw reply

* Re: [PATCH v2 net-next] net: stmmac: Implement logic to automatically select HW Interface
From: David Miller @ 2018-04-23 16:04 UTC (permalink / raw)
  To: Jose.Abreu
  Cc: netdev, Joao.Pinto, Vitor.Soares, peppe.cavallaro,
	alexandre.torgue
In-Reply-To: <0f13e7542a0b701da6446411318a4e14023cddb7.1524470547.git.joabreu@synopsys.com>

From: Jose Abreu <Jose.Abreu@synopsys.com>
Date: Mon, 23 Apr 2018 09:05:15 +0100

> Move all the core version detection to a common place ("hwif.c") and
> implement a table which can be used to lookup the correct callbacks for
> each IP version.
> 
> This simplifies the initialization flow of each IP version and eases
> future implementation of new IP versions.
> 
> Signed-off-by: Jose Abreu <joabreu@synopsys.com>

Applied, thanks Jose.

^ permalink raw reply

* Re: [PATCH net] ipv6: add RTA_TABLE and RTA_PREFSRC to rtm_ipv6_policy
From: David Miller @ 2018-04-23 16:02 UTC (permalink / raw)
  To: edumazet; +Cc: netdev, eric.dumazet
In-Reply-To: <20180423012923.121147-1-edumazet@google.com>

From: Eric Dumazet <edumazet@google.com>
Date: Sun, 22 Apr 2018 18:29:23 -0700

> KMSAN reported use of uninit-value that I tracked to lack
> of proper size check on RTA_TABLE attribute.
> 
> I also believe RTA_PREFSRC lacks a similar check.
> 
> Fixes: 86872cb57925 ("[IPv6] route: FIB6 configuration using struct fib6_config")
> Fixes: c3968a857a6b ("ipv6: RTA_PREFSRC support for ipv6 route source address selection")
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Reported-by: syzbot <syzkaller@googlegroups.com>

Applied and queued up for -stable, thanks Eric.

^ permalink raw reply

* Re: [PATCH] r8169: don't use netif_info et al before net_device has been registered
From: David Miller @ 2018-04-23 15:59 UTC (permalink / raw)
  To: hkallweit1; +Cc: nic_swsd, netdev
In-Reply-To: <94b7d3c8-9634-5b95-ff92-3682f7894fca@gmail.com>

From: Heiner Kallweit <hkallweit1@gmail.com>
Date: Sun, 22 Apr 2018 17:15:15 +0200

> There's no benefit in using netif_info et al before the net_device has
> been registered. We get messages like
> r8169 0000:03:00.0 (unnamed net_device) (uninitialized): [message]
> Therefore use dev_info/dev_err instead.
> 
> As a side effect we don't need parameter dev for function
> rtl8169_get_mac_version() any longer.
> 
> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>

Please properly tag your Subject lines with the tree you are targetting,
in this case it should have been "[PATCH net-next]"

Applied, thank you.

^ permalink raw reply

* Re: [PATCH net-next] net: init sk_cookie for inet socket
From: David Miller @ 2018-04-23 15:58 UTC (permalink / raw)
  To: laoar.shao; +Cc: eric.dumazet, alexei.starovoitov, netdev, linux-kernel
In-Reply-To: <1524405004-10960-1-git-send-email-laoar.shao@gmail.com>

From: Yafang Shao <laoar.shao@gmail.com>
Date: Sun, 22 Apr 2018 21:50:04 +0800

> With sk_cookie we can identify a socket, that is very helpful for
> traceing and statistic, i.e. tcp tracepiont and ebpf.
> So we'd better init it by default for inet socket.
> When using it, we just need call atomic64_read(&sk->sk_cookie).
> 
> Signed-off-by: Yafang Shao <laoar.shao@gmail.com>

Applied, thank you.

^ permalink raw reply

* Re: [PATCH net] bonding: do not set slave_dev npinfo before slave_enable_netpoll in bond_enslave
From: David Miller @ 2018-04-23 15:55 UTC (permalink / raw)
  To: lucien.xin; +Cc: netdev, andy, jiri, xiyou.wangcong
In-Reply-To: <d1defa141d6daef661da14b42db108a0e913314d.1524395510.git.lucien.xin@gmail.com>

From: Xin Long <lucien.xin@gmail.com>
Date: Sun, 22 Apr 2018 19:11:50 +0800

> After Commit 8a8efa22f51b ("bonding: sync netpoll code with bridge"), it
> would set slave_dev npinfo in slave_enable_netpoll when enslaving a dev
> if bond->dev->npinfo was set.
> 
> However now slave_dev npinfo is set with bond->dev->npinfo before calling
> slave_enable_netpoll. With slave_dev npinfo set, __netpoll_setup called
> in slave_enable_netpoll will not call slave dev's .ndo_netpoll_setup().
> It causes that the lower dev of this slave dev can't set its npinfo.
> 
> One way to reproduce it:
> 
>   # modprobe bonding
>   # brctl addbr br0
>   # brctl addif br0 eth1
>   # ifconfig bond0 192.168.122.1/24 up
>   # ifenslave bond0 eth2
>   # systemctl restart netconsole
>   # ifenslave bond0 br0
>   # ifconfig eth2 down
>   # systemctl restart netconsole
> 
> The netpoll won't really work.
> 
> This patch is to remove that slave_dev npinfo setting in bond_enslave().
> 
> Fixes: 8a8efa22f51b ("bonding: sync netpoll code with bridge")
> Signed-off-by: Xin Long <lucien.xin@gmail.com>

Applied and queued up for -stable, thanks.

^ permalink raw reply

* Re: [PATCH net] ipv6: add RTA_TABLE and RTA_PREFSRC to rtm_ipv6_policy
From: David Ahern @ 2018-04-23 15:48 UTC (permalink / raw)
  To: Eric Dumazet, David S . Miller; +Cc: netdev, Eric Dumazet
In-Reply-To: <20180423012923.121147-1-edumazet@google.com>

On 4/22/18 7:29 PM, Eric Dumazet wrote:
> KMSAN reported use of uninit-value that I tracked to lack
> of proper size check on RTA_TABLE attribute.
> 
> I also believe RTA_PREFSRC lacks a similar check.
> 
> Fixes: 86872cb57925 ("[IPv6] route: FIB6 configuration using struct fib6_config")
> Fixes: c3968a857a6b ("ipv6: RTA_PREFSRC support for ipv6 route source address selection")
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Reported-by: syzbot <syzkaller@googlegroups.com>
> ---
>  net/ipv6/route.c | 2 ++
>  1 file changed, 2 insertions(+)
> 

Acked-by: David Ahern <dsahern@gmail.com>

^ permalink raw reply

* Re: [pci PATCH v8 0/4] Add support for unmanaged SR-IOV
From: Alexander Duyck @ 2018-04-23 15:47 UTC (permalink / raw)
  To: Don Dutile
  Cc: Bjorn Helgaas, Alexander Duyck, Bjorn Helgaas, linux-pci,
	virtio-dev, kvm, Netdev, Daly, Dan, LKML, linux-nvme, Keith Busch,
	netanel, Maximilian Heyne, Wang, Liang-min, Rustad, Mark D,
	David Woodhouse, Christoph Hellwig, dwmw
In-Reply-To: <84adb7c2-bcf7-5e17-a9d0-482416e2d26c@redhat.com>

On Mon, Apr 23, 2018 at 8:21 AM, Don Dutile <ddutile@redhat.com> wrote:
> On 04/21/2018 04:34 PM, Bjorn Helgaas wrote:
>>
>> On Fri, Apr 20, 2018 at 12:28:08PM -0400, Alexander Duyck wrote:
>>>
>>> This series is meant to add support for SR-IOV on devices when the VFs
>>> are
>>> not managed by the kernel. Examples of recent patches attempting to do
>>> this
>>> include:
>>> virto - https://patchwork.kernel.org/patch/10241225/
>>> pci-stub - https://patchwork.kernel.org/patch/10109935/
>>> vfio - https://patchwork.kernel.org/patch/10103353/
>>> uio - https://patchwork.kernel.org/patch/9974031/
>>>
>>> Since this is quickly blowing up into a multi-driver problem it is
>>> probably
>>> best to implement this solution as generically as possible.
>>>
>>> This series is an attempt to do that. What we do with this patch set is
>>> provide a generic framework to enable SR-IOV in the case that the PF
>>> driver
>>> doesn't support managing the VFs itself.
>>>
>>> I based my patch set originally on the patch by Mark Rustad but there
>>> isn't
>>> much left after going through and cleaning out the bits that were no
>>> longer
>>> needed, and after incorporating the feedback from David Miller. At this
>>> point
>>> the only items to be fully reused was his patch description which is now
>>> present in patch 3 of the set.
>>>
>>> This solution is limited in scope to just adding support for devices that
>>> provide no functionality for SR-IOV other than allocating the VFs by
>>> calling pci_enable_sriov. Previous sets had included patches for VFIO,
>>> but
>>> for now I am dropping that as the scope of that work is larger then I
>>> think I can take on at this time.
>>>
>>> v2: Reduced scope back to just virtio_pci and vfio-pci
>>>      Broke into 3 patch set from single patch
>>>      Changed autoprobe behavior to always set when num_vfs is set
>>> non-zero
>>> v3: Updated Documentation to clarify when sriov_unmanaged_autoprobe is
>>> used
>>>      Wrapped vfio_pci_sriov_configure to fix build errors w/o SR-IOV in
>>> kernel
>>> v4: Dropped vfio-pci patch
>>>      Added ena and nvme to drivers now using
>>> pci_sriov_configure_unmanaged
>>>      Dropped pci_disable_sriov call in virtio_pci to be consistent with
>>> ena
>>> v5: Dropped sriov_unmanaged_autoprobe and pci_sriov_conifgure_unmanaged
>>>      Added new patch that enables pci_sriov_configure_simple
>>>      Updated drivers to use pci_sriov_configure_simple
>>> v6: Defined pci_sriov_configure_simple as NULL when SR-IOV is not enabled
>>>      Updated drivers to drop "#ifdef" checks for IOV
>>>      Added pci-pf-stub as place for PF-only drivers to add support
>>> v7: Dropped pci_id table explanation from pci-pf-stub driver
>>>      Updated pci_sriov_configure_simple to drop need for err value
>>>      Fixed comment explaining why pci_sriov_configure_simple is NULL
>>> v8: Dropped virtio from the set, support to be added later after TC
>>> approval
>>>
>>> Cc: Mark Rustad <mark.d.rustad@intel.com>
>>> Cc: Maximilian Heyne <mheyne@amazon.de>
>>> Cc: Liang-Min Wang <liang-min.wang@intel.com>
>>> Cc: David Woodhouse <dwmw@amazon.co.uk>
>>>
>>> ---
>>>
>>> Alexander Duyck (4):
>>>        pci: Add pci_sriov_configure_simple for PFs that don't manage VF
>>> resources
>>>        ena: Migrate over to unmanaged SR-IOV support
>>>        nvme: Migrate over to unmanaged SR-IOV support
>>>        pci-pf-stub: Add PF driver stub for PFs that function only to
>>> enable VFs
>>>
>>>
>>>   drivers/net/ethernet/amazon/ena/ena_netdev.c |   28 -------------
>>>   drivers/nvme/host/pci.c                      |   20 ----------
>>>   drivers/pci/Kconfig                          |   12 ++++++
>>>   drivers/pci/Makefile                         |    2 +
>>>   drivers/pci/iov.c                            |   31 +++++++++++++++
>>>   drivers/pci/pci-pf-stub.c                    |   54
>>> ++++++++++++++++++++++++++
>>>   include/linux/pci.h                          |    3 +
>>>   include/linux/pci_ids.h                      |    2 +
>>>   8 files changed, 106 insertions(+), 46 deletions(-)
>>>   create mode 100644 drivers/pci/pci-pf-stub.c
>>
>>
>> I tentatively applied these to pci/virtualization-review.
>>
>> The code changes look fine, but I want to flesh out the changelogs a
>> little bit before merging them.
>>
>> For example, I'm not sure what you mean by "devices where the PF is
>> not capable of managing VF resources."
>>
> I agree w/Bjorn's assessment of the changelog.
> The VF's are (minimally) assigned via the pf-stub driver, so they are
> 'managed by the kernel'.
> The security model is the same as the existing one, which was the issue we
> resolved in the previous set(s) of patches.
>
> I am hoping that something like vfio will be used to deal with the VF
> ownership
> and the reset mechanisms during assignement & de-assignment to 'guests'
> (qemu-kvm, DPDK, or whatever user-process),
> so the known, existing security model(s) is(are) maintained as well.
> If so, it'd be good to add such verbage somewhere (as 0/n is not kept in
> anything but possibly Bjorn's patchwork, or whatever patch mgmt tool he
> uses, and future reference would be good to have) say, an update to
> Documentation/PCI/pci-iov-howto.txt.
>
> So... the 'unmanaged SR-IOV' Subject, IMO, is not a valid Subject for the
> patch series any longer.
>
> No objections to the patch series, as Bjorn noted, just the commit
> log(s)/nomenclature of what is really being done.
> The expectation of VF enablement via the PF was born out of the fairly
> complicated, and unique PF vs VF drivers of the first implementations, which
> AlexD knows so well.  This "VFs act just like PFs without SRIOV
> capabilities" support is what this patch set enables with a much lighter
> configuration mechanism.
> So, maybe the patch set ought to be 'lightweight SRIOV enablement'.
>
> --dd

I'd be good with this being referred to as "lightweight SRIOV enablement".

The only reason why I was referring to it as "unmanaged" was because I
am used to drivers that use the PF MMIO registers to manage VF
resources and that doesn't exist in this model. Obviously this is all
still managed via the extended PCIe configuration though so there is
still some management taking place by the PCI subsystem in the kernel.

Thanks.

- Alex

^ permalink raw reply

* Re: [PATCH 1/3] ethtool: Support ETHTOOL_GSTATS2 command.
From: Ben Greear @ 2018-04-23 15:41 UTC (permalink / raw)
  To: Roopa Prabhu, David Miller; +Cc: netdev, Johannes Berg, linux-wireless, ath10k
In-Reply-To: <CAJieiUh8Pnu5GVSWz04WSZ5L0YVpVBmeY+H_MjhL0k6SWwbcqw@mail.gmail.com>

On 04/22/2018 02:15 PM, Roopa Prabhu wrote:
> On Sun, Apr 22, 2018 at 11:54 AM, David Miller <davem@davemloft.net> wrote:
>> From: Johannes Berg <johannes@sipsolutions.net>
>> Date: Thu, 19 Apr 2018 17:26:57 +0200
>>
>>> On Thu, 2018-04-19 at 08:25 -0700, Ben Greear wrote:
>>>>
>>>> Maybe this could be in followup patches?  It's going to touch a lot of files,
>>>> and might be hell to get merged all at once, and I've never used spatch, so
>>>> just maybe someone else will volunteer that part :)
>>>
>>> I guess you'll have to ask davem. :)
>>
>> Well, first of all, I really don't like this.
>>
>> The first reason is that every time I see interface foo become foo2,
>> foo3 is never far behind it.
>>
>> If foo was not extensible enough such that we needed foo2, we beter
>> design the new thing with explicitly better extensibility in mind.
>>
>> Furthermore, what you want here is a specific filter.  Someone else
>> will want to filter on another criteria, and the next person will
>> want yet another.
>>
>> This needs to be properly generalized.
>>
>> And frankly if we had moved to ethtool netlink/devlink by now, we
>> could just add a netlink attribute for filtering and not even be
>> having this conversation.
>
>
> +1.
>
> Also, the RTM_GETSTATS api was added to improve stats query efficiency
> (with filters).
>  we should look at it  to see if this fits there. Keeping all stats
> queries in one place will help.

I like the ethtool API, so I'll be sticking with that for now.

Thanks,
Ben



-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply

* Re: [PATCH 1/3] ethtool: Support ETHTOOL_GSTATS2 command.
From: Ben Greear @ 2018-04-23 15:38 UTC (permalink / raw)
  To: David Miller, johannes-cdvu00un1VgdHxzADdlk8Q
  Cc: netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-wireless-u79uwXL29TY76Z2rM5mHXA,
	ath10k-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r
In-Reply-To: <20180422.145420.1197041027922699603.davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>

On 04/22/2018 11:54 AM, David Miller wrote:
> From: Johannes Berg <johannes-cdvu00un1VgdHxzADdlk8Q@public.gmane.org>
> Date: Thu, 19 Apr 2018 17:26:57 +0200
>
>> On Thu, 2018-04-19 at 08:25 -0700, Ben Greear wrote:
>>>
>>> Maybe this could be in followup patches?  It's going to touch a lot of files,
>>> and might be hell to get merged all at once, and I've never used spatch, so
>>> just maybe someone else will volunteer that part :)
>>
>> I guess you'll have to ask davem. :)
>
> Well, first of all, I really don't like this.
>
> The first reason is that every time I see interface foo become foo2,
> foo3 is never far behind it.
>
> If foo was not extensible enough such that we needed foo2, we beter
> design the new thing with explicitly better extensibility in mind.
>
> Furthermore, what you want here is a specific filter.  Someone else
> will want to filter on another criteria, and the next person will
> want yet another.
>
> This needs to be properly generalized.
>
> And frankly if we had moved to ethtool netlink/devlink by now, we
> could just add a netlink attribute for filtering and not even be
> having this conversation.

Well, since there are un-defined flags, it would be simple enough to
extend the API further in the future (flag (1<<31) could mean expect
more input members, etc.  And, adding up to 30 more flags to filter on different
things won't change the API and should be backwards compatible.

But, if you don't want it, that is OK by me, I agree it is a fairly
obscure feature.  It would have saved me time if you had said you didn't
want it at the first RFC patch though...

Thanks,
Ben

-- 
Ben Greear <greearb-my8/4N5VtI7c+919tysfdA@public.gmane.org>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply

* Re: [PATCH] kvmalloc: always use vmalloc if CONFIG_DEBUG_VM
From: Michael S. Tsirkin @ 2018-04-23 15:25 UTC (permalink / raw)
  To: Mikulas Patocka
  Cc: Matthew Wilcox, David Miller, Andrew Morton, linux-mm,
	eric.dumazet, edumazet, netdev, linux-kernel, jasowang,
	virtualization, dm-devel, Vlastimil Babka
In-Reply-To: <alpine.LRH.2.02.1804200817230.22382@file01.intranet.prod.int.rdu2.redhat.com>

On Fri, Apr 20, 2018 at 08:20:23AM -0400, Mikulas Patocka wrote:
> 
> 
> On Fri, 20 Apr 2018, Matthew Wilcox wrote:
> 
> > On Thu, Apr 19, 2018 at 12:12:38PM -0400, Mikulas Patocka wrote:
> > > Unfortunatelly, some kernel code has bugs - it uses kvmalloc and then
> > > uses DMA-API on the returned memory or frees it with kfree. Such bugs were
> > > found in the virtio-net driver, dm-integrity or RHEL7 powerpc-specific
> > > code.
> > 
> > Maybe it's time to have the SG code handle vmalloced pages?  This is
> > becoming more and more common with vmapped stacks (and some of our
> > workarounds are hideous -- allocate 4 bytes with kmalloc because we can't
> > DMA onto the stack any more?).  We already have a few places which do
> > handle sgs of vmalloced addresses, such as the nx crypto driver:
> > 
> >         if (is_vmalloc_addr(start_addr))
> >                 sg_addr = page_to_phys(vmalloc_to_page(start_addr))
> >                           + offset_in_page(sg_addr);
> >         else
> >                 sg_addr = __pa(sg_addr);
> > 
> > and videobuf:
> > 
> >                 pg = vmalloc_to_page(virt);
> >                 if (NULL == pg)
> >                         goto err;
> >                 BUG_ON(page_to_pfn(pg) >= (1 << (32 - PAGE_SHIFT)));
> >                 sg_set_page(&sglist[i], pg, PAGE_SIZE, 0);
> > 
> > Yes, there's the potential that we have to produce two SG entries for a
> > virtually contiguous region if it crosses a page boundary, and our APIs
> > aren't set up right to make it happen.  But this is something we should
> > consider fixing ... otherwise we'll end up with dozens of driver hacks.
> > The videobuf implementation was already copy-and-pasted into the saa7146
> > driver, for example.
> 
> What if the device requires physically contiguous area and the vmalloc 
> area crosses a page? Will you use a bounce buffer? Where do you allocate 
> the bounce buffer from? What if you run out of bounce buffers?
> 
> Mikulkas

I agree with Matthew here.

4 byte variables are typically size aligned so won't cross a boundary.

That's enough for virtio at least. People using structs can force
alignment.

We could wrap access in a macro (sizeof(x) >= alignof(x)) to help
guarantee that.

-- 
MST

^ permalink raw reply

* Re: [pci PATCH v8 0/4] Add support for unmanaged SR-IOV
From: Don Dutile @ 2018-04-23 15:21 UTC (permalink / raw)
  To: Bjorn Helgaas, Alexander Duyck
  Cc: bhelgaas, linux-pci, virtio-dev, kvm, netdev, dan.daly,
	linux-kernel, linux-nvme, keith.busch, netanel, mheyne,
	liang-min.wang, mark.d.rustad, dwmw2, hch, dwmw
In-Reply-To: <20180421203437.GW28657@bhelgaas-glaptop.roam.corp.google.com>

On 04/21/2018 04:34 PM, Bjorn Helgaas wrote:
> On Fri, Apr 20, 2018 at 12:28:08PM -0400, Alexander Duyck wrote:
>> This series is meant to add support for SR-IOV on devices when the VFs are
>> not managed by the kernel. Examples of recent patches attempting to do this
>> include:
>> virto - https://patchwork.kernel.org/patch/10241225/
>> pci-stub - https://patchwork.kernel.org/patch/10109935/
>> vfio - https://patchwork.kernel.org/patch/10103353/
>> uio - https://patchwork.kernel.org/patch/9974031/
>>
>> Since this is quickly blowing up into a multi-driver problem it is probably
>> best to implement this solution as generically as possible.
>>
>> This series is an attempt to do that. What we do with this patch set is
>> provide a generic framework to enable SR-IOV in the case that the PF driver
>> doesn't support managing the VFs itself.
>>
>> I based my patch set originally on the patch by Mark Rustad but there isn't
>> much left after going through and cleaning out the bits that were no longer
>> needed, and after incorporating the feedback from David Miller. At this point
>> the only items to be fully reused was his patch description which is now
>> present in patch 3 of the set.
>>
>> This solution is limited in scope to just adding support for devices that
>> provide no functionality for SR-IOV other than allocating the VFs by
>> calling pci_enable_sriov. Previous sets had included patches for VFIO, but
>> for now I am dropping that as the scope of that work is larger then I
>> think I can take on at this time.
>>
>> v2: Reduced scope back to just virtio_pci and vfio-pci
>>      Broke into 3 patch set from single patch
>>      Changed autoprobe behavior to always set when num_vfs is set non-zero
>> v3: Updated Documentation to clarify when sriov_unmanaged_autoprobe is used
>>      Wrapped vfio_pci_sriov_configure to fix build errors w/o SR-IOV in kernel
>> v4: Dropped vfio-pci patch
>>      Added ena and nvme to drivers now using pci_sriov_configure_unmanaged
>>      Dropped pci_disable_sriov call in virtio_pci to be consistent with ena
>> v5: Dropped sriov_unmanaged_autoprobe and pci_sriov_conifgure_unmanaged
>>      Added new patch that enables pci_sriov_configure_simple
>>      Updated drivers to use pci_sriov_configure_simple
>> v6: Defined pci_sriov_configure_simple as NULL when SR-IOV is not enabled
>>      Updated drivers to drop "#ifdef" checks for IOV
>>      Added pci-pf-stub as place for PF-only drivers to add support
>> v7: Dropped pci_id table explanation from pci-pf-stub driver
>>      Updated pci_sriov_configure_simple to drop need for err value
>>      Fixed comment explaining why pci_sriov_configure_simple is NULL
>> v8: Dropped virtio from the set, support to be added later after TC approval
>>
>> Cc: Mark Rustad <mark.d.rustad@intel.com>
>> Cc: Maximilian Heyne <mheyne@amazon.de>
>> Cc: Liang-Min Wang <liang-min.wang@intel.com>
>> Cc: David Woodhouse <dwmw@amazon.co.uk>
>>
>> ---
>>
>> Alexander Duyck (4):
>>        pci: Add pci_sriov_configure_simple for PFs that don't manage VF resources
>>        ena: Migrate over to unmanaged SR-IOV support
>>        nvme: Migrate over to unmanaged SR-IOV support
>>        pci-pf-stub: Add PF driver stub for PFs that function only to enable VFs
>>
>>
>>   drivers/net/ethernet/amazon/ena/ena_netdev.c |   28 -------------
>>   drivers/nvme/host/pci.c                      |   20 ----------
>>   drivers/pci/Kconfig                          |   12 ++++++
>>   drivers/pci/Makefile                         |    2 +
>>   drivers/pci/iov.c                            |   31 +++++++++++++++
>>   drivers/pci/pci-pf-stub.c                    |   54 ++++++++++++++++++++++++++
>>   include/linux/pci.h                          |    3 +
>>   include/linux/pci_ids.h                      |    2 +
>>   8 files changed, 106 insertions(+), 46 deletions(-)
>>   create mode 100644 drivers/pci/pci-pf-stub.c
> 
> I tentatively applied these to pci/virtualization-review.
> 
> The code changes look fine, but I want to flesh out the changelogs a
> little bit before merging them.
> 
> For example, I'm not sure what you mean by "devices where the PF is
> not capable of managing VF resources."
> 
I agree w/Bjorn's assessment of the changelog.
The VF's are (minimally) assigned via the pf-stub driver, so they are 'managed by the kernel'.
The security model is the same as the existing one, which was the issue we resolved in the previous set(s) of patches.

I am hoping that something like vfio will be used to deal with the VF ownership
and the reset mechanisms during assignement & de-assignment to 'guests' (qemu-kvm, DPDK, or whatever user-process),
so the known, existing security model(s) is(are) maintained as well.
If so, it'd be good to add such verbage somewhere (as 0/n is not kept in anything but possibly Bjorn's patchwork, or whatever patch mgmt tool he uses, and future reference would be good to have) say, an update to Documentation/PCI/pci-iov-howto.txt.

So... the 'unmanaged SR-IOV' Subject, IMO, is not a valid Subject for the patch series any longer.

No objections to the patch series, as Bjorn noted, just the commit log(s)/nomenclature of what is really being done.
The expectation of VF enablement via the PF was born out of the fairly complicated, and unique PF vs VF drivers of the first implementations, which AlexD knows so well.  This "VFs act just like PFs without SRIOV capabilities" support is what this patch set enables with a much lighter configuration mechanism.
So, maybe the patch set ought to be 'lightweight SRIOV enablement'.

--dd

> It *sounds* like you're saying the hardware works differently on some
> devices, but I don't think that's what you mean.  I think you're
> saying something about which drivers are used for the PF and the VF.
> 
> I think a trivial example of how this will be used might help.  I
> assume this involves a virtualization scenario where the host uses the
> PF to enable several VFs, but the host doesn't use the PF for much
> else.  Then you assign the VFs to guests, and drivers in the guest
> OSes use the VFs.
> 
> Since .sriov_configure() is only used by sriov_numvfs_store(), I
> assume the usage model involves writing to the sysfs sriov_numvfs
> attribute to enable the VFs, then assigning them to guests?
> 
> Bjorn
> 

^ permalink raw reply

* Fw: [Bug 199469] New: Regression in 32-bit-compat dev_ioctl due to bf4405737f9f85a06db2b0ce5d76a818b61992e2
From: Stephen Hemminger @ 2018-04-23 15:19 UTC (permalink / raw)
  To: netdev

I think this has already been addressed. But forwarding for sure.

Begin forwarded message:

Date: Mon, 23 Apr 2018 01:26:46 +0000
From: bugzilla-daemon@bugzilla.kernel.org
To: stephen@networkplumber.org
Subject: [Bug 199469] New: Regression in 32-bit-compat dev_ioctl due to bf4405737f9f85a06db2b0ce5d76a818b61992e2


https://bugzilla.kernel.org/show_bug.cgi?id=199469

            Bug ID: 199469
           Summary: Regression in 32-bit-compat dev_ioctl due to
                    bf4405737f9f85a06db2b0ce5d76a818b61992e2
           Product: Networking
           Version: 2.5
    Kernel Version: 4.16.0-0.rc4.git0.1.fc28.x86_64
          Hardware: All
                OS: Linux
              Tree: Mainline
            Status: NEW
          Severity: normal
          Priority: P1
         Component: Other
          Assignee: stephen@networkplumber.org
          Reporter: robert@ocallahan.org
        Regression: No

Created attachment 275501
  --> https://bugzilla.kernel.org/attachment.cgi?id=275501&action=edit  
Test program

The commit says

    Once upon a time net/socket.c:dev_ifsioc() used to handle SIOCSHWTSTAMP and
    SIOCSIFMAP.  These have different native and compat layout, so the format
    conversion had been needed.  In 2009 these two cases had been taken out,
    turning the rest into a convoluted way to calling sock_do_ioctl().  We copy
    compat structure into native one, call sock_do_ioctl() on that and copy
    the result back for the in/out ioctls.  No layout transformation anywhere,
    so we might as well just call sock_do_ioctl() and skip all the headache
with
    copying.

However there is one problem: 32-bit 'struct ifreq' and 64-bit 'struct ifreq'
are not the same size. The former is 32 bytes and the latter is 40 bytes. Thus,
if you place a 32-bit 'struct ifreq' immediately before an unmapped page and
try to pass it to one of these ioctls, the syscall fails with EFAULT due to
this commit.

Steps to reproduce:
Copy attached file to /tmp/test.c, then:
[roc@localhost-live ~]$ gcc -o /tmp/test /tmp/test.c && /tmp/test
Index: 1
[roc@localhost-live ~]$ gcc -m32 -o /tmp/test /tmp/test.c && /tmp/test
Failed SIOCGIFINDEX: Bad address

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply

* Re: [PATCH] kvmalloc: always use vmalloc if CONFIG_DEBUG_VM
From: Michal Hocko @ 2018-04-23 15:15 UTC (permalink / raw)
  To: Mikulas Patocka
  Cc: Matthew Wilcox, David Miller, Andrew Morton, linux-mm,
	eric.dumazet, edumazet, netdev, linux-kernel, mst, jasowang,
	virtualization, dm-devel, Vlastimil Babka
In-Reply-To: <alpine.LRH.2.02.1804221733520.7995@file01.intranet.prod.int.rdu2.redhat.com>

On Mon 23-04-18 10:06:08, Mikulas Patocka wrote:
> 
> 
> On Sat, 21 Apr 2018, Matthew Wilcox wrote:
> 
> > On Fri, Apr 20, 2018 at 05:21:26PM -0400, Mikulas Patocka wrote:
> > > On Fri, 20 Apr 2018, Matthew Wilcox wrote:
> > > > On Fri, Apr 20, 2018 at 04:54:53PM -0400, Mikulas Patocka wrote:
> > > > > On Fri, 20 Apr 2018, Michal Hocko wrote:
> > > > > > No way. This is just wrong! First of all, you will explode most likely
> > > > > > on many allocations of small sizes. Second, CONFIG_DEBUG_VM tends to be
> > > > > > enabled quite often.
> > > > > 
> > > > > You're an evil person who doesn't want to fix bugs.
> > > > 
> > > > Steady on.  There's no need for that.  Michal isn't evil.  Please
> > > > apologise.
> > > 
> > > I see this attitude from Michal again and again.
> > 
> > Fine; then *say that*.  I also see Michal saying "No" a lot.  Sometimes
> > I agree with him, sometimes I don't.  I think he genuinely wants the best
> > code in the kernel, and saying "No" is part of it.
> > 
> > > He didn't want to fix vmalloc(GFP_NOIO)
> > 
> > I don't remember that conversation, so I don't know whether I agree with
> > his reasoning or not.  But we are supposed to be moving away from GFP_NOIO
> > towards marking regions with memalloc_noio_save() / restore.  If you do
> > that, you won't need vmalloc(GFP_NOIO).
> 
> He said the same thing a year ago. And there was small progress. 6 out of 
> 27 __vmalloc calls were converted to memalloc_noio_save in a year - 5 in 
> infiniband and 1 in btrfs. (the whole discussion is here 
> http://lkml.iu.edu/hypermail/linux/kernel/1706.3/04681.html )

Well this is not that easy. It requires a cooperation from maintainers.
I can only do as much. I've posted patches in the past and actively
bringing up this topic at LSFMM last two years...

> He refuses 15-line patch to fix GFP_NOIO bug because he believes that in 4 
> years, the kernel will be refactored and GFP_NOIO will be eliminated. Why 
> does he have veto over this part of the code? I'd much rather argue with 
> people who have constructive comments about fixing bugs than with him.

I didn't NACK the patch AFAIR. I've said it is not a good idea longterm.
I would be much more willing to change my mind if you would back your
patch by a real bug report. Hacks are acceptable when we have a real
issue in hands. But if we want to fix potential issue then better make
it properly.

[...]

> I sent the CONFIG_DEBUG_SG patch before (I wonder why he didn't repond to 
> it). I'll send a third version of the patch that actually randomly chooses 
> between kmalloc and vmalloc, because some abuses can only be detected with 
> kmalloc and we should test both.
> 
> For bisecting, it is better to always fallback to vmalloc, but for general 
> testing, it is better to test both branches.

Agreed!

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply

* Re: [PATCH for-rc] uapi: Fix SPDX tags for files referring to the 'OpenIB.org' license
From: Doug Ledford @ 2018-04-23 15:11 UTC (permalink / raw)
  To: David Miller, jgg
  Cc: linux-rdma, kstewart, pombredanne, gregkh, tglx, swinslow,
	santosh.shilimkar, netdev, linux-kernel, davejwatson
In-Reply-To: <20180422.210823.1454088204642743739.davem@davemloft.net>

[-- Attachment #1: Type: text/plain, Size: 942 bytes --]

On Sun, 2018-04-22 at 21:08 -0400, David Miller wrote:
> From: Jason Gunthorpe <jgg@mellanox.com>
> Date: Fri, 20 Apr 2018 09:49:10 -0600
> 
> > Based on discussion with Kate Stewart this license is not a
> > BSD-2-Clause, but is now formally identified as Linux-OpenIB
> > by SPDX.
> > 
> > The key difference between the licenses is in the 'warranty'
> > paragraph.
> > 
> > if_infiniband.h refers to the 'OpenIB.org' license, but
> > does not include the text, instead it links to an obsolete
> > web site that contains a license that matches the BSD-2-Clause
> > SPX. There is no 'three clause' version of the OpenIB.org
> > license.
> > 
> > Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
> 
> Acked-by: David S. Miller <davem@davemloft.net>
> 

Thanks, applied.

-- 
Doug Ledford <dledford@redhat.com>
    GPG KeyID: B826A3330E572FDD
    Key fingerprint = AE6B 1BDA 122B 23B4 265B  1274 B826 A333 0E57 2FDD

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* Re: [PATCH] kvmalloc: always use vmalloc if CONFIG_DEBUG_VM
From: Michal Hocko @ 2018-04-23 15:10 UTC (permalink / raw)
  To: Mikulas Patocka
  Cc: Matthew Wilcox, David Miller, Andrew Morton, linux-mm,
	eric.dumazet, edumazet, bhutchings, netdev, linux-kernel, mst,
	jasowang, virtualization, dm-devel, Vlastimil Babka
In-Reply-To: <alpine.LRH.2.02.1804231006440.22488@file01.intranet.prod.int.rdu2.redhat.com>

On Mon 23-04-18 10:24:02, Mikulas Patocka wrote:
[...]

I am not going to comment on your continuous accusations. We can discuss
patches but general rants make very limited sense.
 
> > > > I already said that we can change it from CONFIG_DEBUG_VM to 
> > > > CONFIG_DEBUG_SG - or to whatever other option you may want, just to make 
> > > > sure that it is enabled in distro debug kernels by default.
> > > 
> > > Yes, and I think that's the right idea.  So send a v2 and ignore the
> > > replies that are clearly relating to an earlier version of the patch.
> > > Not everybody reads every mail in the thread before responding to one they
> > > find interesting.  Yes, ideally, one would, but sometimes one doesn't.
> > 
> > And look here. This is yet another ad-hoc idea. We have many users of
> > kvmalloc which have no relation to SG, yet you are going to control
> > their behavior by CONFIG_DEBUG_SG? No way! (yeah evil again)
> 
> Why aren't you constructive and pick up pick up the CONFIG flag?

Because config doesn't make that much of a sense. You do not want a
permanent vmalloc fallback unconditionally. Debugging option which
changes the behavior completely is not useful IMNHO. Besides that who is
going to enable it?

> > Really, we have a fault injection framework and this sounds like
> > something to hook in there.
> 
> The testing people won't set it up. They install the "kernel-debug" 
> package and run the tests in it.
> 
> If you introduce a hidden option that no one knows about, no one will use 
> it.

then make sure people know about it. Fuzzers already do test fault
injections.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply

* Re: WARNING: suspicious RCU usage in rt6_check_expired
From: David Ahern @ 2018-04-23 15:10 UTC (permalink / raw)
  To: Eric Dumazet, syzbot, davem, kuznet, linux-kernel, netdev,
	syzkaller-bugs
In-Reply-To: <a0eb8d30-0f76-fad2-e9fa-5de92f59b2ff@gmail.com>

On 4/23/18 7:31 AM, Eric Dumazet wrote:
>> stack backtrace:
>> CPU: 1 PID: 25958 Comm: syz-executor7 Not tainted 4.16.0+ #11
>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
>> Call Trace:
>>  __dump_stack lib/dump_stack.c:77 [inline]
>>  dump_stack+0x1b9/0x294 lib/dump_stack.c:113
>>  lockdep_rcu_suspicious+0x14a/0x153 kernel/locking/lockdep.c:4592
>>  rt6_check_expired+0x38b/0x3e0 net/ipv6/route.c:410
>>  ip6_negative_advice+0x67/0xc0 net/ipv6/route.c:2204
>>  dst_negative_advice include/net/sock.h:1786 [inline]
>>  sock_setsockopt+0x138f/0x1fe0 net/core/sock.c:1051
>>  __sys_setsockopt+0x2df/0x390 net/socket.c:1899
>>  SYSC_setsockopt net/socket.c:1914 [inline]
>>  SyS_setsockopt+0x34/0x50 net/socket.c:1911
>>  do_syscall_64+0x29e/0x9d0 arch/x86/entry/common.c:287
>>  entry_SYSCALL_64_after_hwframe+0x42/0xb7

...

> 
> Added in commit a68886a691804d3f6d479ebf6825480fbafb6a00
> ("net/ipv6: Make from in rt6_info rcu protected")
> 

I missed the rcu_read_lock for this path. Will send a patch.

^ permalink raw reply

* Re: KMSAN: uninit-value in strnlen
From: Guillaume Nault @ 2018-04-23 15:01 UTC (permalink / raw)
  To: syzbot; +Cc: linux-kernel, mostrows, netdev, syzkaller-bugs
In-Reply-To: <00000000000003f2d5056a7fbf92@google.com>

On Mon, Apr 23, 2018 at 01:23:01AM -0700, syzbot wrote:
> Hello,
> 
> syzbot hit the following crash on https://github.com/google/kmsan.git/master
> commit
> a7f95e9c8a95e9fbb388c3999b61a17667cd3bbe (Sat Apr 21 13:50:22 2018 +0000)
> kmsan: disable assembly checksums
> syzbot dashboard link:
> https://syzkaller.appspot.com/bug?extid=cd06c321e7147d03a65e
> 
> So far this crash happened 5 times on
> https://github.com/google/kmsan.git/master.
> C reproducer: https://syzkaller.appspot.com/x/repro.c?id=5785171018121216
> syzkaller reproducer:
> https://syzkaller.appspot.com/x/repro.syz?id=5117671628603392
> Raw console output:
> https://syzkaller.appspot.com/x/log.txt?id=6310764688179200
> Kernel config: https://syzkaller.appspot.com/x/.config?id=328654897048964367
> compiler: clang version 7.0.0 (trunk 329391)
> 
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: syzbot+cd06c321e7147d03a65e@syzkaller.appspotmail.com
> It will help syzbot understand when the bug is fixed. See footer for
> details.
> If you forward the report, please keep this part and the footer.
> 
> ==================================================================
> BUG: KMSAN: uninit-value in strnlen+0xc4/0x110 lib/string.c:499
> CPU: 1 PID: 4507 Comm: syzkaller579712 Not tainted 4.16.0+ #85
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> Google 01/01/2011
> Call Trace:
>  __dump_stack lib/dump_stack.c:17 [inline]
>  dump_stack+0x185/0x1d0 lib/dump_stack.c:53
>  kmsan_report+0x142/0x240 mm/kmsan/kmsan.c:1067
>  __msan_warning_32+0x6c/0xb0 mm/kmsan/kmsan_instr.c:683
>  strnlen+0xc4/0x110 lib/string.c:499
>  dev_name_hash net/core/dev.c:209 [inline]
>  dev_get_by_name_rcu net/core/dev.c:764 [inline]
>  dev_get_by_name+0x6e/0x350 net/core/dev.c:791
>  pppoe_connect+0xcb7/0x2360 drivers/net/ppp/pppoe.c:665
>  SYSC_connect+0x41a/0x510 net/socket.c:1639
>  SyS_connect+0x54/0x80 net/socket.c:1620
>  do_syscall_64+0x309/0x430 arch/x86/entry/common.c:287
>  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
> RIP: 0033:0x43fcf9
> RSP: 002b:00007ffca4bd4978 EFLAGS: 00000213 ORIG_RAX: 000000000000002a
> RAX: ffffffffffffffda RBX: 00000000004002c8 RCX: 000000000043fcf9
> RDX: 0000000000000007 RSI: 0000000020000040 RDI: 0000000000000003
> RBP: 00000000006ca018 R08: 00000000004002c8 R09: 00000000004002c8
> R10: 00000000004002c8 R11: 0000000000000213 R12: 0000000000401620
> R13: 00000000004016b0 R14: 0000000000000000 R15: 0000000000000000
> 
> Local variable description: ----address@SYSC_connect
> Variable was created at:
>  SYSC_connect+0x6f/0x510 net/socket.c:1622
>  SyS_connect+0x54/0x80 net/socket.c:1620
> ==================================================================
> 
That's a consequence of not validating sockaddr_len. The sockaddr_pppox
structure was incomplete.

#syz dup: KMSAN: uninit-value in pppoe_connect

^ permalink raw reply

* [PATCH net] pppoe: check sockaddr length in pppoe_connect()
From: Guillaume Nault @ 2018-04-23 14:38 UTC (permalink / raw)
  To: netdev; +Cc: Michal Ostrowski

We must validate sockaddr_len, otherwise userspace can pass fewer data
than we expect and we end up accessing invalid data.

Fixes: 224cf5ad14c0 ("ppp: Move the PPP drivers")
Reported-by: syzbot+4f03bdf92fdf9ef5ddab@syzkaller.appspotmail.com
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
---
 drivers/net/ppp/pppoe.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/net/ppp/pppoe.c b/drivers/net/ppp/pppoe.c
index 1483bc7b01e1..7df07337d69c 100644
--- a/drivers/net/ppp/pppoe.c
+++ b/drivers/net/ppp/pppoe.c
@@ -620,6 +620,10 @@ static int pppoe_connect(struct socket *sock, struct sockaddr *uservaddr,
 	lock_sock(sk);
 
 	error = -EINVAL;
+
+	if (sockaddr_len != sizeof(struct sockaddr_pppox))
+		goto end;
+
 	if (sp->sa_protocol != PX_PROTO_OE)
 		goto end;
 
-- 
2.17.0

^ permalink raw reply related

* Re: [PATCH net] net: ethtool: Add missing kernel doc for FEC parameters
From: David Miller @ 2018-04-23 14:28 UTC (permalink / raw)
  To: f.fainelli; +Cc: netdev, vidya.chowdary, dustin, roopa
In-Reply-To: <20180421231848.17450-1-f.fainelli@gmail.com>

From: Florian Fainelli <f.fainelli@gmail.com>
Date: Sat, 21 Apr 2018 16:18:48 -0700

> + * @sec_fecparam: Set the network device Forward Error Correction

As Sergei pointed out this should be "set_fecparam"

^ permalink raw reply

* Re: [PATCH] kvmalloc: always use vmalloc if CONFIG_DEBUG_VM
From: Mikulas Patocka @ 2018-04-23 14:24 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Matthew Wilcox, David Miller, Andrew Morton, linux-mm,
	eric.dumazet, edumazet, bhutchings, netdev, linux-kernel, mst,
	jasowang, virtualization, dm-devel, Vlastimil Babka
In-Reply-To: <20180422130356.GG17484@dhcp22.suse.cz>



On Sun, 22 Apr 2018, Michal Hocko wrote:

> On Sat 21-04-18 07:47:57, Matthew Wilcox wrote:
> 
> > > He didn't want to fix vmalloc(GFP_NOIO)
> > 
> > I don't remember that conversation, so I don't know whether I agree with
> > his reasoning or not.  But we are supposed to be moving away from GFP_NOIO
> > towards marking regions with memalloc_noio_save() / restore.  If you do
> > that, you won't need vmalloc(GFP_NOIO).
> 
> It was basically to detect GFP_NOIO context _inside_ vmalloc and use the
> scope API to enforce it there. Does it solve potential problems? Yes it
> does. Does it solve any existing report, no I am not aware of any. Is
> it a good fix longterm? Absolutely no, because the scope API should be
> used _at the place_ where the scope starts rather than a random utility
> function. If we are going the easier way now, we will never teach users
> to use the API properly. And I am willing to risk to keep a broken
> code which we have for years rather than allow a random hack that will
> seemingly fix it.
> 
> Btw. I was pretty much explicit with this reasoning when rejecting the
> patch. Do you still call that evil?

You are making nonsensical excuses.

That patch doesn't prevent you from refactoring the kernel and eliminating 
GFP_NOIO in the long term. You can apply the patch and then continue 
working on GFP_NOIO refactoring as well as before.

> > > he didn't want to fix alloc_pages sleeping when __GFP_NORETRY is used.
> > 
> > The GFP flags are a mess, still.
> 
> I do not remember that one but __GFP_NORETRY is _allowed_ to sleep. And
> yes I do _agree_ gfp flags are a mess which is really hard to get fixed
> because they are lacking a good design from the very beginning. Fixing
> some of those issues today is a completely PITA.

It may sleep, but if it sleeps regularly, it slows down swapping (because 
the swapping code does mempool_alloc and mempool_alloc does __GFP_NORETRY 
allocation). And there were two INTENTIONAL sleeps with schedule_timeout. 
You removed one and left the other, claiming that __GFP_NORETRY allocation 
should sleep.

> > > I already said that we can change it from CONFIG_DEBUG_VM to 
> > > CONFIG_DEBUG_SG - or to whatever other option you may want, just to make 
> > > sure that it is enabled in distro debug kernels by default.
> > 
> > Yes, and I think that's the right idea.  So send a v2 and ignore the
> > replies that are clearly relating to an earlier version of the patch.
> > Not everybody reads every mail in the thread before responding to one they
> > find interesting.  Yes, ideally, one would, but sometimes one doesn't.
> 
> And look here. This is yet another ad-hoc idea. We have many users of
> kvmalloc which have no relation to SG, yet you are going to control
> their behavior by CONFIG_DEBUG_SG? No way! (yeah evil again)

Why aren't you constructive and pick up pick up the CONFIG flag?

> Really, we have a fault injection framework and this sounds like
> something to hook in there.

The testing people won't set it up. They install the "kernel-debug" 
package and run the tests in it.

If you introduce a hidden option that no one knows about, no one will use 
it.

> -- 
> Michal Hocko
> SUSE Labs

Mikulas

^ permalink raw reply

* Re: [PATCH net-next v2 0/2] fib rules extack support
From: David Miller @ 2018-04-23 14:21 UTC (permalink / raw)
  To: roopa; +Cc: netdev, dsa, idosch
In-Reply-To: <1524328891-3647-1-git-send-email-roopa@cumulusnetworks.com>

From: Roopa Prabhu <roopa@cumulusnetworks.com>
Date: Sat, 21 Apr 2018 09:41:29 -0700

> From: Roopa Prabhu <roopa@cumulusnetworks.com>
> 
> First patch refactors code to move fib rule netlink handling
> into a common function. This became obvious when adding
> duplicate extack msgs in add and del paths. Second patch
> adds extack msgs.
> 
> v2 - Dropped the ip route get support and selftests from
>      the series to look at the input path some more (as pointed
>      out by ido). Will come back to that next week when i have
>      some time. resending just the extack part for now.

Series applied, but I was really looking forward to having those
nice test cases in the tree.

Thanks.

^ permalink raw reply

* [PATCH net] l2tp: check sockaddr length in pppol2tp_connect()
From: Guillaume Nault @ 2018-04-23 14:15 UTC (permalink / raw)
  To: netdev; +Cc: James Chapman

Check sockaddr_len before dereferencing sp->sa_protocol, to ensure that
it actually points to valid data.

Fixes: fd558d186df2 ("l2tp: Split pppol2tp patch into separate l2tp and ppp parts")
Reported-by: syzbot+a70ac890b23b1bf29f5c@syzkaller.appspotmail.com
Signed-off-by: Guillaume Nault <g.nault@alphalink.fr>
---
 net/l2tp/l2tp_ppp.c | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/net/l2tp/l2tp_ppp.c b/net/l2tp/l2tp_ppp.c
index 1404bc1c1bb7..1fd9e145076a 100644
--- a/net/l2tp/l2tp_ppp.c
+++ b/net/l2tp/l2tp_ppp.c
@@ -619,6 +619,13 @@ static int pppol2tp_connect(struct socket *sock, struct sockaddr *uservaddr,
 	lock_sock(sk);
 
 	error = -EINVAL;
+
+	if (sockaddr_len != sizeof(struct sockaddr_pppol2tp) &&
+	    sockaddr_len != sizeof(struct sockaddr_pppol2tpv3) &&
+	    sockaddr_len != sizeof(struct sockaddr_pppol2tpin6) &&
+	    sockaddr_len != sizeof(struct sockaddr_pppol2tpv3in6))
+		goto end;
+
 	if (sp->sa_protocol != PX_PROTO_OL2TP)
 		goto end;
 
-- 
2.17.0

^ permalink raw reply related

* Re: [PATCH net-next 1/1] tc-testing: updated ife test cases
From: David Miller @ 2018-04-23 14:13 UTC (permalink / raw)
  To: mrv; +Cc: netdev, kernel, jhs, xiyou.wangcong, jiri
In-Reply-To: <1524283011-13822-1-git-send-email-mrv@mojatatu.com>

From: Roman Mashak <mrv@mojatatu.com>
Date: Fri, 20 Apr 2018 23:56:51 -0400

> Signed-off-by: Roman Mashak <mrv@mojatatu.com>

Applied.

^ permalink raw reply

* Re: [bpf-next PATCH 3/3] bpf: add sample program to trace map events
From: Sebastiano Miano @ 2018-04-23 14:08 UTC (permalink / raw)
  To: Jesper Dangaard Brouer, Alexei Starovoitov
  Cc: netdev, ast, daniel, mingo, rostedt, fulvio.risso,
	David S. Miller
In-Reply-To: <20180420114711.1a06fb26@redhat.com>

On 20/04/2018 11:47, Jesper Dangaard Brouer wrote:
> On Thu, 19 Apr 2018 17:27:37 -0700
> Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:
> 
>> On Wed, Apr 18, 2018 at 05:30:59PM +0200, Sebastiano Miano wrote:
>>> This patch adds a sample program, called trace_map_events,
>>> that shows how to capture map events and filter them based on
>>> the map id.
>> ...
>>> +struct bpf_map_keyval_ctx {
>>> +	u64 pad;		// First 8 bytes are not accessible by bpf code
>>> +	u32 type;		// offset:8;	size:4;	signed:0;
>>> +	u32 key_len;		// offset:12;	size:4;	signed:0;
>>> +	u32 key;		// offset:16;	size:4;	signed:0;
>>> +	bool key_trunc;		// offset:20;	size:1;	signed:0;
>>> +	u32 val_len;		// offset:24;	size:4;	signed:0;
>>> +	u32 val;		// offset:28;	size:4;	signed:0;
>>> +	bool val_trunc;		// offset:32;	size:1;	signed:0;
>>> +	int ufd;		// offset:36;	size:4;	signed:1;
>>> +	u32 id;			// offset:40;	size:4;	signed:0;
>>> +};
>>> +
>>> +SEC("tracepoint/bpf/bpf_map_lookup_elem")
>>> +int trace_bpf_map_lookup_elem(struct bpf_map_keyval_ctx *ctx)
>>> +{
>>> +	struct map_event_data data;
>>> +	int cpu = bpf_get_smp_processor_id();
>>> +	bool *filter;
>>> +	u32 key = 0, map_id = ctx->id;
>>> +
>>> +	filter = bpf_map_lookup_elem(&filter_events, &key);
>>> +	if (!filter)
>>> +		return 1;
>>> +
>>> +	if (!*filter)
>>> +		goto send_event;
>>> +
>>> +	/*
>>> +	 * If the map_id is not in the list of filtered
>>> +	 * ids we immediately return
>>> +	 */
>>> +	if (!bpf_map_lookup_elem(&filtered_ids, &map_id))
>>> +		return 0;
>>> +
>>> +send_event:
>>> +	data.map_id = map_id;
>>> +	data.evnt_type = MAP_LOOKUP;
>>> +	data.map_type = ctx->type;
>>> +
>>> +	bpf_perf_event_output(ctx, &map_event_trace, cpu, &data, sizeof(data));
>>> +	return 0;
>>> +}
>>
>> looks like the purpose of the series is to create map notify mechanism
>> so some external user space daemon can snoop all bpf map operations
>> that all other processes and bpf programs are doing.
>> I think it would be way better to create a proper mechanism for that
>> with permissions.
>>
>> tracepoints in the bpf core were introduced as introspection mechanism
>> for debugging. Right now we have better ways to do introspection
>> with ids, queries, etc that bpftool is using, so original purpose of
>> those tracepoints is gone and they actually rot.
>> Let's not repurpose them into this map notify logic.
>> I don't want tracepoints in the bpf core to become a stable interface
>> it will stiffen the development.
> 
> Well, I need it exactly for introspection and debugging, and just need
> the missing ID (as it was introduced later).
> 
> Can we just drop the sample program then? I would likely not use the
> sample program, because it is missing the PID+comm-name.

I agree with Jesper. We could drop this sample program but, do you think 
it is worth spending some time to create another sample program that 
shows the correct or suggested way to do it?

I believe it would be useful to have a program for this since I found it 
difficult to understand how to do it with the current 
documentation/samples and it seems I was not the only one hitting this 
problem.

> 
> For my use, I can simply use perf record to debug what programs are
> changing the map ID:
> 
>    perf record -e bpf:bpf_map_* -a --filter 'id == 2'
> 
> This should be a fairly common troubleshooting step.  I want to
> figure-out if anybody else (another userspace program) is changing my
> map. This can easily be caused by two userspace control programs
> running at the same time. Sysadm/scripts made a mistake and started two
> instances. Without the map ID, I cannot know what map perf is talking
> about...
> 

That's in fact the real use case for the first two patches. Since bpf 
tracepoints are still a rather common (and easy to use) troubleshooting 
and monitoring tool why shouldn't we "enhance" their support with the 
newly added map/prog IDs?

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox