Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH net-next v1 00/11] amd-xgbe: AMD XGBE driver updates 2016-11-10
From: David Miller @ 2016-11-13  5:56 UTC (permalink / raw)
  To: thomas.lendacky; +Cc: netdev, f.fainelli
In-Reply-To: <20161110230916.20704.34081.stgit@tlendack-t1.amdoffice.net>

From: Tom Lendacky <thomas.lendacky@amd.com>
Date: Thu, 10 Nov 2016 17:09:17 -0600

> This patch series is targeted at adding support for a new PCI version
> of the hardware. As part of the new PCI device, there is a new PCS/PHY
> interaction, ECC support, I2C sideband communication, SFP+ support and
> more.
> 
> The following updates and fixes are included in this driver update series:
> 
> - Hardware workaround for possible incorrectly generated interrupts
>   during software reset
> - Hardware workaround for Tx timestamp register access order
> - Add support for a PCI version of the device
> - Increase the Rx queue limit to take advantage of the increased number
>   of DMA channels that might be available
> - Add support for a new DMA channel interrupt mode
> - Add ECC support for the device memory
> - Add support for using the integrated I2C controller for sideband
>   communication
> - Expose the phylib phy_aneg_done() function so it can be called by the
>   driver
> - Add support for SFP+ modules
> - Add support for MDIO attached PHYs
> - Add support for KR re-driver between the PCS/SerDes and an external
>   PHY
> 
> This patch series is based on net-next.

Series applied, thanks Tom.

^ permalink raw reply

* Re: [PATCH] net: bpqether.h: remove if_ether.h guard
From: David Miller @ 2016-11-13  5:58 UTC (permalink / raw)
  To: baruch; +Cc: ralf, linux-hams, netdev
In-Reply-To: <da8a959ec7a821eb179e5e5b40eff669daf9a338.1478776902.git.baruch@tkos.co.il>

From: Baruch Siach <baruch@tkos.co.il>
Date: Thu, 10 Nov 2016 13:21:42 +0200

> __LINUX_IF_ETHER_H is not defined anywhere, and if_ether.h can keep itself from
> double inclusion, though it uses a single underscore prefix.
> 
> Signed-off-by: Baruch Siach <baruch@tkos.co.il>

Applied.

^ permalink raw reply

* Re: [PATCH iproute2 1/1] tc: print raw qdisc handle.
From: Stephen Hemminger @ 2016-11-13  6:53 UTC (permalink / raw)
  To: Roman Mashak; +Cc: netdev, jhs
In-Reply-To: <1478790392-13028-1-git-send-email-mrv@mojatatu.com>

On Thu, 10 Nov 2016 10:06:32 -0500
Roman Mashak <mrv@mojatatu.com> wrote:

> Resending patch after fixing indentation and spaces.
> 
> Signed-off-by: Roman Mashak <mrv@mojatatu.com>
> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com>

iproute2 uses kernel coding style. This patch fails basic checkpatch.


WARNING: suspect code indent for conditional statements (8, 11)
#31: FILE: tc/tc_qdisc.c:241:
+	if (!show_raw) {
+	   fprintf(fp, "qdisc %s %x: ", rta_getattr_str(tb[TCA_KIND]),

total: 0 errors, 1 warnings, 15 lines checked

^ permalink raw reply

* Re: [iproute PATCH v2 1/2] include: Add linux/sctp.h
From: Stephen Hemminger @ 2016-11-13  7:00 UTC (permalink / raw)
  To: Phil Sutter; +Cc: netdev
In-Reply-To: <20161109111224.3946-2-phil@nwl.cc>

On Wed,  9 Nov 2016 12:12:23 +0100
Phil Sutter <phil@nwl.cc> wrote:

> Add sanitized UAPI linux/sctp.h header file.
> 
> Signed-off-by: Phil Sutter <phil@nwl.cc>

Applied both patches.

^ permalink raw reply

* Re: [net-next 1/2] net: ethernet: slicoss: add slicoss gigabit ethernet driver
From: kbuild test robot @ 2016-11-13  7:13 UTC (permalink / raw)
  To: Lino Sanfilippo
  Cc: devel, gregkh, linux-kernel, liodot, Lino Sanfilippo, kbuild-all,
	charrer, netdev, davem
In-Reply-To: <1479012453-19410-2-git-send-email-LinoSanfilippo@gmx.de>

[-- Attachment #1: Type: text/plain, Size: 3669 bytes --]

Hi Lino,

[auto build test WARNING on net-next/master]

url:    https://github.com/0day-ci/linux/commits/Lino-Sanfilippo/net-ethernet-slicoss-add-slicoss-gigabit-ethernet-driver/20161113-125131
config: alpha-allyesconfig (attached as .config)
compiler: alpha-linux-gnu-gcc (Debian 6.1.1-9) 6.1.1 20160705
reproduce:
        wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        make.cross ARCH=alpha 

All warnings (new ones prefixed by >>):

   drivers/staging/slicoss/slicoss.c: In function 'slic_cmdq_addcmdpage':
>> drivers/staging/slicoss/slicoss.c:1258:2: warning: 'virt_to_bus' is deprecated [-Wdeprecated-declarations]
     phys_addr = virt_to_bus((void *)page);
     ^~~~~~~~~
   In file included from include/linux/io.h:25:0,
                    from include/linux/irq.h:24,
                    from include/asm-generic/hardirq.h:12,
                    from arch/alpha/include/asm/hardirq.h:7,
                    from include/linux/hardirq.h:8,
                    from include/linux/interrupt.h:12,
                    from drivers/staging/slicoss/slicoss.c:71:
   arch/alpha/include/asm/io.h:114:42: note: declared here
    static inline unsigned long __deprecated virt_to_bus(void *address)
                                             ^~~~~~~~~~~

vim +/virt_to_bus +1258 drivers/staging/slicoss/slicoss.c

4d6ea9c3 Denis Kirjanov     2010-07-10  1242  	struct slic_hostcmd *cmd;
4d6ea9c3 Denis Kirjanov     2010-07-10  1243  	struct slic_hostcmd *prev;
4d6ea9c3 Denis Kirjanov     2010-07-10  1244  	struct slic_hostcmd *tail;
4d6ea9c3 Denis Kirjanov     2010-07-10  1245  	struct slic_cmdqueue *cmdq;
4d6ea9c3 Denis Kirjanov     2010-07-10  1246  	int cmdcnt;
4d6ea9c3 Denis Kirjanov     2010-07-10  1247  	void *cmdaddr;
4d6ea9c3 Denis Kirjanov     2010-07-10  1248  	ulong phys_addr;
4d6ea9c3 Denis Kirjanov     2010-07-10  1249  	u32 phys_addrl;
4d6ea9c3 Denis Kirjanov     2010-07-10  1250  	u32 phys_addrh;
4d6ea9c3 Denis Kirjanov     2010-07-10  1251  	struct slic_handle *pslic_handle;
eafe6002 David Matlack      2015-05-11  1252  	unsigned long flags;
4d6f6af8 Greg Kroah-Hartman 2008-03-19  1253  
4d6ea9c3 Denis Kirjanov     2010-07-10  1254  	cmdaddr = page;
dd146d21 Shraddha Barke     2015-10-15  1255  	cmd = cmdaddr;
4d6ea9c3 Denis Kirjanov     2010-07-10  1256  	cmdcnt = 0;
4d6f6af8 Greg Kroah-Hartman 2008-03-19  1257  
4d6ea9c3 Denis Kirjanov     2010-07-10 @1258  	phys_addr = virt_to_bus((void *)page);
4d6ea9c3 Denis Kirjanov     2010-07-10  1259  	phys_addrl = SLIC_GET_ADDR_LOW(phys_addr);
4d6ea9c3 Denis Kirjanov     2010-07-10  1260  	phys_addrh = SLIC_GET_ADDR_HIGH(phys_addr);
4d6f6af8 Greg Kroah-Hartman 2008-03-19  1261  
4d6ea9c3 Denis Kirjanov     2010-07-10  1262  	prev = NULL;
4d6ea9c3 Denis Kirjanov     2010-07-10  1263  	tail = cmd;
4d6ea9c3 Denis Kirjanov     2010-07-10  1264  	while ((cmdcnt < SLIC_CMDQ_CMDSINPAGE) &&
4d6ea9c3 Denis Kirjanov     2010-07-10  1265  	       (adapter->slic_handle_ix < 256)) {
4d6ea9c3 Denis Kirjanov     2010-07-10  1266  		/* Allocate and initialize a SLIC_HANDLE for this command */

:::::: The code at line 1258 was first introduced by commit
:::::: 4d6ea9c3223da8d8dc91b369087fa40cc53edd36 Staging: slicoss: kill functions prototypes and reorder functions

:::::: TO: Denis Kirjanov <dkirjanov@hera.kernel.org>
:::::: CC: Greg Kroah-Hartman <gregkh@suse.de>

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 47959 bytes --]

[-- Attachment #3: Type: text/plain, Size: 169 bytes --]

_______________________________________________
devel mailing list
devel@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel

^ permalink raw reply

* Re: [PATCH] genetlink: fix unsigned int comparison with less than zero
From: Johannes Berg @ 2016-11-13  7:31 UTC (permalink / raw)
  To: Cong Wang
  Cc: Colin King, David S . Miller, pravin shelar, Wei Yongjun,
	Florian Westphal, Tycho Andersen, Tom Herbert,
	Linux Kernel Network Developers, LKML
In-Reply-To: <CAM_iQpUw-PYmksL2StciJiM-8C+y3w9T_Vo4=pVKnjQzTwKvLg@mail.gmail.com>


> > I suppose it could be, since family IDs are allocated in a 16-bit
> > range
> > anyway. But family IDs can also never actually be negative, so
> > having
> > an unsigned int in the struct makes sense too.
> 
> All idr_* API's accept int, rather than unsigned int. This is my
> point.

Sure, but that's an internal implementation detail. The struct
genl_family is also an external API towards its users, and there
negative numbers make no sense whatsoever.

johannes

^ permalink raw reply

* Re: [Intel-wired-lan] [PATCH v2] e1000e: free IRQ regardless of __E1000_DOWN
From: Neftin, Sasha @ 2016-11-13  8:34 UTC (permalink / raw)
  To: Baicar, Tyler, jeffrey.t.kirsher, intel-wired-lan, netdev,
	linux-kernel, okaya, timur
In-Reply-To: <dff136c7-cb6a-8a61-d311-01959bac16b9@codeaurora.org>

On 11/11/2016 12:35 AM, Baicar, Tyler wrote:
> Hello Sasha,
> 
> On 11/9/2016 11:19 PM, Neftin, Sasha wrote:
>> On 11/9/2016 11:41 PM, Tyler Baicar wrote:
>>> Move IRQ free code so that it will happen regardless of the
>>> __E1000_DOWN bit. Currently the e1000e driver only releases its IRQ
>>> if the __E1000_DOWN bit is cleared. This is not sufficient because
>>> it is possible for __E1000_DOWN to be set without releasing the IRQ.
>>> In such a situation, we will hit a kernel bug later in e1000_remove
>>> because the IRQ still has action since it was never freed. A
>>> secondary bus reset can cause this case to happen.
>>>
>>> Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org>
>>> ---
>>>   drivers/net/ethernet/intel/e1000e/netdev.c | 3 ++-
>>>   1 file changed, 2 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c
>>> b/drivers/net/ethernet/intel/e1000e/netdev.c
>>> index 7017281..36cfcb0 100644
>>> --- a/drivers/net/ethernet/intel/e1000e/netdev.c
>>> +++ b/drivers/net/ethernet/intel/e1000e/netdev.c
>>> @@ -4679,12 +4679,13 @@ int e1000e_close(struct net_device *netdev)
>>>         if (!test_bit(__E1000_DOWN, &adapter->state)) {
>>>           e1000e_down(adapter, true);
>>> -        e1000_free_irq(adapter);
>>>             /* Link status message must follow this format */
>>>           pr_info("%s NIC Link is Down\n", adapter->netdev->name);
>>>       }               
>>>   +    e1000_free_irq(adapter);
>>> +
>>>       napi_disable(&adapter->napi);
>>>         e1000e_free_tx_resources(adapter->tx_ring);
>>>
>> I would like not recommend insert this change. This change related
>> driver state machine, we afraid from lot of synchronization problem and
>> issues.
>> We need keep e1000_free_irq in loop and check for 'test_bit' ready.
> 
> What do you mean here? There is no loop. If __E1000_DOWN is set then we
> will never free the IRQ.
> 
>> Another point, does before execute secondary bus reset your SW back up
>> pcie configuration space as properly?
> 
> After a secondary bus reset, the link needs to recover and go back to a
> working state after 1 second.
> 
> From the callstack, the issue is happening while removing the endpoint
> from the system, before applying the secondary bus reset.
> 
> The order of events is
> 1. remove the drivers
> 2. cause a secondary bus reset
> 3. wait 1 second
Actually, this is too much, usually link up in less than 100ms.You can
check Data Link Layer indication.
> 4. recover the link
> 
> callstack:
> free_msi_irqs+0x6c/0x1a8
> pci_disable_msi+0xb0/0x148
> e1000e_reset_interrupt_capability+0x60/0x78
> e1000_remove+0xc8/0x180
> pci_device_remove+0x48/0x118
> __device_release_driver+0x80/0x108
> device_release_driver+0x2c/0x40
> pci_stop_bus_device+0xa0/0xb0
> pci_stop_bus_device+0x3c/0xb0
> pci_stop_root_bus+0x54/0x80
> acpi_pci_root_remove+0x28/0x64
> acpi_bus_trim+0x6c/0xa4
> acpi_device_hotplug+0x19c/0x3f4
> acpi_hotplug_work_fn+0x28/0x3c
> process_one_work+0x150/0x460
> worker_thread+0x50/0x4b8
> kthread+0xd4/0xe8
> ret_from_fork+0x10/0x50
> 
> Thanks,
> Tyler
> 
Hello Tyler,
Okay, we need consult more about this suggestion.
May I ask what is setup you run? Is there NIC or on board LAN? I would
like try reproduce this issue in our lab's too.
Also, is same issue observed with same scenario and others NIC's too?
Sasha

^ permalink raw reply

* Re: [net-next 1/2] net: ethernet: slicoss: add slicoss gigabit ethernet driver
From: kbuild test robot @ 2016-11-13  8:38 UTC (permalink / raw)
  To: Lino Sanfilippo
  Cc: kbuild-all, davem, charrer, liodot, gregkh, devel, linux-kernel,
	netdev, Lino Sanfilippo
In-Reply-To: <1479012453-19410-2-git-send-email-LinoSanfilippo@gmx.de>

[-- Attachment #1: Type: text/plain, Size: 3393 bytes --]

Hi Lino,

[auto build test ERROR on net-next/master]

url:    https://github.com/0day-ci/linux/commits/Lino-Sanfilippo/net-ethernet-slicoss-add-slicoss-gigabit-ethernet-driver/20161113-125131
config: powerpc-allyesconfig (attached as .config)
compiler: powerpc64-linux-gnu-gcc (Debian 6.1.1-9) 6.1.1 20160705
reproduce:
        wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        make.cross ARCH=powerpc 

Note: the linux-review/Lino-Sanfilippo/net-ethernet-slicoss-add-slicoss-gigabit-ethernet-driver/20161113-125131 HEAD 04d1527ec5ee782c78665a855a30a3f150ba78f4 builds fine.
      It only hurts bisectibility.

All errors (new ones prefixed by >>):

   drivers/staging/slicoss/slicoss.c: In function 'slic_cmdq_addcmdpage':
>> drivers/staging/slicoss/slicoss.c:1258:14: error: implicit declaration of function 'virt_to_bus' [-Werror=implicit-function-declaration]
     phys_addr = virt_to_bus((void *)page);
                 ^~~~~~~~~~~
   cc1: some warnings being treated as errors

vim +/virt_to_bus +1258 drivers/staging/slicoss/slicoss.c

4d6ea9c3 Denis Kirjanov     2010-07-10  1242  	struct slic_hostcmd *cmd;
4d6ea9c3 Denis Kirjanov     2010-07-10  1243  	struct slic_hostcmd *prev;
4d6ea9c3 Denis Kirjanov     2010-07-10  1244  	struct slic_hostcmd *tail;
4d6ea9c3 Denis Kirjanov     2010-07-10  1245  	struct slic_cmdqueue *cmdq;
4d6ea9c3 Denis Kirjanov     2010-07-10  1246  	int cmdcnt;
4d6ea9c3 Denis Kirjanov     2010-07-10  1247  	void *cmdaddr;
4d6ea9c3 Denis Kirjanov     2010-07-10  1248  	ulong phys_addr;
4d6ea9c3 Denis Kirjanov     2010-07-10  1249  	u32 phys_addrl;
4d6ea9c3 Denis Kirjanov     2010-07-10  1250  	u32 phys_addrh;
4d6ea9c3 Denis Kirjanov     2010-07-10  1251  	struct slic_handle *pslic_handle;
eafe6002 David Matlack      2015-05-11  1252  	unsigned long flags;
4d6f6af8 Greg Kroah-Hartman 2008-03-19  1253  
4d6ea9c3 Denis Kirjanov     2010-07-10  1254  	cmdaddr = page;
dd146d21 Shraddha Barke     2015-10-15  1255  	cmd = cmdaddr;
4d6ea9c3 Denis Kirjanov     2010-07-10  1256  	cmdcnt = 0;
4d6f6af8 Greg Kroah-Hartman 2008-03-19  1257  
4d6ea9c3 Denis Kirjanov     2010-07-10 @1258  	phys_addr = virt_to_bus((void *)page);
4d6ea9c3 Denis Kirjanov     2010-07-10  1259  	phys_addrl = SLIC_GET_ADDR_LOW(phys_addr);
4d6ea9c3 Denis Kirjanov     2010-07-10  1260  	phys_addrh = SLIC_GET_ADDR_HIGH(phys_addr);
4d6f6af8 Greg Kroah-Hartman 2008-03-19  1261  
4d6ea9c3 Denis Kirjanov     2010-07-10  1262  	prev = NULL;
4d6ea9c3 Denis Kirjanov     2010-07-10  1263  	tail = cmd;
4d6ea9c3 Denis Kirjanov     2010-07-10  1264  	while ((cmdcnt < SLIC_CMDQ_CMDSINPAGE) &&
4d6ea9c3 Denis Kirjanov     2010-07-10  1265  	       (adapter->slic_handle_ix < 256)) {
4d6ea9c3 Denis Kirjanov     2010-07-10  1266  		/* Allocate and initialize a SLIC_HANDLE for this command */

:::::: The code at line 1258 was first introduced by commit
:::::: 4d6ea9c3223da8d8dc91b369087fa40cc53edd36 Staging: slicoss: kill functions prototypes and reorder functions

:::::: TO: Denis Kirjanov <dkirjanov@hera.kernel.org>
:::::: CC: Greg Kroah-Hartman <gregkh@suse.de>

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 51025 bytes --]

^ permalink raw reply

* Re: [Intel-wired-lan] [PATCH v2] e1000e: free IRQ regardless of __E1000_DOWN
From: Neftin, Sasha @ 2016-11-13  9:25 UTC (permalink / raw)
  To: Baicar, Tyler, jeffrey.t.kirsher, intel-wired-lan, netdev,
	linux-kernel, okaya, timur
In-Reply-To: <3914adae-91b2-7b74-02bc-579b3e32d143@intel.com>

On 11/13/2016 10:34 AM, Neftin, Sasha wrote:
> On 11/11/2016 12:35 AM, Baicar, Tyler wrote:
>> Hello Sasha,
>>
>> On 11/9/2016 11:19 PM, Neftin, Sasha wrote:
>>> On 11/9/2016 11:41 PM, Tyler Baicar wrote:
>>>> Move IRQ free code so that it will happen regardless of the
>>>> __E1000_DOWN bit. Currently the e1000e driver only releases its IRQ
>>>> if the __E1000_DOWN bit is cleared. This is not sufficient because
>>>> it is possible for __E1000_DOWN to be set without releasing the IRQ.
>>>> In such a situation, we will hit a kernel bug later in e1000_remove
>>>> because the IRQ still has action since it was never freed. A
>>>> secondary bus reset can cause this case to happen.
>>>>
>>>> Signed-off-by: Tyler Baicar <tbaicar@codeaurora.org>
>>>> ---
>>>>   drivers/net/ethernet/intel/e1000e/netdev.c | 3 ++-
>>>>   1 file changed, 2 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c
>>>> b/drivers/net/ethernet/intel/e1000e/netdev.c
>>>> index 7017281..36cfcb0 100644
>>>> --- a/drivers/net/ethernet/intel/e1000e/netdev.c
>>>> +++ b/drivers/net/ethernet/intel/e1000e/netdev.c
>>>> @@ -4679,12 +4679,13 @@ int e1000e_close(struct net_device *netdev)
>>>>         if (!test_bit(__E1000_DOWN, &adapter->state)) {
>>>>           e1000e_down(adapter, true);
>>>> -        e1000_free_irq(adapter);
>>>>             /* Link status message must follow this format */
>>>>           pr_info("%s NIC Link is Down\n", adapter->netdev->name);
>>>>       }               
>>>>   +    e1000_free_irq(adapter);
>>>> +
>>>>       napi_disable(&adapter->napi);
>>>>         e1000e_free_tx_resources(adapter->tx_ring);
>>>>
>>> I would like not recommend insert this change. This change related
>>> driver state machine, we afraid from lot of synchronization problem and
>>> issues.
>>> We need keep e1000_free_irq in loop and check for 'test_bit' ready.
>>
>> What do you mean here? There is no loop. If __E1000_DOWN is set then we
>> will never free the IRQ.
>>
>>> Another point, does before execute secondary bus reset your SW back up
>>> pcie configuration space as properly?
>>
>> After a secondary bus reset, the link needs to recover and go back to a
>> working state after 1 second.
>>
>> From the callstack, the issue is happening while removing the endpoint
>> from the system, before applying the secondary bus reset.
>>
>> The order of events is
>> 1. remove the drivers
>> 2. cause a secondary bus reset
>> 3. wait 1 second
> Actually, this is too much, usually link up in less than 100ms.You can
> check Data Link Layer indication.
>> 4. recover the link
>>
>> callstack:
>> free_msi_irqs+0x6c/0x1a8
>> pci_disable_msi+0xb0/0x148
>> e1000e_reset_interrupt_capability+0x60/0x78
>> e1000_remove+0xc8/0x180
>> pci_device_remove+0x48/0x118
>> __device_release_driver+0x80/0x108
>> device_release_driver+0x2c/0x40
>> pci_stop_bus_device+0xa0/0xb0
>> pci_stop_bus_device+0x3c/0xb0
>> pci_stop_root_bus+0x54/0x80
>> acpi_pci_root_remove+0x28/0x64
>> acpi_bus_trim+0x6c/0xa4
>> acpi_device_hotplug+0x19c/0x3f4
>> acpi_hotplug_work_fn+0x28/0x3c
>> process_one_work+0x150/0x460
>> worker_thread+0x50/0x4b8
>> kthread+0xd4/0xe8
>> ret_from_fork+0x10/0x50
>>
>> Thanks,
>> Tyler
>>
> Hello Tyler,
> Okay, we need consult more about this suggestion.
> May I ask what is setup you run? Is there NIC or on board LAN? I would
> like try reproduce this issue in our lab's too.
> Also, is same issue observed with same scenario and others NIC's too?
> Sasha
> _______________________________________________
> Intel-wired-lan mailing list
> Intel-wired-lan@lists.osuosl.org
> http://lists.osuosl.org/mailman/listinfo/intel-wired-lan
> 
Please, specify what is device used.

^ permalink raw reply

* [PATCH 4.4 23/34] net: sctp, forbid negative length
From: Greg Kroah-Hartman @ 2016-11-13 11:24 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Jiri Slaby, Vlad Yasevich,
	Neil Horman, David S. Miller, linux-sctp, netdev
In-Reply-To: <20161113112400.008903838@linuxfoundation.org>

4.4-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Jiri Slaby <jslaby@suse.cz>


[ Upstream commit a4b8e71b05c27bae6bad3bdecddbc6b68a3ad8cf ]

Most of getsockopt handlers in net/sctp/socket.c check len against
sizeof some structure like:
        if (len < sizeof(int))
                return -EINVAL;

On the first look, the check seems to be correct. But since len is int
and sizeof returns size_t, int gets promoted to unsigned size_t too. So
the test returns false for negative lengths. Yes, (-1 < sizeof(long)) is
false.

Fix this in sctp by explicitly checking len < 0 before any getsockopt
handler is called.

Note that sctp_getsockopt_events already handled the negative case.
Since we added the < 0 check elsewhere, this one can be removed.

If not checked, this is the result:
UBSAN: Undefined behaviour in ../mm/page_alloc.c:2722:19
shift exponent 52 is too large for 32-bit type 'int'
CPU: 1 PID: 24535 Comm: syz-executor Not tainted 4.8.1-0-syzkaller #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.1-0-gb3ef39f-prebuilt.qemu-project.org 04/01/2014
 0000000000000000 ffff88006d99f2a8 ffffffffb2f7bdea 0000000041b58ab3
 ffffffffb4363c14 ffffffffb2f7bcde ffff88006d99f2d0 ffff88006d99f270
 0000000000000000 0000000000000000 0000000000000034 ffffffffb5096422
Call Trace:
 [<ffffffffb3051498>] ? __ubsan_handle_shift_out_of_bounds+0x29c/0x300
...
 [<ffffffffb273f0e4>] ? kmalloc_order+0x24/0x90
 [<ffffffffb27416a4>] ? kmalloc_order_trace+0x24/0x220
 [<ffffffffb2819a30>] ? __kmalloc+0x330/0x540
 [<ffffffffc18c25f4>] ? sctp_getsockopt_local_addrs+0x174/0xca0 [sctp]
 [<ffffffffc18d2bcd>] ? sctp_getsockopt+0x10d/0x1b0 [sctp]
 [<ffffffffb37c1219>] ? sock_common_getsockopt+0xb9/0x150
 [<ffffffffb37be2f5>] ? SyS_getsockopt+0x1a5/0x270

Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Cc: Vlad Yasevich <vyasevich@gmail.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: linux-sctp@vger.kernel.org
Cc: netdev@vger.kernel.org
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/sctp/socket.c |    5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -4371,7 +4371,7 @@ static int sctp_getsockopt_disable_fragm
 static int sctp_getsockopt_events(struct sock *sk, int len, char __user *optval,
 				  int __user *optlen)
 {
-	if (len <= 0)
+	if (len == 0)
 		return -EINVAL;
 	if (len > sizeof(struct sctp_event_subscribe))
 		len = sizeof(struct sctp_event_subscribe);
@@ -5972,6 +5972,9 @@ static int sctp_getsockopt(struct sock *
 	if (get_user(len, optlen))
 		return -EFAULT;
 
+	if (len < 0)
+		return -EINVAL;
+
 	lock_sock(sk);
 
 	switch (optname) {

^ permalink raw reply

* [PATCH 4.8 24/35] net: sctp, forbid negative length
From: Greg Kroah-Hartman @ 2016-11-13 11:27 UTC (permalink / raw)
  To: linux-kernel
  Cc: Greg Kroah-Hartman, stable, Jiri Slaby, Vlad Yasevich,
	Neil Horman, David S. Miller, linux-sctp, netdev
In-Reply-To: <20161113112420.863033770@linuxfoundation.org>

4.8-stable review patch.  If anyone has any objections, please let me know.

------------------

From: Jiri Slaby <jslaby@suse.cz>


[ Upstream commit a4b8e71b05c27bae6bad3bdecddbc6b68a3ad8cf ]

Most of getsockopt handlers in net/sctp/socket.c check len against
sizeof some structure like:
        if (len < sizeof(int))
                return -EINVAL;

On the first look, the check seems to be correct. But since len is int
and sizeof returns size_t, int gets promoted to unsigned size_t too. So
the test returns false for negative lengths. Yes, (-1 < sizeof(long)) is
false.

Fix this in sctp by explicitly checking len < 0 before any getsockopt
handler is called.

Note that sctp_getsockopt_events already handled the negative case.
Since we added the < 0 check elsewhere, this one can be removed.

If not checked, this is the result:
UBSAN: Undefined behaviour in ../mm/page_alloc.c:2722:19
shift exponent 52 is too large for 32-bit type 'int'
CPU: 1 PID: 24535 Comm: syz-executor Not tainted 4.8.1-0-syzkaller #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.1-0-gb3ef39f-prebuilt.qemu-project.org 04/01/2014
 0000000000000000 ffff88006d99f2a8 ffffffffb2f7bdea 0000000041b58ab3
 ffffffffb4363c14 ffffffffb2f7bcde ffff88006d99f2d0 ffff88006d99f270
 0000000000000000 0000000000000000 0000000000000034 ffffffffb5096422
Call Trace:
 [<ffffffffb3051498>] ? __ubsan_handle_shift_out_of_bounds+0x29c/0x300
...
 [<ffffffffb273f0e4>] ? kmalloc_order+0x24/0x90
 [<ffffffffb27416a4>] ? kmalloc_order_trace+0x24/0x220
 [<ffffffffb2819a30>] ? __kmalloc+0x330/0x540
 [<ffffffffc18c25f4>] ? sctp_getsockopt_local_addrs+0x174/0xca0 [sctp]
 [<ffffffffc18d2bcd>] ? sctp_getsockopt+0x10d/0x1b0 [sctp]
 [<ffffffffb37c1219>] ? sock_common_getsockopt+0xb9/0x150
 [<ffffffffb37be2f5>] ? SyS_getsockopt+0x1a5/0x270

Signed-off-by: Jiri Slaby <jslaby@suse.cz>
Cc: Vlad Yasevich <vyasevich@gmail.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: linux-sctp@vger.kernel.org
Cc: netdev@vger.kernel.org
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 net/sctp/socket.c |    5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -4683,7 +4683,7 @@ static int sctp_getsockopt_disable_fragm
 static int sctp_getsockopt_events(struct sock *sk, int len, char __user *optval,
 				  int __user *optlen)
 {
-	if (len <= 0)
+	if (len == 0)
 		return -EINVAL;
 	if (len > sizeof(struct sctp_event_subscribe))
 		len = sizeof(struct sctp_event_subscribe);
@@ -6426,6 +6426,9 @@ static int sctp_getsockopt(struct sock *
 	if (get_user(len, optlen))
 		return -EFAULT;
 
+	if (len < 0)
+		return -EINVAL;
+
 	lock_sock(sk);
 
 	switch (optname) {

^ permalink raw reply

* Re: linux-next: manual merge of the net-next tree with the net tree
From: Or Gerlitz @ 2016-11-13 12:27 UTC (permalink / raw)
  To: Stephen Rothwell, David Miller
  Cc: Networking, linux-next, Linux Kernel, Or Gerlitz, Saeed Mahameed,
	Hadar Hen Zion
In-Reply-To: <20161110105030.3724fe28@canb.auug.org.au>

On Thu, Nov 10, 2016 at 1:50 AM, Stephen Rothwell <sfr@canb.auug.org.au> wrote:
> Hi all,
>
> Today's linux-next merge of the net-next tree got a conflict in:
>
>   drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
>
> between commit:
>   ee39fbc4447d ("net/mlx5: E-Switch, Set the actions for offloaded rules properly")
> from the net tree and commit:
>   66958ed906b8 ("net/mlx5: Support encap id when setting new steering entry")
> from the net-next tree.

> I fixed it up (see below) and can carry the fix as necessary.

Thanks Stephen, the fix is correct. Dave will hit the conflict the
next time he rebases
net-next on net and will solve it there. Hence the conflict will not
show up in linux-next
once you re-peek net-next.

Or.

> diff --cc drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
> index d239f5d0ea36,50fe8e8861bb..000000000000
> --- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c
> @@@ -57,14 -58,14 +58,15 @@@ mlx5_eswitch_add_offloaded_rule(struct
>         if (esw->mode != SRIOV_OFFLOADS)
>                 return ERR_PTR(-EOPNOTSUPP);
>
>  -      flow_act.action = attr->action;
>  +      /* per flow vlan pop/push is emulated, don't set that into the firmware */
> -       action = attr->action & ~(MLX5_FLOW_CONTEXT_ACTION_VLAN_PUSH | MLX5_FLOW_CONTEXT_ACTION_VLAN_POP);
> ++      flow_act.action = attr->action & ~(MLX5_FLOW_CONTEXT_ACTION_VLAN_PUSH | MLX5_FLOW_CONTEXT_ACTION_VLAN_POP);
>
> -       if (action & MLX5_FLOW_CONTEXT_ACTION_FWD_DEST) {
> -               dest.type = MLX5_FLOW_DESTINATION_TYPE_VPORT;
> -               dest.vport_num = attr->out_rep->vport;
> -               action = MLX5_FLOW_CONTEXT_ACTION_FWD_DEST;
> -       } else if (action & MLX5_FLOW_CONTEXT_ACTION_COUNT) {
> +       if (flow_act.action & MLX5_FLOW_CONTEXT_ACTION_FWD_DEST) {
> +               dest[i].type = MLX5_FLOW_DESTINATION_TYPE_VPORT;
> +               dest[i].vport_num = attr->out_rep->vport;
> +               i++;
> +       }
> +       if (flow_act.action & MLX5_FLOW_CONTEXT_ACTION_COUNT) {
>                 counter = mlx5_fc_create(esw->dev, true);
>                 if (IS_ERR(counter))
>                         return ERR_CAST(counter);

^ permalink raw reply

* [PATCH] ip6_output: ensure flow saddr actually belongs to device
From: Jason A. Donenfeld @ 2016-11-13 13:23 UTC (permalink / raw)
  To: David Ahern, Netdev, WireGuard mailing list, LKML,
	YOSHIFUJI Hideaki, Hannes Frederic Sowa
In-Reply-To: <CAHmME9qYYU-p5BhR0ReNSB+__wysuDK+eEk1SisCoyf4=2woxg@mail.gmail.com>

This puts the IPv6 routing functions in parity with the IPv4 routing
functions. Namely, we now check in v6 that if a flowi6 requests an
saddr, the returned dst actually corresponds to a net device that has
that saddr. This mirrors the v4 logic with __ip_dev_find in
__ip_route_output_key_hash. In the event that the returned dst is not
for a dst with a dev that has the saddr, we return -EINVAL, just like
v4; this makes it easy to use the same error handlers for both cases.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Cc: David Ahern <dsa@cumulusnetworks.com>
---
 net/ipv6/ip6_output.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index 6001e78..a834129 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -1011,6 +1011,11 @@ static int ip6_dst_lookup_tail(struct net *net, const struct sock *sk,
 		}
 	}
 #endif
+	if (!ipv6_addr_any(&fl6->saddr) &&
+	    !ipv6_chk_addr(net, &fl6->saddr, (*dst)->dev, 1)) {
+		err = -EINVAL;
+		goto out_err_release;
+	}
 
 	return 0;
 
-- 
2.10.2

^ permalink raw reply related

* [PATCH] Fixup packets with incorrect ethertype sent by ZTE MF821D
From: Jussi Peltola @ 2016-11-13 13:47 UTC (permalink / raw)
  To: netdev; +Cc: Bjørn Mork

This brokenness appears reliably after running "rdisc6 wwan0" but I have
not debugged if this is related to timing or the format of the router
solicitation. Before receiving a router solicitation, v4 is received
correctly and v6 does not work. After sending the MF821D a router
solicitation with rdisc6, v4 is broken but v6 works. With this patch
(which I am using to write this message) a dual-stack context is usable.

commit 2c25237d19c0c9741c6ebec854def99b88618eac
Author: Jussi Peltola <plz@plz.fi>
Date:   Sun Nov 13 15:41:50 2016 +0200

    Fixup packets with incorrect ethertype sent by ZTE MF821D
    
    Signed-off-by: Jussi Peltola <plz@plz.fi>

diff --git a/drivers/net/usb/qmi_wwan.c b/drivers/net/usb/qmi_wwan.c
index 3ff76c6..edd8172 100644
--- a/drivers/net/usb/qmi_wwan.c
+++ b/drivers/net/usb/qmi_wwan.c
@@ -172,6 +172,12 @@ static const u8 buggy_fw_addr[ETH_ALEN] = {0x00, 0xa0, 0xc6, 0x00, 0x00, 0x00};
  * Another common firmware bug results in all packets being addressed
  * to 00:a0:c6:00:00:00 despite the host address being different.
  * This function will also fixup such packets.
+ *
+ * At least the ZTE MF821D sends IPv4 packets with a bogus ethertype
+ * of 0x63bc, 0xe3bc, 0x63bd or 0xe3bd, and bogus source and
+ * destination MACs after it has received an IPv6 router solicitation
+ * (IPv6 is transmitted correctly). This function will also fix up
+ * such packets.
  */
 static int qmi_wwan_rx_fixup(struct usbnet *dev, struct sk_buff *skb)
 {
@@ -195,12 +201,21 @@ static int qmi_wwan_rx_fixup(struct usbnet *dev, struct sk_buff *skb)
                        return 0;
                if (is_multicast_ether_addr(skb->data))
                        return 1;
+
                /* possibly bogus destination - rewrite just in case */
                skb_reset_mac_header(skb);
                goto fix_dest;
        default:
                if (rawip)
                        return 0;
+
+               /* Bogus ethertype and src/dst mac for v4 on ZTE MF821D */
+               if ((skb->data[12] & 0x7f) == 0x63
+                && (skb->data[13] & 0xfe) == 0xbc) {
+                       proto = htons(ETH_P_IP);
+                       goto reset_mac;
+               }
+
                /* pass along other packets without modifications */
                return 1;
        }
@@ -213,6 +228,7 @@ static int qmi_wwan_rx_fixup(struct usbnet *dev, struct sk_buff *skb)
        if (skb_headroom(skb) < ETH_HLEN)
                return 0;
        skb_push(skb, ETH_HLEN);
+reset_mac:
        skb_reset_mac_header(skb);
        eth_hdr(skb)->h_proto = proto;
        eth_zero_addr(eth_hdr(skb)->h_source);

^ permalink raw reply related

* [PATCH net] sctp: change sk state only when it has assocs in sctp_shutdown
From: Xin Long @ 2016-11-13 13:44 UTC (permalink / raw)
  To: network dev, linux-sctp
  Cc: davem, Marcelo Ricardo Leitner, Neil Horman, Vlad Yasevich,
	andreyknvl

Now when users shutdown a sock with SEND_SHUTDOWN in sctp, even if
this sock has no connection (assoc), sk state would be changed to
SCTP_SS_CLOSING, which is not as we expect.

Besides, after that if users try to listen on this sock, kernel
could even panic when it dereference sctp_sk(sk)->bind_hash in
sctp_inet_listen, as bind_hash is null when sock has no assoc.

This patch is to move sk state change after checking sk assocs
is not empty, and also merge these two if() conditions and reduce
indent level.

Fixes: d46e416c11c8 ("sctp: sctp should change socket state when shutdown is received")
Reported-by: Andrey Konovalov <andreyknvl@google.com>
Tested-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
---
 net/sctp/socket.c | 15 +++++++--------
 1 file changed, 7 insertions(+), 8 deletions(-)

diff --git a/net/sctp/socket.c b/net/sctp/socket.c
index faa48ff..f23ad91 100644
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -4285,19 +4285,18 @@ static void sctp_shutdown(struct sock *sk, int how)
 {
 	struct net *net = sock_net(sk);
 	struct sctp_endpoint *ep;
-	struct sctp_association *asoc;
 
 	if (!sctp_style(sk, TCP))
 		return;
 
-	if (how & SEND_SHUTDOWN) {
+	ep = sctp_sk(sk)->ep;
+	if (how & SEND_SHUTDOWN && !list_empty(&ep->asocs)) {
+		struct sctp_association *asoc;
+
 		sk->sk_state = SCTP_SS_CLOSING;
-		ep = sctp_sk(sk)->ep;
-		if (!list_empty(&ep->asocs)) {
-			asoc = list_entry(ep->asocs.next,
-					  struct sctp_association, asocs);
-			sctp_primitive_SHUTDOWN(net, asoc, NULL);
-		}
+		asoc = list_entry(ep->asocs.next,
+				  struct sctp_association, asocs);
+		sctp_primitive_SHUTDOWN(net, asoc, NULL);
 	}
 }
 
-- 
2.1.0

^ permalink raw reply related

* Re: [RFC v4 00/18] Landlock LSM: Unprivileged sandboxing
From: Mickaël Salaün @ 2016-11-13 14:23 UTC (permalink / raw)
  To: linux-kernel
  Cc: Alexei Starovoitov, Andy Lutomirski, Daniel Borkmann, Daniel Mack,
	David Drysdale, David S . Miller, Eric W . Biederman,
	James Morris, Jann Horn, Kees Cook, Paul Moore, Sargun Dhillon,
	Serge E . Hallyn, Tejun Heo, Thomas Graf, Will Drewry,
	kernel-hardening, linux-api, linux-security-module, netdev,
	cgroups
In-Reply-To: <20161026065654.19166-1-mic@digikod.net>


[-- Attachment #1.1: Type: text/plain, Size: 22993 bytes --]

Hi,

After the BoF at LPC last week, we came to a multi-step roadmap to
upstream Landlock.

A first patch series containing the basic properties needed for a
"minimum viable product", which means being able to test it, without
full features. The idea is to set in place the main components which
include the LSM part (some hooks with the manager logic) and the new
eBPF type. To have a minimum amount of code, the first userland entry
point will be the seccomp syscall. This doesn't imply non-upstream
patches and should be more simple. For the sake of simplicity and to
ease the review, this first series will only be dedicated to privileged
processes (i.e. with CAP_SYS_ADMIN). We may want to only allow one level
of rules at first, instead of dealing with more complex rule inheritance
(like seccomp-bpf can do).

The second series will focus on the cgroup manager. It will follow the
same rules of inheritance as the Daniel Mack's patches does.

The third series will try to bring a BPF map of handles for Landlock and
the dedicated BPF helpers.

Finally, the fourth series will bring back the unprivileged mode (with
no_new_privs), at least for process hierarchies (via seccomp). This also
imply to handle multi-level of rules.

Right now, an important point of attention is the userland ABI. We don't
want LSM hooks to be exposed "as is" to userland. This may have some
future implications if their semantic and/or enforcement point(s)
change. In the next series, I will propose a new abstraction over the
currently used LSM hooks. I'll also propose a new way to deal with
resource accountability. Finally, I plan to create a minimal (kernel)
developer documentation and a test suite.

Regards,
 Mickaël


On 26/10/2016 08:56, Mickaël Salaün wrote:
> Hi,
> 
> This fourth RFC brings some improvements over the previous one [1]. An important
> new point is the abstraction from the raw types of LSM hook arguments. It is
> now possible to call a Landlock function the same way for LSM hooks with
> different internal argument types. Some parts of the code are revamped with RCU
> to properly deal with concurrency. From a userland point of view, the only
> remaining link with seccomp-bpf is the ability to use the seccomp(2) syscall to
> load and enforce a Landlock rule. Seccomp filters cannot trigger Landlock rules
> anymore. For now, it is no more possible for an unprivileged user to enforce a
> Landlock rule on a cgroup through delegation.
> 
> As suggested, I plan to write documentation for userland and kernel developers
> with some kind of guiding principles. A remaining question is how to enforce
> limitations for the rule creation?
> 
> 
> # Landlock LSM
> 
> The goal of this new stackable Linux Security Module (LSM) called Landlock is
> to allow any process, including unprivileged ones, to create powerful security
> sandboxes comparable to the Seatbelt/XNU Sandbox or the OpenBSD Pledge. This
> kind of sandbox is expected to help mitigate the security impact of bugs or
> unexpected/malicious behaviors in userland applications.
> 
> eBPF programs are used to create a security rule. They are very limited (i.e.
> can only call a whitelist of functions) and cannot do a denial of service (i.e.
> no loop). A new dedicated eBPF map allows to collect and compare Landlock
> handles with system resources (e.g. files or network connections).
> 
> The approach taken is to add the minimum amount of code while still allowing
> the userland to create quite complex access rules. A dedicated security policy
> language as the one used by SELinux, AppArmor and other major LSMs involves a
> lot of code and is usually dedicated to a trusted user (i.e. root).
> 
> 
> # eBPF
> 
> To get an expressive language while still being safe and small, Landlock is
> based on eBPF. Landlock should be usable by untrusted processes and must then
> expose a minimal attack surface. The eBPF bytecode is minimal while powerful,
> widely used and designed to be used by not so trusted application. Reusing this
> code allows to not reproduce the same mistakes and minimize new code  while
> still taking a generic approach. Only a few additional features are added like
> a new kind of arraymap and some dedicated eBPF functions.
> 
> An eBPF program has access to an eBPF context which contains the LSM hook
> arguments (as does seccomp-bpf with syscall arguments). They can be used
> directly or passed to helper functions according to their types. It is then
> possible to do complex access checks without race conditions nor inconsistent
> evaluation (i.e. incorrect mirroring of the OS code and state [2]).
> 
> There is one eBPF program subtype per LSM hook. This allows to statically check
> which context access is performed by an eBPF program. This is needed to deny
> kernel address leak and ensure the right use of LSM hook arguments with eBPF
> functions. Moreover, this safe pointer handling removes the need for runtime
> check or abstract data, which improves performances. Any user can add multiple
> Landlock eBPF programs per LSM hook. They are stacked and evaluated one after
> the other (cf. seccomp-bpf).
> 
> 
> # LSM hooks
> 
> Unlike syscalls, LSM hooks are security checkpoints and are not architecture
> dependent. They are designed to match a security need associated with a
> security policy (e.g. access to a file). Exposing parts of some LSM hooks
> instead of using the syscall API for sandboxing should help to avoid bugs and
> hacks as encountered by the first RFC. Instead of redoing the work of the LSM
> hooks through syscalls, we should use and expose them as does policies of
> access control LSM.
> 
> Only a subset of the hooks are meaningful for an unprivileged sandbox mechanism
> (e.g. file system or network access control). Landlock uses an abstraction of
> raw LSM hooks, which allow to deal with possible future API changes of the LSM
> hook API. Moreover, thanks to the ePBF program typing (per LSM hook) used by
> Landlock, it should not be hard to make such evolutions backward compatible.
> 
> 
> # Use case scenario
> 
> First, a process needs to create a new dedicated eBPF map containing handles.
> This handles are references to system resources (e.g. file or directory) and
> grouped in one or multiple maps to be efficiently managed and checked in
> batches. This kind of map can be passed to Landlock eBPF functions to compare,
> for example, with a file access request. The handles are only accessible from
> the eBPF programs created by the same thread.
> 
> The loaded Landlock eBPF programs can be triggered by a seccomp filter
> returning RET_LANDLOCK. In addition, a cookie (16-bit value) can be passed from
> a seccomp filter to eBPF programs. This allow flexible security policies
> between seccomp and Landlock.
> 
> Another way to enforce a Landlock security policy is to attach Landlock
> programs to a dedicated cgroup. All the processes in this cgroup will then be
> subject to this policy. For unprivileged processes, this can be done thanks to
> cgroup delegation.
> 
> A triggered Landlock eBPF program can allow or deny an access, according to
> its subtype (i.e. LSM hook), thanks to errno return values.
> 
> 
> # Sandbox example with process hierarchy sandboxing (seccomp)
> 
>   $ ls /home
>   user1
>   $ LANDLOCK_ALLOWED='/bin:/lib:/usr:/tmp:/proc/self/fd/0' \
>       ./samples/landlock/sandbox /bin/sh -i
>   Launching a new sandboxed process.
>   $ ls /home
>   ls: cannot access '/home': No such file or directory
> 
> 
> # Sandbox example with conditional access control depending on a cgroup
> 
>   $ mkdir /sys/fs/cgroup/sandboxed
>   $ ls /home
>   user1
>   $ LANDLOCK_CGROUPS='/sys/fs/cgroup/sandboxed' \
>       LANDLOCK_ALLOWED='/bin:/lib:/usr:/tmp:/proc/self/fd/0' \
>       ./samples/landlock/sandbox
>   Ready to sandbox with cgroups.
>   $ ls /home
>   user1
>   $ echo $$ > /sys/fs/cgroup/sandboxed/cgroup.procs
>   $ ls /home
>   ls: cannot access '/home': No such file or directory
> 
> 
> # Current limitations and possible improvements
> 
> For now, eBPF programs can only return an errno code. It may be interesting to
> be able to do other actions like seccomp-bpf does (e.g. kill process). Such
> features can easily be implemented but the main advantage of the current
> approach is to be able to only execute eBPF programs until one returns an errno
> code instead of executing all programs like seccomp-bpf does.
> 
> It is quite easy to add new eBPF functions to extend Landlock. The main concern
> should be about the possibility to leak information from current process to
> another one (e.g. through maps) to not reproduce the same security sensitive
> behavior as ptrace.
> 
> This design does not seem too intrusive but is flexible enough to allow a
> powerful sandbox mechanism accessible by any process on Linux. The use of
> seccomp and Landlock is more suitable with the help of a userland library (e.g.
> libseccomp) that could help to specify a high-level language to express a
> security policy instead of raw eBPF programs. Moreover, thanks to LLVM, it is
> possible to express an eBPF program with a subset of C.
> 
> 
> # FAQ
> 
> ## Why does seccomp-bpf is not enough?
> 
> A seccomp filter can access to raw syscall arguments which means that it is not
> possible to filter according to pointed such as a file path. As the first
> version of this patch series demonstrated, filtering at the syscall level is
> complicated (e.g. need to take care of race conditions). This is mainly because
> the access control checkpoints of the kernel are not at this high-level but
> more underneath, at LSM hooks level. The LSM hooks are designed to handle this
> kind of checks. This series use this approach to leverage the ability of
> unprivileged users to limit themselves.
> 
> Cf. "What it isn't?" in Documentation/prctl/seccomp_filter.txt
> 
> 
> ## Why using the seccomp(2) syscall?
> 
> Landlock use the same semantic as seccomp to apply access rule restrictions. It
> add a new layer of security for the current process which is inherited by its
> childs. It makes sense to use an unique access-restricting syscall (that should
> be allowed by seccomp-bpf rules) which can only drop privileges. Moreover, a
> Landlock eBPF program could come from outside a process (e.g. passed through a
> UNIX socket). It is then useful to differentiate the creation/load of Landlock
> eBPF programs via bpf(2), from rule enforcing via seccomp(2).
> 
> 
> ## Why using cgroups?
> 
> cgroups are designed to handle groups of processes. One use case is to manage
> containers. Sandboxing based on process hierarchy (seccomp) is design to handle
> immutable security policies, which is a good security property but does not
> match all use cases. A user can attach Landlock rules to a cgroup. Doing so,
> all the processes in that cgroup will be subject to the security policy.
> However, if the user is allowed to manage this cgroup, it could dynamically
> move this group of processes to a cgroup with another security policy (or
> none). Landlock rules can be applied either on a process hierarchy (e.g.
> application with built-in sandboxing) or a group of processes (e.g. container
> sandboxing). Both approaches can be combined for the same process.
> 
> 
> ## Does Landlock can limit network access or other resources?
> 
> Limiting network access is obviously in the scope of Landlock but it is not yet
> implemented. The main goal now is to get feedback about the whole concept, the
> API and the file access control part. More access control types could be
> implemented in the future.
> 
> Sargun Dhillon sent a RFC (Checmate) [4] to deal with network manipulation.
> This could be implemented on top of the Landlock framework.
> 
> 
> ## Why a new LSM? Are SELinux, AppArmor, Smack or Tomoyo not good enough?
> 
> The current access control LSMs are fine for their purpose which is to give the
> *root* the ability to enforce a security policy for the *system*. What is
> missing is a way to enforce a security policy for any applications by its
> developer and *unprivileged user* as seccomp can do for raw syscall filtering.
> Moreover, Landlock handles stacked hook programs from different users. It must
> then ensure there is no possible malicious interactions between these programs.
> 
> Differences with other (access control) LSMs:
> * not only dedicated to administrators (i.e. no_new_priv);
> * limited kernel attack surface (e.g. policy parsing);
> * helpers to compare complex objects (path/FD), no access to internal kernel
>   data (do not leak addresses);
> * constrained policy rules/programs (no DoS: deterministic execution time);
> * do not leak more information than the loader process can legitimately have
>   access to (minimize metadata inference): must compare from an already allowed
>   file (through a handle).
> 
> 
> ## Why not use a policy language like used by SElinux or AppArmor?
> 
> This kind of LSMs are dedicated to administrators. They already manage the
> system and are not a threat to the system security. However, seccomp, and
> Landlock too, should be available to anyone, which potentially include
> untrusted users and processes. To reduce the attack surface, Landlock should
> expose the minimum amount of code, hence minimal complexity. Moreover, another
> threat is to make accessible to a malicious code a new way to gain more
> information. For example, Landlock features should not allow a program to get
> the file owner if the directory containing this file is not readable. This data
> could then be exfiltrated thanks to the access result. Thus, we should limit
> the expressiveness of the available checks. The current approach is to do the
> checks in such a way that only a comparison with an already accessed resource
> (e.g. file descriptor) is possible. This allow to have a reference to compare
> with, without exposing much information.
> 
> 
> ## As a developer, why do I need this feature?
> 
> Landlock's goal is to help userland to limit its attack surface.
> Security-conscious developers would like to protect users from a security bug
> in their applications and the third-party dependencies they are using. Such a
> bug can compromise all the user data and help an attacker to perform a
> privilege escalation. Using an *unprivileged sandbox* feature such as Landlock
> empowers the developer with the ability to properly compartmentalize its
> software and limit the impact of vulnerabilities.
> 
> 
> ## As a user, why do I need a this feature?
> 
> Any user can already use seccomp-bpf to whitelist a set of syscalls to
> reduce the kernel attack surface for a predefined set of processes. However an
> unprivileged user can't create a security policy like the root user can thanks to
> SELinux and other access control LSMs. Landlock allows any unprivileged user to
> protect their data from being accessed by any process they run but only an
> identified subset. User tools can be created to help create such a high-level
> access control policy. This policy may not be powerful enough to express the
> same policies as the current access control LSMs, because of the threat an
> unprivileged user can be to the system, but it should be enough for most
> use-cases (e.g. blacklist or whitelist a set of file hierarchies).
> 
> 
> # Changes since RFC v3
> 
> * use abstract LSM hook arguments with custom types (e.g. *_LANDLOCK_ARG_FS for
>   struct file, struct inode and struct path)
> * add more LSM hooks to support full file system access control
> * improve the sandbox example
> * fix races and RCU issues:
>   * eBPF program execution and eBPF helpers
>   * revamp the arraymap of handles to cleanly deal with update/delete
> * eBPF program subtype for Landlock:
>   * remove the "origin" field
>   * add an "option" field
> * rebase onto Daniel Mack's patches v7 [3]
> * remove merged commit 1955351da41c ("bpf: Set register type according to
>   is_valid_access()")
> * fix spelling mistakes
> * cleanup some type and variable names
> * split patches
> * for now, remove cgroup delegation handling for unprivileged user
> * remove extra access check for cgroup_get_from_fd()
> * remove unused example code dealing with skb
> * remove seccomp-bpf link:
>   * no more seccomp cookie
>   * for now, it is no more possible to check the current syscall properties
> 
> 
> # Changes since RFC v2
> 
> * revamp cgroup handling:
>   * use Daniel Mack's patches "Add eBPF hooks for cgroups" v5
>   * remove bpf_landlock_cmp_cgroup_beneath()
>   * make BPF_PROG_ATTACH usable with delegated cgroups
>   * add a new CGRP_NO_NEW_PRIVS flag for safe cgroups
>   * handle Landlock sandboxing for cgroups hierarchy
>   * allow unprivileged processes to attach Landlock eBPF program to cgroups
> * add subtype to eBPF programs:
>   * replace Landlock hook identification by custom eBPF program types with a
>     dedicated subtype field
>   * manage fine-grained privileged Landlock programs
>   * register Landlock programs for dedicated trigger origins (e.g. syscall,
>     return from seccomp filter and/or interruption)
> * performance and memory optimizations: use an array to access Landlock hooks
>   directly but do not duplicated it for each thread (seccomp-based)
> * allow running Landlock programs without seccomp filter
> * fix seccomp-related issues
> * remove extra errno bounding check for Landlock programs
> * add some examples for optional eBPF functions or context access (network
>   related) according to security checks to allow more features for privileged
>   programs (e.g. Checmate)
> 
> 
> # Changes since RFC v1
> 
> * focus on the LSM hooks, not the syscalls:
>   * much more simple implementation
>   * does not need audit cache tricks to avoid race conditions
>   * more simple to use and more generic because using the LSM hook abstraction
>     directly
>   * more efficient because only checking in LSM hooks
>   * architecture agnostic
> * switch from cBPF to eBPF:
>   * new eBPF program types dedicated to Landlock
>   * custom functions used by the eBPF program
>   * gain some new features (e.g. 10 registers, can load values of different
> 	size, LLVM translator) but only a few functions allowed and a dedicated map
>     type
>   * new context: LSM hook ID, cookie and LSM hook arguments
>   * need to set the sysctl kernel.unprivileged_bpf_disable to 0 (default value)
>     to be able to load hook filters as unprivileged users
> * smaller and simpler:
>   * no more checker groups but dedicated arraymap of handles
>   * simpler userland structs thanks to eBPF functions
> * distinctive name: Landlock
> 
> 
> This series can be applied on top of Daniel Mack's patches for BPF_PROG_ATTACH
> v7 [3] on Linux v4.9-rc2. This can be tested with CONFIG_SECURITY_LANDLOCK,
> CONFIG_SECCOMP_FILTER and CONFIG_CGROUP_BPF. I would really appreciate
> constructive comments on the usability, architecture, code and userland API of
> Landlock LSM.
> 
> [1] https://lkml.kernel.org/r/20160914072415.26021-1-mic@digikod.net
> [2] https://crypto.stanford.edu/cs155/papers/traps.pdf
> [3] https://lkml.kernel.org/r/1477390454-12553-1-git-send-email-daniel@zonque.org
> [4] https://lkml.kernel.org/r/20160829114542.GA20836@ircssh.c.rugged-nimbus-611.internal
> 
> Regards,
> 
> Mickaël Salaün (18):
>   landlock: Add Kconfig
>   bpf: Move u64_to_ptr() to BPF headers and inline it
>   bpf,landlock: Add a new arraymap type to deal with (Landlock) handles
>   bpf,landlock: Add eBPF program subtype and is_valid_subtype() verifier
>   bpf,landlock: Define an eBPF program type for Landlock
>   fs: Constify path_is_under()'s arguments
>   landlock: Add LSM hooks
>   landlock: Handle file comparisons
>   landlock: Add manager functions
>   seccomp: Split put_seccomp_filter() with put_seccomp()
>   seccomp,landlock: Handle Landlock hooks per process hierarchy
>   bpf: Cosmetic change for bpf_prog_attach()
>   bpf/cgroup: Replace struct bpf_prog with struct bpf_object
>   bpf/cgroup: Make cgroup_bpf_update() return an error code
>   bpf/cgroup: Move capability check
>   bpf/cgroup,landlock: Handle Landlock hooks per cgroup
>   landlock: Add update and debug access flags
>   samples/landlock: Add sandbox example
> 
>  fs/namespace.c                 |   2 +-
>  include/linux/bpf-cgroup.h     |  19 +-
>  include/linux/bpf.h            |  44 +++-
>  include/linux/cgroup-defs.h    |   2 +
>  include/linux/filter.h         |   1 +
>  include/linux/fs.h             |   2 +-
>  include/linux/landlock.h       |  95 +++++++++
>  include/linux/lsm_hooks.h      |   5 +
>  include/linux/seccomp.h        |  12 +-
>  include/uapi/linux/bpf.h       | 105 ++++++++++
>  include/uapi/linux/seccomp.h   |   1 +
>  kernel/bpf/arraymap.c          | 270 +++++++++++++++++++++++++
>  kernel/bpf/cgroup.c            | 139 ++++++++++---
>  kernel/bpf/syscall.c           |  71 ++++---
>  kernel/bpf/verifier.c          |  35 +++-
>  kernel/cgroup.c                |   6 +-
>  kernel/fork.c                  |  15 +-
>  kernel/seccomp.c               |  26 ++-
>  kernel/trace/bpf_trace.c       |  12 +-
>  net/core/filter.c              |  26 ++-
>  samples/Makefile               |   2 +-
>  samples/bpf/bpf_helpers.h      |   5 +
>  samples/landlock/.gitignore    |   1 +
>  samples/landlock/Makefile      |  16 ++
>  samples/landlock/sandbox.c     | 405 +++++++++++++++++++++++++++++++++++++
>  security/Kconfig               |   1 +
>  security/Makefile              |   2 +
>  security/landlock/Kconfig      |  23 +++
>  security/landlock/Makefile     |   3 +
>  security/landlock/checker_fs.c | 152 ++++++++++++++
>  security/landlock/checker_fs.h |  20 ++
>  security/landlock/common.h     |  58 ++++++
>  security/landlock/lsm.c        | 449 +++++++++++++++++++++++++++++++++++++++++
>  security/landlock/manager.c    | 379 ++++++++++++++++++++++++++++++++++
>  security/security.c            |   1 +
>  35 files changed, 2309 insertions(+), 96 deletions(-)
>  create mode 100644 include/linux/landlock.h
>  create mode 100644 samples/landlock/.gitignore
>  create mode 100644 samples/landlock/Makefile
>  create mode 100644 samples/landlock/sandbox.c
>  create mode 100644 security/landlock/Kconfig
>  create mode 100644 security/landlock/Makefile
>  create mode 100644 security/landlock/checker_fs.c
>  create mode 100644 security/landlock/checker_fs.h
>  create mode 100644 security/landlock/common.h
>  create mode 100644 security/landlock/lsm.c
>  create mode 100644 security/landlock/manager.c
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply

* Re: Kernel 4.8.7 crashing down cleanup_net()
From: Borislav Petkov @ 2016-11-13 16:00 UTC (permalink / raw)
  To: Ashton Holmes; +Cc: LKML, netdev
In-Reply-To: <CA+OSeu=TvXdd8VmxWOo2dh-jS_WrNibSWdCt0OEyPEkfxjO_yw@mail.gmail.com>

Hmm, there's cleanup_net() namespaces fun in there, adding netdev@.

On Sat, Nov 12, 2016 at 02:17:03PM -0800, Ashton Holmes wrote:
> I upgraded to 4.8.7 and the system boots and my root partition gets
> decrypted but right after that both of my monitors turn off and
> looking at syslog from 4.8.6 shows the following:
> 
> Nov 12 11:54:24 user-desktop kernel: [   19.853197] ------------[ cut
> here ]------------
> Nov 12 11:54:24 user-desktop kernel: [   19.853206] WARNING: CPU: 2
> PID: 177 at fs/proc/proc_sysctl.c:1607 ops_exit_list.isra.4+0x33/0x60
> Nov 12 11:54:24 user-desktop kernel: [   19.853210] Modules linked in:
> amdgpu eeepc_wmi i2c_algo_bit asus_wmi drm_kms_helper video rfkill
> snd_hda_codec_realtek ttm snd_hda_codec_generic snd_hda_codec_hdmi drm
> snd_hda_intel snd_usb_audio sparse_keymap snd_hda_codec mxm_wmi
> snd_usbmidi_lib snd_hda_core efi_pstore snd_hwdep snd_rawmidi evdev
> snd_seq_device joydev snd_pcm serio_raw pcspkr efivars fam15h_power
> k10temp snd_timer snd sp5100_tco i2c_piix4 sg soundcore i2c_core
> shpchp tpm_infineon tpm_tis tpm_tis_core tpm wmi button it87 hwmon_vid
> autofs4 ext4 crc16 jbd2 mbcache algif_skcipher af_alg dm_crypt dm_mod
> hid_generic usbhid hid sr_mod cdrom sd_mod ohci_pci crct10dif_pclmul
> crc32_pclmul crc32c_intel aesni_intel aes_x86_64 glue_helper lrw
> gf128mul ablk_helper cryptd psmouse ohci_hcd ahci libahci ehci_pci
> ehci_hcd libata xhci_pci xhci_hcd e1000e usbcore scsi_mod ptp
> usb_common pps_core
> Nov 12 11:54:24 user-desktop kernel: [   19.853264] CPU: 2 PID: 177
> Comm: kworker/u16:2 Not tainted 4.8.7 #1
> Nov 12 11:54:24 user-desktop kernel: [   19.853267] Hardware name: To
> be filled by O.E.M. To be filled by O.E.M./CROSSHAIR V FORMULA-Z, BIOS
> 2201 03/23/2015
> Nov 12 11:54:24 user-desktop kernel: [   19.853273] Workqueue: netns cleanup_net
> Nov 12 11:54:24 user-desktop kernel: [   19.853276]  0000000000000286
> 00000000b98071d2 ffffffff81302a8f 0000000000000000
> Nov 12 11:54:24 user-desktop kernel: [   19.853281]  0000000000000000
> ffffffff81076f54 ffff880814f88040 ffff8808140efdf0
> Nov 12 11:54:24 user-desktop kernel: [   19.853285]  ffffffff818f11d8
> ffffffff818f11e0 ffff8808140efde0 0000000000000200
> Nov 12 11:54:24 user-desktop kernel: [   19.853290] Call Trace:
> Nov 12 11:54:24 user-desktop kernel: [   19.853295]
> [<ffffffff81302a8f>] ? dump_stack+0x5c/0x7d
> Nov 12 11:54:24 user-desktop kernel: [   19.853299]
> [<ffffffff81076f54>] ? __warn+0xc4/0xe0
> Nov 12 11:54:24 user-desktop kernel: [   19.853302]
> [<ffffffff8149e243>] ? ops_exit_list.isra.4+0x33/0x60
> Nov 12 11:54:24 user-desktop kernel: [   19.853305]
> [<ffffffff8149f230>] ? cleanup_net+0x1b0/0x290
> Nov 12 11:54:24 user-desktop kernel: [   19.853309]
> [<ffffffff8108fefd>] ? process_one_work+0x14d/0x410
> Nov 12 11:54:24 user-desktop kernel: [   19.853312]
> [<ffffffff81090cc2>] ? worker_thread+0x62/0x490
> Nov 12 11:54:24 user-desktop kernel: [   19.853315]
> [<ffffffff81090c60>] ? rescuer_thread+0x340/0x340
> Nov 12 11:54:24 user-desktop kernel: [   19.853318]
> [<ffffffff81095f3f>] ? kthread+0xdf/0x100
> Nov 12 11:54:24 user-desktop kernel: [   19.853322]
> [<ffffffff8102b78b>] ? __switch_to+0x2bb/0x710
> Nov 12 11:54:24 user-desktop kernel: [   19.853325]
> [<ffffffff815a371f>] ? ret_from_fork+0x1f/0x40
> Nov 12 11:54:24 user-desktop kernel: [   19.853328]
> [<ffffffff81095e60>] ? kthread_park+0x50/0x50
> Nov 12 11:54:24 user-desktop kernel: [   19.853331] ---[ end trace
> c4840b46b58dbe12 ]---
> 

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply

* Re: [PATCH] Fixup packets with incorrect ethertype sent by ZTE MF821D
From: Bjørn Mork @ 2016-11-13 16:12 UTC (permalink / raw)
  To: Jussi Peltola; +Cc: netdev
In-Reply-To: <20161113134753.GR2745@pokute.pelzi.net>

Jussi Peltola <plz@plz.fi> writes:

> This brokenness appears reliably after running "rdisc6 wwan0" but I have
> not debugged if this is related to timing or the format of the router
> solicitation. Before receiving a router solicitation, v4 is received
> correctly and v6 does not work. After sending the MF821D a router
> solicitation with rdisc6, v4 is broken but v6 works. With this patch
> (which I am using to write this message) a dual-stack context is usable.
>
> commit 2c25237d19c0c9741c6ebec854def99b88618eac
> Author: Jussi Peltola <plz@plz.fi>
> Date:   Sun Nov 13 15:41:50 2016 +0200
>
>     Fixup packets with incorrect ethertype sent by ZTE MF821D
>     
>     Signed-off-by: Jussi Peltola <plz@plz.fi>
>
> diff --git a/drivers/net/usb/qmi_wwan.c b/drivers/net/usb/qmi_wwan.c
> index 3ff76c6..edd8172 100644
> --- a/drivers/net/usb/qmi_wwan.c
> +++ b/drivers/net/usb/qmi_wwan.c
> @@ -172,6 +172,12 @@ static const u8 buggy_fw_addr[ETH_ALEN] = {0x00, 0xa0, 0xc6, 0x00, 0x00, 0x00};
>   * Another common firmware bug results in all packets being addressed
>   * to 00:a0:c6:00:00:00 despite the host address being different.
>   * This function will also fixup such packets.
> + *
> + * At least the ZTE MF821D sends IPv4 packets with a bogus ethertype
> + * of 0x63bc, 0xe3bc, 0x63bd or 0xe3bd, and bogus source and
> + * destination MACs after it has received an IPv6 router solicitation
> + * (IPv6 is transmitted correctly). This function will also fix up
> + * such packets.
>   */
>  static int qmi_wwan_rx_fixup(struct usbnet *dev, struct sk_buff *skb)
>  {
> @@ -195,12 +201,21 @@ static int qmi_wwan_rx_fixup(struct usbnet *dev, struct sk_buff *skb)
>                         return 0;
>                 if (is_multicast_ether_addr(skb->data))
>                         return 1;
> +
>                 /* possibly bogus destination - rewrite just in case */
>                 skb_reset_mac_header(skb);
>                 goto fix_dest;
>         default:
>                 if (rawip)
>                         return 0;
> +
> +               /* Bogus ethertype and src/dst mac for v4 on ZTE MF821D */
> +               if ((skb->data[12] & 0x7f) == 0x63
> +                && (skb->data[13] & 0xfe) == 0xbc) {
> +                       proto = htons(ETH_P_IP);
> +                       goto reset_mac;
> +               }
> +
>                 /* pass along other packets without modifications */
>                 return 1;
>         }
> @@ -213,6 +228,7 @@ static int qmi_wwan_rx_fixup(struct usbnet *dev, struct sk_buff *skb)
>         if (skb_headroom(skb) < ETH_HLEN)
>                 return 0;
>         skb_push(skb, ETH_HLEN);
> +reset_mac:
>         skb_reset_mac_header(skb);
>         eth_hdr(skb)->h_proto = proto;
>         eth_zero_addr(eth_hdr(skb)->h_source);


I must admit that I don't like the additional magic stuff at all.  I
don't doubt a second that it fixes the problem for you.  But there is
also no doubt that the firmware is so broken that there is no way we can
know this fixes anything for anyone else.  Where did the firmware get
those magic numbers?  It's definitely garbage.  But we have no reason to
think the garbage is invariable. It could be part of the firmware code,
or a serial number, or your assigned IPv6 prefix, or something else that
changes between different modems.  The fact that it is stable for you is
unfortunately not proof that it is the same for everyone else.

We have rejected similar fixes when this issue has come up earlier (it
looks like a generic Qualcomm firmware issue shared with other vendors
using the same baseband firmware generation). Using raw-ip is the
recommended workaround for these modems.

In any case, if we're going to add a fix like this, then I want it way
more generic.  The only valid ethertypes expected from the modem is IP,
IPV6 or ARP. Testing against those three, resetting anything else to IP,
will at least catch *any* bogus value.

But I'm not convinced I want this additional processing of every packet
just to let Qualcomm go on hiring monkeys-on-crack to write their
firmware. At least not when we do have raw-ip as a workaround for the
issue.

Feel free to try to convince me, though.


Bjørn

^ permalink raw reply

* Re: [PATCH] ip6_output: ensure flow saddr actually belongs to device
From: David Ahern @ 2016-11-13 16:30 UTC (permalink / raw)
  To: Jason A. Donenfeld, Netdev, WireGuard mailing list, LKML,
	YOSHIFUJI Hideaki, Hannes Frederic Sowa
In-Reply-To: <20161113132347.17907-1-Jason@zx2c4.com>

On 11/13/16 6:23 AM, Jason A. Donenfeld wrote:
> This puts the IPv6 routing functions in parity with the IPv4 routing
> functions. Namely, we now check in v6 that if a flowi6 requests an
> saddr, the returned dst actually corresponds to a net device that has
> that saddr. This mirrors the v4 logic with __ip_dev_find in
> __ip_route_output_key_hash. In the event that the returned dst is not
> for a dst with a dev that has the saddr, we return -EINVAL, just like
> v4; this makes it easy to use the same error handlers for both cases.
> 
> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
> Cc: David Ahern <dsa@cumulusnetworks.com>
> ---
>  net/ipv6/ip6_output.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
> index 6001e78..a834129 100644
> --- a/net/ipv6/ip6_output.c
> +++ b/net/ipv6/ip6_output.c
> @@ -1011,6 +1011,11 @@ static int ip6_dst_lookup_tail(struct net *net, const struct sock *sk,
>  		}
>  	}
>  #endif
> +	if (!ipv6_addr_any(&fl6->saddr) &&
> +	    !ipv6_chk_addr(net, &fl6->saddr, (*dst)->dev, 1)) {
> +		err = -EINVAL;
> +		goto out_err_release;
> +	}
>  
>  	return 0;

You can't require the address to be on the dst device. e.g., it can be an address from the loopback/vrf device.

This block needs to be done at function entry, and pass dev as NULL to mean is the address assigned to any interface. That gets you the equivalency of the IPv4 check.

^ permalink raw reply

* Re: [PATCH] Fixup packets with incorrect ethertype sent by ZTE MF821D
From: David Miller @ 2016-11-13 17:00 UTC (permalink / raw)
  To: bjorn; +Cc: plz, netdev
In-Reply-To: <87zil3cq9i.fsf@miraculix.mork.no>

From: Bjørn Mork <bjorn@mork.no>
Date: Sun, 13 Nov 2016 17:12:57 +0100

> In any case, if we're going to add a fix like this, then I want it way
> more generic.  The only valid ethertypes expected from the modem is IP,
> IPV6 or ARP. Testing against those three, resetting anything else to IP,
> will at least catch *any* bogus value.
> 
> But I'm not convinced I want this additional processing of every packet
> just to let Qualcomm go on hiring monkeys-on-crack to write their
> firmware. At least not when we do have raw-ip as a workaround for the
> issue.
> 
> Feel free to try to convince me, though.

There are also several coding style problems with this patch.

^ permalink raw reply

* Re: [PATCH] r8152: Fix error path in open function
From: David Miller @ 2016-11-13 17:03 UTC (permalink / raw)
  To: linux; +Cc: linux-usb, netdev, linux-kernel
In-Reply-To: <1478749885-32212-1-git-send-email-linux@roeck-us.net>

From: Guenter Roeck <linux@roeck-us.net>
Date: Wed,  9 Nov 2016 19:51:25 -0800

> If usb_submit_urb() called from the open function fails, the following
> crash may be observed.
 ...
> Clean up error handling to avoid registering the notifier if the open
> function is going to fail.
> 
> Signed-off-by: Guenter Roeck <linux@roeck-us.net>

Looks good, applied, thanks.

^ permalink raw reply

* Re: [PATCH] net: phy: marvell: optimize logic for page changing during init
From: David Miller @ 2016-11-13 17:05 UTC (permalink / raw)
  To: uwe; +Cc: f.fainelli, netdev
In-Reply-To: <20161110140301.13381-1-uwe@kleine-koenig.org>

From: Uwe Kleine-König <uwe@kleine-koenig.org>
Date: Thu, 10 Nov 2016 15:03:01 +0100

> Instead of remembering if the page was changed, just compare the current
> page to the saved one. This is easier and has the advantage to save a
> register write if the page was already restored.
> 
> Signed-off-by: Uwe Kleine-König <uwe@kleine-koenig.org>

Applied to net-next, thanks.

^ permalink raw reply

* Re: [PATCH] genetlink: fix unsigned int comparison with less than zero
From: David Miller @ 2016-11-13 17:15 UTC (permalink / raw)
  To: colin.king
  Cc: johannes.berg, pshelar, weiyongjun1, fw, tycho.andersen,
	xiyou.wangcong, tom, netdev, linux-kernel
In-Reply-To: <20161110155758.26996-1-colin.king@canonical.com>

From: Colin King <colin.king@canonical.com>
Date: Thu, 10 Nov 2016 15:57:58 +0000

> From: Colin Ian King <colin.king@canonical.com>
> 
> family->id is unsigned, so the less than zero check for
> failure return from idr_alloc is never true and so the error exit
> is never handled.  Instead, assign err and check if this is less
> than zero since this is a signed integer.
> 
> Issue found with static analysis with CoverityScan, CID 1375916
> 
> Signed-off-by: Colin Ian King <colin.king@canonical.com>
> ---
>  net/netlink/genetlink.c | 6 ++----
>  1 file changed, 2 insertions(+), 4 deletions(-)
> 
> diff --git a/net/netlink/genetlink.c b/net/netlink/genetlink.c
> index f0b65fe..2ea61ba 100644
> --- a/net/netlink/genetlink.c
> +++ b/net/netlink/genetlink.c
> @@ -360,12 +360,10 @@ int genl_register_family(struct genl_family *family)
>  	} else
>  		family->attrbuf = NULL;
>  
> -	family->id = idr_alloc(&genl_fam_idr, family,
> +	family->id = err = idr_alloc(&genl_fam_idr, family,
>  			       start, end + 1, GFP_KERNEL);

First of all, if we make this change you must fixup the indentation of
the second l ine of this idr_alloc() call.  Arguments spanning
multiple lines of a call must be indented precisely to the column
following the openning parenthesis of the first line.

Next, the IDR helpers never give us values that fall within the
positive range of an integer that does not fall within the postive
range of an unsigned integer.

We are going to pass this value in later to release the ID, again
the interface will expect a signed rather than an unsigned int.

Therefore is makes sesne only to change the family->id type to
what it must be, which is a signed int.

I've commited the following to net-next:

====================
[PATCH] genetlink: Make family a signed integer.

The idr_alloc(), idr_remove(), et al. routines all expect IDs to be
signed integers.  Therefore make the genl_family member 'id' signed
too.

Signed-off-by: David S. Miller <davem@davemloft.net>
---
 include/net/genetlink.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/net/genetlink.h b/include/net/genetlink.h
index 3ec87ba..a34275b 100644
--- a/include/net/genetlink.h
+++ b/include/net/genetlink.h
@@ -48,7 +48,7 @@ struct genl_info;
  * @n_ops: number of operations supported by this family
  */
 struct genl_family {
-	unsigned int		id;		/* private */
+	int			id;		/* private */
 	unsigned int		hdrsize;
 	char			name[GENL_NAMSIZ];
 	unsigned int		version;
-- 
2.7.4

^ permalink raw reply related

* Re: [PATCH] Fixup packets with incorrect ethertype sent by ZTE MF821D
From: Jussi Peltola @ 2016-11-13 17:25 UTC (permalink / raw)
  To: Bjørn Mork; +Cc: netdev
In-Reply-To: <87zil3cq9i.fsf@miraculix.mork.no>

On Sun, Nov 13, 2016 at 05:12:57PM +0100, Bjørn Mork wrote:
> In any case, if we're going to add a fix like this, then I want it way
> more generic.  The only valid ethertypes expected from the modem is IP,
> IPV6 or ARP. Testing against those three, resetting anything else to IP,
> will at least catch *any* bogus value.
 
Yes, this is pretty obvious, and it's pretty easy to look for an initial
4 or 6 and assume anything else is ARP (and just pass it through since
no reports of wrong ARP ethertype have been seen as far as I know and
fixing ARP up is probably futile if the L2 addresses in the body or
header don't make sense.)

I didn't bother writing this before sending this one first to provoke
discussion. It should actually be pretty simple; just pop the L2 header
if it looks like it's too broken and later add a new one if (!rawip).

> But I'm not convinced I want this additional processing of every packet
> just to let Qualcomm go on hiring monkeys-on-crack to write their
> firmware. At least not when we do have raw-ip as a workaround for the
> issue.
> 
> Feel free to try to convince me, though.

This is a modem widely sold in Finland by one of the telcos that has
enabled IPv6 to all subscribers. So the population affected is
definitely not just one person - but I will have to see if rawip works
and then see if ModemManager can be made to use that by default. I
didn't initially even think of trying, because v6 only works on this
modem after a router solicit. But who knows...

I find bugs like this, where the general answer for users (or userspace
parts like ModemManager) is "just disable this ipv6 crap" pretty nasty.
Even if the required fix is not exactly elegant, brokenness like this
can greatly set back getting v6 enabled when available.

As more v6 is deployed around the world I would expect more broken
Qualcomm modems to show themselves. I'll have to try to test this modem
to see if this bug disappears when the telco does not have IPv6; if so,
people are in for annoying surprises as the modem suddenly stops working
when the telco deploys v6.

I can agree with the lack of elegance. But I don't see processing per
every packet as a significant issue when the devices connect over USB
and their transfer rates are limited by the real world performance of
mobile networks, and the other option is that the device just doesn't
work at all. A knob would definitely feel wrong, as there is no
indication this logic will ever break any functionality for anyone, it
just wastes a few CPU cycles if the modem is not broken.

^ permalink raw reply

* Re: [PATCH net,v2] ipv4: use new_gw for redirect neigh lookup
From: David Miller @ 2016-11-13 17:25 UTC (permalink / raw)
  To: stephen.suryaputra.lin; +Cc: netdev, ssurya
In-Reply-To: <1478794575-28748-1-git-send-email-ssurya@ieee.org>

From: Stephen Suryaputra Lin <stephen.suryaputra.lin@gmail.com>
Date: Thu, 10 Nov 2016 11:16:15 -0500

> In v2.6, ip_rt_redirect() calls arp_bind_neighbour() which returns 0
> and then the state of the neigh for the new_gw is checked. If the state
> isn't valid then the redirected route is deleted. This behavior is
> maintained up to v3.5.7 by check_peer_redirect() because rt->rt_gateway
> is assigned to peer->redirect_learned.a4 before calling
> ipv4_neigh_lookup().
> 
> After commit 5943634fc559 ("ipv4: Maintain redirect and PMTU info in
> struct rtable again."), ipv4_neigh_lookup() is performed without the
> rt_gateway assigned to the new_gw. In the case when rt_gateway (old_gw)
> isn't zero, the function uses it as the key. The neigh is most likely
> valid since the old_gw is the one that sends the ICMP redirect message.
> Then the new_gw is assigned to fib_nh_exception. The problem is: the
> new_gw ARP may never gets resolved and the traffic is blackholed.
> 
> So, use the new_gw for neigh lookup.
> 
> Changes from v1:
>  - use __ipv4_neigh_lookup instead (per Eric Dumazet).
> 
> Fixes: 5943634fc559 ("ipv4: Maintain redirect and PMTU info in struct rtable again.")
> Signed-off-by: Stephen Suryaputra Lin <ssurya@ieee.org>

Looks good, applied and queued up for -stable.

Thanks.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox