Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH v2] ppp: remove the PPPIOCDETACH ioctl
From: Walter Harms @ 2018-06-06  9:01 UTC (permalink / raw)
  To: linux-ppp, Eric Biggers, Paul Mackerras
  Cc: syzkaller-bugs, Guillaume Nault, Eric Biggers, netdev,
	linux-kernel, linux-fsdevel
In-Reply-To: <20180523213738.146911-1-ebiggers3@gmail.com>



> Eric Biggers <ebiggers3@gmail.com> hat am 23. Mai 2018 um 23:37 geschrieben:
> 
> 
> From: Eric Biggers <ebiggers@google.com>
> 
> The PPPIOCDETACH ioctl effectively tries to "close" the given ppp file
> before f_count has reached 0, which is fundamentally a bad idea.  It
> does check 'f_count < 2', which excludes concurrent operations on the
> file since they would only be possible with a shared fd table, in which
> case each fdget() would take a file reference.  However, it fails to
> account for the fact that even with 'f_count == 1' the file can still be
> linked into epoll instances.  As reported by syzbot, this can trivially
> be used to cause a use-after-free.
> 
> Yet, the only known user of PPPIOCDETACH is pppd versions older than
> ppp-2.4.2, which was released almost 15 years ago (November 2003).
> Also, PPPIOCDETACH apparently stopped working reliably at around the
> same time, when the f_count check was added to the kernel, e.g. see
> https://lkml.org/lkml/2002/12/31/83.  Also, the current 'f_count < 2'
> check makes PPPIOCDETACH only work in single-threaded applications; it
> always fails if called from a multithreaded application.
> 
> All pppd versions released in the last 15 years just close() the file
> descriptor instead.
> 
> Therefore, instead of hacking around this bug by exporting epoll
> internals to modules, and probably missing other related bugs, just
> remove the PPPIOCDETACH ioctl and see if anyone actually notices.  Leave
> a stub in place that prints a one-time warning and returns EINVAL.
> 
> Reported-by: syzbot+16363c99d4134717c05b@syzkaller.appspotmail.com
> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> Cc: stable@vger.kernel.org
> Signed-off-by: Eric Biggers <ebiggers@google.com>
> ---
> 
> v2: leave a stub in place, rather than removing the ioctl completely.
> 
>  Documentation/networking/ppp_generic.txt |  6 ------
>  drivers/net/ppp/ppp_generic.c            | 27 +++++-------------------
>  include/uapi/linux/ppp-ioctl.h           |  2 +-
>  3 files changed, 6 insertions(+), 29 deletions(-)
> 
> diff --git a/Documentation/networking/ppp_generic.txt
> b/Documentation/networking/ppp_generic.txt
> index 091d20273dcb..61daf4b39600 100644
> --- a/Documentation/networking/ppp_generic.txt
> +++ b/Documentation/networking/ppp_generic.txt
> @@ -300,12 +300,6 @@ unattached instance are:
>  The ioctl calls available on an instance of /dev/ppp attached to a
>  channel are:
>  
> -* PPPIOCDETACH detaches the instance from the channel.  This ioctl is
> -  deprecated since the same effect can be achieved by closing the
> -  instance.  In order to prevent possible races this ioctl will fail
> -  with an EINVAL error if more than one file descriptor refers to this
> -  instance (i.e. as a result of dup(), dup2() or fork()).
> -
>  * PPPIOCCONNECT connects this channel to a PPP interface.  The
>    argument should point to an int containing the interface unit
>    number.  It will return an EINVAL error if the channel is already
> diff --git a/drivers/net/ppp/ppp_generic.c b/drivers/net/ppp/ppp_generic.c
> index dc7c7ec43202..02ad03a2fab7 100644
> --- a/drivers/net/ppp/ppp_generic.c
> +++ b/drivers/net/ppp/ppp_generic.c
> @@ -605,30 +605,13 @@ static long ppp_ioctl(struct file *file, unsigned int
> cmd, unsigned long arg)
>  
>  	if (cmd == PPPIOCDETACH) {
>  		/*
> -		 * We have to be careful here... if the file descriptor
> -		 * has been dup'd, we could have another process in the
> -		 * middle of a poll using the same file *, so we had
> -		 * better not free the interface data structures -
> -		 * instead we fail the ioctl.  Even in this case, we
> -		 * shut down the interface if we are the owner of it.
> -		 * Actually, we should get rid of PPPIOCDETACH, userland
> -		 * (i.e. pppd) could achieve the same effect by closing
> -		 * this fd and reopening /dev/ppp.
> +		 * PPPIOCDETACH is no longer supported as it was heavily broken,
> +		 * and is only known to have been used by pppd older than
> +		 * ppp-2.4.2 (released November 2003).
>  		 */
> +		pr_warn_once("%s (%d) used obsolete PPPIOCDETACH ioctl\n",
> +			     current->comm, current->pid);
>  		err = -EINVAL;
> -		if (pf->kind == INTERFACE) {
> -			ppp = PF_TO_PPP(pf);
> -			rtnl_lock();
> -			if (file == ppp->owner)
> -				unregister_netdevice(ppp->dev);
> -			rtnl_unlock();
> -		}
> -		if (atomic_long_read(&file->f_count) < 2) {
> -			ppp_release(NULL, file);
> -			err = 0;
> -		} else
> -			pr_warn("PPPIOCDETACH file->f_count=%ld\n",
> -				atomic_long_read(&file->f_count));
>  		goto out;
>  	}
>  
> diff --git a/include/uapi/linux/ppp-ioctl.h b/include/uapi/linux/ppp-ioctl.h
> index b19a9c249b15..784c2e3e572e 100644
> --- a/include/uapi/linux/ppp-ioctl.h
> +++ b/include/uapi/linux/ppp-ioctl.h
> @@ -106,7 +106,7 @@ struct pppol2tp_ioc_stats {
>  #define PPPIOCGIDLE	_IOR('t', 63, struct ppp_idle) /* get idle time */
>  #define PPPIOCNEWUNIT	_IOWR('t', 62, int)	/* create new ppp unit */
>  #define PPPIOCATTACH	_IOW('t', 61, int)	/* attach to ppp unit */
> -#define PPPIOCDETACH	_IOW('t', 60, int)	/* detach from ppp unit/chan */
> +#define PPPIOCDETACH	_IOW('t', 60, int)	/* obsolete, do not use */

It's a bit late but ...
i would suggest a stronger wording here. like /* removed 2003 */ 
for me "obsolete" sounds like "maybe it will be removed in a distant future"

just my 2 cents,

re,
 wh

>  #define PPPIOCSMRRU	_IOW('t', 59, int)	/* set multilink MRU */
>  #define PPPIOCCONNECT	_IOW('t', 58, int)	/* connect channel to unit */
>  #define PPPIOCDISCONN	_IO('t', 57)		/* disconnect channel */
> -- 
> 2.17.0.441.gb46fe60e1d-goog
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ppp" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [RFC PATCH 0/6] net: ethernet: ti: cpsw: add MQPRIO and CBS Qdisc offload
From: Ivan Khoronzhuk @ 2018-06-06  8:53 UTC (permalink / raw)
  To: Vinicius Costa Gomes
  Cc: grygorii.strashko, davem, corbet, akpm, netdev, linux-doc,
	linux-kernel, linux-omap, henrik, jesus.sanchez-palencia
In-Reply-To: <87602y1n4j.fsf@intel.com>

On Mon, Jun 04, 2018 at 02:23:56PM -0300, Vinicius Costa Gomes wrote:
>Hi Ivan,
>
>Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> writes:
>
>> This series adds MQPRIO and CBS Qdisc offload for TI cpsw driver.
>> It potentially can be used in audio video bridging (AVB) and time
>> sensitive networking (TSN).
>>
>> Patchset was tested on AM572x EVM and BBB boards. Last patch from this
>> series adds detailed description of configuration with examples. For
>> consistency reasons, in role of talker and listener, tools from
>> patchset "TSN: Add qdisc based config interface for CBS" were used and
>> can be seen here: https://www.spinics.net/lists/netdev/msg460869.html
>>
>> Based on net-next/master
>>
>
>I didn't test this, but it looks fine from my side.
>
>I agree with Grygorii, that if no comments, this should be re-sent as a
>patch series next.
Thanks, I'll do it on this week.

>
>
>> Ivan Khoronzhuk (6):
>>   net: ethernet: ti: cpsw: use cpdma channels in backward order for txq
>>   net: ethernet: ti: cpdma: fit rated channels in backward order
>>   net: ethernet: ti: cpsw: add MQPRIO Qdisc offload
>>   net: ethernet: ti: cpsw: add CBS Qdisc offload
>>   net: ethernet: ti: cpsw: restore shaper configuration while down/up
>>   Documentation: networking: cpsw: add MQPRIO & CBS offload examples
>>
>>  Documentation/networking/cpsw.txt       | 540 ++++++++++++++++++++++++
>>  drivers/net/ethernet/ti/cpsw.c          | 364 +++++++++++++++-
>>  drivers/net/ethernet/ti/davinci_cpdma.c |  31 +-
>>  3 files changed, 913 insertions(+), 22 deletions(-)
>>  create mode 100644 Documentation/networking/cpsw.txt
>>
>> --
>> 2.17.0
>
>
>Cheers,
>--
>Vinicius

-- 
Regards,
Ivan Khoronzhuk

^ permalink raw reply

* Re: [PATCH net-next 2/3] net: hns3: Fix for VF mailbox receiving unknown message
From: kbuild test robot @ 2018-06-06  8:48 UTC (permalink / raw)
  To: Salil Mehta
  Cc: kbuild-all, davem, salil.mehta, yisen.zhuang, lipeng321,
	mehta.salil, netdev, linux-kernel, linuxarm, Xi Wang
In-Reply-To: <20180605114201.29900-3-salil.mehta@huawei.com>

[-- Attachment #1: Type: text/plain, Size: 9646 bytes --]

Hi Xi,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on net-next/master]

url:    https://github.com/0day-ci/linux/commits/Salil-Mehta/Bug-fixes-optimization-for-HNS3-Driver/20180606-155040
config: xtensa-allmodconfig (attached as .config)
compiler: xtensa-linux-gcc (GCC) 8.1.0
reproduce:
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        make.cross ARCH=xtensa 

All errors (new ones prefixed by >>):

   In file included from include/linux/init.h:5,
                    from drivers/net/ethernet/hisilicon/hns3/hclge_mbx.h:6,
                    from drivers/net/ethernet//hisilicon/hns3/hns3vf/hclgevf_mbx.c:4:
   drivers/net/ethernet//hisilicon/hns3/hns3vf/hclgevf_mbx.c: In function 'hclgevf_mbx_handler':
>> drivers/net/ethernet//hisilicon/hns3/hns3vf/hclgevf_mbx.c:155:17: error: implicit declaration of function 'hnae3_get_bit'; did you mean 'hnae_get_bit'? [-Werror=implicit-function-declaration]
      if (unlikely(!hnae3_get_bit(flag, HCLGEVF_CMDQ_RX_OUTVLD_B))) {
                    ^~~~~~~~~~~~~
   include/linux/compiler.h:77:42: note: in definition of macro 'unlikely'
    # define unlikely(x) __builtin_expect(!!(x), 0)
                                             ^
   cc1: some warnings being treated as errors

vim +155 drivers/net/ethernet//hisilicon/hns3/hns3vf/hclgevf_mbx.c

     3	
   > 4	#include "hclge_mbx.h"
     5	#include "hclgevf_main.h"
     6	#include "hnae3.h"
     7	
     8	static void hclgevf_reset_mbx_resp_status(struct hclgevf_dev *hdev)
     9	{
    10		/* this function should be called with mbx_resp.mbx_mutex held
    11		 * to prtect the received_response from race condition
    12		 */
    13		hdev->mbx_resp.received_resp  = false;
    14		hdev->mbx_resp.origin_mbx_msg = 0;
    15		hdev->mbx_resp.resp_status    = 0;
    16		memset(hdev->mbx_resp.additional_info, 0, HCLGE_MBX_MAX_RESP_DATA_SIZE);
    17	}
    18	
    19	/* hclgevf_get_mbx_resp: used to get a response from PF after VF sends a mailbox
    20	 * message to PF.
    21	 * @hdev: pointer to struct hclgevf_dev
    22	 * @resp_msg: pointer to store the original message type and response status
    23	 * @len: the resp_msg data array length.
    24	 */
    25	static int hclgevf_get_mbx_resp(struct hclgevf_dev *hdev, u16 code0, u16 code1,
    26					u8 *resp_data, u16 resp_len)
    27	{
    28	#define HCLGEVF_MAX_TRY_TIMES	500
    29	#define HCLGEVF_SLEEP_USCOEND	1000
    30		struct hclgevf_mbx_resp_status *mbx_resp;
    31		u16 r_code0, r_code1;
    32		int i = 0;
    33	
    34		if (resp_len > HCLGE_MBX_MAX_RESP_DATA_SIZE) {
    35			dev_err(&hdev->pdev->dev,
    36				"VF mbx response len(=%d) exceeds maximum(=%d)\n",
    37				resp_len,
    38				HCLGE_MBX_MAX_RESP_DATA_SIZE);
    39			return -EINVAL;
    40		}
    41	
    42		while ((!hdev->mbx_resp.received_resp) && (i < HCLGEVF_MAX_TRY_TIMES)) {
    43			udelay(HCLGEVF_SLEEP_USCOEND);
    44			i++;
    45		}
    46	
    47		if (i >= HCLGEVF_MAX_TRY_TIMES) {
    48			dev_err(&hdev->pdev->dev,
    49				"VF could not get mbx resp(=%d) from PF in %d tries\n",
    50				hdev->mbx_resp.received_resp, i);
    51			return -EIO;
    52		}
    53	
    54		mbx_resp = &hdev->mbx_resp;
    55		r_code0 = (u16)(mbx_resp->origin_mbx_msg >> 16);
    56		r_code1 = (u16)(mbx_resp->origin_mbx_msg & 0xff);
    57	
    58		if (mbx_resp->resp_status)
    59			return mbx_resp->resp_status;
    60	
    61		if (resp_data)
    62			memcpy(resp_data, &mbx_resp->additional_info[0], resp_len);
    63	
    64		hclgevf_reset_mbx_resp_status(hdev);
    65	
    66		if (!(r_code0 == code0 && r_code1 == code1 && !mbx_resp->resp_status)) {
    67			dev_err(&hdev->pdev->dev,
    68				"VF could not match resp code(code0=%d,code1=%d), %d",
    69				code0, code1, mbx_resp->resp_status);
    70			return -EIO;
    71		}
    72	
    73		return 0;
    74	}
    75	
    76	int hclgevf_send_mbx_msg(struct hclgevf_dev *hdev, u16 code, u16 subcode,
    77				 const u8 *msg_data, u8 msg_len, bool need_resp,
    78				 u8 *resp_data, u16 resp_len)
    79	{
    80		struct hclge_mbx_vf_to_pf_cmd *req;
    81		struct hclgevf_desc desc;
    82		int status;
    83	
    84		req = (struct hclge_mbx_vf_to_pf_cmd *)desc.data;
    85	
    86		/* first two bytes are reserved for code & subcode */
    87		if (msg_len > (HCLGE_MBX_MAX_MSG_SIZE - 2)) {
    88			dev_err(&hdev->pdev->dev,
    89				"VF send mbx msg fail, msg len %d exceeds max len %d\n",
    90				msg_len, HCLGE_MBX_MAX_MSG_SIZE);
    91			return -EINVAL;
    92		}
    93	
    94		hclgevf_cmd_setup_basic_desc(&desc, HCLGEVF_OPC_MBX_VF_TO_PF, false);
    95		req->msg[0] = code;
    96		req->msg[1] = subcode;
    97		memcpy(&req->msg[2], msg_data, msg_len);
    98	
    99		/* synchronous send */
   100		if (need_resp) {
   101			mutex_lock(&hdev->mbx_resp.mbx_mutex);
   102			hclgevf_reset_mbx_resp_status(hdev);
   103			status = hclgevf_cmd_send(&hdev->hw, &desc, 1);
   104			if (status) {
   105				dev_err(&hdev->pdev->dev,
   106					"VF failed(=%d) to send mbx message to PF\n",
   107					status);
   108				mutex_unlock(&hdev->mbx_resp.mbx_mutex);
   109				return status;
   110			}
   111	
   112			status = hclgevf_get_mbx_resp(hdev, code, subcode, resp_data,
   113						      resp_len);
   114			mutex_unlock(&hdev->mbx_resp.mbx_mutex);
   115		} else {
   116			/* asynchronous send */
   117			status = hclgevf_cmd_send(&hdev->hw, &desc, 1);
   118			if (status) {
   119				dev_err(&hdev->pdev->dev,
   120					"VF failed(=%d) to send mbx message to PF\n",
   121					status);
   122				return status;
   123			}
   124		}
   125	
   126		return status;
   127	}
   128	
   129	static bool hclgevf_cmd_crq_empty(struct hclgevf_hw *hw)
   130	{
   131		u32 tail = hclgevf_read_dev(hw, HCLGEVF_NIC_CRQ_TAIL_REG);
   132	
   133		return tail == hw->cmq.crq.next_to_use;
   134	}
   135	
   136	void hclgevf_mbx_handler(struct hclgevf_dev *hdev)
   137	{
   138		struct hclgevf_mbx_resp_status *resp;
   139		struct hclge_mbx_pf_to_vf_cmd *req;
   140		struct hclgevf_cmq_ring *crq;
   141		struct hclgevf_desc *desc;
   142		u16 *msg_q;
   143		u16 flag;
   144		u8 *temp;
   145		int i;
   146	
   147		resp = &hdev->mbx_resp;
   148		crq = &hdev->hw.cmq.crq;
   149	
   150		while (!hclgevf_cmd_crq_empty(&hdev->hw)) {
   151			desc = &crq->desc[crq->next_to_use];
   152			req = (struct hclge_mbx_pf_to_vf_cmd *)desc->data;
   153	
   154			flag = le16_to_cpu(crq->desc[crq->next_to_use].flag);
 > 155			if (unlikely(!hnae3_get_bit(flag, HCLGEVF_CMDQ_RX_OUTVLD_B))) {
   156				dev_warn(&hdev->pdev->dev,
   157					 "dropped invalid mailbox message, code = %d\n",
   158					 req->msg[0]);
   159	
   160				/* dropping/not processing this invalid message */
   161				crq->desc[crq->next_to_use].flag = 0;
   162				hclge_mbx_ring_ptr_move_crq(crq);
   163				continue;
   164			}
   165	
   166			/* synchronous messages are time critical and need preferential
   167			 * treatment. Therefore, we need to acknowledge all the sync
   168			 * responses as quickly as possible so that waiting tasks do not
   169			 * timeout and simultaneously queue the async messages for later
   170			 * prcessing in context of mailbox task i.e. the slow path.
   171			 */
   172			switch (req->msg[0]) {
   173			case HCLGE_MBX_PF_VF_RESP:
   174				if (resp->received_resp)
   175					dev_warn(&hdev->pdev->dev,
   176						 "VF mbx resp flag not clear(%d)\n",
   177						 req->msg[1]);
   178				resp->received_resp = true;
   179	
   180				resp->origin_mbx_msg = (req->msg[1] << 16);
   181				resp->origin_mbx_msg |= req->msg[2];
   182				resp->resp_status = req->msg[3];
   183	
   184				temp = (u8 *)&req->msg[4];
   185				for (i = 0; i < HCLGE_MBX_MAX_RESP_DATA_SIZE; i++) {
   186					resp->additional_info[i] = *temp;
   187					temp++;
   188				}
   189				break;
   190			case HCLGE_MBX_LINK_STAT_CHANGE:
   191			case HCLGE_MBX_ASSERTING_RESET:
   192				/* set this mbx event as pending. This is required as we
   193				 * might loose interrupt event when mbx task is busy
   194				 * handling. This shall be cleared when mbx task just
   195				 * enters handling state.
   196				 */
   197				hdev->mbx_event_pending = true;
   198	
   199				/* we will drop the async msg if we find ARQ as full
   200				 * and continue with next message
   201				 */
   202				if (hdev->arq.count >= HCLGE_MBX_MAX_ARQ_MSG_NUM) {
   203					dev_warn(&hdev->pdev->dev,
   204						 "Async Q full, dropping msg(%d)\n",
   205						 req->msg[1]);
   206					break;
   207				}
   208	
   209				/* tail the async message in arq */
   210				msg_q = hdev->arq.msg_q[hdev->arq.tail];
   211				memcpy(&msg_q[0], req->msg, HCLGE_MBX_MAX_ARQ_MSG_SIZE);
   212				hclge_mbx_tail_ptr_move_arq(hdev->arq);
   213				hdev->arq.count++;
   214	
   215				hclgevf_mbx_task_schedule(hdev);
   216	
   217				break;
   218			default:
   219				dev_err(&hdev->pdev->dev,
   220					"VF received unsupported(%d) mbx msg from PF\n",
   221					req->msg[0]);
   222				break;
   223			}
   224			crq->desc[crq->next_to_use].flag = 0;
   225			hclge_mbx_ring_ptr_move_crq(crq);
   226		}
   227	
   228		/* Write back CMDQ_RQ header pointer, M7 need this pointer */
   229		hclgevf_write_dev(&hdev->hw, HCLGEVF_NIC_CRQ_HEAD_REG,
   230				  crq->next_to_use);
   231	}
   232	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 53124 bytes --]

^ permalink raw reply

* Re: [PATCH 1/1] net: dsa: b53: Fix for brcm tag issue in Cygnus SoC
From: Clément Péron @ 2018-06-06  8:32 UTC (permalink / raw)
  To: scott.branden
  Cc: arun.parameswaran, Florian Fainelli, vivien.didelot, andrew,
	davem, netdev, linux-kernel, BCM Kernel Feedback
In-Reply-To: <8bb9ab39-a38b-9492-975e-d76c864ff8ff@broadcom.com>

Hi Scott,

On Tue, 5 Jun 2018 at 22:55, Scott Branden <scott.branden@broadcom.com> wrote:
>
> Clement,
>
> Please test this works for your issue.

Tested on top of 4.17.0 with a BCM58305

Tested-by: Clément Péron <peron.clem@gmail.com>

>
>
> On 18-06-05 01:38 PM, Arun Parameswaran wrote:
> > In the Broadcom Cygnus SoC, the brcm tag needs to be inserted
> > in between the mac address and the ether type (should use
> > 'DSA_PROTO_TAG_BRCM') for the packets sent to the internal
> > b53 switch.
> >
> > Since the Cygnus was added with the BCM58XX device id and the
> > BCM58XX uses 'DSA_PROTO_TAG_BRCM_PREPEND', the data path is
> > broken, due to the incorrect brcm tag location.
> >
> > Add a new b53 device id (BCM583XX) for Cygnus family to fix the
> > issue. Add the new device id to the BCM58XX family as Cygnus
> > is similar to the BCM58XX in most other functionalities.
> >
> > Fixes: 11606039604c ("net: dsa: b53: Support prepended Broadcom tags")
> >
> > Signed-off-by: Arun Parameswaran <arun.parameswaran@broadcom.com>
> Acked-by: Scott Branden <scott.branden@broadcom.com>
> > ---
> >   drivers/net/dsa/b53/b53_common.c | 15 ++++++++++++++-
> >   drivers/net/dsa/b53/b53_priv.h   |  2 ++
> >   drivers/net/dsa/b53/b53_srab.c   |  4 ++--
> >   3 files changed, 18 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/net/dsa/b53/b53_common.c b/drivers/net/dsa/b53/b53_common.c
> > index 3da5fca..bbc6cc6 100644
> > --- a/drivers/net/dsa/b53/b53_common.c
> > +++ b/drivers/net/dsa/b53/b53_common.c
> > @@ -684,7 +684,8 @@ static int b53_switch_reset(struct b53_device *dev)
> >        * still use this driver as a library and need to perform the reset
> >        * earlier.
> >        */
> > -     if (dev->chip_id == BCM58XX_DEVICE_ID) {
> > +     if (dev->chip_id == BCM58XX_DEVICE_ID ||
> > +         dev->chip_id == BCM583XX_DEVICE_ID) {
> >               b53_read8(dev, B53_CTRL_PAGE, B53_SOFTRESET, &reg);
> >               reg |= SW_RST | EN_SW_RST | EN_CH_RST;
> >               b53_write8(dev, B53_CTRL_PAGE, B53_SOFTRESET, reg);
> > @@ -1880,6 +1881,18 @@ struct b53_chip_data {
> >               .jumbo_size_reg = B53_JUMBO_MAX_SIZE,
> >       },
> >       {
> > +             .chip_id = BCM583XX_DEVICE_ID,
> > +             .dev_name = "BCM583xx/11360",
> > +             .vlans = 4096,
> > +             .enabled_ports = 0x103,
> > +             .arl_entries = 4,
> > +             .cpu_port = B53_CPU_PORT,
> > +             .vta_regs = B53_VTA_REGS,
> > +             .duplex_reg = B53_DUPLEX_STAT_GE,
> > +             .jumbo_pm_reg = B53_JUMBO_PORT_MASK,
> > +             .jumbo_size_reg = B53_JUMBO_MAX_SIZE,
> > +     },
> > +     {
> >               .chip_id = BCM7445_DEVICE_ID,
> >               .dev_name = "BCM7445",
> >               .vlans  = 4096,
> > diff --git a/drivers/net/dsa/b53/b53_priv.h b/drivers/net/dsa/b53/b53_priv.h
> > index 3b57f47..b232aaa 100644
> > --- a/drivers/net/dsa/b53/b53_priv.h
> > +++ b/drivers/net/dsa/b53/b53_priv.h
> > @@ -62,6 +62,7 @@ enum {
> >       BCM53018_DEVICE_ID = 0x53018,
> >       BCM53019_DEVICE_ID = 0x53019,
> >       BCM58XX_DEVICE_ID = 0x5800,
> > +     BCM583XX_DEVICE_ID = 0x58300,
> >       BCM7445_DEVICE_ID = 0x7445,
> >       BCM7278_DEVICE_ID = 0x7278,
> >   };
> > @@ -181,6 +182,7 @@ static inline int is5301x(struct b53_device *dev)
> >   static inline int is58xx(struct b53_device *dev)
> >   {
> >       return dev->chip_id == BCM58XX_DEVICE_ID ||
> > +             dev->chip_id == BCM583XX_DEVICE_ID ||
> >               dev->chip_id == BCM7445_DEVICE_ID ||
> >               dev->chip_id == BCM7278_DEVICE_ID;
> >   }
> > diff --git a/drivers/net/dsa/b53/b53_srab.c b/drivers/net/dsa/b53/b53_srab.c
> > index c37ffd1..8247481 100644
> > --- a/drivers/net/dsa/b53/b53_srab.c
> > +++ b/drivers/net/dsa/b53/b53_srab.c
> > @@ -364,7 +364,7 @@ static int b53_srab_write64(struct b53_device *dev, u8 page, u8 reg,
> >       { .compatible = "brcm,bcm53018-srab" },
> >       { .compatible = "brcm,bcm53019-srab" },
> >       { .compatible = "brcm,bcm5301x-srab" },
> > -     { .compatible = "brcm,bcm11360-srab", .data = (void *)BCM58XX_DEVICE_ID },
> > +     { .compatible = "brcm,bcm11360-srab", .data = (void *)BCM583XX_DEVICE_ID },
> >       { .compatible = "brcm,bcm58522-srab", .data = (void *)BCM58XX_DEVICE_ID },
> >       { .compatible = "brcm,bcm58525-srab", .data = (void *)BCM58XX_DEVICE_ID },
> >       { .compatible = "brcm,bcm58535-srab", .data = (void *)BCM58XX_DEVICE_ID },
> > @@ -372,7 +372,7 @@ static int b53_srab_write64(struct b53_device *dev, u8 page, u8 reg,
> >       { .compatible = "brcm,bcm58623-srab", .data = (void *)BCM58XX_DEVICE_ID },
> >       { .compatible = "brcm,bcm58625-srab", .data = (void *)BCM58XX_DEVICE_ID },
> >       { .compatible = "brcm,bcm88312-srab", .data = (void *)BCM58XX_DEVICE_ID },
> > -     { .compatible = "brcm,cygnus-srab", .data = (void *)BCM58XX_DEVICE_ID },
> > +     { .compatible = "brcm,cygnus-srab", .data = (void *)BCM583XX_DEVICE_ID },
> >       { .compatible = "brcm,nsp-srab", .data = (void *)BCM58XX_DEVICE_ID },
> >       { /* sentinel */ },
> >   };
>

^ permalink raw reply

* Re: [PATCH 0/4] RFC CPSW switchdev mode
From: Ivan Khoronzhuk @ 2018-06-06  8:23 UTC (permalink / raw)
  To: Grygorii Strashko
  Cc: Andrew Lunn, Ilias Apalodimas, Ivan Vecera, Jiri Pirko, netdev,
	nsekhar, francois.ozog, yogeshs, spatton
In-Reply-To: <4908e5f8-7533-6fa5-866f-1fb32fd13857@ti.com>

On Tue, Jun 05, 2018 at 06:23:45PM -0500, Grygorii Strashko wrote:
>
>
>On 06/02/2018 07:26 PM, Andrew Lunn wrote:
>>> *After this patch set*: goal keep things working the same as max as
>>> possible and get rid of TI custom tool.
>>
>> We are happy to keep things the same, if they fit with the switchdev
>> model. Anything in your customer TI tool/model which does not fit the
>> switchdev model you won't be able to keep, except if we agree to
>> extend the model.
>
>Right. That's the main goal of RFC to identify those gaps.
>
>>
>> I can say now, sw0p0 is going to cause problems. I really do suggest
>> you drop it for the moment in order to get a minimal driver
>> accepted. sw0p0 does not fit the switchdev model.
>
>Honestly, this is not the first patchset and we started without sw0p0,
>but then.... (with current LKML)
>- default vlan offloading breaks traffic reception to P0
> (Ilias saying it's fixed in next - good)
>- adding vlan to P1/P2 works, but not for P0 (again as per Ilias -fixed)
>- mcast - no way to manually add static record and include or exclude P0.
>
>
>:( above are basic functionality required.
>
>>
>>> Below I've described some tested use cases (not include full static configuration),
>>> but regarding sw0p0 - there is work done by Ivan Khoronzhuk [1] which enables
>>> adds MQPRIO and CBS Qdisc and targets AVB network features. It required to
>>> offload MQPRIO and CBS parameters on all ports including P0. In case of P0,
>>> CPDMA TX channels shapers need to be configured, and in case
>>> of sw0p1/sw0p2 internal FIFOS.
>>> sw0p0 also expected to be used to configure CPDMA interface in general -
>>> number of tx/rx channels, rates, ring sizes.
>>
>> Can this be derives from the configuration on sw0p1 and sw0p2?
>> sw0p1 has 1 tx channel, sw0p2 has 2 tx channels, so give p0 3 tx
>> channels?
>
>This not exactly what is required, sry I probably will need just repeat what Ivan
>described in https://lkml.org/lkml/2018/5/18/1135. I'd try to provide this info tomorrow.
>
>Unfortunately, I'm not sure if all this reworking and switchdev conversation would make sense
>if we will not be able to fit Ivan's work in new CPSW driver model ;..(
>and do AVB bridge.

Not sure I tracked everything, but current link example only for dual-emac mode,
where i guess no switchdev and thus we have only 2 ports having phys and no need
in some common configuration. Only common resources tx queues that are easily
shared between two ports. In case of switchdev the same, no need in port 0, at
least for cpsw implementation (not sure about others), tx queues cpdma shapers
are configured separately and can be done with any netdev presenting ports,
thus p0 can be avoided in this case also. For instance, I've created short
description, based on current switchdev RFC, showing simple configuration
commands for shapers configuration from host side and CBS for every port,
w/o p0 usage:
https://git.linaro.org/people/ivan.khoronzhuk/tsn_conf.git/tree/README_switchdev

Will add it once switchdev decision is stabilized and applied. Basically nothing
changed with dual-emac configuration except different rates set for cpdma
shapers from host side.

Probably, p0 is needed for other configurations, or for compatibility reasons
with old versions, but definitely should be created list of all cases, and on my
opinion, any common resource for both ports can be configured with any netdev
presenting ports if it doesn't conflict with ports configuration itself.

>
>>
>>> In addition there is set of global CPSW parameters (not related to P1/P2, like
>>> MAC Authorization Mode, OUI Deny Mode, crc ) which I've
>>> thought can be added to sw0p0 (using ethtool -priv-flags).
>>
>> You should describe these features, and then we can figure out how
>> best to model them. devlink might be an option if they are switch
>> global.
>
>Ok. that can be postponed.
>
>-- 
>regards,
>-grygorii

-- 
Regards,
Ivan Khoronzhuk

^ permalink raw reply

* Re: Freeze when using ipheth+IPsec+IPv6
From: Yves-Alexis Perez @ 2018-06-06  8:21 UTC (permalink / raw)
  To: linux-kernel, David S. Miller, Hans Liljestrand, David Windsor,
	Kees Cook, Reshetova, Elena, Kirill Tkhai, Al Viro, Cong Wang,
	Mateusz Jurczyk, Denys Vlasenko, David Herrmann, netdev,
	Alexander Kappner, Johannes Berg, Gustavo A. R. Silva,
	Arvind Yadav, Steffen Klassert, Herbert Xu
In-Reply-To: <20180605085450.GA3506@scapa.corsac.net>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

On Tue, Jun 05, 2018 at 10:54:51AM +0200, Yves-Alexis Perez wrote:
> Hi,
> 
> since some kernels releases (I didn't test thorougly but at least 4.16
> and 4.17) I have regular freezes in certain situations on my laptop.
> 
> It seems to happen when I:
> 
> - tether using my iPhone (involving ipheth)
> - mount an IPsec tunnel over IPv4
> - run evolution to fetch my mail (IMAP traffic over IPv6 inside the IPv4
>   IPsec tunnel)
> 
> When I do that, the interface seems to freeze. Last time the mouse was
> still moving so the kernel didn't completely crash, but the UI was
> completely irresponsive. I managed to get the attached log from
> /sys/fs/pstore with refcount_t stuff pointing to an underflow.

Today I had a different behavior. Again same situation (ipheth, IPsec
tunnel, refresh of the LKML folder in Evolution). The kernel didn't
crash/freeze but I had multiple (33309 actually) "recvmsg bug:
copied..." traces like this one:


[ 1555.957599] ------------[ cut here ]------------
[ 1555.957619] recvmsg bug: copied ABEA08B2 seq 1 rcvnxt ABEA0DCE fl 0
[ 1555.957805] WARNING: CPU: 3 PID: 2177 at /home/corsac/projets/linux/linux/net/ipv4/tcp.c:1850 tcp_recvmsg+0x610/0xb40
[ 1555.957813] Modules linked in: esp4 xfrm6_mode_tunnel xfrm4_mode_tunnel bnep ipheth rtsx_pci_sdmmc snd_hda_codec_realtek iwlmvm snd_hda_codec_generic snd_hda_codec_hdmi snd_hda_intel iwlwifi snd_hda_codec snd_hwdep rtsx_pci snd_hda_core snd_pcm thinkpad_acpi efivarfs input_leds
[ 1555.957895] CPU: 3 PID: 2177 Comm: pool Tainted: G                T 4.17.0 #22
[ 1555.957902] Hardware name: LENOVO 20CMCTO1WW/20CMCTO1WW, BIOS N10ET48W (1.27 ) 09/12/2017
[ 1555.957922] RIP: 0010:tcp_recvmsg+0x610/0xb40
[ 1555.957927] RSP: 0018:ffffb77e010f7cf8 EFLAGS: 00010282
[ 1555.957932] RAX: 0000000000000000 RBX: 00000000abea08b2 RCX: 0000000000000006
[ 1555.957935] RDX: 0000000000000007 RSI: 0000000000000086 RDI: ffffa37a8dd95610
[ 1555.957939] RBP: ffffb77e010f7db8 R08: 00000000000003b4 R09: 0000000000000004
[ 1555.957942] R10: ffffa37a3b1180c8 R11: 0000000000000001 R12: ffffa37a81d40e00
[ 1555.957945] R13: ffffa37a3b118000 R14: ffffa37a3b118524 R15: 0000000000000000
[ 1555.957951] FS:  0000738f795c0700(0000) GS:ffffa37a8dd80000(0000) knlGS:0000000000000000
[ 1555.957954] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1555.957957] CR2: 0000738f0879a028 CR3: 000000024200c006 CR4: 00000000003606e0
[ 1555.957964] Call Trace:
[ 1555.957996]  inet_recvmsg+0x5c/0x110
[ 1555.958017]  __sys_recvfrom+0xf2/0x160
[ 1555.958030]  __x64_sys_recvfrom+0x1f/0x30
[ 1555.958039]  do_syscall_64+0x72/0x1c0
[ 1555.958048]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 1555.958053] RIP: 0033:0x73901a71deae
[ 1555.958056] RSP: 002b:0000738f795bee50 EFLAGS: 00000246 ORIG_RAX: 000000000000002d
[ 1555.958060] RAX: ffffffffffffffda RBX: 0000000000000028 RCX: 000073901a71deae
[ 1555.958063] RDX: 0000000000000404 RSI: 0000738f087955a7 RDI: 0000000000000028
[ 1555.958066] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
[ 1555.958068] R10: 0000000000000000 R11: 0000000000000246 R12: 0000738f087955a7
[ 1555.958071] R13: 0000000000000404 R14: 0000000000000000 R15: ffffffffffffffff
[ 1555.958075] Code: e9 33 fd ff ff 4c 89 e0 41 8b 8d 20 05 00 00 89 de 48 c7 c7 10 47 05 ae 48 89 85 48 ff ff ff 44 8b 85 70 ff ff ff e8 80 0d 93 ff <0f> 0b 48 8b 85 48 ff ff ff e9 ed fd ff ff 41 8b 8d 20 05 00 00 
[ 1555.958180] ---[ end trace e7da03c87ec51f13 ]---

(complete log available but it seems that only R08 is changing between
these traces)

Followed by a "recvmsg bug 2:":

[ 1563.657991] ------------[ cut here ]------------
[ 1563.657992] recvmsg bug 2: copied ABEA08B2 seq 6A7E3970 rcvnxt ABECA5EE fl 0
[ 1563.658002] WARNING: CPU: 1 PID: 2177 at /home/corsac/projets/linux/linux/net/ipv4/tcp.c:1864 tcp_recvmsg+0x647/0xb40
[ 1563.658002] Modules linked in: esp4 xfrm6_mode_tunnel xfrm4_mode_tunnel bnep ipheth rtsx_pci_sdmmc snd_hda_codec_realtek iwlmvm snd_hda_codec_generic snd_hda_codec_hdmi snd_hda_intel iwlwifi snd_hda_codec snd_hwdep rtsx_pci snd_hda_core snd_pcm thinkpad_acpi efivarfs input_leds
[ 1563.658016] CPU: 1 PID: 2177 Comm: pool Tainted: G        W       T 4.17.0 #22
[ 1563.658017] Hardware name: LENOVO 20CMCTO1WW/20CMCTO1WW, BIOS N10ET48W (1.27 ) 09/12/2017
[ 1563.658019] RIP: 0010:tcp_recvmsg+0x647/0xb40
[ 1563.658020] RSP: 0018:ffffb77e010f7cf8 EFLAGS: 00010282
[ 1563.658022] RAX: 0000000000000000 RBX: 00000000416bcf42 RCX: 0000000000000006
[ 1563.658023] RDX: 0000000000000007 RSI: 0000000000000086 RDI: ffffa37a8dc95610
[ 1563.658024] RBP: ffffb77e010f7db8 R08: 000000000013fd88 R09: 0000000000000004
[ 1563.658026] R10: ffffa37a3b1180c8 R11: 0000000000000001 R12: ffffa37a81d40e00
[ 1563.658027] R13: ffffa37a3b118000 R14: ffffa37a3b118524 R15: 0000000000000000
[ 1563.658028] FS:  0000738f795c0700(0000) GS:ffffa37a8dc80000(0000) knlGS:0000000000000000
[ 1563.658030] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1563.658031] CR2: 00007f967818b048 CR3: 000000024200c003 CR4: 00000000003606e0
[ 1563.658032] Call Trace:
[ 1563.658040]  inet_recvmsg+0x5c/0x110
[ 1563.658046]  __sys_recvfrom+0xf2/0x160
[ 1563.658054]  __x64_sys_recvfrom+0x1f/0x30
[ 1563.658060]  do_syscall_64+0x72/0x1c0
[ 1563.658062]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 1563.658065] RIP: 0033:0x73901a71deae
[ 1563.658070] RSP: 002b:0000738f795bee50 EFLAGS: 00000246 ORIG_RAX: 000000000000002d
[ 1563.658080] RAX: ffffffffffffffda RBX: 0000000000000028 RCX: 000073901a71deae
[ 1563.658085] RDX: 0000000000000404 RSI: 0000738f087955a7 RDI: 0000000000000028
[ 1563.658089] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
[ 1563.658092] R10: 0000000000000000 R11: 0000000000000246 R12: 0000738f087955a7
[ 1563.658097] R13: 0000000000000404 R14: 0000000000000000 R15: ffffffffffffffff
[ 1563.658102] Code: ff ff 41 8b 8d 20 05 00 00 48 c7 c7 40 47 05 ae 4c 89 95 48 ff ff ff 41 8b 54 24 28 44 8b 85 70 ff ff ff 41 8b 36 e8 49 0d 93 ff <0f> 0b 4c 8b 95 48 ff ff ff e9 89 fb ff ff 49 8b 55 60 83 e2 02 
[ 1563.658219] ---[ end trace e7da03c87ec5c408 ]---

and finally a NULL pointer dereference:

[ 1563.658223] BUG: unable to handle kernel NULL pointer dereference at 0000000000000028
[ 1563.658230] PGD 0 P4D 0 
[ 1563.658234] Oops: 0000 [#1] PREEMPT SMP PTI
[ 1563.658237] Modules linked in: esp4 xfrm6_mode_tunnel xfrm4_mode_tunnel bnep ipheth rtsx_pci_sdmmc snd_hda_codec_realtek iwlmvm snd_hda_codec_generic snd_hda_codec_hdmi snd_hda_intel iwlwifi snd_hda_codec snd_hwdep rtsx_pci snd_hda_core snd_pcm thinkpad_acpi efivarfs input_leds
[ 1563.658253] CPU: 1 PID: 2177 Comm: pool Tainted: G        W       T 4.17.0 #22
[ 1563.658255] Hardware name: LENOVO 20CMCTO1WW/20CMCTO1WW, BIOS N10ET48W (1.27 ) 09/12/2017
[ 1563.658258] RIP: 0010:tcp_recvmsg+0x1eb/0xb40
[ 1563.658260] RSP: 0018:ffffb77e010f7cf8 EFLAGS: 00010282
[ 1563.658263] RAX: 0000000000000000 RBX: 00000000416bcf42 RCX: 0000000000000006
[ 1563.658265] RDX: 0000000000000007 RSI: 0000000000000086 RDI: ffffa37a8dc95610
[ 1563.658268] RBP: ffffb77e010f7db8 R08: 000000000013fd88 R09: 0000000000000004
[ 1563.658270] R10: ffffa37a3b1180c8 R11: 0000000000000001 R12: ffffa37a81d40e00
[ 1563.658272] R13: ffffa37a3b118000 R14: ffffa37a3b118524 R15: 0000000000000000
[ 1563.658275] FS:  0000738f795c0700(0000) GS:ffffa37a8dc80000(0000) knlGS:0000000000000000
[ 1563.658278] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1563.658280] CR2: 0000000000000028 CR3: 000000024200c003 CR4: 00000000003606e0
[ 1563.658282] Call Trace:
[ 1563.658287]  inet_recvmsg+0x5c/0x110
[ 1563.658291]  __sys_recvfrom+0xf2/0x160
[ 1563.658295]  __x64_sys_recvfrom+0x1f/0x30
[ 1563.658298]  do_syscall_64+0x72/0x1c0
[ 1563.658302]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 1563.658304] RIP: 0033:0x73901a71deae
[ 1563.658306] RSP: 002b:0000738f795bee50 EFLAGS: 00000246 ORIG_RAX: 000000000000002d
[ 1563.658309] RAX: ffffffffffffffda RBX: 0000000000000028 RCX: 000073901a71deae
[ 1563.658311] RDX: 0000000000000404 RSI: 0000738f087955a7 RDI: 0000000000000028
[ 1563.658312] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
[ 1563.658314] R10: 0000000000000000 R11: 0000000000000246 R12: 0000738f087955a7
[ 1563.658316] R13: 0000000000000404 R14: 0000000000000000 R15: ffffffffffffffff
[ 1563.658318] Code: 8b 44 24 78 41 39 d8 77 57 41 f6 44 24 34 01 0f 85 24 01 00 00 45 85 ff 0f 84 40 04 00 00 49 8b 04 24 49 39 c2 0f 84 1d 02 00 00 <8b> 50 28 41 8b 1e 39 d3 0f 88 f4 03 00 00 49 89 c4 29 d3 41 f6 
[ 1563.658365] RIP: tcp_recvmsg+0x1eb/0xb40 RSP: ffffb77e010f7cf8
[ 1563.658366] CR2: 0000000000000028
[ 1563.658369] ---[ end trace e7da03c87ec5c409 ]---

If you need more information, please ask.

Regards,
- -- 
Yves-Alexis
-----BEGIN PGP SIGNATURE-----

iQEzBAEBCgAdFiEE8vi34Qgfo83x35gF3rYcyPpXRFsFAlsXmYcACgkQ3rYcyPpX
RFtK6QgArIJyLOT8Lot0jdQehm9MfL6iNUWNSHbEckhK80zYQCLUodj8VQJsmeu1
1hZwvg/Kuw0vxLG3i744NxcbCncfoaBUkZHoUmCZxFzyUeQVviAf9EaLp6cU0JPk
ZBSKPeoPMF9WlBKecV9O/j6T6FRjbSmV/J7esj6vNFXm3iwOh1Yp0cugpU+j+/IA
BxWVkKWZqS/uxtXaakoYdYOvrcRRpxcGKNXHajGW2AKXqybfoPgx0tSWzQ8bpn/o
3NtU9AL5flo4CgmnSY+qXtwT1fnNEtSVbbRmWyrMRpzzLLzTE2v4Pn5043J1Q1C6
EmfVzeYke69MSSGG/fqrLeEV6PzLZQ==
=C7Mx
-----END PGP SIGNATURE-----

^ permalink raw reply

* [PATCH] net/phy: Micrel KSZ8061 PHY link failure after cable connect
From: Alexander Onnasch @ 2018-06-06  8:03 UTC (permalink / raw)
  Cc: alexander.onnasch, Andrew Lunn, Florian Fainelli, netdev,
	linux-kernel

With Micrel KSZ8061 PHY, the link may occasionally not come up after
Ethernet cable connect. The vendor's (Microchip, former Micrel) errata
sheet 80000688A.pdf describes the problem and possible workarounds in
detail, see below.
The patch implements workaround 1, which permanently fixes the issue.

DESCRIPTION
Link-up may not occur properly when the Ethernet cable is initially
connected. This issue occurs more commonly when the cable is connected
slowly, but it may occur any time a cable is connected. This issue occurs
in the auto-negotiation circuit, and will not occur if auto-negotiation
is disabled (which requires that the two link partners be set to the
same speed and duplex).

END USER IMPLICATIONS
When this issue occurs, link is not established. Subsequent cable
plug/unplug cycles will not correct the issue.

WORK AROUND
There are four approaches to work around this issue:
1.  This issue can be prevented by setting bit 15 in MMD device address 1,
    register 2, prior to connecting the  cable or prior to setting the
    Restart Auto-Negotiation bit in register 0h.The MMD registers are
    accessed via the indirect access registers Dh and Eh, or via the Micrel
    EthUtil utility as shown here:
    •  If using the EthUtil utility (usually with a Micrel KSZ8061
       Evaluation Board), type the following commands:
       > address 1
       > mmd 1
       > iw 2 b61a
    •  Alternatively, write the following registers to write to the
       indirect MMD register:
       Write register Dh, data 0001h
       Write register Eh, data 0002h
       Write register Dh, data 4001h
       Write register Eh, data B61Ah
2.  The issue can be avoided by disabling auto-negotiation in the KSZ8061,
    either by the strapping option, or by clearing bit 12 in register 0h.
    Care must be taken to ensure that the KSZ8061 and the link partner
    will link with the same speed and duplex. Note that the KSZ8061
    defaults to full-duplex when auto-negotiation is off, but other
    devices may default to half-duplex in the event of failed
    auto-negotiation.
3.  The issue can be avoided by connecting the cable prior to powering-up
    or resetting the KSZ8061, and leaving it plugged in thereafter.
4.  If the above measures are not taken and the problem occurs, link can
    be recovered by setting the Restart Auto-Negotiation bit in
    register 0h, or by resetting or power cycling the device. Reset may
    be either hardware reset or software reset (register 0h, bit 15).

PLAN
This errata will not be corrected in a future revision.

Signed-off-by: Alexander Onnasch <alexander.onnasch@landisgyr.com>
---
 drivers/net/phy/micrel.c | 24 +++++++++++++++++++++++-
 1 file changed, 23 insertions(+), 1 deletion(-)

diff --git a/drivers/net/phy/micrel.c b/drivers/net/phy/micrel.c
index 6c45ff6..7d80a00 100644
--- a/drivers/net/phy/micrel.c
+++ b/drivers/net/phy/micrel.c
@@ -339,6 +339,28 @@ static int ksz8041_config_aneg(struct phy_device *phydev)
 	return genphy_config_aneg(phydev);
 }
 
+#define MII_KSZ8061RN_MMD_CTRL_REG	    0x0d
+#define MII_KSZ8061RN_MMD_REGDATA_REG	0x0e
+
+static int ksz8061_extended_write(struct phy_device *phydev,
+				  u8 mode, u32 dev_addr, u32 regnum, u16 val)
+{
+	phy_write(phydev, MII_KSZ8061RN_MMD_CTRL_REG, dev_addr);
+	phy_write(phydev, MII_KSZ8061RN_MMD_REGDATA_REG, regnum);
+	phy_write(phydev, MII_KSZ8061RN_MMD_CTRL_REG, (mode << 14) | dev_addr);
+	return phy_write(phydev, MII_KSZ8061RN_MMD_REGDATA_REG, val);
+}
+
+static int ksz8061_config_init(struct phy_device *phydev)
+{
+	int ret;
+
+	ret = ksz8061_extended_write(phydev, 0x1, 0x1, 0x2, 0xB61A);
+	if (ret)
+		return ret;
+	return kszphy_config_init(phydev);
+}
+
 static int ksz9021_load_values_from_of(struct phy_device *phydev,
 				       const struct device_node *of_node,
 				       u16 reg,
@@ -938,7 +960,7 @@ static struct phy_driver ksphy_driver[] = {
 	.phy_id_mask	= MICREL_PHY_ID_MASK,
 	.features	= PHY_BASIC_FEATURES,
 	.flags		= PHY_HAS_INTERRUPT,
-	.config_init	= kszphy_config_init,
+	.config_init	= ksz8061_config_init,
 	.config_aneg	= genphy_config_aneg,
 	.read_status	= genphy_read_status,
 	.ack_interrupt	= kszphy_ack_interrupt,
-- 
2.7.4

^ permalink raw reply related

* RE: [PATCH] drivers/net/phy/micrel: Fix for PHY KSZ8061 errrata: Potential link-up failure when Ethernet cable is connected slowly
From: Onnasch, Alexander (EXT) @ 2018-06-06  8:02 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: Andrew Lunn, netdev@vger.kernel.org, linux-kernel@vger.kernel.org
In-Reply-To: <b560368a-e790-164a-7414-a88a21da64b5@gmail.com>

Hello 
thanks for the hints! I have changed it accordingly and will now post it again.
best regards,
Alexander Onnasch

-----Original Message-----
From: Florian Fainelli <f.fainelli@gmail.com> 
Sent: Freitag, 25. Mai 2018 17:17
To: Onnasch, Alexander (EXT) <Alexander.Onnasch@landisgyr.com>
Cc: Andrew Lunn <andrew@lunn.ch>; netdev@vger.kernel.org; linux-kernel@vger.kernel.org
Subject: Re: [PATCH] drivers/net/phy/micrel: Fix for PHY KSZ8061 errrata: Potential link-up failure when Ethernet cable is connected slowly



On 05/25/2018 05:37 AM, Alexander Onnasch wrote:
> Signed-off-by: Alexander Onnasch <alexander.onnasch@landisgyr.com>

You would want to make the commit subject shorter (ideally capped somewhere around 72 characters) and provide a commit message which explains the issue and why the workaround is effective.

Thank you!

[snip]

> 
> P PLEASE CONSIDER OUR ENVIRONMENT BEFORE PRINTING THIS EMAIL.
> 
> This e-mail (including any attachments) is confidential and may be legally privileged. If you are not an intended recipient or an authorized representative of an intended recipient, you are prohibited from using, copying or distributing the information in this e-mail or its attachments. If you have received this e-mail in error, please notify the sender immediately by return e-mail and delete all copies of this message and any attachments. Thank you.

You need to remove that footer otherwise we cannot be accepting your patch.
--
Florian

^ permalink raw reply

* Re: [PATCH net] net: sched: cls: Fix offloading when ingress dev is vxlan
From: Paul Blakey @ 2018-06-06  7:59 UTC (permalink / raw)
  To: Jakub Kicinski, David Miller
  Cc: paulb, jiri, xiyou.wangcong, jhs, netdev, kliteyn, roid, shahark,
	markb, ogerlitz
In-Reply-To: <20180605142700.6033a7a0@cakuba.netronome.com>



On 06/06/2018 00:27, Jakub Kicinski wrote:
> On Tue, 05 Jun 2018 15:06:40 -0400 (EDT), David Miller wrote:
>> From: Jakub Kicinski <kubakici@wp.pl>
>> Date: Tue, 5 Jun 2018 11:57:47 -0700
>>
>>> Do we still care about correctness and not breaking backward
>>> compatibility?
>>
>> Jakub let me know if you want me to revert this change.
> 
> Yes, I think this patch introduces a regression when block is shared
> between offload capable and in-capable device, therefore it should be
> reverted.  Shared blocks went through a number of review cycles to
> ensure such cases are handled correctly.
> 
> 
> Longer story about the actual issue which is never explained in the
> commit message is this: in kernels 4.10 - 4.14 skip_sw flag was
> supported on tunnels in cls_flower only:
> 
> static int fl_hw_replace_filter(struct tcf_proto *tp,
> [...]
> 	if (!tc_can_offload(dev)) {
> 		if (tcf_exts_get_dev(dev, &f->exts, &f->hw_dev) ||
> 		    (f->hw_dev && !tc_can_offload(f->hw_dev))) {
> 			f->hw_dev = dev;
> 			return tc_skip_sw(f->flags) ? -EINVAL : 0;
> 		}
> 		dev = f->hw_dev;
> 		cls_flower.egress_dev = true;
> 	} else {
> 		f->hw_dev = dev;
> 	}
> 
> 
> In 4.15 - 4.17 with addition of shared blocks egdev mechanism was
> promoted to a generic TC thing supported on all filters but it no
> longer works with skip_sw.
> 
> I'd argue skip_sw is incorrect for tunnels, because the rule will only
> apply to traffic ingressing on ASIC ports, not all traffic which hits
> the tunnel netdev.  Therefore we should keep the 4.15 - 4.17 behaviour.
> 
> But that's a side note, I don't think we should be breaking offload on
> shared blocks whether we decide to support skip_sw on tunnels or not.
> 

Maybe we can apply my patch logic of still trying the egress dev if the 
block has a single device, and not shared. Is that ok with you?

You're patch seems good as an add on, but the egress hw device (sw1np0) 
would go over the tc actions and see if it can offload such rule (output 
to software lo device) and would fail anyway.

^ permalink raw reply

* Re: [PATCH] can: m_can: Fix runtime resume call
From: Faiz Abbas @ 2018-06-06  7:53 UTC (permalink / raw)
  To: linux-can, netdev, linux-kernel, linux-omap; +Cc: mkl, wg
In-Reply-To: <20180529132426.31216-1-faiz_abbas@ti.com>

Hi,

On Tuesday 29 May 2018 06:54 PM, Faiz Abbas wrote:
> pm_runtime_get_sync() returns a 1 if the state of the device is already
> 'active'. This is not a failure case and should return a success.
> 
> Therefore fix error handling for pm_runtime_get_sync() call such that
> it returns success when the value is 1.
> 
> Also cleanup the TODO for using runtime PM for sleep mode as that is
> implemented.
> 
> Signed-off-by: Faiz Abbas <faiz_abbas@ti.com>
> ---
>  drivers/net/can/m_can/m_can.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/net/can/m_can/m_can.c b/drivers/net/can/m_can/m_can.c
> index a9fbf81ac3d4..04c48371ab2a 100644
> --- a/drivers/net/can/m_can/m_can.c
> +++ b/drivers/net/can/m_can/m_can.c
> @@ -634,10 +634,12 @@ static int m_can_clk_start(struct m_can_priv *priv)
>  	int err;
>  
>  	err = pm_runtime_get_sync(priv->device);
> -	if (err)
> +	if (err < 0) {
>  		pm_runtime_put_noidle(priv->device);
> +		return err;
> +	}
>  
> -	return err;
> +	return 0;
>  }
>  
>  static void m_can_clk_stop(struct m_can_priv *priv)
> @@ -1687,8 +1689,6 @@ static int m_can_plat_probe(struct platform_device *pdev)
>  	return ret;
>  }
>  
> -/* TODO: runtime PM with power down or sleep mode  */
> -
>  static __maybe_unused int m_can_suspend(struct device *dev)
>  {
>  	struct net_device *ndev = dev_get_drvdata(dev);
> 

Gentle ping.

Thanks,
Faiz

^ permalink raw reply

* Re: [PATCH] can: m_can: Move acessing of message ram to after clocks are enabled
From: Faiz Abbas @ 2018-06-06  7:52 UTC (permalink / raw)
  To: linux-can, netdev, linux-kernel, linux-omap; +Cc: mkl, wg
In-Reply-To: <20180529141940.1682-1-faiz_abbas@ti.com>

Hi,

On Tuesday 29 May 2018 07:49 PM, Faiz Abbas wrote:
> MCAN message ram should only be accessed once clocks are enabled.
> Therefore, move the call to parse/init the message ram to after
> clocks are enabled.
> 
> Signed-off-by: Faiz Abbas <faiz_abbas@ti.com>
> ---
>  drivers/net/can/m_can/m_can.c | 7 +++----
>  1 file changed, 3 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/net/can/m_can/m_can.c b/drivers/net/can/m_can/m_can.c
> index b397a33f3d32..a9fbf81ac3d4 100644
> --- a/drivers/net/can/m_can/m_can.c
> +++ b/drivers/net/can/m_can/m_can.c
> @@ -1642,8 +1642,6 @@ static int m_can_plat_probe(struct platform_device *pdev)
>  	priv->can.clock.freq = clk_get_rate(cclk);
>  	priv->mram_base = mram_addr;
>  
> -	m_can_of_parse_mram(priv, mram_config_vals);
> -
>  	platform_set_drvdata(pdev, dev);
>  	SET_NETDEV_DEV(dev, &pdev->dev);
>  
> @@ -1666,6 +1664,8 @@ static int m_can_plat_probe(struct platform_device *pdev)
>  		goto clk_disable;
>  	}
>  
> +	m_can_of_parse_mram(priv, mram_config_vals);
> +
>  	devm_can_led_init(dev);
>  
>  	of_can_transceiver(dev);
> @@ -1715,8 +1715,6 @@ static __maybe_unused int m_can_resume(struct device *dev)
>  
>  	pinctrl_pm_select_default_state(dev);
>  
> -	m_can_init_ram(priv);
> -
>  	priv->can.state = CAN_STATE_ERROR_ACTIVE;
>  
>  	if (netif_running(ndev)) {
> @@ -1726,6 +1724,7 @@ static __maybe_unused int m_can_resume(struct device *dev)
>  		if (ret)
>  			return ret;
>  
> +		m_can_init_ram(priv);
>  		m_can_start(ndev);
>  		netif_device_attach(ndev);
>  		netif_start_queue(ndev);
> 

Gentle ping.

Thanks,
Faiz

^ permalink raw reply

* Re: [PATCH net] failover: eliminate callback hell
From: Jiri Pirko @ 2018-06-06  7:25 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: kys, haiyangz, davem, mst, sridhar.samudrala, netdev,
	Stephen Hemminger
In-Reply-To: <20180605034231.31610-1-sthemmin@microsoft.com>

Tue, Jun 05, 2018 at 05:42:31AM CEST, stephen@networkplumber.org wrote:
>The net failover should be a simple library, not a virtual
>object with function callbacks (see callback hell).

Why just a library? It should do a common things. I think it should be a
virtual object. Looks like your patch again splits the common
functionality into multiple drivers. That is kind of backwards attitude.
I don't get it. We should rather focus on fixing the mess the
introduction of netvsc-bonding caused and switch netvsc to 3-netdev
model.

^ permalink raw reply

* Re: [PATCH iproute2 v2 1/2] ip: display netns name instead of nsid
From: Nicolas Dichtel @ 2018-06-06  7:11 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev
In-Reply-To: <20180605095224.1921fee5@xeon-e3>

Le 05/06/2018 à 18:52, Stephen Hemminger a écrit :
> On Tue,  5 Jun 2018 15:08:30 +0200
> Nicolas Dichtel <nicolas.dichtel@6wind.com> wrote:
> 
>>  
>> +char *get_name_from_nsid(int nsid)
>> +{
>> +	struct nsid_cache *c;
>> +
>> +	netns_nsid_socket_init();
>> +	netns_map_init();
>> +
>> +	c = netns_map_get_by_nsid(nsid);
>> +	if (c)
>> +		return c->name;
>> +
>> +	return NULL;
>> +}
>> +
> 
> This is better, but now there is a different problem.
> When doing multiple interfaces, won't the initialization code be called twice?
> 
No, the init function is propected:

void netns_nsid_socket_init(void)
{
        if (rtnsh.fd > -1 || !ipnetns_have_nsid())
                return;
...

void netns_map_init(void)
{
...
        if (initialized || !ipnetns_have_nsid())
                return;
...

^ permalink raw reply

* Re: [PATCH] r8169: Reinstate ALDPS and ASPM support
From: Kai-Heng Feng @ 2018-06-06  7:03 UTC (permalink / raw)
  To: Heiner Kallweit; +Cc: davem, hayeswang, romieu, netdev, linux-kernel, Ryankao
In-Reply-To: <dfd3a9b4-3f89-9849-8d71-6fc49374b1df@gmail.com>

at 03:13, Heiner Kallweit <hkallweit1@gmail.com> wrote:

> On 05.06.2018 06:58, Kai-Heng Feng wrote:
>> This patch reinstate ALDPS and ASPM support on r8169.
>>
>> On some Intel platforms, ASPM support on r8169 is the key factor to let
>> Package C-State achieve PC8. Without ASPM support, the deepest Package
>> C-State can hit is PC3. PC8 can save additional ~3W in comparison with
>> PC3.
>>
>> This patch is from Realtek.
>>
>> Fixes: e0c075577965 ("r8169: enable ALDPS for power saving")
>> Fixes: d64ec841517a ("r8169: enable internal ASPM and clock request  
>> settings")
>>
>> Cc: Ryankao <ryankao@realtek.com>
>> Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
>> ---
>>  drivers/net/ethernet/realtek/r8169.c | 190 +++++++++++++++++++++------
>>  1 file changed, 151 insertions(+), 39 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/realtek/r8169.c  
>> b/drivers/net/ethernet/realtek/r8169.c
>> index 75dfac0248f4..a28ef20be221 100644
>> --- a/drivers/net/ethernet/realtek/r8169.c
>> +++ b/drivers/net/ethernet/realtek/r8169.c
>> @@ -319,6 +319,8 @@ static const struct pci_device_id rtl8169_pci_tbl[]  
>> = {
>>
>>  MODULE_DEVICE_TABLE(pci, rtl8169_pci_tbl);
>>
>> +static int enable_aspm = 1;
>> +static int enable_aldps = 1;
>>  static int use_dac = -1;
>>  static struct {
>>  	u32 msg_enable;
>> @@ -817,6 +819,10 @@ struct rtl8169_private {
>>
>>  MODULE_AUTHOR("Realtek and the Linux r8169 crew <netdev@vger.kernel.org>");
>>  MODULE_DESCRIPTION("RealTek RTL-8169 Gigabit Ethernet driver");
>> +module_param(enable_aspm, int, 0);
>> +MODULE_PARM_DESC(enable_aspm, "Enable ASPM");
>> +module_param(enable_aldps, int, 0);
>> +MODULE_PARM_DESC(enable_aldps, "Enable ALDPS");
>>  module_param(use_dac, int, 0);
>>  MODULE_PARM_DESC(use_dac, "Enable PCI DAC. Unsafe on 32 bit PCI slot.");
>>  module_param_named(debug, debug.msg_enable, int, 0);
>> @@ -1567,25 +1573,6 @@ static void rtl_link_chg_patch(struct  
>> rtl8169_private *tp)
>>  	}
>>  }
>>
>> -static void rtl8169_check_link_status(struct net_device *dev,
>> -				      struct rtl8169_private *tp)
>> -{
>> -	struct device *d = tp_to_dev(tp);
>> -
>> -	if (tp->link_ok(tp)) {
>> -		rtl_link_chg_patch(tp);
>> -		/* This is to cancel a scheduled suspend if there's one. */
>> -		pm_request_resume(d);
>> -		netif_carrier_on(dev);
>> -		if (net_ratelimit())
>> -			netif_info(tp, ifup, dev, "link up\n");
>> -	} else {
>> -		netif_carrier_off(dev);
>> -		netif_info(tp, ifdown, dev, "link down\n");
>> -		pm_runtime_idle(d);
>> -	}
>> -}
>> -
>>  #define WAKE_ANY (WAKE_PHY | WAKE_MAGIC | WAKE_UCAST | WAKE_BCAST | WAKE_MCAST)
>>
>>  static u32 __rtl8169_get_wol(struct rtl8169_private *tp)
>> @@ -3520,6 +3507,15 @@ static void rtl8168e_1_hw_phy_config(struct  
>> rtl8169_private *tp)
>>  	rtl_writephy(tp, 0x0d, 0x4007);
>>  	rtl_writephy(tp, 0x0e, 0x0000);
>>  	rtl_writephy(tp, 0x0d, 0x0000);
>> +
>> +	/* Check ALDPS bit, disable it if enabled */
>> +	rtl_writephy(tp, 0x1f, 0x0000);
>> +	if (enable_aldps)
>> +		rtl_w0w1_phy(tp, 0x15, 0x1000, 0x0000);
>> +	else if (rtl_readphy(tp, 0x15) & 0x1000)
>> +		rtl_w0w1_phy(tp, 0x15, 0x0000, 0x1000);
>> +
>> +	rtl_writephy(tp, 0x1f, 0x0000);
>
> Few remarks:
> - The comment isn't applicable any longer.
> - The second rtl_writephy(tp, 0x1f, 0x0000) isn't needed because you don't
>   switch the page in between.
> - The code is a little hard to read, instead you could use the following
>   and create a helper, ideally with register and bit number as
>   parameters so that you can use it for all affected chip types.
>
> val = rtl_readphy(tp, 0x15);
> val &= ~BIT(12);
> if (enable_aldps)
> 	val |= BIT(12);
> rtl_writephy(tp, 0x15, val);

This is much cleaner. Thanks!

>
>> }
>>
>>  static void rtl_rar_exgmac_set(struct rtl8169_private *tp, u8 *addr)
>> @@ -3627,6 +3623,15 @@ static void rtl8168e_2_hw_phy_config(struct  
>> rtl8169_private *tp)
>>
>>  	/* Broken BIOS workaround: feed GigaMAC registers with MAC address. */
>>  	rtl_rar_exgmac_set(tp, tp->dev->dev_addr);
>> +
>> +	/* Check ALDPS bit, disable it if enabled */
>> +	rtl_writephy(tp, 0x1f, 0x0000);
>> +	if (enable_aldps)
>> +		rtl_w0w1_phy(tp, 0x15, 0x1000, 0x0000);
>> +	else if (rtl_readphy(tp, 0x15) & 0x1000)
>> +		rtl_w0w1_phy(tp, 0x15, 0x0000, 0x1000);
>> +
>> +	rtl_writephy(tp, 0x1f, 0x0000);
>>  }
>>
>>  static void rtl8168f_hw_phy_config(struct rtl8169_private *tp)
>> @@ -3649,6 +3654,15 @@ static void rtl8168f_hw_phy_config(struct  
>> rtl8169_private *tp)
>>  	rtl_writephy(tp, 0x05, 0x8b86);
>>  	rtl_w0w1_phy(tp, 0x06, 0x0001, 0x0000);
>>  	rtl_writephy(tp, 0x1f, 0x0000);
>> +
>> +	/* Check ALDPS bit, disable it if enabled */
>> +	rtl_writephy(tp, 0x1f, 0x0000);
>> +	if (enable_aldps)
>> +		rtl_w0w1_phy(tp, 0x15, 0x1000, 0x0000);
>> +	else if (rtl_readphy(tp, 0x15) & 0x1000)
>> +		rtl_w0w1_phy(tp, 0x15, 0x0000, 0x1000);
>> +
>> +	rtl_writephy(tp, 0x1f, 0x0000);
>>  }
>>
>>  static void rtl8168f_1_hw_phy_config(struct rtl8169_private *tp)
>> @@ -3865,7 +3879,9 @@ static void rtl8168g_1_hw_phy_config(struct  
>> rtl8169_private *tp)
>>
>>  	/* Check ALDPS bit, disable it if enabled */
>>  	rtl_writephy(tp, 0x1f, 0x0a43);
>> -	if (rtl_readphy(tp, 0x10) & 0x0004)
>> +	if (enable_aldps)
>> +		rtl_w0w1_phy(tp, 0x10, 0x0004, 0x0000);
>> +	else if (rtl_readphy(tp, 0x10) & 0x0004)
>>  		rtl_w0w1_phy(tp, 0x10, 0x0000, 0x0004);
>>
>>  	rtl_writephy(tp, 0x1f, 0x0000);
>> @@ -3874,6 +3890,14 @@ static void rtl8168g_1_hw_phy_config(struct  
>> rtl8169_private *tp)
>>  static void rtl8168g_2_hw_phy_config(struct rtl8169_private *tp)
>>  {
>>  	rtl_apply_firmware(tp);
>> +
>> +	rtl_writephy(tp, 0x1f, 0x0a43);
>> +	if (enable_aldps)
>> +		rtl_w0w1_phy(tp, 0x10, 0x0004, 0x0000);
>> +	else if (rtl_readphy(tp, 0x10) & 0x0004)
>> +		rtl_w0w1_phy(tp, 0x10, 0x0000, 0x0004);
>> +
>> +	rtl_writephy(tp, 0x1f, 0x0000);
>>  }
>>
>>  static void rtl8168h_1_hw_phy_config(struct rtl8169_private *tp)
>> @@ -3980,7 +4004,9 @@ static void rtl8168h_1_hw_phy_config(struct  
>> rtl8169_private *tp)
>>
>>  	/* Check ALDPS bit, disable it if enabled */
>>  	rtl_writephy(tp, 0x1f, 0x0a43);
>> -	if (rtl_readphy(tp, 0x10) & 0x0004)
>> +	if (enable_aldps)
>> +		rtl_w0w1_phy(tp, 0x10, 0x0004, 0x0000);
>> +	else if (rtl_readphy(tp, 0x10) & 0x0004)
>>  		rtl_w0w1_phy(tp, 0x10, 0x0000, 0x0004);
>>
>>  	rtl_writephy(tp, 0x1f, 0x0000);
>> @@ -4053,7 +4079,9 @@ static void rtl8168h_2_hw_phy_config(struct  
>> rtl8169_private *tp)
>>
>>  	/* Check ALDPS bit, disable it if enabled */
>>  	rtl_writephy(tp, 0x1f, 0x0a43);
>> -	if (rtl_readphy(tp, 0x10) & 0x0004)
>> +	if (enable_aldps)
>> +		rtl_w0w1_phy(tp, 0x10, 0x0004, 0x0000);
>> +	else if (rtl_readphy(tp, 0x10) & 0x0004)
>>  		rtl_w0w1_phy(tp, 0x10, 0x0000, 0x0004);
>>
>>  	rtl_writephy(tp, 0x1f, 0x0000);
>> @@ -4095,7 +4123,9 @@ static void rtl8168ep_1_hw_phy_config(struct  
>> rtl8169_private *tp)
>>
>>  	/* Check ALDPS bit, disable it if enabled */
>>  	rtl_writephy(tp, 0x1f, 0x0a43);
>> -	if (rtl_readphy(tp, 0x10) & 0x0004)
>> +	if (enable_aldps)
>> +		rtl_w0w1_phy(tp, 0x10, 0x0004, 0x0000);
>> +	else if (rtl_readphy(tp, 0x10) & 0x0004)
>>  		rtl_w0w1_phy(tp, 0x10, 0x0000, 0x0004);
>>
>>  	rtl_writephy(tp, 0x1f, 0x0000);
>> @@ -4186,7 +4216,9 @@ static void rtl8168ep_2_hw_phy_config(struct  
>> rtl8169_private *tp)
>>
>>  	/* Check ALDPS bit, disable it if enabled */
>>  	rtl_writephy(tp, 0x1f, 0x0a43);
>> -	if (rtl_readphy(tp, 0x10) & 0x0004)
>> +	if (enable_aldps)
>> +		rtl_w0w1_phy(tp, 0x10, 0x0004, 0x0000);
>> +	else if (rtl_readphy(tp, 0x10) & 0x0004)
>>  		rtl_w0w1_phy(tp, 0x10, 0x0000, 0x0004);
>>
>>  	rtl_writephy(tp, 0x1f, 0x0000);
>> @@ -4233,6 +4265,15 @@ static void rtl8105e_hw_phy_config(struct  
>> rtl8169_private *tp)
>>  	rtl_apply_firmware(tp);
>>
>>  	rtl_writephy_batch(tp, phy_reg_init, ARRAY_SIZE(phy_reg_init));
>> +
>> +	/* Check ALDPS bit, disable it if enabled */
>> +	rtl_writephy(tp, 0x1f, 0x0000);
>> +	if (enable_aldps)
>> +		rtl_w0w1_phy(tp, 0x18, 0x1000, 0x0000);
>> +	else if (rtl_readphy(tp, 0x18) & 0x1000)
>> +		rtl_w0w1_phy(tp, 0x18, 0x0000, 0x1000);
>> +
>> +	rtl_writephy(tp, 0x1f, 0x0000);
>>  }
>>
>>  static void rtl8402_hw_phy_config(struct rtl8169_private *tp)
>> @@ -4250,6 +4291,15 @@ static void rtl8402_hw_phy_config(struct  
>> rtl8169_private *tp)
>>  	rtl_writephy(tp, 0x10, 0x401f);
>>  	rtl_writephy(tp, 0x19, 0x7030);
>>  	rtl_writephy(tp, 0x1f, 0x0000);
>> +
>> +	/* Check ALDPS bit, disable it if enabled */
>> +	rtl_writephy(tp, 0x1f, 0x0000);
>> +	if (enable_aldps)
>> +		rtl_w0w1_phy(tp, 0x18, 0x1000, 0x0000);
>> +	else if (rtl_readphy(tp, 0x18) & 0x1000)
>> +		rtl_w0w1_phy(tp, 0x18, 0x0000, 0x1000);
>> +
>> +	rtl_writephy(tp, 0x1f, 0x0000);
>>  }
>>
>>  static void rtl8106e_hw_phy_config(struct rtl8169_private *tp)
>> @@ -4272,6 +4322,15 @@ static void rtl8106e_hw_phy_config(struct  
>> rtl8169_private *tp)
>>  	rtl_writephy_batch(tp, phy_reg_init, ARRAY_SIZE(phy_reg_init));
>>
>>  	rtl_eri_write(tp, 0x1d0, ERIAR_MASK_0011, 0x0000, ERIAR_EXGMAC);
>> +
>> +	/* Check ALDPS bit, disable it if enabled */
>> +	rtl_writephy(tp, 0x1f, 0x0000);
>> +	if (enable_aldps)
>> +		rtl_w0w1_phy(tp, 0x18, 0x1000, 0x0000);
>> +	else if (rtl_readphy(tp, 0x18) & 0x1000)
>> +		rtl_w0w1_phy(tp, 0x18, 0x0000, 0x1000);
>> +
>> +	rtl_writephy(tp, 0x1f, 0x0000);
>>  }
>>
>>  static void rtl_hw_phy_config(struct net_device *dev)
>> @@ -5290,6 +5349,18 @@ static void rtl_pcie_state_l2l3_enable(struct  
>> rtl8169_private *tp, bool enable)
>>  	RTL_W8(tp, Config3, data);
>>  }
>>
>> +static void rtl_hw_internal_aspm_clkreq_enable(struct rtl8169_private  
>> *tp,
>> +					       bool enable)
>> +{
>> +	if (enable) {
>> +		RTL_W8(tp, Config2, RTL_R8(tp, Config2) | ClkReqEn);
>> +		RTL_W8(tp, Config5, RTL_R8(tp, Config5) | ASPM_en);
>> +	} else {
>> +		RTL_W8(tp, Config2, RTL_R8(tp, Config2) & ~ClkReqEn);
>> +		RTL_W8(tp, Config5, RTL_R8(tp, Config5) & ~ASPM_en);
>> +	}
>> +}
>> +
>>  static void rtl_hw_start_8168bb(struct rtl8169_private *tp)
>>  {
>>  	RTL_W8(tp, Config3, RTL_R8(tp, Config3) & ~Beacon_en);
>> @@ -5646,9 +5717,10 @@ static void rtl_hw_start_8168g_1(struct  
>> rtl8169_private *tp)
>>  	rtl_hw_start_8168g(tp);
>>
>>  	/* disable aspm and clock request before access ephy */
>> -	RTL_W8(tp, Config2, RTL_R8(tp, Config2) & ~ClkReqEn);
>> -	RTL_W8(tp, Config5, RTL_R8(tp, Config5) & ~ASPM_en);
>> +	rtl_hw_internal_aspm_clkreq_enable(tp, false);
>>  	rtl_ephy_init(tp, e_info_8168g_1, ARRAY_SIZE(e_info_8168g_1));
>> +	if (enable_aspm)
>> +		rtl_hw_internal_aspm_clkreq_enable(tp, true);
>>  }
>>
>>  static void rtl_hw_start_8168g_2(struct rtl8169_private *tp)
>> @@ -5681,9 +5753,10 @@ static void rtl_hw_start_8411_2(struct  
>> rtl8169_private *tp)
>>  	rtl_hw_start_8168g(tp);
>>
>>  	/* disable aspm and clock request before access ephy */
>> -	RTL_W8(tp, Config2, RTL_R8(tp, Config2) & ~ClkReqEn);
>> -	RTL_W8(tp, Config5, RTL_R8(tp, Config5) & ~ASPM_en);
>> +	rtl_hw_internal_aspm_clkreq_enable(tp, false);
>>  	rtl_ephy_init(tp, e_info_8411_2, ARRAY_SIZE(e_info_8411_2));
>> +	if (enable_aspm)
>> +		rtl_hw_internal_aspm_clkreq_enable(tp, true);
>>  }
>>
>>  static void rtl_hw_start_8168h_1(struct rtl8169_private *tp)
>> @@ -5700,8 +5773,7 @@ static void rtl_hw_start_8168h_1(struct  
>> rtl8169_private *tp)
>>  	};
>>
>>  	/* disable aspm and clock request before access ephy */
>> -	RTL_W8(tp, Config2, RTL_R8(tp, Config2) & ~ClkReqEn);
>> -	RTL_W8(tp, Config5, RTL_R8(tp, Config5) & ~ASPM_en);
>> +	rtl_hw_internal_aspm_clkreq_enable(tp, false);
>>  	rtl_ephy_init(tp, e_info_8168h_1, ARRAY_SIZE(e_info_8168h_1));
>>
>>  	RTL_W32(tp, TxConfig, RTL_R32(tp, TxConfig) | TXCFG_AUTO_FIFO);
>> @@ -5780,6 +5852,9 @@ static void rtl_hw_start_8168h_1(struct  
>> rtl8169_private *tp)
>>  	r8168_mac_ocp_write(tp, 0xe63e, 0x0000);
>>  	r8168_mac_ocp_write(tp, 0xc094, 0x0000);
>>  	r8168_mac_ocp_write(tp, 0xc09e, 0x0000);
>> +
>> +	if (enable_aspm)
>> +		rtl_hw_internal_aspm_clkreq_enable(tp, true);
>>  }
>>
>>  static void rtl_hw_start_8168ep(struct rtl8169_private *tp)
>> @@ -5831,11 +5906,13 @@ static void rtl_hw_start_8168ep_1(struct  
>> rtl8169_private *tp)
>>  	};
>>
>>  	/* disable aspm and clock request before access ephy */
>> -	RTL_W8(tp, Config2, RTL_R8(tp, Config2) & ~ClkReqEn);
>> -	RTL_W8(tp, Config5, RTL_R8(tp, Config5) & ~ASPM_en);
>> +	rtl_hw_internal_aspm_clkreq_enable(tp, false);
>>  	rtl_ephy_init(tp, e_info_8168ep_1, ARRAY_SIZE(e_info_8168ep_1));
>>
>>  	rtl_hw_start_8168ep(tp);
>> +
>> +	if (enable_aspm)
>> +		rtl_hw_internal_aspm_clkreq_enable(tp, true);
>>  }
>>
>>  static void rtl_hw_start_8168ep_2(struct rtl8169_private *tp)
>> @@ -5847,14 +5924,16 @@ static void rtl_hw_start_8168ep_2(struct  
>> rtl8169_private *tp)
>>  	};
>>
>>  	/* disable aspm and clock request before access ephy */
>> -	RTL_W8(tp, Config2, RTL_R8(tp, Config2) & ~ClkReqEn);
>> -	RTL_W8(tp, Config5, RTL_R8(tp, Config5) & ~ASPM_en);
>> +	rtl_hw_internal_aspm_clkreq_enable(tp, false);
>>  	rtl_ephy_init(tp, e_info_8168ep_2, ARRAY_SIZE(e_info_8168ep_2));
>>
>>  	rtl_hw_start_8168ep(tp);
>>
>>  	RTL_W8(tp, DLLPR, RTL_R8(tp, DLLPR) & ~PFM_EN);
>>  	RTL_W8(tp, MISC_1, RTL_R8(tp, MISC_1) & ~PFM_D3COLD_EN);
>> +
>> +	if (enable_aspm)
>> +		rtl_hw_internal_aspm_clkreq_enable(tp, true);
>>  }
>>
>>  static void rtl_hw_start_8168ep_3(struct rtl8169_private *tp)
>> @@ -5868,8 +5947,7 @@ static void rtl_hw_start_8168ep_3(struct  
>> rtl8169_private *tp)
>>  	};
>>
>>  	/* disable aspm and clock request before access ephy */
>> -	RTL_W8(tp, Config2, RTL_R8(tp, Config2) & ~ClkReqEn);
>> -	RTL_W8(tp, Config5, RTL_R8(tp, Config5) & ~ASPM_en);
>> +	rtl_hw_internal_aspm_clkreq_enable(tp, false);
>>  	rtl_ephy_init(tp, e_info_8168ep_3, ARRAY_SIZE(e_info_8168ep_3));
>>
>>  	rtl_hw_start_8168ep(tp);
>> @@ -5889,6 +5967,9 @@ static void rtl_hw_start_8168ep_3(struct  
>> rtl8169_private *tp)
>>  	data = r8168_mac_ocp_read(tp, 0xe860);
>>  	data |= 0x0080;
>>  	r8168_mac_ocp_write(tp, 0xe860, data);
>> +
>> +	if (enable_aspm)
>> +		rtl_hw_internal_aspm_clkreq_enable(tp, true);
>>  }
>>
>>  static void rtl_hw_start_8168(struct rtl8169_private *tp)
>> @@ -6364,7 +6445,7 @@ static void rtl8169_tx_clear(struct  
>> rtl8169_private *tp)
>>  	tp->cur_tx = tp->dirty_tx = 0;
>>  }
>>
>> -static void rtl_reset_work(struct rtl8169_private *tp)
>> +static void _rtl_reset_work(struct rtl8169_private *tp)
>>  {
>>  	struct net_device *dev = tp->dev;
>>  	int i;
>> @@ -6384,6 +6465,33 @@ static void rtl_reset_work(struct rtl8169_private  
>> *tp)
>>  	napi_enable(&tp->napi);
>>  	rtl_hw_start(tp);
>>  	netif_wake_queue(dev);
>> +}
>> +
>> +static void rtl8169_check_link_status(struct net_device *dev,
>> +				      struct rtl8169_private *tp)
>> +{
>> +	struct device *d = tp_to_dev(tp);
>> +
>> +	if (tp->link_ok(tp)) {
>> +		rtl_link_chg_patch(tp);
>> +		/* This is to cancel a scheduled suspend if there's one. */
>> +		if (pm_request_resume(d))
>> +			_rtl_reset_work(tp);
>
> This reset was added, what is it good for and how is it related to
> ASPM/ALDPS? It looks a little bogus, especially considering that
> pm_request_resume() can return also positive values.

Probably. I'll do some testing and discuss with Realtek.

>
>> +		netif_carrier_on(dev);
>> +		if (net_ratelimit())
>> +			netif_info(tp, ifup, dev, "link up\n");
>> +	} else {
>> +		netif_carrier_off(dev);
>> +		netif_info(tp, ifdown, dev, "link down\n");
>> +		pm_runtime_idle(d);
>> +	}
>> +}
>> +
>> +static void rtl_reset_work(struct rtl8169_private *tp)
>> +{
>> +	struct net_device *dev = tp->dev;
>> +
>> +	_rtl_reset_work(tp);
>>  	rtl8169_check_link_status(dev, tp);
>>  }
>>
>> @@ -7649,8 +7757,12 @@ static int rtl_init_one(struct pci_dev *pdev,  
>> const struct pci_device_id *ent)
>>
>>  	/* disable ASPM completely as that cause random device stop working
>>  	 * problems as well as full system hangs for some PCIe devices users */
>> -	pci_disable_link_state(pdev, PCIE_LINK_STATE_L0S | PCIE_LINK_STATE_L1 |
>> -				     PCIE_LINK_STATE_CLKPM);
>> +	if (!enable_aspm) {
>> +		pci_disable_link_state(pdev, PCIE_LINK_STATE_L0S |
>> +					     PCIE_LINK_STATE_L1 |
>> +					     PCIE_LINK_STATE_CLKPM);
>> +		netif_info(tp, probe, dev, "ASPM disabled\n");
>
> You should use dev_info() here because the net_device isn't registered yet.

Thanks. I will change that.

Kai-Heng

>
>> +	}
>>
>>  	/* enable device (incl. PCI PM wakeup and hotplug setup) */
>>  	rc = pcim_enable_device(pdev);

^ permalink raw reply

* Re: [PATCH] r8169: Reinstate ALDPS and ASPM support
From: Kai-Heng Feng @ 2018-06-06  6:58 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Ryankao, jrg.otte@gmail.com, David Miller, Hayes Wang,
	hkallweit1@gmail.com, romieu@fr.zoreil.com, Linux Netdev List,
	Linux Kernel Mailing List, Hau
In-Reply-To: <20180605172805.GD30381@bhelgaas-glaptop.roam.corp.google.com>

Hi, Bjorn,

at 01:28, Bjorn Helgaas <helgaas@kernel.org> wrote:

> On Tue, Jun 05, 2018 at 06:34:09AM +0000, Ryankao wrote:
>> Add realtek folk Hau
>>
>> -----Original Message-----
>> From: Kai Heng Feng [mailto:kai.heng.feng@canonical.com]
>> Sent: Tuesday, June 05, 2018 1:02 PM
>> To: jrg.otte@gmail.com
>> Cc: David Miller <davem@davemloft.net>; Hayes Wang  
>> <hayeswang@realtek.com>; hkallweit1@gmail.com; romieu@fr.zoreil.com;  
>> Linux Netdev List <netdev@vger.kernel.org>; Linux Kernel Mailing List  
>> <linux-kernel@vger.kernel.org>; Ryankao <ryankao@realtek.com>
>> Subject: Re: [PATCH] r8169: Reinstate ALDPS and ASPM support
>>
>> Hi Jörg Otte,
>>
>> Can you give this patch a try?
>>
>> Since you are the only one that reported ALDPS/ASPM regression,
>>
>> And I think this patch should solve the issue you had [1].
>>
>> Hopefully we don't need to go down the rabbit hole of  
>> blacklist/whitelist...
>>
>> Kai-Heng
>>
>> [1] https://lkml.org/lkml/2013/1/5/36
>
> I have no idea what ALDPS is.  It's not mentioned in the PCIe spec, so
> presumably it's some Realtek-specific thing.  ASPM is a generic PCIe
> thing.  Changes to these two things should be in separate patches so
> they don't get tangled up.

Sure, I'll split them in two. I'll also consult with Realtek to explain  
what ALDPS actually does.

>
>>> On Jun 5, 2018, at 12:58 PM, Kai-Heng Feng
>>> <kai.heng.feng@canonical.com>
>>> wrote:
>>>
>>> This patch reinstate ALDPS and ASPM support on r8169.
>>>
>>> On some Intel platforms, ASPM support on r8169 is the key factor to
>>> let Package C-State achieve PC8. Without ASPM support, the deepest
>>> Package C-State can hit is PC3. PC8 can save additional ~3W in
>>> comparison with PC3.
>>>
>>> This patch is from Realtek.
>>>
>>> Fixes: e0c075577965 ("r8169: enable ALDPS for power saving")
>>> Fixes: d64ec841517a ("r8169: enable internal ASPM and clock request
>>> settings")
>
>>> +3507,15 @@ static void rtl8168e_1_hw_phy_config(struct
>>> rtl8169_private *tp)
>>>  	rtl_writephy(tp, 0x0d, 0x4007);
>>>  	rtl_writephy(tp, 0x0e, 0x0000);
>>>  	rtl_writephy(tp, 0x0d, 0x0000);
>>> +
>>> +	/* Check ALDPS bit, disable it if enabled */
>>> +	rtl_writephy(tp, 0x1f, 0x0000);
>>> +	if (enable_aldps)
>>> +		rtl_w0w1_phy(tp, 0x15, 0x1000, 0x0000);
>>> +	else if (rtl_readphy(tp, 0x15) & 0x1000)
>>> +		rtl_w0w1_phy(tp, 0x15, 0x0000, 0x1000);
>
> There's a lot of repetition of this code with minor variations.  You
> could probably factor it out and make it more concise and more
> readable.

You are right. Will do.

>
>>> +static void rtl8169_check_link_status(struct net_device *dev,
>>> +				      struct rtl8169_private *tp) {
>>> +	struct device *d = tp_to_dev(tp);
>>> +
>>> +	if (tp->link_ok(tp)) {
>>> +		rtl_link_chg_patch(tp);
>>> +		/* This is to cancel a scheduled suspend if there's one. */
>>> +		if (pm_request_resume(d))
>>> +			_rtl_reset_work(tp);
>>> +		netif_carrier_on(dev);
>>> +		if (net_ratelimit())
>>> +			netif_info(tp, ifup, dev, "link up\n");
>>> +	} else {
>>> +		netif_carrier_off(dev);
>>> +		netif_info(tp, ifdown, dev, "link down\n");
>>> +		pm_runtime_idle(d);
>>> +	}
>>> +}
>
> This function apparently just got moved around without changing
> anything.  That's fine, but the move should be in a separate patch to
> make the real changes easier to review.

It actually added a new condition to check the return value of  
pm_request_resume().

It's probably a bogus check though. I'll ask Realtek why they did this.

>
>>> @@ -7649,8 +7757,12 @@ static int rtl_init_one(struct pci_dev *pdev,
>>> const struct pci_device_id *ent)
>>>
>>>  	/* disable ASPM completely as that cause random device stop working
>>>  	 * problems as well as full system hangs for some PCIe devices users */
>>> -	pci_disable_link_state(pdev, PCIE_LINK_STATE_L0S | PCIE_LINK_STATE_L1 |
>>> -				     PCIE_LINK_STATE_CLKPM);
>>> +	if (!enable_aspm) {
>>> +		pci_disable_link_state(pdev, PCIE_LINK_STATE_L0S |
>>> +					     PCIE_LINK_STATE_L1 |
>>> +					     PCIE_LINK_STATE_CLKPM);
>>> +		netif_info(tp, probe, dev, "ASPM disabled\n");
>>> +	}
>
> ASPM is a generic PCIe feature that should be configured by the PCI
> core without any help from the device driver.
>
> If code in the driver is needed, that means either the PCI core is
> doing it wrong and we should fix it there, or the device is broken and
> the driver is working around the erratum.
>
> If this is an erratum, you should include details about exactly what's
> broken and (ideally) a URL to the published erratum.  Otherwise this
> is just unmaintainable black magic and likely to be broken by future
> ASPM changes in the PCI core.
>
> ASPM configuration is done by the PCI core before drivers are bound to
> the device.  If you need device-specific workarounds, they should
> probably be in quirks so they're done before the core does that ASPM
> configuration.

You are right.

I think calling pci_disable_link_state() is unnecessary, I'll also consult  
with Realtek about this.

I'll also do some testing and send a separate patch to remove  
pci_disable_link_state() for -rc kernel to test if this is something really  
needed.

Kai-Heng

>
>>> /* enable device (incl. PCI PM wakeup and hotplug setup) */
>>>  	rc = pcim_enable_device(pdev);
>>> --
>>> 2.17.0
>>
>> ------Please consider the environment before printing this e-mail.

^ permalink raw reply

* Re: [RFC PATCH 2/2] net: macb: Disable TX checksum offloading on all Zynq
From: Michal Simek @ 2018-06-06  6:50 UTC (permalink / raw)
  To: Nicolas Ferre, Jennifer Dahm, netdev, David S . Miller,
	Harini Katakam
  Cc: Nathan Sullivan
In-Reply-To: <2e8da660-36d9-7aa0-bf66-86644eaf715b@microchip.com>

On 4.6.2018 17:06, Nicolas Ferre wrote:
> On 25/05/2018 at 23:44, Jennifer Dahm wrote:
>> The Zynq ethernet hardware has checksum offloading bugs that cause
>> small UDP packets (<= 2 bytes) to be sent with an incorrect checksum
>> (0xffff) and forwarded UDP packets to be re-checksummed, which is
>> illegal behavior. The best solution we have right now is to disable
>> hardware TX checksum offloading entirely.
>>
>> Signed-off-by: Jennifer Dahm <jennifer.dahm@ni.com>
> 
> Adding some xilinx people I know...

Harini: Please look at this.

Thanks,
Michal

^ permalink raw reply

* Re: [PATCH] r8169: Reinstate ALDPS and ASPM support
From: Kai-Heng Feng @ 2018-06-06  6:47 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: David Miller, Hayes Wang, hkallweit1, romieu, Linux Netdev List,
	Linux Kernel Mailing List, Ryankao
In-Reply-To: <20180605141114.GC14873@lunn.ch>

at 22:11, Andrew Lunn <andrew@lunn.ch> wrote:

> On Tue, Jun 05, 2018 at 12:58:12PM +0800, Kai-Heng Feng wrote:
>> This patch reinstate ALDPS and ASPM support on r8169.
>>
>> On some Intel platforms, ASPM support on r8169 is the key factor to let
>> Package C-State achieve PC8. Without ASPM support, the deepest Package
>> C-State can hit is PC3. PC8 can save additional ~3W in comparison with
>> PC3.
>>
>> This patch is from Realtek.
>>
>> Fixes: e0c075577965 ("r8169: enable ALDPS for power saving")
>> Fixes: d64ec841517a ("r8169: enable internal ASPM and clock request  
>> settings")
>>
>> Cc: Ryankao <ryankao@realtek.com>
>> Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
>> ---
>>  drivers/net/ethernet/realtek/r8169.c | 190 +++++++++++++++++++++------
>>  1 file changed, 151 insertions(+), 39 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/realtek/r8169.c  
>> b/drivers/net/ethernet/realtek/r8169.c
>> index 75dfac0248f4..a28ef20be221 100644
>> --- a/drivers/net/ethernet/realtek/r8169.c
>> +++ b/drivers/net/ethernet/realtek/r8169.c
>> @@ -319,6 +319,8 @@ static const struct pci_device_id rtl8169_pci_tbl[]  
>> = {
>>
>>  MODULE_DEVICE_TABLE(pci, rtl8169_pci_tbl);
>>
>> +static int enable_aspm = 1;
>> +static int enable_aldps = 1;
>>  static int use_dac = -1;
>>  static struct {
>>  	u32 msg_enable;
>> @@ -817,6 +819,10 @@ struct rtl8169_private {
>>
>>  MODULE_AUTHOR("Realtek and the Linux r8169 crew <netdev@vger.kernel.org>");
>>  MODULE_DESCRIPTION("RealTek RTL-8169 Gigabit Ethernet driver");
>> +module_param(enable_aspm, int, 0);
>> +MODULE_PARM_DESC(enable_aspm, "Enable ASPM");
>> +module_param(enable_aldps, int, 0);
>> +MODULE_PARM_DESC(enable_aldps, "Enable ALDPS");
>
> Hi Kai
>
> No module parameter please. Just turn it on by default. Assuming
> testing shows works.

Hi Andrew,

Sure.

Do you think we should also strip out the enable/disable logic?

Or just remove the parameter?

Kai-Heng

>
> 	Andrew

^ permalink raw reply

* Re: [PATCH 0/4] RFC CPSW switchdev mode
From: Ilias Apalodimas @ 2018-06-06  6:42 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Grygorii Strashko, netdev, ivan.khoronzhuk, nsekhar, jiri,
	ivecera, francois.ozog, yogeshs, spatton
In-Reply-To: <20180605235356.GC6237@lunn.ch>

Hi Andrew,

>On Wed, Jun 06, 2018 at 01:53:56AM +0200, Andrew Lunn wrote:
> > And I just have to look a little bit in the future as selected approach
> > expected to be extended on future SoC (and other parts of existing SoCs - ICSS-G SW switch)
> > where we going to have more features, like TSN, EST and packet Policing and Classification.
> 
> You should probably took a closer look at the Mellonex driver. It has
> a lot of TC offload, which is what policing and classification is.
> 
I did take a close look to both Mellanox and rocker drivers before issuing this
RFC and we seem to be close on the approach. What Grygorii is reffering to, is
that for CBS to work properly on CPSW there *must* be a way to configure the
CPU port individually.

> TSN is being worked on in general, and i think the i210 is taking the
> lead. So you probably want to keep an eye on that, and join the
> discussion.
> 
TSN is not just "one thing". It's a few IEEE standards bundled up to provide the
needed functionality. i210 is only implementing CBS at the moment and there's
an RFC out there to support what they call "Time based scheduling".
I am already having discussions with Jesus on their current work.

The idea behind using switchdev is that due to it's design, it's a very
convenient way of utilizing netlink and iproute2/tc to configure any kind of
future shapers. Since net_device_ops now has a callback to configure hardware
schedulers being able to expose netdevs as hardware ports and configure them
individually is great. As you can understand you'll end up with TSN capable
switches and NICs being configured with the same commands from a userspace point
of view. I am not sure this is the proper way to go, but at least for me, 
sounds like a viable solution.

Thanks
Ilias

^ permalink raw reply

* Re: [PATCH] r8169: Reinstate ALDPS and ASPM support
From: Simon Horman @ 2018-06-06  6:12 UTC (permalink / raw)
  To: Heiner Kallweit
  Cc: Kai-Heng Feng, davem, hayeswang, romieu, netdev, linux-kernel,
	Ryankao
In-Reply-To: <dfd3a9b4-3f89-9849-8d71-6fc49374b1df@gmail.com>

On Tue, Jun 05, 2018 at 09:13:08PM +0200, Heiner Kallweit wrote:
> On 05.06.2018 06:58, Kai-Heng Feng wrote:
> > This patch reinstate ALDPS and ASPM support on r8169.
> > 
> > On some Intel platforms, ASPM support on r8169 is the key factor to let
> > Package C-State achieve PC8. Without ASPM support, the deepest Package
> > C-State can hit is PC3. PC8 can save additional ~3W in comparison with
> > PC3.
> > 
> > This patch is from Realtek.
> > 
> > Fixes: e0c075577965 ("r8169: enable ALDPS for power saving")
> > Fixes: d64ec841517a ("r8169: enable internal ASPM and clock request settings")
> > 
> > Cc: Ryankao <ryankao@realtek.com>
> > Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
> > ---
> >  drivers/net/ethernet/realtek/r8169.c | 190 +++++++++++++++++++++------
> >  1 file changed, 151 insertions(+), 39 deletions(-)
> > 
> > diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
> > index 75dfac0248f4..a28ef20be221 100644
> > --- a/drivers/net/ethernet/realtek/r8169.c
> > +++ b/drivers/net/ethernet/realtek/r8169.c
> > @@ -319,6 +319,8 @@ static const struct pci_device_id rtl8169_pci_tbl[] = {
> >  
> >  MODULE_DEVICE_TABLE(pci, rtl8169_pci_tbl);
> >  
> > +static int enable_aspm = 1;
> > +static int enable_aldps = 1;
> >  static int use_dac = -1;
> >  static struct {
> >  	u32 msg_enable;
> > @@ -817,6 +819,10 @@ struct rtl8169_private {
> >  
> >  MODULE_AUTHOR("Realtek and the Linux r8169 crew <netdev@vger.kernel.org>");
> >  MODULE_DESCRIPTION("RealTek RTL-8169 Gigabit Ethernet driver");
> > +module_param(enable_aspm, int, 0);
> > +MODULE_PARM_DESC(enable_aspm, "Enable ASPM");
> > +module_param(enable_aldps, int, 0);
> > +MODULE_PARM_DESC(enable_aldps, "Enable ALDPS");
> >  module_param(use_dac, int, 0);
> >  MODULE_PARM_DESC(use_dac, "Enable PCI DAC. Unsafe on 32 bit PCI slot.");
> >  module_param_named(debug, debug.msg_enable, int, 0);

I'm a little surprised to see new module parameters being added.

^ permalink raw reply

* Re: [PATCH net] failover: eliminate callback hell
From: Samudrala, Sridhar @ 2018-06-06  6:11 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Michael S. Tsirkin, kys, haiyangz, davem, netdev,
	Stephen Hemminger
In-Reply-To: <20180605230038.521d5c18@xeon-e3>



On 6/5/2018 11:00 PM, Stephen Hemminger wrote:
> On Tue, 5 Jun 2018 22:39:12 -0700
> "Samudrala, Sridhar" <sridhar.samudrala@intel.com> wrote:
>
>> On 6/5/2018 8:51 PM, Stephen Hemminger wrote:
>>> On Tue, 5 Jun 2018 16:52:22 -0700
>>> "Samudrala, Sridhar" <sridhar.samudrala@intel.com> wrote:
>>>   
>>>> On 6/5/2018 2:52 PM, Stephen Hemminger wrote:
>>>>> On Tue, 5 Jun 2018 22:38:43 +0300
>>>>> "Michael S. Tsirkin" <mst@redhat.com> wrote:
>>>>>      
>>>>>>> See:
>>>>>>>       https://patchwork.ozlabs.org/patch/851711/
>>>>>> Let me try to summarize that:
>>>>>>
>>>>>> 	You wanted to speed up the delayed link up.  You had an idea to
>>>>>> 	additionally take link up when userspace renames the interface (standby
>>>>>> 	one which is also the failover for netvsc).
>>>>>>
>>>>>> 	But userspace might not do any renames, in which case there will
>>>>>> 	still be the delay, and so this never got applied.
>>>>>>
>>>>>> 	Is this a good summary?
>>>>>>
>>>>>> Davem said delay should go away completely as it's not robust, and I
>>>>>> think I agree.  So I don't think we should make all failover users use
>>>>>> delay. IIUC failover kept a delay option especially for netvsc to
>>>>>> minimize the surprise factor. Hopefully we can come up with
>>>>>> something more robust and drop that option completely.
>>>>> The timeout was the original solution to how to complete setup after
>>>>> userspace has had a chance to rename the device. Unfortunately, the whole network
>>>>> device initialization (cooperation with udev and userspace) is a a mess because
>>>>> there is no well defined specification, and there are multiple ways userspace
>>>>> does this in old and new distributions.  The timeout has its own issues
>>>>> (how long, handling errors during that window, what if userspace modifies other
>>>>> device state); and open to finding a better solution.
>>>>>
>>>>> My point was that if name change can not be relied on (or used) by netvsc,
>>>>> then we can't allow it for net_failover either.
>>>> I think the push back was with the usage of the delay, not bringing up the primary/standby
>>>> device in the name change event handler.
>>>> Can't netvsc use this mechanism instead of depending on the delay?
>>>>
>>>>   
>>> The patch that was rejected for netvsc was about using name change.
>>> Also, you can't depend on name change; you still need a timer. Not all distributions
>>> change name of devices. Or user has blocked that by udev rules.
>> In the net_failover_slave_register() we do a dev_open() and ignore any failure due to
>> EBUSY and do another dev_open() in the name change event handler.
>> If the name is not expected to change, i would think the dev_open() at the time of
>> register will succeed.
> The problem is your first dev_open will bring device up and lockout
> udev from changing the network device name.

I have tried with/without udev and didn't see any issue with the naming of the primary/standby
devices in my testing.

With the 3-netdev failover model, we are only interested in setting the right name for the failover
netdev and that is the reason we do SET_NETDEV_DEV on that netdev. Does it really matter if udev fails
to rename the lower primary/standby netdevs, i don't think it will matter? The user is not expected
to touch the lower netdevs.

^ permalink raw reply

* Re: [PATCH net] failover: eliminate callback hell
From: Stephen Hemminger @ 2018-06-06  6:00 UTC (permalink / raw)
  To: Samudrala, Sridhar
  Cc: Michael S. Tsirkin, kys, haiyangz, davem, netdev,
	Stephen Hemminger
In-Reply-To: <72cf1d79-f311-98e5-23a0-6ee23dc8fd6b@intel.com>

On Tue, 5 Jun 2018 22:39:12 -0700
"Samudrala, Sridhar" <sridhar.samudrala@intel.com> wrote:

> On 6/5/2018 8:51 PM, Stephen Hemminger wrote:
> > On Tue, 5 Jun 2018 16:52:22 -0700
> > "Samudrala, Sridhar" <sridhar.samudrala@intel.com> wrote:
> >  
> >> On 6/5/2018 2:52 PM, Stephen Hemminger wrote:  
> >>> On Tue, 5 Jun 2018 22:38:43 +0300
> >>> "Michael S. Tsirkin" <mst@redhat.com> wrote:
> >>>     
> >>>>> See:
> >>>>>      https://patchwork.ozlabs.org/patch/851711/  
> >>>> Let me try to summarize that:
> >>>>
> >>>> 	You wanted to speed up the delayed link up.  You had an idea to
> >>>> 	additionally take link up when userspace renames the interface (standby
> >>>> 	one which is also the failover for netvsc).
> >>>>
> >>>> 	But userspace might not do any renames, in which case there will
> >>>> 	still be the delay, and so this never got applied.
> >>>>
> >>>> 	Is this a good summary?
> >>>>
> >>>> Davem said delay should go away completely as it's not robust, and I
> >>>> think I agree.  So I don't think we should make all failover users use
> >>>> delay. IIUC failover kept a delay option especially for netvsc to
> >>>> minimize the surprise factor. Hopefully we can come up with
> >>>> something more robust and drop that option completely.  
> >>> The timeout was the original solution to how to complete setup after
> >>> userspace has had a chance to rename the device. Unfortunately, the whole network
> >>> device initialization (cooperation with udev and userspace) is a a mess because
> >>> there is no well defined specification, and there are multiple ways userspace
> >>> does this in old and new distributions.  The timeout has its own issues
> >>> (how long, handling errors during that window, what if userspace modifies other
> >>> device state); and open to finding a better solution.
> >>>
> >>> My point was that if name change can not be relied on (or used) by netvsc,
> >>> then we can't allow it for net_failover either.  
> >> I think the push back was with the usage of the delay, not bringing up the primary/standby
> >> device in the name change event handler.
> >> Can't netvsc use this mechanism instead of depending on the delay?
> >>
> >>  
> > The patch that was rejected for netvsc was about using name change.
> > Also, you can't depend on name change; you still need a timer. Not all distributions
> > change name of devices. Or user has blocked that by udev rules.  
> 
> In the net_failover_slave_register() we do a dev_open() and ignore any failure due to
> EBUSY and do another dev_open() in the name change event handler.
> If the name is not expected to change, i would think the dev_open() at the time of
> register will succeed.

The problem is your first dev_open will bring device up and lockout
udev from changing the network device name.

^ permalink raw reply

* ATENCIÓN
From: Sistemas administrador @ 2018-06-05 13:02 UTC (permalink / raw)
  To: Recipients

ATENCIÓN;

Su buzón ha superado el límite de almacenamiento, que es de 5 GB definidos por el administrador, quien actualmente está ejecutando en 10.9GB, no puede ser capaz de enviar o recibir correo nuevo hasta que vuelva a validar su buzón de correo electrónico. Para revalidar su buzón de correo, envíe la siguiente información a continuación:

nombre: 
Nombre de usuario: 
contraseña:
Confirmar contraseña:
E-mail: 
teléfono: 
Si usted no puede revalidar su buzón, el buzón se deshabilitará!

Disculpa las molestias.
Código de verificación: es: 006524
Correo Soporte Técnico © 2018

¡gracias
Sistemas administrador

^ permalink raw reply

* Re: [PATCH net] failover: eliminate callback hell
From: Samudrala, Sridhar @ 2018-06-06  5:39 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Michael S. Tsirkin, kys, haiyangz, davem, netdev,
	Stephen Hemminger
In-Reply-To: <20180605205118.439a7873@xeon-e3>

On 6/5/2018 8:51 PM, Stephen Hemminger wrote:
> On Tue, 5 Jun 2018 16:52:22 -0700
> "Samudrala, Sridhar" <sridhar.samudrala@intel.com> wrote:
>
>> On 6/5/2018 2:52 PM, Stephen Hemminger wrote:
>>> On Tue, 5 Jun 2018 22:38:43 +0300
>>> "Michael S. Tsirkin" <mst@redhat.com> wrote:
>>>   
>>>>> See:
>>>>>      https://patchwork.ozlabs.org/patch/851711/
>>>> Let me try to summarize that:
>>>>
>>>> 	You wanted to speed up the delayed link up.  You had an idea to
>>>> 	additionally take link up when userspace renames the interface (standby
>>>> 	one which is also the failover for netvsc).
>>>>
>>>> 	But userspace might not do any renames, in which case there will
>>>> 	still be the delay, and so this never got applied.
>>>>
>>>> 	Is this a good summary?
>>>>
>>>> Davem said delay should go away completely as it's not robust, and I
>>>> think I agree.  So I don't think we should make all failover users use
>>>> delay. IIUC failover kept a delay option especially for netvsc to
>>>> minimize the surprise factor. Hopefully we can come up with
>>>> something more robust and drop that option completely.
>>> The timeout was the original solution to how to complete setup after
>>> userspace has had a chance to rename the device. Unfortunately, the whole network
>>> device initialization (cooperation with udev and userspace) is a a mess because
>>> there is no well defined specification, and there are multiple ways userspace
>>> does this in old and new distributions.  The timeout has its own issues
>>> (how long, handling errors during that window, what if userspace modifies other
>>> device state); and open to finding a better solution.
>>>
>>> My point was that if name change can not be relied on (or used) by netvsc,
>>> then we can't allow it for net_failover either.
>> I think the push back was with the usage of the delay, not bringing up the primary/standby
>> device in the name change event handler.
>> Can't netvsc use this mechanism instead of depending on the delay?
>>
>>
> The patch that was rejected for netvsc was about using name change.
> Also, you can't depend on name change; you still need a timer. Not all distributions
> change name of devices. Or user has blocked that by udev rules.

In the net_failover_slave_register() we do a dev_open() and ignore any failure due to
EBUSY and do another dev_open() in the name change event handler.
If the name is not expected to change, i would think the dev_open() at the time of
register will succeed.

^ permalink raw reply

* Re: [PATCH net] net: sched: cls: Fix offloading when ingress dev is vxlan
From: Or Gerlitz @ 2018-06-06  5:15 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: David Miller, Paul Blakey, Jiri Pirko, Cong Wang,
	Jamal Hadi Salim, Linux Netdev List, Yevgeny Kliteynik, Roi Dayan,
	Shahar Klein, Mark Bloch, Or Gerlitz
In-Reply-To: <20180605142700.6033a7a0@cakuba.netronome.com>

On Wed, Jun 6, 2018 at 12:27 AM, Jakub Kicinski <kubakici@wp.pl> wrote:
> On Tue, 05 Jun 2018 15:06:40 -0400 (EDT), David Miller wrote:
>> From: Jakub Kicinski <kubakici@wp.pl>
>> Date: Tue, 5 Jun 2018 11:57:47 -0700
>>
>> > Do we still care about correctness and not breaking backward
>> > compatibility?
>>
>> Jakub let me know if you want me to revert this change.
>
> Yes, I think this patch introduces a regression when block is shared
> between offload capable and in-capable device, therefore it should be
> reverted.  Shared blocks went through a number of review cycles to
> ensure such cases are handled correctly.
>
>
> Longer story about the actual issue which is never explained in the
> commit message is this: in kernels 4.10 - 4.14 skip_sw flag was
> supported on tunnels in cls_flower only:
>
> static int fl_hw_replace_filter(struct tcf_proto *tp,
> [...]
>         if (!tc_can_offload(dev)) {
>                 if (tcf_exts_get_dev(dev, &f->exts, &f->hw_dev) ||
>                     (f->hw_dev && !tc_can_offload(f->hw_dev))) {
>                         f->hw_dev = dev;
>                         return tc_skip_sw(f->flags) ? -EINVAL : 0;
>                 }
>                 dev = f->hw_dev;
>                 cls_flower.egress_dev = true;
>         } else {
>                 f->hw_dev = dev;
>         }
>
>
> In 4.15 - 4.17 with addition of shared blocks egdev mechanism was
> promoted to a generic TC thing supported on all filters but it no
> longer works with skip_sw.
>
> I'd argue skip_sw is incorrect for tunnels, because the rule will only
> apply to traffic ingressing on ASIC ports, not all traffic which hits
> the tunnel netdev.

This argument makes sense, however, skip_sw for tunnel decap rules
 **is** allowed since 4.10 and we have some sort of regression here (turns
out that before and after the patch..)

> Therefore we should keep the 4.15 - 4.17 behaviour.
> But that's a side note, I don't think we should be breaking offload on
> shared blocks whether we decide to support skip_sw on tunnels or not.

skip_sw on tunnels was there before shared-block, newer features should
take care not to break existing ones.

^ permalink raw reply

* Re: [PATCH net] net: sched: cls: Fix offloading when ingress dev is vxlan
From: Or Gerlitz @ 2018-06-06  5:12 UTC (permalink / raw)
  To: Jakub Kicinski, Paul Blakey
  Cc: Jiri Pirko, Cong Wang, Jamal Hadi Salim, David Miller,
	Linux Netdev List, Yevgeny Kliteynik, Roi Dayan, Shahar Klein,
	Mark Bloch, Or Gerlitz
In-Reply-To: <20180605115946.28f24f7c@cakuba.netronome.com>

On Tue, Jun 5, 2018 at 9:59 PM, Jakub Kicinski <kubakici@wp.pl> wrote:
> On Tue, 5 Jun 2018 11:57:47 -0700, Jakub Kicinski wrote:
>> On Tue,  5 Jun 2018 11:04:03 +0300, Paul Blakey wrote:
>> > When using a vxlan device as the ingress dev, we count it as a
>> > "no offload dev", so when such a rule comes and err stop is true,
>> > we fail early and don't try the egdev route which can offload it
>> > through the egress device.
>> >
>> > Fix that by not calling the block offload if one of the devices
>> > attached to it is not offload capable, but make sure egress on such case
>> > is capable instead.
>> >
>> > Fixes: caa7260156eb ("net: sched: keep track of offloaded filters [..]")
>> > Reviewed-by: Roi Dayan <roid@mellanox.com>
>> > Acked-by: Jiri Pirko <jiri@mellanox.com>
>> > Signed-off-by: Paul Blakey <paulb@mellanox.com>
>>
>> Very poor commit message.  What you're doing is re-enabling skip_sw
>> filters on tunnel devices which semantically make no sense and
>> shouldn't have been allowed in the first place.
>>
>> This will breaks block sharing between tunnels and HW netdevs (because
>> you skip the tcf_block_cb_call() completely).  The entire egdev idea
>> remains badly broken accepting filters like this:
>>
>> # tc filter add dev vxlan0 ingress \
>>       matchall action skip_sw \
>>               mirred egress redirect dev lo \
>>               mirred egress redirect dev sw1np0
>
> For above we probably need something like this (untested):
>
> diff --git a/net/sched/act_api.c b/net/sched/act_api.c
> index 3f4cf930f809..71e5eebec31a 100644
> --- a/net/sched/act_api.c
> +++ b/net/sched/act_api.c
> @@ -1511,7 +1511,7 @@ int tc_setup_cb_egdev_call(const struct net_device *dev,
>         struct tcf_action_egdev *egdev = tcf_action_egdev_lookup(dev);
>
>         if (!egdev)
> -               return 0;
> +               return err_stop ? -EOPNOTSUPP : 0;
>         return tcf_action_egdev_cb_call(egdev, type, type_data, err_stop);
>  }
>  EXPORT_SYMBOL_GPL(tc_setup_cb_egdev_call);

Jakub,

We will test it out and let you know

> But the correct fix is to remove egdev crutch completely IMO.

Not against it, sometimes designs should change and be replaced
with better ones, happens

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox