Netdev List

Netdev List
 help / color / mirror / Atom feed

* [PATCH v1 net-next 01/15] net: dsa: sja1105: Change the PTP command access pattern
From: Vladimir Oltean @ 2019-09-02 16:25 UTC (permalink / raw)
  To: f.fainelli, vivien.didelot, andrew, davem, vinicius.gomes,
	vedang.patel, richardcochran
  Cc: weifeng.voon, jiri, m-karicheri2, Jose.Abreu, ilias.apalodimas,
	jhs, xiyou.wangcong, kurt.kanzenbach, netdev, Vladimir Oltean
In-Reply-To: <20190902162544.24613-1-olteanv@gmail.com>

The PTP command register contains enable bits for:
- Putting the 64-bit PTPCLKVAL register in add/subtract or write mode
- Taking timestamps off of the corrected vs free-running clock
- Starting/stopping the TTEthernet scheduling
- Starting/stopping PPS output
- Resetting the switch

When a command needs to be issued (e.g. "change the PTPCLKVAL from write
mode to add/subtract mode"), one cannot simply write to the command
register setting the PTPCLKADD bit to 1, because that would zeroize the
other settings. One also cannot do a read-modify-write (that would be
too easy for this hardware) because not all bits of the command register
are readable over SPI.

So this leaves us with the only option of keeping the value of the PTP
command register in the driver, and operating on that.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
---
Changes since RFC:
- None.

 drivers/net/dsa/sja1105/sja1105.h     | 5 +++++
 drivers/net/dsa/sja1105/sja1105_ptp.c | 6 +-----
 2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/net/dsa/sja1105/sja1105.h b/drivers/net/dsa/sja1105/sja1105.h
index 78094db32622..d8a92646e80a 100644
--- a/drivers/net/dsa/sja1105/sja1105.h
+++ b/drivers/net/dsa/sja1105/sja1105.h
@@ -50,6 +50,10 @@ struct sja1105_regs {
 	u64 qlevel[SJA1105_NUM_PORTS];
 };
 
+struct sja1105_ptp_cmd {
+	u64 resptp;		/* reset */
+};
+
 struct sja1105_info {
 	u64 device_id;
 	/* Needed for distinction between P and R, and between Q and S
@@ -89,6 +93,7 @@ struct sja1105_private {
 	struct spi_device *spidev;
 	struct dsa_switch *ds;
 	struct sja1105_port ports[SJA1105_NUM_PORTS];
+	struct sja1105_ptp_cmd ptp_cmd;
 	struct ptp_clock_info ptp_caps;
 	struct ptp_clock *clock;
 	/* The cycle counter translates the PTP timestamps (based on
diff --git a/drivers/net/dsa/sja1105/sja1105_ptp.c b/drivers/net/dsa/sja1105/sja1105_ptp.c
index d8e8dd59f3d1..07374ba6b9be 100644
--- a/drivers/net/dsa/sja1105/sja1105_ptp.c
+++ b/drivers/net/dsa/sja1105/sja1105_ptp.c
@@ -54,10 +54,6 @@
 #define cc_to_sja1105(d) container_of((d), struct sja1105_private, tstamp_cc)
 #define dw_to_sja1105(d) container_of((d), struct sja1105_private, refresh_work)
 
-struct sja1105_ptp_cmd {
-	u64 resptp;       /* reset */
-};
-
 int sja1105_get_ts_info(struct dsa_switch *ds, int port,
 			struct ethtool_ts_info *info)
 {
@@ -218,8 +214,8 @@ int sja1105_ptpegr_ts_poll(struct sja1105_private *priv, int port, u64 *ts)
 
 int sja1105_ptp_reset(struct sja1105_private *priv)
 {
+	struct sja1105_ptp_cmd cmd = priv->ptp_cmd;
 	struct dsa_switch *ds = priv->ds;
-	struct sja1105_ptp_cmd cmd = {0};
 	int rc;
 
 	mutex_lock(&priv->ptp_lock);
-- 
2.17.1


^ permalink raw reply related

* [PATCH v1 net-next 02/15] net: dsa: sja1105: Get rid of global declaration of struct ptp_clock_info
From: Vladimir Oltean @ 2019-09-02 16:25 UTC (permalink / raw)
  To: f.fainelli, vivien.didelot, andrew, davem, vinicius.gomes,
	vedang.patel, richardcochran
  Cc: weifeng.voon, jiri, m-karicheri2, Jose.Abreu, ilias.apalodimas,
	jhs, xiyou.wangcong, kurt.kanzenbach, netdev, Vladimir Oltean
In-Reply-To: <20190902162544.24613-1-olteanv@gmail.com>

We need priv->ptp_caps to hold a structure and not just a pointer,
because we use container_of in the various PTP callbacks.

Therefore, the sja1105_ptp_caps structure declared in the global memory
of the driver serves no further purpose after copying it into
priv->ptp_caps.

So just populate priv->ptp_caps with the needed operations and remove
sja1105_ptp_caps.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
---
Changes since RFC:
- None.

 drivers/net/dsa/sja1105/sja1105_ptp.c | 29 +++++++++++++--------------
 1 file changed, 14 insertions(+), 15 deletions(-)

diff --git a/drivers/net/dsa/sja1105/sja1105_ptp.c b/drivers/net/dsa/sja1105/sja1105_ptp.c
index 07374ba6b9be..13f9f5799e46 100644
--- a/drivers/net/dsa/sja1105/sja1105_ptp.c
+++ b/drivers/net/dsa/sja1105/sja1105_ptp.c
@@ -343,29 +343,28 @@ static void sja1105_ptp_overflow_check(struct work_struct *work)
 	schedule_delayed_work(&priv->refresh_work, SJA1105_REFRESH_INTERVAL);
 }
 
-static const struct ptp_clock_info sja1105_ptp_caps = {
-	.owner		= THIS_MODULE,
-	.name		= "SJA1105 PHC",
-	.adjfine	= sja1105_ptp_adjfine,
-	.adjtime	= sja1105_ptp_adjtime,
-	.gettime64	= sja1105_ptp_gettime,
-	.settime64	= sja1105_ptp_settime,
-	.max_adj	= SJA1105_MAX_ADJ_PPB,
-};
-
 int sja1105_ptp_clock_register(struct sja1105_private *priv)
 {
 	struct dsa_switch *ds = priv->ds;
 
 	/* Set up the cycle counter */
 	priv->tstamp_cc = (struct cyclecounter) {
-		.read = sja1105_ptptsclk_read,
-		.mask = CYCLECOUNTER_MASK(64),
-		.shift = SJA1105_CC_SHIFT,
-		.mult = SJA1105_CC_MULT,
+		.read		= sja1105_ptptsclk_read,
+		.mask		= CYCLECOUNTER_MASK(64),
+		.shift		= SJA1105_CC_SHIFT,
+		.mult		= SJA1105_CC_MULT,
+	};
+	priv->ptp_caps = (struct ptp_clock_info) {
+		.owner		= THIS_MODULE,
+		.name		= "SJA1105 PHC",
+		.adjfine	= sja1105_ptp_adjfine,
+		.adjtime	= sja1105_ptp_adjtime,
+		.gettime64	= sja1105_ptp_gettime,
+		.settime64	= sja1105_ptp_settime,
+		.max_adj	= SJA1105_MAX_ADJ_PPB,
 	};
+
 	mutex_init(&priv->ptp_lock);
-	priv->ptp_caps = sja1105_ptp_caps;
 
 	priv->clock = ptp_clock_register(&priv->ptp_caps, ds->dev);
 	if (IS_ERR_OR_NULL(priv->clock))
-- 
2.17.1


^ permalink raw reply related

* [PATCH v1 net-next 00/15] tc-taprio offload for SJA1105 DSA
From: Vladimir Oltean @ 2019-09-02 16:25 UTC (permalink / raw)
  To: f.fainelli, vivien.didelot, andrew, davem, vinicius.gomes,
	vedang.patel, richardcochran
  Cc: weifeng.voon, jiri, m-karicheri2, Jose.Abreu, ilias.apalodimas,
	jhs, xiyou.wangcong, kurt.kanzenbach, netdev, Vladimir Oltean

This is the first attempt to submit the tc-taprio offload model for
inclusion in the net tree.

Changes in this version:
- Made "flags 1" and "flags 2" mutually exclusive in the taprio qdisc
- Moved taprio_enable_offload and taprio_disable_offload out of atomic
  context - spin_lock_bh(qdisc_lock(sch)). This allows drivers that
  implement the ndo_setup_tc to sleep and for taprio memory to be
  allocated with GFP_KERNEL. The only thing that was kept under the
  spinlock is the assignment of the q->dequeue and q->peek pointers.
- Finally making proper use of own API - added a taprio_alloc helper to
  avoid passing stack memory to drivers.

The first RFC from July can be seen at:
https://lists.openwall.net/netdev/2019/07/07/81

The second version of the RFC is at:
https://www.spinics.net/lists/netdev/msg596663.html

Changes in v2 of the RFC since v1:
- Adapted the taprio offload patch to work by specifying "flags 2" to
  the iproute2-next tc. At the moment I don't clearly understand whether
  the full offload and the txtime assist ("flags 1") are mutually
  exclusive or not (i.e. whether a "flags 3" mode should be rejected,
  which it currently isn't).
- Added reference counting to the taprio offload structure. Maybe the
  function names and placement could have been better though. As for the
  other complaint (cycle time calculation) it got fixed in the taprio
  parser in the meantime.
- Converted sja1105 to use the hardware PTP registers, and save/restore
  the PTP time across resets.
- Made the DSA callback for ndo_setup_tc a bit more generic, but I don't
  know whether it fulfills expectations. Drivers still can't do blocking
  operations in its execution context.
- Added a state machine for starting/stopping the scheduler based on the
  last command run on the PTP clock.

For those who want to follow along with the hardware implementation, the
manual is here:
https://www.nxp.com/docs/en/user-guide/UM10944.pdf

Original cover letter:

Using Vinicius Costa Gomes' configuration interface for 802.1Qbv (later
resent by Voon Weifeng for the stmmac driver), I am submitting for
review a draft implementation of this offload for a DSA switch.

I don't want to insist too much on the hardware specifics of SJA1105
which isn't otherwise very compliant to the IEEE spec.

In order to be able to test with Vedang Patel's iproute2 patch for
taprio offload (https://www.spinics.net/lists/netdev/msg573072.html)
I had to actually revert the txtime-assist branch as it had changed the
iproute2 interface.

In terms of impact for DSA drivers, I would like to point out that:

- Maybe somebody should pre-populate qopt->cycle_time in case the user
  does not provide one. Otherwise each driver needs to iterate over the
  GCL once, just to set the cycle time (right now stmmac does as well).

- Configuring the switch over SPI cannot apparently be done from this
  ndo_setup_tc callback because it runs in atomic context. I also have
  some downstream patches to offload tc clsact matchall with mirred
  action, but in that case it looks like the atomic context restriction
  does not apply.

- I had to copy the struct tc_taprio_qopt_offload to driver private
  memory because a static config needs to be constructed every time a
  change takes place, and there are up to 4 switch ports that may take a
  TAS configuration. I have created a private
  tc_taprio_qopt_offload_copy() helper for this - I don't know whether
  it's of any help in the general case.

There is more to be done however. The TAS needs to be integrated with
the PTP driver. This is because with a PTP clock source, the base time
is written dynamically to the PTPSCHTM (PTP schedule time) register and
must be a time in the future. Then the "real" base time of each port's
TAS config can be offset by at most ~50 ms (the DELTA field from the
Schedule Entry Points Table) relative to PTPSCHTM.
Because base times in the past are completely ignored by this hardware,
we need to decide if it's ok behaviorally for a driver to "roll" a past
base time into the immediate future by incrementally adding the cycle
time (so the phase doesn't change). If it is, then decide by how long in
the future it is ok to do so. Or alternatively, is it preferable if the
driver errors out if the user-supplied base time is in the past and the
hardware doesn't like it? But even then, there might be fringe cases
when the base time becomes a past PTP time right as the driver tries to
apply the config.
Also applying a tc-taprio offload to a second SJA1105 switch port will
inevitably need to roll the first port's (now past) base time into an
equivalent future time.
All of this is going to be complicated even further by the fact that
resetting the switch (to apply the tc-taprio offload) makes it reset its
PTP time.

Vinicius Costa Gomes (1):
  taprio: Add support for hardware offloading

Vladimir Oltean (14):
  net: dsa: sja1105: Change the PTP command access pattern
  net: dsa: sja1105: Get rid of global declaration of struct
    ptp_clock_info
  net: dsa: sja1105: Switch to hardware operations for PTP
  net: dsa: sja1105: Implement the .gettimex64 system call for PTP
  net: dsa: sja1105: Restore PTP time after switch reset
  net: dsa: sja1105: Disallow management xmit during switch reset
  net: dsa: sja1105: Move PTP data to its own private structure
  net: dsa: sja1105: Advertise the 8 TX queues
  net: dsa: Pass ndo_setup_tc slave callback to drivers
  net: dsa: sja1105: Add static config tables for scheduling
  net: dsa: sja1105: Configure the Time-Aware Scheduler via tc-taprio
    offload
  net: dsa: sja1105: Make HOSTPRIO a kernel config
  net: dsa: sja1105: Make the PTP command read-write
  net: dsa: sja1105: Implement state machine for TAS with PTP clock
    source

 drivers/net/dsa/sja1105/Kconfig               |  17 +
 drivers/net/dsa/sja1105/Makefile              |   4 +
 drivers/net/dsa/sja1105/sja1105.h             |  36 +-
 .../net/dsa/sja1105/sja1105_dynamic_config.c  |   8 +
 drivers/net/dsa/sja1105/sja1105_main.c        |  94 +-
 drivers/net/dsa/sja1105/sja1105_ptp.c         | 345 ++++----
 drivers/net/dsa/sja1105/sja1105_ptp.h         | 103 ++-
 drivers/net/dsa/sja1105/sja1105_spi.c         |  58 +-
 .../net/dsa/sja1105/sja1105_static_config.c   | 167 ++++
 .../net/dsa/sja1105/sja1105_static_config.h   |  48 +-
 drivers/net/dsa/sja1105/sja1105_tas.c         | 830 ++++++++++++++++++
 drivers/net/dsa/sja1105/sja1105_tas.h         |  69 ++
 include/linux/netdevice.h                     |   1 +
 include/net/dsa.h                             |   2 +
 include/net/pkt_sched.h                       |  33 +
 include/uapi/linux/pkt_sched.h                |   3 +-
 net/dsa/slave.c                               |  12 +-
 net/dsa/tag_sja1105.c                         |   3 +-
 net/sched/sch_taprio.c                        | 278 +++++-
 19 files changed, 1886 insertions(+), 225 deletions(-)
 create mode 100644 drivers/net/dsa/sja1105/sja1105_tas.c
 create mode 100644 drivers/net/dsa/sja1105/sja1105_tas.h

-- 
2.17.1


^ permalink raw reply

* [PATCH v2] net-ipv6: fix excessive RTF_ADDRCONF flag on ::1/128 local route (and others)
From: Maciej Żenczykowski @ 2019-09-02 16:23 UTC (permalink / raw)
  To: Maciej Żenczykowski, David S . Miller
  Cc: netdev, David Ahern, Lorenzo Colitti
In-Reply-To: <565e386f-e72a-73db-1f34-fedb5190658a@gmail.com>

From: Maciej Żenczykowski <maze@google.com>

There is a subtle change in behaviour introduced by:
  commit c7a1ce397adacaf5d4bb2eab0a738b5f80dc3e43
  'ipv6: Change addrconf_f6i_alloc to use ip6_route_info_create'

Before that patch /proc/net/ipv6_route includes:
00000000000000000000000000000001 80 00000000000000000000000000000000 00 00000000000000000000000000000000 00000000 00000003 00000000 80200001 lo

Afterwards /proc/net/ipv6_route includes:
00000000000000000000000000000001 80 00000000000000000000000000000000 00 00000000000000000000000000000000 00000000 00000002 00000000 80240001 lo

ie. the above commit causes the ::1/128 local (automatic) route to be flagged with RTF_ADDRCONF (0x040000).

AFAICT, this is incorrect since these routes are *not* coming from RA's.

As such, this patch restores the old behaviour.

Fixes: c7a1ce397adacaf5d4bb2eab0a738b5f80dc3e43
Cc: David Ahern <dsahern@gmail.com>
Cc: Lorenzo Colitti <lorenzo@google.com>
Signed-off-by: Maciej Żenczykowski <maze@google.com>
---
 net/ipv6/route.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index 558c6c68855f..516b2e568dae 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -4365,13 +4365,14 @@ struct fib6_info *addrconf_f6i_alloc(struct net *net,
 	struct fib6_config cfg = {
 		.fc_table = l3mdev_fib_table(idev->dev) ? : RT6_TABLE_LOCAL,
 		.fc_ifindex = idev->dev->ifindex,
-		.fc_flags = RTF_UP | RTF_ADDRCONF | RTF_NONEXTHOP,
+		.fc_flags = RTF_UP | RTF_NONEXTHOP,
 		.fc_dst = *addr,
 		.fc_dst_len = 128,
 		.fc_protocol = RTPROT_KERNEL,
 		.fc_nlinfo.nl_net = net,
 		.fc_ignore_dev_down = true,
 	};
+	struct fib6_info *f6i;
 
 	if (anycast) {
 		cfg.fc_type = RTN_ANYCAST;
@@ -4381,7 +4382,10 @@ struct fib6_info *addrconf_f6i_alloc(struct net *net,
 		cfg.fc_flags |= RTF_LOCAL;
 	}
 
-	return ip6_route_info_create(&cfg, gfp_flags, NULL);
+	f6i = ip6_route_info_create(&cfg, gfp_flags, NULL);
+	if (f6i)
+		f6i->dst_nocount = true;
+	return f6i;
 }
 
 /* remove deleted ip from prefsrc entries */
-- 
2.23.0.187.g17f5b7556c-goog


^ permalink raw reply related

* [PATCH net] sctp: use transport pf_retrans in sctp_do_8_2_transport_strike
From: Xin Long @ 2019-09-02 15:24 UTC (permalink / raw)
  To: network dev, linux-sctp; +Cc: davem, Marcelo Ricardo Leitner, Neil Horman

Transport should use its own pf_retrans to do the error_count
check, instead of asoc's. Otherwise, it's meaningless to make
pf_retrans per transport.

Fixes: 5aa93bcf66f4 ("sctp: Implement quick failover draft from tsvwg")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
---
 net/sctp/sm_sideeffect.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/sctp/sm_sideeffect.c b/net/sctp/sm_sideeffect.c
index 1cf5bb5..e52b212 100644
--- a/net/sctp/sm_sideeffect.c
+++ b/net/sctp/sm_sideeffect.c
@@ -547,7 +547,7 @@ static void sctp_do_8_2_transport_strike(struct sctp_cmd_seq *commands,
 	if (net->sctp.pf_enable &&
 	   (transport->state == SCTP_ACTIVE) &&
 	   (transport->error_count < transport->pathmaxrxt) &&
-	   (transport->error_count > asoc->pf_retrans)) {
+	   (transport->error_count > transport->pf_retrans)) {
 
 		sctp_assoc_control_transport(asoc, transport,
 					     SCTP_TRANSPORT_PF,
-- 
2.1.0


^ permalink raw reply related

* Re: [PATCH v4 1/5] vsock/virtio: limit the memory used per-socket
From: Michael S. Tsirkin @ 2019-09-02 15:23 UTC (permalink / raw)
  To: Stefano Garzarella
  Cc: netdev, linux-kernel, Stefan Hajnoczi, David S. Miller,
	virtualization, Jason Wang, kvm
In-Reply-To: <20190902095723.6vuvp73fdunmiogo@steredhat>

On Mon, Sep 02, 2019 at 11:57:23AM +0200, Stefano Garzarella wrote:
> On Sun, Sep 01, 2019 at 06:17:58AM -0400, Michael S. Tsirkin wrote:
> > On Sun, Sep 01, 2019 at 04:26:19AM -0400, Michael S. Tsirkin wrote:
> > > On Thu, Aug 01, 2019 at 03:36:16PM +0200, Stefano Garzarella wrote:
> > > > On Thu, Aug 01, 2019 at 09:21:15AM -0400, Michael S. Tsirkin wrote:
> > > > > On Thu, Aug 01, 2019 at 12:47:54PM +0200, Stefano Garzarella wrote:
> > > > > > On Tue, Jul 30, 2019 at 04:42:25PM -0400, Michael S. Tsirkin wrote:
> > > > > > > On Tue, Jul 30, 2019 at 11:35:39AM +0200, Stefano Garzarella wrote:
> > > > > > 
> > > > > > (...)
> > > > > > 
> > > > > > > > 
> > > > > > > > The problem here is the compatibility. Before this series virtio-vsock
> > > > > > > > and vhost-vsock modules had the RX buffer size hard-coded
> > > > > > > > (VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE = 4K). So, if we send a buffer smaller
> > > > > > > > of 4K, there might be issues.
> > > > > > > 
> > > > > > > Shouldn't be if they are following the spec. If not let's fix
> > > > > > > the broken parts.
> > > > > > > 
> > > > > > > > 
> > > > > > > > Maybe it is the time to add add 'features' to virtio-vsock device.
> > > > > > > > 
> > > > > > > > Thanks,
> > > > > > > > Stefano
> > > > > > > 
> > > > > > > Why would a remote care about buffer sizes?
> > > > > > > 
> > > > > > > Let's first see what the issues are. If they exist
> > > > > > > we can either fix the bugs, or code the bug as a feature in spec.
> > > > > > > 
> > > > > > 
> > > > > > The vhost_transport '.stream_enqueue' callback
> > > > > > [virtio_transport_stream_enqueue()] calls the virtio_transport_send_pkt_info(),
> > > > > > passing the user message. This function allocates a new packet, copying
> > > > > > the user message, but (before this series) it limits the packet size to
> > > > > > the VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE (4K):
> > > > > > 
> > > > > > static int virtio_transport_send_pkt_info(struct vsock_sock *vsk,
> > > > > > 					  struct virtio_vsock_pkt_info *info)
> > > > > > {
> > > > > >  ...
> > > > > > 	/* we can send less than pkt_len bytes */
> > > > > > 	if (pkt_len > VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE)
> > > > > > 		pkt_len = VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE;
> > > > > > 
> > > > > > 	/* virtio_transport_get_credit might return less than pkt_len credit */
> > > > > > 	pkt_len = virtio_transport_get_credit(vvs, pkt_len);
> > > > > > 
> > > > > > 	/* Do not send zero length OP_RW pkt */
> > > > > > 	if (pkt_len == 0 && info->op == VIRTIO_VSOCK_OP_RW)
> > > > > > 		return pkt_len;
> > > > > >  ...
> > > > > > }
> > > > > > 
> > > > > > then it queues the packet for the TX worker calling .send_pkt()
> > > > > > [vhost_transport_send_pkt() in the vhost_transport case]
> > > > > > 
> > > > > > The main function executed by the TX worker is
> > > > > > vhost_transport_do_send_pkt() that picks up a buffer from the virtqueue
> > > > > > and it tries to copy the packet (up to 4K) on it.  If the buffer
> > > > > > allocated from the guest will be smaller then 4K, I think here it will
> > > > > > be discarded with an error:
> > > > > > 
> > > > 
> > > > I'm adding more lines to explain better.
> > > > 
> > > > > > static void
> > > > > > vhost_transport_do_send_pkt(struct vhost_vsock *vsock,
> > > > > > 				struct vhost_virtqueue *vq)
> > > > > > {
> > > > 		...
> > > > 
> > > > 		head = vhost_get_vq_desc(vq, vq->iov, ARRAY_SIZE(vq->iov),
> > > > 					 &out, &in, NULL, NULL);
> > > > 
> > > > 		...
> > > > 
> > > > 		len = iov_length(&vq->iov[out], in);
> > > > 		iov_iter_init(&iov_iter, READ, &vq->iov[out], in, len);
> > > > 
> > > > 		nbytes = copy_to_iter(&pkt->hdr, sizeof(pkt->hdr), &iov_iter);
> > > > 		if (nbytes != sizeof(pkt->hdr)) {
> > > > 			virtio_transport_free_pkt(pkt);
> > > > 			vq_err(vq, "Faulted on copying pkt hdr\n");
> > > > 			break;
> > > > 		}
> > > > 
> > > > > >  ...
> > > > > > 		nbytes = copy_to_iter(pkt->buf, pkt->len, &iov_iter);
> > > > > 
> > > > > isn't pck len the actual length though?
> > > > > 
> > > > 
> > > > It is the length of the packet that we are copying in the guest RX
> > > > buffers pointed by the iov_iter. The guest allocates an iovec with 2
> > > > buffers, one for the header and one for the payload (4KB).
> > > 
> > > BTW at the moment that forces another kmalloc within virtio core. Maybe
> > > vsock needs a flag to skip allocation in this case.  Worth benchmarking.
> > > See virtqueue_use_indirect which just does total_sg > 1.
> 
> Okay, I'll take a look at virtqueue_use_indirect and I'll do some
> benchmarking.
> 
> > > 
> > > > 
> > > > > > 		if (nbytes != pkt->len) {
> > > > > > 			virtio_transport_free_pkt(pkt);
> > > > > > 			vq_err(vq, "Faulted on copying pkt buf\n");
> > > > > > 			break;
> > > > > > 		}
> > > > > >  ...
> > > > > > }
> > > > > > 
> > > > > > 
> > > > > > This series changes this behavior since now we will split the packet in
> > > > > > vhost_transport_do_send_pkt() depending on the buffer found in the
> > > > > > virtqueue.
> > > > > > 
> > > > > > We didn't change the buffer size in this series, so we still backward
> > > > > > compatible, but if we will use buffers smaller than 4K, we should
> > > > > > encounter the error described above.
> > > 
> > > So that's an implementation bug then? It made an assumption
> > > of a 4K sized buffer? Or even PAGE_SIZE sized buffer?
> 
> Yes, I think it made an assumption and it used this macro as a limit:
> 
> include/linux/virtio_vsock.h:13:
>     #define VIRTIO_VSOCK_DEFAULT_RX_BUF_SIZE        (1024 * 4)
> 
> > 
> > Assuming we miss nothing and buffers < 4K are broken,
> > I think we need to add this to the spec, possibly with
> > a feature bit to relax the requirement that all buffers
> > are at least 4k in size.
> > 
> 
> Okay, should I send a proposal to virtio-dev@lists.oasis-open.org?
> 
> Thanks,
> Stefano

virtio-comment is more appropriate for this.

-- 
MST

^ permalink raw reply

* Re: [PATCH v3] virtio-net: lower min ring num_free for efficiency
From: Michael S. Tsirkin @ 2019-09-02 15:01 UTC (permalink / raw)
  To: ? jiang
  Cc: jasowang@redhat.com, davem@davemloft.net, ast@kernel.org,
	daniel@iogearbox.net, jakub.kicinski@netronome.com,
	hawk@kernel.org, john.fastabend@gmail.com, kafai@fb.com,
	songliubraving@fb.com, yhs@fb.com,
	virtualization@lists.linux-foundation.org, netdev@vger.kernel.org,
	linux-kernel@vger.kernel.org, xdp-newbies@vger.kernel.org,
	bpf@vger.kernel.org, jiangran.jr@alibaba-inc.com
In-Reply-To: <BYAPR14MB32059DD9439280B66B532351A6AB0@BYAPR14MB3205.namprd14.prod.outlook.com>

On Tue, Aug 20, 2019 at 02:51:23AM +0000, ? jiang wrote:
> This change lowers ring buffer reclaim threshold from 1/2*queue to budget
> for better performance. According to our test with qemu + dpdk, packet
> dropping happens when the guest is not able to provide free buffer in
> avail ring timely with default 1/2*queue. The value in the patch has been
> tested and does show better performance.
> 
> Test setup: iperf3 to generate packets to guest (total 30mins, pps 400k, UDP)
> avg packets drop before: 2842
> avg packets drop after: 360(-87.3%)
> 
> Further, current code suffers from a starvation problem: the amount of
> work done by try_fill_recv is not bounded by the budget parameter, thus
> (with large queues) once in a while userspace gets blocked for a long
> time while queue is being refilled. Trigger refills earlier to make sure
> the amount of work to do is limited.
> 
> Signed-off-by: jiangkidd <jiangkidd@hotmail.com>
> Acked-by: Jason Wang <jasowang@redhat.com>
> Acked-by: Michael S. Tsirkin <mst@redhat.com>


Dave, could you merge this please?

Either net or net-next at your discretion.

> ---
>  drivers/net/virtio_net.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index 0d4115c9e20b..bc08be7925eb 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -1331,7 +1331,7 @@ static int virtnet_receive(struct receive_queue *rq, int budget,
>  		}
>  	}
>  
> -	if (rq->vq->num_free > virtqueue_get_vring_size(rq->vq) / 2) {
> +	if (rq->vq->num_free > min((unsigned int)budget, virtqueue_get_vring_size(rq->vq)) / 2) {
>  		if (!try_fill_recv(vi, rq, GFP_ATOMIC))
>  			schedule_delayed_work(&vi->refill, 0);
>  	}
> -- 
> 2.11.0

^ permalink raw reply

* Re: [PATCH v2 1/6] mdev: Introduce sha1 based mdev alias
From: Cornelia Huck @ 2019-09-02 14:46 UTC (permalink / raw)
  To: Parav Pandit
  Cc: alex.williamson@redhat.com, Jiri Pirko, kwankhede@nvidia.com,
	davem@davemloft.net, kvm@vger.kernel.org,
	linux-kernel@vger.kernel.org, netdev@vger.kernel.org
In-Reply-To: <AM0PR05MB48661F9608F284AB5C9BAEB5D1BD0@AM0PR05MB4866.eurprd05.prod.outlook.com>

On Fri, 30 Aug 2019 15:45:13 +0000
Parav Pandit <parav@mellanox.com> wrote:

> > > > > > This detour via the local variable looks weird to me. Can you
> > > > > > either create the alias directly in the mdev (would need to
> > > > > > happen later in the function, but I'm not sure why you generate
> > > > > > the alias before checking for duplicates anyway), or do an explicit copy?  
> > > > > Alias duplicate check is done after generating it, because
> > > > > duplicate alias are  
> > > > not allowed.  
> > > > > The probability of collision is rare.
> > > > > So it is speculatively generated without hold the lock, because
> > > > > there is no  
> > > > need to hold the lock.  
> > > > > It is compared along with guid while mutex lock is held in single loop.
> > > > > And if it is duplicate, there is no need to allocate mdev.
> > > > >
> > > > > It will be sub optimal to run through the mdev list 2nd time after
> > > > > mdev  
> > > > creation and after generating alias for duplicate check.
> > > >
> > > > Ok, but what about copying it? I find this "set local variable to
> > > > NULL after ownership is transferred" pattern a bit unintuitive.
> > > > Copying it to the mdev (and then unconditionally freeing it) looks more  
> > obvious to me.  
> > > Its not unconditionally freed.  
> > 
> > That's not what I have been saying :(
> >   
> Ah I see. You want to allocate alias memory twice; once inside mdev device and another one in _create() function.
> _create() one you want to free unconditionally.
> 
> Well, passing pointer is fine.

It's not that it doesn't work, but it feels fragile due to its
non-obviousness.

> mdev_register_device() has similar little tricky pattern that makes parent = NULL on __find_parent_device() finds duplicate one.

I don't think that the two are comparable.

> 
> Ownership transfer is more straight forward code.

I have to disagree here.

> 
> It is similar to device_initialize(), device init sequence code, where once device_initialize is done, freeing the device memory will be left to the put_device(), we don't call kfree() on mdev device.

This does not really look similar to me: devices are refcounted
structures, while strings aren't; you transfer a local pointer to a
refcounted structure and then discard the local reference.

^ permalink raw reply

* Re: [PATCH v2 5/6] mdev: Update sysfs documentation
From: Cornelia Huck @ 2019-09-02 14:36 UTC (permalink / raw)
  To: Parav Pandit
  Cc: alex.williamson@redhat.com, Jiri Pirko, kwankhede@nvidia.com,
	davem@davemloft.net, kvm@vger.kernel.org,
	linux-kernel@vger.kernel.org, netdev@vger.kernel.org
In-Reply-To: <AM0PR05MB4866372C521F59491838C8E4D1BD0@AM0PR05MB4866.eurprd05.prod.outlook.com>

On Fri, 30 Aug 2019 13:10:17 +0000
Parav Pandit <parav@mellanox.com> wrote:

> > -----Original Message-----
> > From: Cornelia Huck <cohuck@redhat.com>
> > Sent: Friday, August 30, 2019 6:19 PM
> > To: Parav Pandit <parav@mellanox.com>
> > Cc: alex.williamson@redhat.com; Jiri Pirko <jiri@mellanox.com>;
> > kwankhede@nvidia.com; davem@davemloft.net; kvm@vger.kernel.org; linux-
> > kernel@vger.kernel.org; netdev@vger.kernel.org
> > Subject: Re: [PATCH v2 5/6] mdev: Update sysfs documentation
> > 
> > On Thu, 29 Aug 2019 06:19:03 -0500
> > Parav Pandit <parav@mellanox.com> wrote:
> >   
> > > Updated documentation for optional read only sysfs attribute.  
> > 
> > I'd probably merge this into the patch introducing the attribute.
> >   
> Ok. I will spin v3.
> 
> > >
> > > Signed-off-by: Parav Pandit <parav@mellanox.com>
> > > ---
> > >  Documentation/driver-api/vfio-mediated-device.rst | 5 +++++
> > >  1 file changed, 5 insertions(+)
> > >
> > > diff --git a/Documentation/driver-api/vfio-mediated-device.rst
> > > b/Documentation/driver-api/vfio-mediated-device.rst
> > > index 25eb7d5b834b..0ab03d3f5629 100644
> > > --- a/Documentation/driver-api/vfio-mediated-device.rst
> > > +++ b/Documentation/driver-api/vfio-mediated-device.rst
> > > @@ -270,6 +270,7 @@ Directories and Files Under the sysfs for Each mdev  
> > Device  
> > >           |--- remove
> > >           |--- mdev_type {link to its type}
> > >           |--- vendor-specific-attributes [optional]
> > > +         |--- alias [optional]  
> > 
> > "optional" implies "not always present" to me, not "might return a read error if
> > not available". Don't know if there's a better way to tag this? Or make it really
> > optional? :)  
> 
> May be write it as,
> 
> alias [ optional when requested by parent ]

I'm not sure what 'optional when requested' is supposed to mean...
maybe something like 'content optional' or so?

> 
> >   
> > >
> > >  * remove (write only)
> > >
> > > @@ -281,6 +282,10 @@ Example::
> > >
> > >  	# echo 1 > /sys/bus/mdev/devices/$mdev_UUID/remove
> > >
> > > +* alias (read only)
> > > +Whenever a parent requested to generate an alias, each mdev is
> > > +assigned a unique alias by the mdev core. This file shows the alias of the  
> > mdev device.
> > 
> > It's not really the parent, but the vendor driver requesting this, right? Also,  
> At mdev level, it only knows parent->ops structure, whether parent is registered by vendor driver or something else.

Who else is supposed to create the mdev device?

> 
> > "each mdev" is a bit ambiguous,   
> It is in context of the parent. Sentence is not starting with "each mdev".
> But may be more verbosely written as,
> 
> Whenever a parent requested to generate an alias, Each mdev device of such parent is assigned 
> unique alias by the mdev core. This file shows the alias of the mdev device.

I'd really leave the parent out of this: this seems more like an
implementation detail. It's more that alias may either contain an
alias, or return a read error if no alias has been generated. Who
requested the alias to be generated is probably not really of interest
to the userspace reader.

> 
> > created via that driver. Lastly, if we stick with the "returns an error if not
> > implemented" approach, that should also be mentioned here.  
> Ok. Will spin v3 to describe it.
> 
> >   
> > > +
> > >  Mediated device Hot plug
> > >  ------------------------
> > >  
> 


^ permalink raw reply

* Re: [PATCH] net/skbuff: silence warnings under memory pressure
From: Vlastimil Babka @ 2019-09-02 14:24 UTC (permalink / raw)
  To: Qian Cai, Eric Dumazet, davem
  Cc: netdev, linux-mm, linux-kernel, Michal Hocko
In-Reply-To: <1567178728.5576.32.camel@lca.pw>

On 8/30/19 5:25 PM, Qian Cai wrote:
> On Fri, 2019-08-30 at 17:11 +0200, Eric Dumazet wrote:
>>
>> On 8/30/19 4:57 PM, Qian Cai wrote:
>>> When running heavy memory pressure workloads, the system is throwing
>>> endless warnings below due to the allocation could fail from
>>> __build_skb(), and the volume of this call could be huge which may
>>> generate a lot of serial console output and cosumes all CPUs as
>>> warn_alloc() could be expensive by calling dump_stack() and then
>>> show_mem().
>>>
>>> Fix it by silencing the warning in this call site. Also, it seems
>>> unnecessary to even print a warning at all if the allocation failed in
>>> __build_skb(), as it may just retransmit the packet and retry.
>>>

Well, __GFP_NOWARN would save me from explaining this warning to users
many times. OTOH usually it's an indication that min_free_kbytes should
be raised to better cope with network traffic.

>>
>> Same patches are showing up there and there from time to time.
>>
>> Why is this particular spot interesting, against all others not adding
>> __GFP_NOWARN ?

This one is interesting that it's a GFP_ATOMIC allocation triggered by
incoming packets, and has a fallback mechanism. I don't recall other so
notoric ones.

>> Are we going to have hundred of patches adding __GFP_NOWARN at various points,
>> or should we get something generic to not flood the syslog in case of memory
>> pressure ?
>>
> 
> From my testing which uses LTP oom* tests. There are only 3 places need to be
> patched. The other two are in IOMMU code for both Intel and AMD. The place is
> particular interesting because it could cause the system with floating serial
> console output for days without making progress in OOM. I suppose it ends up in
> a looping condition that warn_alloc() would end up generating more calls into
> __build_skb() via ksoftirqd.

Regardless of this particular allocation, if the reporting itself makes
the conditions so much worse, then at least some kind of general
ratelimit would make sense indeed.


^ permalink raw reply

* Re: [PATCH v5] perf machine: arm/arm64: Improve completeness for kernel address space
From: Leo Yan @ 2019-09-02 14:15 UTC (permalink / raw)
  To: Adrian Hunter
  Cc: Arnaldo Carvalho de Melo, Alexander Shishkin, Jiri Olsa,
	Namhyung Kim, Alexei Starovoitov, Daniel Borkmann,
	Martin KaFai Lau, Song Liu, Yonghong Song, linux-kernel, netdev,
	bpf, clang-built-linux, Mathieu Poirier, Peter Zijlstra,
	Suzuki Poulouse, coresight, linux-arm-kernel
In-Reply-To: <20190826125105.GA3288@leoy-ThinkPad-X240s>

Hi Adrian,

On Mon, Aug 26, 2019 at 08:51:05PM +0800, Leo Yan wrote:
> Hi Adrian,
> 
> On Fri, Aug 16, 2019 at 04:00:02PM +0300, Adrian Hunter wrote:
> > On 16/08/19 4:45 AM, Leo Yan wrote:
> > > Hi Adrian,
> > > 
> > > On Thu, Aug 15, 2019 at 02:45:57PM +0300, Adrian Hunter wrote:
> > > 
> > > [...]
> > > 
> > >>>> How come you cannot use kallsyms to get the information?
> > >>>
> > >>> Thanks for pointing out this.  Sorry I skipped your comment "I don't
> > >>> know how you intend to calculate ARM_PRE_START_SIZE" when you reviewed
> > >>> the patch v3, I should use that chance to elaborate the detailed idea
> > >>> and so can get more feedback/guidance before procceed.
> > >>>
> > >>> Actually, I have considered to use kallsyms when worked on the previous
> > >>> patch set.
> > >>>
> > >>> As mentioned in patch set v4's cover letter, I tried to implement
> > >>> machine__create_extra_kernel_maps() for arm/arm64, the purpose is to
> > >>> parse kallsyms so can find more kernel maps and thus also can fixup
> > >>> the kernel start address.  But I found the 'perf script' tool directly
> > >>> calls machine__get_kernel_start() instead of running into the flow for
> > >>> machine__create_extra_kernel_maps();
> > >>
> > >> Doesn't it just need to loop through each kernel map to find the lowest
> > >> start address?
> > > 
> > > Based on your suggestion, I worked out below change and verified it
> > > can work well on arm64 for fixing up start address; please let me know
> > > if the change works for you?
> > 
> > How does that work if take a perf.data file to a machine with a different
> > architecture?
> 
> Sorry I delayed so long to respond to your question; I didn't have
> confidence to give out very reasonale answer and this is the main reason
> for delaying.

Could you take chance to review my below replying?  I'd like to get
your confirmation before I send out offical patch.

Thanks,
Leo Yan

> 
> For your question for taking a perf.data file to a machine with a
> different architecture, we can firstly use command 'perf buildid-list'
> to print out the buildid for kallsyms, based on the dumped buildid we
> can find out the location for the saved kallsyms file; then we can use
> option '--kallsyms' to specify the offline kallsyms file and use the
> offline kallsyms to fixup kernel start address.  The detailed commands
> are listed as below:
> 
> root@debian:~# perf buildid-list
> 7b36dfca8317ef74974ebd7ee5ec0a8b35c97640 [kernel.kallsyms]
> 56b84aa88a1bcfe222a97a53698b92723a3977ca /usr/lib/systemd/systemd
> 0956b952e9cd673d48ff2cfeb1a9dbd0c853e686 /usr/lib/aarch64-linux-gnu/libm-2.28.so
> [...]
> 
> root@debian:~# perf script --kallsyms ~/.debug/\[kernel.kallsyms\]/7b36dfca8317ef74974ebd7ee5ec0a8b35c97640/kallsyms
> 
> The amended patch is as below, please review and always welcome
> any suggestions or comments!
> 
> diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
> index 5734460fc89e..593f05cc453f 100644
> --- a/tools/perf/util/machine.c
> +++ b/tools/perf/util/machine.c
> @@ -2672,9 +2672,26 @@ int machine__nr_cpus_avail(struct machine *machine)
>  	return machine ? perf_env__nr_cpus_avail(machine->env) : 0;
>  }
>  
> +static int machine__fixup_kernel_start(void *arg,
> +				       const char *name __maybe_unused,
> +				       char type,
> +				       u64 start)
> +{
> +	struct machine *machine = arg;
> +
> +	type = toupper(type);
> +
> +	/* Fixup for text, weak, data and bss sections. */
> +	if (type == 'T' || type == 'W' || type == 'D' || type == 'B')
> +		machine->kernel_start = min(machine->kernel_start, start);
> +
> +	return 0;
> +}
> +
>  int machine__get_kernel_start(struct machine *machine)
>  {
>  	struct map *map = machine__kernel_map(machine);
> +	char filename[PATH_MAX];
>  	int err = 0;
>  
>  	/*
> @@ -2696,6 +2713,22 @@ int machine__get_kernel_start(struct machine *machine)
>  		if (!err && !machine__is(machine, "x86_64"))
>  			machine->kernel_start = map->start;
>  	}
> +
> +	if (symbol_conf.kallsyms_name != NULL) {
> +		strncpy(filename, symbol_conf.kallsyms_name, PATH_MAX);
> +	} else {
> +		machine__get_kallsyms_filename(machine, filename, PATH_MAX);
> +
> +		if (symbol__restricted_filename(filename, "/proc/kallsyms"))
> +			goto out;
> +	}
> +
> +	if (kallsyms__parse(filename, machine, machine__fixup_kernel_start))
> +		pr_warning("Fail to fixup kernel start address. skipping...\n");
> +
> +out:
>  	return err;
>  }
>  
> 
> Thanks,
> Leo Yan

^ permalink raw reply

* Re: BUG_ON in skb_segment, after bpf_skb_change_proto was applied
From: Shmulik Ladkani @ 2019-09-02 13:44 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: Daniel Borkmann, Eric Dumazet, netdev, Alexander Duyck,
	Alexei Starovoitov, Yonghong Song, Steffen Klassert,
	Shmulik Ladkani, eyal
In-Reply-To: <CA+FuTSfVsgNDi7c=GUU8nMg2hWxF2SjCNLXetHeVPdnxAW5K-w@mail.gmail.com>

On Sun, 1 Sep 2019 16:05:48 -0400
Willem de Bruijn <willemdebruijn.kernel@gmail.com> wrote:

> One quick fix is to disable sg and thus revert to copying in this
> case. Not ideal, but better than a kernel splat:
> 
> @@ -3714,6 +3714,9 @@ struct sk_buff *skb_segment(struct sk_buff *head_skb,
>         sg = !!(features & NETIF_F_SG);
>         csum = !!can_checksum_protocol(features, proto);
> 
> +       if (list_skb && skb_headlen(list_skb) && !list_skb->head_frag)
> +               sg = false;
> +

Many thanks Willem for looking into this.

Indeed the above suggestion avoids the splat, at least with the
reproducer.

I'll look deeper into this, to make sure the generated skbs are indeed
legit and have correct content.

Shmulik

^ permalink raw reply

* Re: [PATCH] net-ipv6: fix excessive RTF_ADDRCONF flag on ::1/128 local route (and others)
From: David Ahern @ 2019-09-02 13:37 UTC (permalink / raw)
  To: Maciej Żenczykowski, Maciej Żenczykowski,
	David S . Miller; +Cc: netdev
In-Reply-To: <20190901174759.257032-1-zenczykowski@gmail.com>

On 9/1/19 11:47 AM, Maciej Żenczykowski wrote:
> From: Maciej Żenczykowski <maze@google.com>
> 
> There is a subtle change in behaviour introduced by:
>   commit c7a1ce397adacaf5d4bb2eab0a738b5f80dc3e43
>   'ipv6: Change addrconf_f6i_alloc to use ip6_route_info_create'
> 
> Before that patch /proc/net/ipv6_route includes:
> 00000000000000000000000000000001 80 00000000000000000000000000000000 00 00000000000000000000000000000000 00000000 00000003 00000000 80200001 lo
> 
> Afterwards /proc/net/ipv6_route includes:
> 00000000000000000000000000000001 80 00000000000000000000000000000000 00 00000000000000000000000000000000 00000000 00000002 00000000 80240001 lo
> 
> ie. the above commit causes the ::/128 local (automatic) route to be flagged with RTF_ADDRCONF (0x040000).

The ADDRCONF flag is wrong.

> 
> AFAICT, this is incorrect since these routes are *not* coming from RA's.
> 
> As such, this patch restores the old behaviour.
> 
> Fixes: c7a1ce397adacaf5d4bb2eab0a738b5f80dc3e43
> Cc: David Ahern <dsahern@gmail.com>
> Signed-off-by: Maciej Żenczykowski <maze@google.com>
> ---
>  net/ipv6/route.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/net/ipv6/route.c b/net/ipv6/route.c
> index 558c6c68855f..cee977e52d34 100644
> --- a/net/ipv6/route.c
> +++ b/net/ipv6/route.c
> @@ -4365,13 +4365,14 @@ struct fib6_info *addrconf_f6i_alloc(struct net *net,
>  	struct fib6_config cfg = {
>  		.fc_table = l3mdev_fib_table(idev->dev) ? : RT6_TABLE_LOCAL,
>  		.fc_ifindex = idev->dev->ifindex,
> -		.fc_flags = RTF_UP | RTF_ADDRCONF | RTF_NONEXTHOP,
> +		.fc_flags = RTF_UP | RTF_NONEXTHOP,
>  		.fc_dst = *addr,
>  		.fc_dst_len = 128,
>  		.fc_protocol = RTPROT_KERNEL,
>  		.fc_nlinfo.nl_net = net,
>  		.fc_ignore_dev_down = true,
>  	};
> +	struct fib6_info *f6i;
>  
>  	if (anycast) {
>  		cfg.fc_type = RTN_ANYCAST;
> @@ -4381,7 +4382,9 @@ struct fib6_info *addrconf_f6i_alloc(struct net *net,
>  		cfg.fc_flags |= RTF_LOCAL;
>  	}
>  
> -	return ip6_route_info_create(&cfg, gfp_flags, NULL);
> +	f6i = ip6_route_info_create(&cfg, gfp_flags, NULL);

ip6_route_info_create can return NULL.



> +	f6i->dst_nocount = true;
> +	return f6i;
>  }
>  
>  /* remove deleted ip from prefsrc entries */
> 


^ permalink raw reply

* RE: [net-next 0/3] ravb: Remove use of undocumented registers
From: Chris Paterson @ 2019-09-02 13:16 UTC (permalink / raw)
  To: Geert Uytterhoeven, Simon Horman, Biju Das, Fabrizio Castro
  Cc: David Miller, Sergei Shtylyov, Magnus Damm, netdev, Linux-Renesas
In-Reply-To: <CAMuHMdXQypGJ_oqUndOcf02GCqxEGEOK15+jnS5ehqdUJ+A8aw@mail.gmail.com>

Hello Geert,

> From: netdev-owner@vger.kernel.org <netdev-owner@vger.kernel.org>
> On Behalf Of Geert Uytterhoeven
> Sent: 02 September 2019 09:16
> 
> Hi Simon, Biju, Fabrizio,
> 
> On Mon, Sep 2, 2019 at 10:06 AM Simon Horman
> <horms+renesas@verge.net.au> wrote:
> > this short series cleans up the RAVB driver a little.
> >
> > The first patch corrects the spelling of the FBP field of SFO register.
> > This register field is unused and should have no run-time effect.
> >
> > The remaining two patches remove the use of undocumented registers
> > after some consultation with the internal Renesas BSP team.
> >
> > All patches have been lightly tested on:
> > * E3 Ebisu
> > * H3 Salvator-XS (ES2.0)
> > * M3-W Salvator-XS
> > * M3-N Salvator-XS
> 
> It would be good if someone could test this on an R-Car Gen2 board
> that uses ravb (iwg22d or iwg23s).

I've tried this series + net-next on the iwg23s haven't seen any issues from a quick sanity test.

Kind regards, Chris

> 
> Thanks!
> 
> > Kazuya Mizuguchi (2):
> >   ravb: correct typo in FBP field of SFO register
> >   ravb: Remove undocumented processing
> >
> > Simon Horman (1):
> >   ravb: TROCR register is only present on R-Car Gen3
> 
> Gr{oetje,eeting}s,
> 
>                         Geert
> 
> --
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-
> m68k.org
> 
> In personal conversations with technical people, I call myself a hacker. But
> when I'm talking to journalists I just say "programmer" or something like that.
>                                 -- Linus Torvalds

^ permalink raw reply

* [PATCH net-next v3 1/3] net: dsa: mt7530: Convert to PHYLINK API
From: René van Dorst @ 2019-09-02 13:02 UTC (permalink / raw)
  To: Sean Wang, Andrew Lunn, Vivien Didelot, Florian Fainelli,
	David S . Miller, Matthias Brugger
  Cc: netdev, linux-arm-kernel, linux-mediatek, Russell King,
	John Crispin, linux-mips, Frank Wunderlich, René van Dorst,
	Russell King
In-Reply-To: <20190902130226.26845-1-opensource@vdorst.com>

Convert mt7530 to PHYLINK API

Signed-off-by: René van Dorst <opensource@vdorst.com>
Tested-by: Frank Wunderlich <frank-w@public-files.de>
Acked-by: Russell King <rmk+kernel@armlinux.org.uk>
---
v2->v3:
* No change
* Add tags acked-by and tested-by
v1->v2:
* Refactor "unsupported" phy_interface part in
  mt7530_phylink_mac_validate() suggested by Russell King
* Report and return when phylink tries to use autoneg_inband in
  mt7530_phylink_mac_config() suggested by Russell King
* Refactor port 6 setup in mt7530_phylink_mac_config()
rfc->v1:
* Renamed P5_MODE_* to P5_INTF_SEL_*. fits the function more
* Convert if-statement for speed bits to a switch suggested by
  Daniel Santos
* Refactor flow_control pause bits and don't use state->link in
  mt7530_phylink_mac_config() suggested by Russell King
* Move MAC tx/rx en/disable to mt7530_phylink_mac_link_up/down()
  suggested by Russell King
* Always support PHY_INTERFACE_MODE_NA in mt7530_phylink_validate()
  suggested by Russell King
* Added phylink_set_port_modes() in mt7530_phylink_validate() suggested
  by Russell King
* Remove dev_err on the end of mt7530_phylink_mac_config() suggested by
  Russell King

 drivers/net/dsa/mt7530.c | 266 +++++++++++++++++++++++++++++----------
 drivers/net/dsa/mt7530.h |  32 +++--
 2 files changed, 211 insertions(+), 87 deletions(-)

diff --git a/drivers/net/dsa/mt7530.c b/drivers/net/dsa/mt7530.c
index c48e29486b10..ecc13b57e619 100644
--- a/drivers/net/dsa/mt7530.c
+++ b/drivers/net/dsa/mt7530.c
@@ -13,7 +13,7 @@
 #include <linux/of_mdio.h>
 #include <linux/of_net.h>
 #include <linux/of_platform.h>
-#include <linux/phy.h>
+#include <linux/phylink.h>
 #include <linux/regmap.h>
 #include <linux/regulator/consumer.h>
 #include <linux/reset.h>
@@ -633,63 +633,6 @@ mt7530_get_sset_count(struct dsa_switch *ds, int port, int sset)
 	return ARRAY_SIZE(mt7530_mib);
 }
 
-static void mt7530_adjust_link(struct dsa_switch *ds, int port,
-			       struct phy_device *phydev)
-{
-	struct mt7530_priv *priv = ds->priv;
-
-	if (phy_is_pseudo_fixed_link(phydev)) {
-		dev_dbg(priv->dev, "phy-mode for master device = %x\n",
-			phydev->interface);
-
-		/* Setup TX circuit incluing relevant PAD and driving */
-		mt7530_pad_clk_setup(ds, phydev->interface);
-
-		if (priv->id == ID_MT7530) {
-			/* Setup RX circuit, relevant PAD and driving on the
-			 * host which must be placed after the setup on the
-			 * device side is all finished.
-			 */
-			mt7623_pad_clk_setup(ds);
-		}
-	} else {
-		u16 lcl_adv = 0, rmt_adv = 0;
-		u8 flowctrl;
-		u32 mcr = PMCR_USERP_LINK | PMCR_FORCE_MODE;
-
-		switch (phydev->speed) {
-		case SPEED_1000:
-			mcr |= PMCR_FORCE_SPEED_1000;
-			break;
-		case SPEED_100:
-			mcr |= PMCR_FORCE_SPEED_100;
-			break;
-		}
-
-		if (phydev->link)
-			mcr |= PMCR_FORCE_LNK;
-
-		if (phydev->duplex) {
-			mcr |= PMCR_FORCE_FDX;
-
-			if (phydev->pause)
-				rmt_adv = LPA_PAUSE_CAP;
-			if (phydev->asym_pause)
-				rmt_adv |= LPA_PAUSE_ASYM;
-
-			lcl_adv = linkmode_adv_to_lcl_adv_t(
-				phydev->advertising);
-			flowctrl = mii_resolve_flowctrl_fdx(lcl_adv, rmt_adv);
-
-			if (flowctrl & FLOW_CTRL_TX)
-				mcr |= PMCR_TX_FC_EN;
-			if (flowctrl & FLOW_CTRL_RX)
-				mcr |= PMCR_RX_FC_EN;
-		}
-		mt7530_write(priv, MT7530_PMCR_P(port), mcr);
-	}
-}
-
 static int
 mt7530_cpu_port_enable(struct mt7530_priv *priv,
 		       int port)
@@ -698,9 +641,6 @@ mt7530_cpu_port_enable(struct mt7530_priv *priv,
 	mt7530_write(priv, MT7530_PVC_P(port),
 		     PORT_SPEC_TAG);
 
-	/* Setup the MAC by default for the cpu port */
-	mt7530_write(priv, MT7530_PMCR_P(port), PMCR_CPUP_LINK);
-
 	/* Disable auto learning on the cpu port */
 	mt7530_set(priv, MT7530_PSC_P(port), SA_DIS);
 
@@ -731,9 +671,6 @@ mt7530_port_enable(struct dsa_switch *ds, int port,
 
 	mutex_lock(&priv->reg_mutex);
 
-	/* Setup the MAC for the user port */
-	mt7530_write(priv, MT7530_PMCR_P(port), PMCR_USERP_LINK);
-
 	/* Allow the user port gets connected to the cpu port and also
 	 * restore the port matrix if the port is the member of a certain
 	 * bridge.
@@ -742,7 +679,7 @@ mt7530_port_enable(struct dsa_switch *ds, int port,
 	priv->ports[port].enable = true;
 	mt7530_rmw(priv, MT7530_PCR_P(port), PCR_MATRIX_MASK,
 		   priv->ports[port].pm);
-	mt7530_port_set_status(priv, port, 1);
+	mt7530_port_set_status(priv, port, 0);
 
 	mutex_unlock(&priv->reg_mutex);
 
@@ -1232,10 +1169,10 @@ static int
 mt7530_setup(struct dsa_switch *ds)
 {
 	struct mt7530_priv *priv = ds->priv;
-	int ret, i;
-	u32 id, val;
-	struct device_node *dn;
 	struct mt7530_dummy_poll p;
+	struct device_node *dn;
+	u32 id, val;
+	int ret, i;
 
 	/* The parent node of master netdev which holds the common system
 	 * controller also is the container for two GMACs nodes representing
@@ -1305,6 +1242,8 @@ mt7530_setup(struct dsa_switch *ds)
 	val |= MHWTRAP_MANUAL;
 	mt7530_write(priv, MT7530_MHWTRAP, val);
 
+	priv->p6_interface = PHY_INTERFACE_MODE_NA;
+
 	/* Enable and reset MIB counters */
 	mt7530_mib_reset(ds);
 
@@ -1329,6 +1268,191 @@ mt7530_setup(struct dsa_switch *ds)
 	return 0;
 }
 
+static void mt7530_phylink_mac_config(struct dsa_switch *ds, int port,
+				      unsigned int mode,
+				      const struct phylink_link_state *state)
+{
+	struct mt7530_priv *priv = ds->priv;
+	u32 mcr_cur, mcr_new;
+
+	switch (port) {
+	case 0: /* Internal phy */
+	case 1:
+	case 2:
+	case 3:
+	case 4:
+		if (state->interface != PHY_INTERFACE_MODE_GMII)
+			return;
+		break;
+	/* case 5: Port 5 is not supported! */
+	case 6: /* 1st cpu port */
+		if (priv->p6_interface == state->interface)
+			break;
+
+		if (state->interface != PHY_INTERFACE_MODE_RGMII &&
+		    state->interface != PHY_INTERFACE_MODE_TRGMII)
+			return;
+
+		/* Setup TX circuit incluing relevant PAD and driving */
+		mt7530_pad_clk_setup(ds, state->interface);
+
+		if (priv->id == ID_MT7530) {
+			/* Setup RX circuit, relevant PAD and driving on the
+			 * host which must be placed after the setup on the
+			 * device side is all finished.
+			 */
+			mt7623_pad_clk_setup(ds);
+		}
+
+		priv->p6_interface = state->interface;
+		break;
+	default:
+		dev_err(ds->dev, "%s: unsupported port: %i\n", __func__, port);
+		return;
+	}
+
+	if (phylink_autoneg_inband(mode)) {
+		dev_err(ds->dev, "%s: in-band negotiation unsupported\n",
+			__func__);
+		return;
+	}
+
+	mcr_cur = mt7530_read(priv, MT7530_PMCR_P(port));
+	mcr_new = mcr_cur;
+	mcr_new &= ~(PMCR_FORCE_SPEED_1000 | PMCR_FORCE_SPEED_100 |
+		     PMCR_FORCE_FDX | PMCR_TX_FC_EN | PMCR_RX_FC_EN);
+	mcr_new |= PMCR_IFG_XMIT(1) | PMCR_MAC_MODE | PMCR_BACKOFF_EN |
+		   PMCR_BACKPR_EN | PMCR_FORCE_MODE | PMCR_FORCE_LNK;
+
+	switch (state->speed) {
+	case SPEED_1000:
+		mcr_new |= PMCR_FORCE_SPEED_1000;
+		break;
+	case SPEED_100:
+		mcr_new |= PMCR_FORCE_SPEED_100;
+		break;
+	}
+	if (state->duplex == DUPLEX_FULL) {
+		mcr_new |= PMCR_FORCE_FDX;
+		if (state->pause & MLO_PAUSE_TX)
+			mcr_new |= PMCR_TX_FC_EN;
+		if (state->pause & MLO_PAUSE_RX)
+			mcr_new |= PMCR_RX_FC_EN;
+	}
+
+	if (mcr_new != mcr_cur)
+		mt7530_write(priv, MT7530_PMCR_P(port), mcr_new);
+}
+
+static void mt7530_phylink_mac_link_down(struct dsa_switch *ds, int port,
+					 unsigned int mode,
+					 phy_interface_t interface)
+{
+	struct mt7530_priv *priv = ds->priv;
+
+	mt7530_port_set_status(priv, port, 0);
+}
+
+static void mt7530_phylink_mac_link_up(struct dsa_switch *ds, int port,
+				       unsigned int mode,
+				       phy_interface_t interface,
+				       struct phy_device *phydev)
+{
+	struct mt7530_priv *priv = ds->priv;
+
+	mt7530_port_set_status(priv, port, 1);
+}
+
+static void mt7530_phylink_validate(struct dsa_switch *ds, int port,
+				    unsigned long *supported,
+				    struct phylink_link_state *state)
+{
+	__ETHTOOL_DECLARE_LINK_MODE_MASK(mask) = { 0, };
+
+	switch (port) {
+	case 0: /* Internal phy */
+	case 1:
+	case 2:
+	case 3:
+	case 4:
+		if (state->interface != PHY_INTERFACE_MODE_NA &&
+		    state->interface != PHY_INTERFACE_MODE_GMII)
+			goto unsupported;
+		break;
+	/* case 5: Port 5 not supported! */
+	case 6: /* 1st cpu port */
+		if (state->interface != PHY_INTERFACE_MODE_NA &&
+		    state->interface != PHY_INTERFACE_MODE_RGMII &&
+		    state->interface != PHY_INTERFACE_MODE_TRGMII)
+			goto unsupported;
+		break;
+	default:
+		dev_err(ds->dev, "%s: unsupported port: %i\n", __func__, port);
+unsupported:
+		linkmode_zero(supported);
+		return;
+	}
+
+	phylink_set_port_modes(mask);
+	phylink_set(mask, Autoneg);
+
+	if (state->interface != PHY_INTERFACE_MODE_TRGMII) {
+		phylink_set(mask, 10baseT_Half);
+		phylink_set(mask, 10baseT_Full);
+		phylink_set(mask, 100baseT_Half);
+		phylink_set(mask, 100baseT_Full);
+		phylink_set(mask, 1000baseT_Half);
+	}
+
+	phylink_set(mask, 1000baseT_Full);
+
+	phylink_set(mask, Pause);
+	phylink_set(mask, Asym_Pause);
+
+	linkmode_and(supported, supported, mask);
+	linkmode_and(state->advertising, state->advertising, mask);
+}
+
+static int
+mt7530_phylink_mac_link_state(struct dsa_switch *ds, int port,
+			      struct phylink_link_state *state)
+{
+	struct mt7530_priv *priv = ds->priv;
+	u32 pmsr;
+
+	if (port < 0 || port >= MT7530_NUM_PORTS)
+		return -EINVAL;
+
+	pmsr = mt7530_read(priv, MT7530_PMSR_P(port));
+
+	state->link = (pmsr & PMSR_LINK);
+	state->an_complete = state->link;
+	state->duplex = !!(pmsr & PMSR_DPX);
+
+	switch (pmsr & PMSR_SPEED_MASK) {
+	case PMSR_SPEED_10:
+		state->speed = SPEED_10;
+		break;
+	case PMSR_SPEED_100:
+		state->speed = SPEED_100;
+		break;
+	case PMSR_SPEED_1000:
+		state->speed = SPEED_1000;
+		break;
+	default:
+		state->speed = SPEED_UNKNOWN;
+		break;
+	}
+
+	state->pause &= ~(MLO_PAUSE_RX | MLO_PAUSE_TX);
+	if (pmsr & PMSR_RX_FC)
+		state->pause |= MLO_PAUSE_RX;
+	if (pmsr & PMSR_TX_FC)
+		state->pause |= MLO_PAUSE_TX;
+
+	return 1;
+}
+
 static const struct dsa_switch_ops mt7530_switch_ops = {
 	.get_tag_protocol	= mtk_get_tag_protocol,
 	.setup			= mt7530_setup,
@@ -1337,7 +1461,6 @@ static const struct dsa_switch_ops mt7530_switch_ops = {
 	.phy_write		= mt7530_phy_write,
 	.get_ethtool_stats	= mt7530_get_ethtool_stats,
 	.get_sset_count		= mt7530_get_sset_count,
-	.adjust_link		= mt7530_adjust_link,
 	.port_enable		= mt7530_port_enable,
 	.port_disable		= mt7530_port_disable,
 	.port_stp_state_set	= mt7530_stp_state_set,
@@ -1350,6 +1473,11 @@ static const struct dsa_switch_ops mt7530_switch_ops = {
 	.port_vlan_prepare	= mt7530_port_vlan_prepare,
 	.port_vlan_add		= mt7530_port_vlan_add,
 	.port_vlan_del		= mt7530_port_vlan_del,
+	.phylink_validate	= mt7530_phylink_validate,
+	.phylink_mac_link_state = mt7530_phylink_mac_link_state,
+	.phylink_mac_config	= mt7530_phylink_mac_config,
+	.phylink_mac_link_down	= mt7530_phylink_mac_link_down,
+	.phylink_mac_link_up	= mt7530_phylink_mac_link_up,
 };
 
 static const struct of_device_id mt7530_of_match[] = {
diff --git a/drivers/net/dsa/mt7530.h b/drivers/net/dsa/mt7530.h
index bfac90f48102..107dd04acede 100644
--- a/drivers/net/dsa/mt7530.h
+++ b/drivers/net/dsa/mt7530.h
@@ -198,26 +198,20 @@ enum mt7530_vlan_port_attr {
 #define  PMCR_FORCE_SPEED_100		BIT(2)
 #define  PMCR_FORCE_FDX			BIT(1)
 #define  PMCR_FORCE_LNK			BIT(0)
-#define  PMCR_COMMON_LINK		(PMCR_IFG_XMIT(1) | PMCR_MAC_MODE | \
-					 PMCR_BACKOFF_EN | PMCR_BACKPR_EN | \
-					 PMCR_TX_EN | PMCR_RX_EN | \
-					 PMCR_TX_FC_EN | PMCR_RX_FC_EN)
-#define  PMCR_CPUP_LINK			(PMCR_COMMON_LINK | PMCR_FORCE_MODE | \
-					 PMCR_FORCE_SPEED_1000 | \
-					 PMCR_FORCE_FDX | \
-					 PMCR_FORCE_LNK)
-#define  PMCR_USERP_LINK		PMCR_COMMON_LINK
-#define  PMCR_FIXED_LINK		(PMCR_IFG_XMIT(1) | PMCR_MAC_MODE | \
-					 PMCR_FORCE_MODE | PMCR_TX_EN | \
-					 PMCR_RX_EN | PMCR_BACKPR_EN | \
-					 PMCR_BACKOFF_EN | \
-					 PMCR_FORCE_SPEED_1000 | \
-					 PMCR_FORCE_FDX | \
-					 PMCR_FORCE_LNK)
-#define PMCR_FIXED_LINK_FC		(PMCR_FIXED_LINK | \
-					 PMCR_TX_FC_EN | PMCR_RX_FC_EN)
+#define  PMCR_SPEED_MASK		(PMCR_FORCE_SPEED_100 | \
+					 PMCR_FORCE_SPEED_1000)
 
 #define MT7530_PMSR_P(x)		(0x3008 + (x) * 0x100)
+#define  PMSR_EEE1G			BIT(7)
+#define  PMSR_EEE100M			BIT(6)
+#define  PMSR_RX_FC			BIT(5)
+#define  PMSR_TX_FC			BIT(4)
+#define  PMSR_SPEED_1000		BIT(3)
+#define  PMSR_SPEED_100			BIT(2)
+#define  PMSR_SPEED_10			0x00
+#define  PMSR_SPEED_MASK		(PMSR_SPEED_100 | PMSR_SPEED_1000)
+#define  PMSR_DPX			BIT(1)
+#define  PMSR_LINK			BIT(0)
 
 /* Register for MIB */
 #define MT7530_PORT_MIB_COUNTER(x)	(0x4000 + (x) * 0x100)
@@ -423,6 +417,7 @@ struct mt7530_port {
  * @ports:		Holding the state among ports
  * @reg_mutex:		The lock for protecting among process accessing
  *			registers
+ * @p6_interface	Holding the current port 6 interface
  */
 struct mt7530_priv {
 	struct device		*dev;
@@ -435,6 +430,7 @@ struct mt7530_priv {
 	struct gpio_desc	*reset;
 	unsigned int		id;
 	bool			mcm;
+	phy_interface_t		p6_interface;
 
 	struct mt7530_port	ports[MT7530_NUM_PORTS];
 	/* protect among processes for registers access*/
-- 
2.20.1


^ permalink raw reply related

* [PATCH net-next v3 3/3] net: dsa: mt7530: Add support for port 5
From: René van Dorst @ 2019-09-02 13:02 UTC (permalink / raw)
  To: Sean Wang, Andrew Lunn, Vivien Didelot, Florian Fainelli,
	David S . Miller, Matthias Brugger
  Cc: netdev, linux-arm-kernel, linux-mediatek, Russell King,
	John Crispin, linux-mips, Frank Wunderlich, René van Dorst,
	Russell King
In-Reply-To: <20190902130226.26845-1-opensource@vdorst.com>

Adding support for port 5.

Port 5 can muxed/interface to:
- internal 5th GMAC of the switch; can be used as 2nd CPU port or as
  extra port with an external phy for a 6th ethernet port.
- internal PHY of port 0 or 4; Used in most applications so that port 0
  or 4 is the WAN port and interfaces with the 2nd GMAC of the SOC.

Signed-off-by: René van Dorst <opensource@vdorst.com>
Tested-by: Frank Wunderlich <frank-w@public-files.de>
Acked-by: Russell King <rmk+kernel@armlinux.org.uk>
---
v2->v3:
* Change in mt7530_setup_port5() the port 5 setup message in to a debug
  message. Suggested by David Miller
* Add tags acked-by and tested-by
v1->v2:
* Also report 1000base-x support for port 5 suggested by Russell King
* Reorder variable declaraiant in reverse christmas tree suggested by
  Daved Miller
* Refactor phy-handle lookup for 2nd GMAC.
* Use of_mdio_parse_addr() instead of do it manualy suggested by
  Florian Fainelli
* Refactor port 5 setup in mt7530_phylink_mac_config()
rfc->v1:
* Removed unnecessary info print suggested by Andrew Lunn
* Added support for MII mode for port 5

 drivers/net/dsa/mt7530.c | 145 +++++++++++++++++++++++++++++++++++++--
 drivers/net/dsa/mt7530.h |  29 ++++++++
 2 files changed, 168 insertions(+), 6 deletions(-)

diff --git a/drivers/net/dsa/mt7530.c b/drivers/net/dsa/mt7530.c
index ecc13b57e619..1d8d36de4d20 100644
--- a/drivers/net/dsa/mt7530.c
+++ b/drivers/net/dsa/mt7530.c
@@ -633,6 +633,77 @@ mt7530_get_sset_count(struct dsa_switch *ds, int port, int sset)
 	return ARRAY_SIZE(mt7530_mib);
 }
 
+static void mt7530_setup_port5(struct dsa_switch *ds, phy_interface_t interface)
+{
+	struct mt7530_priv *priv = ds->priv;
+	u8 tx_delay = 0;
+	int val;
+
+	mutex_lock(&priv->reg_mutex);
+
+	val = mt7530_read(priv, MT7530_MHWTRAP);
+
+	val |= MHWTRAP_MANUAL | MHWTRAP_P5_MAC_SEL | MHWTRAP_P5_DIS;
+	val &= ~MHWTRAP_P5_RGMII_MODE & ~MHWTRAP_PHY0_SEL;
+
+	switch (priv->p5_intf_sel) {
+	case P5_INTF_SEL_PHY_P0:
+		/* MT7530_P5_MODE_GPHY_P0: 2nd GMAC -> P5 -> P0 */
+		val |= MHWTRAP_PHY0_SEL;
+		/* fall through */
+	case P5_INTF_SEL_PHY_P4:
+		/* MT7530_P5_MODE_GPHY_P4: 2nd GMAC -> P5 -> P4 */
+		val &= ~MHWTRAP_P5_MAC_SEL & ~MHWTRAP_P5_DIS;
+
+		/* Setup the MAC by default for the cpu port */
+		mt7530_write(priv, MT7530_PMCR_P(5), 0x56300);
+		break;
+	case P5_INTF_SEL_GMAC5:
+		/* MT7530_P5_MODE_GMAC: P5 -> External phy or 2nd GMAC */
+		val &= ~MHWTRAP_P5_DIS;
+		break;
+	case P5_DISABLED:
+		interface = PHY_INTERFACE_MODE_NA;
+		break;
+	default:
+		dev_err(ds->dev, "Unsupported p5_intf_sel %d\n",
+			priv->p5_intf_sel);
+		goto unlock_exit;
+	}
+
+	/* Setup RGMII settings */
+	if (phy_interface_mode_is_rgmii(interface)) {
+		val |= MHWTRAP_P5_RGMII_MODE;
+
+		/* P5 RGMII RX Clock Control: delay setting for 1000M */
+		mt7530_write(priv, MT7530_P5RGMIIRXCR, CSR_RGMII_EDGE_ALIGN);
+
+		/* Don't set delay in DSA mode */
+		if (!dsa_is_dsa_port(priv->ds, 5) &&
+		    (interface == PHY_INTERFACE_MODE_RGMII_TXID ||
+		     interface == PHY_INTERFACE_MODE_RGMII_ID))
+			tx_delay = 4; /* n * 0.5 ns */
+
+		/* P5 RGMII TX Clock Control: delay x */
+		mt7530_write(priv, MT7530_P5RGMIITXCR,
+			     CSR_RGMII_TXC_CFG(0x10 + tx_delay));
+
+		/* reduce P5 RGMII Tx driving, 8mA */
+		mt7530_write(priv, MT7530_IO_DRV_CR,
+			     P5_IO_CLK_DRV(1) | P5_IO_DATA_DRV(1));
+	}
+
+	mt7530_write(priv, MT7530_MHWTRAP, val);
+
+	dev_dbg(ds->dev, "Setup P5, HWTRAP=0x%x, intf_sel=%s, phy-mode=%s\n",
+		val, p5_intf_modes(priv->p5_intf_sel), phy_modes(interface));
+
+	priv->p5_interface = interface;
+
+unlock_exit:
+	mutex_unlock(&priv->reg_mutex);
+}
+
 static int
 mt7530_cpu_port_enable(struct mt7530_priv *priv,
 		       int port)
@@ -1169,7 +1240,10 @@ static int
 mt7530_setup(struct dsa_switch *ds)
 {
 	struct mt7530_priv *priv = ds->priv;
+	struct device_node *phy_node;
+	struct device_node *mac_np;
 	struct mt7530_dummy_poll p;
+	phy_interface_t interface;
 	struct device_node *dn;
 	u32 id, val;
 	int ret, i;
@@ -1260,6 +1334,40 @@ mt7530_setup(struct dsa_switch *ds)
 			mt7530_port_disable(ds, i);
 	}
 
+	/* Setup port 5 */
+	priv->p5_intf_sel = P5_DISABLED;
+	interface = PHY_INTERFACE_MODE_NA;
+
+	if (!dsa_is_unused_port(ds, 5)) {
+		priv->p5_intf_sel = P5_INTF_SEL_GMAC5;
+		interface = of_get_phy_mode(ds->ports[5].dn);
+	} else {
+		/* Scan the ethernet nodes. look for GMAC1, lookup used phy */
+		for_each_child_of_node(dn, mac_np) {
+			if (!of_device_is_compatible(mac_np,
+						     "mediatek,eth-mac"))
+				continue;
+
+			ret = of_property_read_u32(mac_np, "reg", &id);
+			if (ret < 0 || id != 1)
+				continue;
+
+			phy_node = of_parse_phandle(mac_np, "phy-handle", 0);
+			if (phy_node->parent == priv->dev->of_node->parent) {
+				interface = of_get_phy_mode(mac_np);
+				id = of_mdio_parse_addr(ds->dev, phy_node);
+				if (id == 0)
+					priv->p5_intf_sel = P5_INTF_SEL_PHY_P0;
+				if (id == 4)
+					priv->p5_intf_sel = P5_INTF_SEL_PHY_P4;
+			}
+			of_node_put(phy_node);
+			break;
+		}
+	}
+
+	mt7530_setup_port5(ds, interface);
+
 	/* Flush the FDB table */
 	ret = mt7530_fdb_cmd(priv, MT7530_FDB_FLUSH, NULL);
 	if (ret < 0)
@@ -1284,7 +1392,16 @@ static void mt7530_phylink_mac_config(struct dsa_switch *ds, int port,
 		if (state->interface != PHY_INTERFACE_MODE_GMII)
 			return;
 		break;
-	/* case 5: Port 5 is not supported! */
+	case 5: /* 2nd cpu port with phy of port 0 or 4 / external phy */
+		if (priv->p5_interface == state->interface)
+			break;
+		if (!phy_interface_mode_is_rgmii(state->interface) &&
+		    state->interface != PHY_INTERFACE_MODE_MII &&
+		    state->interface != PHY_INTERFACE_MODE_GMII)
+			return;
+
+		mt7530_setup_port5(ds, state->interface);
+		break;
 	case 6: /* 1st cpu port */
 		if (priv->p6_interface == state->interface)
 			break;
@@ -1324,6 +1441,10 @@ static void mt7530_phylink_mac_config(struct dsa_switch *ds, int port,
 	mcr_new |= PMCR_IFG_XMIT(1) | PMCR_MAC_MODE | PMCR_BACKOFF_EN |
 		   PMCR_BACKPR_EN | PMCR_FORCE_MODE | PMCR_FORCE_LNK;
 
+	/* Are we connected to external phy */
+	if (port == 5 && dsa_is_user_port(ds, 5))
+		mcr_new |= PMCR_EXT_PHY;
+
 	switch (state->speed) {
 	case SPEED_1000:
 		mcr_new |= PMCR_FORCE_SPEED_1000;
@@ -1379,7 +1500,13 @@ static void mt7530_phylink_validate(struct dsa_switch *ds, int port,
 		    state->interface != PHY_INTERFACE_MODE_GMII)
 			goto unsupported;
 		break;
-	/* case 5: Port 5 not supported! */
+	case 5: /* 2nd cpu port with phy of port 0 or 4 / external phy */
+		if (state->interface != PHY_INTERFACE_MODE_NA &&
+		    !phy_interface_mode_is_rgmii(state->interface) &&
+		    state->interface != PHY_INTERFACE_MODE_MII &&
+		    state->interface != PHY_INTERFACE_MODE_GMII)
+			goto unsupported;
+		break;
 	case 6: /* 1st cpu port */
 		if (state->interface != PHY_INTERFACE_MODE_NA &&
 		    state->interface != PHY_INTERFACE_MODE_RGMII &&
@@ -1396,15 +1523,21 @@ static void mt7530_phylink_validate(struct dsa_switch *ds, int port,
 	phylink_set_port_modes(mask);
 	phylink_set(mask, Autoneg);
 
-	if (state->interface != PHY_INTERFACE_MODE_TRGMII) {
+	if (state->interface == PHY_INTERFACE_MODE_TRGMII) {
+		phylink_set(mask, 1000baseT_Full);
+	} else {
 		phylink_set(mask, 10baseT_Half);
 		phylink_set(mask, 10baseT_Full);
 		phylink_set(mask, 100baseT_Half);
 		phylink_set(mask, 100baseT_Full);
-		phylink_set(mask, 1000baseT_Half);
-	}
 
-	phylink_set(mask, 1000baseT_Full);
+		if (state->interface != PHY_INTERFACE_MODE_MII) {
+			phylink_set(mask, 1000baseT_Half);
+			phylink_set(mask, 1000baseT_Full);
+			if (port == 5)
+				phylink_set(mask, 1000baseX_Full);
+		}
+	}
 
 	phylink_set(mask, Pause);
 	phylink_set(mask, Asym_Pause);
diff --git a/drivers/net/dsa/mt7530.h b/drivers/net/dsa/mt7530.h
index 107dd04acede..ccb9da8cad0d 100644
--- a/drivers/net/dsa/mt7530.h
+++ b/drivers/net/dsa/mt7530.h
@@ -186,6 +186,7 @@ enum mt7530_vlan_port_attr {
 /* Register for port MAC control register */
 #define MT7530_PMCR_P(x)		(0x3000 + ((x) * 0x100))
 #define  PMCR_IFG_XMIT(x)		(((x) & 0x3) << 18)
+#define  PMCR_EXT_PHY			BIT(17)
 #define  PMCR_MAC_MODE			BIT(16)
 #define  PMCR_FORCE_MODE		BIT(15)
 #define  PMCR_TX_EN			BIT(14)
@@ -245,6 +246,7 @@ enum mt7530_vlan_port_attr {
 
 /* Register for hw trap modification */
 #define MT7530_MHWTRAP			0x7804
+#define  MHWTRAP_PHY0_SEL		BIT(20)
 #define  MHWTRAP_MANUAL			BIT(16)
 #define  MHWTRAP_P5_MAC_SEL		BIT(13)
 #define  MHWTRAP_P6_DIS			BIT(8)
@@ -402,6 +404,30 @@ struct mt7530_port {
 	u16 pvid;
 };
 
+/* Port 5 interface select definitions */
+enum p5_interface_select {
+	P5_DISABLED = 0,
+	P5_INTF_SEL_PHY_P0,
+	P5_INTF_SEL_PHY_P4,
+	P5_INTF_SEL_GMAC5,
+};
+
+static const char *p5_intf_modes(unsigned int p5_interface)
+{
+	switch (p5_interface) {
+	case P5_DISABLED:
+		return "DISABLED";
+	case P5_INTF_SEL_PHY_P0:
+		return "PHY P0";
+	case P5_INTF_SEL_PHY_P4:
+		return "PHY P4";
+	case P5_INTF_SEL_GMAC5:
+		return "GMAC5";
+	default:
+		return "unknown";
+	}
+}
+
 /* struct mt7530_priv -	This is the main data structure for holding the state
  *			of the driver
  * @dev:		The device pointer
@@ -418,6 +444,7 @@ struct mt7530_port {
  * @reg_mutex:		The lock for protecting among process accessing
  *			registers
  * @p6_interface	Holding the current port 6 interface
+ * @p5_intf_sel:	Holding the current port 5 interface select
  */
 struct mt7530_priv {
 	struct device		*dev;
@@ -431,6 +458,8 @@ struct mt7530_priv {
 	unsigned int		id;
 	bool			mcm;
 	phy_interface_t		p6_interface;
+	phy_interface_t		p5_interface;
+	unsigned int		p5_intf_sel;
 
 	struct mt7530_port	ports[MT7530_NUM_PORTS];
 	/* protect among processes for registers access*/
-- 
2.20.1


^ permalink raw reply related

* [PATCH net-next v3 2/3] dt-bindings: net: dsa: mt7530: Add support for port 5
From: René van Dorst @ 2019-09-02 13:02 UTC (permalink / raw)
  To: Sean Wang, Andrew Lunn, Vivien Didelot, Florian Fainelli,
	David S . Miller, Matthias Brugger
  Cc: netdev, linux-arm-kernel, linux-mediatek, Russell King,
	John Crispin, linux-mips, Frank Wunderlich, René van Dorst,
	devicetree, Rob Herring
In-Reply-To: <20190902130226.26845-1-opensource@vdorst.com>

MT7530 port 5 has many modes/configurations.
Update the documentation how to use port 5.

Signed-off-by: René van Dorst <opensource@vdorst.com>
Cc: devicetree@vger.kernel.org
Cc: Rob Herring <robh@kernel.org>
---
v2->v3:
* Remove 'status = "okay";' lines, suggested by Rob Herring
v1->v2:
* Adding extra note about RGMII2 and gpio use.
rfc->v1:
* No change

 .../devicetree/bindings/net/dsa/mt7530.txt    | 214 ++++++++++++++++++
 1 file changed, 214 insertions(+)

diff --git a/Documentation/devicetree/bindings/net/dsa/mt7530.txt b/Documentation/devicetree/bindings/net/dsa/mt7530.txt
index 47aa205ee0bd..c5ed5d25f642 100644
--- a/Documentation/devicetree/bindings/net/dsa/mt7530.txt
+++ b/Documentation/devicetree/bindings/net/dsa/mt7530.txt
@@ -35,6 +35,42 @@ Required properties for the child nodes within ports container:
 - phy-mode: String, must be either "trgmii" or "rgmii" for port labeled
 	 "cpu".
 
+Port 5 of the switch is muxed between:
+1. GMAC5: GMAC5 can interface with another external MAC or PHY.
+2. PHY of port 0 or port 4: PHY interfaces with an external MAC like 2nd GMAC
+   of the SOC. Used in many setups where port 0/4 becomes the WAN port.
+   Note: On a MT7621 SOC with integrated switch: 2nd GMAC can only connected to
+	 GMAC5 when the gpios for RGMII2 (GPIO 22-33) are not used and not
+	 connected to external component!
+
+Port 5 modes/configurations:
+1. Port 5 is disabled and isolated: An external phy can interface to the 2nd
+   GMAC of the SOC.
+   In the case of a build-in MT7530 switch, port 5 shares the RGMII bus with 2nd
+   GMAC and an optional external phy. Mind the GPIO/pinctl settings of the SOC!
+2. Port 5 is muxed to PHY of port 0/4: Port 0/4 interfaces with 2nd GMAC.
+   It is a simple MAC to PHY interface, port 5 needs to be setup for xMII mode
+   and RGMII delay.
+3. Port 5 is muxed to GMAC5 and can interface to an external phy.
+   Port 5 becomes an extra switch port.
+   Only works on platform where external phy TX<->RX lines are swapped.
+   Like in the Ubiquiti ER-X-SFP.
+4. Port 5 is muxed to GMAC5 and interfaces with the 2nd GAMC as 2nd CPU port.
+   Currently a 2nd CPU port is not supported by DSA code.
+
+Depending on how the external PHY is wired:
+1. normal: The PHY can only connect to 2nd GMAC but not to the switch
+2. swapped: RGMII TX, RX are swapped; external phy interface with the switch as
+   a ethernet port. But can't interface to the 2nd GMAC.
+
+Based on the DT the port 5 mode is configured.
+
+Driver tries to lookup the phy-handle of the 2nd GMAC of the master device.
+When phy-handle matches PHY of port 0 or 4 then port 5 set-up as mode 2.
+phy-mode must be set, see also example 2 below!
+ * mt7621: phy-mode = "rgmii-txid";
+ * mt7623: phy-mode = "rgmii";
+
 See Documentation/devicetree/bindings/net/dsa/dsa.txt for a list of additional
 required, optional properties and how the integrated switch subnodes must
 be specified.
@@ -94,3 +130,181 @@ Example:
 			};
 		};
 	};
+
+Example 2: MT7621: Port 4 is WAN port: 2nd GMAC -> Port 5 -> PHY port 4.
+
+&eth {
+	gmac0: mac@0 {
+		compatible = "mediatek,eth-mac";
+		reg = <0>;
+		phy-mode = "rgmii";
+
+		fixed-link {
+			speed = <1000>;
+			full-duplex;
+			pause;
+		};
+	};
+
+	gmac1: mac@1 {
+		compatible = "mediatek,eth-mac";
+		reg = <1>;
+		phy-mode = "rgmii-txid";
+		phy-handle = <&phy4>;
+	};
+
+	mdio: mdio-bus {
+		#address-cells = <1>;
+		#size-cells = <0>;
+
+		/* Internal phy */
+		phy4: ethernet-phy@4 {
+			reg = <4>;
+		};
+
+		mt7530: switch@1f {
+			compatible = "mediatek,mt7621";
+			#address-cells = <1>;
+			#size-cells = <0>;
+			reg = <0x1f>;
+			pinctrl-names = "default";
+			mediatek,mcm;
+
+			resets = <&rstctrl 2>;
+			reset-names = "mcm";
+
+			ports {
+				#address-cells = <1>;
+				#size-cells = <0>;
+
+				port@0 {
+					reg = <0>;
+					label = "lan0";
+				};
+
+				port@1 {
+					reg = <1>;
+					label = "lan1";
+				};
+
+				port@2 {
+					reg = <2>;
+					label = "lan2";
+				};
+
+				port@3 {
+					reg = <3>;
+					label = "lan3";
+				};
+
+/* Commented out. Port 4 is handled by 2nd GMAC.
+				port@4 {
+					reg = <4>;
+					label = "lan4";
+				};
+*/
+
+				cpu_port0: port@6 {
+					reg = <6>;
+					label = "cpu";
+					ethernet = <&gmac0>;
+					phy-mode = "rgmii";
+
+					fixed-link {
+						speed = <1000>;
+						full-duplex;
+						pause;
+					};
+				};
+			};
+		};
+	};
+};
+
+Example 3: MT7621: Port 5 is connected to external PHY: Port 5 -> external PHY.
+
+&eth {
+	gmac0: mac@0 {
+		compatible = "mediatek,eth-mac";
+		reg = <0>;
+		phy-mode = "rgmii";
+
+		fixed-link {
+			speed = <1000>;
+			full-duplex;
+			pause;
+		};
+	};
+
+	mdio: mdio-bus {
+		#address-cells = <1>;
+		#size-cells = <0>;
+
+		/* External phy */
+		ephy5: ethernet-phy@7 {
+			reg = <7>;
+		};
+
+		mt7530: switch@1f {
+			compatible = "mediatek,mt7621";
+			#address-cells = <1>;
+			#size-cells = <0>;
+			reg = <0x1f>;
+			pinctrl-names = "default";
+			mediatek,mcm;
+
+			resets = <&rstctrl 2>;
+			reset-names = "mcm";
+
+			ports {
+				#address-cells = <1>;
+				#size-cells = <0>;
+
+				port@0 {
+					reg = <0>;
+					label = "lan0";
+				};
+
+				port@1 {
+					reg = <1>;
+					label = "lan1";
+				};
+
+				port@2 {
+					reg = <2>;
+					label = "lan2";
+				};
+
+				port@3 {
+					reg = <3>;
+					label = "lan3";
+				};
+
+				port@4 {
+					reg = <4>;
+					label = "lan4";
+				};
+
+				port@5 {
+					reg = <5>;
+					label = "lan5";
+					phy-mode = "rgmii";
+					phy-handle = <&ephy5>;
+				};
+
+				cpu_port0: port@6 {
+					reg = <6>;
+					label = "cpu";
+					ethernet = <&gmac0>;
+					phy-mode = "rgmii";
+
+					fixed-link {
+						speed = <1000>;
+						full-duplex;
+						pause;
+					};
+				};
+			};
+		};
+	};
+};
-- 
2.20.1


^ permalink raw reply related

* [PATCH net-next v3 0/3] net: dsa: mt7530: Convert to PHYLINK and add support for port 5
From: René van Dorst @ 2019-09-02 13:02 UTC (permalink / raw)
  To: Sean Wang, Andrew Lunn, Vivien Didelot, Florian Fainelli,
	David S . Miller, Matthias Brugger
  Cc: netdev, linux-arm-kernel, linux-mediatek, Russell King,
	John Crispin, linux-mips, Frank Wunderlich, René van Dorst

1. net: dsa: mt7530: Convert to PHYLINK API
   This patch converts mt7530 to PHYLINK API.
2. dt-bindings: net: dsa: mt7530: Add support for port 5
3. net: dsa: mt7530: Add support for port 5
   These 2 patches adding support for port 5 of the switch.

v2->v3:
 * Removed 'status = "okay"' lines in patch #2
 * Change a port 5 setup message in a debug message in patch #3
 * Added ack-by and tested-by tags
v1->v2:
 * Mostly phylink improvements after review.
rfc -> v1:
 * Mostly phylink improvements after review.
 * Drop phy isolation patches. Adds no value for now.

René van Dorst (3):
  net: dsa: mt7530: Convert to PHYLINK API
  dt-bindings: net: dsa: mt7530: Add support for port 5
  net: dsa: mt7530: Add support for port 5

 .../devicetree/bindings/net/dsa/mt7530.txt    | 214 ++++++++++
 drivers/net/dsa/mt7530.c                      | 371 +++++++++++++++---
 drivers/net/dsa/mt7530.h                      |  61 ++-
 3 files changed, 573 insertions(+), 73 deletions(-)

-- 
2.20.1

^ permalink raw reply

* Re: [PATCH v4] rtw88: pci: Move a mass of jobs in hw IRQ to soft IRQ
From: Kalle Valo @ 2019-09-02 12:18 UTC (permalink / raw)
  To: Tony Chuang
  Cc: Jian-Hong Pan, David S . Miller, linux-wireless@vger.kernel.org,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux@endlessm.com
In-Reply-To: <F7CD281DE3E379468C6D07993EA72F84D18AE2DA@RTITMBSVM04.realtek.com.tw>

Tony Chuang <yhchuang@realtek.com> writes:

>> From: Jian-Hong Pan 
>> Subject: [PATCH v4] rtw88: pci: Move a mass of jobs in hw IRQ to soft IRQ
>> 
>> There is a mass of jobs between spin lock and unlock in the hardware
>> IRQ which will occupy much time originally. To make system work more
>> efficiently, this patch moves the jobs to the soft IRQ (bottom half) to
>> reduce the time in hardware IRQ.
>> 
>> Signed-off-by: Jian-Hong Pan <jian-hong@endlessm.com>
>
> Now it works fine with MSI interrupt enabled.
>
> But this patch is conflicting with MSI interrupt patch.
> Is there a better way we can make Kalle apply them more smoothly?
> I can rebase them and submit both if you're OK.

Yeah, submitting all the MSI patches in the same patchset is the easiest
approach. That way they apply cleanly to wireless-drivers-next.

-- 
https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

^ permalink raw reply

* [PATCH net-next] r8152: modify rtl8152_set_speed function
From: Hayes Wang @ 2019-09-02 11:52 UTC (permalink / raw)
  To: netdev; +Cc: nic_swsd, linux-kernel, Hayes Wang

First, for AUTONEG_DISABLE, we only need to modify MII_BMCR.

Second, add advertising parameter for rtl8152_set_speed(). Add
RTL_ADVERTISED_xxx for advertising parameter of rtl8152_set_speed().
Then, the advertising settings from ethtool could be saved.

Signed-off-by: Hayes Wang <hayeswang@realtek.com>
---
 drivers/net/usb/r8152.c | 196 +++++++++++++++++++++++++++-------------
 1 file changed, 132 insertions(+), 64 deletions(-)

diff --git a/drivers/net/usb/r8152.c b/drivers/net/usb/r8152.c
index c6fa0c17c13d..5d49d8dd93e1 100644
--- a/drivers/net/usb/r8152.c
+++ b/drivers/net/usb/r8152.c
@@ -757,6 +757,7 @@ struct r8152 {
 	u32 msg_enable;
 	u32 tx_qlen;
 	u32 coalesce;
+	u32 advertising;
 	u32 rx_buf_sz;
 	u32 rx_copybreak;
 	u32 rx_pending;
@@ -790,6 +791,13 @@ enum tx_csum_stat {
 	TX_CSUM_NONE
 };
 
+#define RTL_ADVERTISED_10_HALF			BIT(0)
+#define RTL_ADVERTISED_10_FULL			BIT(1)
+#define RTL_ADVERTISED_100_HALF			BIT(2)
+#define RTL_ADVERTISED_100_FULL			BIT(3)
+#define RTL_ADVERTISED_1000_HALF		BIT(4)
+#define RTL_ADVERTISED_1000_FULL		BIT(5)
+
 /* Maximum number of multicast addresses to filter (vs. Rx-all-multicast).
  * The RTL chips use a 64 element hash table based on the Ethernet CRC.
  */
@@ -3801,90 +3809,117 @@ static void rtl8153b_disable(struct r8152 *tp)
 	r8153b_aldps_en(tp, true);
 }
 
-static int rtl8152_set_speed(struct r8152 *tp, u8 autoneg, u16 speed, u8 duplex)
+static int rtl8152_set_speed(struct r8152 *tp, u8 autoneg, u32 speed, u8 duplex,
+			     u32 advertising)
 {
-	u16 bmcr, anar, gbcr;
 	enum spd_duplex speed_duplex;
+	u16 bmcr;
 	int ret = 0;
 
-	anar = r8152_mdio_read(tp, MII_ADVERTISE);
-	anar &= ~(ADVERTISE_10HALF | ADVERTISE_10FULL |
-		  ADVERTISE_100HALF | ADVERTISE_100FULL);
-	if (tp->mii.supports_gmii) {
-		gbcr = r8152_mdio_read(tp, MII_CTRL1000);
-		gbcr &= ~(ADVERTISE_1000FULL | ADVERTISE_1000HALF);
-	} else {
-		gbcr = 0;
-	}
-
 	if (autoneg == AUTONEG_DISABLE) {
-		if (speed == SPEED_10) {
-			bmcr = 0;
-			anar |= ADVERTISE_10HALF | ADVERTISE_10FULL;
-			speed_duplex = FORCE_10M_HALF;
-		} else if (speed == SPEED_100) {
-			bmcr = BMCR_SPEED100;
-			anar |= ADVERTISE_100HALF | ADVERTISE_100FULL;
-			speed_duplex = FORCE_100M_HALF;
-		} else if (speed == SPEED_1000 && tp->mii.supports_gmii) {
-			bmcr = BMCR_SPEED1000;
-			gbcr |= ADVERTISE_1000FULL | ADVERTISE_1000HALF;
-			speed_duplex = NWAY_1000M_FULL;
-		} else {
-			ret = -EINVAL;
-			goto out;
-		}
+		if (duplex != DUPLEX_HALF && duplex != DUPLEX_FULL)
+			return -EINVAL;
 
-		if (duplex == DUPLEX_FULL) {
-			bmcr |= BMCR_FULLDPLX;
-			if (speed != SPEED_1000)
-				speed_duplex++;
-		}
-	} else {
-		if (speed == SPEED_10) {
+		switch (speed) {
+		case SPEED_10:
+			bmcr = BMCR_SPEED10;
 			if (duplex == DUPLEX_FULL) {
-				anar |= ADVERTISE_10HALF | ADVERTISE_10FULL;
-				speed_duplex = NWAY_10M_FULL;
+				bmcr |= BMCR_FULLDPLX;
+				speed_duplex = FORCE_10M_FULL;
 			} else {
-				anar |= ADVERTISE_10HALF;
-				speed_duplex = NWAY_10M_HALF;
+				speed_duplex = FORCE_10M_HALF;
 			}
-		} else if (speed == SPEED_100) {
+			break;
+		case SPEED_100:
+			bmcr = BMCR_SPEED100;
 			if (duplex == DUPLEX_FULL) {
-				anar |= ADVERTISE_10HALF | ADVERTISE_10FULL;
-				anar |= ADVERTISE_100HALF | ADVERTISE_100FULL;
-				speed_duplex = NWAY_100M_FULL;
+				bmcr |= BMCR_FULLDPLX;
+				speed_duplex = FORCE_100M_FULL;
 			} else {
-				anar |= ADVERTISE_10HALF;
-				anar |= ADVERTISE_100HALF;
-				speed_duplex = NWAY_100M_HALF;
+				speed_duplex = FORCE_100M_HALF;
 			}
-		} else if (speed == SPEED_1000 && tp->mii.supports_gmii) {
-			if (duplex == DUPLEX_FULL) {
-				anar |= ADVERTISE_10HALF | ADVERTISE_10FULL;
-				anar |= ADVERTISE_100HALF | ADVERTISE_100FULL;
-				gbcr |= ADVERTISE_1000FULL | ADVERTISE_1000HALF;
-			} else {
-				anar |= ADVERTISE_10HALF;
-				anar |= ADVERTISE_100HALF;
-				gbcr |= ADVERTISE_1000HALF;
+			break;
+		case SPEED_1000:
+			if (tp->mii.supports_gmii) {
+				bmcr = BMCR_SPEED1000 | BMCR_FULLDPLX;
+				speed_duplex = NWAY_1000M_FULL;
+				break;
 			}
-			speed_duplex = NWAY_1000M_FULL;
-		} else {
+			/* fall through */
+		default:
 			ret = -EINVAL;
 			goto out;
 		}
 
+		if (duplex == DUPLEX_FULL)
+			tp->mii.full_duplex = 1;
+		else
+			tp->mii.full_duplex = 0;
+
+		tp->mii.force_media = 1;
+	} else {
+		u16 anar, tmp1;
+		u32 support;
+
+		support = RTL_ADVERTISED_10_HALF | RTL_ADVERTISED_10_FULL |
+			  RTL_ADVERTISED_100_HALF | RTL_ADVERTISED_100_FULL;
+
+		if (tp->mii.supports_gmii)
+			support |= RTL_ADVERTISED_1000_FULL;
+
+		if (!(advertising & support))
+			return -EINVAL;
+
+		anar = r8152_mdio_read(tp, MII_ADVERTISE);
+		tmp1 = anar & ~(ADVERTISE_10HALF | ADVERTISE_10FULL |
+				ADVERTISE_100HALF | ADVERTISE_100FULL);
+		if (advertising & RTL_ADVERTISED_10_HALF) {
+			tmp1 |= ADVERTISE_10HALF;
+			speed_duplex = NWAY_10M_HALF;
+		}
+		if (advertising & RTL_ADVERTISED_10_FULL) {
+			tmp1 |= ADVERTISE_10FULL;
+			speed_duplex = NWAY_10M_FULL;
+		}
+
+		if (advertising & RTL_ADVERTISED_100_HALF) {
+			tmp1 |= ADVERTISE_100HALF;
+			speed_duplex = NWAY_100M_HALF;
+		}
+		if (advertising & RTL_ADVERTISED_100_FULL) {
+			tmp1 |= ADVERTISE_100FULL;
+			speed_duplex = NWAY_100M_FULL;
+		}
+
+		if (anar != tmp1) {
+			r8152_mdio_write(tp, MII_ADVERTISE, tmp1);
+			tp->mii.advertising = tmp1;
+		}
+
+		if (tp->mii.supports_gmii) {
+			u16 gbcr;
+
+			gbcr = r8152_mdio_read(tp, MII_CTRL1000);
+			tmp1 = gbcr & ~(ADVERTISE_1000FULL |
+					ADVERTISE_1000HALF);
+
+			if (advertising & RTL_ADVERTISED_1000_FULL) {
+				tmp1 |= ADVERTISE_1000FULL;
+				speed_duplex = NWAY_1000M_FULL;
+			}
+
+			if (gbcr != tmp1)
+				r8152_mdio_write(tp, MII_CTRL1000, tmp1);
+		}
+
 		bmcr = BMCR_ANENABLE | BMCR_ANRESTART;
+
+		tp->mii.force_media = 0;
 	}
 
 	if (test_and_clear_bit(PHY_RESET, &tp->flags))
 		bmcr |= BMCR_RESET;
 
-	if (tp->mii.supports_gmii)
-		r8152_mdio_write(tp, MII_CTRL1000, gbcr);
-
-	r8152_mdio_write(tp, MII_ADVERTISE, anar);
 	r8152_mdio_write(tp, MII_BMCR, bmcr);
 
 	switch (tp->version) {
@@ -4122,7 +4157,8 @@ static void rtl_hw_phy_work_func_t(struct work_struct *work)
 
 	tp->rtl_ops.hw_phy_cfg(tp);
 
-	rtl8152_set_speed(tp, tp->autoneg, tp->speed, tp->duplex);
+	rtl8152_set_speed(tp, tp->autoneg, tp->speed, tp->duplex,
+			  tp->advertising);
 
 	mutex_unlock(&tp->control);
 
@@ -4841,20 +4877,46 @@ static int rtl8152_set_link_ksettings(struct net_device *dev,
 				      const struct ethtool_link_ksettings *cmd)
 {
 	struct r8152 *tp = netdev_priv(dev);
+	u32 advertising = 0;
 	int ret;
 
 	ret = usb_autopm_get_interface(tp->intf);
 	if (ret < 0)
 		goto out;
 
+	if (test_bit(ETHTOOL_LINK_MODE_10baseT_Half_BIT,
+		     cmd->link_modes.advertising))
+		advertising |= RTL_ADVERTISED_10_HALF;
+
+	if (test_bit(ETHTOOL_LINK_MODE_10baseT_Full_BIT,
+		     cmd->link_modes.advertising))
+		advertising |= RTL_ADVERTISED_10_FULL;
+
+	if (test_bit(ETHTOOL_LINK_MODE_100baseT_Half_BIT,
+		     cmd->link_modes.advertising))
+		advertising |= RTL_ADVERTISED_100_HALF;
+
+	if (test_bit(ETHTOOL_LINK_MODE_100baseT_Full_BIT,
+		     cmd->link_modes.advertising))
+		advertising |= RTL_ADVERTISED_100_FULL;
+
+	if (test_bit(ETHTOOL_LINK_MODE_1000baseT_Half_BIT,
+		     cmd->link_modes.advertising))
+		advertising |= RTL_ADVERTISED_1000_HALF;
+
+	if (test_bit(ETHTOOL_LINK_MODE_1000baseT_Full_BIT,
+		     cmd->link_modes.advertising))
+		advertising |= RTL_ADVERTISED_1000_FULL;
+
 	mutex_lock(&tp->control);
 
 	ret = rtl8152_set_speed(tp, cmd->base.autoneg, cmd->base.speed,
-				cmd->base.duplex);
+				cmd->base.duplex, advertising);
 	if (!ret) {
 		tp->autoneg = cmd->base.autoneg;
 		tp->speed = cmd->base.speed;
 		tp->duplex = cmd->base.duplex;
+		tp->advertising = advertising;
 	}
 
 	mutex_unlock(&tp->control);
@@ -5569,7 +5631,13 @@ static int rtl8152_probe(struct usb_interface *intf,
 	tp->mii.phy_id = R8152_PHY_ID;
 
 	tp->autoneg = AUTONEG_ENABLE;
-	tp->speed = tp->mii.supports_gmii ? SPEED_1000 : SPEED_100;
+	tp->speed = SPEED_100;
+	tp->advertising = RTL_ADVERTISED_10_HALF | RTL_ADVERTISED_10_FULL |
+			  RTL_ADVERTISED_100_HALF | RTL_ADVERTISED_100_FULL;
+	if (tp->mii.supports_gmii) {
+		tp->speed = SPEED_1000;
+		tp->advertising |= RTL_ADVERTISED_1000_FULL;
+	}
 	tp->duplex = DUPLEX_FULL;
 
 	tp->rx_copybreak = RTL8152_RXFG_HEADSZ;
-- 
2.21.0


^ permalink raw reply related

* RE: [net-next 3/3] ravb: TROCR register is only present on R-Car Gen3
From: Yoshihiro Shimoda @ 2019-09-02 10:51 UTC (permalink / raw)
  To: Simon Horman, David Miller, Sergei Shtylyov
  Cc: Magnus Damm, netdev@vger.kernel.org,
	linux-renesas-soc@vger.kernel.org
In-Reply-To: <20190902080603.5636-4-horms+renesas@verge.net.au>

Hi Simon-san,

> From: Simon Horman, Sent: Monday, September 2, 2019 5:06 PM
> 
> Only use the TROCR register on R-Car Gen3.
> It is not present on other SoCs.
> 
> Offsets used for the undocumented registers are also considered reserved
> and should not be written to.
> 
> After some internal investigation with Renesas it remains unclear why this
> driver accesses these fields on R-Car Gen2 but regardless of what the
> historical reasons are the current code is considered incorrect.
> 
> Signed-off-by: Simon Horman <horms+renesas@verge.net.au>

Thank you for the patch!

Reviewed-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>

Best regards,
Yoshihiro Shimoda


^ permalink raw reply

* RE: [net-next 2/3] ravb: Remove undocumented processing
From: Yoshihiro Shimoda @ 2019-09-02 10:49 UTC (permalink / raw)
  To: Simon Horman, David Miller, Sergei Shtylyov
  Cc: Magnus Damm, netdev@vger.kernel.org,
	linux-renesas-soc@vger.kernel.org, Kazuya Mizuguchi
In-Reply-To: <20190902080603.5636-3-horms+renesas@verge.net.au>

Hi Simon-san,

> From: Simon Horman, Sent: Monday, September 2, 2019 5:06 PM
> 
> From: Kazuya Mizuguchi <kazuya.mizuguchi.ks@renesas.com>
> 
> This patch removes the use of the undocumented registers
> CDCR, LCCR, CERCR, CEECR and the undocumented BOC bit of the CCC register.
> 
> Current documentation for EtherAVB (ravb) describes the offset of
> what the driver uses as the BOC bit as reserved and that only a value of
> 0 should be written. Furthermore, the offsets used for the undocumented
> registers are also considered reserved nd should not be written to.
> 
> After some internal investigation with Renesas it remains unclear
> why this driver accesses these fields but regardless of what the historical
> reasons are the current code is considered incorrect.
> 
> Signed-off-by: Kazuya Mizuguchi <kazuya.mizuguchi.ks@renesas.com>
> Signed-off-by: Simon Horman <horms+renesas@verge.net.au>

Thank you for the patch!

Reviewed-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>

Best regards,
Yoshihiro Shimoda


^ permalink raw reply

* RE: [net-next 1/3] ravb: correct typo in FBP field of SFO register
From: Yoshihiro Shimoda @ 2019-09-02 10:47 UTC (permalink / raw)
  To: Simon Horman, David Miller, Sergei Shtylyov
  Cc: Magnus Damm, netdev@vger.kernel.org,
	linux-renesas-soc@vger.kernel.org, Kazuya Mizuguchi
In-Reply-To: <20190902080603.5636-2-horms+renesas@verge.net.au>

Hi Simon-san,

Thank you for the patch!

> From: Simon Horman, Sent: Monday, September 2, 2019 5:06 PM
> 
> From: Kazuya Mizuguchi <kazuya.mizuguchi.ks@renesas.com>
> 
> The field name is FBP rather than FPB.
> 
> This field is unused and could equally be removed from the driver entirely.
> But there seems no harm in leaving as documentation of the presence of the
> field.
> 
> Signed-off-by: Kazuya Mizuguchi <kazuya.mizuguchi.ks@renesas.com>
> Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
> ---
> v0 - Kazuya Mizuguchi
> 
> v1 - Simon Horman
> * Extracted from larger patch
> * Wrote changelog
> 
> diff --git a/drivers/net/ethernet/renesas/ravb.h b/drivers/net/ethernet/renesas/ravb.h
> index ac9195add811..bdb051f04b0c 100644
> --- a/drivers/net/ethernet/renesas/ravb.h
> +++ b/drivers/net/ethernet/renesas/ravb.h
> @@ -317,7 +312,7 @@ enum UFCD_BIT {
> 
>  /* SFO */
>  enum SFO_BIT {
> -	SFO_FPB		= 0x0000003F,
> +	SFO_FBP		= 0x0000003F,
>  };
> 
>  /* RTC */
> ---
>  drivers/net/ethernet/renesas/ravb.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/renesas/ravb.h b/drivers/net/ethernet/renesas/ravb.h

This patch has two same diff. After removed either one of them,

Reviewed-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>

Best regards,
Yoshihiro Shimoda


> index ac9195add811..2596a95a4300 100644
> --- a/drivers/net/ethernet/renesas/ravb.h
> +++ b/drivers/net/ethernet/renesas/ravb.h
> @@ -317,7 +317,7 @@ enum UFCD_BIT {
> 
>  /* SFO */
>  enum SFO_BIT {
> -	SFO_FPB		= 0x0000003F,
> +	SFO_FBP		= 0x0000003F,
>  };
> 
>  /* RTC */
> --
> 2.11.0


^ permalink raw reply

* Re: [PATCH net-next 05/13] net: stmmac: selftests: Add selftest for L3/L4 Filters
From: kbuild test robot @ 2019-09-02 10:35 UTC (permalink / raw)
  To: Jose Abreu
  Cc: kbuild-all, netdev, Joao Pinto, Jose Abreu, Giuseppe Cavallaro,
	Alexandre Torgue, David S. Miller, Maxime Coquelin, linux-stm32,
	linux-arm-kernel, linux-kernel
In-Reply-To: <f2b536e9370a448a40fc411712e717d4c605a3f5.1567410970.git.joabreu@synopsys.com>

[-- Attachment #1: Type: text/plain, Size: 4087 bytes --]

Hi Jose,

I love your patch! Perhaps something to improve:

[auto build test WARNING on net-next/master]

url:    https://github.com/0day-ci/linux/commits/Jose-Abreu/net-stmmac-Improvements-for-next/20190902-160927
config: mips-allmodconfig (attached as .config)
compiler: mips-linux-gcc (GCC) 7.4.0
reproduce:
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        GCC_VERSION=7.4.0 make.cross ARCH=mips 

If you fix the issue, kindly add following tag
Reported-by: kbuild test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

   drivers/net/ethernet/stmicro/stmmac/stmmac_selftests.c: In function '__stmmac_test_l3filt':
>> drivers/net/ethernet/stmicro/stmmac/stmmac_selftests.c:1249:1: warning: the frame size of 1280 bytes is larger than 1024 bytes [-Wframe-larger-than=]
    }
    ^
   drivers/net/ethernet/stmicro/stmmac/stmmac_selftests.c: In function '__stmmac_test_l4filt':
   drivers/net/ethernet/stmicro/stmmac/stmmac_selftests.c:1362:1: warning: the frame size of 1280 bytes is larger than 1024 bytes [-Wframe-larger-than=]
    }
    ^

vim +1249 drivers/net/ethernet/stmicro/stmmac/stmmac_selftests.c

  1170	
  1171	#ifdef CONFIG_NET_CLS_ACT
  1172	static int __stmmac_test_l3filt(struct stmmac_priv *priv, u32 dst, u32 src,
  1173					u32 dst_mask, u32 src_mask)
  1174	{
  1175		struct flow_dissector_key_ipv4_addrs key, mask;
  1176		unsigned long dummy_cookie = 0xdeadbeef;
  1177		struct flow_dissector dissector = { };
  1178		struct stmmac_packet_attrs attr = { };
  1179		struct flow_cls_offload cls = { };
  1180		struct flow_rule *rule;
  1181		int ret;
  1182	
  1183		if (!tc_can_offload(priv->dev))
  1184			return -EOPNOTSUPP;
  1185		if (!priv->dma_cap.l3l4fnum)
  1186			return -EOPNOTSUPP;
  1187		if (priv->rss.enable) {
  1188			struct stmmac_rss rss = { .enable = false, };
  1189	
  1190			stmmac_rss_configure(priv, priv->hw, &rss,
  1191					     priv->plat->rx_queues_to_use);
  1192		}
  1193	
  1194		dissector.used_keys |= (1 << FLOW_DISSECTOR_KEY_IPV4_ADDRS);
  1195		dissector.offset[FLOW_DISSECTOR_KEY_IPV4_ADDRS] = 0;
  1196	
  1197		cls.common.chain_index = 0;
  1198		cls.command = FLOW_CLS_REPLACE;
  1199		cls.cookie = dummy_cookie;
  1200	
  1201		rule = kzalloc(struct_size(rule, action.entries, 1), GFP_KERNEL);
  1202		if (!rule) {
  1203			ret = -ENOMEM;
  1204			goto cleanup_rss;
  1205		}
  1206	
  1207		rule->match.dissector = &dissector;
  1208		rule->match.key = (void *)&key;
  1209		rule->match.mask = (void *)&mask;
  1210	
  1211		key.src = htonl(src);
  1212		key.dst = htonl(dst);
  1213		mask.src = src_mask;
  1214		mask.dst = dst_mask;
  1215	
  1216		cls.rule = rule;
  1217	
  1218		rule->action.entries[0].id = FLOW_ACTION_DROP;
  1219		rule->action.num_entries = 1;
  1220	
  1221		attr.dst = priv->dev->dev_addr;
  1222		attr.ip_dst = dst;
  1223		attr.ip_src = src;
  1224	
  1225		/* Shall receive packet */
  1226		ret = __stmmac_test_loopback(priv, &attr);
  1227		if (ret)
  1228			goto cleanup_rule;
  1229	
  1230		ret = stmmac_tc_setup_cls(priv, priv, &cls);
  1231		if (ret)
  1232			goto cleanup_rule;
  1233	
  1234		/* Shall NOT receive packet */
  1235		ret = __stmmac_test_loopback(priv, &attr);
  1236		ret = ret ? 0 : -EINVAL;
  1237	
  1238		cls.command = FLOW_CLS_DESTROY;
  1239		stmmac_tc_setup_cls(priv, priv, &cls);
  1240	cleanup_rule:
  1241		kfree(rule);
  1242	cleanup_rss:
  1243		if (priv->rss.enable) {
  1244			stmmac_rss_configure(priv, priv->hw, &priv->rss,
  1245					     priv->plat->rx_queues_to_use);
  1246		}
  1247	
  1248		return ret;
> 1249	}
  1250	#else
  1251	static int __stmmac_test_l3filt(struct stmmac_priv *priv, u32 dst, u32 src,
  1252					u32 dst_mask, u32 src_mask)
  1253	{
  1254		return -EOPNOTSUPP;
  1255	}
  1256	#endif
  1257	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 61542 bytes --]

^ permalink raw reply

* [PATCH net] rxrpc: Fix misplaced traceline
From: David Howells @ 2019-09-02 10:34 UTC (permalink / raw)
  To: netdev; +Cc: Hillf Danton, dhowells, linux-afs, linux-kernel

There's a misplaced traceline in rxrpc_input_packet() which is looking at a
packet that just got released rather than the replacement packet.

Fix this by moving the traceline after the assignment that moves the new
packet pointer to the actual packet pointer.

Fixes: d0d5c0cd1e71 ("rxrpc: Use skb_unshare() rather than skb_cow_data()")
Reported-by: Hillf Danton <hdanton@sina.com>
Signed-off-by: David Howells <dhowells@redhat.com>
---

 net/rxrpc/input.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/rxrpc/input.c b/net/rxrpc/input.c
index d122c53c8697..157be1ff8697 100644
--- a/net/rxrpc/input.c
+++ b/net/rxrpc/input.c
@@ -1262,8 +1262,8 @@ int rxrpc_input_packet(struct sock *udp_sk, struct sk_buff *skb)
 
 			if (nskb != skb) {
 				rxrpc_eaten_skb(skb, rxrpc_skb_received);
-				rxrpc_new_skb(skb, rxrpc_skb_unshared);
 				skb = nskb;
+				rxrpc_new_skb(skb, rxrpc_skb_unshared);
 				sp = rxrpc_skb(skb);
 			}
 		}


^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox