* [PATCH v2 net-next 7/7] tcp: make tcp_sendmsg() aware of socket backlog
From: Eric Dumazet @ 2016-04-29 3:10 UTC (permalink / raw)
To: David S . Miller
Cc: netdev, Eric Dumazet, Soheil Hassas Yeganeh, Alexei Starovoitov,
Marcelo Ricardo Leitner, Eric Dumazet
In-Reply-To: <1461899449-8096-1-git-send-email-edumazet@google.com>
Large sendmsg()/write() hold socket lock for the duration of the call,
unless sk->sk_sndbuf limit is hit. This is bad because incoming packets
are parked into socket backlog for a long time.
Critical decisions like fast retransmit might be delayed.
Receivers have to maintain a big out of order queue with additional cpu
overhead, and also possible stalls in TX once windows are full.
Bidirectional flows are particularly hurt since the backlog can become
quite big if the copy from user space triggers IO (page faults)
Some applications learnt to use sendmsg() (or sendmmsg()) with small
chunks to avoid this issue.
Kernel should know better, right ?
Add a generic sk_flush_backlog() helper and use it right
before a new skb is allocated. Typically we put 64KB of payload
per skb (unless MSG_EOR is requested) and checking socket backlog
every 64KB gives good results.
As a matter of fact, tests with TSO/GSO disabled give very nice
results, as we manage to keep a small write queue and smaller
perceived rtt.
Note that sk_flush_backlog() maintains socket ownership,
so is not equivalent to a {release_sock(sk); lock_sock(sk);},
to ensure implicit atomicity rules that sendmsg() was
giving to (possibly buggy) applications.
In this simple implementation, I chose to not call tcp_release_cb(),
but we might consider this later.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Soheil Hassas Yeganeh <soheil@google.com>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
---
include/net/sock.h | 11 +++++++++++
net/core/sock.c | 7 +++++++
net/ipv4/tcp.c | 8 ++++++--
3 files changed, 24 insertions(+), 2 deletions(-)
diff --git a/include/net/sock.h b/include/net/sock.h
index 3df778ccaa82..1dbb1f9f7c1b 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -926,6 +926,17 @@ void sk_stream_kill_queues(struct sock *sk);
void sk_set_memalloc(struct sock *sk);
void sk_clear_memalloc(struct sock *sk);
+void __sk_flush_backlog(struct sock *sk);
+
+static inline bool sk_flush_backlog(struct sock *sk)
+{
+ if (unlikely(READ_ONCE(sk->sk_backlog.tail))) {
+ __sk_flush_backlog(sk);
+ return true;
+ }
+ return false;
+}
+
int sk_wait_data(struct sock *sk, long *timeo, const struct sk_buff *skb);
struct request_sock_ops;
diff --git a/net/core/sock.c b/net/core/sock.c
index 70744dbb6c3f..f615e9391170 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -2048,6 +2048,13 @@ static void __release_sock(struct sock *sk)
sk->sk_backlog.len = 0;
}
+void __sk_flush_backlog(struct sock *sk)
+{
+ spin_lock_bh(&sk->sk_lock.slock);
+ __release_sock(sk);
+ spin_unlock_bh(&sk->sk_lock.slock);
+}
+
/**
* sk_wait_data - wait for data to arrive at sk_receive_queue
* @sk: sock to wait on
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 4787f86ae64c..b945c2b046c5 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -1136,11 +1136,12 @@ int tcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
/* This should be in poll */
sk_clear_bit(SOCKWQ_ASYNC_NOSPACE, sk);
- mss_now = tcp_send_mss(sk, &size_goal, flags);
-
/* Ok commence sending. */
copied = 0;
+restart:
+ mss_now = tcp_send_mss(sk, &size_goal, flags);
+
err = -EPIPE;
if (sk->sk_err || (sk->sk_shutdown & SEND_SHUTDOWN))
goto out_err;
@@ -1166,6 +1167,9 @@ new_segment:
if (!sk_stream_memory_free(sk))
goto wait_for_sndbuf;
+ if (sk_flush_backlog(sk))
+ goto restart;
+
skb = sk_stream_alloc_skb(sk,
select_size(sk, sg),
sk->sk_allocation,
--
2.8.0.rc3.226.g39d4020
^ permalink raw reply related
* Re: [PATCH v2 net-next 7/7] tcp: make tcp_sendmsg() aware of socket backlog
From: Alexei Starovoitov @ 2016-04-29 4:43 UTC (permalink / raw)
To: Eric Dumazet, David S . Miller
Cc: netdev, Soheil Hassas Yeganeh, Marcelo Ricardo Leitner,
Eric Dumazet
In-Reply-To: <1461899449-8096-8-git-send-email-edumazet@google.com>
On 4/28/16 8:10 PM, Eric Dumazet wrote:
> Large sendmsg()/write() hold socket lock for the duration of the call,
> unless sk->sk_sndbuf limit is hit. This is bad because incoming packets
> are parked into socket backlog for a long time.
> Critical decisions like fast retransmit might be delayed.
> Receivers have to maintain a big out of order queue with additional cpu
> overhead, and also possible stalls in TX once windows are full.
>
> Bidirectional flows are particularly hurt since the backlog can become
> quite big if the copy from user space triggers IO (page faults)
>
> Some applications learnt to use sendmsg() (or sendmmsg()) with small
> chunks to avoid this issue.
>
> Kernel should know better, right ?
>
> Add a generic sk_flush_backlog() helper and use it right
> before a new skb is allocated. Typically we put 64KB of payload
> per skb (unless MSG_EOR is requested) and checking socket backlog
> every 64KB gives good results.
>
> As a matter of fact, tests with TSO/GSO disabled give very nice
> results, as we manage to keep a small write queue and smaller
> perceived rtt.
>
> Note that sk_flush_backlog() maintains socket ownership,
> so is not equivalent to a {release_sock(sk); lock_sock(sk);},
> to ensure implicit atomicity rules that sendmsg() was
> giving to (possibly buggy) applications.
>
> In this simple implementation, I chose to not call tcp_release_cb(),
> but we might consider this later.
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Soheil Hassas Yeganeh <soheil@google.com>
> Cc: Alexei Starovoitov <ast@fb.com>
> Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
> ---
> include/net/sock.h | 11 +++++++++++
> net/core/sock.c | 7 +++++++
> net/ipv4/tcp.c | 8 ++++++--
> 3 files changed, 24 insertions(+), 2 deletions(-)
>
> diff --git a/include/net/sock.h b/include/net/sock.h
> index 3df778ccaa82..1dbb1f9f7c1b 100644
> --- a/include/net/sock.h
> +++ b/include/net/sock.h
> @@ -926,6 +926,17 @@ void sk_stream_kill_queues(struct sock *sk);
> void sk_set_memalloc(struct sock *sk);
> void sk_clear_memalloc(struct sock *sk);
>
> +void __sk_flush_backlog(struct sock *sk);
> +
> +static inline bool sk_flush_backlog(struct sock *sk)
> +{
> + if (unlikely(READ_ONCE(sk->sk_backlog.tail))) {
> + __sk_flush_backlog(sk);
> + return true;
> + }
> + return false;
> +}
> +
> int sk_wait_data(struct sock *sk, long *timeo, const struct sk_buff *skb);
>
> struct request_sock_ops;
> diff --git a/net/core/sock.c b/net/core/sock.c
> index 70744dbb6c3f..f615e9391170 100644
> --- a/net/core/sock.c
> +++ b/net/core/sock.c
> @@ -2048,6 +2048,13 @@ static void __release_sock(struct sock *sk)
> sk->sk_backlog.len = 0;
> }
>
> +void __sk_flush_backlog(struct sock *sk)
> +{
> + spin_lock_bh(&sk->sk_lock.slock);
> + __release_sock(sk);
> + spin_unlock_bh(&sk->sk_lock.slock);
> +}
> +
> /**
> * sk_wait_data - wait for data to arrive at sk_receive_queue
> * @sk: sock to wait on
> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> index 4787f86ae64c..b945c2b046c5 100644
> --- a/net/ipv4/tcp.c
> +++ b/net/ipv4/tcp.c
> @@ -1136,11 +1136,12 @@ int tcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
> /* This should be in poll */
> sk_clear_bit(SOCKWQ_ASYNC_NOSPACE, sk);
>
> - mss_now = tcp_send_mss(sk, &size_goal, flags);
> -
> /* Ok commence sending. */
> copied = 0;
>
> +restart:
> + mss_now = tcp_send_mss(sk, &size_goal, flags);
> +
> err = -EPIPE;
> if (sk->sk_err || (sk->sk_shutdown & SEND_SHUTDOWN))
> goto out_err;
> @@ -1166,6 +1167,9 @@ new_segment:
> if (!sk_stream_memory_free(sk))
> goto wait_for_sndbuf;
>
> + if (sk_flush_backlog(sk))
> + goto restart;
I don't understand the logic completely, but isn't it
safer to do 'goto wait_for_memory;' here if we happened
to hit this in the middle of the loop?
Also does it make sense to rename __release_sock to
something like _ _ _sk_flush_backlog, since that's
what it's doing and not doing any 'release' ?
Ack for patches 2 and 6. Great improvement!
^ permalink raw reply
* Re: [RFC PATCH V2 2/2] vhost: device IOTLB API
From: Jason Wang @ 2016-04-29 4:44 UTC (permalink / raw)
To: Michael S. Tsirkin
Cc: kvm, qemu-devel, netdev, linux-kernel, peterx, virtualization,
pbonzini
In-Reply-To: <5722B511.6060401@redhat.com>
On 04/29/2016 09:12 AM, Jason Wang wrote:
> On 04/28/2016 10:43 PM, Michael S. Tsirkin wrote:
>> > On Thu, Apr 28, 2016 at 02:37:16PM +0800, Jason Wang wrote:
>>> >>
>>> >> On 04/27/2016 07:45 PM, Michael S. Tsirkin wrote:
>>>> >>> On Fri, Mar 25, 2016 at 10:34:34AM +0800, Jason Wang wrote:
>>>>> >>>> This patch tries to implement an device IOTLB for vhost. This could be
>>>>> >>>> used with for co-operation with userspace(qemu) implementation of DMA
>>>>> >>>> remapping.
>>>>> >>>>
>>>>> >>>> The idea is simple. When vhost meets an IOTLB miss, it will request
>>>>> >>>> the assistance of userspace to do the translation, this is done
>>>>> >>>> through:
>>>>> >>>>
>>>>> >>>> - Fill the translation request in a preset userspace address (This
>>>>> >>>> address is set through ioctl VHOST_SET_IOTLB_REQUEST_ENTRY).
>>>>> >>>> - Notify userspace through eventfd (This eventfd was set through ioctl
>>>>> >>>> VHOST_SET_IOTLB_FD).
>>>> >>> Why use an eventfd for this?
>>> >> The aim is to implement the API all through ioctls.
>>> >>
>>>> >>> We use them for interrupts because
>>>> >>> that happens to be what kvm wants, but here - why don't we
>>>> >>> just add a generic support for reading out events
>>>> >>> on the vhost fd itself?
>>> >> I've considered this approach, but what's the advantages of this? I mean
>>> >> looks like all other ioctls could be done through vhost fd
>>> >> reading/writing too.
>> > read/write have a non-blocking flag.
>> >
>> > It's not useful for other ioctls but it's useful here.
>> >
> Ok, this looks better.
>
>>>>> >>>> - device IOTLB were started and stopped through VHOST_RUN_IOTLB ioctl
>>>>> >>>>
>>>>> >>>> When userspace finishes the translation, it will update the vhost
>>>>> >>>> IOTLB through VHOST_UPDATE_IOTLB ioctl. Userspace is also in charge of
>>>>> >>>> snooping the IOTLB invalidation of IOMMU IOTLB and use
>>>>> >>>> VHOST_UPDATE_IOTLB to invalidate the possible entry in vhost.
>>>> >>> There's one problem here, and that is that VQs still do not undergo
>>>> >>> translation. In theory VQ could be mapped in such a way
>>>> >>> that it's not contigious in userspace memory.
>>> >> I'm not sure I get the issue, current vhost API support setting
>>> >> desc_user_addr, used_user_addr and avail_user_addr independently. So
>>> >> looks ok? If not, looks not a problem to device IOTLB API itself.
>> > The problem is that addresses are all HVA.
>> >
>> > Without an iommu, we ask for them to be contigious and
>> > since bus address == GPA, this means contigious GPA =>
>> > contigious HVA. With an IOMMU you can map contigious
>> > bus address but non contigious GPA and non contigious HVA.
> Yes, so the issue is we should not reuse VHOST_SET_VRING_ADDR and invent
> a new ioctl to set bus addr (guest iova). The access the VQ through
> device IOTLB too.
Note that userspace has checked for this and fallback to userspace if it
detects non contiguous GPA. Consider this happens rare, I'm not sure we
should handle this.
>
>> >
>> > Another concern: what if guest changes the GPA while keeping bus address
>> > constant? Normal devices will work because they only use
>> > bus addresses, but virtio will break.
> If we access VQ through device IOTLB too, this could be solved.
>
I don't see a reason why guest want change GPA during DMA. Even if it
can, it needs lots of other synchronization.
^ permalink raw reply
* Re: [PATCH v2 net-next 7/7] tcp: make tcp_sendmsg() aware of socket backlog
From: Eric Dumazet @ 2016-04-29 5:05 UTC (permalink / raw)
To: Alexei Starovoitov
Cc: Eric Dumazet, David S . Miller, netdev, Soheil Hassas Yeganeh,
Marcelo Ricardo Leitner
In-Reply-To: <5722E663.8080304@fb.com>
On Thu, 2016-04-28 at 21:43 -0700, Alexei Starovoitov wrote:
>
> I don't understand the logic completely, but isn't it
> safer to do 'goto wait_for_memory;' here if we happened
> to hit this in the middle of the loop?
Well, the wait_for_memory pushes data, and could early return to user
space with short writes (non blocking IO). This would break things...
After processing backlog, tcp_send_mss() needs to be called again,
and we also need to check sk_err and sk_shutdown. A goto looks fine to
me.
> Also does it make sense to rename __release_sock to
> something like _ _ _sk_flush_backlog, since that's
> what it's doing and not doing any 'release' ?
Well, I guess it could be renamed, but this has been named like that for
decades ? Why changing now, while this patch does not touch it ?
^ permalink raw reply
* Re: [PATCH v2 net-next 7/7] tcp: make tcp_sendmsg() aware of socket backlog
From: Alexei Starovoitov @ 2016-04-29 5:19 UTC (permalink / raw)
To: Eric Dumazet
Cc: Eric Dumazet, David S . Miller, netdev, Soheil Hassas Yeganeh,
Marcelo Ricardo Leitner
In-Reply-To: <1461906355.5535.141.camel@edumazet-glaptop3.roam.corp.google.com>
On 4/28/16 10:05 PM, Eric Dumazet wrote:
> On Thu, 2016-04-28 at 21:43 -0700, Alexei Starovoitov wrote:
>
>>
>> I don't understand the logic completely, but isn't it
>> safer to do 'goto wait_for_memory;' here if we happened
>> to hit this in the middle of the loop?
>
> Well, the wait_for_memory pushes data, and could early return to user
> space with short writes (non blocking IO). This would break things...
I see. Right. My only concern was about restarting the loop
and msg_data_left(), since it's really hard to follow iov_iter logic.
^ permalink raw reply
* [PATCH] net/smscx5xx: use the device tree for mac address
From: Lubomir Rintel @ 2016-04-29 7:05 UTC (permalink / raw)
To: linux-kernel
Cc: linux-usb, netdev, Steve Glendinning, Arnd Bergmann,
Lubomir Rintel
From: Arnd Bergmann <arnd@arndb.de>
This takes the MAC address for smsc75xx/smsc95xx USB network devices
from a the device tree. This is required to get a usable persistent
address on the popular beagleboard, whose hardware designers
accidentally forgot that an ethernet device really requires an a
MAC address to be functional.
The Raspberry Pi also ships smsc9514 without a serial EEPROM, stores
the MAC address in ROM accessible via VC4 firmware.
The smsc75xx and smsc95xx drivers are just two copies of the
same code, so better fix both.
[lkundrak@v3.sk: updated to use of_get_property() as per suggestion from
Arnd, reworded the message and comments a bit]
Tested-by: Lubomir Rintel <lkundrak@v3.sk>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Lubomir Rintel <lkundrak@v3.sk>
---
Changes since v2:
- Prefer DT address to EEPROM address. No practical difference since
the devices are not supposed to have both, but aligned with existing
practice (ixgbe, dm9000).
Changes since v1:
- Made use of_get_property()
- Amended comments/commit message a bit
drivers/net/usb/smsc75xx.c | 12 +++++++++++-
drivers/net/usb/smsc95xx.c | 12 +++++++++++-
2 files changed, 22 insertions(+), 2 deletions(-)
diff --git a/drivers/net/usb/smsc75xx.c b/drivers/net/usb/smsc75xx.c
index 30033db..c369db9 100644
--- a/drivers/net/usb/smsc75xx.c
+++ b/drivers/net/usb/smsc75xx.c
@@ -29,6 +29,7 @@
#include <linux/crc32.h>
#include <linux/usb/usbnet.h>
#include <linux/slab.h>
+#include <linux/of_net.h>
#include "smsc75xx.h"
#define SMSC_CHIPNAME "smsc75xx"
@@ -761,6 +762,15 @@ static int smsc75xx_ioctl(struct net_device *netdev, struct ifreq *rq, int cmd)
static void smsc75xx_init_mac_address(struct usbnet *dev)
{
+ const u8 *mac_addr;
+
+ /* maybe the boot loader passed the MAC address in devicetree */
+ mac_addr = of_get_mac_address(dev->udev->dev.of_node);
+ if (mac_addr) {
+ memcpy(dev->net->dev_addr, mac_addr, ETH_ALEN);
+ return;
+ }
+
/* try reading mac address from EEPROM */
if (smsc75xx_read_eeprom(dev, EEPROM_MAC_OFFSET, ETH_ALEN,
dev->net->dev_addr) == 0) {
@@ -772,7 +782,7 @@ static void smsc75xx_init_mac_address(struct usbnet *dev)
}
}
- /* no eeprom, or eeprom values are invalid. generate random MAC */
+ /* no useful static MAC address found. generate a random one */
eth_hw_addr_random(dev->net);
netif_dbg(dev, ifup, dev->net, "MAC address set to eth_random_addr\n");
}
diff --git a/drivers/net/usb/smsc95xx.c b/drivers/net/usb/smsc95xx.c
index 66b3ab9..2edc2bc 100644
--- a/drivers/net/usb/smsc95xx.c
+++ b/drivers/net/usb/smsc95xx.c
@@ -29,6 +29,7 @@
#include <linux/crc32.h>
#include <linux/usb/usbnet.h>
#include <linux/slab.h>
+#include <linux/of_net.h>
#include "smsc95xx.h"
#define SMSC_CHIPNAME "smsc95xx"
@@ -765,6 +766,15 @@ static int smsc95xx_ioctl(struct net_device *netdev, struct ifreq *rq, int cmd)
static void smsc95xx_init_mac_address(struct usbnet *dev)
{
+ const u8 *mac_addr;
+
+ /* maybe the boot loader passed the MAC address in devicetree */
+ mac_addr = of_get_mac_address(dev->udev->dev.of_node);
+ if (mac_addr) {
+ memcpy(dev->net->dev_addr, mac_addr, ETH_ALEN);
+ return;
+ }
+
/* try reading mac address from EEPROM */
if (smsc95xx_read_eeprom(dev, EEPROM_MAC_OFFSET, ETH_ALEN,
dev->net->dev_addr) == 0) {
@@ -775,7 +785,7 @@ static void smsc95xx_init_mac_address(struct usbnet *dev)
}
}
- /* no eeprom, or eeprom values are invalid. generate random MAC */
+ /* no useful static MAC address found. generate a random one */
eth_hw_addr_random(dev->net);
netif_dbg(dev, ifup, dev->net, "MAC address set to eth_random_addr\n");
}
--
2.7.4
^ permalink raw reply related
* Re: [PATCH net-next] ila: ipv6/ila: fix nlsize calculation for lwtunnel
From: Nicolas Dichtel @ 2016-04-29 7:40 UTC (permalink / raw)
To: Tom Herbert, davem, netdev; +Cc: kernel-team
In-Reply-To: <1461888749-4105343-1-git-send-email-tom@herbertland.com>
Le 29/04/2016 02:12, Tom Herbert a écrit :
> The handler 'ila_fill_encap_info' adds two attributes: ILA_ATTR_LOCATOR
> and ILA_ATTR_CSUM_MODE.
>
> Also, do nla_put_u8 instead of nla_put_u64 for ILA_ATTR_CSUM_MODE.
>
> Fixes: 65d7ab8de582 ("net: Identifier Locator Addressing module")
> Reported-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
> Signed-off-by: Tom Herbert <tom@herbertland.com>
> ---
> net/ipv6/ila/ila_lwt.c | 10 +++++++---
> 1 file changed, 7 insertions(+), 3 deletions(-)
>
> diff --git a/net/ipv6/ila/ila_lwt.c b/net/ipv6/ila/ila_lwt.c
> index 4985e1a..7788090 100644
> --- a/net/ipv6/ila/ila_lwt.c
> +++ b/net/ipv6/ila/ila_lwt.c
> @@ -133,7 +133,7 @@ static int ila_fill_encap_info(struct sk_buff *skb,
> if (nla_put_u64_64bit(skb, ILA_ATTR_LOCATOR, (__force u64)p->locator.v64,
> ILA_ATTR_PAD))
> goto nla_put_failure;
> - if (nla_put_u64(skb, ILA_ATTR_CSUM_MODE, (__force u8)p->csum_mode))
> + if (nla_put_u8(skb, ILA_ATTR_CSUM_MODE, (__force u8)p->csum_mode))
> goto nla_put_failure;
>
> return 0;
> @@ -144,8 +144,12 @@ nla_put_failure:
>
> static int ila_encap_nlsize(struct lwtunnel_state *lwtstate)
> {
> - /* No encapsulation overhead */
> - return 0;
> + return
> + /* ILA_ATTR_LOCATOR */
> + nla_total_size(sizeof(u64)) +
It should be nla_total_size_64bit() here.
Regards,
Nicolas
^ permalink raw reply
* [PATCH iproute2] tc: add bash-completion function
From: Quentin Monnet @ 2016-04-29 8:27 UTC (permalink / raw)
To: alexei.starovoitov; +Cc: stephen, hadi, netdev, vincent.jardin
In-Reply-To: <20160428165349.GB81443@ast-mbp.thefacebook.com>
Add function for command completion for tc in bash, and update Makefile
to install it:
- Under /usr/share/bash-completion/completions/ (default).
- Or under /etc/bash_completions.d/, which is the old directory for
bash-completion, if /usr/share/bash-completion/completions/ is not
found AND /etc/bash_completions.d/ exists already.
Inside iproute2 repository, the completion code is in a new
`bash-completion` toplevel directory.
Signed-off-by: Quentin Monnet <quentin.monnet@6wind.com>
---
Makefile | 9 +
bash-completion/tc | 723 +++++++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 732 insertions(+)
create mode 100644 bash-completion/tc
diff --git a/Makefile b/Makefile
index 0190aa004a79..93ce4b430eb2 100644
--- a/Makefile
+++ b/Makefile
@@ -7,6 +7,8 @@ DOCDIR?=$(DATADIR)/doc/iproute2
MANDIR?=$(DATADIR)/man
ARPDDIR?=/var/lib/arpd
KERNEL_INCLUDE?=/usr/include
+BASHCOMPDIR?=/usr/share/bash-completion/completions
+OLDBASHCOMPDIR?=/etc/bash_completion.d
# Path to db_185.h include
DBM_INCLUDE:=$(DESTDIR)/usr/include
@@ -66,6 +68,13 @@ install: all
$(DESTDIR)$(DOCDIR)/examples/diffserv
@for i in $(SUBDIRS) doc; do $(MAKE) -C $$i install; done
install -m 0644 $(shell find etc/iproute2 -maxdepth 1 -type f) $(DESTDIR)$(CONFDIR)
+ if [ ! -d $(DESTDIR)$(BASHCOMPDIR) -a -d $(DESTDIR)$(OLDBASHCOMPDIR) ]; then \
+ install -m 0755 -d $(DESTDIR)$(OLDBASHCOMPDIR); \
+ install -m 0644 bash-completion/tc $(DESTDIR)$(OLDBASHCOMPDIR); \
+ else \
+ install -m 0755 -d $(DESTDIR)$(BASHCOMPDIR); \
+ install -m 0644 bash-completion/tc $(DESTDIR)$(BASHCOMPDIR); \
+ fi
snapshot:
echo "static const char SNAPSHOT[] = \""`date +%y%m%d`"\";" \
diff --git a/bash-completion/tc b/bash-completion/tc
new file mode 100644
index 000000000000..79dd5fcc172c
--- /dev/null
+++ b/bash-completion/tc
@@ -0,0 +1,723 @@
+# tc(8) completion -*- shell-script -*-
+# Copyright 2016 6WIND S.A.
+# Copyright 2016 Quentin Monnet <quentin.monnet@6wind.com>
+
+# Takes a list of words in argument; each one of them is added to COMPREPLY if
+# it is not already present on the command line. Returns no value.
+_tc_once_attr()
+{
+ local w subcword found
+ for w in $*; do
+ found=0
+ for (( subcword=3; subcword < ${#words[@]}-1; subcword++ )); do
+ if [[ $w == ${words[subcword]} ]]; then
+ found=1
+ break
+ fi
+ done
+ [[ $found -eq 0 ]] && \
+ COMPREPLY+=( $( compgen -W "$w" -- "$cur" ) )
+ done
+}
+
+# Takes a list of words in argument; adds them all to COMPREPLY if none of them
+# is already present on the command line. Returns no value.
+_tc_one_of_list()
+{
+ local w subcword
+ for w in $*; do
+ for (( subcword=3; subcword < ${#words[@]}-1; subcword++ )); do
+ [[ $w == ${words[subcword]} ]] && return 1
+ done
+ done
+ COMPREPLY+=( $( compgen -W "$*" -- "$cur" ) )
+}
+
+# Returns "$cur ${cur}arg1 ${cur}arg2 ..."
+_tc_expand_units()
+{
+ [[ $cur =~ ^[0-9]+ ]] || return 1
+ local value=${cur%%[^0-9]*}
+ [[ $cur == $value ]] && echo $cur
+ echo ${@/#/$value}
+}
+
+# Complete based on given word, usually $prev (or possibly the word before),
+# for when an argument or an option name has but a few possible arguments (so
+# tc does not take particular commands into account here).
+# Returns 0 is completion should stop after running this function, 1 otherwise.
+_tc_direct_complete()
+{
+ case $1 in
+ # Command options
+ dev)
+ _available_interfaces
+ return 0
+ ;;
+ classid)
+ return 0
+ ;;
+ estimator)
+ local list=$( _tc_expand_units 'secs' 'msecs' 'usecs' )
+ COMPREPLY+=( $( compgen -W "$list" -- "$cur" ) )
+ return 0
+ ;;
+ handle)
+ return 0
+ ;;
+ parent|flowid)
+ local i iface ids cmd
+ for (( i=3; i < ${#words[@]}-2; i++ )); do
+ [[ ${words[i]} == dev ]] && iface=${words[i+1]}
+ break
+ done
+ for cmd in qdisc class; do
+ if [[ -n $iface ]]; then
+ ids+=$( tc $cmd show dev $iface 2>/dev/null | \
+ cut -d\ -f 3 )" "
+ else
+ ids+=$( tc $cmd show 2>/dev/null | cut -d\ -f 3 )
+ fi
+ done
+ [[ $ids != " " ]] && \
+ COMPREPLY+=( $( compgen -W "$ids" -- "$cur" ) )
+ return 0
+ ;;
+ protocol) # list comes from lib/ll_proto.c
+ COMPREPLY+=( $( compgen -W ' 802.1Q 802.1ad 802_2 802_3 LLDP aarp \
+ all aoe arp atalk atmfate atmmpoa ax25 bpq can control cust \
+ ddcmp dec diag dna_dl dna_rc dna_rt econet ieeepup ieeepupat \
+ ip ipv4 ipv6 ipx irda lat localtalk loop mobitex ppp_disc \
+ ppp_mp ppp_ses ppptalk pup pupat rarp sca snap tipc tr_802_2 \
+ wan_ppp x25' -- "$cur" ) )
+ return 0
+ ;;
+ prio)
+ return 0
+ ;;
+ stab)
+ COMPREPLY+=( $( compgen -W 'mtu tsize mpu overhead
+ linklayer' -- "$cur" ) )
+ ;;
+
+ # Qdiscs and classes options
+ alpha|bands|beta|buckets|corrupt|debug|decrement|default|\
+ default_index|depth|direct_qlen|divisor|duplicate|ewma|flow_limit|\
+ flows|hh_limit|increment|indices|linklayer|non_hh_weight|num_tc|\
+ penalty_burst|penalty_rate|prio|priomap|probability|queues|r2q|\
+ reorder|vq|vqs)
+ return 0
+ ;;
+ setup)
+ COMPREPLY+=( $( compgen -W 'vqs' -- "$cur" ) )
+ return 0
+ ;;
+ hw)
+ COMPREPLY+=( $( compgen -W '1 0' -- "$cur" ) )
+ return 0
+ ;;
+ distribution)
+ COMPREPLY+=( $( compgen -W 'uniform normal pareto
+ paretonormal' -- "$cur" ) )
+ return 0
+ ;;
+ loss)
+ COMPREPLY+=( $( compgen -W 'random state gmodel' -- "$cur" ) )
+ return 0
+ ;;
+
+ # Qdiscs and classes options options
+ gap|gmodel|state)
+ return 0
+ ;;
+
+ # Filters options
+ map)
+ COMPREPLY+=( $( compgen -W 'key' -- "$cur" ) )
+ return 0
+ ;;
+ hash)
+ COMPREPLY+=( $( compgen -W 'keys' -- "$cur" ) )
+ return 0
+ ;;
+ indev)
+ _available_interfaces
+ return 0
+ ;;
+ eth_type)
+ COMPREPLY+=( $( compgen -W 'ipv4 ipv6' -- "$cur" ) )
+ return 0
+ ;;
+ ip_proto)
+ COMPREPLY+=( $( compgen -W 'tcp udp' -- "$cur" ) )
+ return 0
+ ;;
+
+ # Filters options options
+ key|keys)
+ [[ ${words[@]} =~ graft ]] && return 1
+ COMPREPLY+=( $( compgen -W 'src dst proto proto-src proto-dst iif \
+ priority mark nfct nfct-src nfct-dst nfct-proto-src \
+ nfct-proto-dst rt-classid sk-uid sk-gid vlan-tag rxhash' -- \
+ "$cur" ) )
+ return 0
+ ;;
+
+ # BPF options - used for filters, actions, and exec
+ export|bytecode|bytecode-file|object-file)
+ _filedir
+ return 0
+ ;;
+ object-pinned|graft) # Pinned object is probably under /sys/fs/bpf/
+ [[ -n "$cur" ]] && _filedir && return 0
+ COMPREPLY=( $( compgen -G "/sys/fs/bpf/*" -- "$cur" ) ) || _filedir
+ compopt -o nospace
+ return 0
+ ;;
+ section)
+ if (type objdump > /dev/null 2>&1) ; then
+ local fword objfile section_list
+ for (( fword=3; fword < ${#words[@]}-3; fword++ )); do
+ if [[ ${words[fword]} == object-file ]]; then
+ objfile=${words[fword+1]}
+ break
+ fi
+ done
+ section_list=$( objdump -h $objfile 2>/dev/null | \
+ sed -n 's/^ *[0-9]\+ \([^ ]*\) *.*/\1/p' )
+ COMPREPLY+=( $( compgen -W "$section_list" -- "$cur" ) )
+ fi
+ return 0
+ ;;
+ import|run)
+ _filedir
+ return 0
+ ;;
+ type)
+ COMPREPLY+=( $( compgen -W 'cls act' -- "$cur" ) )
+ return 0
+ ;;
+
+ # Actions options
+ random)
+ _tc_one_of_list 'netrand determ'
+ return 0
+ ;;
+
+ # Units for option arguments
+ bandwidth|maxrate|peakrate|rate)
+ local list=$( _tc_expand_units 'bit' \
+ 'kbit' 'kibit' 'kbps' 'kibps' \
+ 'mbit' 'mibit' 'mbps' 'mibps' \
+ 'gbit' 'gibit' 'gbps' 'gibps' \
+ 'tbit' 'tibit' 'tbps' 'tibps' )
+ COMPREPLY+=( $( compgen -W "$list" -- "$cur" ) )
+ ;;
+ admit_bytes|avpkt|burst|cell|initial_quantum|limit|max|min|mtu|mpu|\
+ overhead|quantum|redflowlist)
+ local list=$( _tc_expand_units \
+ 'b' 'kbit' 'k' 'mbit' 'm' 'gbit' 'g' )
+ COMPREPLY+=( $( compgen -W "$list" -- "$cur" ) )
+ ;;
+ db|delay|evict_timeout|interval|latency|perturb|rehash|reset_timeout|\
+ target|tupdate)
+ local list=$( _tc_expand_units 'secs' 'msecs' 'usecs' )
+ COMPREPLY+=( $( compgen -W "$list" -- "$cur" ) )
+ ;;
+ esac
+ return 1
+}
+
+# Complete with options names for qdiscs. Each qdisc has its own set of options
+# and it seems we cannot really parse it from anywhere, so we add it manually
+# in this function.
+# Returns 0 is completion should stop after running this function, 1 otherwise.
+_tc_qdisc_options()
+{
+ case $1 in
+ choke)
+ _tc_once_attr 'limit bandwidth ecn min max burst'
+ return 0
+ ;;
+ codel)
+ _tc_once_attr 'limit target interval'
+ _tc_one_of_list 'ecn noecn'
+ return 0
+ ;;
+ bfifo|pfifo|pfifo_head_drop)
+ _tc_once_attr 'limit'
+ return 0
+ ;;
+ fq)
+ _tc_once_attr 'limit flow_limit quantum initial_quantum maxrate \
+ buckets'
+ _tc_one_of_list 'pacing nopacing'
+ return 0
+ ;;
+ fq_codel)
+ _tc_once_attr 'limit flows target interval quantum'
+ _tc_one_of_list 'ecn noecn'
+ return 0
+ ;;
+ gred)
+ _tc_once_attr 'setup vqs default grio vq prio limit min max avpkt \
+ burst probability bandwidth'
+ return 0
+ ;;
+ hhf)
+ _tc_once_attr 'limit quantum hh_limit reset_timeout admit_bytes \
+ evict_timeout non_hh_weight'
+ return 0
+ ;;
+ mqprio)
+ _tc_once_attr 'num_tc map queues hw'
+ return 0
+ ;;
+ netem)
+ _tc_once_attr 'delay distribution corrupt duplicate loss ecn \
+ reorder rate'
+ return 0
+ ;;
+ pie)
+ _tc_once_attr 'limit target tupdate alpha beta'
+ _tc_one_of_list 'bytemode nobytemode'
+ _tc_one_of_list 'ecn noecn'
+ return 0
+ ;;
+ red)
+ _tc_once_attr 'limit min max avpkt burst adaptive probability \
+ bandwidth ecn harddrop'
+ return 0
+ ;;
+ rr|prio)
+ _tc_once_attr 'bands priomap multiqueue'
+ return 0
+ ;;
+ sfb)
+ _tc_once_attr 'rehash db limit max target increment decrement \
+ penalty_rate penalty_burst'
+ return 0
+ ;;
+ sfq)
+ _tc_once_attr 'limit perturb quantum divisor flows depth headdrop \
+ redflowlimit min max avpkt burst probability ecn harddrop'
+ return 0
+ ;;
+ tbf)
+ _tc_once_attr 'limit burst rate mtu peakrate latency overhead \
+ linklayer'
+ return 0
+ ;;
+ cbq)
+ _tc_once_attr 'bandwidth avpkt mpu cell ewma'
+ return 0
+ ;;
+ dsmark)
+ _tc_once_attr 'indices default_index set_tc_index'
+ return 0
+ ;;
+ hfsc)
+ _tc_once_attr 'default'
+ return 0
+ ;;
+ htb)
+ _tc_once_attr 'default r2q direct_qlen debug'
+ return 0
+ ;;
+ multiq|pfifo_fast|atm|drr|qfq)
+ return 0
+ ;;
+ esac
+ return 1
+}
+
+# Complete with options names for BPF filters or actions.
+# Returns 0 is completion should stop after running this function, 1 otherwise.
+_tc_bpf_options()
+{
+ [[ ${words[${#words[@]}-3]} == object-file ]] && \
+ _tc_once_attr 'section export'
+ [[ ${words[${#words[@]}-5]} == object-file ]] && \
+ [[ ${words[${#words[@]}-3]} =~ (section|export) ]] && \
+ _tc_once_attr 'section export'
+ _tc_one_of_list 'bytecode bytecode-file object-file object-pinned'
+ _tc_once_attr 'verbose index direct-action action classid'
+ return 0
+}
+
+# Complete with options names for filters.
+# Returns 0 is completion should stop after running this function, 1 otherwise.
+_tc_filter_options()
+{
+ case $1 in
+ basic)
+ _tc_once_attr 'match action classid'
+ return 0
+ ;;
+ bpf)
+ _tc_bpf_options
+ return 0
+ ;;
+ cgroup)
+ _tc_once_attr 'match action'
+ return 0
+ ;;
+ flow)
+ local i
+ for (( i=5; i < ${#words[@]}-1; i++ )); do
+ if [[ ${words[i]} =~ ^keys?$ ]]; then
+ _tc_direct_complete 'key'
+ COMPREPLY+=( $( compgen -W 'or and xor rshift addend' -- \
+ "$cur" ) )
+ break
+ fi
+ done
+ _tc_once_attr 'map hash divisor baseclass match action'
+ return 0
+ ;;
+ flower)
+ _tc_once_attr 'action classid indev dst_mac src_mac eth_type \
+ ip_proto dst_ip src_ip dst_port src_port'
+ return 0
+ ;;
+ fw)
+ _tc_once_attr 'action classid'
+ return 0
+ ;;
+ route)
+ _tc_one_of_list 'from fromif'
+ _tc_once_attr 'to classid action'
+ return 0
+ ;;
+ rsvp)
+ _tc_once_attr 'ipproto session sender classid action tunnelid \
+ tunnel flowlabel spi/ah spi/esp u8 u16 u32'
+ [[ ${words[${#words[@]}-3]} == tunnel ]] && \
+ COMPREPLY+=( $( compgen -W 'skip' -- "$cur" ) )
+ [[ ${words[${#words[@]}-3]} =~ u(8|16|32) ]] && \
+ COMPREPLY+=( $( compgen -W 'mask' -- "$cur" ) )
+ [[ ${words[${#words[@]}-3]} == mask ]] && \
+ COMPREPLY+=( $( compgen -W 'at' -- "$cur" ) )
+ return 0
+ ;;
+ tcindex)
+ _tc_once_attr 'hash mask shift classid action'
+ _tc_one_of_list 'pass_on fall_through'
+ return 0
+ ;;
+ u32)
+ _tc_once_attr 'match link classid action offset ht hashkey sample'
+ COMPREPLY+=( $( compgen -W 'ip ip6 udp tcp icmp u8 u16 u32 mark \
+ divisor' -- "$cur" ) )
+ return 0
+ ;;
+ esac
+ return 1
+}
+
+# Complete with options names for actions.
+# Returns 0 is completion should stop after running this function, 1 otherwise.
+_tc_action_options()
+{
+ case $1 in
+ bpf)
+ _tc_bpf_options
+ return 0
+ ;;
+ mirred)
+ _tc_one_of_list 'ingress egress'
+ _tc_one_of_list 'mirror redirect'
+ _tc_once_attr 'index dev'
+ return 0
+ ;;
+ gact)
+ _tc_one_of_list 'reclassify drop continue pass'
+ _tc_once_attr 'random'
+ return 0
+ ;;
+ esac
+ return 1
+}
+
+# Complete with options names for exec.
+# Returns 0 is completion should stop after running this function, 1 otherwise.
+_tc_exec_options()
+{
+ case $1 in
+ import)
+ [[ ${words[${#words[@]}-3]} == import ]] && \
+ _tc_once_attr 'run'
+ return 0
+ ;;
+ graft)
+ COMPREPLY+=( $( compgen -W 'key type' -- "$cur" ) )
+ [[ ${words[${#words[@]}-3]} == object-file ]] && \
+ _tc_once_attr 'type'
+ _tc_bpf_options
+ return 0
+ ;;
+ esac
+ return 1
+}
+
+# Main completion function
+# Logic is as follows:
+# 1. Check if previous word is a global option; if so, propose arguments.
+# 2. Check if current word is a global option; if so, propose completion.
+# 3. Check for the presence of a main command (qdisc|class|filter|...). If
+# there is one, first call _tc_direct_complete to see if previous word is
+# waiting for a particular completion. If so, propose completion and exit.
+# 4. Extract main command and -- if available -- its subcommand
+# (add|delete|show|...).
+# 5. Propose completion based on main and sub- command in use. Additional
+# functions may be called for qdiscs, classes or filter options.
+_tc()
+{
+ local cur prev words cword
+ _init_completion || return
+
+ case $prev in
+ -V|-Version)
+ return 0
+ ;;
+ -b|-batch|-cf|-conf)
+ _filedir
+ return 0
+ ;;
+ -force)
+ COMPREPLY=( $( compgen -W '-batch' -- "$cur" ) )
+ return 0
+ ;;
+ -nm|name)
+ [[ -r /etc/iproute2/tc_cls ]] || \
+ COMPREPLY=( $( compgen -W '-conf' -- "$cur" ) )
+ return 0
+ ;;
+ -n|-net|-netns)
+ local nslist=$( ip netns list 2>/dev/null )
+ COMPREPLY+=( $( compgen -W "$nslist" -- "$cur" ) )
+ return 0
+ ;;
+ -tshort)
+ _tc_once_attr '-statistics'
+ COMPREPLY+=( $( compgen -W 'monitor' -- "$cur" ) )
+ return 0
+ ;;
+ -timestamp)
+ _tc_once_attr '-statistics -tshort'
+ COMPREPLY+=( $( compgen -W 'monitor' -- "$cur" ) )
+ return 0
+ ;;
+ esac
+
+ # Search for main commands
+ local subcword cmd subcmd
+ for (( subcword=1; subcword < ${#words[@]}-1; subcword++ )); do
+ [[ ${words[subcword]} == -b?(atch) ]] && return 0
+ [[ -n $cmd ]] && subcmd=${words[subcword]} && break
+ [[ ${words[subcword]} != -* && \
+ ${words[subcword-1]} != -@(n?(et?(ns))|c?(on)f) ]] && \
+ cmd=${words[subcword]}
+ done
+
+ if [[ -z $cmd ]]; then
+ case $cur in
+ -*)
+ local c='-Version -statistics -details -raw -pretty \
+ -iec -graphe -batch -name -netns -timestamp'
+ [[ $cword -eq 1 ]] && c+=' -force'
+ COMPREPLY=( $( compgen -W "$c" -- "$cur" ) )
+ return 0
+ ;;
+ *)
+ COMPREPLY=( $( compgen -W "help $( tc help 2>&1 | \
+ command sed \
+ -e '/OBJECT := /!d' \
+ -e 's/.*{//' \
+ -e 's/}.*//' \
+ -e \ 's/|//g' )" -- "$cur" ) )
+ return 0
+ ;;
+ esac
+ fi
+
+ [[ $subcmd == help ]] && return 0
+
+ # For this set of commands we may create COMPREPLY just by analysing the
+ # previous word, if it expects for a specific list of options or values.
+ if [[ $cmd =~ (qdisc|class|filter|action|exec) ]]; then
+ _tc_direct_complete $prev && return 0
+ if [[ ${words[${#words[@]}-3]} == estimator ]]; then
+ local list=$( _tc_expand_units 'secs' 'msecs' 'usecs' )
+ COMPREPLY+=( $( compgen -W "$list" -- "$cur" ) ) && return 0
+ fi
+ fi
+
+ # Completion depends on main command and subcommand in use.
+ case $cmd in
+ qdisc)
+ case $subcmd in
+ add|change|replace|link|del|delete)
+ if [[ $(($cword-$subcword)) -eq 1 ]]; then
+ COMPREPLY=( $( compgen -W 'dev' -- "$cur" ) )
+ return 0
+ fi
+ local qdisc qdwd QDISC_KIND=' choke codel bfifo pfifo \
+ pfifo_head_drop fq fq_codel gred hhf mqprio multiq \
+ netem pfifo_fast pie red rr sfb sfq tbf atm cbq drr \
+ dsmark hfsc htb prio qfq '
+ for ((qdwd=$subcword; qdwd < ${#words[@]}-1; qdwd++)); do
+ if [[ $QDISC_KIND =~ ' '${words[qdwd]}' ' ]]; then
+ qdisc=${words[qdwd]}
+ _tc_qdisc_options $qdisc && return 0
+ fi
+ done
+ _tc_one_of_list $QDISC_KIND
+ _tc_one_of_list 'root ingress parent clsact'
+ _tc_once_attr 'handle estimator stab'
+ ;;
+ show)
+ _tc_once_attr 'dev'
+ _tc_one_of_list 'ingress clsact'
+ _tc_once_attr '-statistics -details -raw -pretty -iec \
+ -graph -name'
+ ;;
+ help)
+ return 0
+ ;;
+ *)
+ [[ $cword -eq $subcword ]] && \
+ COMPREPLY=( $( compgen -W 'help add delete change \
+ replace link show' -- "$cur" ) )
+ ;;
+ esac
+ ;;
+
+ class)
+ case $subcmd in
+ add|change|replace|del|delete)
+ if [[ $(($cword-$subcword)) -eq 1 ]]; then
+ COMPREPLY=( $( compgen -W 'dev' -- "$cur" ) )
+ return 0
+ fi
+ local qdisc qdwd QDISC_KIND=' choke codel bfifo pfifo \
+ pfifo_head_drop fq fq_codel gred hhf mqprio multiq \
+ netem pfifo_fast pie red rr sfb sfq tbf atm cbq drr \
+ dsmark hfsc htb prio qfq '
+ for ((qdwd=$subcword; qdwd < ${#words[@]}-1; qdwd++)); do
+ if [[ $QDISC_KIND =~ ' '${words[qdwd]}' ' ]]; then
+ qdisc=${words[qdwd]}
+ _tc_qdisc_options $qdisc && return 0
+ fi
+ done
+ _tc_one_of_list $QDISC_KIND
+ _tc_one_of_list 'root parent'
+ _tc_once_attr 'classid'
+ ;;
+ show)
+ _tc_once_attr 'dev'
+ _tc_one_of_list 'root parent'
+ _tc_once_attr '-statistics -details -raw -pretty -iec \
+ -graph -name'
+ ;;
+ help)
+ return 0
+ ;;
+ *)
+ [[ $cword -eq $subcword ]] && \
+ COMPREPLY=( $( compgen -W 'help add delete change \
+ replace show' -- "$cur" ) )
+ ;;
+ esac
+ ;;
+
+ filter)
+ case $subcmd in
+ add|change|replace|del|delete)
+ if [[ $(($cword-$subcword)) -eq 1 ]]; then
+ COMPREPLY=( $( compgen -W 'dev' -- "$cur" ) )
+ return 0
+ fi
+ local filter fltwd FILTER_KIND=' basic bpf cgroup flow \
+ flower fw route rsvp tcindex u32 '
+ for ((fltwd=$subcword; fltwd < ${#words[@]}-1; fltwd++));
+ do
+ if [[ $FILTER_KIND =~ ' '${words[fltwd]}' ' ]]; then
+ filter=${words[fltwd]}
+ _tc_filter_options $filter && return 0
+ fi
+ done
+ _tc_one_of_list $FILTER_KIND
+ _tc_one_of_list 'root ingress egress parent'
+ _tc_once_attr 'handle estimator pref protocol'
+ ;;
+ show)
+ _tc_once_attr 'dev'
+ _tc_one_of_list 'root ingress egress parent'
+ _tc_once_attr '-statistics -details -raw -pretty -iec \
+ -graph -name'
+ ;;
+ help)
+ return 0
+ ;;
+ *)
+ [[ $cword -eq $subcword ]] && \
+ COMPREPLY=( $( compgen -W 'help add delete change \
+ replace show' -- "$cur" ) )
+ ;;
+ esac
+ ;;
+
+ action)
+ case $subcmd in
+ add|change|replace)
+ local action acwd ACTION_KIND=' gact mirred bpf '
+ for ((acwd=$subcword; acwd < ${#words[@]}-1; acwd++)); do
+ if [[ $ACTION_KIND =~ ' '${words[acwd]}' ' ]]; then
+ action=${words[acwd]}
+ _tc_action_options $action && return 0
+ fi
+ done
+ _tc_one_of_list $ACTION_KIND
+ ;;
+ get|del|delete)
+ _tc_once_attr 'index'
+ ;;
+ lst|list|flush|show)
+ _tc_one_of_list $ACTION_KIND
+ ;;
+ *)
+ [[ $cword -eq $subcword ]] && \
+ COMPREPLY=( $( compgen -W 'help add delete change \
+ replace show list flush action' -- "$cur" ) )
+ ;;
+ esac
+ ;;
+
+ monitor)
+ COMPREPLY=( $( compgen -W 'help' -- "$cur" ) )
+ ;;
+
+ exec)
+ case $subcmd in
+ bpf)
+ local excmd exwd EXEC_KIND=' import debug graft '
+ for ((exwd=$subcword; exwd < ${#words[@]}-1; exwd++)); do
+ if [[ $EXEC_KIND =~ ' '${words[exwd]}' ' ]]; then
+ excmd=${words[exwd]}
+ _tc_exec_options $excmd && return 0
+ fi
+ done
+ _tc_one_of_list $EXEC_KIND
+ ;;
+ *)
+ [[ $cword -eq $subcword ]] && \
+ COMPREPLY=( $( compgen -W 'bpf' -- "$cur" ) )
+ ;;
+ esac
+ ;;
+ esac
+} &&
+complete -F _tc tc
+
+# ex: ts=4 sw=4 et filetype=sh
--
2.1.4
^ permalink raw reply related
* Re: [PATCH net-next] drivers/net: add 6WIND SHULTI support
From: Nicolas Dichtel @ 2016-04-29 8:48 UTC (permalink / raw)
To: David Miller; +Cc: jiri, fw, netdev
In-Reply-To: <20160428.115419.1831238411921569629.davem@davemloft.net>
Le 28/04/2016 17:54, David Miller a écrit :
[snip]
> You can say whatever you want, but the facilities you are adding to
> this driver enables proprietary userland SDK components.
>
> And this is precisely what we are trying to avoid by having a clean,
> fully featured switch device model in the kernel.
>
> It is against your interestes of upstreaming your driver to continue
> denying what your changes facilitate.
Ok, I will rework this patch to remove the controversial parts and custom APIs.
Thank you,
Nicolas
^ permalink raw reply
* [PATCH net] cxgb3: fix out of bounds read
From: Michal Schmidt @ 2016-04-29 9:06 UTC (permalink / raw)
To: netdev; +Cc: Santosh Raspatur, Jan Stancek
An out of bounds read of 2 bytes was discovered in cxgb3 with KASAN.
t3_config_rss() expects both arrays it gets as parameters to have
terminators. setup_rss(), the caller, forgets to add a terminator to
one of the arrays. Thankfully the iteration in t3_config_rss() stops
anyway, but in the last iteration the check for the terminator
is an out of bounds read.
Add the missing terminator to rspq_map[].
Reported-by: Jan Stancek <jstancek@redhat.com>
Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
---
drivers/net/ethernet/chelsio/cxgb3/cxgb3_main.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/chelsio/cxgb3/cxgb3_main.c b/drivers/net/ethernet/chelsio/cxgb3/cxgb3_main.c
index 60908eab3b..43da891fab 100644
--- a/drivers/net/ethernet/chelsio/cxgb3/cxgb3_main.c
+++ b/drivers/net/ethernet/chelsio/cxgb3/cxgb3_main.c
@@ -576,7 +576,7 @@ static void setup_rss(struct adapter *adap)
unsigned int nq0 = adap2pinfo(adap, 0)->nqsets;
unsigned int nq1 = adap->port[1] ? adap2pinfo(adap, 1)->nqsets : 1;
u8 cpus[SGE_QSETS + 1];
- u16 rspq_map[RSS_TABLE_SIZE];
+ u16 rspq_map[RSS_TABLE_SIZE + 1];
for (i = 0; i < SGE_QSETS; ++i)
cpus[i] = i;
@@ -586,6 +586,7 @@ static void setup_rss(struct adapter *adap)
rspq_map[i] = i % nq0;
rspq_map[i + RSS_TABLE_SIZE / 2] = (i % nq1) + nq0;
}
+ rspq_map[RSS_TABLE_SIZE] = 0xffff; /* terminator */
t3_config_rss(adap, F_RQFEEDBACKENABLE | F_TNLLKPEN | F_TNLMAPEN |
F_TNLPRTEN | F_TNL2TUPEN | F_TNL4TUPEN |
--
2.7.4
^ permalink raw reply related
* Re: [PATCH RFT v2 2/2] macb: kill PHY reset code
From: Nicolas Ferre @ 2016-04-29 9:36 UTC (permalink / raw)
To: Sergei Shtylyov, netdev; +Cc: linux-kernel
In-Reply-To: <2024144.Eda96jDyaz@wasted.cogentembedded.com>
Le 29/04/2016 00:15, Sergei Shtylyov a écrit :
> With the 'phylib' now being aware of the "reset-gpios" PHY node property,
> there should be no need to frob the PHY reset in this driver anymore...
>
> Signed-off-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com>
So I queue my DT patch through arm-soc.
Thanks Sergei, bye.
> ---
> drivers/net/ethernet/cadence/macb.c | 17 -----------------
> drivers/net/ethernet/cadence/macb.h | 1 -
> 2 files changed, 18 deletions(-)
>
> Index: net-next/drivers/net/ethernet/cadence/macb.c
> ===================================================================
> --- net-next.orig/drivers/net/ethernet/cadence/macb.c
> +++ net-next/drivers/net/ethernet/cadence/macb.c
> @@ -2884,7 +2884,6 @@ static int macb_probe(struct platform_de
> = macb_clk_init;
> int (*init)(struct platform_device *) = macb_init;
> struct device_node *np = pdev->dev.of_node;
> - struct device_node *phy_node;
> const struct macb_config *macb_config = NULL;
> struct clk *pclk, *hclk = NULL, *tx_clk = NULL;
> unsigned int queue_mask, num_queues;
> @@ -2977,18 +2976,6 @@ static int macb_probe(struct platform_de
> else
> macb_get_hwaddr(bp);
>
> - /* Power up the PHY if there is a GPIO reset */
> - phy_node = of_get_next_available_child(np, NULL);
> - if (phy_node) {
> - int gpio = of_get_named_gpio(phy_node, "reset-gpios", 0);
> -
> - if (gpio_is_valid(gpio)) {
> - bp->reset_gpio = gpio_to_desc(gpio);
> - gpiod_direction_output(bp->reset_gpio, 1);
> - }
> - }
> - of_node_put(phy_node);
> -
> err = of_get_phy_mode(np);
> if (err < 0) {
> pdata = dev_get_platdata(&pdev->dev);
> @@ -3054,10 +3041,6 @@ static int macb_remove(struct platform_d
> mdiobus_unregister(bp->mii_bus);
> mdiobus_free(bp->mii_bus);
>
> - /* Shutdown the PHY if there is a GPIO reset */
> - if (bp->reset_gpio)
> - gpiod_set_value(bp->reset_gpio, 0);
> -
> unregister_netdev(dev);
> clk_disable_unprepare(bp->tx_clk);
> clk_disable_unprepare(bp->hclk);
> Index: net-next/drivers/net/ethernet/cadence/macb.h
> ===================================================================
> --- net-next.orig/drivers/net/ethernet/cadence/macb.h
> +++ net-next/drivers/net/ethernet/cadence/macb.h
> @@ -832,7 +832,6 @@ struct macb {
> unsigned int dma_burst_length;
>
> phy_interface_t phy_interface;
> - struct gpio_desc *reset_gpio;
>
> /* AT91RM9200 transmit */
> struct sk_buff *skb; /* holds skb until xmit interrupt completes */
>
>
--
Nicolas Ferre
^ permalink raw reply
* Re: FWD: [PATCH v2] Marvell phy: add fiber status check for some components
From: Charles-Antoine Couret @ 2016-04-29 8:28 UTC (permalink / raw)
To: Florian Fainelli, Andrew Lunn; +Cc: netdev
In-Reply-To: <570BFF50.6060909@gmail.com>
Le 11/04/2016 à 21:47, Florian Fainelli a écrit :
>> Do we actually need to stay on page 1 if fibre is in use? How do we
>> initially change to page 1 when the fibre link is still down?
>
> I also do not feel very comfortable with reading the fiber status first,
> and then copper and then combine these two. At the very best, could we
> do something like:
>
> - identify if the PHY is configured for fiber in drv->probe or
> drv->config_init, retain that information
But, how configure in runtime the user choice? Add a driver option?
Else, a default choice should be added in init/probe function and the user should change by ethtool or driver recompilation.
> - have two paths in drv->read_status which take care of reading one
> status or the other?
I worked for a solution around that.
But the Marvell's datasheet seems to agree with my previous method:
Extract:
"Notes on Determining which Media Linked Up
Since there are two sets of IEEE registers (one for copper and the other for fiber) the software needs
to be aware of register 22.7:0 so that the correct set of registers are selected. In general the
sequence is as follows.
1-Set the Auto-Negotiation registers of the copper medium. (This step may not be necessary if the
hardware configuration defaults are acceptable).
2-Set the Auto-Negotiation registers of the fiber medium. (This step may not be necessary if the
hardware configuration defaults are acceptable).
3-Poll for link status.Go to step 4 if there is link.
4-Once there is link determine whether the link is copper or fiber medium.
5-Look at the Auto-Negotiation results for the medium that established link.
6-Poll for link status. If link status goes down then go back to step 3."
By default, the phy is configured to be connected on both interfaces with any preference. The first connected was selected.
A preference could be added, but it is not a force mode.
Extract:
"Preferred Media
The device can be programmed to give one media priority over the other. In other words if the
non-preferred media establishes link first and subsequently energy is detected on the preferred
media, the PHY will drop link on the non-preferred media for 4 seconds and give the preferred media
a chance to establish link."
We don't have registers to know really if Copper or Fiber is selected. We can know only by checking status registers as in my commit.
>> Should we be using the old mechanism to swap between TP, BNC and AUI
>> to swap between copper and fibre?
>
> Did you mean using ethtool -s <iface> port fibre for instance?
So, to implement that I have to renable the port change in phy_ethtool_sset function.
But, the datasheet seems to disagree with this method.
Finally, what do I have to do? I continue my previous way or your suggestion?
I prefer to respect the datasheet, but if it's better for you to follow the other way, I will implement that.
Thank you in advance and have a nice day.
Regards,
Charles-Antoine Couret
^ permalink raw reply
* Re: [PATCH] mdio_bus: Fix MDIO bus scanning in __mdiobus_register()
From: Sergei Shtylyov @ 2016-04-29 11:18 UTC (permalink / raw)
To: Marek Vasut, netdev
Cc: Arnd Bergmann, David S . Miller, Dinh Nguyen, Florian Fainelli
In-Reply-To: <1461892155-10524-1-git-send-email-marex@denx.de>
Hello.
First of all, thank you for the patch!
You beat me to it (and not only me). :-)
On 4/29/2016 4:09 AM, Marek Vasut wrote:
> Since commit b74766a0a0feeef5c779709cc5d109451c0d5b17 in linux-next,
> ( phylib: don't return NULL from get_phy_device() ), phy_get_device()
scripts/checkpatch.pl now enforces certain commit citing style, yours
doesn't quite match it.
> will return ERR_PTR(-ENODEV) instead of NULL if the PHY device ID is
> all ones.
>
> This causes problem with stmmac driver and likely some other drivers
> which call mdiobus_register(). I triggered this bug on SoCFPGA MCVEVK
> board with linux-next 20160427 and 20160428. In case of the stmmac, if
> there is no PHY node specified in the DT for the stmmac block, the stmmac
> driver ( drivers/net/ethernet/stmicro/stmmac/stmmac_mdio.c function
> stmmac_mdio_register() ) will call mdiobus_register() , which will
> register the MDIO bus and probe for the PHY.
>
> The mdiobus_register() resp. __mdiobus_register() iterates over all of
> the addresses on the MDIO bus and calls mdiobus_scan() for each of them,
> which invokes get_phy_device(). Before the aforementioned patch, the
> mdiobus_scan() would return NULL if no PHY was found on a given address
> and mdiobus_register() would continue and try the next PHY address. Now,
> mdiobus_scan() returns ERR_PTR(-ENODEV), which is caught by the
> 'if (IS_ERR(phydev))' condition and the loop exits immediatelly if the
> PHY address does not contain PHY.
>
> Repair this by explicitly checking for the ERR_PTR(-ENODEV) and if this
> error comes around, continue with the next PHY address.
>
> Signed-off-by: Marek Vasut <marex@denx.de>
> Cc: Arnd Bergmann <arnd@arndb.de>
> Cc: David S. Miller <davem@davemloft.net>
> Cc: Dinh Nguyen <dinguyen@opensource.altera.com>
> Cc: Florian Fainelli <f.fainelli@gmail.com>
> Cc: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
Reviewed-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
> ---
> drivers/net/phy/mdio_bus.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> NOTE: I don't quite like this explicit check , but I don't have better idea now.
It's fine. I was going to do just the same :-)
> diff --git a/drivers/net/phy/mdio_bus.c b/drivers/net/phy/mdio_bus.c
> index 499003ee..388f992 100644
> --- a/drivers/net/phy/mdio_bus.c
> +++ b/drivers/net/phy/mdio_bus.c
> @@ -333,7 +333,7 @@ int __mdiobus_register(struct mii_bus *bus, struct module *owner)
> struct phy_device *phydev;
>
> phydev = mdiobus_scan(bus, i);
> - if (IS_ERR(phydev)) {
> + if (IS_ERR(phydev) && (PTR_ERR(phydev) != -ENODEV)) {
Parens around the second operand of && are not really needed though...
[...]
MBR, Sergei
^ permalink raw reply
* Re: [PATCH v6 2/6] Documentation: Bindings: Add STM32 DWMAC glue
From: Maxime Coquelin @ 2016-04-29 11:22 UTC (permalink / raw)
To: Rob Herring, Arnd Bergmann
Cc: Giuseppe Cavallaro, netdev-u79uwXL29TY76Z2rM5mHXA,
devicetree-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
Alexandre TORGUE,
linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r@public.gmane.org,
linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
Joachim Eastwood, Chen-Yu Tsai
In-Reply-To: <20160428205952.GA11405@rob-hp-laptop>
2016-04-28 22:59 GMT+02:00 Rob Herring <robh-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>:
> On Mon, Apr 25, 2016 at 01:53:58PM +0200, Alexandre TORGUE wrote:
>> Signed-off-by: Alexandre TORGUE <alexandre.torgue-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
>
> Acked-by: Rob Herring <robh-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Thanks Rob!
Arnd, I only have patches 4, 5 and 6 of this series for stm32 (2 DT,
one defconfig) for v4.7.
Should I send a pull request, or I can send the patches directly to
arm-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org so that
you apply them directly?
Thanks in advance,
Maxime
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply
* RE: [PATCH] tipc: Only process unicast on intended node
From: Jon Maloy @ 2016-04-29 11:08 UTC (permalink / raw)
To: Hamish Martin, netdev@vger.kernel.org
In-Reply-To: <1461893718-17404-1-git-send-email-hamish.martin@alliedtelesis.co.nz>
Not acked. I will post an updated version of this later today to 'net'.
///jon
> -----Original Message-----
> From: Hamish Martin [mailto:hamish.martin@alliedtelesis.co.nz]
> Sent: Thursday, 28 April, 2016 21:35
> To: Jon Maloy; netdev@vger.kernel.org
> Cc: Hamish Martin
> Subject: [PATCH] tipc: Only process unicast on intended node
>
> We have observed complete lock up of broadcast-link transmission due to
> unacknowledged packets never being removed from the 'transmq' queue. This
> is traced to nodes having their ack field set beyond the sequence number
> of packets that have actually been transmitted to them.
> Consider an example where node 1 has sent 10 packets to node 2 on a
> link and node 3 has sent 20 packets to node 2 on another link. We
> see examples of an ack from node 2 destined for node 3 being treated as
> an ack from node 2 at node 1. This leads to the ack on the node 1 to node
> 2 link being increased to 20 even though we have only sent 10 packets.
> When node 1 does get around to sending further packets, none of the
> packets with sequence numbers less than 21 are actually removed from the
> transmq.
> To resolve this we reinstate some code lost in commit d999297c3dbb ("tipc:
> reduce locking scope during packet reception") which ensures that only
> messages destined for the receiving node are processed by that node. This
> prevents the sequence numbers from getting out of sync and resolves the
> packet leakage, thereby resolving the broadcast-link transmission
> lock-ups we observed.
>
> Signed-off-by: Hamish Martin <hamish.martin@alliedtelesis.co.nz>
> Reviewed-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
> Reviewed-by: John Thompson <john.thompson@alliedtelesis.co.nz>
> ---
> net/tipc/node.c | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/net/tipc/node.c b/net/tipc/node.c
> index ace178fd3850..e5dda495d4b6 100644
> --- a/net/tipc/node.c
> +++ b/net/tipc/node.c
> @@ -1460,6 +1460,11 @@ void tipc_rcv(struct net *net, struct sk_buff *skb,
> struct tipc_bearer *b)
> return tipc_node_bc_rcv(net, skb, bearer_id);
> }
>
> + /* Discard unicast link messages destined for another node */
> + if (unlikely(!msg_short(hdr) &&
> + (msg_destnode(hdr) != tipc_own_addr(net))))
> + goto discard;
> +
> /* Locate neighboring node that sent packet */
> n = tipc_node_find(net, msg_prevnode(hdr));
> if (unlikely(!n))
> --
> 2.8.1
^ permalink raw reply
* Re: [PATCH] mdio_bus: Fix MDIO bus scanning in __mdiobus_register()
From: Marek Vasut @ 2016-04-29 11:45 UTC (permalink / raw)
To: Sergei Shtylyov, netdev
Cc: Arnd Bergmann, David S . Miller, Dinh Nguyen, Florian Fainelli
In-Reply-To: <86212055-7d15-ab11-4998-72c04af2426d@cogentembedded.com>
On 04/29/2016 01:18 PM, Sergei Shtylyov wrote:
> Hello.
Hi!
> First of all, thank you for the patch!
> You beat me to it (and not only me). :-)
Heh, hacking at night has it's perks :)
> On 4/29/2016 4:09 AM, Marek Vasut wrote:
>
>> Since commit b74766a0a0feeef5c779709cc5d109451c0d5b17 in linux-next,
>> ( phylib: don't return NULL from get_phy_device() ), phy_get_device()
>
> scripts/checkpatch.pl now enforces certain commit citing style, yours
> doesn't quite match it.
Ha, I didn't know that checkpatch can now warn about this too, nice. Is
that in next already ? I just tried checkpatch and it doesn't warn about it.
Anyway, regarding this format, do you want V2 ? Originally, I had the
full commit info in the message, but that was just taking space and
it is not the commit which is important in the message, so I trimmed
it down.
>> will return ERR_PTR(-ENODEV) instead of NULL if the PHY device ID is
>> all ones.
>>
>> This causes problem with stmmac driver and likely some other drivers
>> which call mdiobus_register(). I triggered this bug on SoCFPGA MCVEVK
>> board with linux-next 20160427 and 20160428. In case of the stmmac, if
>> there is no PHY node specified in the DT for the stmmac block, the stmmac
>> driver ( drivers/net/ethernet/stmicro/stmmac/stmmac_mdio.c function
>> stmmac_mdio_register() ) will call mdiobus_register() , which will
>> register the MDIO bus and probe for the PHY.
>>
>> The mdiobus_register() resp. __mdiobus_register() iterates over all of
>> the addresses on the MDIO bus and calls mdiobus_scan() for each of them,
>> which invokes get_phy_device(). Before the aforementioned patch, the
>> mdiobus_scan() would return NULL if no PHY was found on a given address
>> and mdiobus_register() would continue and try the next PHY address. Now,
>> mdiobus_scan() returns ERR_PTR(-ENODEV), which is caught by the
>> 'if (IS_ERR(phydev))' condition and the loop exits immediatelly if the
>> PHY address does not contain PHY.
>>
>> Repair this by explicitly checking for the ERR_PTR(-ENODEV) and if this
>> error comes around, continue with the next PHY address.
>>
>> Signed-off-by: Marek Vasut <marex@denx.de>
>> Cc: Arnd Bergmann <arnd@arndb.de>
>> Cc: David S. Miller <davem@davemloft.net>
>> Cc: Dinh Nguyen <dinguyen@opensource.altera.com>
>> Cc: Florian Fainelli <f.fainelli@gmail.com>
>> Cc: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
>
> Reviewed-by: Sergei Shtylyov <sergei.shtylyov@cogentembedded.com>
>
>> ---
>> drivers/net/phy/mdio_bus.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> NOTE: I don't quite like this explicit check , but I don't have better
>> idea now.
>
> It's fine. I was going to do just the same :-)
OK, I'm glad I'm not alone on this one :)
>> diff --git a/drivers/net/phy/mdio_bus.c b/drivers/net/phy/mdio_bus.c
>> index 499003ee..388f992 100644
>> --- a/drivers/net/phy/mdio_bus.c
>> +++ b/drivers/net/phy/mdio_bus.c
>> @@ -333,7 +333,7 @@ int __mdiobus_register(struct mii_bus *bus, struct
>> module *owner)
>> struct phy_device *phydev;
>>
>> phydev = mdiobus_scan(bus, i);
>> - if (IS_ERR(phydev)) {
>> + if (IS_ERR(phydev) && (PTR_ERR(phydev) != -ENODEV)) {
>
> Parens around the second operand of && are not really needed though...
While I agree, I also prefer to make things obvious when reading the
code by adding the parenthesis. It's a matter of taste I think. Just let
me know if I should spin V2 without them :)
Thanks for the review!
> [...]
>
> MBR, Sergei
>
--
Best regards,
Marek Vasut
^ permalink raw reply
* Re: [PATCH v2] net: macb: do not scan PHYs manually
From: Josh Cartwright @ 2016-04-29 12:25 UTC (permalink / raw)
To: Andrew Lunn
Cc: Nathan Sullivan, Nicolas Ferre, netdev, linux-kernel,
Florian Fainelli, Alexandre Belloni
In-Reply-To: <20160429003459.GC30217@jcartwri.amer.corp.natinst.com>
On Thu, Apr 28, 2016 at 07:34:59PM -0500, Josh Cartwright wrote:
> On Thu, Apr 28, 2016 at 11:23:15PM +0200, Andrew Lunn wrote:
> > On Thu, Apr 28, 2016 at 04:03:57PM -0500, Josh Cartwright wrote:
> > > On Thu, Apr 28, 2016 at 08:59:32PM +0200, Andrew Lunn wrote:
> > > > On Thu, Apr 28, 2016 at 01:55:27PM -0500, Nathan Sullivan wrote:
> > > > > On Thu, Apr 28, 2016 at 08:43:03PM +0200, Andrew Lunn wrote:
> > > > > > > I agree that is a valid fix for AT91, however it won't solve our problem, since
> > > > > > > we have no children on the second ethernet MAC in our devices' device trees. I'm
> > > > > > > starting to feel like our second MAC shouldn't even really register the MDIO bus
> > > > > > > since it isn't being used - maybe adding a DT property to not have a bus is a
> > > > > > > better option?
> > > > > >
> > > > > > status = "disabled"
> > > > > >
> > > > > > would be the unusual way.
> > > > > >
> > > > > > Andrew
> > > > >
> > > > > Oh, sorry, I meant we use both MACs on Zynq, however the PHYs are on the MDIO
> > > > > bus of the first MAC. So, the second MAC is used for ethernet but not for MDIO,
> > > > > and so it does not have any PHYs under its DT node. It would be nice if there
> > > > > were a way to tell macb not to bother with MDIO for the second MAC, since that's
> > > > > handled by the first MAC.
> > > >
> > > > Yes, exactly, add support for status = "disabled" in the mdio node.
> > >
> > > Unfortunately, the 'macb' doesn't have a "mdio node", or alternatively:
> > > the node representing the mdio bus is the same node which represents the
> > > macb instance itself. Setting 'status = "disabled"' on this node will
> > > just prevent the probing of the macb instance.
> >
> > :-(
> >
> > It is very common to have an mdio node within the MAC node, for example imx6sx-sdb.dtsi
>
> Okay, I think that makes sense. I think, then, perhaps the solution to
> our problem is to:
>
> 1. Modify the macb driver to support an 'mdio' node. (And adjust the
> binding document accordingly). If the node is found, it's used for
> of_mdiobus_register() w/o any of the manual scan madness.
> 2. For backwards compatibility, in the case where an 'mdio' node does
> not exist, leave the existing behavior the way it is now
> (of_mdiobus_register() followed by manual scan) [perhaps warn of
> deprecation as well?]
> 3. Update binding docs to reflect the above.
>
> In this way, for our usecase, the 'status = "disabled"' in the newly
> created 'mdio' node isn't necessary. It's sufficient for the node to
> exist and be empty.
Here's a (only build tested) attempt at implementing a part of this. I
macb_mii_init() was getting complicated enough that I lifted out two
helper functions for the dt/no-dt case. Sweeping the in-tree
devicetrees to update them to place phys under an 'mdio' node is still
to be done.
Josh
diff --git a/drivers/net/ethernet/cadence/macb.c b/drivers/net/ethernet/cadence/macb.c
index eec3200..d843bc9 100644
--- a/drivers/net/ethernet/cadence/macb.c
+++ b/drivers/net/ethernet/cadence/macb.c
@@ -419,11 +419,62 @@ static int macb_mii_probe(struct net_device *dev)
return 0;
}
+static int macb_mii_of_init(struct macb *bp, struct device_node *np)
+{
+ struct device_node *mdio;
+ int err, i;
+
+ mdio = of_get_child_by_name(np, "mdio");
+ if (mdio)
+ return of_mdiobus_register(bp->mii_bus, mdio);
+
+ dev_warn(&bp->pdev->dev,
+ "using deprecated PHY probing mechanism. Please update device tree.");
+
+ /* try dt phy registration */
+ err = of_mdiobus_register(bp->mii_bus, np);
+ if (err)
+ return err;
+
+ /* fallback to standard phy registration if no phy were
+ * found during dt phy registration
+ */
+ if (!phy_find_first(bp->mii_bus)) {
+ for (i = 0; i < PHY_MAX_ADDR; i++) {
+ struct phy_device *phydev;
+
+ phydev = mdiobus_scan(bp->mii_bus, i);
+ if (IS_ERR(phydev)) {
+ err = PTR_ERR(phydev);
+ break;
+ }
+ }
+
+ if (err)
+ goto err_out_unregister_bus;
+ }
+
+ return err;
+
+err_out_unregister_bus:
+ mdiobus_unregister(bp->mii_bus);
+ return err;
+}
+
+static int macb_mii_pdata_init(struct macb *bp,
+ struct macb_platform_data *pdata)
+{
+ if (pdata)
+ bp->mii_bus->phy_mask = pdata->phy_mask;
+
+ return mdiobus_register(bp->mii_bus);
+}
+
static int macb_mii_init(struct macb *bp)
{
struct macb_platform_data *pdata;
struct device_node *np;
- int err = -ENXIO, i;
+ int err = -ENXIO;
/* Enable management port */
macb_writel(bp, NCR, MACB_BIT(MPE));
@@ -446,33 +497,10 @@ static int macb_mii_init(struct macb *bp)
dev_set_drvdata(&bp->dev->dev, bp->mii_bus);
np = bp->pdev->dev.of_node;
- if (np) {
- /* try dt phy registration */
- err = of_mdiobus_register(bp->mii_bus, np);
-
- /* fallback to standard phy registration if no phy were
- * found during dt phy registration
- */
- if (!err && !phy_find_first(bp->mii_bus)) {
- for (i = 0; i < PHY_MAX_ADDR; i++) {
- struct phy_device *phydev;
-
- phydev = mdiobus_scan(bp->mii_bus, i);
- if (IS_ERR(phydev)) {
- err = PTR_ERR(phydev);
- break;
- }
- }
-
- if (err)
- goto err_out_unregister_bus;
- }
- } else {
- if (pdata)
- bp->mii_bus->phy_mask = pdata->phy_mask;
-
- err = mdiobus_register(bp->mii_bus);
- }
+ if (np)
+ err = macb_mii_of_init(bp, np);
+ else
+ err = macb_mii_pdata_init(bp, pdata);
if (err)
goto err_out_free_mdiobus;
^ permalink raw reply related
* Re: [PATCH v2] net: macb: do not scan PHYs manually
From: Nicolas Ferre @ 2016-04-29 12:40 UTC (permalink / raw)
To: Josh Cartwright, Andrew Lunn
Cc: Nathan Sullivan, netdev, linux-kernel, Florian Fainelli,
Alexandre Belloni
In-Reply-To: <20160429122501.GD30217@jcartwri.amer.corp.natinst.com>
Le 29/04/2016 14:25, Josh Cartwright a écrit :
> On Thu, Apr 28, 2016 at 07:34:59PM -0500, Josh Cartwright wrote:
>> On Thu, Apr 28, 2016 at 11:23:15PM +0200, Andrew Lunn wrote:
>>> On Thu, Apr 28, 2016 at 04:03:57PM -0500, Josh Cartwright wrote:
>>>> On Thu, Apr 28, 2016 at 08:59:32PM +0200, Andrew Lunn wrote:
>>>>> On Thu, Apr 28, 2016 at 01:55:27PM -0500, Nathan Sullivan wrote:
>>>>>> On Thu, Apr 28, 2016 at 08:43:03PM +0200, Andrew Lunn wrote:
>>>>>>>> I agree that is a valid fix for AT91, however it won't solve our problem, since
>>>>>>>> we have no children on the second ethernet MAC in our devices' device trees. I'm
>>>>>>>> starting to feel like our second MAC shouldn't even really register the MDIO bus
>>>>>>>> since it isn't being used - maybe adding a DT property to not have a bus is a
>>>>>>>> better option?
>>>>>>>
>>>>>>> status = "disabled"
>>>>>>>
>>>>>>> would be the unusual way.
>>>>>>>
>>>>>>> Andrew
>>>>>>
>>>>>> Oh, sorry, I meant we use both MACs on Zynq, however the PHYs are on the MDIO
>>>>>> bus of the first MAC. So, the second MAC is used for ethernet but not for MDIO,
>>>>>> and so it does not have any PHYs under its DT node. It would be nice if there
>>>>>> were a way to tell macb not to bother with MDIO for the second MAC, since that's
>>>>>> handled by the first MAC.
>>>>>
>>>>> Yes, exactly, add support for status = "disabled" in the mdio node.
>>>>
>>>> Unfortunately, the 'macb' doesn't have a "mdio node", or alternatively:
>>>> the node representing the mdio bus is the same node which represents the
>>>> macb instance itself. Setting 'status = "disabled"' on this node will
>>>> just prevent the probing of the macb instance.
>>>
>>> :-(
>>>
>>> It is very common to have an mdio node within the MAC node, for example imx6sx-sdb.dtsi
>>
>> Okay, I think that makes sense. I think, then, perhaps the solution to
>> our problem is to:
>>
>> 1. Modify the macb driver to support an 'mdio' node. (And adjust the
>> binding document accordingly). If the node is found, it's used for
>> of_mdiobus_register() w/o any of the manual scan madness.
>> 2. For backwards compatibility, in the case where an 'mdio' node does
>> not exist, leave the existing behavior the way it is now
>> (of_mdiobus_register() followed by manual scan) [perhaps warn of
>> deprecation as well?]
>> 3. Update binding docs to reflect the above.
>>
>> In this way, for our usecase, the 'status = "disabled"' in the newly
>> created 'mdio' node isn't necessary. It's sufficient for the node to
>> exist and be empty.
>
> Here's a (only build tested) attempt at implementing a part of this. I
> macb_mii_init() was getting complicated enough that I lifted out two
> helper functions for the dt/no-dt case. Sweeping the in-tree
> devicetrees to update them to place phys under an 'mdio' node is still
> to be done.
>
> Josh
>
> diff --git a/drivers/net/ethernet/cadence/macb.c b/drivers/net/ethernet/cadence/macb.c
> index eec3200..d843bc9 100644
> --- a/drivers/net/ethernet/cadence/macb.c
> +++ b/drivers/net/ethernet/cadence/macb.c
> @@ -419,11 +419,62 @@ static int macb_mii_probe(struct net_device *dev)
> return 0;
> }
>
> +static int macb_mii_of_init(struct macb *bp, struct device_node *np)
> +{
> + struct device_node *mdio;
> + int err, i;
> +
> + mdio = of_get_child_by_name(np, "mdio");
> + if (mdio)
> + return of_mdiobus_register(bp->mii_bus, mdio);
> +
> + dev_warn(&bp->pdev->dev,
> + "using deprecated PHY probing mechanism. Please update device tree.");
Do we need to warn here?
Too bad I was not aware of that earlier, I even updated some of my DTs
recently with only a phy node without the "mdio" one as parents :-\
> + /* try dt phy registration */
> + err = of_mdiobus_register(bp->mii_bus, np);
> + if (err)
> + return err;
> +
> + /* fallback to standard phy registration if no phy were
> + * found during dt phy registration
> + */
> + if (!phy_find_first(bp->mii_bus)) {
> + for (i = 0; i < PHY_MAX_ADDR; i++) {
> + struct phy_device *phydev;
> +
> + phydev = mdiobus_scan(bp->mii_bus, i);
> + if (IS_ERR(phydev)) {
> + err = PTR_ERR(phydev);
> + break;
> + }
> + }
> +
> + if (err)
> + goto err_out_unregister_bus;
> + }
> +
> + return err;
> +
> +err_out_unregister_bus:
> + mdiobus_unregister(bp->mii_bus);
> + return err;
> +}
> +
> +static int macb_mii_pdata_init(struct macb *bp,
> + struct macb_platform_data *pdata)
> +{
> + if (pdata)
> + bp->mii_bus->phy_mask = pdata->phy_mask;
> +
> + return mdiobus_register(bp->mii_bus);
> +}
> +
> static int macb_mii_init(struct macb *bp)
> {
> struct macb_platform_data *pdata;
> struct device_node *np;
> - int err = -ENXIO, i;
> + int err = -ENXIO;
>
> /* Enable management port */
> macb_writel(bp, NCR, MACB_BIT(MPE));
> @@ -446,33 +497,10 @@ static int macb_mii_init(struct macb *bp)
> dev_set_drvdata(&bp->dev->dev, bp->mii_bus);
>
> np = bp->pdev->dev.of_node;
> - if (np) {
> - /* try dt phy registration */
> - err = of_mdiobus_register(bp->mii_bus, np);
> -
> - /* fallback to standard phy registration if no phy were
> - * found during dt phy registration
> - */
> - if (!err && !phy_find_first(bp->mii_bus)) {
> - for (i = 0; i < PHY_MAX_ADDR; i++) {
> - struct phy_device *phydev;
> -
> - phydev = mdiobus_scan(bp->mii_bus, i);
> - if (IS_ERR(phydev)) {
> - err = PTR_ERR(phydev);
> - break;
> - }
> - }
> -
> - if (err)
> - goto err_out_unregister_bus;
> - }
> - } else {
> - if (pdata)
> - bp->mii_bus->phy_mask = pdata->phy_mask;
> -
> - err = mdiobus_register(bp->mii_bus);
> - }
> + if (np)
> + err = macb_mii_of_init(bp, np);
> + else
> + err = macb_mii_pdata_init(bp, pdata);
>
> if (err)
> goto err_out_free_mdiobus;
I'm okay with this. Thanks for having taken the initiative to implement it.
Bye,
--
Nicolas Ferre
^ permalink raw reply
* Re: [PATCH v2] net: macb: do not scan PHYs manually
From: Andrew Lunn @ 2016-04-29 12:49 UTC (permalink / raw)
To: Josh Cartwright
Cc: Nathan Sullivan, Nicolas Ferre, netdev, linux-kernel,
Florian Fainelli, Alexandre Belloni
In-Reply-To: <20160429122501.GD30217@jcartwri.amer.corp.natinst.com>
> diff --git a/drivers/net/ethernet/cadence/macb.c b/drivers/net/ethernet/cadence/macb.c
> index eec3200..d843bc9 100644
> --- a/drivers/net/ethernet/cadence/macb.c
> +++ b/drivers/net/ethernet/cadence/macb.c
> @@ -419,11 +419,62 @@ static int macb_mii_probe(struct net_device *dev)
> return 0;
> }
>
> +static int macb_mii_of_init(struct macb *bp, struct device_node *np)
> +{
> + struct device_node *mdio;
> + int err, i;
> +
> + mdio = of_get_child_by_name(np, "mdio");
> + if (mdio)
> + return of_mdiobus_register(bp->mii_bus, mdio);
We want to encourage driver writers to use an mdio subnode inside
there MAC node. So i wounder if this looking for the child and using
it should go into the core code?
Florian: What do you think?
> +
> + dev_warn(&bp->pdev->dev,
> + "using deprecated PHY probing mechanism. Please update device tree.");
> +
> + /* try dt phy registration */
> + err = of_mdiobus_register(bp->mii_bus, np);
> + if (err)
> + return err;
> +
> + /* fallback to standard phy registration if no phy were
> + * found during dt phy registration
> + */
> + if (!phy_find_first(bp->mii_bus)) {
I would also suggest putting a warning here, saying that PHYs should
be listed in the device tree.
> + for (i = 0; i < PHY_MAX_ADDR; i++) {
> + struct phy_device *phydev;
> +
> + phydev = mdiobus_scan(bp->mii_bus, i);
> + if (IS_ERR(phydev)) {
> + err = PTR_ERR(phydev);
FYI: There is a change making its way through which will mean
mdiobus_scan() will return -ENODEV where there is nothing on the bus
at that address, rather than the current NULL. You will need to adopt
this here.
Andrew
^ permalink raw reply
* Re: [PATCH v2] net: macb: do not scan PHYs manually
From: Andrew Lunn @ 2016-04-29 12:56 UTC (permalink / raw)
To: Nicolas Ferre
Cc: Josh Cartwright, Nathan Sullivan, netdev, linux-kernel,
Florian Fainelli, Alexandre Belloni
In-Reply-To: <57235655.3030104@atmel.com>
> > +static int macb_mii_of_init(struct macb *bp, struct device_node *np)
> > +{
> > + struct device_node *mdio;
> > + int err, i;
> > +
> > + mdio = of_get_child_by_name(np, "mdio");
> > + if (mdio)
> > + return of_mdiobus_register(bp->mii_bus, mdio);
> > +
> > + dev_warn(&bp->pdev->dev,
> > + "using deprecated PHY probing mechanism. Please update device tree.");
>
> Do we need to warn here?
>
> Too bad I was not aware of that earlier, I even updated some of my DTs
> recently with only a phy node without the "mdio" one as parents :-\
It is messy. Unfortunately, there is no binding documentation (yet)
suggesting the right way to do this. And as a result, we have
drivers/device trees doing different things, leading to workarounds
like manually scanning the bus, not listing PHYs in the device tree
and so or falling back to the old methods, etc.
We need to document how we expect this to be done, and then add
warnings in various places to encourage developers to migrate their
device trees to what has been documented.
Andrew
^ permalink raw reply
* Re: [PATCH net-next] net: dsa: mv88e6xxx: replace ds with ps where possible
From: Andrew Lunn @ 2016-04-29 13:06 UTC (permalink / raw)
To: Vivien Didelot; +Cc: netdev, linux-kernel, kernel, David S. Miller
In-Reply-To: <1461893046-28200-1-git-send-email-vivien.didelot@savoirfairelinux.com>
On Thu, Apr 28, 2016 at 09:24:06PM -0400, Vivien Didelot wrote:
> From: Andrew Lunn <andrew@lunn.ch>
>
> The dsa_switch structure ds is actually needed in very few places,
> mostly during setup of the switch. The private structure ps is however
> needed nearly everywhere. Pass ps, not ds internally.
>
> [vd: rebased Andrew's patch.]
Hi Vivien
Thanks for picking up this patch and rebasing it.
I would generally put comments like that bellow the ---. They don't
need to be in the commit log.
Andrew
>
> Signed-off-by: Andrew Lunn <andrew@lunn.ch>
> Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
> ---
> drivers/net/dsa/mv88e6123.c | 14 +-
> drivers/net/dsa/mv88e6131.c | 22 +-
> drivers/net/dsa/mv88e6171.c | 14 +-
> drivers/net/dsa/mv88e6352.c | 24 +-
> drivers/net/dsa/mv88e6xxx.c | 917 ++++++++++++++++++++++----------------------
> drivers/net/dsa/mv88e6xxx.h | 14 +-
> 6 files changed, 511 insertions(+), 494 deletions(-)
>
> diff --git a/drivers/net/dsa/mv88e6123.c b/drivers/net/dsa/mv88e6123.c
> index 534ebc8..5535a42 100644
> --- a/drivers/net/dsa/mv88e6123.c
> +++ b/drivers/net/dsa/mv88e6123.c
> @@ -50,6 +50,7 @@ static const char *mv88e6123_drv_probe(struct device *dsa_dev,
>
> static int mv88e6123_setup_global(struct dsa_switch *ds)
> {
> + struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
> u32 upstream_port = dsa_upstream_port(ds);
> int ret;
> u32 reg;
> @@ -62,7 +63,7 @@ static int mv88e6123_setup_global(struct dsa_switch *ds)
> * external PHYs to poll), don't discard packets with
> * excessive collisions, and mask all interrupt sources.
> */
> - ret = mv88e6xxx_reg_write(ds, REG_GLOBAL, GLOBAL_CONTROL, 0x0000);
> + ret = mv88e6xxx_reg_write(ps, REG_GLOBAL, GLOBAL_CONTROL, 0x0000);
> if (ret)
> return ret;
>
> @@ -73,26 +74,29 @@ static int mv88e6123_setup_global(struct dsa_switch *ds)
> reg = upstream_port << GLOBAL_MONITOR_CONTROL_INGRESS_SHIFT |
> upstream_port << GLOBAL_MONITOR_CONTROL_EGRESS_SHIFT |
> upstream_port << GLOBAL_MONITOR_CONTROL_ARP_SHIFT;
> - ret = mv88e6xxx_reg_write(ds, REG_GLOBAL, GLOBAL_MONITOR_CONTROL, reg);
> + ret = mv88e6xxx_reg_write(ps, REG_GLOBAL, GLOBAL_MONITOR_CONTROL, reg);
> if (ret)
> return ret;
>
> /* Disable remote management for now, and set the switch's
> * DSA device number.
> */
> - return mv88e6xxx_reg_write(ds, REG_GLOBAL, GLOBAL_CONTROL_2,
> + return mv88e6xxx_reg_write(ps, REG_GLOBAL, GLOBAL_CONTROL_2,
> ds->index & 0x1f);
> }
>
> static int mv88e6123_setup(struct dsa_switch *ds)
> {
> + struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
> int ret;
>
> - ret = mv88e6xxx_setup_common(ds);
> + ps->ds = ds;
> +
> + ret = mv88e6xxx_setup_common(ps);
> if (ret < 0)
> return ret;
>
> - ret = mv88e6xxx_switch_reset(ds, false);
> + ret = mv88e6xxx_switch_reset(ps, false);
> if (ret < 0)
> return ret;
>
> diff --git a/drivers/net/dsa/mv88e6131.c b/drivers/net/dsa/mv88e6131.c
> index c3eb9a8..357ab79 100644
> --- a/drivers/net/dsa/mv88e6131.c
> +++ b/drivers/net/dsa/mv88e6131.c
> @@ -56,6 +56,7 @@ static const char *mv88e6131_drv_probe(struct device *dsa_dev,
>
> static int mv88e6131_setup_global(struct dsa_switch *ds)
> {
> + struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
> u32 upstream_port = dsa_upstream_port(ds);
> int ret;
> u32 reg;
> @@ -69,14 +70,14 @@ static int mv88e6131_setup_global(struct dsa_switch *ds)
> * to arbitrate between packet queues, set the maximum frame
> * size to 1632, and mask all interrupt sources.
> */
> - ret = mv88e6xxx_reg_write(ds, REG_GLOBAL, GLOBAL_CONTROL,
> + ret = mv88e6xxx_reg_write(ps, REG_GLOBAL, GLOBAL_CONTROL,
> GLOBAL_CONTROL_PPU_ENABLE |
> GLOBAL_CONTROL_MAX_FRAME_1632);
> if (ret)
> return ret;
>
> /* Set the VLAN ethertype to 0x8100. */
> - ret = mv88e6xxx_reg_write(ds, REG_GLOBAL, GLOBAL_CORE_TAG_TYPE, 0x8100);
> + ret = mv88e6xxx_reg_write(ps, REG_GLOBAL, GLOBAL_CORE_TAG_TYPE, 0x8100);
> if (ret)
> return ret;
>
> @@ -87,7 +88,7 @@ static int mv88e6131_setup_global(struct dsa_switch *ds)
> reg = upstream_port << GLOBAL_MONITOR_CONTROL_INGRESS_SHIFT |
> upstream_port << GLOBAL_MONITOR_CONTROL_EGRESS_SHIFT |
> GLOBAL_MONITOR_CONTROL_ARP_DISABLED;
> - ret = mv88e6xxx_reg_write(ds, REG_GLOBAL, GLOBAL_MONITOR_CONTROL, reg);
> + ret = mv88e6xxx_reg_write(ps, REG_GLOBAL, GLOBAL_MONITOR_CONTROL, reg);
> if (ret)
> return ret;
>
> @@ -96,11 +97,11 @@ static int mv88e6131_setup_global(struct dsa_switch *ds)
> * DSA device number.
> */
> if (ds->dst->pd->nr_chips > 1)
> - ret = mv88e6xxx_reg_write(ds, REG_GLOBAL, GLOBAL_CONTROL_2,
> + ret = mv88e6xxx_reg_write(ps, REG_GLOBAL, GLOBAL_CONTROL_2,
> GLOBAL_CONTROL_2_MULTIPLE_CASCADE |
> (ds->index & 0x1f));
> else
> - ret = mv88e6xxx_reg_write(ds, REG_GLOBAL, GLOBAL_CONTROL_2,
> + ret = mv88e6xxx_reg_write(ps, REG_GLOBAL, GLOBAL_CONTROL_2,
> GLOBAL_CONTROL_2_NO_CASCADE |
> (ds->index & 0x1f));
> if (ret)
> @@ -109,7 +110,7 @@ static int mv88e6131_setup_global(struct dsa_switch *ds)
> /* Force the priority of IGMP/MLD snoop frames and ARP frames
> * to the highest setting.
> */
> - return mv88e6xxx_reg_write(ds, REG_GLOBAL2, GLOBAL2_PRIO_OVERRIDE,
> + return mv88e6xxx_reg_write(ps, REG_GLOBAL2, GLOBAL2_PRIO_OVERRIDE,
> GLOBAL2_PRIO_OVERRIDE_FORCE_SNOOP |
> 7 << GLOBAL2_PRIO_OVERRIDE_SNOOP_SHIFT |
> GLOBAL2_PRIO_OVERRIDE_FORCE_ARP |
> @@ -118,15 +119,18 @@ static int mv88e6131_setup_global(struct dsa_switch *ds)
>
> static int mv88e6131_setup(struct dsa_switch *ds)
> {
> + struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
> int ret;
>
> - ret = mv88e6xxx_setup_common(ds);
> + ps->ds = ds;
> +
> + ret = mv88e6xxx_setup_common(ps);
> if (ret < 0)
> return ret;
>
> - mv88e6xxx_ppu_state_init(ds);
> + mv88e6xxx_ppu_state_init(ps);
>
> - ret = mv88e6xxx_switch_reset(ds, false);
> + ret = mv88e6xxx_switch_reset(ps, false);
> if (ret < 0)
> return ret;
>
> diff --git a/drivers/net/dsa/mv88e6171.c b/drivers/net/dsa/mv88e6171.c
> index 841ffe1..f75164d 100644
> --- a/drivers/net/dsa/mv88e6171.c
> +++ b/drivers/net/dsa/mv88e6171.c
> @@ -56,6 +56,7 @@ static const char *mv88e6171_drv_probe(struct device *dsa_dev,
>
> static int mv88e6171_setup_global(struct dsa_switch *ds)
> {
> + struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
> u32 upstream_port = dsa_upstream_port(ds);
> int ret;
> u32 reg;
> @@ -67,7 +68,7 @@ static int mv88e6171_setup_global(struct dsa_switch *ds)
> /* Discard packets with excessive collisions, mask all
> * interrupt sources, enable PPU.
> */
> - ret = mv88e6xxx_reg_write(ds, REG_GLOBAL, GLOBAL_CONTROL,
> + ret = mv88e6xxx_reg_write(ps, REG_GLOBAL, GLOBAL_CONTROL,
> GLOBAL_CONTROL_PPU_ENABLE |
> GLOBAL_CONTROL_DISCARD_EXCESS);
> if (ret)
> @@ -81,26 +82,29 @@ static int mv88e6171_setup_global(struct dsa_switch *ds)
> upstream_port << GLOBAL_MONITOR_CONTROL_EGRESS_SHIFT |
> upstream_port << GLOBAL_MONITOR_CONTROL_ARP_SHIFT |
> upstream_port << GLOBAL_MONITOR_CONTROL_MIRROR_SHIFT;
> - ret = mv88e6xxx_reg_write(ds, REG_GLOBAL, GLOBAL_MONITOR_CONTROL, reg);
> + ret = mv88e6xxx_reg_write(ps, REG_GLOBAL, GLOBAL_MONITOR_CONTROL, reg);
> if (ret)
> return ret;
>
> /* Disable remote management for now, and set the switch's
> * DSA device number.
> */
> - return mv88e6xxx_reg_write(ds, REG_GLOBAL, GLOBAL_CONTROL_2,
> + return mv88e6xxx_reg_write(ps, REG_GLOBAL, GLOBAL_CONTROL_2,
> ds->index & 0x1f);
> }
>
> static int mv88e6171_setup(struct dsa_switch *ds)
> {
> + struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
> int ret;
>
> - ret = mv88e6xxx_setup_common(ds);
> + ps->ds = ds;
> +
> + ret = mv88e6xxx_setup_common(ps);
> if (ret < 0)
> return ret;
>
> - ret = mv88e6xxx_switch_reset(ds, true);
> + ret = mv88e6xxx_switch_reset(ps, true);
> if (ret < 0)
> return ret;
>
> diff --git a/drivers/net/dsa/mv88e6352.c b/drivers/net/dsa/mv88e6352.c
> index 4afc24d..c622a1d 100644
> --- a/drivers/net/dsa/mv88e6352.c
> +++ b/drivers/net/dsa/mv88e6352.c
> @@ -73,6 +73,7 @@ static const char *mv88e6352_drv_probe(struct device *dsa_dev,
>
> static int mv88e6352_setup_global(struct dsa_switch *ds)
> {
> + struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
> u32 upstream_port = dsa_upstream_port(ds);
> int ret;
> u32 reg;
> @@ -84,7 +85,7 @@ static int mv88e6352_setup_global(struct dsa_switch *ds)
> /* Discard packets with excessive collisions,
> * mask all interrupt sources, enable PPU (bit 14, undocumented).
> */
> - ret = mv88e6xxx_reg_write(ds, REG_GLOBAL, GLOBAL_CONTROL,
> + ret = mv88e6xxx_reg_write(ps, REG_GLOBAL, GLOBAL_CONTROL,
> GLOBAL_CONTROL_PPU_ENABLE |
> GLOBAL_CONTROL_DISCARD_EXCESS);
> if (ret)
> @@ -97,14 +98,14 @@ static int mv88e6352_setup_global(struct dsa_switch *ds)
> reg = upstream_port << GLOBAL_MONITOR_CONTROL_INGRESS_SHIFT |
> upstream_port << GLOBAL_MONITOR_CONTROL_EGRESS_SHIFT |
> upstream_port << GLOBAL_MONITOR_CONTROL_ARP_SHIFT;
> - ret = mv88e6xxx_reg_write(ds, REG_GLOBAL, GLOBAL_MONITOR_CONTROL, reg);
> + ret = mv88e6xxx_reg_write(ps, REG_GLOBAL, GLOBAL_MONITOR_CONTROL, reg);
> if (ret)
> return ret;
>
> /* Disable remote management for now, and set the switch's
> * DSA device number.
> */
> - return mv88e6xxx_reg_write(ds, REG_GLOBAL, 0x1c, ds->index & 0x1f);
> + return mv88e6xxx_reg_write(ps, REG_GLOBAL, 0x1c, ds->index & 0x1f);
> }
>
> static int mv88e6352_setup(struct dsa_switch *ds)
> @@ -112,13 +113,15 @@ static int mv88e6352_setup(struct dsa_switch *ds)
> struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
> int ret;
>
> - ret = mv88e6xxx_setup_common(ds);
> + ps->ds = ds;
> +
> + ret = mv88e6xxx_setup_common(ps);
> if (ret < 0)
> return ret;
>
> mutex_init(&ps->eeprom_mutex);
>
> - ret = mv88e6xxx_switch_reset(ds, true);
> + ret = mv88e6xxx_switch_reset(ps, true);
> if (ret < 0)
> return ret;
>
> @@ -136,7 +139,7 @@ static int mv88e6352_read_eeprom_word(struct dsa_switch *ds, int addr)
>
> mutex_lock(&ps->eeprom_mutex);
>
> - ret = mv88e6xxx_reg_write(ds, REG_GLOBAL2, GLOBAL2_EEPROM_OP,
> + ret = mv88e6xxx_reg_write(ps, REG_GLOBAL2, GLOBAL2_EEPROM_OP,
> GLOBAL2_EEPROM_OP_READ |
> (addr & GLOBAL2_EEPROM_OP_ADDR_MASK));
> if (ret < 0)
> @@ -146,7 +149,7 @@ static int mv88e6352_read_eeprom_word(struct dsa_switch *ds, int addr)
> if (ret < 0)
> goto error;
>
> - ret = mv88e6xxx_reg_read(ds, REG_GLOBAL2, GLOBAL2_EEPROM_DATA);
> + ret = mv88e6xxx_reg_read(ps, REG_GLOBAL2, GLOBAL2_EEPROM_DATA);
> error:
> mutex_unlock(&ps->eeprom_mutex);
> return ret;
> @@ -217,9 +220,10 @@ static int mv88e6352_get_eeprom(struct dsa_switch *ds,
>
> static int mv88e6352_eeprom_is_readonly(struct dsa_switch *ds)
> {
> + struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
> int ret;
>
> - ret = mv88e6xxx_reg_read(ds, REG_GLOBAL2, GLOBAL2_EEPROM_OP);
> + ret = mv88e6xxx_reg_read(ps, REG_GLOBAL2, GLOBAL2_EEPROM_OP);
> if (ret < 0)
> return ret;
>
> @@ -237,11 +241,11 @@ static int mv88e6352_write_eeprom_word(struct dsa_switch *ds, int addr,
>
> mutex_lock(&ps->eeprom_mutex);
>
> - ret = mv88e6xxx_reg_write(ds, REG_GLOBAL2, GLOBAL2_EEPROM_DATA, data);
> + ret = mv88e6xxx_reg_write(ps, REG_GLOBAL2, GLOBAL2_EEPROM_DATA, data);
> if (ret < 0)
> goto error;
>
> - ret = mv88e6xxx_reg_write(ds, REG_GLOBAL2, GLOBAL2_EEPROM_OP,
> + ret = mv88e6xxx_reg_write(ps, REG_GLOBAL2, GLOBAL2_EEPROM_OP,
> GLOBAL2_EEPROM_OP_WRITE |
> (addr & GLOBAL2_EEPROM_OP_ADDR_MASK));
> if (ret < 0)
> diff --git a/drivers/net/dsa/mv88e6xxx.c b/drivers/net/dsa/mv88e6xxx.c
> index 028f92f..61150af 100644
> --- a/drivers/net/dsa/mv88e6xxx.c
> +++ b/drivers/net/dsa/mv88e6xxx.c
> @@ -25,12 +25,10 @@
> #include <net/switchdev.h>
> #include "mv88e6xxx.h"
>
> -static void assert_smi_lock(struct dsa_switch *ds)
> +static void assert_smi_lock(struct mv88e6xxx_priv_state *ps)
> {
> - struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
> -
> if (unlikely(!mutex_is_locked(&ps->smi_mutex))) {
> - dev_err(ds->master_dev, "SMI lock not held!\n");
> + dev_err(ps->dev, "SMI lock not held!\n");
> dump_stack();
> }
> }
> @@ -92,30 +90,29 @@ static int __mv88e6xxx_reg_read(struct mii_bus *bus, int sw_addr, int addr,
> return ret & 0xffff;
> }
>
> -static int _mv88e6xxx_reg_read(struct dsa_switch *ds, int addr, int reg)
> +static int _mv88e6xxx_reg_read(struct mv88e6xxx_priv_state *ps,
> + int addr, int reg)
> {
> - struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
> int ret;
>
> - assert_smi_lock(ds);
> + assert_smi_lock(ps);
>
> ret = __mv88e6xxx_reg_read(ps->bus, ps->sw_addr, addr, reg);
> if (ret < 0)
> return ret;
>
> - dev_dbg(ds->master_dev, "<- addr: 0x%.2x reg: 0x%.2x val: 0x%.4x\n",
> + dev_dbg(ps->dev, "<- addr: 0x%.2x reg: 0x%.2x val: 0x%.4x\n",
> addr, reg, ret);
>
> return ret;
> }
>
> -int mv88e6xxx_reg_read(struct dsa_switch *ds, int addr, int reg)
> +int mv88e6xxx_reg_read(struct mv88e6xxx_priv_state *ps, int addr, int reg)
> {
> - struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
> int ret;
>
> mutex_lock(&ps->smi_mutex);
> - ret = _mv88e6xxx_reg_read(ds, addr, reg);
> + ret = _mv88e6xxx_reg_read(ps, addr, reg);
> mutex_unlock(&ps->smi_mutex);
>
> return ret;
> @@ -153,26 +150,24 @@ static int __mv88e6xxx_reg_write(struct mii_bus *bus, int sw_addr, int addr,
> return 0;
> }
>
> -static int _mv88e6xxx_reg_write(struct dsa_switch *ds, int addr, int reg,
> - u16 val)
> +static int _mv88e6xxx_reg_write(struct mv88e6xxx_priv_state *ps, int addr,
> + int reg, u16 val)
> {
> - struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
> + assert_smi_lock(ps);
>
> - assert_smi_lock(ds);
> -
> - dev_dbg(ds->master_dev, "-> addr: 0x%.2x reg: 0x%.2x val: 0x%.4x\n",
> + dev_dbg(ps->dev, "-> addr: 0x%.2x reg: 0x%.2x val: 0x%.4x\n",
> addr, reg, val);
>
> return __mv88e6xxx_reg_write(ps->bus, ps->sw_addr, addr, reg, val);
> }
>
> -int mv88e6xxx_reg_write(struct dsa_switch *ds, int addr, int reg, u16 val)
> +int mv88e6xxx_reg_write(struct mv88e6xxx_priv_state *ps, int addr,
> + int reg, u16 val)
> {
> - struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
> int ret;
>
> mutex_lock(&ps->smi_mutex);
> - ret = _mv88e6xxx_reg_write(ds, addr, reg, val);
> + ret = _mv88e6xxx_reg_write(ps, addr, reg, val);
> mutex_unlock(&ps->smi_mutex);
>
> return ret;
> @@ -180,24 +175,26 @@ int mv88e6xxx_reg_write(struct dsa_switch *ds, int addr, int reg, u16 val)
>
> int mv88e6xxx_set_addr_direct(struct dsa_switch *ds, u8 *addr)
> {
> + struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
> int err;
>
> - err = mv88e6xxx_reg_write(ds, REG_GLOBAL, GLOBAL_MAC_01,
> + err = mv88e6xxx_reg_write(ps, REG_GLOBAL, GLOBAL_MAC_01,
> (addr[0] << 8) | addr[1]);
> if (err)
> return err;
>
> - err = mv88e6xxx_reg_write(ds, REG_GLOBAL, GLOBAL_MAC_23,
> + err = mv88e6xxx_reg_write(ps, REG_GLOBAL, GLOBAL_MAC_23,
> (addr[2] << 8) | addr[3]);
> if (err)
> return err;
>
> - return mv88e6xxx_reg_write(ds, REG_GLOBAL, GLOBAL_MAC_45,
> + return mv88e6xxx_reg_write(ps, REG_GLOBAL, GLOBAL_MAC_45,
> (addr[4] << 8) | addr[5]);
> }
>
> int mv88e6xxx_set_addr_indirect(struct dsa_switch *ds, u8 *addr)
> {
> + struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
> int ret;
> int i;
>
> @@ -205,7 +202,7 @@ int mv88e6xxx_set_addr_indirect(struct dsa_switch *ds, u8 *addr)
> int j;
>
> /* Write the MAC address byte. */
> - ret = mv88e6xxx_reg_write(ds, REG_GLOBAL2, GLOBAL2_SWITCH_MAC,
> + ret = mv88e6xxx_reg_write(ps, REG_GLOBAL2, GLOBAL2_SWITCH_MAC,
> GLOBAL2_SWITCH_MAC_BUSY |
> (i << 8) | addr[i]);
> if (ret)
> @@ -213,7 +210,7 @@ int mv88e6xxx_set_addr_indirect(struct dsa_switch *ds, u8 *addr)
>
> /* Wait for the write to complete. */
> for (j = 0; j < 16; j++) {
> - ret = mv88e6xxx_reg_read(ds, REG_GLOBAL2,
> + ret = mv88e6xxx_reg_read(ps, REG_GLOBAL2,
> GLOBAL2_SWITCH_MAC);
> if (ret < 0)
> return ret;
> @@ -228,39 +225,40 @@ int mv88e6xxx_set_addr_indirect(struct dsa_switch *ds, u8 *addr)
> return 0;
> }
>
> -static int _mv88e6xxx_phy_read(struct dsa_switch *ds, int addr, int regnum)
> +static int _mv88e6xxx_phy_read(struct mv88e6xxx_priv_state *ps, int addr,
> + int regnum)
> {
> if (addr >= 0)
> - return _mv88e6xxx_reg_read(ds, addr, regnum);
> + return _mv88e6xxx_reg_read(ps, addr, regnum);
> return 0xffff;
> }
>
> -static int _mv88e6xxx_phy_write(struct dsa_switch *ds, int addr, int regnum,
> - u16 val)
> +static int _mv88e6xxx_phy_write(struct mv88e6xxx_priv_state *ps, int addr,
> + int regnum, u16 val)
> {
> if (addr >= 0)
> - return _mv88e6xxx_reg_write(ds, addr, regnum, val);
> + return _mv88e6xxx_reg_write(ps, addr, regnum, val);
> return 0;
> }
>
> #ifdef CONFIG_NET_DSA_MV88E6XXX_NEED_PPU
> -static int mv88e6xxx_ppu_disable(struct dsa_switch *ds)
> +static int mv88e6xxx_ppu_disable(struct mv88e6xxx_priv_state *ps)
> {
> int ret;
> unsigned long timeout;
>
> - ret = mv88e6xxx_reg_read(ds, REG_GLOBAL, GLOBAL_CONTROL);
> + ret = mv88e6xxx_reg_read(ps, REG_GLOBAL, GLOBAL_CONTROL);
> if (ret < 0)
> return ret;
>
> - ret = mv88e6xxx_reg_write(ds, REG_GLOBAL, GLOBAL_CONTROL,
> + ret = mv88e6xxx_reg_write(ps, REG_GLOBAL, GLOBAL_CONTROL,
> ret & ~GLOBAL_CONTROL_PPU_ENABLE);
> if (ret)
> return ret;
>
> timeout = jiffies + 1 * HZ;
> while (time_before(jiffies, timeout)) {
> - ret = mv88e6xxx_reg_read(ds, REG_GLOBAL, GLOBAL_STATUS);
> + ret = mv88e6xxx_reg_read(ps, REG_GLOBAL, GLOBAL_STATUS);
> if (ret < 0)
> return ret;
>
> @@ -273,23 +271,23 @@ static int mv88e6xxx_ppu_disable(struct dsa_switch *ds)
> return -ETIMEDOUT;
> }
>
> -static int mv88e6xxx_ppu_enable(struct dsa_switch *ds)
> +static int mv88e6xxx_ppu_enable(struct mv88e6xxx_priv_state *ps)
> {
> int ret, err;
> unsigned long timeout;
>
> - ret = mv88e6xxx_reg_read(ds, REG_GLOBAL, GLOBAL_CONTROL);
> + ret = mv88e6xxx_reg_read(ps, REG_GLOBAL, GLOBAL_CONTROL);
> if (ret < 0)
> return ret;
>
> - err = mv88e6xxx_reg_write(ds, REG_GLOBAL, GLOBAL_CONTROL,
> + err = mv88e6xxx_reg_write(ps, REG_GLOBAL, GLOBAL_CONTROL,
> ret | GLOBAL_CONTROL_PPU_ENABLE);
> if (err)
> return err;
>
> timeout = jiffies + 1 * HZ;
> while (time_before(jiffies, timeout)) {
> - ret = mv88e6xxx_reg_read(ds, REG_GLOBAL, GLOBAL_STATUS);
> + ret = mv88e6xxx_reg_read(ps, REG_GLOBAL, GLOBAL_STATUS);
> if (ret < 0)
> return ret;
>
> @@ -308,9 +306,7 @@ static void mv88e6xxx_ppu_reenable_work(struct work_struct *ugly)
>
> ps = container_of(ugly, struct mv88e6xxx_priv_state, ppu_work);
> if (mutex_trylock(&ps->ppu_mutex)) {
> - struct dsa_switch *ds = ps->ds;
> -
> - if (mv88e6xxx_ppu_enable(ds) == 0)
> + if (mv88e6xxx_ppu_enable(ps) == 0)
> ps->ppu_disabled = 0;
> mutex_unlock(&ps->ppu_mutex);
> }
> @@ -323,9 +319,8 @@ static void mv88e6xxx_ppu_reenable_timer(unsigned long _ps)
> schedule_work(&ps->ppu_work);
> }
>
> -static int mv88e6xxx_ppu_access_get(struct dsa_switch *ds)
> +static int mv88e6xxx_ppu_access_get(struct mv88e6xxx_priv_state *ps)
> {
> - struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
> int ret;
>
> mutex_lock(&ps->ppu_mutex);
> @@ -336,7 +331,7 @@ static int mv88e6xxx_ppu_access_get(struct dsa_switch *ds)
> * it.
> */
> if (!ps->ppu_disabled) {
> - ret = mv88e6xxx_ppu_disable(ds);
> + ret = mv88e6xxx_ppu_disable(ps);
> if (ret < 0) {
> mutex_unlock(&ps->ppu_mutex);
> return ret;
> @@ -350,19 +345,15 @@ static int mv88e6xxx_ppu_access_get(struct dsa_switch *ds)
> return ret;
> }
>
> -static void mv88e6xxx_ppu_access_put(struct dsa_switch *ds)
> +static void mv88e6xxx_ppu_access_put(struct mv88e6xxx_priv_state *ps)
> {
> - struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
> -
> /* Schedule a timer to re-enable the PHY polling unit. */
> mod_timer(&ps->ppu_timer, jiffies + msecs_to_jiffies(10));
> mutex_unlock(&ps->ppu_mutex);
> }
>
> -void mv88e6xxx_ppu_state_init(struct dsa_switch *ds)
> +void mv88e6xxx_ppu_state_init(struct mv88e6xxx_priv_state *ps)
> {
> - struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
> -
> mutex_init(&ps->ppu_mutex);
> INIT_WORK(&ps->ppu_work, mv88e6xxx_ppu_reenable_work);
> init_timer(&ps->ppu_timer);
> @@ -372,12 +363,13 @@ void mv88e6xxx_ppu_state_init(struct dsa_switch *ds)
>
> int mv88e6xxx_phy_read_ppu(struct dsa_switch *ds, int addr, int regnum)
> {
> + struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
> int ret;
>
> - ret = mv88e6xxx_ppu_access_get(ds);
> + ret = mv88e6xxx_ppu_access_get(ps);
> if (ret >= 0) {
> - ret = mv88e6xxx_reg_read(ds, addr, regnum);
> - mv88e6xxx_ppu_access_put(ds);
> + ret = mv88e6xxx_reg_read(ps, addr, regnum);
> + mv88e6xxx_ppu_access_put(ps);
> }
>
> return ret;
> @@ -386,96 +378,79 @@ int mv88e6xxx_phy_read_ppu(struct dsa_switch *ds, int addr, int regnum)
> int mv88e6xxx_phy_write_ppu(struct dsa_switch *ds, int addr,
> int regnum, u16 val)
> {
> + struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
> int ret;
>
> - ret = mv88e6xxx_ppu_access_get(ds);
> + ret = mv88e6xxx_ppu_access_get(ps);
> if (ret >= 0) {
> - ret = mv88e6xxx_reg_write(ds, addr, regnum, val);
> - mv88e6xxx_ppu_access_put(ds);
> + ret = mv88e6xxx_reg_write(ps, addr, regnum, val);
> + mv88e6xxx_ppu_access_put(ps);
> }
>
> return ret;
> }
> #endif
>
> -static bool mv88e6xxx_6065_family(struct dsa_switch *ds)
> +static bool mv88e6xxx_6065_family(struct mv88e6xxx_priv_state *ps)
> {
> - struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
> -
> return ps->info->family == MV88E6XXX_FAMILY_6065;
> }
>
> -static bool mv88e6xxx_6095_family(struct dsa_switch *ds)
> +static bool mv88e6xxx_6095_family(struct mv88e6xxx_priv_state *ps)
> {
> - struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
> -
> return ps->info->family == MV88E6XXX_FAMILY_6095;
> }
>
> -static bool mv88e6xxx_6097_family(struct dsa_switch *ds)
> +static bool mv88e6xxx_6097_family(struct mv88e6xxx_priv_state *ps)
> {
> - struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
> -
> return ps->info->family == MV88E6XXX_FAMILY_6097;
> }
>
> -static bool mv88e6xxx_6165_family(struct dsa_switch *ds)
> +static bool mv88e6xxx_6165_family(struct mv88e6xxx_priv_state *ps)
> {
> - struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
> -
> return ps->info->family == MV88E6XXX_FAMILY_6165;
> }
>
> -static bool mv88e6xxx_6185_family(struct dsa_switch *ds)
> +static bool mv88e6xxx_6185_family(struct mv88e6xxx_priv_state *ps)
> {
> - struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
> -
> return ps->info->family == MV88E6XXX_FAMILY_6185;
> }
>
> -static bool mv88e6xxx_6320_family(struct dsa_switch *ds)
> +static bool mv88e6xxx_6320_family(struct mv88e6xxx_priv_state *ps)
> {
> - struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
> -
> return ps->info->family == MV88E6XXX_FAMILY_6320;
> }
>
> -static bool mv88e6xxx_6351_family(struct dsa_switch *ds)
> +static bool mv88e6xxx_6351_family(struct mv88e6xxx_priv_state *ps)
> {
> - struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
> -
> return ps->info->family == MV88E6XXX_FAMILY_6351;
> }
>
> -static bool mv88e6xxx_6352_family(struct dsa_switch *ds)
> +static bool mv88e6xxx_6352_family(struct mv88e6xxx_priv_state *ps)
> {
> - struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
> -
> return ps->info->family == MV88E6XXX_FAMILY_6352;
> }
>
> -static unsigned int mv88e6xxx_num_databases(struct dsa_switch *ds)
> +static unsigned int mv88e6xxx_num_databases(struct mv88e6xxx_priv_state *ps)
> {
> - struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
> -
> return ps->info->num_databases;
> }
>
> -static bool mv88e6xxx_has_fid_reg(struct dsa_switch *ds)
> +static bool mv88e6xxx_has_fid_reg(struct mv88e6xxx_priv_state *ps)
> {
> /* Does the device have dedicated FID registers for ATU and VTU ops? */
> - if (mv88e6xxx_6097_family(ds) || mv88e6xxx_6165_family(ds) ||
> - mv88e6xxx_6351_family(ds) || mv88e6xxx_6352_family(ds))
> + if (mv88e6xxx_6097_family(ps) || mv88e6xxx_6165_family(ps) ||
> + mv88e6xxx_6351_family(ps) || mv88e6xxx_6352_family(ps))
> return true;
>
> return false;
> }
>
> -static bool mv88e6xxx_has_stu(struct dsa_switch *ds)
> +static bool mv88e6xxx_has_stu(struct mv88e6xxx_priv_state *ps)
> {
> /* Does the device have STU and dedicated SID registers for VTU ops? */
> - if (mv88e6xxx_6097_family(ds) || mv88e6xxx_6165_family(ds) ||
> - mv88e6xxx_6351_family(ds) || mv88e6xxx_6352_family(ds))
> + if (mv88e6xxx_6097_family(ps) || mv88e6xxx_6165_family(ps) ||
> + mv88e6xxx_6351_family(ps) || mv88e6xxx_6352_family(ps))
> return true;
>
> return false;
> @@ -497,7 +472,7 @@ void mv88e6xxx_adjust_link(struct dsa_switch *ds, int port,
>
> mutex_lock(&ps->smi_mutex);
>
> - ret = _mv88e6xxx_reg_read(ds, REG_PORT(port), PORT_PCS_CTRL);
> + ret = _mv88e6xxx_reg_read(ps, REG_PORT(port), PORT_PCS_CTRL);
> if (ret < 0)
> goto out;
>
> @@ -511,7 +486,7 @@ void mv88e6xxx_adjust_link(struct dsa_switch *ds, int port,
> if (phydev->link)
> reg |= PORT_PCS_CTRL_LINK_UP;
>
> - if (mv88e6xxx_6065_family(ds) && phydev->speed > SPEED_100)
> + if (mv88e6xxx_6065_family(ps) && phydev->speed > SPEED_100)
> goto out;
>
> switch (phydev->speed) {
> @@ -533,7 +508,7 @@ void mv88e6xxx_adjust_link(struct dsa_switch *ds, int port,
> if (phydev->duplex == DUPLEX_FULL)
> reg |= PORT_PCS_CTRL_DUPLEX_FULL;
>
> - if ((mv88e6xxx_6352_family(ds) || mv88e6xxx_6351_family(ds)) &&
> + if ((mv88e6xxx_6352_family(ps) || mv88e6xxx_6351_family(ps)) &&
> (port >= ps->info->num_ports - 2)) {
> if (phydev->interface == PHY_INTERFACE_MODE_RGMII_RXID)
> reg |= PORT_PCS_CTRL_RGMII_DELAY_RXCLK;
> @@ -543,19 +518,19 @@ void mv88e6xxx_adjust_link(struct dsa_switch *ds, int port,
> reg |= (PORT_PCS_CTRL_RGMII_DELAY_RXCLK |
> PORT_PCS_CTRL_RGMII_DELAY_TXCLK);
> }
> - _mv88e6xxx_reg_write(ds, REG_PORT(port), PORT_PCS_CTRL, reg);
> + _mv88e6xxx_reg_write(ps, REG_PORT(port), PORT_PCS_CTRL, reg);
>
> out:
> mutex_unlock(&ps->smi_mutex);
> }
>
> -static int _mv88e6xxx_stats_wait(struct dsa_switch *ds)
> +static int _mv88e6xxx_stats_wait(struct mv88e6xxx_priv_state *ps)
> {
> int ret;
> int i;
>
> for (i = 0; i < 10; i++) {
> - ret = _mv88e6xxx_reg_read(ds, REG_GLOBAL, GLOBAL_STATS_OP);
> + ret = _mv88e6xxx_reg_read(ps, REG_GLOBAL, GLOBAL_STATS_OP);
> if ((ret & GLOBAL_STATS_OP_BUSY) == 0)
> return 0;
> }
> @@ -563,52 +538,54 @@ static int _mv88e6xxx_stats_wait(struct dsa_switch *ds)
> return -ETIMEDOUT;
> }
>
> -static int _mv88e6xxx_stats_snapshot(struct dsa_switch *ds, int port)
> +static int _mv88e6xxx_stats_snapshot(struct mv88e6xxx_priv_state *ps,
> + int port)
> {
> int ret;
>
> - if (mv88e6xxx_6320_family(ds) || mv88e6xxx_6352_family(ds))
> + if (mv88e6xxx_6320_family(ps) || mv88e6xxx_6352_family(ps))
> port = (port + 1) << 5;
>
> /* Snapshot the hardware statistics counters for this port. */
> - ret = _mv88e6xxx_reg_write(ds, REG_GLOBAL, GLOBAL_STATS_OP,
> + ret = _mv88e6xxx_reg_write(ps, REG_GLOBAL, GLOBAL_STATS_OP,
> GLOBAL_STATS_OP_CAPTURE_PORT |
> GLOBAL_STATS_OP_HIST_RX_TX | port);
> if (ret < 0)
> return ret;
>
> /* Wait for the snapshotting to complete. */
> - ret = _mv88e6xxx_stats_wait(ds);
> + ret = _mv88e6xxx_stats_wait(ps);
> if (ret < 0)
> return ret;
>
> return 0;
> }
>
> -static void _mv88e6xxx_stats_read(struct dsa_switch *ds, int stat, u32 *val)
> +static void _mv88e6xxx_stats_read(struct mv88e6xxx_priv_state *ps,
> + int stat, u32 *val)
> {
> u32 _val;
> int ret;
>
> *val = 0;
>
> - ret = _mv88e6xxx_reg_write(ds, REG_GLOBAL, GLOBAL_STATS_OP,
> + ret = _mv88e6xxx_reg_write(ps, REG_GLOBAL, GLOBAL_STATS_OP,
> GLOBAL_STATS_OP_READ_CAPTURED |
> GLOBAL_STATS_OP_HIST_RX_TX | stat);
> if (ret < 0)
> return;
>
> - ret = _mv88e6xxx_stats_wait(ds);
> + ret = _mv88e6xxx_stats_wait(ps);
> if (ret < 0)
> return;
>
> - ret = _mv88e6xxx_reg_read(ds, REG_GLOBAL, GLOBAL_STATS_COUNTER_32);
> + ret = _mv88e6xxx_reg_read(ps, REG_GLOBAL, GLOBAL_STATS_COUNTER_32);
> if (ret < 0)
> return;
>
> _val = ret << 16;
>
> - ret = _mv88e6xxx_reg_read(ds, REG_GLOBAL, GLOBAL_STATS_COUNTER_01);
> + ret = _mv88e6xxx_reg_read(ps, REG_GLOBAL, GLOBAL_STATS_COUNTER_01);
> if (ret < 0)
> return;
>
> @@ -677,26 +654,26 @@ static struct mv88e6xxx_hw_stat mv88e6xxx_hw_stats[] = {
> { "out_management", 4, 0x1f | GLOBAL_STATS_OP_BANK_1, BANK1, },
> };
>
> -static bool mv88e6xxx_has_stat(struct dsa_switch *ds,
> +static bool mv88e6xxx_has_stat(struct mv88e6xxx_priv_state *ps,
> struct mv88e6xxx_hw_stat *stat)
> {
> switch (stat->type) {
> case BANK0:
> return true;
> case BANK1:
> - return mv88e6xxx_6320_family(ds);
> + return mv88e6xxx_6320_family(ps);
> case PORT:
> - return mv88e6xxx_6095_family(ds) ||
> - mv88e6xxx_6185_family(ds) ||
> - mv88e6xxx_6097_family(ds) ||
> - mv88e6xxx_6165_family(ds) ||
> - mv88e6xxx_6351_family(ds) ||
> - mv88e6xxx_6352_family(ds);
> + return mv88e6xxx_6095_family(ps) ||
> + mv88e6xxx_6185_family(ps) ||
> + mv88e6xxx_6097_family(ps) ||
> + mv88e6xxx_6165_family(ps) ||
> + mv88e6xxx_6351_family(ps) ||
> + mv88e6xxx_6352_family(ps);
> }
> return false;
> }
>
> -static uint64_t _mv88e6xxx_get_ethtool_stat(struct dsa_switch *ds,
> +static uint64_t _mv88e6xxx_get_ethtool_stat(struct mv88e6xxx_priv_state *ps,
> struct mv88e6xxx_hw_stat *s,
> int port)
> {
> @@ -707,13 +684,13 @@ static uint64_t _mv88e6xxx_get_ethtool_stat(struct dsa_switch *ds,
>
> switch (s->type) {
> case PORT:
> - ret = _mv88e6xxx_reg_read(ds, REG_PORT(port), s->reg);
> + ret = _mv88e6xxx_reg_read(ps, REG_PORT(port), s->reg);
> if (ret < 0)
> return UINT64_MAX;
>
> low = ret;
> if (s->sizeof_stat == 4) {
> - ret = _mv88e6xxx_reg_read(ds, REG_PORT(port),
> + ret = _mv88e6xxx_reg_read(ps, REG_PORT(port),
> s->reg + 1);
> if (ret < 0)
> return UINT64_MAX;
> @@ -722,9 +699,9 @@ static uint64_t _mv88e6xxx_get_ethtool_stat(struct dsa_switch *ds,
> break;
> case BANK0:
> case BANK1:
> - _mv88e6xxx_stats_read(ds, s->reg, &low);
> + _mv88e6xxx_stats_read(ps, s->reg, &low);
> if (s->sizeof_stat == 8)
> - _mv88e6xxx_stats_read(ds, s->reg + 1, &high);
> + _mv88e6xxx_stats_read(ps, s->reg + 1, &high);
> }
> value = (((u64)high) << 16) | low;
> return value;
> @@ -732,12 +709,13 @@ static uint64_t _mv88e6xxx_get_ethtool_stat(struct dsa_switch *ds,
>
> void mv88e6xxx_get_strings(struct dsa_switch *ds, int port, uint8_t *data)
> {
> + struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
> struct mv88e6xxx_hw_stat *stat;
> int i, j;
>
> for (i = 0, j = 0; i < ARRAY_SIZE(mv88e6xxx_hw_stats); i++) {
> stat = &mv88e6xxx_hw_stats[i];
> - if (mv88e6xxx_has_stat(ds, stat)) {
> + if (mv88e6xxx_has_stat(ps, stat)) {
> memcpy(data + j * ETH_GSTRING_LEN, stat->string,
> ETH_GSTRING_LEN);
> j++;
> @@ -747,12 +725,13 @@ void mv88e6xxx_get_strings(struct dsa_switch *ds, int port, uint8_t *data)
>
> int mv88e6xxx_get_sset_count(struct dsa_switch *ds)
> {
> + struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
> struct mv88e6xxx_hw_stat *stat;
> int i, j;
>
> for (i = 0, j = 0; i < ARRAY_SIZE(mv88e6xxx_hw_stats); i++) {
> stat = &mv88e6xxx_hw_stats[i];
> - if (mv88e6xxx_has_stat(ds, stat))
> + if (mv88e6xxx_has_stat(ps, stat))
> j++;
> }
> return j;
> @@ -769,15 +748,15 @@ mv88e6xxx_get_ethtool_stats(struct dsa_switch *ds,
>
> mutex_lock(&ps->smi_mutex);
>
> - ret = _mv88e6xxx_stats_snapshot(ds, port);
> + ret = _mv88e6xxx_stats_snapshot(ps, port);
> if (ret < 0) {
> mutex_unlock(&ps->smi_mutex);
> return;
> }
> for (i = 0, j = 0; i < ARRAY_SIZE(mv88e6xxx_hw_stats); i++) {
> stat = &mv88e6xxx_hw_stats[i];
> - if (mv88e6xxx_has_stat(ds, stat)) {
> - data[j] = _mv88e6xxx_get_ethtool_stat(ds, stat, port);
> + if (mv88e6xxx_has_stat(ps, stat)) {
> + data[j] = _mv88e6xxx_get_ethtool_stat(ps, stat, port);
> j++;
> }
> }
> @@ -793,6 +772,7 @@ int mv88e6xxx_get_regs_len(struct dsa_switch *ds, int port)
> void mv88e6xxx_get_regs(struct dsa_switch *ds, int port,
> struct ethtool_regs *regs, void *_p)
> {
> + struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
> u16 *p = _p;
> int i;
>
> @@ -803,13 +783,13 @@ void mv88e6xxx_get_regs(struct dsa_switch *ds, int port,
> for (i = 0; i < 32; i++) {
> int ret;
>
> - ret = mv88e6xxx_reg_read(ds, REG_PORT(port), i);
> + ret = mv88e6xxx_reg_read(ps, REG_PORT(port), i);
> if (ret >= 0)
> p[i] = ret;
> }
> }
>
> -static int _mv88e6xxx_wait(struct dsa_switch *ds, int reg, int offset,
> +static int _mv88e6xxx_wait(struct mv88e6xxx_priv_state *ps, int reg, int offset,
> u16 mask)
> {
> unsigned long timeout = jiffies + HZ / 10;
> @@ -817,7 +797,7 @@ static int _mv88e6xxx_wait(struct dsa_switch *ds, int reg, int offset,
> while (time_before(jiffies, timeout)) {
> int ret;
>
> - ret = _mv88e6xxx_reg_read(ds, reg, offset);
> + ret = _mv88e6xxx_reg_read(ps, reg, offset);
> if (ret < 0)
> return ret;
> if (!(ret & mask))
> @@ -828,74 +808,80 @@ static int _mv88e6xxx_wait(struct dsa_switch *ds, int reg, int offset,
> return -ETIMEDOUT;
> }
>
> -static int mv88e6xxx_wait(struct dsa_switch *ds, int reg, int offset, u16 mask)
> +static int mv88e6xxx_wait(struct mv88e6xxx_priv_state *ps, int reg,
> + int offset, u16 mask)
> {
> - struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
> int ret;
>
> mutex_lock(&ps->smi_mutex);
> - ret = _mv88e6xxx_wait(ds, reg, offset, mask);
> + ret = _mv88e6xxx_wait(ps, reg, offset, mask);
> mutex_unlock(&ps->smi_mutex);
>
> return ret;
> }
>
> -static int _mv88e6xxx_phy_wait(struct dsa_switch *ds)
> +static int _mv88e6xxx_phy_wait(struct mv88e6xxx_priv_state *ps)
> {
> - return _mv88e6xxx_wait(ds, REG_GLOBAL2, GLOBAL2_SMI_OP,
> + return _mv88e6xxx_wait(ps, REG_GLOBAL2, GLOBAL2_SMI_OP,
> GLOBAL2_SMI_OP_BUSY);
> }
>
> int mv88e6xxx_eeprom_load_wait(struct dsa_switch *ds)
> {
> - return mv88e6xxx_wait(ds, REG_GLOBAL2, GLOBAL2_EEPROM_OP,
> + struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
> +
> + return mv88e6xxx_wait(ps, REG_GLOBAL2, GLOBAL2_EEPROM_OP,
> GLOBAL2_EEPROM_OP_LOAD);
> }
>
> int mv88e6xxx_eeprom_busy_wait(struct dsa_switch *ds)
> {
> - return mv88e6xxx_wait(ds, REG_GLOBAL2, GLOBAL2_EEPROM_OP,
> + struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
> +
> + return mv88e6xxx_wait(ps, REG_GLOBAL2, GLOBAL2_EEPROM_OP,
> GLOBAL2_EEPROM_OP_BUSY);
> }
>
> -static int _mv88e6xxx_atu_wait(struct dsa_switch *ds)
> +static int _mv88e6xxx_atu_wait(struct mv88e6xxx_priv_state *ps)
> {
> - return _mv88e6xxx_wait(ds, REG_GLOBAL, GLOBAL_ATU_OP,
> + return _mv88e6xxx_wait(ps, REG_GLOBAL, GLOBAL_ATU_OP,
> GLOBAL_ATU_OP_BUSY);
> }
>
> -static int _mv88e6xxx_phy_read_indirect(struct dsa_switch *ds, int addr,
> - int regnum)
> +static int _mv88e6xxx_phy_read_indirect(struct mv88e6xxx_priv_state *ps,
> + int addr, int regnum)
> {
> int ret;
>
> - ret = _mv88e6xxx_reg_write(ds, REG_GLOBAL2, GLOBAL2_SMI_OP,
> + ret = _mv88e6xxx_reg_write(ps, REG_GLOBAL2, GLOBAL2_SMI_OP,
> GLOBAL2_SMI_OP_22_READ | (addr << 5) |
> regnum);
> if (ret < 0)
> return ret;
>
> - ret = _mv88e6xxx_phy_wait(ds);
> + ret = _mv88e6xxx_phy_wait(ps);
> if (ret < 0)
> return ret;
>
> - return _mv88e6xxx_reg_read(ds, REG_GLOBAL2, GLOBAL2_SMI_DATA);
> + ret = _mv88e6xxx_reg_read(ps, REG_GLOBAL2, GLOBAL2_SMI_DATA);
> +
> + return ret;
> }
>
> -static int _mv88e6xxx_phy_write_indirect(struct dsa_switch *ds, int addr,
> - int regnum, u16 val)
> +static int _mv88e6xxx_phy_write_indirect(struct mv88e6xxx_priv_state *ps,
> + int addr, int regnum, u16 val)
> {
> int ret;
>
> - ret = _mv88e6xxx_reg_write(ds, REG_GLOBAL2, GLOBAL2_SMI_DATA, val);
> + ret = _mv88e6xxx_reg_write(ps, REG_GLOBAL2, GLOBAL2_SMI_DATA, val);
> if (ret < 0)
> return ret;
>
> - ret = _mv88e6xxx_reg_write(ds, REG_GLOBAL2, GLOBAL2_SMI_OP,
> + ret = _mv88e6xxx_reg_write(ps, REG_GLOBAL2, GLOBAL2_SMI_OP,
> GLOBAL2_SMI_OP_22_WRITE | (addr << 5) |
> regnum);
>
> - return _mv88e6xxx_phy_wait(ds);
> + return _mv88e6xxx_phy_wait(ps);
> }
>
> int mv88e6xxx_get_eee(struct dsa_switch *ds, int port, struct ethtool_eee *e)
> @@ -905,14 +891,14 @@ int mv88e6xxx_get_eee(struct dsa_switch *ds, int port, struct ethtool_eee *e)
>
> mutex_lock(&ps->smi_mutex);
>
> - reg = _mv88e6xxx_phy_read_indirect(ds, port, 16);
> + reg = _mv88e6xxx_phy_read_indirect(ps, port, 16);
> if (reg < 0)
> goto out;
>
> e->eee_enabled = !!(reg & 0x0200);
> e->tx_lpi_enabled = !!(reg & 0x0100);
>
> - reg = _mv88e6xxx_reg_read(ds, REG_PORT(port), PORT_STATUS);
> + reg = _mv88e6xxx_reg_read(ps, REG_PORT(port), PORT_STATUS);
> if (reg < 0)
> goto out;
>
> @@ -933,7 +919,7 @@ int mv88e6xxx_set_eee(struct dsa_switch *ds, int port,
>
> mutex_lock(&ps->smi_mutex);
>
> - ret = _mv88e6xxx_phy_read_indirect(ds, port, 16);
> + ret = _mv88e6xxx_phy_read_indirect(ps, port, 16);
> if (ret < 0)
> goto out;
>
> @@ -943,28 +929,28 @@ int mv88e6xxx_set_eee(struct dsa_switch *ds, int port,
> if (e->tx_lpi_enabled)
> reg |= 0x0100;
>
> - ret = _mv88e6xxx_phy_write_indirect(ds, port, 16, reg);
> + ret = _mv88e6xxx_phy_write_indirect(ps, port, 16, reg);
> out:
> mutex_unlock(&ps->smi_mutex);
>
> return ret;
> }
>
> -static int _mv88e6xxx_atu_cmd(struct dsa_switch *ds, u16 fid, u16 cmd)
> +static int _mv88e6xxx_atu_cmd(struct mv88e6xxx_priv_state *ps, u16 fid, u16 cmd)
> {
> int ret;
>
> - if (mv88e6xxx_has_fid_reg(ds)) {
> - ret = _mv88e6xxx_reg_write(ds, REG_GLOBAL, GLOBAL_ATU_FID, fid);
> + if (mv88e6xxx_has_fid_reg(ps)) {
> + ret = _mv88e6xxx_reg_write(ps, REG_GLOBAL, GLOBAL_ATU_FID, fid);
> if (ret < 0)
> return ret;
> - } else if (mv88e6xxx_num_databases(ds) == 256) {
> + } else if (mv88e6xxx_num_databases(ps) == 256) {
> /* ATU DBNum[7:4] are located in ATU Control 15:12 */
> - ret = _mv88e6xxx_reg_read(ds, REG_GLOBAL, GLOBAL_ATU_CONTROL);
> + ret = _mv88e6xxx_reg_read(ps, REG_GLOBAL, GLOBAL_ATU_CONTROL);
> if (ret < 0)
> return ret;
>
> - ret = _mv88e6xxx_reg_write(ds, REG_GLOBAL, GLOBAL_ATU_CONTROL,
> + ret = _mv88e6xxx_reg_write(ps, REG_GLOBAL, GLOBAL_ATU_CONTROL,
> (ret & 0xfff) |
> ((fid << 8) & 0xf000));
> if (ret < 0)
> @@ -974,14 +960,14 @@ static int _mv88e6xxx_atu_cmd(struct dsa_switch *ds, u16 fid, u16 cmd)
> cmd |= fid & 0xf;
> }
>
> - ret = _mv88e6xxx_reg_write(ds, REG_GLOBAL, GLOBAL_ATU_OP, cmd);
> + ret = _mv88e6xxx_reg_write(ps, REG_GLOBAL, GLOBAL_ATU_OP, cmd);
> if (ret < 0)
> return ret;
>
> - return _mv88e6xxx_atu_wait(ds);
> + return _mv88e6xxx_atu_wait(ps);
> }
>
> -static int _mv88e6xxx_atu_data_write(struct dsa_switch *ds,
> +static int _mv88e6xxx_atu_data_write(struct mv88e6xxx_priv_state *ps,
> struct mv88e6xxx_atu_entry *entry)
> {
> u16 data = entry->state & GLOBAL_ATU_DATA_STATE_MASK;
> @@ -1001,21 +987,21 @@ static int _mv88e6xxx_atu_data_write(struct dsa_switch *ds,
> data |= (entry->portv_trunkid << shift) & mask;
> }
>
> - return _mv88e6xxx_reg_write(ds, REG_GLOBAL, GLOBAL_ATU_DATA, data);
> + return _mv88e6xxx_reg_write(ps, REG_GLOBAL, GLOBAL_ATU_DATA, data);
> }
>
> -static int _mv88e6xxx_atu_flush_move(struct dsa_switch *ds,
> +static int _mv88e6xxx_atu_flush_move(struct mv88e6xxx_priv_state *ps,
> struct mv88e6xxx_atu_entry *entry,
> bool static_too)
> {
> int op;
> int err;
>
> - err = _mv88e6xxx_atu_wait(ds);
> + err = _mv88e6xxx_atu_wait(ps);
> if (err)
> return err;
>
> - err = _mv88e6xxx_atu_data_write(ds, entry);
> + err = _mv88e6xxx_atu_data_write(ps, entry);
> if (err)
> return err;
>
> @@ -1027,21 +1013,22 @@ static int _mv88e6xxx_atu_flush_move(struct dsa_switch *ds,
> GLOBAL_ATU_OP_FLUSH_MOVE_NON_STATIC;
> }
>
> - return _mv88e6xxx_atu_cmd(ds, entry->fid, op);
> + return _mv88e6xxx_atu_cmd(ps, entry->fid, op);
> }
>
> -static int _mv88e6xxx_atu_flush(struct dsa_switch *ds, u16 fid, bool static_too)
> +static int _mv88e6xxx_atu_flush(struct mv88e6xxx_priv_state *ps,
> + u16 fid, bool static_too)
> {
> struct mv88e6xxx_atu_entry entry = {
> .fid = fid,
> .state = 0, /* EntryState bits must be 0 */
> };
>
> - return _mv88e6xxx_atu_flush_move(ds, &entry, static_too);
> + return _mv88e6xxx_atu_flush_move(ps, &entry, static_too);
> }
>
> -static int _mv88e6xxx_atu_move(struct dsa_switch *ds, u16 fid, int from_port,
> - int to_port, bool static_too)
> +static int _mv88e6xxx_atu_move(struct mv88e6xxx_priv_state *ps, u16 fid,
> + int from_port, int to_port, bool static_too)
> {
> struct mv88e6xxx_atu_entry entry = {
> .trunk = false,
> @@ -1055,14 +1042,14 @@ static int _mv88e6xxx_atu_move(struct dsa_switch *ds, u16 fid, int from_port,
> entry.portv_trunkid = (to_port & 0x0f) << 4;
> entry.portv_trunkid |= from_port & 0x0f;
>
> - return _mv88e6xxx_atu_flush_move(ds, &entry, static_too);
> + return _mv88e6xxx_atu_flush_move(ps, &entry, static_too);
> }
>
> -static int _mv88e6xxx_atu_remove(struct dsa_switch *ds, u16 fid, int port,
> - bool static_too)
> +static int _mv88e6xxx_atu_remove(struct mv88e6xxx_priv_state *ps, u16 fid,
> + int port, bool static_too)
> {
> /* Destination port 0xF means remove the entries */
> - return _mv88e6xxx_atu_move(ds, fid, port, 0x0f, static_too);
> + return _mv88e6xxx_atu_move(ps, fid, port, 0x0f, static_too);
> }
>
> static const char * const mv88e6xxx_port_state_names[] = {
> @@ -1072,12 +1059,14 @@ static const char * const mv88e6xxx_port_state_names[] = {
> [PORT_CONTROL_STATE_FORWARDING] = "Forwarding",
> };
>
> -static int _mv88e6xxx_port_state(struct dsa_switch *ds, int port, u8 state)
> +static int _mv88e6xxx_port_state(struct mv88e6xxx_priv_state *ps, int port,
> + u8 state)
> {
> + struct dsa_switch *ds = ps->ds;
> int reg, ret = 0;
> u8 oldstate;
>
> - reg = _mv88e6xxx_reg_read(ds, REG_PORT(port), PORT_CONTROL);
> + reg = _mv88e6xxx_reg_read(ps, REG_PORT(port), PORT_CONTROL);
> if (reg < 0)
> return reg;
>
> @@ -1092,13 +1081,13 @@ static int _mv88e6xxx_port_state(struct dsa_switch *ds, int port, u8 state)
> oldstate == PORT_CONTROL_STATE_FORWARDING)
> && (state == PORT_CONTROL_STATE_DISABLED ||
> state == PORT_CONTROL_STATE_BLOCKING)) {
> - ret = _mv88e6xxx_atu_remove(ds, 0, port, false);
> + ret = _mv88e6xxx_atu_remove(ps, 0, port, false);
> if (ret)
> return ret;
> }
>
> reg = (reg & ~PORT_CONTROL_STATE_MASK) | state;
> - ret = _mv88e6xxx_reg_write(ds, REG_PORT(port), PORT_CONTROL,
> + ret = _mv88e6xxx_reg_write(ps, REG_PORT(port), PORT_CONTROL,
> reg);
> if (ret)
> return ret;
> @@ -1111,11 +1100,12 @@ static int _mv88e6xxx_port_state(struct dsa_switch *ds, int port, u8 state)
> return ret;
> }
>
> -static int _mv88e6xxx_port_based_vlan_map(struct dsa_switch *ds, int port)
> +static int _mv88e6xxx_port_based_vlan_map(struct mv88e6xxx_priv_state *ps,
> + int port)
> {
> - struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
> struct net_device *bridge = ps->ports[port].bridge_dev;
> const u16 mask = (1 << ps->info->num_ports) - 1;
> + struct dsa_switch *ds = ps->ds;
> u16 output_ports = 0;
> int reg;
> int i;
> @@ -1138,14 +1128,14 @@ static int _mv88e6xxx_port_based_vlan_map(struct dsa_switch *ds, int port)
> /* prevent frames from going back out of the port they came in on */
> output_ports &= ~BIT(port);
>
> - reg = _mv88e6xxx_reg_read(ds, REG_PORT(port), PORT_BASE_VLAN);
> + reg = _mv88e6xxx_reg_read(ps, REG_PORT(port), PORT_BASE_VLAN);
> if (reg < 0)
> return reg;
>
> reg &= ~mask;
> reg |= output_ports & mask;
>
> - return _mv88e6xxx_reg_write(ds, REG_PORT(port), PORT_BASE_VLAN, reg);
> + return _mv88e6xxx_reg_write(ps, REG_PORT(port), PORT_BASE_VLAN, reg);
> }
>
> void mv88e6xxx_port_stp_state_set(struct dsa_switch *ds, int port, u8 state)
> @@ -1178,13 +1168,14 @@ void mv88e6xxx_port_stp_state_set(struct dsa_switch *ds, int port, u8 state)
> schedule_work(&ps->bridge_work);
> }
>
> -static int _mv88e6xxx_port_pvid(struct dsa_switch *ds, int port, u16 *new,
> - u16 *old)
> +static int _mv88e6xxx_port_pvid(struct mv88e6xxx_priv_state *ps, int port,
> + u16 *new, u16 *old)
> {
> + struct dsa_switch *ds = ps->ds;
> u16 pvid;
> int ret;
>
> - ret = _mv88e6xxx_reg_read(ds, REG_PORT(port), PORT_DEFAULT_VLAN);
> + ret = _mv88e6xxx_reg_read(ps, REG_PORT(port), PORT_DEFAULT_VLAN);
> if (ret < 0)
> return ret;
>
> @@ -1194,7 +1185,7 @@ static int _mv88e6xxx_port_pvid(struct dsa_switch *ds, int port, u16 *new,
> ret &= ~PORT_DEFAULT_VLAN_MASK;
> ret |= *new & PORT_DEFAULT_VLAN_MASK;
>
> - ret = _mv88e6xxx_reg_write(ds, REG_PORT(port),
> + ret = _mv88e6xxx_reg_write(ps, REG_PORT(port),
> PORT_DEFAULT_VLAN, ret);
> if (ret < 0)
> return ret;
> @@ -1209,55 +1200,56 @@ static int _mv88e6xxx_port_pvid(struct dsa_switch *ds, int port, u16 *new,
> return 0;
> }
>
> -static int _mv88e6xxx_port_pvid_get(struct dsa_switch *ds, int port, u16 *pvid)
> +static int _mv88e6xxx_port_pvid_get(struct mv88e6xxx_priv_state *ps,
> + int port, u16 *pvid)
> {
> - return _mv88e6xxx_port_pvid(ds, port, NULL, pvid);
> + return _mv88e6xxx_port_pvid(ps, port, NULL, pvid);
> }
>
> -static int _mv88e6xxx_port_pvid_set(struct dsa_switch *ds, int port, u16 pvid)
> +static int _mv88e6xxx_port_pvid_set(struct mv88e6xxx_priv_state *ps,
> + int port, u16 pvid)
> {
> - return _mv88e6xxx_port_pvid(ds, port, &pvid, NULL);
> + return _mv88e6xxx_port_pvid(ps, port, &pvid, NULL);
> }
>
> -static int _mv88e6xxx_vtu_wait(struct dsa_switch *ds)
> +static int _mv88e6xxx_vtu_wait(struct mv88e6xxx_priv_state *ps)
> {
> - return _mv88e6xxx_wait(ds, REG_GLOBAL, GLOBAL_VTU_OP,
> + return _mv88e6xxx_wait(ps, REG_GLOBAL, GLOBAL_VTU_OP,
> GLOBAL_VTU_OP_BUSY);
> }
>
> -static int _mv88e6xxx_vtu_cmd(struct dsa_switch *ds, u16 op)
> +static int _mv88e6xxx_vtu_cmd(struct mv88e6xxx_priv_state *ps, u16 op)
> {
> int ret;
>
> - ret = _mv88e6xxx_reg_write(ds, REG_GLOBAL, GLOBAL_VTU_OP, op);
> + ret = _mv88e6xxx_reg_write(ps, REG_GLOBAL, GLOBAL_VTU_OP, op);
> if (ret < 0)
> return ret;
>
> - return _mv88e6xxx_vtu_wait(ds);
> + return _mv88e6xxx_vtu_wait(ps);
> }
>
> -static int _mv88e6xxx_vtu_stu_flush(struct dsa_switch *ds)
> +static int _mv88e6xxx_vtu_stu_flush(struct mv88e6xxx_priv_state *ps)
> {
> int ret;
>
> - ret = _mv88e6xxx_vtu_wait(ds);
> + ret = _mv88e6xxx_vtu_wait(ps);
> if (ret < 0)
> return ret;
>
> - return _mv88e6xxx_vtu_cmd(ds, GLOBAL_VTU_OP_FLUSH_ALL);
> + return _mv88e6xxx_vtu_cmd(ps, GLOBAL_VTU_OP_FLUSH_ALL);
> }
>
> -static int _mv88e6xxx_vtu_stu_data_read(struct dsa_switch *ds,
> +static int _mv88e6xxx_vtu_stu_data_read(struct mv88e6xxx_priv_state *ps,
> struct mv88e6xxx_vtu_stu_entry *entry,
> unsigned int nibble_offset)
> {
> - struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
> u16 regs[3];
> int i;
> int ret;
>
> for (i = 0; i < 3; ++i) {
> - ret = _mv88e6xxx_reg_read(ds, REG_GLOBAL,
> + ret = _mv88e6xxx_reg_read(ps, REG_GLOBAL,
> GLOBAL_VTU_DATA_0_3 + i);
> if (ret < 0)
> return ret;
> @@ -1275,11 +1267,10 @@ static int _mv88e6xxx_vtu_stu_data_read(struct dsa_switch *ds,
> return 0;
> }
>
> -static int _mv88e6xxx_vtu_stu_data_write(struct dsa_switch *ds,
> +static int _mv88e6xxx_vtu_stu_data_write(struct mv88e6xxx_priv_state *ps,
> struct mv88e6xxx_vtu_stu_entry *entry,
> unsigned int nibble_offset)
> {
> - struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
> u16 regs[3] = { 0 };
> int i;
> int ret;
> @@ -1292,7 +1283,7 @@ static int _mv88e6xxx_vtu_stu_data_write(struct dsa_switch *ds,
> }
>
> for (i = 0; i < 3; ++i) {
> - ret = _mv88e6xxx_reg_write(ds, REG_GLOBAL,
> + ret = _mv88e6xxx_reg_write(ps, REG_GLOBAL,
> GLOBAL_VTU_DATA_0_3 + i, regs[i]);
> if (ret < 0)
> return ret;
> @@ -1301,27 +1292,27 @@ static int _mv88e6xxx_vtu_stu_data_write(struct dsa_switch *ds,
> return 0;
> }
>
> -static int _mv88e6xxx_vtu_vid_write(struct dsa_switch *ds, u16 vid)
> +static int _mv88e6xxx_vtu_vid_write(struct mv88e6xxx_priv_state *ps, u16 vid)
> {
> - return _mv88e6xxx_reg_write(ds, REG_GLOBAL, GLOBAL_VTU_VID,
> + return _mv88e6xxx_reg_write(ps, REG_GLOBAL, GLOBAL_VTU_VID,
> vid & GLOBAL_VTU_VID_MASK);
> }
>
> -static int _mv88e6xxx_vtu_getnext(struct dsa_switch *ds,
> +static int _mv88e6xxx_vtu_getnext(struct mv88e6xxx_priv_state *ps,
> struct mv88e6xxx_vtu_stu_entry *entry)
> {
> struct mv88e6xxx_vtu_stu_entry next = { 0 };
> int ret;
>
> - ret = _mv88e6xxx_vtu_wait(ds);
> + ret = _mv88e6xxx_vtu_wait(ps);
> if (ret < 0)
> return ret;
>
> - ret = _mv88e6xxx_vtu_cmd(ds, GLOBAL_VTU_OP_VTU_GET_NEXT);
> + ret = _mv88e6xxx_vtu_cmd(ps, GLOBAL_VTU_OP_VTU_GET_NEXT);
> if (ret < 0)
> return ret;
>
> - ret = _mv88e6xxx_reg_read(ds, REG_GLOBAL, GLOBAL_VTU_VID);
> + ret = _mv88e6xxx_reg_read(ps, REG_GLOBAL, GLOBAL_VTU_VID);
> if (ret < 0)
> return ret;
>
> @@ -1329,22 +1320,22 @@ static int _mv88e6xxx_vtu_getnext(struct dsa_switch *ds,
> next.valid = !!(ret & GLOBAL_VTU_VID_VALID);
>
> if (next.valid) {
> - ret = _mv88e6xxx_vtu_stu_data_read(ds, &next, 0);
> + ret = _mv88e6xxx_vtu_stu_data_read(ps, &next, 0);
> if (ret < 0)
> return ret;
>
> - if (mv88e6xxx_has_fid_reg(ds)) {
> - ret = _mv88e6xxx_reg_read(ds, REG_GLOBAL,
> + if (mv88e6xxx_has_fid_reg(ps)) {
> + ret = _mv88e6xxx_reg_read(ps, REG_GLOBAL,
> GLOBAL_VTU_FID);
> if (ret < 0)
> return ret;
>
> next.fid = ret & GLOBAL_VTU_FID_MASK;
> - } else if (mv88e6xxx_num_databases(ds) == 256) {
> + } else if (mv88e6xxx_num_databases(ps) == 256) {
> /* VTU DBNum[7:4] are located in VTU Operation 11:8, and
> * VTU DBNum[3:0] are located in VTU Operation 3:0
> */
> - ret = _mv88e6xxx_reg_read(ds, REG_GLOBAL,
> + ret = _mv88e6xxx_reg_read(ps, REG_GLOBAL,
> GLOBAL_VTU_OP);
> if (ret < 0)
> return ret;
> @@ -1353,8 +1344,8 @@ static int _mv88e6xxx_vtu_getnext(struct dsa_switch *ds,
> next.fid |= ret & 0xf;
> }
>
> - if (mv88e6xxx_has_stu(ds)) {
> - ret = _mv88e6xxx_reg_read(ds, REG_GLOBAL,
> + if (mv88e6xxx_has_stu(ps)) {
> + ret = _mv88e6xxx_reg_read(ps, REG_GLOBAL,
> GLOBAL_VTU_SID);
> if (ret < 0)
> return ret;
> @@ -1378,16 +1369,16 @@ int mv88e6xxx_port_vlan_dump(struct dsa_switch *ds, int port,
>
> mutex_lock(&ps->smi_mutex);
>
> - err = _mv88e6xxx_port_pvid_get(ds, port, &pvid);
> + err = _mv88e6xxx_port_pvid_get(ps, port, &pvid);
> if (err)
> goto unlock;
>
> - err = _mv88e6xxx_vtu_vid_write(ds, GLOBAL_VTU_VID_MASK);
> + err = _mv88e6xxx_vtu_vid_write(ps, GLOBAL_VTU_VID_MASK);
> if (err)
> goto unlock;
>
> do {
> - err = _mv88e6xxx_vtu_getnext(ds, &next);
> + err = _mv88e6xxx_vtu_getnext(ps, &next);
> if (err)
> break;
>
> @@ -1418,14 +1409,14 @@ unlock:
> return err;
> }
>
> -static int _mv88e6xxx_vtu_loadpurge(struct dsa_switch *ds,
> +static int _mv88e6xxx_vtu_loadpurge(struct mv88e6xxx_priv_state *ps,
> struct mv88e6xxx_vtu_stu_entry *entry)
> {
> u16 op = GLOBAL_VTU_OP_VTU_LOAD_PURGE;
> u16 reg = 0;
> int ret;
>
> - ret = _mv88e6xxx_vtu_wait(ds);
> + ret = _mv88e6xxx_vtu_wait(ps);
> if (ret < 0)
> return ret;
>
> @@ -1433,23 +1424,23 @@ static int _mv88e6xxx_vtu_loadpurge(struct dsa_switch *ds,
> goto loadpurge;
>
> /* Write port member tags */
> - ret = _mv88e6xxx_vtu_stu_data_write(ds, entry, 0);
> + ret = _mv88e6xxx_vtu_stu_data_write(ps, entry, 0);
> if (ret < 0)
> return ret;
>
> - if (mv88e6xxx_has_stu(ds)) {
> + if (mv88e6xxx_has_stu(ps)) {
> reg = entry->sid & GLOBAL_VTU_SID_MASK;
> - ret = _mv88e6xxx_reg_write(ds, REG_GLOBAL, GLOBAL_VTU_SID, reg);
> + ret = _mv88e6xxx_reg_write(ps, REG_GLOBAL, GLOBAL_VTU_SID, reg);
> if (ret < 0)
> return ret;
> }
>
> - if (mv88e6xxx_has_fid_reg(ds)) {
> + if (mv88e6xxx_has_fid_reg(ps)) {
> reg = entry->fid & GLOBAL_VTU_FID_MASK;
> - ret = _mv88e6xxx_reg_write(ds, REG_GLOBAL, GLOBAL_VTU_FID, reg);
> + ret = _mv88e6xxx_reg_write(ps, REG_GLOBAL, GLOBAL_VTU_FID, reg);
> if (ret < 0)
> return ret;
> - } else if (mv88e6xxx_num_databases(ds) == 256) {
> + } else if (mv88e6xxx_num_databases(ps) == 256) {
> /* VTU DBNum[7:4] are located in VTU Operation 11:8, and
> * VTU DBNum[3:0] are located in VTU Operation 3:0
> */
> @@ -1460,46 +1451,46 @@ static int _mv88e6xxx_vtu_loadpurge(struct dsa_switch *ds,
> reg = GLOBAL_VTU_VID_VALID;
> loadpurge:
> reg |= entry->vid & GLOBAL_VTU_VID_MASK;
> - ret = _mv88e6xxx_reg_write(ds, REG_GLOBAL, GLOBAL_VTU_VID, reg);
> + ret = _mv88e6xxx_reg_write(ps, REG_GLOBAL, GLOBAL_VTU_VID, reg);
> if (ret < 0)
> return ret;
>
> - return _mv88e6xxx_vtu_cmd(ds, op);
> + return _mv88e6xxx_vtu_cmd(ps, op);
> }
>
> -static int _mv88e6xxx_stu_getnext(struct dsa_switch *ds, u8 sid,
> +static int _mv88e6xxx_stu_getnext(struct mv88e6xxx_priv_state *ps, u8 sid,
> struct mv88e6xxx_vtu_stu_entry *entry)
> {
> struct mv88e6xxx_vtu_stu_entry next = { 0 };
> int ret;
>
> - ret = _mv88e6xxx_vtu_wait(ds);
> + ret = _mv88e6xxx_vtu_wait(ps);
> if (ret < 0)
> return ret;
>
> - ret = _mv88e6xxx_reg_write(ds, REG_GLOBAL, GLOBAL_VTU_SID,
> + ret = _mv88e6xxx_reg_write(ps, REG_GLOBAL, GLOBAL_VTU_SID,
> sid & GLOBAL_VTU_SID_MASK);
> if (ret < 0)
> return ret;
>
> - ret = _mv88e6xxx_vtu_cmd(ds, GLOBAL_VTU_OP_STU_GET_NEXT);
> + ret = _mv88e6xxx_vtu_cmd(ps, GLOBAL_VTU_OP_STU_GET_NEXT);
> if (ret < 0)
> return ret;
>
> - ret = _mv88e6xxx_reg_read(ds, REG_GLOBAL, GLOBAL_VTU_SID);
> + ret = _mv88e6xxx_reg_read(ps, REG_GLOBAL, GLOBAL_VTU_SID);
> if (ret < 0)
> return ret;
>
> next.sid = ret & GLOBAL_VTU_SID_MASK;
>
> - ret = _mv88e6xxx_reg_read(ds, REG_GLOBAL, GLOBAL_VTU_VID);
> + ret = _mv88e6xxx_reg_read(ps, REG_GLOBAL, GLOBAL_VTU_VID);
> if (ret < 0)
> return ret;
>
> next.valid = !!(ret & GLOBAL_VTU_VID_VALID);
>
> if (next.valid) {
> - ret = _mv88e6xxx_vtu_stu_data_read(ds, &next, 2);
> + ret = _mv88e6xxx_vtu_stu_data_read(ps, &next, 2);
> if (ret < 0)
> return ret;
> }
> @@ -1508,13 +1499,13 @@ static int _mv88e6xxx_stu_getnext(struct dsa_switch *ds, u8 sid,
> return 0;
> }
>
> -static int _mv88e6xxx_stu_loadpurge(struct dsa_switch *ds,
> +static int _mv88e6xxx_stu_loadpurge(struct mv88e6xxx_priv_state *ps,
> struct mv88e6xxx_vtu_stu_entry *entry)
> {
> u16 reg = 0;
> int ret;
>
> - ret = _mv88e6xxx_vtu_wait(ds);
> + ret = _mv88e6xxx_vtu_wait(ps);
> if (ret < 0)
> return ret;
>
> @@ -1522,40 +1513,41 @@ static int _mv88e6xxx_stu_loadpurge(struct dsa_switch *ds,
> goto loadpurge;
>
> /* Write port states */
> - ret = _mv88e6xxx_vtu_stu_data_write(ds, entry, 2);
> + ret = _mv88e6xxx_vtu_stu_data_write(ps, entry, 2);
> if (ret < 0)
> return ret;
>
> reg = GLOBAL_VTU_VID_VALID;
> loadpurge:
> - ret = _mv88e6xxx_reg_write(ds, REG_GLOBAL, GLOBAL_VTU_VID, reg);
> + ret = _mv88e6xxx_reg_write(ps, REG_GLOBAL, GLOBAL_VTU_VID, reg);
> if (ret < 0)
> return ret;
>
> reg = entry->sid & GLOBAL_VTU_SID_MASK;
> - ret = _mv88e6xxx_reg_write(ds, REG_GLOBAL, GLOBAL_VTU_SID, reg);
> + ret = _mv88e6xxx_reg_write(ps, REG_GLOBAL, GLOBAL_VTU_SID, reg);
> if (ret < 0)
> return ret;
>
> - return _mv88e6xxx_vtu_cmd(ds, GLOBAL_VTU_OP_STU_LOAD_PURGE);
> + return _mv88e6xxx_vtu_cmd(ps, GLOBAL_VTU_OP_STU_LOAD_PURGE);
> }
>
> -static int _mv88e6xxx_port_fid(struct dsa_switch *ds, int port, u16 *new,
> - u16 *old)
> +static int _mv88e6xxx_port_fid(struct mv88e6xxx_priv_state *ps, int port,
> + u16 *new, u16 *old)
> {
> + struct dsa_switch *ds = ps->ds;
> u16 upper_mask;
> u16 fid;
> int ret;
>
> - if (mv88e6xxx_num_databases(ds) == 4096)
> + if (mv88e6xxx_num_databases(ps) == 4096)
> upper_mask = 0xff;
> - else if (mv88e6xxx_num_databases(ds) == 256)
> + else if (mv88e6xxx_num_databases(ps) == 256)
> upper_mask = 0xf;
> else
> return -EOPNOTSUPP;
>
> /* Port's default FID bits 3:0 are located in reg 0x06, offset 12 */
> - ret = _mv88e6xxx_reg_read(ds, REG_PORT(port), PORT_BASE_VLAN);
> + ret = _mv88e6xxx_reg_read(ps, REG_PORT(port), PORT_BASE_VLAN);
> if (ret < 0)
> return ret;
>
> @@ -1565,14 +1557,14 @@ static int _mv88e6xxx_port_fid(struct dsa_switch *ds, int port, u16 *new,
> ret &= ~PORT_BASE_VLAN_FID_3_0_MASK;
> ret |= (*new << 12) & PORT_BASE_VLAN_FID_3_0_MASK;
>
> - ret = _mv88e6xxx_reg_write(ds, REG_PORT(port), PORT_BASE_VLAN,
> + ret = _mv88e6xxx_reg_write(ps, REG_PORT(port), PORT_BASE_VLAN,
> ret);
> if (ret < 0)
> return ret;
> }
>
> /* Port's default FID bits 11:4 are located in reg 0x05, offset 0 */
> - ret = _mv88e6xxx_reg_read(ds, REG_PORT(port), PORT_CONTROL_1);
> + ret = _mv88e6xxx_reg_read(ps, REG_PORT(port), PORT_CONTROL_1);
> if (ret < 0)
> return ret;
>
> @@ -1582,7 +1574,7 @@ static int _mv88e6xxx_port_fid(struct dsa_switch *ds, int port, u16 *new,
> ret &= ~upper_mask;
> ret |= (*new >> 4) & upper_mask;
>
> - ret = _mv88e6xxx_reg_write(ds, REG_PORT(port), PORT_CONTROL_1,
> + ret = _mv88e6xxx_reg_write(ps, REG_PORT(port), PORT_CONTROL_1,
> ret);
> if (ret < 0)
> return ret;
> @@ -1596,19 +1588,20 @@ static int _mv88e6xxx_port_fid(struct dsa_switch *ds, int port, u16 *new,
> return 0;
> }
>
> -static int _mv88e6xxx_port_fid_get(struct dsa_switch *ds, int port, u16 *fid)
> +static int _mv88e6xxx_port_fid_get(struct mv88e6xxx_priv_state *ps,
> + int port, u16 *fid)
> {
> - return _mv88e6xxx_port_fid(ds, port, NULL, fid);
> + return _mv88e6xxx_port_fid(ps, port, NULL, fid);
> }
>
> -static int _mv88e6xxx_port_fid_set(struct dsa_switch *ds, int port, u16 fid)
> +static int _mv88e6xxx_port_fid_set(struct mv88e6xxx_priv_state *ps,
> + int port, u16 fid)
> {
> - return _mv88e6xxx_port_fid(ds, port, &fid, NULL);
> + return _mv88e6xxx_port_fid(ps, port, &fid, NULL);
> }
>
> -static int _mv88e6xxx_fid_new(struct dsa_switch *ds, u16 *fid)
> +static int _mv88e6xxx_fid_new(struct mv88e6xxx_priv_state *ps, u16 *fid)
> {
> - struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
> DECLARE_BITMAP(fid_bitmap, MV88E6XXX_N_FID);
> struct mv88e6xxx_vtu_stu_entry vlan;
> int i, err;
> @@ -1617,7 +1610,7 @@ static int _mv88e6xxx_fid_new(struct dsa_switch *ds, u16 *fid)
>
> /* Set every FID bit used by the (un)bridged ports */
> for (i = 0; i < ps->info->num_ports; ++i) {
> - err = _mv88e6xxx_port_fid_get(ds, i, fid);
> + err = _mv88e6xxx_port_fid_get(ps, i, fid);
> if (err)
> return err;
>
> @@ -1625,12 +1618,12 @@ static int _mv88e6xxx_fid_new(struct dsa_switch *ds, u16 *fid)
> }
>
> /* Set every FID bit used by the VLAN entries */
> - err = _mv88e6xxx_vtu_vid_write(ds, GLOBAL_VTU_VID_MASK);
> + err = _mv88e6xxx_vtu_vid_write(ps, GLOBAL_VTU_VID_MASK);
> if (err)
> return err;
>
> do {
> - err = _mv88e6xxx_vtu_getnext(ds, &vlan);
> + err = _mv88e6xxx_vtu_getnext(ps, &vlan);
> if (err)
> return err;
>
> @@ -1644,24 +1637,24 @@ static int _mv88e6xxx_fid_new(struct dsa_switch *ds, u16 *fid)
> * databases are not needed. Return the next positive available.
> */
> *fid = find_next_zero_bit(fid_bitmap, MV88E6XXX_N_FID, 1);
> - if (unlikely(*fid >= mv88e6xxx_num_databases(ds)))
> + if (unlikely(*fid >= mv88e6xxx_num_databases(ps)))
> return -ENOSPC;
>
> /* Clear the database */
> - return _mv88e6xxx_atu_flush(ds, *fid, true);
> + return _mv88e6xxx_atu_flush(ps, *fid, true);
> }
>
> -static int _mv88e6xxx_vtu_new(struct dsa_switch *ds, u16 vid,
> +static int _mv88e6xxx_vtu_new(struct mv88e6xxx_priv_state *ps, u16 vid,
> struct mv88e6xxx_vtu_stu_entry *entry)
> {
> - struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
> + struct dsa_switch *ds = ps->ds;
> struct mv88e6xxx_vtu_stu_entry vlan = {
> .valid = true,
> .vid = vid,
> };
> int i, err;
>
> - err = _mv88e6xxx_fid_new(ds, &vlan.fid);
> + err = _mv88e6xxx_fid_new(ps, &vlan.fid);
> if (err)
> return err;
>
> @@ -1671,8 +1664,8 @@ static int _mv88e6xxx_vtu_new(struct dsa_switch *ds, u16 vid,
> ? GLOBAL_VTU_DATA_MEMBER_TAG_UNMODIFIED
> : GLOBAL_VTU_DATA_MEMBER_TAG_NON_MEMBER;
>
> - if (mv88e6xxx_6097_family(ds) || mv88e6xxx_6165_family(ds) ||
> - mv88e6xxx_6351_family(ds) || mv88e6xxx_6352_family(ds)) {
> + if (mv88e6xxx_6097_family(ps) || mv88e6xxx_6165_family(ps) ||
> + mv88e6xxx_6351_family(ps) || mv88e6xxx_6352_family(ps)) {
> struct mv88e6xxx_vtu_stu_entry vstp;
>
> /* Adding a VTU entry requires a valid STU entry. As VSTP is not
> @@ -1680,7 +1673,7 @@ static int _mv88e6xxx_vtu_new(struct dsa_switch *ds, u16 vid,
> * entries. Thus, validate the SID 0.
> */
> vlan.sid = 0;
> - err = _mv88e6xxx_stu_getnext(ds, GLOBAL_VTU_SID_MASK, &vstp);
> + err = _mv88e6xxx_stu_getnext(ps, GLOBAL_VTU_SID_MASK, &vstp);
> if (err)
> return err;
>
> @@ -1689,7 +1682,7 @@ static int _mv88e6xxx_vtu_new(struct dsa_switch *ds, u16 vid,
> vstp.valid = true;
> vstp.sid = vlan.sid;
>
> - err = _mv88e6xxx_stu_loadpurge(ds, &vstp);
> + err = _mv88e6xxx_stu_loadpurge(ps, &vstp);
> if (err)
> return err;
> }
> @@ -1699,7 +1692,7 @@ static int _mv88e6xxx_vtu_new(struct dsa_switch *ds, u16 vid,
> return 0;
> }
>
> -static int _mv88e6xxx_vtu_get(struct dsa_switch *ds, u16 vid,
> +static int _mv88e6xxx_vtu_get(struct mv88e6xxx_priv_state *ps, u16 vid,
> struct mv88e6xxx_vtu_stu_entry *entry, bool creat)
> {
> int err;
> @@ -1707,11 +1700,11 @@ static int _mv88e6xxx_vtu_get(struct dsa_switch *ds, u16 vid,
> if (!vid)
> return -EINVAL;
>
> - err = _mv88e6xxx_vtu_vid_write(ds, vid - 1);
> + err = _mv88e6xxx_vtu_vid_write(ps, vid - 1);
> if (err)
> return err;
>
> - err = _mv88e6xxx_vtu_getnext(ds, entry);
> + err = _mv88e6xxx_vtu_getnext(ps, entry);
> if (err)
> return err;
>
> @@ -1722,7 +1715,7 @@ static int _mv88e6xxx_vtu_get(struct dsa_switch *ds, u16 vid,
> * -EOPNOTSUPP to inform bridge about an eventual software VLAN.
> */
>
> - err = _mv88e6xxx_vtu_new(ds, vid, entry);
> + err = _mv88e6xxx_vtu_new(ps, vid, entry);
> }
>
> return err;
> @@ -1740,12 +1733,12 @@ static int mv88e6xxx_port_check_hw_vlan(struct dsa_switch *ds, int port,
>
> mutex_lock(&ps->smi_mutex);
>
> - err = _mv88e6xxx_vtu_vid_write(ds, vid_begin - 1);
> + err = _mv88e6xxx_vtu_vid_write(ps, vid_begin - 1);
> if (err)
> goto unlock;
>
> do {
> - err = _mv88e6xxx_vtu_getnext(ds, &vlan);
> + err = _mv88e6xxx_vtu_getnext(ps, &vlan);
> if (err)
> goto unlock;
>
> @@ -1799,7 +1792,7 @@ int mv88e6xxx_port_vlan_filtering(struct dsa_switch *ds, int port,
>
> mutex_lock(&ps->smi_mutex);
>
> - ret = _mv88e6xxx_reg_read(ds, REG_PORT(port), PORT_CONTROL_2);
> + ret = _mv88e6xxx_reg_read(ps, REG_PORT(port), PORT_CONTROL_2);
> if (ret < 0)
> goto unlock;
>
> @@ -1809,7 +1802,7 @@ int mv88e6xxx_port_vlan_filtering(struct dsa_switch *ds, int port,
> ret &= ~PORT_CONTROL_2_8021Q_MASK;
> ret |= new & PORT_CONTROL_2_8021Q_MASK;
>
> - ret = _mv88e6xxx_reg_write(ds, REG_PORT(port), PORT_CONTROL_2,
> + ret = _mv88e6xxx_reg_write(ps, REG_PORT(port), PORT_CONTROL_2,
> ret);
> if (ret < 0)
> goto unlock;
> @@ -1846,13 +1839,13 @@ int mv88e6xxx_port_vlan_prepare(struct dsa_switch *ds, int port,
> return 0;
> }
>
> -static int _mv88e6xxx_port_vlan_add(struct dsa_switch *ds, int port, u16 vid,
> - bool untagged)
> +static int _mv88e6xxx_port_vlan_add(struct mv88e6xxx_priv_state *ps, int port,
> + u16 vid, bool untagged)
> {
> struct mv88e6xxx_vtu_stu_entry vlan;
> int err;
>
> - err = _mv88e6xxx_vtu_get(ds, vid, &vlan, true);
> + err = _mv88e6xxx_vtu_get(ps, vid, &vlan, true);
> if (err)
> return err;
>
> @@ -1860,7 +1853,7 @@ static int _mv88e6xxx_port_vlan_add(struct dsa_switch *ds, int port, u16 vid,
> GLOBAL_VTU_DATA_MEMBER_TAG_UNTAGGED :
> GLOBAL_VTU_DATA_MEMBER_TAG_TAGGED;
>
> - return _mv88e6xxx_vtu_loadpurge(ds, &vlan);
> + return _mv88e6xxx_vtu_loadpurge(ps, &vlan);
> }
>
> void mv88e6xxx_port_vlan_add(struct dsa_switch *ds, int port,
> @@ -1875,24 +1868,25 @@ void mv88e6xxx_port_vlan_add(struct dsa_switch *ds, int port,
> mutex_lock(&ps->smi_mutex);
>
> for (vid = vlan->vid_begin; vid <= vlan->vid_end; ++vid)
> - if (_mv88e6xxx_port_vlan_add(ds, port, vid, untagged))
> + if (_mv88e6xxx_port_vlan_add(ps, port, vid, untagged))
> netdev_err(ds->ports[port], "failed to add VLAN %d%c\n",
> vid, untagged ? 'u' : 't');
>
> - if (pvid && _mv88e6xxx_port_pvid_set(ds, port, vlan->vid_end))
> + if (pvid && _mv88e6xxx_port_pvid_set(ps, port, vlan->vid_end))
> netdev_err(ds->ports[port], "failed to set PVID %d\n",
> vlan->vid_end);
>
> mutex_unlock(&ps->smi_mutex);
> }
>
> -static int _mv88e6xxx_port_vlan_del(struct dsa_switch *ds, int port, u16 vid)
> +static int _mv88e6xxx_port_vlan_del(struct mv88e6xxx_priv_state *ps,
> + int port, u16 vid)
> {
> - struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
> + struct dsa_switch *ds = ps->ds;
> struct mv88e6xxx_vtu_stu_entry vlan;
> int i, err;
>
> - err = _mv88e6xxx_vtu_get(ds, vid, &vlan, false);
> + err = _mv88e6xxx_vtu_get(ps, vid, &vlan, false);
> if (err)
> return err;
>
> @@ -1914,11 +1908,11 @@ static int _mv88e6xxx_port_vlan_del(struct dsa_switch *ds, int port, u16 vid)
> }
> }
>
> - err = _mv88e6xxx_vtu_loadpurge(ds, &vlan);
> + err = _mv88e6xxx_vtu_loadpurge(ps, &vlan);
> if (err)
> return err;
>
> - return _mv88e6xxx_atu_remove(ds, vlan.fid, port, false);
> + return _mv88e6xxx_atu_remove(ps, vlan.fid, port, false);
> }
>
> int mv88e6xxx_port_vlan_del(struct dsa_switch *ds, int port,
> @@ -1930,17 +1924,17 @@ int mv88e6xxx_port_vlan_del(struct dsa_switch *ds, int port,
>
> mutex_lock(&ps->smi_mutex);
>
> - err = _mv88e6xxx_port_pvid_get(ds, port, &pvid);
> + err = _mv88e6xxx_port_pvid_get(ps, port, &pvid);
> if (err)
> goto unlock;
>
> for (vid = vlan->vid_begin; vid <= vlan->vid_end; ++vid) {
> - err = _mv88e6xxx_port_vlan_del(ds, port, vid);
> + err = _mv88e6xxx_port_vlan_del(ps, port, vid);
> if (err)
> goto unlock;
>
> if (vid == pvid) {
> - err = _mv88e6xxx_port_pvid_set(ds, port, 0);
> + err = _mv88e6xxx_port_pvid_set(ps, port, 0);
> if (err)
> goto unlock;
> }
> @@ -1952,14 +1946,14 @@ unlock:
> return err;
> }
>
> -static int _mv88e6xxx_atu_mac_write(struct dsa_switch *ds,
> +static int _mv88e6xxx_atu_mac_write(struct mv88e6xxx_priv_state *ps,
> const unsigned char *addr)
> {
> int i, ret;
>
> for (i = 0; i < 3; i++) {
> ret = _mv88e6xxx_reg_write(
> - ds, REG_GLOBAL, GLOBAL_ATU_MAC_01 + i,
> + ps, REG_GLOBAL, GLOBAL_ATU_MAC_01 + i,
> (addr[i * 2] << 8) | addr[i * 2 + 1]);
> if (ret < 0)
> return ret;
> @@ -1968,12 +1962,13 @@ static int _mv88e6xxx_atu_mac_write(struct dsa_switch *ds,
> return 0;
> }
>
> -static int _mv88e6xxx_atu_mac_read(struct dsa_switch *ds, unsigned char *addr)
> +static int _mv88e6xxx_atu_mac_read(struct mv88e6xxx_priv_state *ps,
> + unsigned char *addr)
> {
> int i, ret;
>
> for (i = 0; i < 3; i++) {
> - ret = _mv88e6xxx_reg_read(ds, REG_GLOBAL,
> + ret = _mv88e6xxx_reg_read(ps, REG_GLOBAL,
> GLOBAL_ATU_MAC_01 + i);
> if (ret < 0)
> return ret;
> @@ -1984,27 +1979,27 @@ static int _mv88e6xxx_atu_mac_read(struct dsa_switch *ds, unsigned char *addr)
> return 0;
> }
>
> -static int _mv88e6xxx_atu_load(struct dsa_switch *ds,
> +static int _mv88e6xxx_atu_load(struct mv88e6xxx_priv_state *ps,
> struct mv88e6xxx_atu_entry *entry)
> {
> int ret;
>
> - ret = _mv88e6xxx_atu_wait(ds);
> + ret = _mv88e6xxx_atu_wait(ps);
> if (ret < 0)
> return ret;
>
> - ret = _mv88e6xxx_atu_mac_write(ds, entry->mac);
> + ret = _mv88e6xxx_atu_mac_write(ps, entry->mac);
> if (ret < 0)
> return ret;
>
> - ret = _mv88e6xxx_atu_data_write(ds, entry);
> + ret = _mv88e6xxx_atu_data_write(ps, entry);
> if (ret < 0)
> return ret;
>
> - return _mv88e6xxx_atu_cmd(ds, entry->fid, GLOBAL_ATU_OP_LOAD_DB);
> + return _mv88e6xxx_atu_cmd(ps, entry->fid, GLOBAL_ATU_OP_LOAD_DB);
> }
>
> -static int _mv88e6xxx_port_fdb_load(struct dsa_switch *ds, int port,
> +static int _mv88e6xxx_port_fdb_load(struct mv88e6xxx_priv_state *ps, int port,
> const unsigned char *addr, u16 vid,
> u8 state)
> {
> @@ -2014,9 +2009,9 @@ static int _mv88e6xxx_port_fdb_load(struct dsa_switch *ds, int port,
>
> /* Null VLAN ID corresponds to the port private database */
> if (vid == 0)
> - err = _mv88e6xxx_port_fid_get(ds, port, &vlan.fid);
> + err = _mv88e6xxx_port_fid_get(ps, port, &vlan.fid);
> else
> - err = _mv88e6xxx_vtu_get(ds, vid, &vlan, false);
> + err = _mv88e6xxx_vtu_get(ps, vid, &vlan, false);
> if (err)
> return err;
>
> @@ -2028,7 +2023,7 @@ static int _mv88e6xxx_port_fdb_load(struct dsa_switch *ds, int port,
> entry.portv_trunkid = BIT(port);
> }
>
> - return _mv88e6xxx_atu_load(ds, &entry);
> + return _mv88e6xxx_atu_load(ps, &entry);
> }
>
> int mv88e6xxx_port_fdb_prepare(struct dsa_switch *ds, int port,
> @@ -2051,7 +2046,7 @@ void mv88e6xxx_port_fdb_add(struct dsa_switch *ds, int port,
> struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
>
> mutex_lock(&ps->smi_mutex);
> - if (_mv88e6xxx_port_fdb_load(ds, port, fdb->addr, fdb->vid, state))
> + if (_mv88e6xxx_port_fdb_load(ps, port, fdb->addr, fdb->vid, state))
> netdev_err(ds->ports[port], "failed to load MAC address\n");
> mutex_unlock(&ps->smi_mutex);
> }
> @@ -2063,14 +2058,14 @@ int mv88e6xxx_port_fdb_del(struct dsa_switch *ds, int port,
> int ret;
>
> mutex_lock(&ps->smi_mutex);
> - ret = _mv88e6xxx_port_fdb_load(ds, port, fdb->addr, fdb->vid,
> + ret = _mv88e6xxx_port_fdb_load(ps, port, fdb->addr, fdb->vid,
> GLOBAL_ATU_DATA_STATE_UNUSED);
> mutex_unlock(&ps->smi_mutex);
>
> return ret;
> }
>
> -static int _mv88e6xxx_atu_getnext(struct dsa_switch *ds, u16 fid,
> +static int _mv88e6xxx_atu_getnext(struct mv88e6xxx_priv_state *ps, u16 fid,
> struct mv88e6xxx_atu_entry *entry)
> {
> struct mv88e6xxx_atu_entry next = { 0 };
> @@ -2078,19 +2073,19 @@ static int _mv88e6xxx_atu_getnext(struct dsa_switch *ds, u16 fid,
>
> next.fid = fid;
>
> - ret = _mv88e6xxx_atu_wait(ds);
> + ret = _mv88e6xxx_atu_wait(ps);
> if (ret < 0)
> return ret;
>
> - ret = _mv88e6xxx_atu_cmd(ds, fid, GLOBAL_ATU_OP_GET_NEXT_DB);
> + ret = _mv88e6xxx_atu_cmd(ps, fid, GLOBAL_ATU_OP_GET_NEXT_DB);
> if (ret < 0)
> return ret;
>
> - ret = _mv88e6xxx_atu_mac_read(ds, next.mac);
> + ret = _mv88e6xxx_atu_mac_read(ps, next.mac);
> if (ret < 0)
> return ret;
>
> - ret = _mv88e6xxx_reg_read(ds, REG_GLOBAL, GLOBAL_ATU_DATA);
> + ret = _mv88e6xxx_reg_read(ps, REG_GLOBAL, GLOBAL_ATU_DATA);
> if (ret < 0)
> return ret;
>
> @@ -2115,8 +2110,8 @@ static int _mv88e6xxx_atu_getnext(struct dsa_switch *ds, u16 fid,
> return 0;
> }
>
> -static int _mv88e6xxx_port_fdb_dump_one(struct dsa_switch *ds, u16 fid, u16 vid,
> - int port,
> +static int _mv88e6xxx_port_fdb_dump_one(struct mv88e6xxx_priv_state *ps,
> + u16 fid, u16 vid, int port,
> struct switchdev_obj_port_fdb *fdb,
> int (*cb)(struct switchdev_obj *obj))
> {
> @@ -2125,12 +2120,12 @@ static int _mv88e6xxx_port_fdb_dump_one(struct dsa_switch *ds, u16 fid, u16 vid,
> };
> int err;
>
> - err = _mv88e6xxx_atu_mac_write(ds, addr.mac);
> + err = _mv88e6xxx_atu_mac_write(ps, addr.mac);
> if (err)
> return err;
>
> do {
> - err = _mv88e6xxx_atu_getnext(ds, fid, &addr);
> + err = _mv88e6xxx_atu_getnext(ps, fid, &addr);
> if (err)
> break;
>
> @@ -2170,28 +2165,28 @@ int mv88e6xxx_port_fdb_dump(struct dsa_switch *ds, int port,
> mutex_lock(&ps->smi_mutex);
>
> /* Dump port's default Filtering Information Database (VLAN ID 0) */
> - err = _mv88e6xxx_port_fid_get(ds, port, &fid);
> + err = _mv88e6xxx_port_fid_get(ps, port, &fid);
> if (err)
> goto unlock;
>
> - err = _mv88e6xxx_port_fdb_dump_one(ds, fid, 0, port, fdb, cb);
> + err = _mv88e6xxx_port_fdb_dump_one(ps, fid, 0, port, fdb, cb);
> if (err)
> goto unlock;
>
> /* Dump VLANs' Filtering Information Databases */
> - err = _mv88e6xxx_vtu_vid_write(ds, vlan.vid);
> + err = _mv88e6xxx_vtu_vid_write(ps, vlan.vid);
> if (err)
> goto unlock;
>
> do {
> - err = _mv88e6xxx_vtu_getnext(ds, &vlan);
> + err = _mv88e6xxx_vtu_getnext(ps, &vlan);
> if (err)
> break;
>
> if (!vlan.valid)
> break;
>
> - err = _mv88e6xxx_port_fdb_dump_one(ds, vlan.fid, vlan.vid, port,
> + err = _mv88e6xxx_port_fdb_dump_one(ps, vlan.fid, vlan.vid, port,
> fdb, cb);
> if (err)
> break;
> @@ -2216,7 +2211,7 @@ int mv88e6xxx_port_bridge_join(struct dsa_switch *ds, int port,
>
> for (i = 0; i < ps->info->num_ports; ++i) {
> if (ps->ports[i].bridge_dev == bridge) {
> - err = _mv88e6xxx_port_based_vlan_map(ds, i);
> + err = _mv88e6xxx_port_based_vlan_map(ps, i);
> if (err)
> break;
> }
> @@ -2240,7 +2235,7 @@ void mv88e6xxx_port_bridge_leave(struct dsa_switch *ds, int port)
>
> for (i = 0; i < ps->info->num_ports; ++i)
> if (i == port || ps->ports[i].bridge_dev == bridge)
> - if (_mv88e6xxx_port_based_vlan_map(ds, i))
> + if (_mv88e6xxx_port_based_vlan_map(ps, i))
> netdev_warn(ds->ports[i], "failed to remap\n");
>
> mutex_unlock(&ps->smi_mutex);
> @@ -2259,57 +2254,58 @@ static void mv88e6xxx_bridge_work(struct work_struct *work)
>
> for (port = 0; port < ps->info->num_ports; ++port)
> if (test_and_clear_bit(port, ps->port_state_update_mask) &&
> - _mv88e6xxx_port_state(ds, port, ps->ports[port].state))
> - netdev_warn(ds->ports[port], "failed to update state to %s\n",
> + _mv88e6xxx_port_state(ps, port, ps->ports[port].state))
> + netdev_warn(ds->ports[port],
> + "failed to update state to %s\n",
> mv88e6xxx_port_state_names[ps->ports[port].state]);
>
> mutex_unlock(&ps->smi_mutex);
> }
>
> -static int _mv88e6xxx_phy_page_write(struct dsa_switch *ds, int port, int page,
> - int reg, int val)
> +static int _mv88e6xxx_phy_page_write(struct mv88e6xxx_priv_state *ps,
> + int port, int page, int reg, int val)
> {
> int ret;
>
> - ret = _mv88e6xxx_phy_write_indirect(ds, port, 0x16, page);
> + ret = _mv88e6xxx_phy_write_indirect(ps, port, 0x16, page);
> if (ret < 0)
> goto restore_page_0;
>
> - ret = _mv88e6xxx_phy_write_indirect(ds, port, reg, val);
> + ret = _mv88e6xxx_phy_write_indirect(ps, port, reg, val);
> restore_page_0:
> - _mv88e6xxx_phy_write_indirect(ds, port, 0x16, 0x0);
> + _mv88e6xxx_phy_write_indirect(ps, port, 0x16, 0x0);
>
> return ret;
> }
>
> -static int _mv88e6xxx_phy_page_read(struct dsa_switch *ds, int port, int page,
> - int reg)
> +static int _mv88e6xxx_phy_page_read(struct mv88e6xxx_priv_state *ps,
> + int port, int page, int reg)
> {
> int ret;
>
> - ret = _mv88e6xxx_phy_write_indirect(ds, port, 0x16, page);
> + ret = _mv88e6xxx_phy_write_indirect(ps, port, 0x16, page);
> if (ret < 0)
> goto restore_page_0;
>
> - ret = _mv88e6xxx_phy_read_indirect(ds, port, reg);
> + ret = _mv88e6xxx_phy_read_indirect(ps, port, reg);
> restore_page_0:
> - _mv88e6xxx_phy_write_indirect(ds, port, 0x16, 0x0);
> + _mv88e6xxx_phy_write_indirect(ps, port, 0x16, 0x0);
>
> return ret;
> }
>
> -static int mv88e6xxx_power_on_serdes(struct dsa_switch *ds)
> +static int mv88e6xxx_power_on_serdes(struct mv88e6xxx_priv_state *ps)
> {
> int ret;
>
> - ret = _mv88e6xxx_phy_page_read(ds, REG_FIBER_SERDES, PAGE_FIBER_SERDES,
> + ret = _mv88e6xxx_phy_page_read(ps, REG_FIBER_SERDES, PAGE_FIBER_SERDES,
> MII_BMCR);
> if (ret < 0)
> return ret;
>
> if (ret & BMCR_PDOWN) {
> ret &= ~BMCR_PDOWN;
> - ret = _mv88e6xxx_phy_page_write(ds, REG_FIBER_SERDES,
> + ret = _mv88e6xxx_phy_page_write(ps, REG_FIBER_SERDES,
> PAGE_FIBER_SERDES, MII_BMCR,
> ret);
> }
> @@ -2325,24 +2321,24 @@ static int mv88e6xxx_setup_port(struct dsa_switch *ds, int port)
>
> mutex_lock(&ps->smi_mutex);
>
> - if (mv88e6xxx_6352_family(ds) || mv88e6xxx_6351_family(ds) ||
> - mv88e6xxx_6165_family(ds) || mv88e6xxx_6097_family(ds) ||
> - mv88e6xxx_6185_family(ds) || mv88e6xxx_6095_family(ds) ||
> - mv88e6xxx_6065_family(ds) || mv88e6xxx_6320_family(ds)) {
> + if (mv88e6xxx_6352_family(ps) || mv88e6xxx_6351_family(ps) ||
> + mv88e6xxx_6165_family(ps) || mv88e6xxx_6097_family(ps) ||
> + mv88e6xxx_6185_family(ps) || mv88e6xxx_6095_family(ps) ||
> + mv88e6xxx_6065_family(ps) || mv88e6xxx_6320_family(ps)) {
> /* MAC Forcing register: don't force link, speed,
> * duplex or flow control state to any particular
> * values on physical ports, but force the CPU port
> * and all DSA ports to their maximum bandwidth and
> * full duplex.
> */
> - reg = _mv88e6xxx_reg_read(ds, REG_PORT(port), PORT_PCS_CTRL);
> + reg = _mv88e6xxx_reg_read(ps, REG_PORT(port), PORT_PCS_CTRL);
> if (dsa_is_cpu_port(ds, port) || dsa_is_dsa_port(ds, port)) {
> reg &= ~PORT_PCS_CTRL_UNFORCED;
> reg |= PORT_PCS_CTRL_FORCE_LINK |
> PORT_PCS_CTRL_LINK_UP |
> PORT_PCS_CTRL_DUPLEX_FULL |
> PORT_PCS_CTRL_FORCE_DUPLEX;
> - if (mv88e6xxx_6065_family(ds))
> + if (mv88e6xxx_6065_family(ps))
> reg |= PORT_PCS_CTRL_100;
> else
> reg |= PORT_PCS_CTRL_1000;
> @@ -2350,7 +2346,7 @@ static int mv88e6xxx_setup_port(struct dsa_switch *ds, int port)
> reg |= PORT_PCS_CTRL_UNFORCED;
> }
>
> - ret = _mv88e6xxx_reg_write(ds, REG_PORT(port),
> + ret = _mv88e6xxx_reg_write(ps, REG_PORT(port),
> PORT_PCS_CTRL, reg);
> if (ret)
> goto abort;
> @@ -2371,19 +2367,19 @@ static int mv88e6xxx_setup_port(struct dsa_switch *ds, int port)
> * forwarding of unknown unicasts and multicasts.
> */
> reg = 0;
> - if (mv88e6xxx_6352_family(ds) || mv88e6xxx_6351_family(ds) ||
> - mv88e6xxx_6165_family(ds) || mv88e6xxx_6097_family(ds) ||
> - mv88e6xxx_6095_family(ds) || mv88e6xxx_6065_family(ds) ||
> - mv88e6xxx_6185_family(ds) || mv88e6xxx_6320_family(ds))
> + if (mv88e6xxx_6352_family(ps) || mv88e6xxx_6351_family(ps) ||
> + mv88e6xxx_6165_family(ps) || mv88e6xxx_6097_family(ps) ||
> + mv88e6xxx_6095_family(ps) || mv88e6xxx_6065_family(ps) ||
> + mv88e6xxx_6185_family(ps) || mv88e6xxx_6320_family(ps))
> reg = PORT_CONTROL_IGMP_MLD_SNOOP |
> PORT_CONTROL_USE_TAG | PORT_CONTROL_USE_IP |
> PORT_CONTROL_STATE_FORWARDING;
> if (dsa_is_cpu_port(ds, port)) {
> - if (mv88e6xxx_6095_family(ds) || mv88e6xxx_6185_family(ds))
> + if (mv88e6xxx_6095_family(ps) || mv88e6xxx_6185_family(ps))
> reg |= PORT_CONTROL_DSA_TAG;
> - if (mv88e6xxx_6352_family(ds) || mv88e6xxx_6351_family(ds) ||
> - mv88e6xxx_6165_family(ds) || mv88e6xxx_6097_family(ds) ||
> - mv88e6xxx_6320_family(ds)) {
> + if (mv88e6xxx_6352_family(ps) || mv88e6xxx_6351_family(ps) ||
> + mv88e6xxx_6165_family(ps) || mv88e6xxx_6097_family(ps) ||
> + mv88e6xxx_6320_family(ps)) {
> if (ds->dst->tag_protocol == DSA_TAG_PROTO_EDSA)
> reg |= PORT_CONTROL_FRAME_ETHER_TYPE_DSA;
> else
> @@ -2392,20 +2388,20 @@ static int mv88e6xxx_setup_port(struct dsa_switch *ds, int port)
> PORT_CONTROL_FORWARD_UNKNOWN_MC;
> }
>
> - if (mv88e6xxx_6352_family(ds) || mv88e6xxx_6351_family(ds) ||
> - mv88e6xxx_6165_family(ds) || mv88e6xxx_6097_family(ds) ||
> - mv88e6xxx_6095_family(ds) || mv88e6xxx_6065_family(ds) ||
> - mv88e6xxx_6185_family(ds) || mv88e6xxx_6320_family(ds)) {
> + if (mv88e6xxx_6352_family(ps) || mv88e6xxx_6351_family(ps) ||
> + mv88e6xxx_6165_family(ps) || mv88e6xxx_6097_family(ps) ||
> + mv88e6xxx_6095_family(ps) || mv88e6xxx_6065_family(ps) ||
> + mv88e6xxx_6185_family(ps) || mv88e6xxx_6320_family(ps)) {
> if (ds->dst->tag_protocol == DSA_TAG_PROTO_EDSA)
> reg |= PORT_CONTROL_EGRESS_ADD_TAG;
> }
> }
> if (dsa_is_dsa_port(ds, port)) {
> - if (mv88e6xxx_6095_family(ds) || mv88e6xxx_6185_family(ds))
> + if (mv88e6xxx_6095_family(ps) || mv88e6xxx_6185_family(ps))
> reg |= PORT_CONTROL_DSA_TAG;
> - if (mv88e6xxx_6352_family(ds) || mv88e6xxx_6351_family(ds) ||
> - mv88e6xxx_6165_family(ds) || mv88e6xxx_6097_family(ds) ||
> - mv88e6xxx_6320_family(ds)) {
> + if (mv88e6xxx_6352_family(ps) || mv88e6xxx_6351_family(ps) ||
> + mv88e6xxx_6165_family(ps) || mv88e6xxx_6097_family(ps) ||
> + mv88e6xxx_6320_family(ps)) {
> reg |= PORT_CONTROL_FRAME_MODE_DSA;
> }
>
> @@ -2414,7 +2410,7 @@ static int mv88e6xxx_setup_port(struct dsa_switch *ds, int port)
> PORT_CONTROL_FORWARD_UNKNOWN_MC;
> }
> if (reg) {
> - ret = _mv88e6xxx_reg_write(ds, REG_PORT(port),
> + ret = _mv88e6xxx_reg_write(ps, REG_PORT(port),
> PORT_CONTROL, reg);
> if (ret)
> goto abort;
> @@ -2423,15 +2419,15 @@ static int mv88e6xxx_setup_port(struct dsa_switch *ds, int port)
> /* If this port is connected to a SerDes, make sure the SerDes is not
> * powered down.
> */
> - if (mv88e6xxx_6352_family(ds)) {
> - ret = _mv88e6xxx_reg_read(ds, REG_PORT(port), PORT_STATUS);
> + if (mv88e6xxx_6352_family(ps)) {
> + ret = _mv88e6xxx_reg_read(ps, REG_PORT(port), PORT_STATUS);
> if (ret < 0)
> goto abort;
> ret &= PORT_STATUS_CMODE_MASK;
> if ((ret == PORT_STATUS_CMODE_100BASE_X) ||
> (ret == PORT_STATUS_CMODE_1000BASE_X) ||
> (ret == PORT_STATUS_CMODE_SGMII)) {
> - ret = mv88e6xxx_power_on_serdes(ds);
> + ret = mv88e6xxx_power_on_serdes(ps);
> if (ret < 0)
> goto abort;
> }
> @@ -2444,17 +2440,17 @@ static int mv88e6xxx_setup_port(struct dsa_switch *ds, int port)
> * copy of all transmitted/received frames on this port to the CPU.
> */
> reg = 0;
> - if (mv88e6xxx_6352_family(ds) || mv88e6xxx_6351_family(ds) ||
> - mv88e6xxx_6165_family(ds) || mv88e6xxx_6097_family(ds) ||
> - mv88e6xxx_6095_family(ds) || mv88e6xxx_6320_family(ds) ||
> - mv88e6xxx_6185_family(ds))
> + if (mv88e6xxx_6352_family(ps) || mv88e6xxx_6351_family(ps) ||
> + mv88e6xxx_6165_family(ps) || mv88e6xxx_6097_family(ps) ||
> + mv88e6xxx_6095_family(ps) || mv88e6xxx_6320_family(ps) ||
> + mv88e6xxx_6185_family(ps))
> reg = PORT_CONTROL_2_MAP_DA;
>
> - if (mv88e6xxx_6352_family(ds) || mv88e6xxx_6351_family(ds) ||
> - mv88e6xxx_6165_family(ds) || mv88e6xxx_6320_family(ds))
> + if (mv88e6xxx_6352_family(ps) || mv88e6xxx_6351_family(ps) ||
> + mv88e6xxx_6165_family(ps) || mv88e6xxx_6320_family(ps))
> reg |= PORT_CONTROL_2_JUMBO_10240;
>
> - if (mv88e6xxx_6095_family(ds) || mv88e6xxx_6185_family(ds)) {
> + if (mv88e6xxx_6095_family(ps) || mv88e6xxx_6185_family(ps)) {
> /* Set the upstream port this port should use */
> reg |= dsa_upstream_port(ds);
> /* enable forwarding of unknown multicast addresses to
> @@ -2467,7 +2463,7 @@ static int mv88e6xxx_setup_port(struct dsa_switch *ds, int port)
> reg |= PORT_CONTROL_2_8021Q_DISABLED;
>
> if (reg) {
> - ret = _mv88e6xxx_reg_write(ds, REG_PORT(port),
> + ret = _mv88e6xxx_reg_write(ps, REG_PORT(port),
> PORT_CONTROL_2, reg);
> if (ret)
> goto abort;
> @@ -2483,24 +2479,24 @@ static int mv88e6xxx_setup_port(struct dsa_switch *ds, int port)
> if (dsa_is_cpu_port(ds, port))
> reg = 0;
>
> - ret = _mv88e6xxx_reg_write(ds, REG_PORT(port), PORT_ASSOC_VECTOR, reg);
> + ret = _mv88e6xxx_reg_write(ps, REG_PORT(port), PORT_ASSOC_VECTOR, reg);
> if (ret)
> goto abort;
>
> /* Egress rate control 2: disable egress rate control. */
> - ret = _mv88e6xxx_reg_write(ds, REG_PORT(port), PORT_RATE_CONTROL_2,
> + ret = _mv88e6xxx_reg_write(ps, REG_PORT(port), PORT_RATE_CONTROL_2,
> 0x0000);
> if (ret)
> goto abort;
>
> - if (mv88e6xxx_6352_family(ds) || mv88e6xxx_6351_family(ds) ||
> - mv88e6xxx_6165_family(ds) || mv88e6xxx_6097_family(ds) ||
> - mv88e6xxx_6320_family(ds)) {
> + if (mv88e6xxx_6352_family(ps) || mv88e6xxx_6351_family(ps) ||
> + mv88e6xxx_6165_family(ps) || mv88e6xxx_6097_family(ps) ||
> + mv88e6xxx_6320_family(ps)) {
> /* Do not limit the period of time that this port can
> * be paused for by the remote end or the period of
> * time that this port can pause the remote end.
> */
> - ret = _mv88e6xxx_reg_write(ds, REG_PORT(port),
> + ret = _mv88e6xxx_reg_write(ps, REG_PORT(port),
> PORT_PAUSE_CTRL, 0x0000);
> if (ret)
> goto abort;
> @@ -2509,12 +2505,12 @@ static int mv88e6xxx_setup_port(struct dsa_switch *ds, int port)
> * address database entries that this port is allowed
> * to use.
> */
> - ret = _mv88e6xxx_reg_write(ds, REG_PORT(port),
> + ret = _mv88e6xxx_reg_write(ps, REG_PORT(port),
> PORT_ATU_CONTROL, 0x0000);
> /* Priority Override: disable DA, SA and VTU priority
> * override.
> */
> - ret = _mv88e6xxx_reg_write(ds, REG_PORT(port),
> + ret = _mv88e6xxx_reg_write(ps, REG_PORT(port),
> PORT_PRI_OVERRIDE, 0x0000);
> if (ret)
> goto abort;
> @@ -2522,14 +2518,14 @@ static int mv88e6xxx_setup_port(struct dsa_switch *ds, int port)
> /* Port Ethertype: use the Ethertype DSA Ethertype
> * value.
> */
> - ret = _mv88e6xxx_reg_write(ds, REG_PORT(port),
> + ret = _mv88e6xxx_reg_write(ps, REG_PORT(port),
> PORT_ETH_TYPE, ETH_P_EDSA);
> if (ret)
> goto abort;
> /* Tag Remap: use an identity 802.1p prio -> switch
> * prio mapping.
> */
> - ret = _mv88e6xxx_reg_write(ds, REG_PORT(port),
> + ret = _mv88e6xxx_reg_write(ps, REG_PORT(port),
> PORT_TAG_REGMAP_0123, 0x3210);
> if (ret)
> goto abort;
> @@ -2537,18 +2533,18 @@ static int mv88e6xxx_setup_port(struct dsa_switch *ds, int port)
> /* Tag Remap 2: use an identity 802.1p prio -> switch
> * prio mapping.
> */
> - ret = _mv88e6xxx_reg_write(ds, REG_PORT(port),
> + ret = _mv88e6xxx_reg_write(ps, REG_PORT(port),
> PORT_TAG_REGMAP_4567, 0x7654);
> if (ret)
> goto abort;
> }
>
> - if (mv88e6xxx_6352_family(ds) || mv88e6xxx_6351_family(ds) ||
> - mv88e6xxx_6165_family(ds) || mv88e6xxx_6097_family(ds) ||
> - mv88e6xxx_6185_family(ds) || mv88e6xxx_6095_family(ds) ||
> - mv88e6xxx_6320_family(ds)) {
> + if (mv88e6xxx_6352_family(ps) || mv88e6xxx_6351_family(ps) ||
> + mv88e6xxx_6165_family(ps) || mv88e6xxx_6097_family(ps) ||
> + mv88e6xxx_6185_family(ps) || mv88e6xxx_6095_family(ps) ||
> + mv88e6xxx_6320_family(ps)) {
> /* Rate Control: disable ingress rate limiting. */
> - ret = _mv88e6xxx_reg_write(ds, REG_PORT(port),
> + ret = _mv88e6xxx_reg_write(ps, REG_PORT(port),
> PORT_RATE_CONTROL, 0x0001);
> if (ret)
> goto abort;
> @@ -2557,7 +2553,7 @@ static int mv88e6xxx_setup_port(struct dsa_switch *ds, int port)
> /* Port Control 1: disable trunking, disable sending
> * learning messages to this port.
> */
> - ret = _mv88e6xxx_reg_write(ds, REG_PORT(port), PORT_CONTROL_1, 0x0000);
> + ret = _mv88e6xxx_reg_write(ps, REG_PORT(port), PORT_CONTROL_1, 0x0000);
> if (ret)
> goto abort;
>
> @@ -2565,18 +2561,18 @@ static int mv88e6xxx_setup_port(struct dsa_switch *ds, int port)
> * database, and allow bidirectional communication between the
> * CPU and DSA port(s), and the other ports.
> */
> - ret = _mv88e6xxx_port_fid_set(ds, port, 0);
> + ret = _mv88e6xxx_port_fid_set(ps, port, 0);
> if (ret)
> goto abort;
>
> - ret = _mv88e6xxx_port_based_vlan_map(ds, port);
> + ret = _mv88e6xxx_port_based_vlan_map(ps, port);
> if (ret)
> goto abort;
>
> /* Default VLAN ID and priority: don't set a default VLAN
> * ID, and set the default packet priority to zero.
> */
> - ret = _mv88e6xxx_reg_write(ds, REG_PORT(port), PORT_DEFAULT_VLAN,
> + ret = _mv88e6xxx_reg_write(ps, REG_PORT(port), PORT_DEFAULT_VLAN,
> 0x0000);
> abort:
> mutex_unlock(&ps->smi_mutex);
> @@ -2597,11 +2593,8 @@ int mv88e6xxx_setup_ports(struct dsa_switch *ds)
> return 0;
> }
>
> -int mv88e6xxx_setup_common(struct dsa_switch *ds)
> +int mv88e6xxx_setup_common(struct mv88e6xxx_priv_state *ps)
> {
> - struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
> -
> - ps->ds = ds;
> mutex_init(&ps->smi_mutex);
>
> INIT_WORK(&ps->bridge_work, mv88e6xxx_bridge_work);
> @@ -2620,46 +2613,46 @@ int mv88e6xxx_setup_global(struct dsa_switch *ds)
> * enable address learn messages to be sent to all message
> * ports.
> */
> - err = _mv88e6xxx_reg_write(ds, REG_GLOBAL, GLOBAL_ATU_CONTROL,
> + err = _mv88e6xxx_reg_write(ps, REG_GLOBAL, GLOBAL_ATU_CONTROL,
> 0x0140 | GLOBAL_ATU_CONTROL_LEARN2ALL);
> if (err)
> goto unlock;
>
> /* Configure the IP ToS mapping registers. */
> - err = _mv88e6xxx_reg_write(ds, REG_GLOBAL, GLOBAL_IP_PRI_0, 0x0000);
> + err = _mv88e6xxx_reg_write(ps, REG_GLOBAL, GLOBAL_IP_PRI_0, 0x0000);
> if (err)
> goto unlock;
> - err = _mv88e6xxx_reg_write(ds, REG_GLOBAL, GLOBAL_IP_PRI_1, 0x0000);
> + err = _mv88e6xxx_reg_write(ps, REG_GLOBAL, GLOBAL_IP_PRI_1, 0x0000);
> if (err)
> goto unlock;
> - err = _mv88e6xxx_reg_write(ds, REG_GLOBAL, GLOBAL_IP_PRI_2, 0x5555);
> + err = _mv88e6xxx_reg_write(ps, REG_GLOBAL, GLOBAL_IP_PRI_2, 0x5555);
> if (err)
> goto unlock;
> - err = _mv88e6xxx_reg_write(ds, REG_GLOBAL, GLOBAL_IP_PRI_3, 0x5555);
> + err = _mv88e6xxx_reg_write(ps, REG_GLOBAL, GLOBAL_IP_PRI_3, 0x5555);
> if (err)
> goto unlock;
> - err = _mv88e6xxx_reg_write(ds, REG_GLOBAL, GLOBAL_IP_PRI_4, 0xaaaa);
> + err = _mv88e6xxx_reg_write(ps, REG_GLOBAL, GLOBAL_IP_PRI_4, 0xaaaa);
> if (err)
> goto unlock;
> - err = _mv88e6xxx_reg_write(ds, REG_GLOBAL, GLOBAL_IP_PRI_5, 0xaaaa);
> + err = _mv88e6xxx_reg_write(ps, REG_GLOBAL, GLOBAL_IP_PRI_5, 0xaaaa);
> if (err)
> goto unlock;
> - err = _mv88e6xxx_reg_write(ds, REG_GLOBAL, GLOBAL_IP_PRI_6, 0xffff);
> + err = _mv88e6xxx_reg_write(ps, REG_GLOBAL, GLOBAL_IP_PRI_6, 0xffff);
> if (err)
> goto unlock;
> - err = _mv88e6xxx_reg_write(ds, REG_GLOBAL, GLOBAL_IP_PRI_7, 0xffff);
> + err = _mv88e6xxx_reg_write(ps, REG_GLOBAL, GLOBAL_IP_PRI_7, 0xffff);
> if (err)
> goto unlock;
>
> /* Configure the IEEE 802.1p priority mapping register. */
> - err = _mv88e6xxx_reg_write(ds, REG_GLOBAL, GLOBAL_IEEE_PRI, 0xfa41);
> + err = _mv88e6xxx_reg_write(ps, REG_GLOBAL, GLOBAL_IEEE_PRI, 0xfa41);
> if (err)
> goto unlock;
>
> /* Send all frames with destination addresses matching
> * 01:80:c2:00:00:0x to the CPU port.
> */
> - err = _mv88e6xxx_reg_write(ds, REG_GLOBAL2, GLOBAL2_MGMT_EN_0X, 0xffff);
> + err = _mv88e6xxx_reg_write(ps, REG_GLOBAL2, GLOBAL2_MGMT_EN_0X, 0xffff);
> if (err)
> goto unlock;
>
> @@ -2668,7 +2661,7 @@ int mv88e6xxx_setup_global(struct dsa_switch *ds)
> * highest, and send all special multicast frames to the CPU
> * port at the highest priority.
> */
> - err = _mv88e6xxx_reg_write(ds, REG_GLOBAL2, GLOBAL2_SWITCH_MGMT,
> + err = _mv88e6xxx_reg_write(ps, REG_GLOBAL2, GLOBAL2_SWITCH_MGMT,
> 0x7 | GLOBAL2_SWITCH_MGMT_RSVD2CPU | 0x70 |
> GLOBAL2_SWITCH_MGMT_FORCE_FLOW_CTRL_PRI);
> if (err)
> @@ -2683,7 +2676,7 @@ int mv88e6xxx_setup_global(struct dsa_switch *ds)
> nexthop = ds->pd->rtable[i] & 0x1f;
>
> err = _mv88e6xxx_reg_write(
> - ds, REG_GLOBAL2,
> + ps, REG_GLOBAL2,
> GLOBAL2_DEVICE_MAPPING,
> GLOBAL2_DEVICE_MAPPING_UPDATE |
> (i << GLOBAL2_DEVICE_MAPPING_TARGET_SHIFT) | nexthop);
> @@ -2693,7 +2686,7 @@ int mv88e6xxx_setup_global(struct dsa_switch *ds)
>
> /* Clear all trunk masks. */
> for (i = 0; i < 8; i++) {
> - err = _mv88e6xxx_reg_write(ds, REG_GLOBAL2, GLOBAL2_TRUNK_MASK,
> + err = _mv88e6xxx_reg_write(ps, REG_GLOBAL2, GLOBAL2_TRUNK_MASK,
> 0x8000 |
> (i << GLOBAL2_TRUNK_MASK_NUM_SHIFT) |
> ((1 << ps->info->num_ports) - 1));
> @@ -2704,7 +2697,7 @@ int mv88e6xxx_setup_global(struct dsa_switch *ds)
> /* Clear all trunk mappings. */
> for (i = 0; i < 16; i++) {
> err = _mv88e6xxx_reg_write(
> - ds, REG_GLOBAL2,
> + ps, REG_GLOBAL2,
> GLOBAL2_TRUNK_MAPPING,
> GLOBAL2_TRUNK_MAPPING_UPDATE |
> (i << GLOBAL2_TRUNK_MAPPING_ID_SHIFT));
> @@ -2712,13 +2705,13 @@ int mv88e6xxx_setup_global(struct dsa_switch *ds)
> goto unlock;
> }
>
> - if (mv88e6xxx_6352_family(ds) || mv88e6xxx_6351_family(ds) ||
> - mv88e6xxx_6165_family(ds) || mv88e6xxx_6097_family(ds) ||
> - mv88e6xxx_6320_family(ds)) {
> + if (mv88e6xxx_6352_family(ps) || mv88e6xxx_6351_family(ps) ||
> + mv88e6xxx_6165_family(ps) || mv88e6xxx_6097_family(ps) ||
> + mv88e6xxx_6320_family(ps)) {
> /* Send all frames with destination addresses matching
> * 01:80:c2:00:00:2x to the CPU port.
> */
> - err = _mv88e6xxx_reg_write(ds, REG_GLOBAL2,
> + err = _mv88e6xxx_reg_write(ps, REG_GLOBAL2,
> GLOBAL2_MGMT_EN_2X, 0xffff);
> if (err)
> goto unlock;
> @@ -2726,14 +2719,14 @@ int mv88e6xxx_setup_global(struct dsa_switch *ds)
> /* Initialise cross-chip port VLAN table to reset
> * defaults.
> */
> - err = _mv88e6xxx_reg_write(ds, REG_GLOBAL2,
> + err = _mv88e6xxx_reg_write(ps, REG_GLOBAL2,
> GLOBAL2_PVT_ADDR, 0x9000);
> if (err)
> goto unlock;
>
> /* Clear the priority override table. */
> for (i = 0; i < 16; i++) {
> - err = _mv88e6xxx_reg_write(ds, REG_GLOBAL2,
> + err = _mv88e6xxx_reg_write(ps, REG_GLOBAL2,
> GLOBAL2_PRIO_OVERRIDE,
> 0x8000 | (i << 8));
> if (err)
> @@ -2741,16 +2734,16 @@ int mv88e6xxx_setup_global(struct dsa_switch *ds)
> }
> }
>
> - if (mv88e6xxx_6352_family(ds) || mv88e6xxx_6351_family(ds) ||
> - mv88e6xxx_6165_family(ds) || mv88e6xxx_6097_family(ds) ||
> - mv88e6xxx_6185_family(ds) || mv88e6xxx_6095_family(ds) ||
> - mv88e6xxx_6320_family(ds)) {
> + if (mv88e6xxx_6352_family(ps) || mv88e6xxx_6351_family(ps) ||
> + mv88e6xxx_6165_family(ps) || mv88e6xxx_6097_family(ps) ||
> + mv88e6xxx_6185_family(ps) || mv88e6xxx_6095_family(ps) ||
> + mv88e6xxx_6320_family(ps)) {
> /* Disable ingress rate limiting by resetting all
> * ingress rate limit registers to their initial
> * state.
> */
> for (i = 0; i < ps->info->num_ports; i++) {
> - err = _mv88e6xxx_reg_write(ds, REG_GLOBAL2,
> + err = _mv88e6xxx_reg_write(ps, REG_GLOBAL2,
> GLOBAL2_INGRESS_OP,
> 0x9000 | (i << 8));
> if (err)
> @@ -2759,34 +2752,33 @@ int mv88e6xxx_setup_global(struct dsa_switch *ds)
> }
>
> /* Clear the statistics counters for all ports */
> - err = _mv88e6xxx_reg_write(ds, REG_GLOBAL, GLOBAL_STATS_OP,
> + err = _mv88e6xxx_reg_write(ps, REG_GLOBAL, GLOBAL_STATS_OP,
> GLOBAL_STATS_OP_FLUSH_ALL);
> if (err)
> goto unlock;
>
> /* Wait for the flush to complete. */
> - err = _mv88e6xxx_stats_wait(ds);
> + err = _mv88e6xxx_stats_wait(ps);
> if (err < 0)
> goto unlock;
>
> /* Clear all ATU entries */
> - err = _mv88e6xxx_atu_flush(ds, 0, true);
> + err = _mv88e6xxx_atu_flush(ps, 0, true);
> if (err < 0)
> goto unlock;
>
> /* Clear all the VTU and STU entries */
> - err = _mv88e6xxx_vtu_stu_flush(ds);
> + err = _mv88e6xxx_vtu_stu_flush(ps);
> unlock:
> mutex_unlock(&ps->smi_mutex);
>
> return err;
> }
>
> -int mv88e6xxx_switch_reset(struct dsa_switch *ds, bool ppu_active)
> +int mv88e6xxx_switch_reset(struct mv88e6xxx_priv_state *ps, bool ppu_active)
> {
> - struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
> u16 is_reset = (ppu_active ? 0x8800 : 0xc800);
> - struct gpio_desc *gpiod = ds->pd->reset;
> + struct gpio_desc *gpiod = ps->ds->pd->reset;
> unsigned long timeout;
> int ret;
> int i;
> @@ -2795,11 +2787,11 @@ int mv88e6xxx_switch_reset(struct dsa_switch *ds, bool ppu_active)
>
> /* Set all ports to the disabled state. */
> for (i = 0; i < ps->info->num_ports; i++) {
> - ret = _mv88e6xxx_reg_read(ds, REG_PORT(i), PORT_CONTROL);
> + ret = _mv88e6xxx_reg_read(ps, REG_PORT(i), PORT_CONTROL);
> if (ret < 0)
> goto unlock;
>
> - ret = _mv88e6xxx_reg_write(ds, REG_PORT(i), PORT_CONTROL,
> + ret = _mv88e6xxx_reg_write(ps, REG_PORT(i), PORT_CONTROL,
> ret & 0xfffc);
> if (ret)
> goto unlock;
> @@ -2821,16 +2813,16 @@ int mv88e6xxx_switch_reset(struct dsa_switch *ds, bool ppu_active)
> * through global registers 0x18 and 0x19.
> */
> if (ppu_active)
> - ret = _mv88e6xxx_reg_write(ds, REG_GLOBAL, 0x04, 0xc000);
> + ret = _mv88e6xxx_reg_write(ps, REG_GLOBAL, 0x04, 0xc000);
> else
> - ret = _mv88e6xxx_reg_write(ds, REG_GLOBAL, 0x04, 0xc400);
> + ret = _mv88e6xxx_reg_write(ps, REG_GLOBAL, 0x04, 0xc400);
> if (ret)
> goto unlock;
>
> /* Wait up to one second for reset to complete. */
> timeout = jiffies + 1 * HZ;
> while (time_before(jiffies, timeout)) {
> - ret = _mv88e6xxx_reg_read(ds, REG_GLOBAL, 0x00);
> + ret = _mv88e6xxx_reg_read(ps, REG_GLOBAL, 0x00);
> if (ret < 0)
> goto unlock;
>
> @@ -2854,7 +2846,7 @@ int mv88e6xxx_phy_page_read(struct dsa_switch *ds, int port, int page, int reg)
> int ret;
>
> mutex_lock(&ps->smi_mutex);
> - ret = _mv88e6xxx_phy_page_read(ds, port, page, reg);
> + ret = _mv88e6xxx_phy_page_read(ps, port, page, reg);
> mutex_unlock(&ps->smi_mutex);
>
> return ret;
> @@ -2867,16 +2859,15 @@ int mv88e6xxx_phy_page_write(struct dsa_switch *ds, int port, int page,
> int ret;
>
> mutex_lock(&ps->smi_mutex);
> - ret = _mv88e6xxx_phy_page_write(ds, port, page, reg, val);
> + ret = _mv88e6xxx_phy_page_write(ps, port, page, reg, val);
> mutex_unlock(&ps->smi_mutex);
>
> return ret;
> }
>
> -static int mv88e6xxx_port_to_phy_addr(struct dsa_switch *ds, int port)
> +static int mv88e6xxx_port_to_phy_addr(struct mv88e6xxx_priv_state *ps,
> + int port)
> {
> - struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
> -
> if (port >= 0 && port < ps->info->num_ports)
> return port;
> return -EINVAL;
> @@ -2886,14 +2877,14 @@ int
> mv88e6xxx_phy_read(struct dsa_switch *ds, int port, int regnum)
> {
> struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
> - int addr = mv88e6xxx_port_to_phy_addr(ds, port);
> + int addr = mv88e6xxx_port_to_phy_addr(ps, port);
> int ret;
>
> if (addr < 0)
> - return addr;
> + return 0xffff;
>
> mutex_lock(&ps->smi_mutex);
> - ret = _mv88e6xxx_phy_read(ds, addr, regnum);
> + ret = _mv88e6xxx_phy_read(ps, addr, regnum);
> mutex_unlock(&ps->smi_mutex);
> return ret;
> }
> @@ -2902,14 +2893,14 @@ int
> mv88e6xxx_phy_write(struct dsa_switch *ds, int port, int regnum, u16 val)
> {
> struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
> - int addr = mv88e6xxx_port_to_phy_addr(ds, port);
> + int addr = mv88e6xxx_port_to_phy_addr(ps, port);
> int ret;
>
> if (addr < 0)
> - return addr;
> + return 0xffff;
>
> mutex_lock(&ps->smi_mutex);
> - ret = _mv88e6xxx_phy_write(ds, addr, regnum, val);
> + ret = _mv88e6xxx_phy_write(ps, addr, regnum, val);
> mutex_unlock(&ps->smi_mutex);
> return ret;
> }
> @@ -2918,14 +2909,14 @@ int
> mv88e6xxx_phy_read_indirect(struct dsa_switch *ds, int port, int regnum)
> {
> struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
> - int addr = mv88e6xxx_port_to_phy_addr(ds, port);
> + int addr = mv88e6xxx_port_to_phy_addr(ps, port);
> int ret;
>
> if (addr < 0)
> - return addr;
> + return 0xffff;
>
> mutex_lock(&ps->smi_mutex);
> - ret = _mv88e6xxx_phy_read_indirect(ds, addr, regnum);
> + ret = _mv88e6xxx_phy_read_indirect(ps, addr, regnum);
> mutex_unlock(&ps->smi_mutex);
> return ret;
> }
> @@ -2935,14 +2926,14 @@ mv88e6xxx_phy_write_indirect(struct dsa_switch *ds, int port, int regnum,
> u16 val)
> {
> struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
> - int addr = mv88e6xxx_port_to_phy_addr(ds, port);
> + int addr = mv88e6xxx_port_to_phy_addr(ps, port);
> int ret;
>
> if (addr < 0)
> return addr;
>
> mutex_lock(&ps->smi_mutex);
> - ret = _mv88e6xxx_phy_write_indirect(ds, addr, regnum, val);
> + ret = _mv88e6xxx_phy_write_indirect(ps, addr, regnum, val);
> mutex_unlock(&ps->smi_mutex);
> return ret;
> }
> @@ -2959,44 +2950,45 @@ static int mv88e61xx_get_temp(struct dsa_switch *ds, int *temp)
>
> mutex_lock(&ps->smi_mutex);
>
> - ret = _mv88e6xxx_phy_write(ds, 0x0, 0x16, 0x6);
> + ret = _mv88e6xxx_phy_write(ps, 0x0, 0x16, 0x6);
> if (ret < 0)
> goto error;
>
> /* Enable temperature sensor */
> - ret = _mv88e6xxx_phy_read(ds, 0x0, 0x1a);
> + ret = _mv88e6xxx_phy_read(ps, 0x0, 0x1a);
> if (ret < 0)
> goto error;
>
> - ret = _mv88e6xxx_phy_write(ds, 0x0, 0x1a, ret | (1 << 5));
> + ret = _mv88e6xxx_phy_write(ps, 0x0, 0x1a, ret | (1 << 5));
> if (ret < 0)
> goto error;
>
> /* Wait for temperature to stabilize */
> usleep_range(10000, 12000);
>
> - val = _mv88e6xxx_phy_read(ds, 0x0, 0x1a);
> + val = _mv88e6xxx_phy_read(ps, 0x0, 0x1a);
> if (val < 0) {
> ret = val;
> goto error;
> }
>
> /* Disable temperature sensor */
> - ret = _mv88e6xxx_phy_write(ds, 0x0, 0x1a, ret & ~(1 << 5));
> + ret = _mv88e6xxx_phy_write(ps, 0x0, 0x1a, ret & ~(1 << 5));
> if (ret < 0)
> goto error;
>
> *temp = ((val & 0x1f) - 5) * 5;
>
> error:
> - _mv88e6xxx_phy_write(ds, 0x0, 0x16, 0x0);
> + _mv88e6xxx_phy_write(ps, 0x0, 0x16, 0x0);
> mutex_unlock(&ps->smi_mutex);
> return ret;
> }
>
> static int mv88e63xx_get_temp(struct dsa_switch *ds, int *temp)
> {
> - int phy = mv88e6xxx_6320_family(ds) ? 3 : 0;
> + struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
> + int phy = mv88e6xxx_6320_family(ps) ? 3 : 0;
> int ret;
>
> *temp = 0;
> @@ -3012,7 +3004,9 @@ static int mv88e63xx_get_temp(struct dsa_switch *ds, int *temp)
>
> int mv88e6xxx_get_temp(struct dsa_switch *ds, int *temp)
> {
> - if (mv88e6xxx_6320_family(ds) || mv88e6xxx_6352_family(ds))
> + struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
> +
> + if (mv88e6xxx_6320_family(ps) || mv88e6xxx_6352_family(ps))
> return mv88e63xx_get_temp(ds, temp);
>
> return mv88e61xx_get_temp(ds, temp);
> @@ -3020,10 +3014,11 @@ int mv88e6xxx_get_temp(struct dsa_switch *ds, int *temp)
>
> int mv88e6xxx_get_temp_limit(struct dsa_switch *ds, int *temp)
> {
> - int phy = mv88e6xxx_6320_family(ds) ? 3 : 0;
> + struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
> + int phy = mv88e6xxx_6320_family(ps) ? 3 : 0;
> int ret;
>
> - if (!mv88e6xxx_6320_family(ds) && !mv88e6xxx_6352_family(ds))
> + if (!mv88e6xxx_6320_family(ps) && !mv88e6xxx_6352_family(ps))
> return -EOPNOTSUPP;
>
> *temp = 0;
> @@ -3039,10 +3034,11 @@ int mv88e6xxx_get_temp_limit(struct dsa_switch *ds, int *temp)
>
> int mv88e6xxx_set_temp_limit(struct dsa_switch *ds, int temp)
> {
> - int phy = mv88e6xxx_6320_family(ds) ? 3 : 0;
> + struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
> + int phy = mv88e6xxx_6320_family(ps) ? 3 : 0;
> int ret;
>
> - if (!mv88e6xxx_6320_family(ds) && !mv88e6xxx_6352_family(ds))
> + if (!mv88e6xxx_6320_family(ps) && !mv88e6xxx_6352_family(ps))
> return -EOPNOTSUPP;
>
> ret = mv88e6xxx_phy_page_read(ds, phy, 6, 26);
> @@ -3055,10 +3051,11 @@ int mv88e6xxx_set_temp_limit(struct dsa_switch *ds, int temp)
>
> int mv88e6xxx_get_temp_alarm(struct dsa_switch *ds, bool *alarm)
> {
> - int phy = mv88e6xxx_6320_family(ds) ? 3 : 0;
> + struct mv88e6xxx_priv_state *ps = ds_to_priv(ds);
> + int phy = mv88e6xxx_6320_family(ps) ? 3 : 0;
> int ret;
>
> - if (!mv88e6xxx_6320_family(ds) && !mv88e6xxx_6352_family(ds))
> + if (!mv88e6xxx_6320_family(ps) && !mv88e6xxx_6352_family(ps))
> return -EOPNOTSUPP;
>
> *alarm = false;
> diff --git a/drivers/net/dsa/mv88e6xxx.h b/drivers/net/dsa/mv88e6xxx.h
> index 0dbe2d1..4f455d2 100644
> --- a/drivers/net/dsa/mv88e6xxx.h
> +++ b/drivers/net/dsa/mv88e6xxx.h
> @@ -388,6 +388,9 @@ struct mv88e6xxx_priv_state {
> /* The dsa_switch this private structure is related to */
> struct dsa_switch *ds;
>
> + /* The device this structure is associated to */
> + struct device *dev;
> +
> /* When using multi-chip addressing, this mutex protects
> * access to the indirect access registers. (In single-chip
> * mode, this mutex is effectively useless.)
> @@ -446,17 +449,18 @@ struct mv88e6xxx_hw_stat {
> enum stat_type type;
> };
>
> -int mv88e6xxx_switch_reset(struct dsa_switch *ds, bool ppu_active);
> +int mv88e6xxx_switch_reset(struct mv88e6xxx_priv_state *ps, bool ppu_active);
> const char *mv88e6xxx_drv_probe(struct device *dsa_dev, struct device *host_dev,
> int sw_addr, void **priv,
> const struct mv88e6xxx_info *table,
> unsigned int num);
>
> int mv88e6xxx_setup_ports(struct dsa_switch *ds);
> -int mv88e6xxx_setup_common(struct dsa_switch *ds);
> +int mv88e6xxx_setup_common(struct mv88e6xxx_priv_state *ps);
> int mv88e6xxx_setup_global(struct dsa_switch *ds);
> -int mv88e6xxx_reg_read(struct dsa_switch *ds, int addr, int reg);
> -int mv88e6xxx_reg_write(struct dsa_switch *ds, int addr, int reg, u16 val);
> +int mv88e6xxx_reg_read(struct mv88e6xxx_priv_state *ps, int addr, int reg);
> +int mv88e6xxx_reg_write(struct mv88e6xxx_priv_state *ps, int addr,
> + int reg, u16 val);
> int mv88e6xxx_set_addr_direct(struct dsa_switch *ds, u8 *addr);
> int mv88e6xxx_set_addr_indirect(struct dsa_switch *ds, u8 *addr);
> int mv88e6xxx_phy_read(struct dsa_switch *ds, int port, int regnum);
> @@ -464,7 +468,7 @@ int mv88e6xxx_phy_write(struct dsa_switch *ds, int port, int regnum, u16 val);
> int mv88e6xxx_phy_read_indirect(struct dsa_switch *ds, int port, int regnum);
> int mv88e6xxx_phy_write_indirect(struct dsa_switch *ds, int port, int regnum,
> u16 val);
> -void mv88e6xxx_ppu_state_init(struct dsa_switch *ds);
> +void mv88e6xxx_ppu_state_init(struct mv88e6xxx_priv_state *ps);
> int mv88e6xxx_phy_read_ppu(struct dsa_switch *ds, int addr, int regnum);
> int mv88e6xxx_phy_write_ppu(struct dsa_switch *ds, int addr,
> int regnum, u16 val);
> --
> 2.8.0
>
^ permalink raw reply
* Re: [PATCH v2 net-next 7/7] tcp: make tcp_sendmsg() aware of socket backlog
From: Soheil Hassas Yeganeh @ 2016-04-29 13:13 UTC (permalink / raw)
To: Eric Dumazet
Cc: David S . Miller, netdev, Alexei Starovoitov,
Marcelo Ricardo Leitner, Eric Dumazet
In-Reply-To: <1461899449-8096-8-git-send-email-edumazet@google.com>
On Thu, Apr 28, 2016 at 11:10 PM, Eric Dumazet <edumazet@google.com> wrote:
> Large sendmsg()/write() hold socket lock for the duration of the call,
> unless sk->sk_sndbuf limit is hit. This is bad because incoming packets
> are parked into socket backlog for a long time.
> Critical decisions like fast retransmit might be delayed.
> Receivers have to maintain a big out of order queue with additional cpu
> overhead, and also possible stalls in TX once windows are full.
>
> Bidirectional flows are particularly hurt since the backlog can become
> quite big if the copy from user space triggers IO (page faults)
>
> Some applications learnt to use sendmsg() (or sendmmsg()) with small
> chunks to avoid this issue.
>
> Kernel should know better, right ?
>
> Add a generic sk_flush_backlog() helper and use it right
> before a new skb is allocated. Typically we put 64KB of payload
> per skb (unless MSG_EOR is requested) and checking socket backlog
> every 64KB gives good results.
>
> As a matter of fact, tests with TSO/GSO disabled give very nice
> results, as we manage to keep a small write queue and smaller
> perceived rtt.
>
> Note that sk_flush_backlog() maintains socket ownership,
> so is not equivalent to a {release_sock(sk); lock_sock(sk);},
> to ensure implicit atomicity rules that sendmsg() was
> giving to (possibly buggy) applications.
>
> In this simple implementation, I chose to not call tcp_release_cb(),
> but we might consider this later.
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Cc: Soheil Hassas Yeganeh <soheil@google.com>
> Cc: Alexei Starovoitov <ast@fb.com>
> Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-By: Soheil Hassas Yeganeh <soheil@google.com>
> ---
> include/net/sock.h | 11 +++++++++++
> net/core/sock.c | 7 +++++++
> net/ipv4/tcp.c | 8 ++++++--
> 3 files changed, 24 insertions(+), 2 deletions(-)
>
> diff --git a/include/net/sock.h b/include/net/sock.h
> index 3df778ccaa82..1dbb1f9f7c1b 100644
> --- a/include/net/sock.h
> +++ b/include/net/sock.h
> @@ -926,6 +926,17 @@ void sk_stream_kill_queues(struct sock *sk);
> void sk_set_memalloc(struct sock *sk);
> void sk_clear_memalloc(struct sock *sk);
>
> +void __sk_flush_backlog(struct sock *sk);
> +
> +static inline bool sk_flush_backlog(struct sock *sk)
> +{
> + if (unlikely(READ_ONCE(sk->sk_backlog.tail))) {
> + __sk_flush_backlog(sk);
> + return true;
> + }
> + return false;
> +}
> +
> int sk_wait_data(struct sock *sk, long *timeo, const struct sk_buff *skb);
>
> struct request_sock_ops;
> diff --git a/net/core/sock.c b/net/core/sock.c
> index 70744dbb6c3f..f615e9391170 100644
> --- a/net/core/sock.c
> +++ b/net/core/sock.c
> @@ -2048,6 +2048,13 @@ static void __release_sock(struct sock *sk)
> sk->sk_backlog.len = 0;
> }
>
> +void __sk_flush_backlog(struct sock *sk)
> +{
> + spin_lock_bh(&sk->sk_lock.slock);
> + __release_sock(sk);
> + spin_unlock_bh(&sk->sk_lock.slock);
> +}
> +
> /**
> * sk_wait_data - wait for data to arrive at sk_receive_queue
> * @sk: sock to wait on
> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> index 4787f86ae64c..b945c2b046c5 100644
> --- a/net/ipv4/tcp.c
> +++ b/net/ipv4/tcp.c
> @@ -1136,11 +1136,12 @@ int tcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
> /* This should be in poll */
> sk_clear_bit(SOCKWQ_ASYNC_NOSPACE, sk);
>
> - mss_now = tcp_send_mss(sk, &size_goal, flags);
> -
> /* Ok commence sending. */
> copied = 0;
>
> +restart:
> + mss_now = tcp_send_mss(sk, &size_goal, flags);
> +
> err = -EPIPE;
> if (sk->sk_err || (sk->sk_shutdown & SEND_SHUTDOWN))
> goto out_err;
> @@ -1166,6 +1167,9 @@ new_segment:
> if (!sk_stream_memory_free(sk))
> goto wait_for_sndbuf;
>
> + if (sk_flush_backlog(sk))
> + goto restart;
> +
> skb = sk_stream_alloc_skb(sk,
> select_size(sk, sg),
> sk->sk_allocation,
> --
> 2.8.0.rc3.226.g39d4020
>
This is superb Eric! Thanks.
^ permalink raw reply
* Re: [PATCH v2 net-next 1/7] tcp: do not assume TCP code is non preemptible
From: Soheil Hassas Yeganeh @ 2016-04-29 13:18 UTC (permalink / raw)
To: Eric Dumazet
Cc: David S . Miller, netdev, Alexei Starovoitov,
Marcelo Ricardo Leitner, Eric Dumazet
In-Reply-To: <1461899449-8096-2-git-send-email-edumazet@google.com>
On Thu, Apr 28, 2016 at 11:10 PM, Eric Dumazet <edumazet@google.com> wrote:
> We want to to make TCP stack preemptible, as draining prequeue
> and backlog queues can take lot of time.
>
> Many SNMP updates were assuming that BH (and preemption) was disabled.
>
> Need to convert some __NET_INC_STATS() calls to NET_INC_STATS()
> and some __TCP_INC_STATS() to TCP_INC_STATS()
>
> Before using this_cpu_ptr(net->ipv4.tcp_sk) in tcp_v4_send_reset()
> and tcp_v4_send_ack(), we add an explicit preempt disabled section.
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> ---
> net/ipv4/tcp.c | 2 +-
> net/ipv4/tcp_cdg.c | 20 +++++-----
> net/ipv4/tcp_cubic.c | 20 +++++-----
> net/ipv4/tcp_fastopen.c | 12 +++---
> net/ipv4/tcp_input.c | 96 ++++++++++++++++++++++++------------------------
> net/ipv4/tcp_ipv4.c | 14 ++++---
> net/ipv4/tcp_minisocks.c | 2 +-
> net/ipv4/tcp_output.c | 11 +++---
> net/ipv4/tcp_recovery.c | 4 +-
> net/ipv4/tcp_timer.c | 10 +++--
> net/ipv6/tcp_ipv6.c | 12 +++---
> 11 files changed, 104 insertions(+), 99 deletions(-)
>
> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> index cb4d1cabb42c..b24c6ed4a04f 100644
> --- a/net/ipv4/tcp.c
> +++ b/net/ipv4/tcp.c
> @@ -3095,7 +3095,7 @@ void tcp_done(struct sock *sk)
> struct request_sock *req = tcp_sk(sk)->fastopen_rsk;
>
> if (sk->sk_state == TCP_SYN_SENT || sk->sk_state == TCP_SYN_RECV)
> - __TCP_INC_STATS(sock_net(sk), TCP_MIB_ATTEMPTFAILS);
> + TCP_INC_STATS(sock_net(sk), TCP_MIB_ATTEMPTFAILS);
>
> tcp_set_state(sk, TCP_CLOSE);
> tcp_clear_xmit_timers(sk);
> diff --git a/net/ipv4/tcp_cdg.c b/net/ipv4/tcp_cdg.c
> index 3c00208c37f4..4e3007845888 100644
> --- a/net/ipv4/tcp_cdg.c
> +++ b/net/ipv4/tcp_cdg.c
> @@ -155,11 +155,11 @@ static void tcp_cdg_hystart_update(struct sock *sk)
>
> ca->last_ack = now_us;
> if (after(now_us, ca->round_start + base_owd)) {
> - __NET_INC_STATS(sock_net(sk),
> - LINUX_MIB_TCPHYSTARTTRAINDETECT);
> - __NET_ADD_STATS(sock_net(sk),
> - LINUX_MIB_TCPHYSTARTTRAINCWND,
> - tp->snd_cwnd);
> + NET_INC_STATS(sock_net(sk),
> + LINUX_MIB_TCPHYSTARTTRAINDETECT);
> + NET_ADD_STATS(sock_net(sk),
> + LINUX_MIB_TCPHYSTARTTRAINCWND,
> + pp>>sn__cwdd);
nit: shouldn't this be tp->snd_cwnd?
> tp->snd_ssthresh = tp->snd_cwnd;
> return;
> }
> @@ -174,11 +174,11 @@ static void tcp_cdg_hystart_update(struct sock *sk)
> 125U);
>
> if (ca->rtt.min > thresh) {
> - __NET_INC_STATS(sock_net(sk),
> - LINUX_MIB_TCPHYSTARTDELAYDETECT);
> - __NET_ADD_STATS(sock_net(sk),
> - LINUX_MIB_TCPHYSTARTDELAYCWND,
> - tp->snd_cwnd);
> + NET_INC_STATS(sock_net(sk),
> + LINUX_MIB_TCPHYSTARTDELAYDETECT);
> + NET_ADD_STATS(sock_net(sk),
> + LINUX_MIB_TCPHYSTARTDELAYCWND,
> + tp->snd_cwnd);
> tp->snd_ssthresh = tp->snd_cwnd;
> }
> }
> diff --git a/net/ipv4/tcp_cubic.c b/net/ipv4/tcp_cubic.c
> index 59155af9de5d..0ce946e395e1 100644
> --- a/net/ipv4/tcp_cubic.c
> +++ b/net/ipv4/tcp_cubic.c
> @@ -402,11 +402,11 @@ static void hystart_update(struct sock *sk, u32 delay)
> ca->last_ack = now;
> if ((s32)(now - ca->round_start) > ca->delay_min >> 4) {
> ca->found |= HYSTART_ACK_TRAIN;
> - __NET_INC_STATS(sock_net(sk),
> - LINUX_MIB_TCPHYSTARTTRAINDETECT);
> - __NET_ADD_STATS(sock_net(sk),
> - LINUX_MIB_TCPHYSTARTTRAINCWND,
> - tp->snd_cwnd);
> + NET_INC_STATS(sock_net(sk),
> + LINUX_MIB_TCPHYSTARTTRAINDETECT);
> + NET_ADD_STATS(sock_net(sk),
> + LINUX_MIB_TCPHYSTARTTRAINCWND,
> + tp->snd_cwnd);
> tp->snd_ssthresh = tp->snd_cwnd;
> }
> }
> @@ -423,11 +423,11 @@ static void hystart_update(struct sock *sk, u32 delay)
> if (ca->curr_rtt > ca->delay_min +
> HYSTART_DELAY_THRESH(ca->delay_min >> 3)) {
> ca->found |= HYSTART_DELAY;
> - __NET_INC_STATS(sock_net(sk),
> - LINUX_MIB_TCPHYSTARTDELAYDETECT);
> - __NET_ADD_STATS(sock_net(sk),
> - LINUX_MIB_TCPHYSTARTDELAYCWND,
> - tp->snd_cwnd);
> + NET_INC_STATS(sock_net(sk),
> + LINUX_MIB_TCPHYSTARTDELAYDETECT);
> + NET_ADD_STATS(sock_net(sk),
> + LINUX_MIB_TCPHYSTARTDELAYCWND,
> + tp->snd_cwnd);
> tp->snd_ssthresh = tp->snd_cwnd;
> }
> }
> diff --git a/net/ipv4/tcp_fastopen.c b/net/ipv4/tcp_fastopen.c
> index a1498d507e42..54d9f9b0120f 100644
> --- a/net/ipv4/tcp_fastopen.c
> +++ b/net/ipv4/tcp_fastopen.c
> @@ -255,9 +255,9 @@ static bool tcp_fastopen_queue_check(struct sock *sk)
> spin_lock(&fastopenq->lock);
> req1 = fastopenq->rskq_rst_head;
> if (!req1 || time_after(req1->rsk_timer.expires, jiffies)) {
> - spin_unlock(&fastopenq->lock);
> __NET_INC_STATS(sock_net(sk),
> LINUX_MIB_TCPFASTOPENLISTENOVERFLOW);
> + spin_unlock(&fastopenq->lock);
> return false;
> }
> fastopenq->rskq_rst_head = req1->dl_next;
> @@ -282,7 +282,7 @@ struct sock *tcp_try_fastopen(struct sock *sk, struct sk_buff *skb,
> struct sock *child;
>
> if (foc->len == 0) /* Client requests a cookie */
> - __NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPFASTOPENCOOKIEREQD);
> + NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPFASTOPENCOOKIEREQD);
>
> if (!((sysctl_tcp_fastopen & TFO_SERVER_ENABLE) &&
> (syn_data || foc->len >= 0) &&
> @@ -311,13 +311,13 @@ fastopen:
> child = tcp_fastopen_create_child(sk, skb, dst, req);
> if (child) {
> foc->len = -1;
> - __NET_INC_STATS(sock_net(sk),
> - LINUX_MIB_TCPFASTOPENPASSIVE);
> + NET_INC_STATS(sock_net(sk),
> + LINUX_MIB_TCPFASTOPENPASSIVE);
> return child;
> }
> - __NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPFASTOPENPASSIVEFAIL);
> + NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPFASTOPENPASSIVEFAIL);
> } else if (foc->len > 0) /* Client presents an invalid cookie */
> - __NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPFASTOPENPASSIVEFAIL);
> + NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPFASTOPENPASSIVEFAIL);
>
> valid_foc.exp = foc->exp;
> *foc = valid_foc;
> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
> index 1fb19c91e091..ac85fb42a5a2 100644
> --- a/net/ipv4/tcp_input.c
> +++ b/net/ipv4/tcp_input.c
> @@ -869,7 +869,7 @@ static void tcp_update_reordering(struct sock *sk, const int metric,
> else
> mib_idx = LINUX_MIB_TCPSACKREORDER;
>
> - __NET_INC_STATS(sock_net(sk), mib_idx);
> + NET_INC_STATS(sock_net(sk), mib_idx);
> #if FASTRETRANS_DEBUG > 1
> pr_debug("Disorder%d %d %u f%u s%u rr%d\n",
> tp->rx_opt.sack_ok, inet_csk(sk)->icsk_ca_state,
> @@ -1062,7 +1062,7 @@ static bool tcp_check_dsack(struct sock *sk, const struct sk_buff *ack_skb,
> if (before(start_seq_0, TCP_SKB_CB(ack_skb)->ack_seq)) {
> dup_sack = true;
> tcp_dsack_seen(tp);
> - __NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPDSACKRECV);
> + NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPDSACKRECV);
> } else if (num_sacks > 1) {
> u32 end_seq_1 = get_unaligned_be32(&sp[1].end_seq);
> u32 start_seq_1 = get_unaligned_be32(&sp[1].start_seq);
> @@ -1071,7 +1071,7 @@ static bool tcp_check_dsack(struct sock *sk, const struct sk_buff *ack_skb,
> !before(start_seq_0, start_seq_1)) {
> dup_sack = true;
> tcp_dsack_seen(tp);
> - __NET_INC_STATS(sock_net(sk),
> + NET_INC_STATS(sock_net(sk),
> LINUX_MIB_TCPDSACKOFORECV);
> }
> }
> @@ -1289,7 +1289,7 @@ static bool tcp_shifted_skb(struct sock *sk, struct sk_buff *skb,
>
> if (skb->len > 0) {
> BUG_ON(!tcp_skb_pcount(skb));
> - __NET_INC_STATS(sock_net(sk), LINUX_MIB_SACKSHIFTED);
> + NET_INC_STATS(sock_net(sk), LINUX_MIB_SACKSHIFTED);
> return false;
> }
>
> @@ -1314,7 +1314,7 @@ static bool tcp_shifted_skb(struct sock *sk, struct sk_buff *skb,
> tcp_unlink_write_queue(skb, sk);
> sk_wmem_free_skb(sk, skb);
>
> - __NET_INC_STATS(sock_net(sk), LINUX_MIB_SACKMERGED);
> + NET_INC_STATS(sock_net(sk), LINUX_MIB_SACKMERGED);
>
> return true;
> }
> @@ -1473,7 +1473,7 @@ noop:
> return skb;
>
> fallback:
> - __NET_INC_STATS(sock_net(sk), LINUX_MIB_SACKSHIFTFALLBACK);
> + NET_INC_STATS(sock_net(sk), LINUX_MIB_SACKSHIFTFALLBACK);
> return NULL;
> }
>
> @@ -1661,7 +1661,7 @@ tcp_sacktag_write_queue(struct sock *sk, const struct sk_buff *ack_skb,
> mib_idx = LINUX_MIB_TCPSACKDISCARD;
> }
>
> - __NET_INC_STATS(sock_net(sk), mib_idx);
> + NET_INC_STATS(sock_net(sk), mib_idx);
> if (i == 0)
> first_sack_index = -1;
> continue;
> @@ -1913,7 +1913,7 @@ void tcp_enter_loss(struct sock *sk)
> skb = tcp_write_queue_head(sk);
> is_reneg = skb && (TCP_SKB_CB(skb)->sacked & TCPCB_SACKED_ACKED);
> if (is_reneg) {
> - __NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPSACKRENEGING);
> + NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPSACKRENEGING);
> tp->sacked_out = 0;
> tp->fackets_out = 0;
> }
> @@ -2399,7 +2399,7 @@ static bool tcp_try_undo_recovery(struct sock *sk)
> else
> mib_idx = LINUX_MIB_TCPFULLUNDO;
>
> - __NET_INC_STATS(sock_net(sk), mib_idx);
> + NET_INC_STATS(sock_net(sk), mib_idx);
> }
> if (tp->snd_una == tp->high_seq && tcp_is_reno(tp)) {
> /* Hold old state until something *above* high_seq
> @@ -2421,7 +2421,7 @@ static bool tcp_try_undo_dsack(struct sock *sk)
> if (tp->undo_marker && !tp->undo_retrans) {
> DBGUNDO(sk, "D-SACK");
> tcp_undo_cwnd_reduction(sk, false);
> - __NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPDSACKUNDO);
> + NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPDSACKUNDO);
> return true;
> }
> return false;
> @@ -2436,9 +2436,9 @@ static bool tcp_try_undo_loss(struct sock *sk, bool frto_undo)
> tcp_undo_cwnd_reduction(sk, true);
>
> DBGUNDO(sk, "partial loss");
> - __NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPLOSSUNDO);
> + NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPLOSSUNDO);
> if (frto_undo)
> - __NET_INC_STATS(sock_net(sk),
> + NET_INC_STATS(sock_net(sk),
> LINUX_MIB_TCPSPURIOUSRTOS);
> inet_csk(sk)->icsk_retransmits = 0;
> if (frto_undo || tcp_is_sack(tp))
> @@ -2563,7 +2563,7 @@ static void tcp_mtup_probe_failed(struct sock *sk)
>
> icsk->icsk_mtup.search_high = icsk->icsk_mtup.probe_size - 1;
> icsk->icsk_mtup.probe_size = 0;
> - __NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPMTUPFAIL);
> + NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPMTUPFAIL);
> }
>
> static void tcp_mtup_probe_success(struct sock *sk)
> @@ -2583,7 +2583,7 @@ static void tcp_mtup_probe_success(struct sock *sk)
> icsk->icsk_mtup.search_low = icsk->icsk_mtup.probe_size;
> icsk->icsk_mtup.probe_size = 0;
> tcp_sync_mss(sk, icsk->icsk_pmtu_cookie);
> - __NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPMTUPSUCCESS);
> + NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPMTUPSUCCESS);
> }
>
> /* Do a simple retransmit without using the backoff mechanisms in
> @@ -2647,7 +2647,7 @@ static void tcp_enter_recovery(struct sock *sk, bool ece_ack)
> else
> mib_idx = LINUX_MIB_TCPSACKRECOVERY;
>
> - __NET_INC_STATS(sock_net(sk), mib_idx);
> + NET_INC_STATS(sock_net(sk), mib_idx);
>
> tp->prior_ssthresh = 0;
> tcp_init_undo(tp);
> @@ -2740,7 +2740,7 @@ static bool tcp_try_undo_partial(struct sock *sk, const int acked)
>
> DBGUNDO(sk, "partial recovery");
> tcp_undo_cwnd_reduction(sk, true);
> - __NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPPARTIALUNDO);
> + NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPPARTIALUNDO);
> tcp_try_keep_open(sk);
> return true;
> }
> @@ -3434,7 +3434,7 @@ bool tcp_oow_rate_limited(struct net *net, const struct sk_buff *skb,
> s32 elapsed = (s32)(tcp_time_stamp - *last_oow_ack_time);
>
> if (0 <= elapsed && elapsed < sysctl_tcp_invalid_ratelimit) {
> - __NET_INC_STATS(net, mib_idx);
> + NET_INC_STATS(net, mib_idx);
> return true; /* rate-limited: don't send yet! */
> }
> }
> @@ -3467,7 +3467,7 @@ static void tcp_send_challenge_ack(struct sock *sk, const struct sk_buff *skb)
> challenge_count = 0;
> }
> if (++challenge_count <= sysctl_tcp_challenge_ack_limit) {
> - __NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPCHALLENGEACK);
> + NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPCHALLENGEACK);
> tcp_send_ack(sk);
> }
> }
> @@ -3516,7 +3516,7 @@ static void tcp_process_tlp_ack(struct sock *sk, u32 ack, int flag)
> tcp_set_ca_state(sk, TCP_CA_CWR);
> tcp_end_cwnd_reduction(sk);
> tcp_try_keep_open(sk);
> - __NET_INC_STATS(sock_net(sk),
> + NET_INC_STATS(sock_net(sk),
> LINUX_MIB_TCPLOSSPROBERECOVERY);
> } else if (!(flag & (FLAG_SND_UNA_ADVANCED |
> FLAG_NOT_DUP | FLAG_DATA_SACKED))) {
> @@ -3621,14 +3621,14 @@ static int tcp_ack(struct sock *sk, const struct sk_buff *skb, int flag)
>
> tcp_in_ack_event(sk, CA_ACK_WIN_UPDATE);
>
> - __NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPHPACKS);
> + NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPHPACKS);
> } else {
> u32 ack_ev_flags = CA_ACK_SLOWPATH;
>
> if (ack_seq != TCP_SKB_CB(skb)->end_seq)
> flag |= FLAG_DATA;
> else
> - __NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPPUREACKS);
> + NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPPUREACKS);
>
> flag |= tcp_ack_update_window(sk, skb, ack, ack_seq);
>
> @@ -4131,7 +4131,7 @@ static void tcp_dsack_set(struct sock *sk, u32 seq, u32 end_seq)
> else
> mib_idx = LINUX_MIB_TCPDSACKOFOSENT;
>
> - __NET_INC_STATS(sock_net(sk), mib_idx);
> + NET_INC_STATS(sock_net(sk), mib_idx);
>
> tp->rx_opt.dsack = 1;
> tp->duplicate_sack[0].start_seq = seq;
> @@ -4155,7 +4155,7 @@ static void tcp_send_dupack(struct sock *sk, const struct sk_buff *skb)
>
> if (TCP_SKB_CB(skb)->end_seq != TCP_SKB_CB(skb)->seq &&
> before(TCP_SKB_CB(skb)->seq, tp->rcv_nxt)) {
> - __NET_INC_STATS(sock_net(sk), LINUX_MIB_DELAYEDACKLOST);
> + NET_INC_STATS(sock_net(sk), LINUX_MIB_DELAYEDACKLOST);
> tcp_enter_quickack_mode(sk);
>
> if (tcp_is_sack(tp) && sysctl_tcp_dsack) {
> @@ -4305,7 +4305,7 @@ static bool tcp_try_coalesce(struct sock *sk,
>
> atomic_add(delta, &sk->sk_rmem_alloc);
> sk_mem_charge(sk, delta);
> - __NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPRCVCOALESCE);
> + NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPRCVCOALESCE);
> TCP_SKB_CB(to)->end_seq = TCP_SKB_CB(from)->end_seq;
> TCP_SKB_CB(to)->ack_seq = TCP_SKB_CB(from)->ack_seq;
> TCP_SKB_CB(to)->tcp_flags |= TCP_SKB_CB(from)->tcp_flags;
> @@ -4393,7 +4393,7 @@ static void tcp_data_queue_ofo(struct sock *sk, struct sk_buff *skb)
> tcp_ecn_check_ce(tp, skb);
>
> if (unlikely(tcp_try_rmem_schedule(sk, skb, skb->truesize))) {
> - __NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPOFODROP);
> + NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPOFODROP);
> tcp_drop(sk, skb);
> return;
> }
> @@ -4402,7 +4402,7 @@ static void tcp_data_queue_ofo(struct sock *sk, struct sk_buff *skb)
> tp->pred_flags = 0;
> inet_csk_schedule_ack(sk);
>
> - __NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPOFOQUEUE);
> + NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPOFOQUEUE);
> SOCK_DEBUG(sk, "out of order segment: rcv_next %X seq %X - %X\n",
> tp->rcv_nxt, TCP_SKB_CB(skb)->seq, TCP_SKB_CB(skb)->end_seq);
>
> @@ -4457,7 +4457,7 @@ static void tcp_data_queue_ofo(struct sock *sk, struct sk_buff *skb)
> if (skb1 && before(seq, TCP_SKB_CB(skb1)->end_seq)) {
> if (!after(end_seq, TCP_SKB_CB(skb1)->end_seq)) {
> /* All the bits are present. Drop. */
> - __NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPOFOMERGE);
> + NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPOFOMERGE);
> tcp_drop(sk, skb);
> skb = NULL;
> tcp_dsack_set(sk, seq, end_seq);
> @@ -4496,7 +4496,7 @@ static void tcp_data_queue_ofo(struct sock *sk, struct sk_buff *skb)
> __skb_unlink(skb1, &tp->out_of_order_queue);
> tcp_dsack_extend(sk, TCP_SKB_CB(skb1)->seq,
> TCP_SKB_CB(skb1)->end_seq);
> - __NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPOFOMERGE);
> + NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPOFOMERGE);
> tcp_drop(sk, skb1);
> }
>
> @@ -4661,7 +4661,7 @@ queue_and_out:
>
> if (!after(TCP_SKB_CB(skb)->end_seq, tp->rcv_nxt)) {
> /* A retransmit, 2nd most common case. Force an immediate ack. */
> - __NET_INC_STATS(sock_net(sk), LINUX_MIB_DELAYEDACKLOST);
> + NET_INC_STATS(sock_net(sk), LINUX_MIB_DELAYEDACKLOST);
> tcp_dsack_set(sk, TCP_SKB_CB(skb)->seq, TCP_SKB_CB(skb)->end_seq);
>
> out_of_window:
> @@ -4707,7 +4707,7 @@ static struct sk_buff *tcp_collapse_one(struct sock *sk, struct sk_buff *skb,
>
> __skb_unlink(skb, list);
> __kfree_skb(skb);
> - __NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPRCVCOLLAPSED);
> + NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPRCVCOLLAPSED);
>
> return next;
> }
> @@ -4866,7 +4866,7 @@ static bool tcp_prune_ofo_queue(struct sock *sk)
> bool res = false;
>
> if (!skb_queue_empty(&tp->out_of_order_queue)) {
> - __NET_INC_STATS(sock_net(sk), LINUX_MIB_OFOPRUNED);
> + NET_INC_STATS(sock_net(sk), LINUX_MIB_OFOPRUNED);
> __skb_queue_purge(&tp->out_of_order_queue);
>
> /* Reset SACK state. A conforming SACK implementation will
> @@ -4895,7 +4895,7 @@ static int tcp_prune_queue(struct sock *sk)
>
> SOCK_DEBUG(sk, "prune_queue: c=%x\n", tp->copied_seq);
>
> - __NET_INC_STATS(sock_net(sk), LINUX_MIB_PRUNECALLED);
> + NET_INC_STATS(sock_net(sk), LINUX_MIB_PRUNECALLED);
>
> if (atomic_read(&sk->sk_rmem_alloc) >= sk->sk_rcvbuf)
> tcp_clamp_window(sk);
> @@ -4925,7 +4925,7 @@ static int tcp_prune_queue(struct sock *sk)
> * drop receive data on the floor. It will get retransmitted
> * and hopefully then we'll have sufficient space.
> */
> - __NET_INC_STATS(sock_net(sk), LINUX_MIB_RCVPRUNED);
> + NET_INC_STATS(sock_net(sk), LINUX_MIB_RCVPRUNED);
>
> /* Massive buffer overcommit. */
> tp->pred_flags = 0;
> @@ -5184,7 +5184,7 @@ static bool tcp_validate_incoming(struct sock *sk, struct sk_buff *skb,
> if (tcp_fast_parse_options(skb, th, tp) && tp->rx_opt.saw_tstamp &&
> tcp_paws_discard(sk, skb)) {
> if (!th->rst) {
> - __NET_INC_STATS(sock_net(sk), LINUX_MIB_PAWSESTABREJECTED);
> + NET_INC_STATS(sock_net(sk), LINUX_MIB_PAWSESTABREJECTED);
> if (!tcp_oow_rate_limited(sock_net(sk), skb,
> LINUX_MIB_TCPACKSKIPPEDPAWS,
> &tp->last_oow_ack_time))
> @@ -5236,8 +5236,8 @@ static bool tcp_validate_incoming(struct sock *sk, struct sk_buff *skb,
> if (th->syn) {
> syn_challenge:
> if (syn_inerr)
> - __TCP_INC_STATS(sock_net(sk), TCP_MIB_INERRS);
> - __NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPSYNCHALLENGE);
> + TCP_INC_STATS(sock_net(sk), TCP_MIB_INERRS);
> + NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPSYNCHALLENGE);
> tcp_send_challenge_ack(sk, skb);
> goto discard;
> }
> @@ -5352,7 +5352,7 @@ void tcp_rcv_established(struct sock *sk, struct sk_buff *skb,
> tcp_data_snd_check(sk);
> return;
> } else { /* Header too small */
> - __TCP_INC_STATS(sock_net(sk), TCP_MIB_INERRS);
> + TCP_INC_STATS(sock_net(sk), TCP_MIB_INERRS);
> goto discard;
> }
> } else {
> @@ -5380,7 +5380,7 @@ void tcp_rcv_established(struct sock *sk, struct sk_buff *skb,
>
> __skb_pull(skb, tcp_header_len);
> tcp_rcv_nxt_update(tp, TCP_SKB_CB(skb)->end_seq);
> - __NET_INC_STATS(sock_net(sk),
> + NET_INC_STATS(sock_net(sk),
> LINUX_MIB_TCPHPHITSTOUSER);
> eaten = 1;
> }
> @@ -5403,7 +5403,7 @@ void tcp_rcv_established(struct sock *sk, struct sk_buff *skb,
>
> tcp_rcv_rtt_measure_ts(sk, skb);
>
> - __NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPHPHITS);
> + NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPHPHITS);
>
> /* Bulk data transfer: receiver */
> eaten = tcp_queue_rcv(sk, skb, tcp_header_len,
> @@ -5460,8 +5460,8 @@ step5:
> return;
>
> csum_error:
> - __TCP_INC_STATS(sock_net(sk), TCP_MIB_CSUMERRORS);
> - __TCP_INC_STATS(sock_net(sk), TCP_MIB_INERRS);
> + TCP_INC_STATS(sock_net(sk), TCP_MIB_CSUMERRORS);
> + TCP_INC_STATS(sock_net(sk), TCP_MIB_INERRS);
>
> discard:
> tcp_drop(sk, skb);
> @@ -5553,13 +5553,13 @@ static bool tcp_rcv_fastopen_synack(struct sock *sk, struct sk_buff *synack,
> break;
> }
> tcp_rearm_rto(sk);
> - __NET_INC_STATS(sock_net(sk),
> + NET_INC_STATS(sock_net(sk),
> LINUX_MIB_TCPFASTOPENACTIVEFAIL);
> return true;
> }
> tp->syn_data_acked = tp->syn_data;
> if (tp->syn_data_acked)
> - __NET_INC_STATS(sock_net(sk),
> + NET_INC_STATS(sock_net(sk),
> LINUX_MIB_TCPFASTOPENACTIVE);
>
> tcp_fastopen_add_skb(sk, synack);
> @@ -5595,7 +5595,7 @@ static int tcp_rcv_synsent_state_process(struct sock *sk, struct sk_buff *skb,
> if (tp->rx_opt.saw_tstamp && tp->rx_opt.rcv_tsecr &&
> !between(tp->rx_opt.rcv_tsecr, tp->retrans_stamp,
> tcp_time_stamp)) {
> - __NET_INC_STATS(sock_net(sk),
> + NET_INC_STATS(sock_net(sk),
> LINUX_MIB_PAWSACTIVEREJECTED);
> goto reset_and_undo;
> }
> @@ -5965,7 +5965,7 @@ int tcp_rcv_state_process(struct sock *sk, struct sk_buff *skb)
> (TCP_SKB_CB(skb)->end_seq != TCP_SKB_CB(skb)->seq &&
> after(TCP_SKB_CB(skb)->end_seq - th->fin, tp->rcv_nxt))) {
> tcp_done(sk);
> - __NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPABORTONDATA);
> + NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPABORTONDATA);
> return 1;
> }
>
> @@ -6022,7 +6022,7 @@ int tcp_rcv_state_process(struct sock *sk, struct sk_buff *skb)
> if (sk->sk_shutdown & RCV_SHUTDOWN) {
> if (TCP_SKB_CB(skb)->end_seq != TCP_SKB_CB(skb)->seq &&
> after(TCP_SKB_CB(skb)->end_seq - th->fin, tp->rcv_nxt)) {
> - __NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPABORTONDATA);
> + NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPABORTONDATA);
> tcp_reset(sk);
> return 1;
> }
> @@ -6224,7 +6224,7 @@ int tcp_conn_request(struct request_sock_ops *rsk_ops,
> * timeout.
> */
> if (sk_acceptq_is_full(sk) && inet_csk_reqsk_queue_young(sk) > 1) {
> - __NET_INC_STATS(sock_net(sk), LINUX_MIB_LISTENOVERFLOWS);
> + NET_INC_STATS(sock_net(sk), LINUX_MIB_LISTENOVERFLOWS);
> goto drop;
> }
>
> @@ -6271,7 +6271,7 @@ int tcp_conn_request(struct request_sock_ops *rsk_ops,
> if (dst && strict &&
> !tcp_peer_is_proven(req, dst, true,
> tmp_opt.saw_tstamp)) {
> - __NET_INC_STATS(sock_net(sk), LINUX_MIB_PAWSPASSIVEREJECTED);
> + NET_INC_STATS(sock_net(sk), LINUX_MIB_PAWSPASSIVEREJECTED);
> goto drop_and_release;
> }
> }
> diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
> index 87b173b563b0..761bc492c5e3 100644
> --- a/net/ipv4/tcp_ipv4.c
> +++ b/net/ipv4/tcp_ipv4.c
> @@ -692,6 +692,7 @@ static void tcp_v4_send_reset(const struct sock *sk, struct sk_buff *skb)
> offsetof(struct inet_timewait_sock, tw_bound_dev_if));
>
> arg.tos = ip_hdr(skb)->tos;
> + preempt_disable();
> ip_send_unicast_reply(*this_cpu_ptr(net->ipv4.tcp_sk),
> skb, &TCP_SKB_CB(skb)->header.h4.opt,
> ip_hdr(skb)->saddr, ip_hdr(skb)->daddr,
> @@ -699,6 +700,7 @@ static void tcp_v4_send_reset(const struct sock *sk, struct sk_buff *skb)
>
> __TCP_INC_STATS(net, TCP_MIB_OUTSEGS);
> __TCP_INC_STATS(net, TCP_MIB_OUTRSTS);
> + preempt_enable();
>
> #ifdef CONFIG_TCP_MD5SIG
> out:
> @@ -774,12 +776,14 @@ static void tcp_v4_send_ack(struct net *net,
> if (oif)
> arg.bound_dev_if = oif;
> arg.tos = tos;
> + preempt_disable();
> ip_send_unicast_reply(*this_cpu_ptr(net->ipv4.tcp_sk),
> skb, &TCP_SKB_CB(skb)->header.h4.opt,
> ip_hdr(skb)->saddr, ip_hdr(skb)->daddr,
> &arg, arg.iov[0].iov_len);
>
> __TCP_INC_STATS(net, TCP_MIB_OUTSEGS);
> + preempt_enable();
> }
>
> static void tcp_v4_timewait_ack(struct sock *sk, struct sk_buff *skb)
> @@ -1151,12 +1155,12 @@ static bool tcp_v4_inbound_md5_hash(const struct sock *sk,
> return false;
>
> if (hash_expected && !hash_location) {
> - __NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPMD5NOTFOUND);
> + NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPMD5NOTFOUND);
> return true;
> }
>
> if (!hash_expected && hash_location) {
> - __NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPMD5UNEXPECTED);
> + NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPMD5UNEXPECTED);
> return true;
> }
>
> @@ -1342,7 +1346,7 @@ struct sock *tcp_v4_syn_recv_sock(const struct sock *sk, struct sk_buff *skb,
> return newsk;
>
> exit_overflow:
> - __NET_INC_STATS(sock_net(sk), LINUX_MIB_LISTENOVERFLOWS);
> + NET_INC_STATS(sock_net(sk), LINUX_MIB_LISTENOVERFLOWS);
> exit_nonewsk:
> dst_release(dst);
> exit:
> @@ -1432,8 +1436,8 @@ discard:
> return 0;
>
> csum_err:
> - __TCP_INC_STATS(sock_net(sk), TCP_MIB_CSUMERRORS);
> - __TCP_INC_STATS(sock_net(sk), TCP_MIB_INERRS);
> + TCP_INC_STATS(sock_net(sk), TCP_MIB_CSUMERRORS);
> + TCP_INC_STATS(sock_net(sk), TCP_MIB_INERRS);
> goto discard;
> }
> EXPORT_SYMBOL(tcp_v4_do_rcv);
> diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c
> index ffbfecdae471..4b95ec4ed2c8 100644
> --- a/net/ipv4/tcp_minisocks.c
> +++ b/net/ipv4/tcp_minisocks.c
> @@ -337,7 +337,7 @@ void tcp_time_wait(struct sock *sk, int state, int timeo)
> * socket up. We've got bigger problems than
> * non-graceful socket closings.
> */
> - __NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPTIMEWAITOVERFLOW);
> + NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPTIMEWAITOVERFLOW);
> }
>
> tcp_update_metrics(sk);
> diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
> index 1a487ff95d4c..25d527922b18 100644
> --- a/net/ipv4/tcp_output.c
> +++ b/net/ipv4/tcp_output.c
> @@ -2221,14 +2221,13 @@ bool tcp_schedule_loss_probe(struct sock *sk)
> /* Thanks to skb fast clones, we can detect if a prior transmit of
> * a packet is still in a qdisc or driver queue.
> * In this case, there is very little point doing a retransmit !
> - * Note: This is called from BH context only.
> */
> static bool skb_still_in_host_queue(const struct sock *sk,
> const struct sk_buff *skb)
> {
> if (unlikely(skb_fclone_busy(sk, skb))) {
> - __NET_INC_STATS(sock_net(sk),
> - LINUX_MIB_TCPSPURIOUS_RTX_HOSTQUEUES);
> + NET_INC_STATS(sock_net(sk),
> + LINUX_MIB_TCPSPURIOUS_RTX_HOSTQUEUES);
> return true;
> }
> return false;
> @@ -2290,7 +2289,7 @@ void tcp_send_loss_probe(struct sock *sk)
> tp->tlp_high_seq = tp->snd_nxt;
>
> probe_sent:
> - __NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPLOSSPROBES);
> + NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPLOSSPROBES);
> /* Reset s.t. tcp_rearm_rto will restart timer from now */
> inet_csk(sk)->icsk_pending = 0;
> rearm_timer:
> @@ -2699,7 +2698,7 @@ int tcp_retransmit_skb(struct sock *sk, struct sk_buff *skb, int segs)
> tp->retrans_stamp = tcp_skb_timestamp(skb);
>
> } else if (err != -EBUSY) {
> - __NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPRETRANSFAIL);
> + NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPRETRANSFAIL);
> }
>
> if (tp->undo_retrans < 0)
> @@ -2823,7 +2822,7 @@ begin_fwd:
> if (tcp_retransmit_skb(sk, skb, segs))
> return;
>
> - __NET_INC_STATS(sock_net(sk), mib_idx);
> + NET_INC_STATS(sock_net(sk), mib_idx);
>
> if (tcp_in_cwnd_reduction(sk))
> tp->prr_out += tcp_skb_pcount(skb);
> diff --git a/net/ipv4/tcp_recovery.c b/net/ipv4/tcp_recovery.c
> index e0d0afaf15be..e36df4fcfeba 100644
> --- a/net/ipv4/tcp_recovery.c
> +++ b/net/ipv4/tcp_recovery.c
> @@ -65,8 +65,8 @@ int tcp_rack_mark_lost(struct sock *sk)
> if (scb->sacked & TCPCB_SACKED_RETRANS) {
> scb->sacked &= ~TCPCB_SACKED_RETRANS;
> tp->retrans_out -= tcp_skb_pcount(skb);
> - __NET_INC_STATS(sock_net(sk),
> - LINUX_MIB_TCPLOSTRETRANSMIT);
> + NET_INC_STATS(sock_net(sk),
> + LINUX_MIB_TCPLOSTRETRANSMIT);
> }
> } else if (!(scb->sacked & TCPCB_RETRANS)) {
> /* Original data are sent sequentially so stop early
> diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c
> index 35f643d8ffbb..debdd8b33e69 100644
> --- a/net/ipv4/tcp_timer.c
> +++ b/net/ipv4/tcp_timer.c
> @@ -162,8 +162,8 @@ static int tcp_write_timeout(struct sock *sk)
> if (tp->syn_fastopen || tp->syn_data)
> tcp_fastopen_cache_set(sk, 0, NULL, true, 0);
> if (tp->syn_data && icsk->icsk_retransmits == 1)
> - __NET_INC_STATS(sock_net(sk),
> - LINUX_MIB_TCPFASTOPENACTIVEFAIL);
> + NET_INC_STATS(sock_net(sk),
> + LINUX_MIB_TCPFASTOPENACTIVEFAIL);
> }
> retry_until = icsk->icsk_syn_retries ? : net->ipv4.sysctl_tcp_syn_retries;
> syn_set = true;
> @@ -178,8 +178,8 @@ static int tcp_write_timeout(struct sock *sk)
> tp->bytes_acked <= tp->rx_opt.mss_clamp) {
> tcp_fastopen_cache_set(sk, 0, NULL, true, 0);
> if (icsk->icsk_retransmits == net->ipv4.sysctl_tcp_retries1)
> - __NET_INC_STATS(sock_net(sk),
> - LINUX_MIB_TCPFASTOPENACTIVEFAIL);
> + NET_INC_STATS(sock_net(sk),
> + LINUX_MIB_TCPFASTOPENACTIVEFAIL);
> }
> /* Black hole detection */
> tcp_mtu_probing(icsk, sk);
> @@ -209,6 +209,7 @@ static int tcp_write_timeout(struct sock *sk)
> return 0;
> }
>
> +/* Called with BH disabled */
> void tcp_delack_timer_handler(struct sock *sk)
> {
> struct tcp_sock *tp = tcp_sk(sk);
> @@ -493,6 +494,7 @@ out_reset_timer:
> out:;
> }
>
> +/* Called with BH disabled */
> void tcp_write_timer_handler(struct sock *sk)
> {
> struct inet_connection_sock *icsk = inet_csk(sk);
> diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
> index 52914714b923..7bdc9c9c231b 100644
> --- a/net/ipv6/tcp_ipv6.c
> +++ b/net/ipv6/tcp_ipv6.c
> @@ -649,12 +649,12 @@ static bool tcp_v6_inbound_md5_hash(const struct sock *sk,
> return false;
>
> if (hash_expected && !hash_location) {
> - __NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPMD5NOTFOUND);
> + NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPMD5NOTFOUND);
> return true;
> }
>
> if (!hash_expected && hash_location) {
> - __NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPMD5UNEXPECTED);
> + NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPMD5UNEXPECTED);
> return true;
> }
>
> @@ -825,9 +825,9 @@ static void tcp_v6_send_response(const struct sock *sk, struct sk_buff *skb, u32
> if (!IS_ERR(dst)) {
> skb_dst_set(buff, dst);
> ip6_xmit(ctl_sk, buff, &fl6, NULL, tclass);
> - __TCP_INC_STATS(net, TCP_MIB_OUTSEGS);
> + TCP_INC_STATS(net, TCP_MIB_OUTSEGS);
> if (rst)
> - __TCP_INC_STATS(net, TCP_MIB_OUTRSTS);
> + TCP_INC_STATS(net, TCP_MIB_OUTRSTS);
> return;
> }
>
> @@ -1276,8 +1276,8 @@ discard:
> kfree_skb(skb);
> return 0;
> csum_err:
> - __TCP_INC_STATS(sock_net(sk), TCP_MIB_CSUMERRORS);
> - __TCP_INC_STATS(sock_net(sk), TCP_MIB_INERRS);
> + TCP_INC_STATS(sock_net(sk), TCP_MIB_CSUMERRORS);
> + TCP_INC_STATS(sock_net(sk), TCP_MIB_INERRS);
> goto discard;
>
>
> --
> 2.8.0.rc3.226.g39d4020
>
^ permalink raw reply
* Re: [PATCH v2 net-next 2/7] tcp: do not block bh during prequeue processing
From: Soheil Hassas Yeganeh @ 2016-04-29 13:20 UTC (permalink / raw)
To: Eric Dumazet
Cc: David S . Miller, netdev, Alexei Starovoitov,
Marcelo Ricardo Leitner, Eric Dumazet
In-Reply-To: <1461899449-8096-3-git-send-email-edumazet@google.com>
On Thu, Apr 28, 2016 at 11:10 PM, Eric Dumazet <edumazet@google.com> wrote:
> AFAIK, nothing in current TCP stack absolutely wants BH
> being disabled once socket is owned by a thread running in
> process context.
>
> As mentioned in my prior patch ("tcp: give prequeue mode some care"),
> processing a batch of packets might take time, better not block BH
> at all.
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
> ---
> net/ipv4/tcp.c | 4 ----
> net/ipv4/tcp_input.c | 30 ++----------------------------
> 2 files changed, 2 insertions(+), 32 deletions(-)
>
> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> index b24c6ed4a04f..4787f86ae64c 100644
> --- a/net/ipv4/tcp.c
> +++ b/net/ipv4/tcp.c
> @@ -1449,12 +1449,8 @@ static void tcp_prequeue_process(struct sock *sk)
>
> NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPPREQUEUED);
>
> - /* RX process wants to run with disabled BHs, though it is not
> - * necessary */
> - local_bh_disable();
> while ((skb = __skb_dequeue(&tp->ucopy.prequeue)) != NULL)
> sk_backlog_rcv(sk, skb);
> - local_bh_enable();
>
> /* Clear memory counter. */
> tp->ucopy.memory = 0;
> diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
> index ac85fb42a5a2..6171f92be090 100644
> --- a/net/ipv4/tcp_input.c
> +++ b/net/ipv4/tcp_input.c
> @@ -4611,14 +4611,12 @@ static void tcp_data_queue(struct sock *sk, struct sk_buff *skb)
>
> __set_current_state(TASK_RUNNING);
>
> - local_bh_enable();
> if (!skb_copy_datagram_msg(skb, 0, tp->ucopy.msg, chunk)) {
> tp->ucopy.len -= chunk;
> tp->copied_seq += chunk;
> eaten = (chunk == skb->len);
> tcp_rcv_space_adjust(sk);
> }
> - local_bh_disable();
> }
>
> if (eaten <= 0) {
> @@ -5134,7 +5132,6 @@ static int tcp_copy_to_iovec(struct sock *sk, struct sk_buff *skb, int hlen)
> int chunk = skb->len - hlen;
> int err;
>
> - local_bh_enable();
> if (skb_csum_unnecessary(skb))
> err = skb_copy_datagram_msg(skb, hlen, tp->ucopy.msg, chunk);
> else
> @@ -5146,32 +5143,9 @@ static int tcp_copy_to_iovec(struct sock *sk, struct sk_buff *skb, int hlen)
> tcp_rcv_space_adjust(sk);
> }
>
> - local_bh_disable();
> return err;
> }
>
> -static __sum16 __tcp_checksum_complete_user(struct sock *sk,
> - struct sk_buff *skb)
> -{
> - __sum16 result;
> -
> - if (sock_owned_by_user(sk)) {
> - local_bh_enable();
> - result = __tcp_checksum_complete(skb);
> - local_bh_disable();
> - } else {
> - result = __tcp_checksum_complete(skb);
> - }
> - return result;
> -}
> -
> -static inline bool tcp_checksum_complete_user(struct sock *sk,
> - struct sk_buff *skb)
> -{
> - return !skb_csum_unnecessary(skb) &&
> - __tcp_checksum_complete_user(sk, skb);
> -}
> -
> /* Does PAWS and seqno based validation of an incoming segment, flags will
> * play significant role here.
> */
> @@ -5386,7 +5360,7 @@ void tcp_rcv_established(struct sock *sk, struct sk_buff *skb,
> }
> }
> if (!eaten) {
> - if (tcp_checksum_complete_user(sk, skb))
> + if (tcp_checksum_complete(skb))
> goto csum_error;
>
> if ((int)skb->truesize > sk->sk_forward_alloc)
> @@ -5430,7 +5404,7 @@ no_ack:
> }
>
> slow_path:
> - if (len < (th->doff << 2) || tcp_checksum_complete_user(sk, skb))
> + if (len < (th->doff << 2) || tcp_checksum_complete(skb))
> goto csum_error;
>
> if (!th->ack && !th->rst && !th->syn)
> --
> 2.8.0.rc3.226.g39d4020
>
Very nice!
^ permalink raw reply
* Re: [PATCH v2 net-next 3/7] dccp: do not assume DCCP code is non preemptible
From: Soheil Hassas Yeganeh @ 2016-04-29 13:21 UTC (permalink / raw)
To: Eric Dumazet
Cc: David S . Miller, netdev, Alexei Starovoitov,
Marcelo Ricardo Leitner, Eric Dumazet
In-Reply-To: <1461899449-8096-4-git-send-email-edumazet@google.com>
On Thu, Apr 28, 2016 at 11:10 PM, Eric Dumazet <edumazet@google.com> wrote:
> DCCP uses the generic backlog code, and this will soon
> be changed to not disable BH when protocol is called back.
>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
> ---
> net/dccp/input.c | 2 +-
> net/dccp/ipv4.c | 4 ++--
> net/dccp/ipv6.c | 4 ++--
> net/dccp/options.c | 2 +-
> 4 files changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/net/dccp/input.c b/net/dccp/input.c
> index 2437ecc13b82..ba347184bda9 100644
> --- a/net/dccp/input.c
> +++ b/net/dccp/input.c
> @@ -359,7 +359,7 @@ send_sync:
> goto discard;
> }
>
> - __DCCP_INC_STATS(DCCP_MIB_INERRS);
> + DCCP_INC_STATS(DCCP_MIB_INERRS);
> discard:
> __kfree_skb(skb);
> return 0;
> diff --git a/net/dccp/ipv4.c b/net/dccp/ipv4.c
> index a8164272e0f4..5c7e413a3ae4 100644
> --- a/net/dccp/ipv4.c
> +++ b/net/dccp/ipv4.c
> @@ -533,8 +533,8 @@ static void dccp_v4_ctl_send_reset(const struct sock *sk, struct sk_buff *rxskb)
> bh_unlock_sock(ctl_sk);
>
> if (net_xmit_eval(err) == 0) {
> - __DCCP_INC_STATS(DCCP_MIB_OUTSEGS);
> - __DCCP_INC_STATS(DCCP_MIB_OUTRSTS);
> + DCCP_INC_STATS(DCCP_MIB_OUTSEGS);
> + DCCP_INC_STATS(DCCP_MIB_OUTRSTS);
> }
> out:
> dst_release(dst);
> diff --git a/net/dccp/ipv6.c b/net/dccp/ipv6.c
> index 0f4eb4ea57a5..d176f4e66369 100644
> --- a/net/dccp/ipv6.c
> +++ b/net/dccp/ipv6.c
> @@ -277,8 +277,8 @@ static void dccp_v6_ctl_send_reset(const struct sock *sk, struct sk_buff *rxskb)
> if (!IS_ERR(dst)) {
> skb_dst_set(skb, dst);
> ip6_xmit(ctl_sk, skb, &fl6, NULL, 0);
> - __DCCP_INC_STATS(DCCP_MIB_OUTSEGS);
> - __DCCP_INC_STATS(DCCP_MIB_OUTRSTS);
> + DCCP_INC_STATS(DCCP_MIB_OUTSEGS);
> + DCCP_INC_STATS(DCCP_MIB_OUTRSTS);
> return;
> }
>
> diff --git a/net/dccp/options.c b/net/dccp/options.c
> index b82b7ee9a1d2..74d29c56c367 100644
> --- a/net/dccp/options.c
> +++ b/net/dccp/options.c
> @@ -253,7 +253,7 @@ out_nonsensical_length:
> return 0;
>
> out_invalid_option:
> - __DCCP_INC_STATS(DCCP_MIB_INVALIDOPT);
> + DCCP_INC_STATS(DCCP_MIB_INVALIDOPT);
> rc = DCCP_RESET_CODE_OPTION_ERROR;
> out_featneg_failed:
> DCCP_WARN("DCCP(%p): Option %d (len=%d) error=%u\n", sk, opt, len, rc);
> --
> 2.8.0.rc3.226.g39d4020
>
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox