Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [RFC PATCH bpf-next 00/14] xdp_flow: Flow offload to XDP
From: Toshiaki Makita @ 2019-08-16  1:38 UTC (permalink / raw)
  To: William Tu
  Cc: Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau, Song Liu,
	Yonghong Song, David S. Miller, Jakub Kicinski,
	Jesper Dangaard Brouer, John Fastabend, Jamal Hadi Salim,
	Cong Wang, Jiri Pirko, Linux Kernel Network Developers, bpf
In-Reply-To: <CALDO+SYC4sPw-7iDkFMCD=kf2UnTW2qc0m6Kgz41zLmNNxQ+Ww@mail.gmail.com>

On 2019/08/16 0:46, William Tu wrote:
> On Tue, Aug 13, 2019 at 5:07 AM Toshiaki Makita
> <toshiaki.makita1@gmail.com> wrote:
>>
>> This is a rough PoC for an idea to offload TC flower to XDP.
>>
>>
>> * Motivation
>>
>> The purpose is to speed up software TC flower by using XDP.
>>
>> I chose TC flower because my current interest is in OVS. OVS uses TC to
>> offload flow tables to hardware, so if TC can offload flows to XDP, OVS
>> also can be offloaded to XDP.
>>
>> When TC flower filter is offloaded to XDP, the received packets are
>> handled by XDP first, and if their protocol or something is not
>> supported by the eBPF program, the program returns XDP_PASS and packets
>> are passed to upper layer TC.
>>
>> The packet processing flow will be like this when this mechanism,
>> xdp_flow, is used with OVS.
>>
>>   +-------------+
>>   | openvswitch |
>>   |    kmod     |
>>   +-------------+
>>          ^
>>          | if not match in filters (flow key or action not supported by TC)
>>   +-------------+
>>   |  TC flower  |
>>   +-------------+
>>          ^
>>          | if not match in flow tables (flow key or action not supported by XDP)
>>   +-------------+
>>   |  XDP prog   |
>>   +-------------+
>>          ^
>>          | incoming packets
>>
> I like this idea, some comments about the OVS AF_XDP work.
> 
> Another way when using OVS AF_XDP is to serve as slow path of TC flow
> HW offload.
> For example:
> 
>   Userspace OVS datapath (The one used by OVS-DPDK)
>       ^
>        |
>    +------------------------------+
>    |  OVS AF_XDP netdev |
>    +------------------------------+
>           ^
>           | if not supported or not match in flow tables
>    +---------------------+
>    |  TC HW flower  |
>    +---------------------+
>           ^
>           | incoming packets
> 
> So in this case it's either TC HW flower offload, or the userspace PMD OVS.
> Both cases should be pretty fast.
> 
> I think xdp_flow can also be used by OVS AF_XDP netdev, sitting between
> TC HW flower and OVS AF_XDP netdev.
> Before the XDP program sending packet to AF_XDP socket, the
> xdp_flow can execute first, and if not match, then send to AF_XDP.
> So in your patch set, implement s.t like
>    bpf_redirect_map(&xsks_map, index, 0);

Thanks, the concept sounds good but this is probably difficult as long as
this is a TC offload, which is emulating TC.
If I changed the direction and implement offload in ovs-vswitchd, it would
be possible. I'll remember this optimization.

> Another thing is that at each layer we are doing its own packet parsing.
>  From your graph, first parse at XDP program, then at TC flow, then at
> openvswitch kmod.
> I wonder if we can reuse some parsing result.

That would be nice if possible...
Currently I don't have any ideas to do that. Someday XDP may support more
metadata for this or HW-offload like checksum. Then we can store the information
and upper layers may be able to use that.

Toshiaki Makita

^ permalink raw reply

* Re: [PATCH net-next v7 3/3] net: phy: broadcom: add 1000Base-X support for BCM54616S
From: Tao Ren @ 2019-08-16  1:33 UTC (permalink / raw)
  To: Florian Fainelli, Andrew Lunn, Heiner Kallweit, David S . Miller,
	Vladimir Oltean, Arun Parameswaran, Justin Chen,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	openbmc@lists.ozlabs.org
In-Reply-To: <d45d609b-60ea-760d-31f4-51afa379c55a@gmail.com>

On 8/15/19 4:36 PM, Florian Fainelli wrote:
> On 8/11/19 4:40 PM, Tao Ren wrote:
>> The BCM54616S PHY cannot work properly in RGMII->1000Base-X mode, mainly
>> because genphy functions are designed for copper links, and 1000Base-X
>> (clause 37) auto negotiation needs to be handled differently.
>>
>> This patch enables 1000Base-X support for BCM54616S by customizing 3
>> driver callbacks, and it's verified to be working on Facebook CMM BMC
>> platform (RGMII->1000Base-KX):
>>
>>   - probe: probe callback detects PHY's operation mode based on
>>     INTERF_SEL[1:0] pins and 1000X/100FX selection bit in SerDES 100-FX
>>     Control register.
>>
>>   - config_aneg: calls genphy_c37_config_aneg when the PHY is running in
>>     1000Base-X mode; otherwise, genphy_config_aneg will be called.
>>
>>   - read_status: calls genphy_c37_read_status when the PHY is running in
>>     1000Base-X mode; otherwise, genphy_read_status will be called.
>>
>> Note: BCM54616S PHY can also be configured in RGMII->100Base-FX mode, and
>> 100Base-FX support is not available as of now.
>>
>> Signed-off-by: Tao Ren <taoren@fb.com>
> 
>> -		reg = bcm_phy_read_shadow(phydev, BCM5482_SHD_MODE);
>> -		bcm_phy_write_shadow(phydev, BCM5482_SHD_MODE,
>> -				     reg | BCM5482_SHD_MODE_1000BX);
>> +		reg = bcm_phy_read_shadow(phydev, BCM54XX_SHD_MODE);
>> +		bcm_phy_write_shadow(phydev, BCM54XX_SHD_MODE,
>> +				     reg | BCM54XX_SHD_MODE_1000BX);
> 
> This could have been a separate patch, but this looks reasonable to me
> and this is correct with the datasheet, thanks Tao.
> 
> Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>

Thank you FLorian.

It is great experience working with you all to figure out root cause and work out the patches, and I really appreciate it.


Cheers,

Tao

^ permalink raw reply

* Re: [RFC PATCH bpf-next 00/14] xdp_flow: Flow offload to XDP
From: Toshiaki Makita @ 2019-08-16  1:28 UTC (permalink / raw)
  To: Jakub Kicinski, Stanislav Fomichev
  Cc: Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau, Song Liu,
	Yonghong Song, David S. Miller, Jesper Dangaard Brouer,
	John Fastabend, Jamal Hadi Salim, Cong Wang, Jiri Pirko, netdev,
	bpf, William Tu
In-Reply-To: <20190815122232.4b1fa01c@cakuba.netronome.com>

On 2019/08/16 4:22, Jakub Kicinski wrote:
> On Thu, 15 Aug 2019 08:21:00 -0700, Stanislav Fomichev wrote:
>> On 08/15, Toshiaki Makita wrote:
>>> On 2019/08/15 2:07, Stanislav Fomichev wrote:
>>>> On 08/13, Toshiaki Makita wrote:
>>>>> * Implementation
>>>>>
>>>>> xdp_flow makes use of UMH to load an eBPF program for XDP, similar to
>>>>> bpfilter. The difference is that xdp_flow does not generate the eBPF
>>>>> program dynamically but a prebuilt program is embedded in UMH. This is
>>>>> mainly because flow insertion is considerably frequent. If we generate
>>>>> and load an eBPF program on each insertion of a flow, the latency of the
>>>>> first packet of ping in above test will incease, which I want to avoid.
>>>> Can this be instead implemented with a new hook that will be called
>>>> for TC events? This hook can write to perf event buffer and control
>>>> plane will insert/remove/modify flow tables in the BPF maps (contol
>>>> plane will also install xdp program).
>>>>
>>>> Why do we need UMH? What am I missing?
>>>
>>> So you suggest doing everything in xdp_flow kmod?
>> You probably don't even need xdp_flow kmod. Add new tc "offload" mode
>> (bypass) that dumps every command via netlink (or calls the BPF hook
>> where you can dump it into perf event buffer) and then read that info
>> from userspace and install xdp programs and modify flow tables.
>> I don't think you need any kernel changes besides that stream
>> of data from the kernel about qdisc/tc flow creation/removal/etc.
> 
> There's a certain allure in bringing the in-kernel BPF translation
> infrastructure forward. OTOH from system architecture perspective IMHO
> it does seem like a task best handed in user space. bpfilter can replace
> iptables completely, here we're looking at an acceleration relatively
> loosely coupled with flower.

I don't think it's loosely coupled. Emulating TC behavior in userspace
is not so easy.

Think about recent multi-mask support in flower. Previously userspace could
assume there is one mask and hash table for each preference in TC. After the
change TC accepts different masks with the same pref. Such a change tends to
break userspace emulation. It may ignore masks passed from flow insertion
and use the mask remembered when the first flow of the pref is inserted. It
may override the mask of all existing flows with the pref. It may fail to
insert such flows. Any of them would result in unexpected wrong datapath
handling which is critical.
I think such an emulation layer needs to be updated in sync with TC.

Toshiaki Makita

^ permalink raw reply

* [PATCH net,v5 1/2] net: sched: use major priority number as hardware priority
From: Pablo Neira Ayuso @ 2019-08-16  1:24 UTC (permalink / raw)
  To: netfilter-devel
  Cc: davem, netdev, marcelo.leitner, jiri, wenxu, saeedm, paulb,
	gerlitz.or, jakub.kicinski
In-Reply-To: <20190816012410.31844-1-pablo@netfilter.org>

tc transparently maps the software priority number to hardware. Update
it to pass the major priority which is what most drivers expect. Update
drivers too so they do not need to lshift the priority field of the
flow_cls_common_offload object. The stmmac driver is an exception, since
this code assumes the tc software priority is fine, therefore, lshift it
just to be conservative.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Acked-by: Jiri Pirko <jiri@mellanox.com>
---
v5: no changes.

 drivers/net/ethernet/mellanox/mlx5/core/en_tc.c      |  2 +-
 drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c   |  2 +-
 drivers/net/ethernet/mscc/ocelot_flower.c            | 12 +++---------
 drivers/net/ethernet/netronome/nfp/flower/qos_conf.c |  2 +-
 drivers/net/ethernet/stmicro/stmmac/stmmac_tc.c      |  2 +-
 include/net/pkt_cls.h                                |  2 +-
 6 files changed, 8 insertions(+), 14 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
index deeb65da99f3..00b2d4a86159 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c
@@ -3167,7 +3167,7 @@ mlx5e_flow_esw_attr_init(struct mlx5_esw_flow_attr *esw_attr,
 
 	esw_attr->parse_attr = parse_attr;
 	esw_attr->chain = f->common.chain_index;
-	esw_attr->prio = TC_H_MAJ(f->common.prio) >> 16;
+	esw_attr->prio = f->common.prio;
 
 	esw_attr->in_rep = in_rep;
 	esw_attr->in_mdev = in_mdev;
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c
index e8ac90564dbe..84a87d059333 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c
@@ -471,7 +471,7 @@ int mlxsw_sp_acl_rulei_commit(struct mlxsw_sp_acl_rule_info *rulei)
 void mlxsw_sp_acl_rulei_priority(struct mlxsw_sp_acl_rule_info *rulei,
 				 unsigned int priority)
 {
-	rulei->priority = priority >> 16;
+	rulei->priority = priority;
 }
 
 void mlxsw_sp_acl_rulei_keymask_u32(struct mlxsw_sp_acl_rule_info *rulei,
diff --git a/drivers/net/ethernet/mscc/ocelot_flower.c b/drivers/net/ethernet/mscc/ocelot_flower.c
index 59487d446a09..b894bc0c9c16 100644
--- a/drivers/net/ethernet/mscc/ocelot_flower.c
+++ b/drivers/net/ethernet/mscc/ocelot_flower.c
@@ -13,12 +13,6 @@ struct ocelot_port_block {
 	struct ocelot_port *port;
 };
 
-static u16 get_prio(u32 prio)
-{
-	/* prio starts from 0x1000 while the ids starts from 0 */
-	return prio >> 16;
-}
-
 static int ocelot_flower_parse_action(struct flow_cls_offload *f,
 				      struct ocelot_ace_rule *rule)
 {
@@ -168,7 +162,7 @@ static int ocelot_flower_parse(struct flow_cls_offload *f,
 	}
 
 finished_key_parsing:
-	ocelot_rule->prio = get_prio(f->common.prio);
+	ocelot_rule->prio = f->common.prio;
 	ocelot_rule->id = f->cookie;
 	return ocelot_flower_parse_action(f, ocelot_rule);
 }
@@ -218,7 +212,7 @@ static int ocelot_flower_destroy(struct flow_cls_offload *f,
 	struct ocelot_ace_rule rule;
 	int ret;
 
-	rule.prio = get_prio(f->common.prio);
+	rule.prio = f->common.prio;
 	rule.port = port_block->port;
 	rule.id = f->cookie;
 
@@ -236,7 +230,7 @@ static int ocelot_flower_stats_update(struct flow_cls_offload *f,
 	struct ocelot_ace_rule rule;
 	int ret;
 
-	rule.prio = get_prio(f->common.prio);
+	rule.prio = f->common.prio;
 	rule.port = port_block->port;
 	rule.id = f->cookie;
 	ret = ocelot_ace_rule_stats_update(&rule);
diff --git a/drivers/net/ethernet/netronome/nfp/flower/qos_conf.c b/drivers/net/ethernet/netronome/nfp/flower/qos_conf.c
index 86e968cd5ffd..124a43dc136a 100644
--- a/drivers/net/ethernet/netronome/nfp/flower/qos_conf.c
+++ b/drivers/net/ethernet/netronome/nfp/flower/qos_conf.c
@@ -93,7 +93,7 @@ nfp_flower_install_rate_limiter(struct nfp_app *app, struct net_device *netdev,
 		return -EOPNOTSUPP;
 	}
 
-	if (flow->common.prio != (1 << 16)) {
+	if (flow->common.prio != 1) {
 		NL_SET_ERR_MSG_MOD(extack, "unsupported offload: qos rate limit offload requires highest priority");
 		return -EOPNOTSUPP;
 	}
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_tc.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_tc.c
index 37c0bc699cd9..6c305b6ecad0 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_tc.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_tc.c
@@ -94,7 +94,7 @@ static int tc_fill_entry(struct stmmac_priv *priv,
 	struct stmmac_tc_entry *entry, *frag = NULL;
 	struct tc_u32_sel *sel = cls->knode.sel;
 	u32 off, data, mask, real_off, rem;
-	u32 prio = cls->common.prio;
+	u32 prio = cls->common.prio << 16;
 	int ret;
 
 	/* Only 1 match per entry */
diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h
index e429809ca90d..98be18ef1ed3 100644
--- a/include/net/pkt_cls.h
+++ b/include/net/pkt_cls.h
@@ -646,7 +646,7 @@ tc_cls_common_offload_init(struct flow_cls_common_offload *cls_common,
 {
 	cls_common->chain_index = tp->chain->index;
 	cls_common->protocol = tp->protocol;
-	cls_common->prio = tp->prio;
+	cls_common->prio = tp->prio >> 16;
 	if (tc_skip_sw(flags) || flags & TCA_CLS_FLAGS_VERBOSE)
 		cls_common->extack = extack;
 }
-- 
2.11.0


^ permalink raw reply related

* [PATCH net,v5 2/2] netfilter: nf_tables: map basechain priority to hardware priority
From: Pablo Neira Ayuso @ 2019-08-16  1:24 UTC (permalink / raw)
  To: netfilter-devel
  Cc: davem, netdev, marcelo.leitner, jiri, wenxu, saeedm, paulb,
	gerlitz.or, jakub.kicinski
In-Reply-To: <20190816012410.31844-1-pablo@netfilter.org>

This patch adds initial support for offloading basechains using the
priority range from 1 to 65535. This is restricting the netfilter
priority range to 16-bit integer since this is what most drivers assume
so far from tc. It should be possible to extend this range of supported
priorities later on once drivers are updated to support for 32-bit
integer priorities.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
v5: fix clang warning by simplifying the mapping of hardware priorities
    to basechain priority in the range of 1-65535. Zero is left behind
    since some drivers do not support this, no negative basechain
    priorities are used at this stage.

 include/net/netfilter/nf_tables_offload.h |  2 ++
 net/netfilter/nf_tables_api.c             |  4 ++++
 net/netfilter/nf_tables_offload.c         | 17 ++++++++++++++---
 3 files changed, 20 insertions(+), 3 deletions(-)

diff --git a/include/net/netfilter/nf_tables_offload.h b/include/net/netfilter/nf_tables_offload.h
index 3196663a10e3..c8b9dec376f5 100644
--- a/include/net/netfilter/nf_tables_offload.h
+++ b/include/net/netfilter/nf_tables_offload.h
@@ -73,4 +73,6 @@ int nft_flow_rule_offload_commit(struct net *net);
 	(__reg)->key		= __key;				\
 	memset(&(__reg)->mask, 0xff, (__reg)->len);
 
+int nft_chain_offload_priority(struct nft_base_chain *basechain);
+
 #endif
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 605a7cfe7ca7..b65caf4974e1 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -1662,6 +1662,10 @@ static int nf_tables_addchain(struct nft_ctx *ctx, u8 family, u8 genmask,
 
 		chain->flags |= NFT_BASE_CHAIN | flags;
 		basechain->policy = NF_ACCEPT;
+		if (chain->flags & NFT_CHAIN_HW_OFFLOAD &&
+		    nft_chain_offload_priority(basechain) < 0)
+			return -EOPNOTSUPP;
+
 		flow_block_init(&basechain->flow_block);
 	} else {
 		chain = kzalloc(sizeof(*chain), GFP_KERNEL);
diff --git a/net/netfilter/nf_tables_offload.c b/net/netfilter/nf_tables_offload.c
index 64f5fd5f240e..c0d18c1d77ac 100644
--- a/net/netfilter/nf_tables_offload.c
+++ b/net/netfilter/nf_tables_offload.c
@@ -103,10 +103,11 @@ void nft_offload_update_dependency(struct nft_offload_ctx *ctx,
 }
 
 static void nft_flow_offload_common_init(struct flow_cls_common_offload *common,
-					 __be16 proto,
-					struct netlink_ext_ack *extack)
+					 __be16 proto, int priority,
+					 struct netlink_ext_ack *extack)
 {
 	common->protocol = proto;
+	common->prio = priority;
 	common->extack = extack;
 }
 
@@ -124,6 +125,15 @@ static int nft_setup_cb_call(struct nft_base_chain *basechain,
 	return 0;
 }
 
+int nft_chain_offload_priority(struct nft_base_chain *basechain)
+{
+	if (basechain->ops.priority <= 0 ||
+	    basechain->ops.priority > USHRT_MAX)
+		return -1;
+
+	return 0;
+}
+
 static int nft_flow_offload_rule(struct nft_trans *trans,
 				 enum flow_cls_command command)
 {
@@ -142,7 +152,8 @@ static int nft_flow_offload_rule(struct nft_trans *trans,
 	if (flow)
 		proto = flow->proto;
 
-	nft_flow_offload_common_init(&cls_flow.common, proto, &extack);
+	nft_flow_offload_common_init(&cls_flow.common, proto,
+				     basechain->ops.priority, &extack);
 	cls_flow.command = command;
 	cls_flow.cookie = (unsigned long) rule;
 	if (flow)
-- 
2.11.0


^ permalink raw reply related

* [PATCH net,v5 0/2] flow_offload hardware priority fixes
From: Pablo Neira Ayuso @ 2019-08-16  1:24 UTC (permalink / raw)
  To: netfilter-devel
  Cc: davem, netdev, marcelo.leitner, jiri, wenxu, saeedm, paulb,
	gerlitz.or, jakub.kicinski

Hi,

This patchset contains two updates for the flow_offload users:

1) Pass the major tc priority to drivers so they do not have to
   lshift it. This is a preparation patch for the fix coming in
   patch #2.

2) Set the hardware priority from the netfilter basechain priority,
   some drivers break when using the existing hardware priority
   number that is set to zero.

v5: fix patch 2/2 to address a clang warning and to simplify
    the priority mapping.

Please, apply.

Pablo Neira Ayuso (2):
  net: sched: use major priority number as hardware priority
  netfilter: nf_tables: map basechain priority to hardware priority

 drivers/net/ethernet/mellanox/mlx5/core/en_tc.c      |  2 +-
 drivers/net/ethernet/mellanox/mlxsw/spectrum_acl.c   |  2 +-
 drivers/net/ethernet/mscc/ocelot_flower.c            | 12 +++---------
 drivers/net/ethernet/netronome/nfp/flower/qos_conf.c |  2 +-
 drivers/net/ethernet/stmicro/stmmac/stmmac_tc.c      |  2 +-
 include/net/netfilter/nf_tables_offload.h            |  2 ++
 include/net/pkt_cls.h                                |  2 +-
 net/netfilter/nf_tables_api.c                        |  4 ++++
 net/netfilter/nf_tables_offload.c                    | 17 ++++++++++++++---
 9 files changed, 28 insertions(+), 17 deletions(-)

-- 
2.11.0


^ permalink raw reply

* Re: INFO: task hung in tls_sw_release_resources_tx
From: Jakub Kicinski @ 2019-08-16  1:11 UTC (permalink / raw)
  To: Hillf Danton
  Cc: syzbot, ast, aviadye, borisp, bpf, daniel, davejwatson, davem,
	john.fastabend, kafai, linux-kernel, netdev, songliubraving,
	syzkaller-bugs, yhs
In-Reply-To: <20190815141419.15036-1-hdanton@sina.com>

On Thu, 15 Aug 2019 22:14:19 +0800, Hillf Danton wrote:
> On Thu, 15 Aug 2019 03:54:06 -0700
> > Hello,
> > 
> > syzbot found the following crash on:
> > 
> > HEAD commit:    6d5afe20 sctp: fix memleak in sctp_send_reset_streams
> > git tree:       net
> > console output: https://syzkaller.appspot.com/x/log.txt?x=16e5536a600000
> > kernel config:  https://syzkaller.appspot.com/x/.config?x=a4c9e9f08e9e8960
> > dashboard link: https://syzkaller.appspot.com/bug?extid=6a9ff159672dfbb41c95
> > compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
> > syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=17cb0502600000
> > C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=14d5dc22600000
> > 
> > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > Reported-by: syzbot+6a9ff159672dfbb41c95@syzkaller.appspotmail.com
> > 
> > INFO: task syz-executor153:10198 blocked for more than 143 seconds.
> >        Not tainted 5.3.0-rc3+ #162
> > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > syz-executor153 D27672 10198  10179 0x80000002
> > Call Trace:
> >   context_switch kernel/sched/core.c:3254 [inline]
> >   __schedule+0x755/0x1580 kernel/sched/core.c:3880
> >   schedule+0xa8/0x270 kernel/sched/core.c:3944
> >   schedule_timeout+0x717/0xc50 kernel/time/timer.c:1783
> >   do_wait_for_common kernel/sched/completion.c:83 [inline]
> >   __wait_for_common kernel/sched/completion.c:104 [inline]
> >   wait_for_common kernel/sched/completion.c:115 [inline]
> >   wait_for_completion+0x29c/0x440 kernel/sched/completion.c:136
> >   crypto_wait_req include/linux/crypto.h:685 [inline]
> >   crypto_wait_req include/linux/crypto.h:680 [inline]
> >   tls_sw_release_resources_tx+0x4ee/0x6b0 net/tls/tls_sw.c:2075
> >   tls_sk_proto_cleanup net/tls/tls_main.c:275 [inline]
> >   tls_sk_proto_close+0x686/0x970 net/tls/tls_main.c:305
> >   inet_release+0xed/0x200 net/ipv4/af_inet.c:427
> >   inet6_release+0x53/0x80 net/ipv6/af_inet6.c:470
> >   __sock_release+0xce/0x280 net/socket.c:590
> >   sock_close+0x1e/0x30 net/socket.c:1268
> >   __fput+0x2ff/0x890 fs/file_table.c:280
> >   ____fput+0x16/0x20 fs/file_table.c:313
> >   task_work_run+0x145/0x1c0 kernel/task_work.c:113
> >   exit_task_work include/linux/task_work.h:22 [inline]
> >   do_exit+0x92f/0x2e50 kernel/exit.c:879
> >   do_group_exit+0x135/0x360 kernel/exit.c:983
> >   __do_sys_exit_group kernel/exit.c:994 [inline]
> >   __se_sys_exit_group kernel/exit.c:992 [inline]
> >   __x64_sys_exit_group+0x44/0x50 kernel/exit.c:992
> >   do_syscall_64+0xfd/0x6a0 arch/x86/entry/common.c:296
> >   entry_SYSCALL_64_after_hwframe+0x49/0xbe
> > RIP: 0033:0x43ff88
> > Code: 00 00 be 3c 00 00 00 eb 19 66 0f 1f 84 00 00 00 00 00 48 89 d7 89 f0  
> > 0f 05 48 3d 00 f0 ff ff 77 21 f4 48 89 d7 44 89 c0 0f 05 <48> 3d 00 f0 ff  
> > ff 76 e0 f7 d8 64 41 89 01 eb d8 0f 1f 84 00 00 00
> > RSP: 002b:00007ffd1c2d0f78 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7
> > RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 000000000043ff88
> > RDX: 0000000000000000 RSI: 000000000000003c RDI: 0000000000000000
> > RBP: 00000000004bf890 R08: 00000000000000e7 R09: ffffffffffffffd0
> > R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001
> > R13: 00000000006d1180 R14: 0000000000000000 R15: 0000000000000000
> > INFO: lockdep is turned off.
> > NMI backtrace for cpu 0
> > CPU: 0 PID: 1057 Comm: khungtaskd Not tainted 5.3.0-rc3+ #162
> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
> > Google 01/01/2011
> > Call Trace:
> >   __dump_stack lib/dump_stack.c:77 [inline]
> >   dump_stack+0x172/0x1f0 lib/dump_stack.c:113
> >   nmi_cpu_backtrace.cold+0x70/0xb2 lib/nmi_backtrace.c:101
> >   nmi_trigger_cpumask_backtrace+0x23b/0x28b lib/nmi_backtrace.c:62
> >   arch_trigger_cpumask_backtrace+0x14/0x20 arch/x86/kernel/apic/hw_nmi.c:38
> >   trigger_all_cpu_backtrace include/linux/nmi.h:146 [inline]
> >   check_hung_uninterruptible_tasks kernel/hung_task.c:205 [inline]
> >   watchdog+0x9d0/0xef0 kernel/hung_task.c:289
> >   kthread+0x361/0x430 kernel/kthread.c:255
> >   ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352
> > Sending NMI from CPU 0 to CPUs 1:
> > NMI backtrace for cpu 1 skipped: idling at native_safe_halt+0xe/0x10  
> > arch/x86/include/asm/irqflags.h:60  
> 
> 1, diff -> commit f87e62d45e51 -> commit 1023121375c6
> 
> --- a/net/tls/tls_sw.c
> +++ b/net/tls/tls_sw.c
> @@ -2167,11 +2167,13 @@ static void tx_work_handler(struct work_
>  		return;
>  
>  	ctx = tls_sw_ctx_tx(tls_ctx);
> -	if (test_bit(BIT_TX_CLOSING, &ctx->tx_bitmask))
> -		return;
> -
> -	if (!test_and_clear_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask))
> -		return;
> +	if (test_bit(BIT_TX_CLOSING, &ctx->tx_bitmask)) {
> +		if (!test_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask))
> +			return;
> +	} else {
> +		if (!test_and_clear_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask))
> +			return;
> +	}
>  	lock_sock(sk);
>  	tls_tx_records(sk, -1);
>  	release_sock(sk);
> --
> 
> 2, a simpler one. And clear BIT_TX_SCHEDULED perhaps after releasing sock.
> 
> --- a/net/tls/tls_sw.c
> +++ b/net/tls/tls_sw.c
> @@ -2167,11 +2167,9 @@ static void tx_work_handler(struct work_
>  		return;
>  
>  	ctx = tls_sw_ctx_tx(tls_ctx);
> -	if (test_bit(BIT_TX_CLOSING, &ctx->tx_bitmask))
> -		return;
> +	if (!test_bit(BIT_TX_CLOSING, &ctx->tx_bitmask))
> +		clear_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask);
>  
> -	if (!test_and_clear_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask))
> -		return;
>  	lock_sock(sk);
>  	tls_tx_records(sk, -1);
>  	release_sock(sk);

Mmm.. too terse, I don't follow what you're trying to do here :(

I've been staring at this for a while and trying to repo but it's not
happening here.

The only thing I see is that EBUSY is not handled.

^ permalink raw reply

* Re: [PATCH net-next,v4 07/12] net: sched: use flow block API
From: Pablo Neira Ayuso @ 2019-08-16  1:10 UTC (permalink / raw)
  To: Edward Cree; +Cc: netdev, netfilter-devel
In-Reply-To: <b45709c7-38b5-2dcb-3db1-0c2fca1840be@solarflare.com>

On Wed, Aug 14, 2019 at 05:32:34PM +0100, Edward Cree wrote:
> On 09/07/2019 21:55, Pablo Neira Ayuso wrote:
> > This patch adds tcf_block_setup() which uses the flow block API.
> >
> > This infrastructure takes the flow block callbacks coming from the
> > driver and register/unregister to/from the cls_api core.
> >
> > Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
> > ---
> > <snip>
> > @@ -796,13 +804,20 @@ static int tcf_block_offload_cmd(struct tcf_block *block,
> >  				 struct netlink_ext_ack *extack)
> >  {
> >  	struct tc_block_offload bo = {};
> > +	int err;
> >  
> >  	bo.net = dev_net(dev);
> >  	bo.command = command;
> >  	bo.binder_type = ei->binder_type;
> >  	bo.block = block;
> >  	bo.extack = extack;
> > -	return dev->netdev_ops->ndo_setup_tc(dev, TC_SETUP_BLOCK, &bo);
> > +	INIT_LIST_HEAD(&bo.cb_list);
> > +
> > +	err = dev->netdev_ops->ndo_setup_tc(dev, TC_SETUP_BLOCK, &bo);
> > +	if (err < 0)
> > +		return err;
> > +
> > +	return tcf_block_setup(block, &bo);
> >  }
> >  
> >  static int tcf_block_offload_bind(struct tcf_block *block, struct Qdisc *q,
> > @@ -1636,6 +1651,77 @@ void tcf_block_cb_unregister(struct tcf_block *block,
> >  }
> >  EXPORT_SYMBOL(tcf_block_cb_unregister);
> >  
> > +static int tcf_block_bind(struct tcf_block *block,
> > +			  struct flow_block_offload *bo)
> > +{
> > +	struct flow_block_cb *block_cb, *next;
> > +	int err, i = 0;
> > +
> > +	list_for_each_entry(block_cb, &bo->cb_list, list) {
> > +		err = tcf_block_playback_offloads(block, block_cb->cb,
> > +						  block_cb->cb_priv, true,
> > +						  tcf_block_offload_in_use(block),
> > +						  bo->extack);
> > +		if (err)
> > +			goto err_unroll;
> > +
> > +		i++;
> > +	}
> > +	list_splice(&bo->cb_list, &block->cb_list);
> > +
> > +	return 0;
> > +
> > +err_unroll:
> > +	list_for_each_entry_safe(block_cb, next, &bo->cb_list, list) {
> > +		if (i-- > 0) {
> > +			list_del(&block_cb->list);
> > +			tcf_block_playback_offloads(block, block_cb->cb,
> > +						    block_cb->cb_priv, false,
> > +						    tcf_block_offload_in_use(block),
> > +						    NULL);
> > +		}
> > +		flow_block_cb_free(block_cb);
> > +	}
> > +
> > +	return err;
> > +}
> >
> Why has the replay been moved from the function called by the driver
>  (__tcf_block_cb_register()) to work done by the driver's caller based on
>  what the driver has left on this flow_block_offload.cb_list?  This makes
>  it impossible for the driver to (say) unregister a block outside of an
>  explicit request from ndo_setup_tc().
> In my under-development driver, I have a teardown path called on PCI
>  remove, which calls tcf_block_cb_unregister() on all my block bindings
>  (of which the driver keeps track), to ensure that no flow rules are still
>  in place when unregister_netdev() is called;

It's the subsystem that has to release resources when
unregister_netdev() event happens. At least in netfilter, when the
device is going away, the filtering policy is removed, hence the
FLOW_BLOCK_UNBIND is called to release the blocks and, hence, the
offload resources. I remember tc ingress qdisc works like this too.

>  this is needed because some  of the driver's state for certain
>  rules involves taking a reference on  the netdevice (dev_hold()). 
>  Your structural changes here make that  impossible; is there any
>  reason why they're necessary?

May I have access to your driver code?

This would make it easier for me to understand your requirements, and
to discuss changes with you.

^ permalink raw reply

* Re: [RFC PATCH bpf-next 00/14] xdp_flow: Flow offload to XDP
From: Toshiaki Makita @ 2019-08-16  1:09 UTC (permalink / raw)
  To: Stanislav Fomichev
  Cc: Alexei Starovoitov, Daniel Borkmann, Martin KaFai Lau, Song Liu,
	Yonghong Song, David S. Miller, Jakub Kicinski,
	Jesper Dangaard Brouer, John Fastabend, Jamal Hadi Salim,
	Cong Wang, Jiri Pirko, netdev, bpf, William Tu
In-Reply-To: <20190815152100.GN2820@mini-arch>

On 2019/08/16 0:21, Stanislav Fomichev wrote:
> On 08/15, Toshiaki Makita wrote:
>> On 2019/08/15 2:07, Stanislav Fomichev wrote:
>>> On 08/13, Toshiaki Makita wrote:
>>>> * Implementation
>>>>
>>>> xdp_flow makes use of UMH to load an eBPF program for XDP, similar to
>>>> bpfilter. The difference is that xdp_flow does not generate the eBPF
>>>> program dynamically but a prebuilt program is embedded in UMH. This is
>>>> mainly because flow insertion is considerably frequent. If we generate
>>>> and load an eBPF program on each insertion of a flow, the latency of the
>>>> first packet of ping in above test will incease, which I want to avoid.
>>> Can this be instead implemented with a new hook that will be called
>>> for TC events? This hook can write to perf event buffer and control
>>> plane will insert/remove/modify flow tables in the BPF maps (contol
>>> plane will also install xdp program).
>>>
>>> Why do we need UMH? What am I missing?
>>
>> So you suggest doing everything in xdp_flow kmod?
> You probably don't even need xdp_flow kmod. Add new tc "offload" mode
> (bypass) that dumps every command via netlink (or calls the BPF hook
> where you can dump it into perf event buffer) and then read that info
> from userspace and install xdp programs and modify flow tables.
> I don't think you need any kernel changes besides that stream
> of data from the kernel about qdisc/tc flow creation/removal/etc.

My intention is to make more people who want high speed network easily use XDP,
so making transparent XDP offload with current TC interface.

What userspace program would monitor TC events with your suggestion?
ovs-vswitchd? If so, it even does not need to monitor TC. It can
implement XDP offload directly.
(However I prefer kernel solution. Please refer to "About alternative
userland (ovs-vswitchd etc.) implementation" section in the cover letter.)

Also such a TC monitoring solution easily can be out-of-sync with real TC
behavior as TC filter/flower is being heavily developed and changed,
e.g. introduction of TC block, support multiple masks with the same pref, etc.
I'm not sure such an unreliable solution have much value.

> But, I haven't looked at the series deeply, so I might be missing
> something :-)
> 
>> I also thought about that. There are two phases so let's think about them separately.
>>
>> 1) TC block (qdisc) creation / eBPF load
>>
>> I saw eBPF maintainers repeatedly saying eBPF program loading needs to be
>> done from userland, not from kernel, to run the verifier for safety.
>> However xdp_flow eBPF program is prebuilt and embedded in kernel so we may
>> allow such programs to be loaded from kernel? I currently don't have the will
>> to make such an API as loading can be done with current UMH mechanism.
>>
>> 2) flow insertion / eBPF map update
>>
>> Not sure if this needs to be done from userland. One concern is that eBPF maps can
>> be modified by unrelated processes and we need to handle all unexpected state of maps.
>> Such handling tends to be difficult and may cause unexpected kernel behavior.
>> OTOH updating maps from kmod may reduces the latency of flow insertion drastically.
> Latency from the moment I type 'tc filter add ...' to the moment the rule
> is installed into the maps? Does it really matter?

Yes it matters. Flow insertion is kind of data path in OVS.
Please see how ping latency is affected in the cover letter.

> Do I understand correctly that both of those events (qdisc creation and
> flow insertion) are triggered from tcf_block_offload_cmd (or similar)?

Both of eBPF load and map update are triggered from tcf_block_offload_cmd.
I think you understand it correctly.

Toshiaki Makita

^ permalink raw reply

* Re: [PATCH net-next,v4 08/12] drivers: net: use flow block API
From: Pablo Neira Ayuso @ 2019-08-16  1:04 UTC (permalink / raw)
  To: Edward Cree; +Cc: netdev, netfilter-devel
In-Reply-To: <b3232864-3800-e2a4-9ee3-2cfcf222a148@solarflare.com>

On Wed, Aug 14, 2019 at 05:17:20PM +0100, Edward Cree wrote:
> On 13/08/2019 20:51, Pablo Neira Ayuso wrote:
> > On Mon, Aug 12, 2019 at 06:50:09PM +0100, Edward Cree wrote:
> >> Pablo, can you explain (because this commit message doesn't) why these per-
> >>  driver lists are needed, and what the information/state is that has module
> >>  (rather than, say, netdevice) scope?
> >
> > The idea is to update drivers to support one flow_block per subsystem,
> > one for ethtool, one for tc, and so on. So far, existing drivers only
> > allow for binding one single flow_block to one of the existing
> > subsystems. So this limitation applies at driver level.
>
> That argues for per-driver _code_, not for per-driver _state_. For instance,
>  each driver could (more logically) store this information in the netdev
>  private data, rather than a static global.  Or even, since each driver
>  instance has a unique cb_ident = netdev_priv(net_dev), this doesn't need to
>  be local to the driver at all and could just belong to the device owning the
>  flow_block (which isn't necessarily the device doing the offload, per
>  indirect blocks).

This list could be placed in netdev_priv() area, yes.

> TBH I'm still not clear why you need a flow_block per subsystem, rather than
>  just having multiple subsystems feed their offload requests through the same
>  flow_block but with different enum tc_setup_type or enum tc_fl_command or
>  some other indication that this is "netfilter" rather than "tc" asking for a
>  tc_cls_flower_offload.

In tc, the flow_block is set up by when the ingress qdisc is
registered. The usual scenario for most drivers is to have one single
flow_block per registered ingress qdisc, this makes a 1:1 mapping
between ingress qdisc and flow_block.

Still, you can register two or more ingress qdiscs to make them share
the same policy via 'tc block'. In that case all those qdiscs use one
single flow_block. This makes a N:1 mapping between these qdisc
ingress and the flow_block. This policy applies to all ingress qdiscs
that are part of the same tc block. By 'tc block', I'm refering to the
tcf_block structure.

In netfilter, there are ingress basechains that are registered per
device. Each basechain gets a flow_block by when the basechain is
registered. Shared blocks as in tcf_block are not yet supported, but
it should not be hard to extend it to make this work.

To reuse the same flow_block as entry point for all subsystems as your
propose - assuming offloads for two or more subsystems are in place -
then all of them would need to have the same block sharing
configuration, which might not be the case, ie. tc ingress might have
a eth0 and eth1 use the same policy via flow_block, while netfilter
might have one basechain for eth0 and another for eth1 (no policy
sharing).

[...]
> This really needs a design document explaining what all the bits are, how
>  they fit together, and why they need to be like that.

I did not design this flow_block abstraction, this concept was already
in place under a different name and extend it so the ethtool/netfilter
subsystems to avoid driver code duplication for offloads. Having said
this, I agree documentation is good to have.

^ permalink raw reply

* Re: [PATCH] ipvlan: set hw_enc_features like macvlan
From: Mahesh Bandewar (महेश बंडेवार) @ 2019-08-16  1:03 UTC (permalink / raw)
  To: Bill Sommerfeld
  Cc: David S. Miller, Petr Machata, Jiri Pirko, Ido Schimmel,
	Daniel Borkmann, YueHaibing, Thomas Gleixner, Miaohe Lin,
	Eric Dumazet, linux-netdev, linux-kernel
In-Reply-To: <20190815001043.153874-1-wsommerfeld@google.com>

On Wed, Aug 14, 2019 at 5:10 PM Bill Sommerfeld <wsommerfeld@google.com> wrote:
>
> Allow encapsulated packets sent to tunnels layered over ipvlan to use
> offloads rather than forcing SW fallbacks.
>
> Since commit f21e5077010acda73a60 ("macvlan: add offload features for
> encapsulation"), macvlan has set dev->hw_enc_features to include
> everything in dev->features; do likewise in ipvlan.
>
Thanks Bill

> Signed-off-by: Bill Sommerfeld <wsommerfeld@google.com>
Acked-by: Mahesh Bandewar <maheshb@google.com>
> ---
>  drivers/net/ipvlan/ipvlan_main.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/drivers/net/ipvlan/ipvlan_main.c b/drivers/net/ipvlan/ipvlan_main.c
> index 1c96bed5a7c4..887bbba4631e 100644
> --- a/drivers/net/ipvlan/ipvlan_main.c
> +++ b/drivers/net/ipvlan/ipvlan_main.c
> @@ -126,6 +126,7 @@ static int ipvlan_init(struct net_device *dev)
>                      (phy_dev->state & IPVLAN_STATE_MASK);
>         dev->features = phy_dev->features & IPVLAN_FEATURES;
>         dev->features |= NETIF_F_LLTX | NETIF_F_VLAN_CHALLENGED;
> +       dev->hw_enc_features |= dev->features;
>         dev->gso_max_size = phy_dev->gso_max_size;
>         dev->gso_max_segs = phy_dev->gso_max_segs;
>         dev->hard_header_len = phy_dev->hard_header_len;
> --
> 2.23.0.rc1.153.gdeed80330f-goog
>

^ permalink raw reply

* Re: [PATCH v2 bpf-next 1/4] bpf: unprivileged BPF access via /dev/bpf
From: Andy Lutomirski @ 2019-08-16  0:54 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Kees Cook, Andy Lutomirski, Song Liu, Networking, bpf,
	Alexei Starovoitov, Daniel Borkmann, Kernel Team, Lorenz Bauer,
	Jann Horn, Greg KH, Linux API, LSM List
In-Reply-To: <20190815234622.t65oxm5mtfzy6fhg@ast-mbp.dhcp.thefacebook.com>

> On Aug 15, 2019, at 4:46 PM, Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:

>> 
>> I'm not sure why you draw the line for VMs -- they're just as buggy
>> as anything else. Regardless, I reject this line of thinking: yes,
>> all software is buggy, but that isn't a reason to give up.
> 
> hmm. are you saying you want kernel community to work towards
> making containers (namespaces) being able to run arbitrary code
> downloaded from the internet?

Yes.

As an example, Sandstorm uses a combination of namespaces (user, network, mount, ipc) and a moderately permissive seccomp policy to run arbitrary code. Not just little snippets, either — node.js, Mongo, MySQL, Meteor, and other fairly heavyweight stacks can all run under Sandstorm, with the whole stack (database engine binaries, etc) supplied by entirely untrusted customers.  During the time Sandstorm was under active development, I can recall *one* bug that would have allowed a sandbox escape. That’s a pretty good track record.  (Also, Meltdown and Spectre, sigh.)

To be clear, Sandstorm did not allow creation of a userns by the untrusted code, and Sandstorm would have heavily restricted bpf(), but that should only be necessary because of the possibility of kernel bugs, not because of the overall design.

Alexei, I’m trying to encourage you to aim for something even better than you have now. Right now, if you grant a user various very strong capabilities, that user’s systemd can use bpf network filters.  Your proposal would allow this with a different, but still very strong, set of capabilities. There’s nothing wrong with this per se, but I think you can aim much higher:

CAP_NET_ADMIN and your CAP_BPF both effectively allow the holder to take over the system, *by design*.  I’m suggesting that you engage the security community (Kees, myself, Aleksa, Jann, Serge, Christian, etc) to aim for something better: make it so that a normal Linux distro would be willing to relax its settings enough so that normal users can use bpf filtering in the systemd units and maybe eventually use even more bpf() capabilities. And let’s make is to that mainstream container managers (that use userns!) will be willing (as an option) to delegate bpf() to their containers. We’re happy to help design, review, and even write code, but we need you to be willing to work with us to make a design that seems like it will work and then to wait long enough to merge it for us to think about it, try to poke holes in it, and convince ourselves and each other that it has a good chance of being sound.

Obviously there will be many cases where an unprivileged program should *not* be able to use bpf() IP filtering, but let’s make it so that enabling these advanced features does not automatically give away the keys to the kingdom.

(Sandstorm still exists but is no longer as actively developed, sadly.)

^ permalink raw reply

* [RFC PATCH net-next 06/11] spi: spi-fsl-dspi: Implement the PTP system timestamping
From: Vladimir Oltean @ 2019-08-16  0:44 UTC (permalink / raw)
  To: h.feurstein, mlichvar, richardcochran, andrew, f.fainelli,
	broonie
  Cc: linux-spi, netdev, Vladimir Oltean
In-Reply-To: <20190816004449.10100-1-olteanv@gmail.com>

This adds a snapshotting software feature for TCFQ and EOQ modes of
operation. Due to my lack of proper understanding of the DMA mode,
the latter mode is left as an exercise for future developers.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
---
 drivers/spi/spi-fsl-dspi.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/drivers/spi/spi-fsl-dspi.c b/drivers/spi/spi-fsl-dspi.c
index 41c45ee2bb2d..3fc266d8263a 100644
--- a/drivers/spi/spi-fsl-dspi.c
+++ b/drivers/spi/spi-fsl-dspi.c
@@ -184,6 +184,9 @@ struct fsl_dspi {
 	int			irq;
 	struct clk		*clk;
 
+	struct ptp_system_timestamp *ptp_sts;
+	const void		*ptp_sts_word;
+	bool			take_snapshot;
 	struct spi_transfer	*cur_transfer;
 	struct spi_message	*cur_msg;
 	struct chip_data	*cur_chip;
@@ -658,6 +661,9 @@ static int dspi_rxtx(struct fsl_dspi *dspi)
 	u16 spi_tcnt;
 	u32 spi_tcr;
 
+	if (dspi->take_snapshot)
+		ptp_read_system_postts(dspi->ptp_sts);
+
 	/* Get transfer counter (in number of SPI transfers). It was
 	 * reset to 0 when transfer(s) were started.
 	 */
@@ -675,6 +681,11 @@ static int dspi_rxtx(struct fsl_dspi *dspi)
 		/* Success! */
 		return 0;
 
+	dspi->take_snapshot = (dspi->tx == dspi->ptp_sts_word);
+
+	if (dspi->take_snapshot)
+		ptp_read_system_prets(dspi->ptp_sts);
+
 	if (dspi->devtype_data->trans_mode == DSPI_EOQ_MODE)
 		dspi_eoq_write(dspi);
 	else
@@ -764,6 +775,8 @@ static int dspi_transfer_one_message(struct spi_master *master,
 		dspi->rx = transfer->rx_buf;
 		dspi->rx_end = dspi->rx + transfer->len;
 		dspi->len = transfer->len;
+		dspi->ptp_sts = transfer->ptp_sts;
+		dspi->ptp_sts_word = dspi->tx + transfer->ptp_sts_word_offset;
 		/* Validated transfer specific frame size (defaults applied) */
 		dspi->bits_per_word = transfer->bits_per_word;
 		if (transfer->bits_per_word <= 8)
@@ -784,6 +797,11 @@ static int dspi_transfer_one_message(struct spi_master *master,
 				     SPI_FRAME_EBITS(transfer->bits_per_word) |
 				     SPI_CTARE_DTCP(1));
 
+		dspi->take_snapshot = (dspi->tx == dspi->ptp_sts_word);
+
+		if (dspi->take_snapshot)
+			ptp_read_system_prets(dspi->ptp_sts);
+
 		trans_mode = dspi->devtype_data->trans_mode;
 		switch (trans_mode) {
 		case DSPI_EOQ_MODE:
-- 
2.17.1


^ permalink raw reply related

* [RFC PATCH net-next 02/11] net: dsa: sja1105: Implement the .gettimex64 system call for PTP
From: Vladimir Oltean @ 2019-08-16  0:44 UTC (permalink / raw)
  To: h.feurstein, mlichvar, richardcochran, andrew, f.fainelli,
	broonie
  Cc: linux-spi, netdev, Vladimir Oltean
In-Reply-To: <20190816004449.10100-1-olteanv@gmail.com>

Through the PTP_SYS_OFFSET_EXTENDED ioctl, it is possible for userspace
applications (i.e. phc2sys) to compensate for the delays incurred while
reading the PHC's time.

Implement this ioctl by passing the PTP system timestamp to the switch's
SPI controller driver.

The 'read PTP time' message is a 12 byte structure, first 4 bytes of
which represent the SPI header, and the last 8 bytes represent the
64-bit PTP time. The switch itself starts processing the command
immediately after receiving the last bit of the address, i.e. at the
middle of byte 3 (last byte of header). The PTP time is shadowed to a
buffer register in the switch, and retrieved atomically during the
subsequent SPI frames.

As such, specify to the SPI controller that we want byte 3 to be the one
that gets software-timestamped.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
---
 drivers/net/dsa/sja1105/sja1105.h      |  4 ++-
 drivers/net/dsa/sja1105/sja1105_main.c |  4 +--
 drivers/net/dsa/sja1105/sja1105_ptp.c  | 21 ++++++++++------
 drivers/net/dsa/sja1105/sja1105_spi.c  | 34 ++++++++++++++++++--------
 4 files changed, 42 insertions(+), 21 deletions(-)

diff --git a/drivers/net/dsa/sja1105/sja1105.h b/drivers/net/dsa/sja1105/sja1105.h
index e6371b2c2df1..18c4f2f808e4 100644
--- a/drivers/net/dsa/sja1105/sja1105.h
+++ b/drivers/net/dsa/sja1105/sja1105.h
@@ -93,6 +93,7 @@ struct sja1105_private {
 	struct spi_device *spidev;
 	struct dsa_switch *ds;
 	struct sja1105_port ports[SJA1105_NUM_PORTS];
+	struct ptp_system_timestamp *ptp_sts;
 	struct ptp_clock_info ptp_caps;
 	struct ptp_clock *clock;
 	/* The cycle counter translates the PTP timestamps (based on
@@ -136,7 +137,8 @@ int sja1105_spi_send_packed_buf(const struct sja1105_private *priv,
 				void *packed_buf, size_t size_bytes);
 int sja1105_spi_send_int(const struct sja1105_private *priv,
 			 sja1105_spi_rw_mode_t rw, u64 reg_addr,
-			 u64 *value, u64 size_bytes);
+			 u64 *value, u64 size_bytes,
+			 struct ptp_system_timestamp *ptp_sts);
 int sja1105_spi_send_long_packed_buf(const struct sja1105_private *priv,
 				     sja1105_spi_rw_mode_t rw, u64 base_addr,
 				     void *packed_buf, u64 buf_len);
diff --git a/drivers/net/dsa/sja1105/sja1105_main.c b/drivers/net/dsa/sja1105/sja1105_main.c
index 82bdc2da8f8f..c4df2cef89cd 100644
--- a/drivers/net/dsa/sja1105/sja1105_main.c
+++ b/drivers/net/dsa/sja1105/sja1105_main.c
@@ -2100,8 +2100,8 @@ static int sja1105_check_device_id(struct sja1105_private *priv)
 	u64 part_no;
 	int rc;
 
-	rc = sja1105_spi_send_int(priv, SPI_READ, regs->device_id,
-				  &device_id, SJA1105_SIZE_DEVICE_ID);
+	rc = sja1105_spi_send_int(priv, SPI_READ, regs->device_id, &device_id,
+				  SJA1105_SIZE_DEVICE_ID, NULL);
 	if (rc < 0)
 		return rc;
 
diff --git a/drivers/net/dsa/sja1105/sja1105_ptp.c b/drivers/net/dsa/sja1105/sja1105_ptp.c
index 51a0014369fc..ee7d1065d745 100644
--- a/drivers/net/dsa/sja1105/sja1105_ptp.c
+++ b/drivers/net/dsa/sja1105/sja1105_ptp.c
@@ -1,6 +1,7 @@
 // SPDX-License-Identifier: GPL-2.0
 /* Copyright (c) 2019, Vladimir Oltean <olteanv@gmail.com>
  */
+#include <linux/spi/spi.h>
 #include "sja1105.h"
 
 /* The adjfine API clamps ppb between [-32,768,000, 32,768,000], and
@@ -262,14 +263,17 @@ int sja1105_ptp_reset(struct sja1105_private *priv)
 	return rc;
 }
 
-static int sja1105_ptp_gettime(struct ptp_clock_info *ptp,
-			       struct timespec64 *ts)
+static int sja1105_ptp_gettimex(struct ptp_clock_info *ptp,
+				struct timespec64 *ts,
+				struct ptp_system_timestamp *sts)
 {
 	struct sja1105_private *priv = ptp_to_sja1105(ptp);
 	u64 ns;
 
 	mutex_lock(&priv->ptp_lock);
+	priv->ptp_sts = sts;
 	ns = timecounter_read(&priv->tstamp_tc);
+	priv->ptp_sts = NULL;
 	mutex_unlock(&priv->ptp_lock);
 
 	*ts = ns_to_timespec64(ns);
@@ -355,7 +359,7 @@ static u64 sja1105_ptptsclk_read(const struct cyclecounter *cc)
 
 	sja1105_debug_gpio(priv, 1);
 	rc = sja1105_spi_send_int(priv, SPI_READ, regs->ptptsclk,
-				  &ptptsclk, 8);
+				  &ptptsclk, 8, priv->ptp_sts);
 	sja1105_debug_gpio(priv, 0);
 	if (rc < 0)
 		dev_err_ratelimited(priv->ds->dev,
@@ -368,9 +372,10 @@ static void sja1105_ptp_overflow_check(struct work_struct *work)
 {
 	struct delayed_work *dw = to_delayed_work(work);
 	struct sja1105_private *priv = rw_to_sja1105(dw);
+	struct ptp_system_timestamp dummy;
 	struct timespec64 ts;
 
-	sja1105_ptp_gettime(&priv->ptp_caps, &ts);
+	sja1105_ptp_gettimex(&priv->ptp_caps, &ts, &dummy);
 
 	schedule_delayed_work(&priv->refresh_work, SJA1105_REFRESH_INTERVAL);
 }
@@ -387,7 +392,7 @@ static void sja1105_ptp_extts_work(struct work_struct *work)
 	mutex_lock(&priv->ptp_lock);
 
 	rc = sja1105_spi_send_int(priv, SPI_READ, regs->ptpsyncts,
-				  &ptpsyncts, 8);
+				  &ptpsyncts, 8, NULL);
 	if (rc < 0)
 		dev_err(priv->ds->dev, "Failed to read PTPSYNCTS: %d\n", rc);
 
@@ -433,12 +438,12 @@ static int sja1105_pps_enable(struct sja1105_private *priv, bool on)
 		ptp_pin_duration = SJA1105_HZ_TO_PIN_DURATION(1);
 
 		rc = sja1105_spi_send_int(priv, SPI_WRITE, regs->ptppinst,
-					  &ptp_pin_start, 8);
+					  &ptp_pin_start, 8, NULL);
 		if (rc < 0)
 			return rc;
 
 		rc = sja1105_spi_send_int(priv, SPI_WRITE, regs->ptppindur,
-					  &ptp_pin_duration, 4);
+					  &ptp_pin_duration, 4, NULL);
 		if (rc < 0)
 			return rc;
 	}
@@ -500,7 +505,7 @@ int sja1105_ptp_clock_register(struct sja1105_private *priv)
 		.name		= "SJA1105 PHC",
 		.adjfine	= sja1105_ptp_adjfine,
 		.adjtime	= sja1105_ptp_adjtime,
-		.gettime64	= sja1105_ptp_gettime,
+		.gettimex64	= sja1105_ptp_gettimex,
 		.settime64	= sja1105_ptp_settime,
 		.enable		= sja1105_ptp_enable,
 		.max_adj	= SJA1105_MAX_ADJ_PPB,
diff --git a/drivers/net/dsa/sja1105/sja1105_spi.c b/drivers/net/dsa/sja1105/sja1105_spi.c
index e33c85569882..a3486a40e0fb 100644
--- a/drivers/net/dsa/sja1105/sja1105_spi.c
+++ b/drivers/net/dsa/sja1105/sja1105_spi.c
@@ -15,10 +15,13 @@
 	(SJA1105_SIZE_SPI_MSG_HEADER + SJA1105_SIZE_SPI_MSG_MAXLEN)
 
 static int sja1105_spi_transfer(const struct sja1105_private *priv,
-				const void *tx, void *rx, int size)
+				const void *tx, void *rx, int size,
+				struct ptp_system_timestamp *ptp_sts)
 {
 	struct spi_device *spi = priv->spidev;
 	struct spi_transfer transfer = {
+		.ptp_sts_word_offset = 3,
+		.ptp_sts = ptp_sts,
 		.tx_buf = tx,
 		.rx_buf = rx,
 		.len = size,
@@ -66,9 +69,11 @@ sja1105_spi_message_pack(void *buf, const struct sja1105_spi_message *msg)
  * @size_bytes is smaller than SIZE_SPI_MSG_MAXLEN. Larger packed buffers
  * are chunked in smaller pieces by sja1105_spi_send_long_packed_buf below.
  */
-int sja1105_spi_send_packed_buf(const struct sja1105_private *priv,
-				sja1105_spi_rw_mode_t rw, u64 reg_addr,
-				void *packed_buf, size_t size_bytes)
+static int
+__sja1105_spi_send_packed_buf(const struct sja1105_private *priv,
+			      sja1105_spi_rw_mode_t rw, u64 reg_addr,
+			      void *packed_buf, size_t size_bytes,
+			      struct ptp_system_timestamp *ptp_sts)
 {
 	u8 tx_buf[SJA1105_SIZE_SPI_TRANSFER_MAX] = {0};
 	u8 rx_buf[SJA1105_SIZE_SPI_TRANSFER_MAX] = {0};
@@ -90,7 +95,7 @@ int sja1105_spi_send_packed_buf(const struct sja1105_private *priv,
 		memcpy(tx_buf + SJA1105_SIZE_SPI_MSG_HEADER,
 		       packed_buf, size_bytes);
 
-	rc = sja1105_spi_transfer(priv, tx_buf, rx_buf, msg_len);
+	rc = sja1105_spi_transfer(priv, tx_buf, rx_buf, msg_len, ptp_sts);
 	if (rc < 0)
 		return rc;
 
@@ -101,6 +106,14 @@ int sja1105_spi_send_packed_buf(const struct sja1105_private *priv,
 	return 0;
 }
 
+int sja1105_spi_send_packed_buf(const struct sja1105_private *priv,
+				sja1105_spi_rw_mode_t rw, u64 reg_addr,
+				void *packed_buf, size_t size_bytes)
+{
+	return __sja1105_spi_send_packed_buf(priv, rw, reg_addr, packed_buf,
+					     size_bytes, NULL);
+}
+
 /* If @rw is:
  * - SPI_WRITE: creates and sends an SPI write message at absolute
  *		address reg_addr, taking size_bytes from *packed_buf
@@ -114,7 +127,8 @@ int sja1105_spi_send_packed_buf(const struct sja1105_private *priv,
  */
 int sja1105_spi_send_int(const struct sja1105_private *priv,
 			 sja1105_spi_rw_mode_t rw, u64 reg_addr,
-			 u64 *value, u64 size_bytes)
+			 u64 *value, u64 size_bytes,
+			 struct ptp_system_timestamp *ptp_sts)
 {
 	u8 packed_buf[SJA1105_SIZE_SPI_MSG_MAXLEN];
 	int rc;
@@ -126,8 +140,8 @@ int sja1105_spi_send_int(const struct sja1105_private *priv,
 		sja1105_pack(packed_buf, value, 8 * size_bytes - 1, 0,
 			     size_bytes);
 
-	rc = sja1105_spi_send_packed_buf(priv, rw, reg_addr, packed_buf,
-					 size_bytes);
+	rc = __sja1105_spi_send_packed_buf(priv, rw, reg_addr, packed_buf,
+					   size_bytes, ptp_sts);
 
 	if (rw == SPI_READ)
 		sja1105_unpack(packed_buf, value, 8 * size_bytes - 1, 0,
@@ -291,7 +305,7 @@ int sja1105_inhibit_tx(const struct sja1105_private *priv,
 	int rc;
 
 	rc = sja1105_spi_send_int(priv, SPI_READ, regs->port_control,
-				  &inhibit_cmd, SJA1105_SIZE_PORT_CTRL);
+				  &inhibit_cmd, SJA1105_SIZE_PORT_CTRL, NULL);
 	if (rc < 0)
 		return rc;
 
@@ -301,7 +315,7 @@ int sja1105_inhibit_tx(const struct sja1105_private *priv,
 		inhibit_cmd &= ~port_bitmap;
 
 	return sja1105_spi_send_int(priv, SPI_WRITE, regs->port_control,
-				    &inhibit_cmd, SJA1105_SIZE_PORT_CTRL);
+				    &inhibit_cmd, SJA1105_SIZE_PORT_CTRL, NULL);
 }
 
 struct sja1105_status {
-- 
2.17.1


^ permalink raw reply related

* [RFC PATCH net-next 08/11] spi: spi-fsl-dspi: Disable interrupts and preemption during poll mode transfer
From: Vladimir Oltean @ 2019-08-16  0:44 UTC (permalink / raw)
  To: h.feurstein, mlichvar, richardcochran, andrew, f.fainelli,
	broonie
  Cc: linux-spi, netdev, Vladimir Oltean
In-Reply-To: <20190816004449.10100-1-olteanv@gmail.com>

While the poll mode helps reduce the overall latency in transmitting a
SPI message in the EOQ and TCFQ modes, the transmission can still have
jitter due to the CPU needing to service interrupts.

The transmission latency does not matter except in situations where the
SPI transfer represents the readout of a POSIX clock. In that case,
even with the byte-level PTP system timestamping in place, a pending IRQ
might find its way to be processed on the local CPU exactly during the
window when the transfer is snapshotted.

Disabling interrupts ensures the above situation never happens. When it
does, it manifests itself as random delay spikes, which throw off the
servo loop of phc2sys and make it lose lock.

Short phc2sys summary after 58 minutes of running without this patch:

  offset: min -26251 max 16416 mean -21.8672 std dev 863.416
  delay: min 4720 max 57280 mean 5182.49 std dev 1607.19
  lost servo lock 3 times

Summary of the same phc2sys service running for 120 minutes with the
patch:

  offset: min -378 max 381 mean -0.0083089 std dev 101.495
  delay: min 4720 max 5920 mean 5129.38 std dev 154.899
  lost servo lock 0 times

Disable interrupts unconditionally if running in poll mode.
Two aspects:
- If the DSPI driver is in IRQ mode, then disabling interrupts becomes a
  contradiction in terms. Poll mode is recommendable for predictable
  latency.
- In theory it should be possible to disable interrupts only for SPI
  transfers that represent an interaction with a POSIX clock. The driver
  can sense this by looking at transfer->ptp_sts. However enabling this
  unconditionally makes issues much more visible (and not just in fringe
  cases), were they to ever appear.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
---
 drivers/spi/spi-fsl-dspi.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/drivers/spi/spi-fsl-dspi.c b/drivers/spi/spi-fsl-dspi.c
index f0838853392d..5404f1b45ad0 100644
--- a/drivers/spi/spi-fsl-dspi.c
+++ b/drivers/spi/spi-fsl-dspi.c
@@ -184,6 +184,8 @@ struct fsl_dspi {
 	int			irq;
 	struct clk		*clk;
 
+	/* Used to disable IRQs and preemption */
+	spinlock_t		lock;
 	struct ptp_system_timestamp *ptp_sts;
 	const void		*ptp_sts_word;
 	bool			take_snapshot;
@@ -757,6 +759,7 @@ static int dspi_transfer_one_message(struct spi_master *master,
 	struct spi_device *spi = message->spi;
 	enum dspi_trans_mode trans_mode;
 	struct spi_transfer *transfer;
+	unsigned long flags = 0;
 	int status = 0;
 
 	message->actual_length = 0;
@@ -813,6 +816,9 @@ static int dspi_transfer_one_message(struct spi_master *master,
 				     SPI_FRAME_EBITS(transfer->bits_per_word) |
 				     SPI_CTARE_DTCP(1));
 
+		if (!dspi->irq)
+			spin_lock_irqsave(&dspi->lock, flags);
+
 		dspi->take_snapshot = (dspi->tx == dspi->ptp_sts_word);
 
 		if (dspi->take_snapshot) {
@@ -848,6 +854,9 @@ static int dspi_transfer_one_message(struct spi_master *master,
 			do {
 				status = dspi_poll(dspi);
 			} while (status == -EAGAIN);
+
+			spin_unlock_irqrestore(&dspi->lock, flags);
+
 		} else if (trans_mode != DSPI_DMA_MODE) {
 			status = wait_event_interruptible(dspi->waitq,
 							  dspi->waitflags);
@@ -1178,6 +1187,7 @@ static int dspi_probe(struct platform_device *pdev)
 	}
 
 	init_waitqueue_head(&dspi->waitq);
+	spin_lock_init(&dspi->lock);
 
 poll_mode:
 	if (dspi->devtype_data->trans_mode == DSPI_DMA_MODE) {
-- 
2.17.1


^ permalink raw reply related

* [RFC PATCH net-next 09/11] ARM: dts: ls1021a-tsn: Add debugging GPIOs for the SJA1105 and DSPI drivers
From: Vladimir Oltean @ 2019-08-16  0:44 UTC (permalink / raw)
  To: h.feurstein, mlichvar, richardcochran, andrew, f.fainelli,
	broonie
  Cc: linux-spi, netdev, Vladimir Oltean
In-Reply-To: <20190816004449.10100-1-olteanv@gmail.com>

These GPIOs are exported to the expansion pin header at the rear of the
board:

EXP1_GPIO7: row 1, pin 9 from left
EXP1_GPIO6: row 1, pin 8 from left

Experimentally I could only see EXP1_GPIO6 (the pin currently assigned
to the DSPI driver) actually toggle on an analyzer - I don't know why,
but on my board, EXP1_GPIO7 isn't.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
---
 arch/arm/boot/dts/ls1021a-tsn.dts | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/arch/arm/boot/dts/ls1021a-tsn.dts b/arch/arm/boot/dts/ls1021a-tsn.dts
index 9d4eee986f53..6cec454c484c 100644
--- a/arch/arm/boot/dts/ls1021a-tsn.dts
+++ b/arch/arm/boot/dts/ls1021a-tsn.dts
@@ -5,6 +5,7 @@
 
 /dts-v1/;
 #include "ls1021a.dtsi"
+#include <dt-bindings/gpio/gpio.h>
 
 / {
 	model = "NXP LS1021A-TSN Board";
@@ -34,6 +35,8 @@
 
 &dspi0 {
 	bus-num = <0>;
+	/* EXP1_GPIO6 is GPIO4_18 */
+	debug-gpios = <&gpio3 18 GPIO_ACTIVE_HIGH>;
 	status = "okay";
 
 	/* ADG704BRMZ 1:4 SPI mux/demux */
@@ -57,6 +60,8 @@
 		/* SPI controller settings for SJA1105 timing requirements */
 		fsl,spi-cs-sck-delay = <1000>;
 		fsl,spi-sck-cs-delay = <1000>;
+		/* EXP1_GPIO7 is GPIO4_19 */
+		debug-gpios = <&gpio3 19 GPIO_ACTIVE_HIGH>;
 
 		ports {
 			#address-cells = <1>;
-- 
2.17.1


^ permalink raw reply related

* [RFC PATCH net-next 05/11] spi: spi-fsl-dspi: Use poll mode in case the platform IRQ is missing
From: Vladimir Oltean @ 2019-08-16  0:44 UTC (permalink / raw)
  To: h.feurstein, mlichvar, richardcochran, andrew, f.fainelli,
	broonie
  Cc: linux-spi, netdev, Vladimir Oltean
In-Reply-To: <20190816004449.10100-1-olteanv@gmail.com>

On platforms like LS1021A which use TCFQ mode, an interrupt needs to be
processed after each byte is TXed/RXed. I tried to make the DSPI
implementation on this SoC operate in other, more efficient modes (EOQ,
DMA) but it looks like it simply isn't possible.

Therefore allow the driver to operate in poll mode, to ease a bit of
this absurd amount of IRQ load generated in TCFQ mode. Doing so reduces
both the net time it takes to transmit a SPI message, as well as the
inter-frame jitter that occurs while doing so.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
---
 drivers/spi/spi-fsl-dspi.c | 156 +++++++++++++++++++++----------------
 1 file changed, 90 insertions(+), 66 deletions(-)

diff --git a/drivers/spi/spi-fsl-dspi.c b/drivers/spi/spi-fsl-dspi.c
index 99708b36ee4f..41c45ee2bb2d 100644
--- a/drivers/spi/spi-fsl-dspi.c
+++ b/drivers/spi/spi-fsl-dspi.c
@@ -652,6 +652,77 @@ static void dspi_eoq_read(struct fsl_dspi *dspi)
 		dspi_push_rx(dspi, fifo_read(dspi));
 }
 
+static int dspi_rxtx(struct fsl_dspi *dspi)
+{
+	struct spi_message *msg = dspi->cur_msg;
+	u16 spi_tcnt;
+	u32 spi_tcr;
+
+	/* Get transfer counter (in number of SPI transfers). It was
+	 * reset to 0 when transfer(s) were started.
+	 */
+	regmap_read(dspi->regmap, SPI_TCR, &spi_tcr);
+	spi_tcnt = SPI_TCR_GET_TCNT(spi_tcr);
+	/* Update total number of bytes that were transferred */
+	msg->actual_length += spi_tcnt * dspi->bytes_per_word;
+
+	if (dspi->devtype_data->trans_mode == DSPI_EOQ_MODE)
+		dspi_eoq_read(dspi);
+	else
+		dspi_tcfq_read(dspi);
+
+	if (!dspi->len)
+		/* Success! */
+		return 0;
+
+	if (dspi->devtype_data->trans_mode == DSPI_EOQ_MODE)
+		dspi_eoq_write(dspi);
+	else
+		dspi_tcfq_write(dspi);
+
+	return -EAGAIN;
+}
+
+static int dspi_poll(struct fsl_dspi *dspi)
+{
+	int tries = 1000;
+	u32 spi_sr;
+
+	do {
+		regmap_read(dspi->regmap, SPI_SR, &spi_sr);
+		regmap_write(dspi->regmap, SPI_SR, spi_sr);
+
+		if (spi_sr & (SPI_SR_EOQF | SPI_SR_TCFQF))
+			break;
+	} while (--tries);
+
+	if (!tries)
+		return -ETIMEDOUT;
+
+	return dspi_rxtx(dspi);
+}
+
+static irqreturn_t dspi_interrupt(int irq, void *dev_id)
+{
+	struct fsl_dspi *dspi = (struct fsl_dspi *)dev_id;
+	u32 spi_sr;
+
+	regmap_read(dspi->regmap, SPI_SR, &spi_sr);
+	regmap_write(dspi->regmap, SPI_SR, spi_sr);
+
+	if (!(spi_sr & (SPI_SR_EOQF | SPI_SR_TCFQF)))
+		return IRQ_HANDLED;
+
+	dspi_rxtx(dspi);
+
+	if (!dspi->len) {
+		dspi->waitflags = 1;
+		wake_up_interruptible(&dspi->waitq);
+	}
+
+	return IRQ_HANDLED;
+}
+
 static int dspi_transfer_one_message(struct spi_master *master,
 				     struct spi_message *message)
 {
@@ -736,13 +807,18 @@ static int dspi_transfer_one_message(struct spi_master *master,
 			goto out;
 		}
 
-		if (trans_mode != DSPI_DMA_MODE) {
-			if (wait_event_interruptible(dspi->waitq,
-						     dspi->waitflags))
-				dev_err(&dspi->pdev->dev,
-					"wait transfer complete fail!\n");
+		if (!dspi->irq) {
+			do {
+				status = dspi_poll(dspi);
+			} while (status == -EAGAIN);
+		} else if (trans_mode != DSPI_DMA_MODE) {
+			status = wait_event_interruptible(dspi->waitq,
+							  dspi->waitflags);
 			dspi->waitflags = 0;
 		}
+		if (status)
+			dev_err(&dspi->pdev->dev,
+				"Waiting for transfer to complete failed!\n");
 
 		if (transfer->delay_usecs)
 			udelay(transfer->delay_usecs);
@@ -830,62 +906,6 @@ static void dspi_cleanup(struct spi_device *spi)
 	kfree(chip);
 }
 
-static irqreturn_t dspi_interrupt(int irq, void *dev_id)
-{
-	struct fsl_dspi *dspi = (struct fsl_dspi *)dev_id;
-	struct spi_message *msg = dspi->cur_msg;
-	enum dspi_trans_mode trans_mode;
-	u32 spi_sr, spi_tcr;
-	u16 spi_tcnt;
-
-	regmap_read(dspi->regmap, SPI_SR, &spi_sr);
-	regmap_write(dspi->regmap, SPI_SR, spi_sr);
-
-	if (spi_sr & (SPI_SR_EOQF | SPI_SR_TCFQF)) {
-		/* Get transfer counter (in number of SPI transfers). It was
-		 * reset to 0 when transfer(s) were started.
-		 */
-		regmap_read(dspi->regmap, SPI_TCR, &spi_tcr);
-		spi_tcnt = SPI_TCR_GET_TCNT(spi_tcr);
-		/* Update total number of bytes that were transferred */
-		msg->actual_length += spi_tcnt * dspi->bytes_per_word;
-
-		trans_mode = dspi->devtype_data->trans_mode;
-		switch (trans_mode) {
-		case DSPI_EOQ_MODE:
-			dspi_eoq_read(dspi);
-			break;
-		case DSPI_TCFQ_MODE:
-			dspi_tcfq_read(dspi);
-			break;
-		default:
-			dev_err(&dspi->pdev->dev, "unsupported trans_mode %u\n",
-				trans_mode);
-				return IRQ_HANDLED;
-		}
-
-		if (!dspi->len) {
-			dspi->waitflags = 1;
-			wake_up_interruptible(&dspi->waitq);
-		} else {
-			switch (trans_mode) {
-			case DSPI_EOQ_MODE:
-				dspi_eoq_write(dspi);
-				break;
-			case DSPI_TCFQ_MODE:
-				dspi_tcfq_write(dspi);
-				break;
-			default:
-				dev_err(&dspi->pdev->dev,
-					"unsupported trans_mode %u\n",
-					trans_mode);
-			}
-		}
-	}
-
-	return IRQ_HANDLED;
-}
-
 static const struct of_device_id fsl_dspi_dt_ids[] = {
 	{ .compatible = "fsl,vf610-dspi", .data = &vf610_data, },
 	{ .compatible = "fsl,ls1021a-v1.0-dspi", .data = &ls1021a_v1_data, },
@@ -1099,11 +1119,13 @@ static int dspi_probe(struct platform_device *pdev)
 		goto out_master_put;
 
 	dspi_init(dspi);
+
 	dspi->irq = platform_get_irq(pdev, 0);
-	if (dspi->irq < 0) {
-		dev_err(&pdev->dev, "can't get platform irq\n");
-		ret = dspi->irq;
-		goto out_clk_put;
+	if (dspi->irq <= 0) {
+		dev_info(&pdev->dev,
+			 "can't get platform irq, using poll mode\n");
+		dspi->irq = 0;
+		goto poll_mode;
 	}
 
 	ret = devm_request_irq(&pdev->dev, dspi->irq, dspi_interrupt,
@@ -1113,6 +1135,9 @@ static int dspi_probe(struct platform_device *pdev)
 		goto out_clk_put;
 	}
 
+	init_waitqueue_head(&dspi->waitq);
+
+poll_mode:
 	if (dspi->devtype_data->trans_mode == DSPI_DMA_MODE) {
 		ret = dspi_request_dma(dspi, res->start);
 		if (ret < 0) {
@@ -1124,7 +1149,6 @@ static int dspi_probe(struct platform_device *pdev)
 	master->max_speed_hz =
 		clk_get_rate(dspi->clk) / dspi->devtype_data->max_clock_factor;
 
-	init_waitqueue_head(&dspi->waitq);
 	platform_set_drvdata(pdev, master);
 
 	ret = spi_register_master(master);
-- 
2.17.1


^ permalink raw reply related

* [RFC PATCH net-next 04/11] spi: spi-fsl-dspi: Cosmetic cleanup
From: Vladimir Oltean @ 2019-08-16  0:44 UTC (permalink / raw)
  To: h.feurstein, mlichvar, richardcochran, andrew, f.fainelli,
	broonie
  Cc: linux-spi, netdev, Vladimir Oltean
In-Reply-To: <20190816004449.10100-1-olteanv@gmail.com>

This patch addresses some cosmetic issues:
- Alignment
- Typos
- (Non-)use of BIT() and GENMASK() macros
- Unused definitions
- Unused includes
- Abuse of ternary operator in detriment of readability
- Reduce indentation level

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
---
 drivers/spi/spi-fsl-dspi.c | 312 ++++++++++++++++++-------------------
 1 file changed, 154 insertions(+), 158 deletions(-)

diff --git a/drivers/spi/spi-fsl-dspi.c b/drivers/spi/spi-fsl-dspi.c
index 53335ccc98f6..99708b36ee4f 100644
--- a/drivers/spi/spi-fsl-dspi.c
+++ b/drivers/spi/spi-fsl-dspi.c
@@ -9,26 +9,16 @@
 #include <linux/delay.h>
 #include <linux/dmaengine.h>
 #include <linux/dma-mapping.h>
-#include <linux/err.h>
-#include <linux/errno.h>
 #include <linux/interrupt.h>
-#include <linux/io.h>
 #include <linux/kernel.h>
-#include <linux/math64.h>
 #include <linux/module.h>
-#include <linux/of.h>
 #include <linux/of_device.h>
 #include <linux/pinctrl/consumer.h>
-#include <linux/platform_device.h>
-#include <linux/pm_runtime.h>
 #include <linux/regmap.h>
-#include <linux/sched.h>
 #include <linux/spi/spi.h>
 #include <linux/spi/spi-fsl-dspi.h>
-#include <linux/spi/spi_bitbang.h>
-#include <linux/time.h>
 
-#define DRIVER_NAME "fsl-dspi"
+#define DRIVER_NAME			"fsl-dspi"
 
 #ifdef CONFIG_M5441x
 #define DSPI_FIFO_SIZE			16
@@ -37,97 +27,98 @@
 #endif
 #define DSPI_DMA_BUFSIZE		(DSPI_FIFO_SIZE * 1024)
 
-#define SPI_MCR		0x00
-#define SPI_MCR_MASTER		(1 << 31)
-#define SPI_MCR_PCSIS		(0x3F << 16)
-#define SPI_MCR_CLR_TXF	(1 << 11)
-#define SPI_MCR_CLR_RXF	(1 << 10)
-#define SPI_MCR_XSPI		(1 << 3)
-
-#define SPI_TCR			0x08
-#define SPI_TCR_GET_TCNT(x)	(((x) & 0xffff0000) >> 16)
-
-#define SPI_CTAR(x)		(0x0c + (((x) & 0x3) * 4))
-#define SPI_CTAR_FMSZ(x)	(((x) & 0x0000000f) << 27)
-#define SPI_CTAR_CPOL(x)	((x) << 26)
-#define SPI_CTAR_CPHA(x)	((x) << 25)
-#define SPI_CTAR_LSBFE(x)	((x) << 24)
-#define SPI_CTAR_PCSSCK(x)	(((x) & 0x00000003) << 22)
-#define SPI_CTAR_PASC(x)	(((x) & 0x00000003) << 20)
-#define SPI_CTAR_PDT(x)	(((x) & 0x00000003) << 18)
-#define SPI_CTAR_PBR(x)	(((x) & 0x00000003) << 16)
-#define SPI_CTAR_CSSCK(x)	(((x) & 0x0000000f) << 12)
-#define SPI_CTAR_ASC(x)	(((x) & 0x0000000f) << 8)
-#define SPI_CTAR_DT(x)		(((x) & 0x0000000f) << 4)
-#define SPI_CTAR_BR(x)		((x) & 0x0000000f)
-#define SPI_CTAR_SCALE_BITS	0xf
-
-#define SPI_CTAR0_SLAVE	0x0c
-
-#define SPI_SR			0x2c
-#define SPI_SR_EOQF		0x10000000
-#define SPI_SR_TCFQF		0x80000000
-#define SPI_SR_CLEAR		0x9aaf0000
-
-#define SPI_RSER_TFFFE		BIT(25)
-#define SPI_RSER_TFFFD		BIT(24)
-#define SPI_RSER_RFDFE		BIT(17)
-#define SPI_RSER_RFDFD		BIT(16)
-
-#define SPI_RSER		0x30
-#define SPI_RSER_EOQFE		0x10000000
-#define SPI_RSER_TCFQE		0x80000000
-
-#define SPI_PUSHR		0x34
-#define SPI_PUSHR_CMD_CONT	(1 << 15)
-#define SPI_PUSHR_CONT		(SPI_PUSHR_CMD_CONT << 16)
-#define SPI_PUSHR_CMD_CTAS(x)	(((x) & 0x0003) << 12)
-#define SPI_PUSHR_CTAS(x)	(SPI_PUSHR_CMD_CTAS(x) << 16)
-#define SPI_PUSHR_CMD_EOQ	(1 << 11)
-#define SPI_PUSHR_EOQ		(SPI_PUSHR_CMD_EOQ << 16)
-#define SPI_PUSHR_CMD_CTCNT	(1 << 10)
-#define SPI_PUSHR_CTCNT		(SPI_PUSHR_CMD_CTCNT << 16)
-#define SPI_PUSHR_CMD_PCS(x)	((1 << x) & 0x003f)
-#define SPI_PUSHR_PCS(x)	(SPI_PUSHR_CMD_PCS(x) << 16)
-#define SPI_PUSHR_TXDATA(x)	((x) & 0x0000ffff)
-
-#define SPI_PUSHR_SLAVE	0x34
-
-#define SPI_POPR		0x38
-#define SPI_POPR_RXDATA(x)	((x) & 0x0000ffff)
-
-#define SPI_TXFR0		0x3c
-#define SPI_TXFR1		0x40
-#define SPI_TXFR2		0x44
-#define SPI_TXFR3		0x48
-#define SPI_RXFR0		0x7c
-#define SPI_RXFR1		0x80
-#define SPI_RXFR2		0x84
-#define SPI_RXFR3		0x88
-
-#define SPI_CTARE(x)		(0x11c + (((x) & 0x3) * 4))
-#define SPI_CTARE_FMSZE(x)	(((x) & 0x1) << 16)
-#define SPI_CTARE_DTCP(x)	((x) & 0x7ff)
-
-#define SPI_SREX		0x13c
-
-#define SPI_FRAME_BITS(bits)	SPI_CTAR_FMSZ((bits) - 1)
-#define SPI_FRAME_BITS_MASK	SPI_CTAR_FMSZ(0xf)
-#define SPI_FRAME_BITS_16	SPI_CTAR_FMSZ(0xf)
-#define SPI_FRAME_BITS_8	SPI_CTAR_FMSZ(0x7)
-
-#define SPI_FRAME_EBITS(bits)	SPI_CTARE_FMSZE(((bits) - 1) >> 4)
-#define SPI_FRAME_EBITS_MASK	SPI_CTARE_FMSZE(1)
+#define SPI_MCR				0x00
+#define SPI_MCR_MASTER			BIT(31)
+#define SPI_MCR_PCSIS			(0x3F << 16)
+#define SPI_MCR_CLR_TXF			BIT(11)
+#define SPI_MCR_CLR_RXF			BIT(10)
+#define SPI_MCR_XSPI			BIT(3)
+
+#define SPI_TCR				0x08
+#define SPI_TCR_GET_TCNT(x)		(((x) & GENMASK(31, 16)) >> 16)
+
+#define SPI_CTAR(x)			(0x0c + (((x) & GENMASK(1, 0)) * 4))
+#define SPI_CTAR_FMSZ(x)		(((x) << 27) & GENMASK(30, 27))
+#define SPI_CTAR_CPOL			BIT(26)
+#define SPI_CTAR_CPHA			BIT(25)
+#define SPI_CTAR_LSBFE			BIT(24)
+#define SPI_CTAR_PCSSCK(x)		(((x) << 22) & GENMASK(23, 22))
+#define SPI_CTAR_PASC(x)		(((x) << 20) & GENMASK(21, 20))
+#define SPI_CTAR_PDT(x)			(((x) << 18) & GENMASK(19, 18))
+#define SPI_CTAR_PBR(x)			(((x) << 16) & GENMASK(17, 16))
+#define SPI_CTAR_CSSCK(x)		(((x) << 12) & GENMASK(15, 12))
+#define SPI_CTAR_ASC(x)			(((x) << 8) & GENMASK(11, 8))
+#define SPI_CTAR_DT(x)			(((x) << 4) & GENMASK(7, 4))
+#define SPI_CTAR_BR(x)			((x) & GENMASK(3, 0))
+#define SPI_CTAR_SCALE_BITS		0xf
+
+#define SPI_CTAR0_SLAVE			0x0c
+
+#define SPI_SR				0x2c
+#define SPI_SR_TCFQF			BIT(31)
+#define SPI_SR_EOQF			BIT(28)
+#define SPI_SR_TFUF			BIT(27)
+#define SPI_SR_TFFF			BIT(25)
+#define SPI_SR_CMDTCF			BIT(23)
+#define SPI_SR_SPEF			BIT(21)
+#define SPI_SR_RFOF			BIT(19)
+#define SPI_SR_TFIWF			BIT(18)
+#define SPI_SR_RFDF			BIT(17)
+#define SPI_SR_CMDFFF			BIT(16)
+#define SPI_SR_CLEAR			(SPI_SR_TCFQF | SPI_SR_EOQF | \
+					SPI_SR_TFUF | SPI_SR_TFFF | \
+					SPI_SR_CMDTCF | SPI_SR_SPEF | \
+					SPI_SR_RFOF | SPI_SR_TFIWF | \
+					SPI_SR_RFDF | SPI_SR_CMDFFF)
+
+#define SPI_RSER_TFFFE			BIT(25)
+#define SPI_RSER_TFFFD			BIT(24)
+#define SPI_RSER_RFDFE			BIT(17)
+#define SPI_RSER_RFDFD			BIT(16)
+
+#define SPI_RSER			0x30
+#define SPI_RSER_TCFQE			BIT(31)
+#define SPI_RSER_EOQFE			BIT(28)
+
+#define SPI_PUSHR			0x34
+#define SPI_PUSHR_CMD_CONT		BIT(15)
+#define SPI_PUSHR_CMD_CTAS(x)		(((x) << 12 & GENMASK(14, 12)))
+#define SPI_PUSHR_CMD_EOQ		BIT(11)
+#define SPI_PUSHR_CMD_CTCNT		BIT(10)
+#define SPI_PUSHR_CMD_PCS(x)		(BIT(x) & GENMASK(5, 0))
+
+#define SPI_PUSHR_SLAVE			0x34
+
+#define SPI_POPR			0x38
+#define SPI_POPR_RXDATA(x)		((x) & GENMASK(15, 0))
+
+#define SPI_TXFR0			0x3c
+#define SPI_TXFR1			0x40
+#define SPI_TXFR2			0x44
+#define SPI_TXFR3			0x48
+#define SPI_RXFR0			0x7c
+#define SPI_RXFR1			0x80
+#define SPI_RXFR2			0x84
+#define SPI_RXFR3			0x88
+
+#define SPI_CTARE(x)			(0x11c + (((x) & GENMASK(1, 0)) * 4))
+#define SPI_CTARE_FMSZE(x)		(((x) & 0x1) << 16)
+#define SPI_CTARE_DTCP(x)		((x) & 0x7ff)
+
+#define SPI_SREX			0x13c
+
+#define SPI_FRAME_BITS(bits)		SPI_CTAR_FMSZ((bits) - 1)
+#define SPI_FRAME_EBITS(bits)		SPI_CTARE_FMSZE(((bits) - 1) >> 4)
 
 /* Register offsets for regmap_pushr */
-#define PUSHR_CMD		0x0
-#define PUSHR_TX		0x2
+#define PUSHR_CMD			0x0
+#define PUSHR_TX			0x2
 
-#define SPI_CS_INIT		0x01
-#define SPI_CS_ASSERT		0x02
-#define SPI_CS_DROP		0x04
+#define SPI_CS_INIT			0x01
+#define SPI_CS_ASSERT			0x02
+#define SPI_CS_DROP			0x04
 
-#define DMA_COMPLETION_TIMEOUT	msecs_to_jiffies(3000)
+#define DMA_COMPLETION_TIMEOUT		msecs_to_jiffies(3000)
 
 struct chip_data {
 	u32 ctar_val;
@@ -246,7 +237,7 @@ static void dspi_push_rx(struct fsl_dspi *dspi, u32 rxdata)
 	if (!dspi->rx)
 		return;
 
-	/* Mask of undefined bits */
+	/* Mask off undefined bits */
 	rxdata &= (1 << dspi->bits_per_word) - 1;
 
 	if (dspi->bytes_per_word == 1)
@@ -282,8 +273,8 @@ static void dspi_rx_dma_callback(void *arg)
 
 static int dspi_next_xfer_dma_submit(struct fsl_dspi *dspi)
 {
-	struct fsl_dspi_dma *dma = dspi->dma;
 	struct device *dev = &dspi->pdev->dev;
+	struct fsl_dspi_dma *dma = dspi->dma;
 	int time_left;
 	int i;
 
@@ -360,9 +351,9 @@ static int dspi_next_xfer_dma_submit(struct fsl_dspi *dspi)
 
 static int dspi_dma_xfer(struct fsl_dspi *dspi)
 {
-	struct fsl_dspi_dma *dma = dspi->dma;
-	struct device *dev = &dspi->pdev->dev;
 	struct spi_message *message = dspi->cur_msg;
+	struct device *dev = &dspi->pdev->dev;
+	struct fsl_dspi_dma *dma = dspi->dma;
 	int curr_remaining_bytes;
 	int bytes_per_buffer;
 	int ret = 0;
@@ -397,9 +388,9 @@ static int dspi_dma_xfer(struct fsl_dspi *dspi)
 
 static int dspi_request_dma(struct fsl_dspi *dspi, phys_addr_t phy_addr)
 {
-	struct fsl_dspi_dma *dma;
-	struct dma_slave_config cfg;
 	struct device *dev = &dspi->pdev->dev;
+	struct dma_slave_config cfg;
+	struct fsl_dspi_dma *dma;
 	int ret;
 
 	dma = devm_kzalloc(dev, sizeof(*dma), GFP_KERNEL);
@@ -421,14 +412,14 @@ static int dspi_request_dma(struct fsl_dspi *dspi, phys_addr_t phy_addr)
 	}
 
 	dma->tx_dma_buf = dma_alloc_coherent(dev, DSPI_DMA_BUFSIZE,
-					&dma->tx_dma_phys, GFP_KERNEL);
+					     &dma->tx_dma_phys, GFP_KERNEL);
 	if (!dma->tx_dma_buf) {
 		ret = -ENOMEM;
 		goto err_tx_dma_buf;
 	}
 
 	dma->rx_dma_buf = dma_alloc_coherent(dev, DSPI_DMA_BUFSIZE,
-					&dma->rx_dma_phys, GFP_KERNEL);
+					     &dma->rx_dma_phys, GFP_KERNEL);
 	if (!dma->rx_dma_buf) {
 		ret = -ENOMEM;
 		goto err_rx_dma_buf;
@@ -485,30 +476,31 @@ static void dspi_release_dma(struct fsl_dspi *dspi)
 	struct fsl_dspi_dma *dma = dspi->dma;
 	struct device *dev = &dspi->pdev->dev;
 
-	if (dma) {
-		if (dma->chan_tx) {
-			dma_unmap_single(dev, dma->tx_dma_phys,
-					DSPI_DMA_BUFSIZE, DMA_TO_DEVICE);
-			dma_release_channel(dma->chan_tx);
-		}
+	if (!dma)
+		return;
 
-		if (dma->chan_rx) {
-			dma_unmap_single(dev, dma->rx_dma_phys,
-					DSPI_DMA_BUFSIZE, DMA_FROM_DEVICE);
-			dma_release_channel(dma->chan_rx);
-		}
+	if (dma->chan_tx) {
+		dma_unmap_single(dev, dma->tx_dma_phys,
+				 DSPI_DMA_BUFSIZE, DMA_TO_DEVICE);
+		dma_release_channel(dma->chan_tx);
+	}
+
+	if (dma->chan_rx) {
+		dma_unmap_single(dev, dma->rx_dma_phys,
+				 DSPI_DMA_BUFSIZE, DMA_FROM_DEVICE);
+		dma_release_channel(dma->chan_rx);
 	}
 }
 
 static void hz_to_spi_baud(char *pbr, char *br, int speed_hz,
-		unsigned long clkrate)
+			   unsigned long clkrate)
 {
 	/* Valid baud rate pre-scaler values */
 	int pbr_tbl[4] = {2, 3, 5, 7};
 	int brs[16] = {	2,	4,	6,	8,
-		16,	32,	64,	128,
-		256,	512,	1024,	2048,
-		4096,	8192,	16384,	32768 };
+			16,	32,	64,	128,
+			256,	512,	1024,	2048,
+			4096,	8192,	16384,	32768 };
 	int scale_needed, scale, minscale = INT_MAX;
 	int i, j;
 
@@ -538,15 +530,15 @@ static void hz_to_spi_baud(char *pbr, char *br, int speed_hz,
 }
 
 static void ns_delay_scale(char *psc, char *sc, int delay_ns,
-		unsigned long clkrate)
+			   unsigned long clkrate)
 {
-	int pscale_tbl[4] = {1, 3, 5, 7};
 	int scale_needed, scale, minscale = INT_MAX;
-	int i, j;
+	int pscale_tbl[4] = {1, 3, 5, 7};
 	u32 remainder;
+	int i, j;
 
 	scale_needed = div_u64_rem((u64)delay_ns * clkrate, NSEC_PER_SEC,
-			&remainder);
+				   &remainder);
 	if (remainder)
 		scale_needed++;
 
@@ -601,7 +593,7 @@ static void dspi_tcfq_write(struct fsl_dspi *dspi)
 		 */
 		u32 data = dspi_pop_tx(dspi);
 
-		if (dspi->cur_chip->ctar_val & SPI_CTAR_LSBFE(1)) {
+		if (dspi->cur_chip->ctar_val & SPI_CTAR_LSBFE) {
 			/* LSB */
 			tx_fifo_write(dspi, data & 0xFFFF);
 			tx_fifo_write(dspi, data >> 16);
@@ -655,19 +647,19 @@ static void dspi_eoq_read(struct fsl_dspi *dspi)
 {
 	int fifo_size = DSPI_FIFO_SIZE;
 
-	/* Read one FIFO entry at and push to rx buffer */
+	/* Read one FIFO entry and push to rx buffer */
 	while ((dspi->rx < dspi->rx_end) && fifo_size--)
 		dspi_push_rx(dspi, fifo_read(dspi));
 }
 
 static int dspi_transfer_one_message(struct spi_master *master,
-		struct spi_message *message)
+				     struct spi_message *message)
 {
 	struct fsl_dspi *dspi = spi_master_get_devdata(master);
 	struct spi_device *spi = message->spi;
+	enum dspi_trans_mode trans_mode;
 	struct spi_transfer *transfer;
 	int status = 0;
-	enum dspi_trans_mode trans_mode;
 
 	message->actual_length = 0;
 
@@ -677,7 +669,7 @@ static int dspi_transfer_one_message(struct spi_master *master,
 		dspi->cur_chip = spi_get_ctldata(spi);
 		/* Prepare command word for CMD FIFO */
 		dspi->tx_cmd = SPI_PUSHR_CMD_CTAS(0) |
-			SPI_PUSHR_CMD_PCS(spi->chip_select);
+			       SPI_PUSHR_CMD_PCS(spi->chip_select);
 		if (list_is_last(&dspi->cur_transfer->transfer_list,
 				 &dspi->cur_msg->transfers)) {
 			/* Leave PCS activated after last transfer when
@@ -718,8 +710,8 @@ static int dspi_transfer_one_message(struct spi_master *master,
 			     SPI_FRAME_BITS(transfer->bits_per_word));
 		if (dspi->devtype_data->xspi_mode)
 			regmap_write(dspi->regmap, SPI_CTARE(0),
-				     SPI_FRAME_EBITS(transfer->bits_per_word)
-				     | SPI_CTARE_DTCP(1));
+				     SPI_FRAME_EBITS(transfer->bits_per_word) |
+				     SPI_CTARE_DTCP(1));
 
 		trans_mode = dspi->devtype_data->trans_mode;
 		switch (trans_mode) {
@@ -733,8 +725,8 @@ static int dspi_transfer_one_message(struct spi_master *master,
 			break;
 		case DSPI_DMA_MODE:
 			regmap_write(dspi->regmap, SPI_RSER,
-				SPI_RSER_TFFFE | SPI_RSER_TFFFD |
-				SPI_RSER_RFDFE | SPI_RSER_RFDFD);
+				     SPI_RSER_TFFFE | SPI_RSER_TFFFD |
+				     SPI_RSER_RFDFE | SPI_RSER_RFDFD);
 			status = dspi_dma_xfer(dspi);
 			break;
 		default:
@@ -746,7 +738,7 @@ static int dspi_transfer_one_message(struct spi_master *master,
 
 		if (trans_mode != DSPI_DMA_MODE) {
 			if (wait_event_interruptible(dspi->waitq,
-						dspi->waitflags))
+						     dspi->waitflags))
 				dev_err(&dspi->pdev->dev,
 					"wait transfer complete fail!\n");
 			dspi->waitflags = 0;
@@ -765,12 +757,12 @@ static int dspi_transfer_one_message(struct spi_master *master,
 
 static int dspi_setup(struct spi_device *spi)
 {
-	struct chip_data *chip;
 	struct fsl_dspi *dspi = spi_master_get_devdata(spi->master);
-	struct fsl_dspi_platform_data *pdata;
-	u32 cs_sck_delay = 0, sck_cs_delay = 0;
 	unsigned char br = 0, pbr = 0, pcssck = 0, cssck = 0;
+	u32 cs_sck_delay = 0, sck_cs_delay = 0;
+	struct fsl_dspi_platform_data *pdata;
 	unsigned char pasc = 0, asc = 0;
+	struct chip_data *chip;
 	unsigned long clkrate;
 
 	/* Only alloc on first setup */
@@ -805,18 +797,22 @@ static int dspi_setup(struct spi_device *spi)
 	/* Set After SCK delay scale values */
 	ns_delay_scale(&pasc, &asc, sck_cs_delay, clkrate);
 
-	chip->ctar_val = SPI_CTAR_CPOL(spi->mode & SPI_CPOL ? 1 : 0)
-		| SPI_CTAR_CPHA(spi->mode & SPI_CPHA ? 1 : 0);
+	chip->ctar_val = 0;
+	if (spi->mode & SPI_CPOL)
+		chip->ctar_val |= SPI_CTAR_CPOL;
+	if (spi->mode & SPI_CPHA)
+		chip->ctar_val |= SPI_CTAR_CPHA;
 
 	if (!spi_controller_is_slave(dspi->master)) {
-		chip->ctar_val |= SPI_CTAR_LSBFE(spi->mode &
-						 SPI_LSB_FIRST ? 1 : 0)
-			| SPI_CTAR_PCSSCK(pcssck)
-			| SPI_CTAR_CSSCK(cssck)
-			| SPI_CTAR_PASC(pasc)
-			| SPI_CTAR_ASC(asc)
-			| SPI_CTAR_PBR(pbr)
-			| SPI_CTAR_BR(br);
+		chip->ctar_val |= SPI_CTAR_PCSSCK(pcssck) |
+				  SPI_CTAR_CSSCK(cssck) |
+				  SPI_CTAR_PASC(pasc) |
+				  SPI_CTAR_ASC(asc) |
+				  SPI_CTAR_PBR(pbr) |
+				  SPI_CTAR_BR(br);
+
+		if (spi->mode & SPI_LSB_FIRST)
+			chip->ctar_val |= SPI_CTAR_LSBFE;
 	}
 
 	spi_set_ctldata(spi, chip);
@@ -829,7 +825,7 @@ static void dspi_cleanup(struct spi_device *spi)
 	struct chip_data *chip = spi_get_ctldata((struct spi_device *)spi);
 
 	dev_dbg(&spi->dev, "spi_device %u.%u cleanup\n",
-			spi->master->bus_num, spi->chip_select);
+		spi->master->bus_num, spi->chip_select);
 
 	kfree(chip);
 }
@@ -845,7 +841,6 @@ static irqreturn_t dspi_interrupt(int irq, void *dev_id)
 	regmap_read(dspi->regmap, SPI_SR, &spi_sr);
 	regmap_write(dspi->regmap, SPI_SR, spi_sr);
 
-
 	if (spi_sr & (SPI_SR_EOQF | SPI_SR_TCFQF)) {
 		/* Get transfer counter (in number of SPI transfers). It was
 		 * reset to 0 when transfer(s) were started.
@@ -982,9 +977,10 @@ static const struct regmap_config dspi_xspi_regmap_config[] = {
 
 static void dspi_init(struct fsl_dspi *dspi)
 {
-	unsigned int mcr = SPI_MCR_PCSIS |
-		(dspi->devtype_data->xspi_mode ? SPI_MCR_XSPI : 0);
+	unsigned int mcr = SPI_MCR_PCSIS;
 
+	if (dspi->devtype_data->xspi_mode)
+		mcr |= SPI_MCR_XSPI;
 	if (!spi_controller_is_slave(dspi->master))
 		mcr |= SPI_MCR_MASTER;
 
@@ -1161,12 +1157,12 @@ static int dspi_remove(struct platform_device *pdev)
 }
 
 static struct platform_driver fsl_dspi_driver = {
-	.driver.name    = DRIVER_NAME,
-	.driver.of_match_table = fsl_dspi_dt_ids,
-	.driver.owner   = THIS_MODULE,
-	.driver.pm = &dspi_pm,
-	.probe          = dspi_probe,
-	.remove		= dspi_remove,
+	.driver.name		= DRIVER_NAME,
+	.driver.of_match_table	= fsl_dspi_dt_ids,
+	.driver.owner		= THIS_MODULE,
+	.driver.pm		= &dspi_pm,
+	.probe			= dspi_probe,
+	.remove			= dspi_remove,
 };
 module_platform_driver(fsl_dspi_driver);
 
-- 
2.17.1


^ permalink raw reply related

* [RFC PATCH net-next 11/11] ARM: dts: ls1021a-tsn: Reduce the SJA1105 SPI frequency for debug
From: Vladimir Oltean @ 2019-08-16  0:44 UTC (permalink / raw)
  To: h.feurstein, mlichvar, richardcochran, andrew, f.fainelli,
	broonie
  Cc: linux-spi, netdev, Vladimir Oltean
In-Reply-To: <20190816004449.10100-1-olteanv@gmail.com>

I have a logic analyzer that cannot sample signals at a higher frequency
than this, and it's nice to actually see the captured data and not just
an amorphous mess.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
---
 arch/arm/boot/dts/ls1021a-tsn.dts | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm/boot/dts/ls1021a-tsn.dts b/arch/arm/boot/dts/ls1021a-tsn.dts
index 3b35e6b5977f..8fdf4c3b24c7 100644
--- a/arch/arm/boot/dts/ls1021a-tsn.dts
+++ b/arch/arm/boot/dts/ls1021a-tsn.dts
@@ -55,7 +55,7 @@
 		#size-cells = <0>;
 		compatible = "nxp,sja1105t";
 		/* 12 MHz */
-		spi-max-frequency = <12000000>;
+		spi-max-frequency = <6000000>;
 		/* Sample data on trailing clock edge */
 		spi-cpha;
 		/* SPI controller settings for SJA1105 timing requirements */
-- 
2.17.1


^ permalink raw reply related

* [RFC PATCH net-next 10/11] ARM: dts: ls1021a-tsn: Use the DSPI controller in poll mode
From: Vladimir Oltean @ 2019-08-16  0:44 UTC (permalink / raw)
  To: h.feurstein, mlichvar, richardcochran, andrew, f.fainelli,
	broonie
  Cc: linux-spi, netdev, Vladimir Oltean
In-Reply-To: <20190816004449.10100-1-olteanv@gmail.com>

Connected to the LS1021A DSPI is the SJA1105 DSA switch. This
constitutes 4 of the 6 Ethernet ports on this board. When using the
board as a PTP switch and bridging all 6 ports under one single L2
entity, it is good to also have the PTP clocks of the switch and of the
standalone Ethernet ports in sync.

This cannot be done with hardware timestamping, and is where phc2sys
comes into play. Using poll mode for SPI access helps ensure that all
transfers take a deterministic time to complete, which is an important
requirement for a TSN switch.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
---
 arch/arm/boot/dts/ls1021a-tsn.dts | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm/boot/dts/ls1021a-tsn.dts b/arch/arm/boot/dts/ls1021a-tsn.dts
index 6cec454c484c..3b35e6b5977f 100644
--- a/arch/arm/boot/dts/ls1021a-tsn.dts
+++ b/arch/arm/boot/dts/ls1021a-tsn.dts
@@ -37,6 +37,7 @@
 	bus-num = <0>;
 	/* EXP1_GPIO6 is GPIO4_18 */
 	debug-gpios = <&gpio3 18 GPIO_ACTIVE_HIGH>;
+	/delete-property/ interrupts;
 	status = "okay";

 	/* ADG704BRMZ 1:4 SPI mux/demux */
-- 
2.17.1

^ permalink raw reply related

* [RFC PATCH net-next 07/11] spi: spi-fsl-dspi: Add a debugging GPIO for monitoring latency
From: Vladimir Oltean @ 2019-08-16  0:44 UTC (permalink / raw)
  To: h.feurstein, mlichvar, richardcochran, andrew, f.fainelli,
	broonie
  Cc: linux-spi, netdev, Vladimir Oltean
In-Reply-To: <20190816004449.10100-1-olteanv@gmail.com>

This is being used to monitor the time it takes to transmit individual
bytes over SPI to the slave device. It is used in conjunction with the
PTP system timestamp feature - only the byte that was requested to be
timestamped triggers a toggle of the GPIO pin.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
---
 drivers/spi/spi-fsl-dspi.c | 30 +++++++++++++++++++++++++++---
 1 file changed, 27 insertions(+), 3 deletions(-)

diff --git a/drivers/spi/spi-fsl-dspi.c b/drivers/spi/spi-fsl-dspi.c
index 3fc266d8263a..f0838853392d 100644
--- a/drivers/spi/spi-fsl-dspi.c
+++ b/drivers/spi/spi-fsl-dspi.c
@@ -203,6 +203,7 @@ struct fsl_dspi {
 	wait_queue_head_t	waitq;
 	u32			waitflags;
 
+	struct gpio_desc	*debug_gpio;
 	struct fsl_dspi_dma	*dma;
 };
 
@@ -223,6 +224,15 @@ static u32 dspi_pop_tx(struct fsl_dspi *dspi)
 	return txdata;
 }
 
+void dspi_debug_gpio(struct fsl_dspi *dspi, bool enabled)
+{
+	if (IS_ERR(dspi->debug_gpio)) {
+		dev_err(&dspi->pdev->dev, "Bad debug GPIO!\n");
+		return;
+	}
+	gpiod_set_value_cansleep(dspi->debug_gpio, enabled);
+}
+
 static u32 dspi_pop_tx_pushr(struct fsl_dspi *dspi)
 {
 	u16 cmd = dspi->tx_cmd, data = dspi_pop_tx(dspi);
@@ -661,8 +671,11 @@ static int dspi_rxtx(struct fsl_dspi *dspi)
 	u16 spi_tcnt;
 	u32 spi_tcr;
 
-	if (dspi->take_snapshot)
+	if (dspi->take_snapshot) {
 		ptp_read_system_postts(dspi->ptp_sts);
+		if (dspi->ptp_sts)
+			dspi_debug_gpio(dspi, 0);
+	}
 
 	/* Get transfer counter (in number of SPI transfers). It was
 	 * reset to 0 when transfer(s) were started.
@@ -683,8 +696,11 @@ static int dspi_rxtx(struct fsl_dspi *dspi)
 
 	dspi->take_snapshot = (dspi->tx == dspi->ptp_sts_word);
 
-	if (dspi->take_snapshot)
+	if (dspi->take_snapshot) {
+		if (dspi->ptp_sts)
+			dspi_debug_gpio(dspi, 1);
 		ptp_read_system_prets(dspi->ptp_sts);
+	}
 
 	if (dspi->devtype_data->trans_mode == DSPI_EOQ_MODE)
 		dspi_eoq_write(dspi);
@@ -799,8 +815,11 @@ static int dspi_transfer_one_message(struct spi_master *master,
 
 		dspi->take_snapshot = (dspi->tx == dspi->ptp_sts_word);
 
-		if (dspi->take_snapshot)
+		if (dspi->take_snapshot) {
+			if (dspi->ptp_sts)
+				dspi_debug_gpio(dspi, 1);
 			ptp_read_system_prets(dspi->ptp_sts);
+		}
 
 		trans_mode = dspi->devtype_data->trans_mode;
 		switch (trans_mode) {
@@ -1126,6 +1145,11 @@ static int dspi_probe(struct platform_device *pdev)
 		}
 	}
 
+	dspi->debug_gpio = devm_gpiod_get(&pdev->dev, "debug",
+					  GPIOD_OUT_HIGH);
+
+	dspi_debug_gpio(dspi, 0);
+
 	dspi->clk = devm_clk_get(&pdev->dev, "dspi");
 	if (IS_ERR(dspi->clk)) {
 		ret = PTR_ERR(dspi->clk);
-- 
2.17.1


^ permalink raw reply related

* [RFC PATCH net-next 03/11] spi: Add a PTP system timestamp to the transfer structure
From: Vladimir Oltean @ 2019-08-16  0:44 UTC (permalink / raw)
  To: h.feurstein, mlichvar, richardcochran, andrew, f.fainelli,
	broonie
  Cc: linux-spi, netdev, Vladimir Oltean
In-Reply-To: <20190816004449.10100-1-olteanv@gmail.com>

SPI is one of the interfaces used to access devices which have a POSIX
clock driver (real time clocks, 1588 timers etc).

Since there are lots of sources of timing jitter when retrieving the
time from such a device (controller delays, locking contention etc),
introduce a PTP system timestamp structure in struct spi_transfer. This
is to be used by SPI device drivers when they need to know the exact
time at which the underlying device's time was snapshotted.

Because SPI controllers may have jitter even between frames, also
introduce a field which specifies to the controller driver specifically
which byte needs to be snapshotted.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
---
 include/linux/spi/spi.h | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/include/linux/spi/spi.h b/include/linux/spi/spi.h
index af4f265d0f67..5a1e4b24c617 100644
--- a/include/linux/spi/spi.h
+++ b/include/linux/spi/spi.h
@@ -13,6 +13,7 @@
 #include <linux/completion.h>
 #include <linux/scatterlist.h>
 #include <linux/gpio/consumer.h>
+#include <linux/ptp_clock_kernel.h>

 struct dma_chan;
 struct property_entry;
@@ -842,6 +843,9 @@ struct spi_transfer {

 	u32		effective_speed_hz;

+	struct ptp_system_timestamp *ptp_sts;
+	unsigned int	ptp_sts_word_offset;
+
 	struct list_head transfer_list;
 };

-- 
2.17.1

^ permalink raw reply related

* [RFC PATCH net-next 00/11] Deterministic SPI latency on NXP
From: Vladimir Oltean @ 2019-08-16  0:44 UTC (permalink / raw)
  To: h.feurstein, mlichvar, richardcochran, andrew, f.fainelli,
	broonie
  Cc: linux-spi, netdev, Vladimir Oltean

Continuing the discussion created by Hubert Feurstein around the
mv88e6xxx driver for MDIO-controlled switches
(https://lkml.org/lkml/2019/8/2/1364), this patchset takes a similar
approach for the NXP LS1021A-TSN board, which has a SPI-controlled DSA
switch (SJA1105).

The patchset is motivated by some experiments done with a logic
analyzer, trying to understand the source of latency (and especially of
the jitter). SJA1105 SPI messages for reading the PTP clock are 12 bytes
in length: 4 for the SPI header and 8 for the timestamp. When looking at
the messages with a scope, there's jitter basically everywhere: between
bits of a frame and between frames in a transfer. The inter-bit jitter
is hardware and impacts us to a lesser extend (is smaller and caused by
the PVT stability of the oscillators, PLLs, etc). We will focus on the
latency between consecutive SPI frames within a 12-byte transfer.

As a preface, revisions of the DSPI controller IP are integrated in many
Freescale/NXP devices. As a result, the driver has 3 modes of operation:
- TCFQ (Transfer Complete Flag mode): The controller signals software
  that data has been sent/received after each individual word.
- EOQ (End of Queue mode): The driver can implement batching by making
  use of the controller's 4-word deep FIFO.
- DMA (Direct Memory Access mode): The SPI controller's FIFO is no
  longer in direct interaction with the driver, but is used to trigger
  the RX and TX channels of the eDMA module on the SoC.

In LS1021A, the driver works in the least efficient mode of the 3
(TCFQ). There is a well-known errata that the DSPI controller is broken
in conjunction with the eDMA module. As for the EOQ mode, I have tried
unsuccessfully for a few days to make use of the 4 entry FIFO, and the
hardware simply fails to reliably acknowledge the transmission when the
FIFO gets full. So it looks like we're stuck with the TCFQ mode.

The problem with phc2sys on the LS1021A-TSN board is that in order for
the gettime64() call to complete on the sja1105, the system has to
service 12 IRQs. Intuitively that is excessive and is the main source of
jitter, but let's not get ahead of ourselves.

An outline of the experiments that were done (unless otherwise
mentioned, all of these ran for 120 seconds):

A. First I have measured the (poor) performance of phc2sys under current
   conditions. (DSPI driver in IRQ mode, no PTP system timestamping)

   offset: min -53310 max 16107 mean -1737.18 std dev 11444.3
   delay: min 163680 max 237360 mean 201149 std dev 22446.6
   lost servo lock 1 times

B. I switched the .gettime64 callback to .gettimex64, snapshotting the
   PTP system timestamp within the sja1105 driver.

   offset: min -48923 max 64217 mean -904.137 std dev 17358.1
   delay: min 149600 max 203840 mean 169045 std dev 17993.3
   lost servo lock 8 times

C. I patched "struct spi_transfer" to contain the PTP system timestamp,
   and from the sja1105 driver, I passed this structure to be
   snapshotted by the SPI controller's driver (spi-fsl-dspi). This is
   the "transfer-level" snapshot.

   offset: min -64979 max 38979 mean -416.197 std dev 15367.9
   delay: min 125120 max 168320 mean 150286 std dev 17675.3
   lost servo lock 10 times

D. I changed the placement of the transfer snapshotting within the DSPI
   driver, from "transfer-level" to "byte-level".

   offset: min -9021 max 7149 mean -0.418803 std dev 3529.81
   delay: min 7840 max 23920 mean 14493.7 std dev 5982.17
   lost servo lock 0 times

E. I moved the DSPI driver to poll mode. I went back to collecting the
   PTP system timestamps from the sja1105 driver (same as B).

   offset: min -4199 max 46643 mean 418.214 std dev 4554.01
   delay: min 84000 max 194000 mean 99463.2 std dev 12936.5
   lost servo lock 1 times

F. Transfer-level snapshotting in the DSPI driver (same as C), but in
   poll mode.

   offset: min -24244 max 1115 mean -230.478 std dev 2297.28
   delay: min 69440 max 119040 mean 70312.9 std dev 8065.34
   lost servo lock 1 times

G. Byte-level snapshotting (same as D) but in poll mode.

   offset: min -314 max 288 mean -2.48718 std dev 118.045
   delay: min 4880 max 6000 mean 5118.63 std dev 507.258
   lost servo lock 0 times

   This seemed suspiciously good to me, so I let it run for longer
   (58 minutes):

   offset: min -26251 max 16416 mean -21.8672 std dev 863.416
   delay: min 4720 max 57280 mean 5182.49 std dev 1607.19
   lost servo lock 3 times

H. Transfer-level snapshotting (same as F), but with IRQs disabled.
   This ran for 86 minutes.

   offset: min -1927 max 1843 mean -0.209203 std dev 529.398
   delay: min 85440 max 93680 mean 88245 std dev 1454.71
   lost servo lock 0 times

I. Byte-level snapshotting (same as G), but with IRQs disabled.
   This ran for 102 minutes.

   offset: min -378 max 381 mean -0.0083089 std dev 101.495
   delay: min 4720 max 5920 mean 5129.38 std dev 154.899
   lost servo lock 0 times

As a result, this patchset proposes the implementation of scenario I.
The others were done through temporary patches which are not presented
here due to the difficulty of presenting a coherent git history without
resorting to reverts etc. The gist of each experiment should be clear
though.

The raw data is available for dissection at
https://drive.google.com/open?id=1r9raU9ZeqOqkqts6Lb-ISf5ubLDLP3wk.
The logic analyzer captures can be opened with a free-as-in-beer program
provided by Saleae: https://www.saleae.com/downloads/.

In the capture data one can find the MOSI, SCK SPI signals, as well as a
debug GPIO which was toggled at the same time as the PTP system
timestamp was taken, to give the viewer an impression of what the
software is capturing compared to the actual timing of the SPI transfer.

Attached are also some close-up screenshots of transfers where there is
a clear and huge delay in-between frames of the same 12-byte SPI
transfer. As it turns out, these were all caused by the CPU getting
interrupted by some other IRQ. Approaches H and I are the only ones that
get rid of these glitches. In theory, the byte-level snapshotting should
be less vulnerable to an IRQ interrupting the SPI transfer (because the
time window is much smaller) but as the 58 minutes experiment shows, it
is not immune.

Vladimir Oltean (11):
  net: dsa: sja1105: Add a debugging GPIO for monitoring SPI latency
  net: dsa: sja1105: Implement the .gettimex64 system call for PTP
  spi: Add a PTP system timestamp to the transfer structure
  spi: spi-fsl-dspi: Cosmetic cleanup
  spi: spi-fsl-dspi: Use poll mode in case the platform IRQ is missing
  spi: spi-fsl-dspi: Implement the PTP system timestamping
  spi: spi-fsl-dspi: Add a debugging GPIO for monitoring latency
  spi: spi-fsl-dspi: Disable interrupts and preemption during poll mode
    transfer
  ARM: dts: ls1021a-tsn: Add debugging GPIOs for the SJA1105 and DSPI
    drivers
  ARM: dts: ls1021a-tsn: Use the DSPI controller in poll mode
  ARM: dts: ls1021a-tsn: Reduce the SJA1105 SPI frequency for debug

 arch/arm/boot/dts/ls1021a-tsn.dts      |   8 +-
 drivers/net/dsa/sja1105/sja1105.h      |   8 +-
 drivers/net/dsa/sja1105/sja1105_main.c |  15 +-
 drivers/net/dsa/sja1105/sja1105_ptp.c  |  23 +-
 drivers/net/dsa/sja1105/sja1105_spi.c  |  34 +-
 drivers/spi/spi-fsl-dspi.c             | 518 ++++++++++++++-----------
 include/linux/spi/spi.h                |   4 +
 7 files changed, 365 insertions(+), 245 deletions(-)

-- 
2.17.1

^ permalink raw reply

* [RFC PATCH net-next 01/11] net: dsa: sja1105: Add a debugging GPIO for monitoring SPI latency
From: Vladimir Oltean @ 2019-08-16  0:44 UTC (permalink / raw)
  To: h.feurstein, mlichvar, richardcochran, andrew, f.fainelli,
	broonie
  Cc: linux-spi, netdev, Vladimir Oltean
In-Reply-To: <20190816004449.10100-1-olteanv@gmail.com>

I am using this to monitor the gettimex64 callback jitter, i.e. the
time it takes for the SPI controller to retrieve the PTP time from the
switch, and therefore the uncertainty in the time that has just been
read.

Signed-off-by: Vladimir Oltean <olteanv@gmail.com>
---
 drivers/net/dsa/sja1105/sja1105.h      |  4 ++++
 drivers/net/dsa/sja1105/sja1105_main.c | 11 +++++++++++
 drivers/net/dsa/sja1105/sja1105_ptp.c  |  2 ++
 3 files changed, 17 insertions(+)

diff --git a/drivers/net/dsa/sja1105/sja1105.h b/drivers/net/dsa/sja1105/sja1105.h
index 3dbfe41b370e..e6371b2c2df1 100644
--- a/drivers/net/dsa/sja1105/sja1105.h
+++ b/drivers/net/dsa/sja1105/sja1105.h
@@ -89,6 +89,7 @@ struct sja1105_private {
 	bool rgmii_tx_delay[SJA1105_NUM_PORTS];
 	const struct sja1105_info *info;
 	struct gpio_desc *reset_gpio;
+	struct gpio_desc *debug_gpio;
 	struct spi_device *spidev;
 	struct dsa_switch *ds;
 	struct sja1105_port ports[SJA1105_NUM_PORTS];
@@ -126,6 +127,9 @@ typedef enum {
 	SPI_WRITE = 1,
 } sja1105_spi_rw_mode_t;
 
+/* From sja1105_main.c */
+void sja1105_debug_gpio(struct sja1105_private *priv, bool enabled);
+
 /* From sja1105_spi.c */
 int sja1105_spi_send_packed_buf(const struct sja1105_private *priv,
 				sja1105_spi_rw_mode_t rw, u64 reg_addr,
diff --git a/drivers/net/dsa/sja1105/sja1105_main.c b/drivers/net/dsa/sja1105/sja1105_main.c
index 17b917d1e6be..82bdc2da8f8f 100644
--- a/drivers/net/dsa/sja1105/sja1105_main.c
+++ b/drivers/net/dsa/sja1105/sja1105_main.c
@@ -23,6 +23,15 @@
 #include <linux/dsa/8021q.h>
 #include "sja1105.h"
 
+void sja1105_debug_gpio(struct sja1105_private *priv, bool enabled)
+{
+	if (IS_ERR(priv->debug_gpio)) {
+		dev_err(priv->ds->dev, "Bad debug GPIO!\n");
+		return;
+	}
+	gpiod_set_value_cansleep(priv->debug_gpio, enabled);
+}
+
 static void sja1105_hw_reset(struct gpio_desc *gpio, unsigned int pulse_len,
 			     unsigned int startup_delay)
 {
@@ -2142,6 +2151,8 @@ static int sja1105_probe(struct spi_device *spi)
 	else
 		sja1105_hw_reset(priv->reset_gpio, 1, 1);
 
+	priv->debug_gpio = devm_gpiod_get(dev, "debug", GPIOD_OUT_HIGH);
+
 	/* Populate our driver private structure (priv) based on
 	 * the device tree node that was probed (spi)
 	 */
diff --git a/drivers/net/dsa/sja1105/sja1105_ptp.c b/drivers/net/dsa/sja1105/sja1105_ptp.c
index 90a595cc596d..51a0014369fc 100644
--- a/drivers/net/dsa/sja1105/sja1105_ptp.c
+++ b/drivers/net/dsa/sja1105/sja1105_ptp.c
@@ -353,8 +353,10 @@ static u64 sja1105_ptptsclk_read(const struct cyclecounter *cc)
 	u64 ptptsclk = 0;
 	int rc;
 
+	sja1105_debug_gpio(priv, 1);
 	rc = sja1105_spi_send_int(priv, SPI_READ, regs->ptptsclk,
 				  &ptptsclk, 8);
+	sja1105_debug_gpio(priv, 0);
 	if (rc < 0)
 		dev_err_ratelimited(priv->ds->dev,
 				    "failed to read ptp cycle counter: %d\n",
-- 
2.17.1


^ permalink raw reply related

* Re: WARNING in tracepoint_probe_register_prio (3)
From: syzbot @ 2019-08-16  0:11 UTC (permalink / raw)
  To: ard.biesheuvel, gregkh, gustavo, jeyu, linux-kernel,
	mathieu.desnoyers, mingo, netdev, paulmck, paulmck, rostedt,
	syzkaller-bugs, tglx
In-Reply-To: <000000000000ab6f84056c786b93@google.com>

syzbot has found a reproducer for the following crash on:

HEAD commit:    ecb9f80d net/mvpp2: Replace tasklet with softirq hrtimer
git tree:       net-next
console output: https://syzkaller.appspot.com/x/log.txt?x=115730ac600000
kernel config:  https://syzkaller.appspot.com/x/.config?x=d4cf1ffb87d590d7
dashboard link: https://syzkaller.appspot.com/bug?extid=774fddf07b7ab29a1e55
compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=11b02a22600000

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+774fddf07b7ab29a1e55@syzkaller.appspotmail.com

WARNING: CPU: 0 PID: 8824 at kernel/tracepoint.c:243 tracepoint_add_func  
kernel/tracepoint.c:243 [inline]
WARNING: CPU: 0 PID: 8824 at kernel/tracepoint.c:243  
tracepoint_probe_register_prio+0x216/0x790 kernel/tracepoint.c:315
Kernel panic - not syncing: panic_on_warn set ...
CPU: 0 PID: 8824 Comm: syz-executor.4 Not tainted 5.3.0-rc3+ #133
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS  
Google 01/01/2011
Call Trace:
  __dump_stack lib/dump_stack.c:77 [inline]
  dump_stack+0x172/0x1f0 lib/dump_stack.c:113
  panic+0x2dc/0x755 kernel/panic.c:219
  __warn.cold+0x20/0x4c kernel/panic.c:576
  report_bug+0x263/0x2b0 lib/bug.c:186
  fixup_bug arch/x86/kernel/traps.c:179 [inline]
  fixup_bug arch/x86/kernel/traps.c:174 [inline]
  do_error_trap+0x11b/0x200 arch/x86/kernel/traps.c:272
  do_invalid_op+0x37/0x50 arch/x86/kernel/traps.c:291
  invalid_op+0x23/0x30 arch/x86/entry/entry_64.S:1028
RIP: 0010:tracepoint_add_func kernel/tracepoint.c:243 [inline]
RIP: 0010:tracepoint_probe_register_prio+0x216/0x790 kernel/tracepoint.c:315
Code: 48 89 f8 48 c1 e8 03 80 3c 08 00 0f 85 bf 04 00 00 48 8b 45 b8 49 3b  
45 08 0f 85 21 ff ff ff 41 bd ef ff ff ff e8 aa 8c fe ff <0f> 0b e8 a3 8c  
fe ff 48 c7 c7 20 44 de 88 e8 d7 1d ca 05 44 89 e8
RSP: 0018:ffff88807b5a7498 EFLAGS: 00010293
RAX: ffff8880a87a85c0 RBX: ffffffff89a1eb00 RCX: dffffc0000000000
RDX: 0000000000000000 RSI: ffffffff8173fcd6 RDI: ffff88808afc4698
RBP: ffff88807b5a74f0 R08: ffff8880a87a85c0 R09: fffffbfff11bc885
R10: ffff88807b5a7488 R11: ffffffff88de4427 R12: ffff88808afc4690
R13: 00000000ffffffef R14: 00000000ffffffff R15: ffffffff8177f710
  tracepoint_probe_register+0x2b/0x40 kernel/tracepoint.c:335
  register_trace_sched_wakeup include/trace/events/sched.h:96 [inline]
  tracing_sched_register kernel/trace/trace_sched_switch.c:54 [inline]
  tracing_start_sched_switch+0xa8/0x190 kernel/trace/trace_sched_switch.c:106
  tracing_start_cmdline_record+0x13/0x20  
kernel/trace/trace_sched_switch.c:131
  trace_printk_init_buffers kernel/trace/trace.c:3050 [inline]
  trace_printk_init_buffers.cold+0xdf/0xe9 kernel/trace/trace.c:3013
  bpf_get_trace_printk_proto+0xe/0x20 kernel/trace/bpf_trace.c:338
  cgroup_base_func_proto kernel/bpf/cgroup.c:799 [inline]
  cgroup_base_func_proto.isra.0+0x10e/0x120 kernel/bpf/cgroup.c:776
  cg_sockopt_func_proto+0x53/0x70 kernel/bpf/cgroup.c:1411
  check_helper_call+0x141/0x32f0 kernel/bpf/verifier.c:3950
  do_check+0x6213/0x89f0 kernel/bpf/verifier.c:7707
  bpf_check+0x6f99/0x9948 kernel/bpf/verifier.c:9294
  bpf_prog_load+0xe68/0x1670 kernel/bpf/syscall.c:1698
  __do_sys_bpf+0xc43/0x3460 kernel/bpf/syscall.c:2849
  __se_sys_bpf kernel/bpf/syscall.c:2808 [inline]
  __x64_sys_bpf+0x73/0xb0 kernel/bpf/syscall.c:2808
  do_syscall_64+0xfd/0x6a0 arch/x86/entry/common.c:296
  entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x459829
Code: fd b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7  
48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff  
ff 0f 83 cb b7 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007fc4bf1dec78 EFLAGS: 00000246 ORIG_RAX: 0000000000000141
RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000459829
RDX: 0000000000000070 RSI: 0000000020000180 RDI: 0000000000000005
RBP: 000000000075bf20 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007fc4bf1df6d4
R13: 00000000004bfc7c R14: 00000000004d1938 R15: 00000000ffffffff
Kernel Offset: disabled
Rebooting in 86400 seconds..


^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox