Linux-ARM-Kernel Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 6/8] dts: coresight: Clean up the device tree graph bindings
From: Mathieu Poirier @ 2018-06-01 20:26 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1527858967-16047-7-git-send-email-suzuki.poulose@arm.com>

On Fri, Jun 01, 2018 at 02:16:05PM +0100, Suzuki K Poulose wrote:
> The coresight drivers relied on default bindings for graph
> in DT, while reusing the "reg" field of the "ports" to indicate
> the actual hardware port number for the connections. However,
> with the rules getting stricter w.r.t to the address mismatch
> with the label, it is no longer possible to use the port address
> field for the hardware port number. Hence, we add an explicit
> property to denote the hardware port number, "coresight,hwid"
> which must be specified for each "endpoint".
> 
> Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
> Cc: Sudeep Holla <sudeep.holla@arm.com>
> Cc: Rob Herring <robh@kernel.org>
> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> ---
>  .../devicetree/bindings/arm/coresight.txt          | 26 +++++++++---
>  drivers/hwtracing/coresight/of_coresight.c         | 46 ++++++++++++++++------
>  2 files changed, 54 insertions(+), 18 deletions(-)
> 
> diff --git a/Documentation/devicetree/bindings/arm/coresight.txt b/Documentation/devicetree/bindings/arm/coresight.txt
> index bd36e40..385581a 100644
> --- a/Documentation/devicetree/bindings/arm/coresight.txt
> +++ b/Documentation/devicetree/bindings/arm/coresight.txt
> @@ -104,7 +104,11 @@ properties to uniquely identify the connection details.
>  	"slave-mode"
>  
>   * Hardware Port number at the component:
> -     -  The hardware port number is assumed to be the address of the "port" component.
> +   - (Obsolete) The hardware port number is assumed to be the address of the "port" component.
> +   - Each "endpoint" must define the hardware port of the local end of the
> +     connection using the following property:
> +	"coresight,hwid" - 32bit integer, hardware port number at the local end.
> +
>  
>  
>  Example:
> @@ -120,6 +124,7 @@ Example:
>  			etb_in_port: endpoint at 0 {
>  				slave-mode;
>  				remote-endpoint = <&replicator_out_port0>;
> +				coresight,hwid = <0>;
>  			};
>  		};
>  	};
> @@ -134,6 +139,7 @@ Example:
>  			tpiu_in_port: endpoint at 0 {
>  				slave-mode;
>  				remote-endpoint = <&replicator_out_port1>;
> +				coresight,hwid = <0>;
>  			};
>  		};
>  	};
> @@ -154,6 +160,7 @@ Example:
>  				reg = <0>;
>  				replicator_out_port0: endpoint {
>  					remote-endpoint = <&etb_in_port>;
> +					coresight,hwid = <0>;
>  				};
>  			};
>  
> @@ -161,15 +168,17 @@ Example:
>  				reg = <1>;
>  				replicator_out_port1: endpoint {
>  					remote-endpoint = <&tpiu_in_port>;
> +					coresight,hwid = <1>;
>  				};
>  			};
>  
>  			/* replicator input port */
>  			port at 2 {
> -				reg = <0>;
> +				reg = <1>;
>  				replicator_in_port0: endpoint {
>  					slave-mode;
>  					remote-endpoint = <&funnel_out_port0>;
> +					coresight,hwid = <0>;
>  				};
>  			};
>  		};
> @@ -191,31 +200,35 @@ Example:
>  				funnel_out_port0: endpoint {
>  					remote-endpoint =
>  							<&replicator_in_port0>;
> +					coresight,hwid = <0>;
>  				};
>  			};
>  
>  			/* funnel input ports */
>  			port at 1 {
> -				reg = <0>;
> +				reg = <1>;
>  				funnel_in_port0: endpoint {
>  					slave-mode;
>  					remote-endpoint = <&ptm0_out_port>;
> +					coresight,hwid = <0>;
>  				};
>  			};
>  
>  			port at 2 {
> -				reg = <1>;
> +				reg = <2>;
>  				funnel_in_port1: endpoint {
>  					slave-mode;
>  					remote-endpoint = <&ptm1_out_port>;
> +					coresight,hwid = <1>;
>  				};
>  			};
>  
>  			port at 3 {
> -				reg = <2>;
> +				reg = <3>;
>  				funnel_in_port2: endpoint {
>  					slave-mode;
>  					remote-endpoint = <&etm0_out_port>;
> +					coresight,hwid = <2>;
>  				};
>  			};
>  
> @@ -233,6 +246,7 @@ Example:
>  		port {
>  			ptm0_out_port: endpoint {
>  				remote-endpoint = <&funnel_in_port0>;
> +				coresight,hwid = <0>;
>  			};
>  		};
>  	};
> @@ -247,6 +261,7 @@ Example:
>  		port {
>  			ptm1_out_port: endpoint {
>  				remote-endpoint = <&funnel_in_port1>;
> +				coresight,hwid = <0>;
>  			};
>  		};
>  	};
> @@ -263,6 +278,7 @@ Example:
>  		port {
>  			stm_out_port: endpoint {
>  				remote-endpoint = <&main_funnel_in_port2>;
> +				coresight,hwid = <0>;
>  			};
>  		};
>  	};
> diff --git a/drivers/hwtracing/coresight/of_coresight.c b/drivers/hwtracing/coresight/of_coresight.c
> index a3f3416..99d7a9c 100644
> --- a/drivers/hwtracing/coresight/of_coresight.c
> +++ b/drivers/hwtracing/coresight/of_coresight.c
> @@ -105,14 +105,37 @@ int of_coresight_get_cpu(const struct device_node *node)
>  }
>  EXPORT_SYMBOL_GPL(of_coresight_get_cpu);
>  
> +/*
> + * of_graph_ep_coresight_get_port_id : Get the hardware port number for the
> + * given endpoint device node. Prefer the explicit "coresight,hwid" property
> + * over the endpoint register id (obsolete bindings).
> + */
> +static int of_graph_ep_coresight_get_port_id(struct device *dev,
> +					     struct device_node *ep_node)


static int of_coresight_endpoint_get_port_id(struct device *dev,
					     struct device_node *ep_node)

I think that makes more sense since this function is only visible in this file.

> +{
> +	struct of_endpoint ep;
> +	int rc, port_id;
> +
> +
> +	if (!of_property_read_u32(ep_node, "coresight,hwid", &port_id))
> +		return port_id;
> +
> +	rc = of_graph_parse_endpoint(ep_node, &ep);
> +	if (rc)
> +		return rc;
> +	dev_warn_once(dev,
> +		      "ep%d: Mandatory \"coresight,hwid\" property missing."
> +		      " DT uses obsolete coresight bindings\n", ep.port);
> +	return ep.port;
> +}
> +
>  struct coresight_platform_data *
>  of_get_coresight_platform_data(struct device *dev,
>  			       const struct device_node *node)
>  {
> -	int ret = 0;
> +	int ret = 0, outport, inport;
>  	struct coresight_platform_data *pdata;
>  	struct coresight_connection *conn;
> -	struct of_endpoint endpoint, rendpoint;
>  	struct device *rdev;
>  	struct device_node *ep = NULL;
>  	struct device_node *rparent = NULL;
> @@ -148,14 +171,10 @@ of_get_coresight_platform_data(struct device *dev,
>  			if (of_find_property(ep, "slave-mode", NULL))
>  				continue;
>  
> -			/* Get a handle on the local endpoint */
> -			ret = of_graph_parse_endpoint(ep, &endpoint);
> -
> -			if (ret)
> +			outport = of_graph_ep_coresight_get_port_id(dev, ep);
> +			if (outport < 0)
>  				continue;
> -
> -			/* The local out port number */
> -			conn->outport = endpoint.port;
> +			conn->outport = outport;
>  
>  			/*
>  			 * Get a handle on the remote endpoint and the device
> @@ -168,15 +187,16 @@ of_get_coresight_platform_data(struct device *dev,
>  			if (!rparent)
>  				continue;
>  
> -			if (of_graph_parse_endpoint(rep, &rendpoint))
> -				continue;
> -
>  			rdev = of_coresight_get_endpoint_device(rparent);
>  			if (!rdev)
>  				return ERR_PTR(-EPROBE_DEFER);
>  
> +			inport = of_graph_ep_coresight_get_port_id(rdev, rep);
> +			if (inport < 0)
> +				continue;
> +
>  			conn->child_name = dev_name(rdev);
> -			conn->child_port = rendpoint.port;
> +			conn->child_port = inport;
>  			conn++;
>  		} while (ep);
>  	}
> -- 
> 2.7.4
> 

^ permalink raw reply

* linux-next-20180601: build error in arch/arm64/kvm/hyp/hyp-entry.S
From: Stefan Wahren @ 2018-06-01 20:17 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

i can't build todays linux-next-20180601 and get the following error message:

arch/arm64/kvm/hyp/hyp-entry.S: Assembler messages:
arch/arm64/kvm/hyp/hyp-entry.S:128: Error: constant expression required at operand 3 -- `bfi x0,x1,#VCPU_WORKAROUND_2_FLAG_SHIFT,#1'

Related commit:
arm64: KVM: Handle guest's ARCH_WORKAROUND_2 requests

Toolchain: gcc-linaro-7.2.1-2017.11-x86_64_aarch64-linux-gnu
Kernel config: arm64/defconfig

Regards
Stefan

^ permalink raw reply

* [PATCH] arm64: alternative:flush cache with unpatched code
From: Rohit Khanna @ 2018-06-01 19:52 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20180601090321.sy3t64rtps7qn2nx@salmiak>

[RK] - Thanks for the comments Mark. Reply inlined.

Thanks
Rohit
________________________________________
From: Mark Rutland <mark.rutland@arm.com>
Sent: Friday, June 1, 2018 2:03 AM
To: Rohit Khanna
Cc: catalin.marinas at arm.com; robin.murphy at arm.com; Suzuki.Poulose at arm.com; linux-arm-kernel at lists.infradead.org; Alexander Van Brunt; Bo Yan; will.deacon at arm.com
Subject: Re: [PATCH] arm64: alternative:flush cache with unpatched code

Hi,

As a general thing, could you please add a version number to patches in future?
i.e. this should be PATCHv4.

It really helps keeping track of patches, distinguishing versions, etc.

On Thu, May 31, 2018 at 01:37:34PM -0700, Rohit Khanna wrote:
> In the current implementation,  __apply_alternatives patches
> flush_icache_range and then executes it without invalidating the icache.
> Thus, icache can contain some of the old instructions for
> flush_icache_range. This can cause unpredictable behavior as during
> execution we can get a mix of old and new instructions for
> flush_icache_range.
>
> This patch :
>
> 1. Adds a new function clean_dcache_range_nopatch for flushing kernel
> memory range. This function uses non hot-patched code and can be
> safely used to flush cache during code patching.
>
> 2. Modifies __apply_alternatives so that it uses
> clean_dcache_range_nopatch to flush the cache range after patching code.
>
> Signed-off-by: Rohit Khanna <rokhanna@nvidia.com>
> ---
>  arch/arm64/include/asm/sysreg.h |  3 +++
>  arch/arm64/kernel/alternative.c | 37 ++++++++++++++++++++++++++++++++++++-
>  2 files changed, 39 insertions(+), 1 deletion(-)
>
> diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
> index 6171178075dc..9d1aee7c9aba 100644
> --- a/arch/arm64/include/asm/sysreg.h
> +++ b/arch/arm64/include/asm/sysreg.h
> @@ -617,6 +617,9 @@
>  #define MVFR1_FPDNAN_SHIFT           4
>  #define MVFR1_FPFTZ_SHIFT            0
>
> +/* SYS_CTR_EL0 */
> +#define SYS_CTR_ISIZE_SHIFT          0
> +#define SYS_CTR_DSIZE_SHIFT          16

We already have CTR_DMINLINE_SHIFT in <asm/cache.h>

Can we please add CTR_IMINLIN_SHIFT there too?

Maybe those should be moved into sysreg.h, but that can be a separate cleanup.

[RK] -  <asm/cache.h> doesnt contain CTR_DMINLINE_SHIFT.

>  #define ID_AA64MMFR0_TGRAN4_SHIFT    28
>  #define ID_AA64MMFR0_TGRAN64_SHIFT   24
> diff --git a/arch/arm64/kernel/alternative.c b/arch/arm64/kernel/alternative.c
> index 5c4bce4ac381..6b8c5438b37b 100644
> --- a/arch/arm64/kernel/alternative.c
> +++ b/arch/arm64/kernel/alternative.c
> @@ -122,6 +122,41 @@ static void patch_alternative(struct alt_instr *alt,
>       }
>  }
>
> +/* This is used for flushing kernel memory range after
> + * __apply_alternatives has patched kernel code
> + */
> +static void clean_dcache_range_nopatch(void *start, void *end)
> +{
> +     u64 d_start, i_start, d_size, i_size, ctr_el0;

I don't think we need separate i_start and d_start variables. A 'start' or
'cur' variable could be used for both.
[RK] - ok.

> +
> +     /* use sanitised value of ctr_el0 rather than raw value from CPU */
> +     ctr_el0 = read_sanitised_ftr_reg(SYS_CTR_EL0);
> +     /* size in bytes */
> +     d_size = cpuid_feature_extract_unsigned_field(ctr_el0,
> +                     SYS_CTR_DSIZE_SHIFT);
> +     i_size = cpuid_feature_extract_unsigned_field(ctr_el0,
> +                     SYS_CTR_ISIZE_SHIFT);

This isn't the size in bytes. Each is log2 the number of (4-byte) words.

i.e. the size in bytes is (xMinLine << 2).
[RK] - This doesnt seem right. For eg if IMinLine = 4 or 0b100
           then with above formula ICacheSize in Bytes = 4 << 2 = 16
           The correct formula should be (4 << xMinLine). 
            So in case IMinLine = 4 or 0b100,
            ICacheSizeBytes = 4 << 4 = 64B             

> +
> +     d_start = (u64)start & ~(d_size - 1);
> +     while (d_start <= (u64)end) {
> +             /* Use civac instead of cvau. This is required
> +              * due to ARM errata 826319, 827319, 824069,
> +              * 819472 on A53
> +              */
> +             asm volatile("dc civac, %0" : : "r" (d_start));

Either this needs a memory clobber, or we need barrier() first, to ensure that
the compiler doesn't re-order this against some of the patching code, however
unlikely that may be.
[RK] - So add barrier() before calling clean_dcache_range_nopatch() ?

> +             d_start += d_size;
> +     }

The loop can be simplified to:

        do {
                asm ( ... );
        } while (d_start += d_size, d_start < (u64)end)
[RK] - ok

> +     dsb(ish);
> +
> +     i_start = (u64)start & ~(i_size - 1);
> +     while (i_start <= (u64)end) {
> +             asm volatile("ic ivau, %0" : : "r" (i_start));
> +             i_start += i_size;
> +     }

Likewise here.
[RK] - ok

Thanks,
Mark.

> +     dsb(ish);
> +     isb();
> +}
> +
>  static void __apply_alternatives(void *alt_region, bool use_linear_alias)
>  {
>       struct alt_instr *alt;
> @@ -155,7 +190,7 @@ static void __apply_alternatives(void *alt_region, bool use_linear_alias)
>
>               alt_cb(alt, origptr, updptr, nr_inst);
>
> -             flush_icache_range((uintptr_t)origptr,
> +             clean_dcache_range_nopatch((uintptr_t)origptr,
>                                  (uintptr_t)(origptr + nr_inst));
>       }
>  }
> --
> 2.1.4
>

^ permalink raw reply

* [RFC PATCH 2/8] coresight: Fix remote endpoint parsing
From: Mathieu Poirier @ 2018-06-01 19:46 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20180601193837.GB9838@xps15>

On 1 June 2018 at 13:38, Mathieu Poirier <mathieu.poirier@linaro.org> wrote:
> On Fri, Jun 01, 2018 at 02:16:01PM +0100, Suzuki K Poulose wrote:
>> When parsing the remote endpoint of an output port, we do :
>>      rport = of_graph_get_remote_port(ep);
>>      rparent = of_graph_get_remote_port_parent(ep);
>>
>> and then parse the "remote_port" as if it was the remote endpoint,
>> which is wrong. The code worked fine because we used endpoint number
>> as the port number. Let us fix it and optimise a bit as:
>>
>>      remote_ep = of_graph_get_remote_endpoint(ep);
>>      if (remote_ep)
>>         remote_parent = of_graph_get_port_parent(remote_ep);
>>
>> and then, parse the remote_ep for the port/endpoint details.
>>
>> Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
>> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
>> ---
>>  drivers/hwtracing/coresight/of_coresight.c | 19 ++++++++++---------
>>  1 file changed, 10 insertions(+), 9 deletions(-)
>>
>> diff --git a/drivers/hwtracing/coresight/of_coresight.c b/drivers/hwtracing/coresight/of_coresight.c
>> index 7c37544..e0deab0 100644
>> --- a/drivers/hwtracing/coresight/of_coresight.c
>> +++ b/drivers/hwtracing/coresight/of_coresight.c
>> @@ -128,7 +128,7 @@ of_get_coresight_platform_data(struct device *dev,
>>       struct device *rdev;
>>       struct device_node *ep = NULL;
>>       struct device_node *rparent = NULL;
>> -     struct device_node *rport = NULL;
>> +     struct device_node *rep = NULL;
>>
>>       pdata = devm_kzalloc(dev, sizeof(*pdata), GFP_KERNEL);
>>       if (!pdata)
>> @@ -169,16 +169,17 @@ of_get_coresight_platform_data(struct device *dev,
>>                       pdata->outports[i] = endpoint.port;
>>
>>                       /*
>> -                      * Get a handle on the remote port and parent
>> -                      * attached to it.
>> +                      * Get a handle on the remote endpoint and the device
>> +                      * it is attached to.
>>                        */
>> -                     rparent = of_graph_get_remote_port_parent(ep);
>> -                     rport = of_graph_get_remote_port(ep);
>> -
>> -                     if (!rparent || !rport)
>> +                     rep = of_graph_get_remote_endpoint(ep);
>> +                     if (!rep)
>> +                             continue;
>> +                     rparent = of_graph_get_port_parent(rep);
>> +                     if (!rparent)
>>                               continue;
>>
>> -                     if (of_graph_parse_endpoint(rport, &rendpoint))
>> +                     if (of_graph_parse_endpoint(rep, &rendpoint))
>>                               continue;
>
> You are correct and I'm out to lunch.
>
>>
>>                       rdev = of_coresight_get_endpoint_device(rparent);
>> @@ -186,7 +187,7 @@ of_get_coresight_platform_data(struct device *dev,
>>                               return ERR_PTR(-EPROBE_DEFER);
>>
>>                       pdata->child_names[i] = dev_name(rdev);
>> -                     pdata->child_ports[i] = rendpoint.id;
>> +                     pdata->child_ports[i] = rendpoint.port;
>
> You need to do a of_node_put() here for both rep and rparent.

Same thing for the "continue" and error condition above.

>
>>
>>                       i++;
>>               } while (ep);
>> --
>> 2.7.4
>>

^ permalink raw reply

* [RFC PATCH 2/8] coresight: Fix remote endpoint parsing
From: Mathieu Poirier @ 2018-06-01 19:38 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1527858967-16047-3-git-send-email-suzuki.poulose@arm.com>

On Fri, Jun 01, 2018 at 02:16:01PM +0100, Suzuki K Poulose wrote:
> When parsing the remote endpoint of an output port, we do :
>      rport = of_graph_get_remote_port(ep);
>      rparent = of_graph_get_remote_port_parent(ep);
> 
> and then parse the "remote_port" as if it was the remote endpoint,
> which is wrong. The code worked fine because we used endpoint number
> as the port number. Let us fix it and optimise a bit as:
> 
>      remote_ep = of_graph_get_remote_endpoint(ep);
>      if (remote_ep)
>         remote_parent = of_graph_get_port_parent(remote_ep);
> 
> and then, parse the remote_ep for the port/endpoint details.
> 
> Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
> Signed-off-by: Suzuki K Poulose <suzuki.poulose@arm.com>
> ---
>  drivers/hwtracing/coresight/of_coresight.c | 19 ++++++++++---------
>  1 file changed, 10 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/hwtracing/coresight/of_coresight.c b/drivers/hwtracing/coresight/of_coresight.c
> index 7c37544..e0deab0 100644
> --- a/drivers/hwtracing/coresight/of_coresight.c
> +++ b/drivers/hwtracing/coresight/of_coresight.c
> @@ -128,7 +128,7 @@ of_get_coresight_platform_data(struct device *dev,
>  	struct device *rdev;
>  	struct device_node *ep = NULL;
>  	struct device_node *rparent = NULL;
> -	struct device_node *rport = NULL;
> +	struct device_node *rep = NULL;
>  
>  	pdata = devm_kzalloc(dev, sizeof(*pdata), GFP_KERNEL);
>  	if (!pdata)
> @@ -169,16 +169,17 @@ of_get_coresight_platform_data(struct device *dev,
>  			pdata->outports[i] = endpoint.port;
>  
>  			/*
> -			 * Get a handle on the remote port and parent
> -			 * attached to it.
> +			 * Get a handle on the remote endpoint and the device
> +			 * it is attached to.
>  			 */
> -			rparent = of_graph_get_remote_port_parent(ep);
> -			rport = of_graph_get_remote_port(ep);
> -
> -			if (!rparent || !rport)
> +			rep = of_graph_get_remote_endpoint(ep);
> +			if (!rep)
> +				continue;
> +			rparent = of_graph_get_port_parent(rep);
> +			if (!rparent)
>  				continue;
>  
> -			if (of_graph_parse_endpoint(rport, &rendpoint))
> +			if (of_graph_parse_endpoint(rep, &rendpoint))
>  				continue;

You are correct and I'm out to lunch.

>  
>  			rdev = of_coresight_get_endpoint_device(rparent);
> @@ -186,7 +187,7 @@ of_get_coresight_platform_data(struct device *dev,
>  				return ERR_PTR(-EPROBE_DEFER);
>  
>  			pdata->child_names[i] = dev_name(rdev);
> -			pdata->child_ports[i] = rendpoint.id;
> +			pdata->child_ports[i] = rendpoint.port;

You need to do a of_node_put() here for both rep and rparent.

>  
>  			i++;
>  		} while (ep);
> -- 
> 2.7.4
> 

^ permalink raw reply

* [PATCH v2 0/5] crypto: Speck support
From: Tomer Ashur @ 2018-06-01 19:23 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <8c9dc804-1f59-a245-57ba-51db3c234621@esat.kuleuven.be>

[Resending because the email bounced back from all 3 mailing lists.
Sorry if you get this email twice]
Hi Eric et al.,
I know that this thread is already stale, and I'm sorry I couldn't join
earlier but maybe late is better than never. Allow me to first introduce
myself: my name is Tomer Ashur and I'm a post-doctoral fellow in KU
Leuven. I am part of symmetric-key group led by Vincent Rijmen where I'm
mostly involved in cryptanalysis. I am also part of ISO/IEC JTC 1/SC
27/WG 2, the group which decided to reject Simon and Speck from ISO. If
it's okay with you, I'd like to give my perspective on what happened in
ISO and what is Speck's real standing with the academic community.

First, I'd like to say that the NSA has done quite extensive work in
muddying the waters, arguing that Simon & Speck are secure and that all
objections are political. This is not true, as I will now show with
examples. The bottom line is that there are still many open questions
about their security, questions that the NSA has, on multiple occasions,
refused to answer.

> It seems to me justified about as well as one would hope for a new cipher - 
>   "Notes on the design and analysis of Simon and Speck" seems to me to give ... detail on the reasoning
This is actually an optical illusion. First you need to understand the
context for this document. The NSA (in particular, the exact same person
who previously promoted DUAL_EC in ISO) proposed to include Simon &
Speck in ISO/IEC 29192-2 back in 2015. For obvious reasons they were met
with skepticism. A main concern was the lack of any design rationale and
internal cryptanalytic results. The NSA people fought tooth and nail for
a year and a half simultaneously arguing two almost mutually-exclusive
points: (i) they employ the most talented cryptographers and hence, we
should trust them when they say that an algorithm is secure; and (ii)
they are average cryptographers and hence they would not be able to
insert a backdoor into the algorithm.

More than once they argued in a meeting that the cryptanalysis for the
ciphers has been stabilized (i.e., that attacks will not improve) just
to be proved wrong in the next meeting (their answer: "well, _now_ it
has fully stabilized", which was again proven wrong in the next
meeting). One of them even had a bet with Tanja Lange that no attack on
either Simon or Speck would be extended by 3 rounds or more in the
upcoming year. He lost this bet. They were very uncooperative, and made
it a point to let us know that they will not be providing more
information about the algorithms.

So, in this climate, you can imagine how surprised we all were when in
one of the meetings (after not getting the votes they needed in order to
proceed to the next stage) they announced that they will provide a
design rationale. At first they distributed it to us in ISO, but per my
suggestion they then uploaded it to ePrint (see ePrint 2017/560).

But our joy was short-lived. Once you read this so-called design
rationale you can immediately notice two things. Firstly, that they
explain in length all decisions affecting performance (in particular,
rotation amounts - which in one of the meetings they described as
"most-efficient; secure-enough"). The second thing is that when it comes
to cryptanalysis this document is merely a literature review. There is
literally nothing new there - all they do is to cite published works by
academics, something wrongly.

Now, there is no nice way to say that, but this document includes
omissions, falsehoods, half-truths and outright lies. I will not go into
the full analysis of the document, but here are some examples:

 1. Omissions - I already said that this document does not provide any
    new information. This becomes apparent when you try to find out how
    they chose the number of rounds. The document remains quite vague on
    this question. There is a lot of hand waving about "Matsui-like
    techniques", "multipath effect", etc. but nowhere you can find (in
    the old version, they recently uploaded a new version which I didn't
    have time to read yet) a place where they say: "this is how we set
    the number of rounds".

    Another omission is about the key schedule - you won't find any
    useful information about the design decisions leading to these
    particular key schedules. Simon uses 3 matrices U,V, and W which are
    not explained, not does the constant c. Speck's key schedule is more
    straightforward but a discussion about the symmetries that may arise
    from using the round function for the key schedule would still be
    appropriate here. Not discussing the combined security of the cipher
    with its key schedule goes against the current trend in linear
    cryptanalysis (see e.g., [2] and many follow up papers).
 2. Half-truths -? take a look at page 16 where they explain how they
    avoided rotation/slide attacks. They give the standard explanation
    that using round-constants would thwart these attacks. This could
    have been fine if the last sentence wasn't "/Also see [AL16]/". From
    the text it seems as if /AL16/ supports the claims made in this
    paragraph. However, /AL16/ is a paper I co-authored which is how I
    know that not only that it doesn't support the claim, it actually
    shows how to adapt rotational cryptanalysis to algorithms using
    round constants.

    As a side note, the goal of /AL16/ was to present a novel way to use
    rotational cryptanalysis in the presence of round constants. This
    paper was published in FSE'17 and we followed up on it with a paper
    in FSE'18 using this attack against Speck{32,48,64} [1]. The reason
    we focused on these versions and not the larger one is not, as was
    suggested in this thread, that they are somehow more secure. The
    actual reason is much less prosaic: these are the resources we had
    at our disposal. This is also the reason the weak-key classes are so
    small. But the fact that my publicly funded university cannot afford
    a better number cruncher doesn't mean that someone with access to
    such won't be able to find better results. In fact, I am quite
    convinced that if you give our tool the resources it needs, it would
    penetrate way more than the currently best known distinguisher of 19
    rounds for Speck128 (translating to better key recovery attacks).

    What is important to understand here is in the same way you do
    "real-world crypto", academics often do "proofs of concept". After
    publishing the attack technique and the attack on (reduced-)Speck, I
    moved to my next project because the scientific marginal benefit is
    small. There is of course the personal gain of being known as the
    guy who broke Speck, but I'm not particularly interested in such
    fame. All of that being said, if anyone has the firepower to run
    this tool and to improve the existing attacks for Speck128, feel
    free to drop me an email.
 3. Falsehoods - with this word I refer to claims in the so-called
    design rationale that are wrong. We can argue whether they were
    included on purpose or if they are simply mistakes. But in either
    case, they are exist and they are worrisome. I would only give one
    example: "/the design team?s early analytic efforts led us to
    believe that the limiting cryptanalytic features for Simon and
    Speck-type block ciphers would be of the linear and differential
    sort"/ (see Page 4). Believing that differential and linear attacks
    would be the most dangerous attacks is reasonable, but as we can see
    from [1], it is wrong.
 4. Lies - this is the most troubling part. The NSA lies to the public
    (including the American people) on official documents. I already
    wrote that the choice for the exact number of rounds is only
    motivated through some hand waving. This makes it hard to tell what
    the real security margin is. But even if you interpret the hand
    waving conservatively, the math results in much smaller security
    margins than what is claimed. I gave a rump session talk about this
    in Crypto 2017 which you can view here [3]. The talk focuses on
    Simon but the story for Speck is similar and results in security
    margins of 15.6%, 15.6%, and 14.7% for Speck128 with key sizes 128,
    192, and 256, respectively. According to the NSA, that is, and only
    if you accept the claim that attacks have stabilized.

    the choice for the number of rounds was heavily discussed in the ISO
    meeting in Berlin about 6 months ago. When confronted with this
    question, the NSA answered (again) that they will not be providing
    further information, added that anyone with a decent level of
    English would immediately understand what they meant, and called me
    an incompetent cryptographer. Nevertheless, a few months after the
    meeting they updated the so-called design rationale and added a
    footnote that reads:
>     "The original version of this paper said 50% here, but noted that
>     this was ?very conser-
>     vative.? This led to confusion by some, who interpreted 50% as an
>     exact value, rather than
>     the very conservative upper bound we intended it to be. This is
>     supported by the literature
>     (see, e.g., [CW15]) and by our internal analysis. Indeed 50% is a
>     significant overestimate;
>     25% appears to be a more accurate estimate. We apologize for the
>     lack of clarity here, and
>     note that even if future advances increased the 25% to 50% Simon
>     would still be secure." (Page 11)
    This is a fine clarification except that it is an outrageous lie.
    For example, for Simon32 the so-called design rationale reports that
    the best linear trail can penetrate at most 12 rounds. As part of my
    research I found an 18-round linear hull which _was confirmed, in
    writing,_ by the NSA (I should have the email somewhere and can find
    it if anyone is interested). The difference between 12 and 18 rounds
    is indeed 50% and not 25% as they argue in the updated document.

These are only part of the problems I and others found with the
so-called design rationale. Having so many problems in a document meant
to convince people that you're not doing anything sinister is either an
indication for some serious incompetence, or an indication that
something sinister is actually happening. Either way, it is clear that
this document is meant for PR and has no scientific value. It surely
does not inspire confidence in the algorithms.

All of this was known to the people in the room when ISO made its
decision to reject Simon and Speck (after deliberating about this for
more than 3 years. Not because there were disagreements but because we
wanted to give the NSA a fair chance). These people also got a first
hand impression of how poorly the people the NSA sent fare with
_technical_ questions, basically refusing to answer all, and throwing
tantrums instead. And then, the ISO people also saw another thing.
During the discussions I asked the NSA two non-technical questions (from
a crypto point of view. These are technical questions from a
standardization point of view):?
??? - Q: You claim that third party analysis is indicative of the
algorithm's real security. Were you aware of all these results when you
published the algorithms, or are any of them better than what you knew of?
??? - A: I refuse to answer that
??? -Q: Are you aware of any cryptanalytic results better than those
already found by academia?
??? -A: I refuse to answer that either.

Now, there seem to be some notion that the people in ISO are bureaucrats
with limited understanding in cryptography. The truth is that WG 2 (the
cryptography experts) includes people like Kan Yasuda, Shiho Moriai, Dan
Berenstein, Pascal Paillier, Tanja Lange, Orr Dunkelman and Jian Guo
(partial list). You can't say that they don't know what they're doing.
Which is why, having all this information, we decided that including
these algorithms in one of our standards would undermine the trust
people have in ISO and the work it is doing.

Note that in parallel to the Simon and Speck process, people from the
NSA (different from those involved in Simon and Speck) are successfully
promoting at least two other projects. So you can't say that there
really is a significant anti-NSA bias either. No, these algorithms seem
insecure, attacks against them keep improving, their designers either
refuse to answer basic questions about their security or lie... What
other conclusion could we have reached except that there might be a
security problem with these algorithms?

This of course brings us back to the question asked early in this thread:

> support for SM4 was just added too, which is a Chinese government standard. Are you going to send a patch to remove that
> too, or is it just NSA designed algorithms that are not okay?
This seems pretty obvious to me. If you don't feel comfortable with SM4,
don't add it either. There are at least that many reasons to distrust
the Chinese government as there are to distrust the NSA.

However, the answer to the question
> Could you say a little more about what it is that separates Speck from SM4
> for you?
is a bit different. There are two main things that separate Speck from
SM4. Firstly, it seems more secure. This is either because it actually
is more secure, or because the Chinese did a better job in hiding their
backdoors; but at least it doesn't scream "something strange is going on
here!!!". Second, SM4 is also being standardized in ISO these days and
the Chinese are very cooperative with the process. Whatever question you
have about this algorithm, I can get you an answer from the person
promoting SM4. This inspires confidence in the algorithm and the
process. Is this enough? I don't think so. But being a member of ISO I'm
bound by certain rules that don't allow me to reject algorithms based on
my intuition, so it seems that SM4 (as well as LEA and Kuznyechik) would
probably find their way into the respective standards.

That being said, if you ask for my opinion, just don't include SM4.

Which bring us to the million dollar question:
> So, what do you propose replacing it with?
Nothing. I am usually not one to argue for maintaining the status quo
and I sure am in favor of encryption-for-all but this case is the text
book example for employing the Precautionary Principle. You yourself are
not fully convinced that Speck is secure and does not contain any
backdoors. If it was really secure, it could have been used in all cases
and not only on low-end devices where AES is too slow. AES is slower
than Speck on most platforms.

Now, I'm a sort of a mathematician which doesn't know much about
processor generations and implementation efficiency. Things like 134833
KB/s are Chinese to me. But the way I understand it, these devices that
are to weak to support AES would not be around in 2-5 years which would
make the problem go away. In the foreseeable future, even if the
crypto-extension isn't added to low-end processors, they would still
improve to a degree they can run some of the efficient-but-not-enough
algorithms of today, no?

I would also like to point out that including an algorithm because "it's
better than nothing" result in something that is not
better-than-nothing, but stands in the way of good solutions. Since
there is no acute problem, why do we need to solve it? This is from the
cryptographers' point of view. From the end-user point of view when they
get something bundled into Android, they don't know that it was included
there as something that is "better than nothing". They think of it as
"good enough; endorsed by Android/Google/Linux". What you give them is a
false sense of security because they don't know of all the question
marks surrounding Speck (both technical and political).

So I think that as a first step, no-encryption is better than using
Speck. Then we can move for a longer term solution. Since this is an
important enough issue I asked around and people are happily willing to
help. For example, Dan Berenstein seems to believe that a solution can
be built using a generic construction along the lines of your discussion
with Samuel (with or without a variant of ChaCha). Even if a generic
construction cannot be used Berenstein told me he's willing to help
design a solution. I also asked Vincent Rijmen and Orr Dunkelman and
they both told me they'd be willing to work in a team to find (or
design) a solution. This is already an impressive cadre and I'm sure it
would not be too much of a problem to solicit other notable
cryptographer because basically, no one in this community thinks it's a
good idea to use Speck.

Sorry for the long post and Shabbat Shalom,

Tomer Ashur, PhD
Senior Researcher
COSIC, KU Leuven

[1] https://eprint.iacr.org/2017/1036
[2] https://eprint.iacr.org/2012/303
[3] https://www.youtube.com/watch?v=3d-xruyR89g&t=2s




On 05/08/2018 01:20 AM, Eric Biggers wrote:
> Hi Samuel,
>
> On Thu, Apr 26, 2018 at 03:05:44AM +0100, Samuel Neves wrote:
>> On Wed, Apr 25, 2018 at 8:49 PM, Eric Biggers <ebiggers@google.com> wrote:
>>> I agree that my explanation should have been better, and should have considered
>>> more crypto algorithms.  The main difficulty is that we have extreme performance
>>> requirements -- it needs to be 50 MB/s at the very least on even low-end ARM
>>> devices like smartwatches.  And even with the NEON-accelerated Speck128-XTS
>>> performance exceeding that after much optimization, we've been getting a lot of
>>> pushback as people want closer to 100 MB/s.
>>>
>> I couldn't find any NEON-capable ARMv7 chip below 800 MHz, so this
>> would put the performance upper bound around 15 cycles per byte, with
>> the comfortable number being ~7. That's indeed tough, though not
>> impossible.
>>
>>> That's why I also included Speck64-XTS in the patches, since it was
>>> straightforward to include, and some devices may really need that last 20-30% of
>>> performance for encryption to be feasible at all.  (And when the choice is
>>> between unencrypted and a 64-bit block cipher, used in a context where the
>>> weakest points in the cryptosystem are actually elsewhere such as the user's
>>> low-entropy PIN and the flash storage doing wear-leveling, I'd certainly take
>>> the 64-bit block cipher.)  So far we haven't had to use Speck64 though, and if
>>> that continues to be the case I'd be fine with Speck64 being removed, leaving
>>> just Speck128.
>>>
>> I would very much prefer that to be the case. As many of us know,
>> "it's better than nothing" has been often used to justify other bad
>> choices, like RC4, that end up preventing better ones from being
>> adopted. At a time where we're trying to get rid of 64-bit ciphers in
>> TLS, where data volumes per session are comparatively low, it would be
>> unfortunate if the opposite starts happening on encryption at rest.
>>
>>> Note that in practice, to have any chance at meeting the performance requirement
>>> the cipher needed to be NEON accelerated.  That made benchmarking really hard
>>> and time-consuming, since to definitely know how an algorithm performs it can
>>> take upwards of a week to implement a NEON version.  It needs to be very well
>>> optimized too, to compare the algorithms fairly -- e.g. with Speck I got a 20%
>>> performance improvement on some CPUs just by changing the NEON instructions used
>>> to implement the 8-bit rotates, an optimization that is not possible with
>>> ciphers that don't use rotate amounts that are multiples of 8.  (This was an
>>> intentional design choice by the Speck designers; they do know what they're
>>> doing, actually.)
>>>
>>> Thus, we had to be pretty aggressive about dropping algorithms from
>>> consideration if there were preliminary indications that they wouldn't perform
>>> well, or had too little cryptanalysis, or had other issues such as an unclear
>>> patent situation.  Threefish for example I did test the C implementation at
>>> https://github.com/wernerd/Skein3Fish, but on ARM32 it was over 4 times slower
>>> than my NEON implementation of Speck128/256-XTS.  And I did not see a clear way
>>> that it could be improved over 4x with NEON, if at all, so I did not take the
>>> long time it would have taken to write an optimized NEON implementation to
>>> benchmark it properly.  Perhaps that was a mistake.  But, time is not unlimited.
>>>
>> In my limited experience with NEON and 64-bit ARX, there's usually a
>> ~2x speedup solely from NEON's native 64-bit operations on ARMv7-A.
>> The extra speedup from encrypting 2 block in parallel is then
>> somewhere between 1x and 2x, depending on various details. Getting
>> near 4x might be feasible, but it is indeed time-consuming to get
>> there.
>>
>>> As for the wide-block mode using ChaCha20 and Poly1305, you'd have to ask Paul
>>> Crowley to explain it properly, but briefly it's actually a pseudorandom
>>> permutation over an arbitrarily-sized message.  So with dm-crypt for example, it
>>> would operate on a whole 512-byte sector, and if any bit of the 512-byte
>>> plaintext is changed, then every bit in the 512-byte ciphertext would change
>>> with 50% probability.  To make this possible, the construction uses a polynomial
>>> evalution in GF(2^130-5) as a universal hash function, similar to the Poly1305
>>> mode.
>>>
>> Oh, OK, that sounds like something resembling Naor-Reingold or its
>> relatives. That would work, but with 3 or 4 passes I guess it wouldn't
>> be very fast.
>>
>>> Using ChaCha20's underlying 512-bit permutation to build a tweakable block
>>> cipher is an interesting idea.  But maybe in my crypto-naivety, it is not
>>> obvious to me how to do so.  Do you have references to any relevant papers?
>>> Remember that we strongly prefer a published cipher to a custom one -- even if
>>> the core is reused, a mistake may be made in the way it is used.  Thus,
>>> similarly to Paul's wide-block mode, I'd be concerned that we'd have to
>>> self-publish a new construction, then use it with no outside crypto review.
>>> *Maybe* it would be straightforward enough to be okay, but to know I'd need to
>>> see the details of how it would actually work.
>>>
>> This would be the 'tweakable Even-Mansour' construction and its
>> variants. The variant I'm most familiar with would be MEM [1],
>> focusing on software friendliness, but there is other provable
>> security work in the same vein, including [3, 4, 5]. It's very similar
>> to how the XEX mode turns a block cipher into a tweakable block
>> cipher.
>>
>> In [1, 2] we used a 1024-bit permutation out of BLAKE2 instead of
>> ChaCha20's, but everything translates easily from one to the other. We
>> also included cheap masks for 512-bit permutations, just in case.
>>
>> [1] https://eprint.iacr.org/2015/999
>> [2] https://github.com/MEM-AEAD/mem-aead
>> [3] https://eprint.iacr.org/2015/539
>> [4] https://eprint.iacr.org/2015/476
>> [5] https://competitions.cr.yp.to/round2/minalpherv11.pdf
>>
>>> But in the end, Speck seemed like the clear choice because it had multiple NEON
>>> implementations available already which showed it could be implemented very
>>> efficiently in NEON; it has over 70 cryptanalysis papers (far more than most
>>> ciphers) yet the security margin is still similar to AES; it has no intellectual
>>> property concerns; there is a paper clearly explaining the design decisions; it
>>> is naturally resistant to timing attacks; it supports a 128-bit block size, so
>>> it can be easily used in XTS mode; it supports the same key sizes as AES; and it
>>> has a simple and understandable design with no "magic numbers" besides 8 and 3
>>> (compare to an actual backdoored algorithm like Dual_EC_DRGB, which basically
>>> had a public key embedded in the algorithm).  Also as Paul mentioned he is
>>> confident in the construction, and he has published cryptanalysis on Salsa20, so
>>> his opinion is probably more significant than mine :-)
>>>
>>> But I will definitely take a closer look at SPARX and some of the other ciphers
>>> you mentioned in case I missed something.  I really do appreciate the
>>> suggestions, by the way, and in any case we do need to be very well prepared to
>>> justify our choices.  I just hope that people can understand that we are
>>> implementing real-world crypto which must operate under *very* tight performance
>>> constraints on ARM processors, and it must be compatible with dm-crypt and
>>> fscrypt with no room for ciphertext expansion.  Thus, many algorithms which may
>>> at first seem reasonable choices had to (unfortunately) be excluded.
>>>
>> I understand it is a tough choice, and it's unfortunate that many of
>> the algorithms we have cater mostly to either the
>> high-hardware-accelerated-end or the extremely low-end, without a lot
>> of good options at the middle-end.
>>
> First, we're planning a publication which explains our choices in more detail,
> so please treat this as some more preliminary notes.
>
> To make sure we've exhausted as many alternatives as possible, I wrote NEON
> implementations of all the block ciphers you suggested with the exception of
> SKINNY (which looked very hardware-oriented and not efficient in software), as
> well as some that others have suggested.  (It was tough, but after doing a
> couple, it got much easier...)  The following shows the decryption performance
> I'm getting on an ARMv7 platform.  Encryption speeds were usually similar, but
> in our use case we care much more about decryption, as that affects the most
> critical metrics such as the time to launch applications.
>
> 	ChaCha8-MEM: 183256 KB/s
> 	ChaCha12-MEM: 134833 KB/s
> 	Chaskey-LTS-XTS: 99097 KB/s
> 	ChaCha20-MEM: 87875 KB/s
> 	Speck64/128-XTS: 85332 KB/s
> 	Speck128/128-XTS: 73404 KB/s
> 	RC5-128/12/256-XTS: 69887 KB/s
> 	Speck128/256-XTS: 69597 KB/s
> 	RC5-64/12/128-XTS: 69267 KB/s
> 	LEA-128-XTS: 67986 KB/s
> 	CHAM128/128-XTS: 52982 KB/s
> 	LEA-256-XTS: 50429 KB/s
> 	Threefish-256: 48349 KB/s
> 	RC6-XTS: 46855 KB/s
> 	RC5-128/20/256-XTS: 44291 KB/s
> 	RC5-64/20/128-XTS: 43924 KB/s
> 	NOEKEON-XTS: 40705 KB/s
> 	Sparx128/128-XTS: 39191 KB/s
> 	XTEA-XTS: 38239 KB/s
> 	AES-128-XTS: 25549 KB/s
> 	AES-256-XTS: 18640 KB/s
>
> Remember that for dm-crypt or fscrypt over flash storage and/or f2fs, a stream
> cipher is insecure.  Moreover, on these (low-end) devices the status quo is no
> encryption, and we need every bit of performance available.  Anything below
> 50 MB/s is definitely unacceptable.  But even at that speed we get many
> complaints, so in practice we need something faster.  That means that the
> algorithms close to 50 MB/s, such as Threefish, still aren't fast enough.
>
> ChaCha-MEM (based roughly on your paper: https://eprint.iacr.org/2015/999), has
> the best performance, especially if we allow for the 12 or 8-round variants.  My
> code for it is based roughly on the existing
> arch/arm/crypto/chacha20-neon-core.S, but updated to support the inverse
> permutation (on 4 blocks at a time, using all 16 NEON registers) and do the
> masking required by MEM.  However, ChaCha-MEM would be a pretty bleeding-edge
> and customized construction, and Paul Crowley and I have concerns about its
> security.  The problem is that the MEM security proof assumes that the
> underlying permutation has no more detectable structural properties than a
> randomly selected permutation.  However, the ChaCha permutation is known to have
> certain symmetries, e.g. if the sixteen 32-bit words are (a, a, a, a, b, b, b,
> b, c, c, c, c, d, d, d, d), then they always map to some (e, e, e, e, f, f, f,
> f, g, g, g, g, h, h, h, h).
>
> For the MEM mask generation, we can use the "expand 32-byte k" constant to break
> the symmetry, like is done in the ChaCha stream cipher.  However, that's not
> possible for the inner application of the permutation.  So, we'd be using the
> ChaCha permutation in a manner in which it wasn't intended, and the security of
> the ChaCha stream cipher wouldn't directly carry over.  Granted, it's not
> impossible that it would be secure, but at the present time it doesn't seem like
> a good choice to actually field.
>
> Chaskey-LTS is faster than Speck, but unfortunately it's not really a viable
> option because it has only a 64-bit security level, due to its use of the
> Even-Mansour construction with a 128-bit key.  Of course, it would still be
> better than nothing, but we prefer a cipher that has a security level in line
> with what is accepted for modern crypto.
>
> RC5 with the traditional 12 rounds is about as fast as Speck, but there is a
> known differential attack on that number of rounds.  So if we choose RC5 we'd
> almost certainly have to use the 20-round variant, which is much slower.
>
> That leaves LEA-128-XTS as the only other algorithm that might meet the
> performance requirement, as it is only slightly slower than Speck128-XTS.  It
> may be the most viable alternative, but beyond the slight performance loss it
> still has some disadvantages compared to Speck:
>
> - Importantly, the LEA authors forgot to include test vectors, so I'm not yet
>   100% sure I implemented it correctly.  (The Speck authors unfortunately didn't
>   make the endianness of their test vectors clear in their initial publication,
>   but at least they actually provided test vectors!)
> - LEA has received some cryptanalysis, but not nearly as much as Speck.
> - It took some very heavy optimization to get good LEA performance, much more
>   than I had to do for Speck.  My final LEA code has separate code paths for
>   128-bit and 256-bit keys, and has reordered and preprocessed the round keys,
>   and reordered the operations.  As a result, it's harder to see how it maps to
>   the original paper.  In contrast, my Speck code is more straightforward and
>   maintainable.
> - LEA-256 (256-bit key) is much slower than LEA-128 (128-bit key), as it has
>   33% more rounds.  LEA-256 would not be fast enough, so we would have to use
>   LEA-128.  In contrast, with Speck we can use Speck128/256 (256-bit key).
>   We're willing to accept a 128-bit security level, but 256-bit is preferable.
>   (I think the Speck designers took a more informed approach to setting
>   appropriate security margins for a lightweight cipher; it seems that other
>   designers often choose too few or too many rounds, especially as the key
>   length is varied.)
> - LEA encryption is also a bit slower than decryption, while with Speck
>   encryption and decryption are almost exactly the same speed.
>
> Note that like Speck, LEA doesn't appear to be approved by a standards
> organization either; it's just specified in a research paper.
>
> Thus, from a technical perspective, and given the current state of the art in
> lightweight cryptography, currently Speck128-XTS seems to be the best choice for
> the problem domain.  It's unfortunate that there are so few good options and
> that the field is so politicized, but it is what it is.
>
> Still, we don't want to abandon HPolyC (Paul's new ChaCha and Poly1305-based
> wide-block mode), and eventually we hope to offer it as an option as well.  But
> it's not yet published, and it's a more complex algorithm that is harder to
> implement so I haven't yet had a chance to implement and benchmark it.  And we
> don't want to continue to leave users unprotected while we spend a long time
> coming up with the perfect algorithm, or for hardware AES support to arrive to
> all low-end CPUs when it's unclear if/when that will happen.
>
> Again, we're planning a publication which will explain all this in more detail.
>
> Thanks!
>
> Eric


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: OpenPGP digital signature
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20180601/c1da80f3/attachment-0001.sig>

^ permalink raw reply

* [PATCH] clkdev: Remove duplicated negative index check from __of_clk_get()
From: Geert Uytterhoeven @ 2018-06-01 19:22 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <152788083877.225090.16757573144833344500@swboyd.mtv.corp.google.com>

Hi Stephen,

On Fri, Jun 1, 2018 at 9:20 PM, Stephen Boyd <sboyd@kernel.org> wrote:
> Quoting Geert Uytterhoeven (2018-05-18 03:58:40)
>> __of_clk_get() calls of_parse_phandle_with_args(), which rejects
>> negative indices since commit bd69f73f2c81eed9 ("of: Create function for
>> counting number of phandles in a property").
>>
>> Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
>> ---
>> Commit bd69f73f2c81eed9 is in v3.9.
>
> Did you send this to Russell's patch tracker? Otherwise I can pick it up

Not yet. The patch tracker is for reviewed patches, AFAIK.

> to clk-next.

Thanks!

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert at linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply

* [PATCH] clkdev: Remove duplicated negative index check from __of_clk_get()
From: Stephen Boyd @ 2018-06-01 19:20 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1526641120-1585-1-git-send-email-geert+renesas@glider.be>

Quoting Geert Uytterhoeven (2018-05-18 03:58:40)
> __of_clk_get() calls of_parse_phandle_with_args(), which rejects
> negative indices since commit bd69f73f2c81eed9 ("of: Create function for
> counting number of phandles in a property").
> 
> Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
> ---
> Commit bd69f73f2c81eed9 is in v3.9.

Did you send this to Russell's patch tracker? Otherwise I can pick it up
to clk-next.

^ permalink raw reply

* [PATCH] clk: aspeed: Add 24MHz fixed clock
From: Stephen Boyd @ 2018-06-01 19:19 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1526633822-17138-1-git-send-email-mine260309@gmail.com>

Quoting Lei YU (2018-05-18 01:57:02)
> Add a 24MHz fixed clock.
> This clock will be used for certain devices, e.g. pwm.
> 
> Signed-off-by: Lei YU <mine260309@gmail.com>
> ---

Applied to clk-next

^ permalink raw reply

* [PATCH V2 3/3] ARM: dts: imx7: correct enet ipg clock
From: Stephen Boyd @ 2018-06-01 19:18 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1526605266-18464-3-git-send-email-Anson.Huang@nxp.com>

Quoting Anson Huang (2018-05-17 18:01:06)
> ENET "ipg" clock should be IMX7D_ENETx_IPG_ROOT_CLK
> rather than IMX7D_ENET_AXI_ROOT_CLK which is for ENET bus
> clock.
> 
> Based on Andy Duan's patch from the NXP kernel tree.
> 
> Signed-off-by: Anson Huang <Anson.Huang@nxp.com>
> ---

Applied to clk-next

^ permalink raw reply

* [PATCH V2 2/3] clk: imx7d: correct enet clock CCGR registers
From: Stephen Boyd @ 2018-06-01 19:18 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1526605266-18464-2-git-send-email-Anson.Huang@nxp.com>

Quoting Anson Huang (2018-05-17 18:01:05)
> Correct enet clock gates as below:
> 
> CCGR6: IMX7D_ENET_AXI_ROOT_CLK (enet1 and enet2 bus clocks)
> CCGR112: IMX7D_ENET1_TIME_ROOT_CLK, IMX7D_ENET1_IPG_ROOT_CLK
> CCGR113: IMX7D_ENET2_TIME_ROOT_CLK, IMX7D_ENET2_IPG_ROOT_CLK
> 
> Just rename unused IMX7D_ENETx_REF_ROOT_CLK for
> IMX7D_ENETx_IPG_ROOT_CLK instead of adding new clocks.
> 
> Based on Andy Duan's patch from the NXP kernel tree.
> 
> Signed-off-by: Anson Huang <Anson.Huang@nxp.com>
> ---

Applied to clk-next

^ permalink raw reply

* [PATCH V2 1/3] clk: imx7d: correct enet phy ref clock gates
From: Stephen Boyd @ 2018-06-01 19:18 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1526605266-18464-1-git-send-email-Anson.Huang@nxp.com>

Quoting Anson Huang (2018-05-17 18:01:04)
> IMX7D_ENET_PHY_REF_ROOT_DIV supplies clock for PHY directly,
> there is no clock gate after it, rename it to
> IMX7D_ENET_PHY_REF_ROOT_CLK to avoid device tree change.
> 
> Signed-off-by: Anson Huang <Anson.Huang@nxp.com>
> ---

Applied to clk-next

^ permalink raw reply

* [PATCH] clk: imx6sl: correct ocram_podf clock type
From: Stephen Boyd @ 2018-06-01 19:10 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1526533248-21990-2-git-send-email-Anson.Huang@nxp.com>

Quoting Anson Huang (2018-05-16 22:00:48)
> IMX6SL_CLK_OCRAM_PODF is a busy divider, its name in
> CCM_CDHIPR register of Reference Manual CCM chapter
> is axi_podf_busy, correct its clock type.
> 
> Signed-off-by: Anson Huang <Anson.Huang@nxp.com>
> ---

Applied to clk-next

^ permalink raw reply

* [PATCH] clk: imx6sx: disable unnecessary clocks during clock initialization
From: Stephen Boyd @ 2018-06-01 19:10 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1526533248-21990-1-git-send-email-Anson.Huang@nxp.com>

Quoting Anson Huang (2018-05-16 22:00:47)
> Disable those unnecessary clocks during kernel boot up to save power,
> those modules clock should be managed by modules driver in runtime.
> 
> Signed-off-by: Anson Huang <Anson.Huang@nxp.com>
> ---

Applied to clk-next

^ permalink raw reply

* [PATCH 2/3] clk: bcm: Update and add tingray clock entries
From: Rob Herring @ 2018-06-01 19:02 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <a354f3d1-051d-8b68-e6b3-ce7a739c10c9@broadcom.com>

On Fri, Jun 1, 2018 at 12:56 PM, Ray Jui <ray.jui@broadcom.com> wrote:
> Hi Rob,
>
> On 5/31/2018 9:25 AM, Rob Herring wrote:
>>
>> On Fri, May 25, 2018 at 09:45:16AM -0700, Ray Jui wrote:
>>>
>>> Update and add Stingray clock definitions and tables so they match the
>>> binding document and the latest ASIC datasheet
>>>
>>> Signed-off-by: Pramod Kumar <pramod.kumar@broadcom.com>
>>> Signed-off-by: Ray Jui <ray.jui@broadcom.com>
>>> ---
>>>   drivers/clk/bcm/clk-sr.c           | 135
>>> ++++++++++++++++++++++++++++++++-----
>>>   include/dt-bindings/clock/bcm-sr.h |  24 +++++--
>>
>>
>> This goes in the 1st patch.
>
>
> Please help to confirm. You want 1st patch and 2nd patch to be combined into
> a single patch?

No. include/dt-bindings/* is part of the DT binding, so it goes with
patch 1. The driver in patch 2.

Rob

^ permalink raw reply

* [PATCH V3] PCI: move early dump functionality from x86 arch into the common code
From: Sinan Kaya @ 2018-06-01 19:00 UTC (permalink / raw)
  To: linux-arm-kernel

Move early dump functionality into common code so that it is available for
all archtiectures. No need to carry arch specific reads around as the read
hooks are already initialized by the time pci_setup_device() is getting
called during scan.

Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
---
 Documentation/admin-guide/kernel-parameters.txt |  2 +-
 arch/x86/include/asm/pci-direct.h               |  5 ---
 arch/x86/kernel/setup.c                         |  5 ---
 arch/x86/pci/common.c                           |  4 ---
 arch/x86/pci/early.c                            | 44 -------------------------
 drivers/pci/pci.c                               |  5 +++
 drivers/pci/pci.h                               |  1 +
 drivers/pci/probe.c                             | 19 +++++++++++
 8 files changed, 26 insertions(+), 59 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index e490902..e64f1d8 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -2995,7 +2995,7 @@
 			See also Documentation/blockdev/paride.txt.
 
 	pci=option[,option...]	[PCI] various PCI subsystem options:
-		earlydump	[X86] dump PCI config space before the kernel
+		earlydump	dump PCI config space before the kernel
 				changes anything
 		off		[X86] don't probe for the PCI bus
 		bios		[X86-32] force use of PCI BIOS, don't access
diff --git a/arch/x86/include/asm/pci-direct.h b/arch/x86/include/asm/pci-direct.h
index e1084f7..e5e2129 100644
--- a/arch/x86/include/asm/pci-direct.h
+++ b/arch/x86/include/asm/pci-direct.h
@@ -14,9 +14,4 @@ extern void write_pci_config(u8 bus, u8 slot, u8 func, u8 offset, u32 val);
 extern void write_pci_config_byte(u8 bus, u8 slot, u8 func, u8 offset, u8 val);
 extern void write_pci_config_16(u8 bus, u8 slot, u8 func, u8 offset, u16 val);
 
-extern int early_pci_allowed(void);
-
-extern unsigned int pci_early_dump_regs;
-extern void early_dump_pci_device(u8 bus, u8 slot, u8 func);
-extern void early_dump_pci_devices(void);
 #endif /* _ASM_X86_PCI_DIRECT_H */
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 2f86d88..480f250 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -991,11 +991,6 @@ void __init setup_arch(char **cmdline_p)
 		setup_clear_cpu_cap(X86_FEATURE_APIC);
 	}
 
-#ifdef CONFIG_PCI
-	if (pci_early_dump_regs)
-		early_dump_pci_devices();
-#endif
-
 	e820__reserve_setup_data();
 	e820__finish_early_params();
 
diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c
index 563049c..d4ec117 100644
--- a/arch/x86/pci/common.c
+++ b/arch/x86/pci/common.c
@@ -22,7 +22,6 @@
 unsigned int pci_probe = PCI_PROBE_BIOS | PCI_PROBE_CONF1 | PCI_PROBE_CONF2 |
 				PCI_PROBE_MMCONF;
 
-unsigned int pci_early_dump_regs;
 static int pci_bf_sort;
 int pci_routeirq;
 int noioapicquirk;
@@ -599,9 +598,6 @@ char *__init pcibios_setup(char *str)
 		pci_probe |= PCI_BIG_ROOT_WINDOW;
 		return NULL;
 #endif
-	} else if (!strcmp(str, "earlydump")) {
-		pci_early_dump_regs = 1;
-		return NULL;
 	} else if (!strcmp(str, "routeirq")) {
 		pci_routeirq = 1;
 		return NULL;
diff --git a/arch/x86/pci/early.c b/arch/x86/pci/early.c
index e5f753c..f5fc953 100644
--- a/arch/x86/pci/early.c
+++ b/arch/x86/pci/early.c
@@ -57,47 +57,3 @@ int early_pci_allowed(void)
 			PCI_PROBE_CONF1;
 }
 
-void early_dump_pci_device(u8 bus, u8 slot, u8 func)
-{
-	u32 value[256 / 4];
-	int i;
-
-	pr_info("pci 0000:%02x:%02x.%d config space:\n", bus, slot, func);
-
-	for (i = 0; i < 256; i += 4)
-		value[i / 4] = read_pci_config(bus, slot, func, i);
-
-	print_hex_dump(KERN_INFO, "", DUMP_PREFIX_OFFSET, 16, 1, value, 256, false);
-}
-
-void early_dump_pci_devices(void)
-{
-	unsigned bus, slot, func;
-
-	if (!early_pci_allowed())
-		return;
-
-	for (bus = 0; bus < 256; bus++) {
-		for (slot = 0; slot < 32; slot++) {
-			for (func = 0; func < 8; func++) {
-				u32 class;
-				u8 type;
-
-				class = read_pci_config(bus, slot, func,
-							PCI_CLASS_REVISION);
-				if (class == 0xffffffff)
-					continue;
-
-				early_dump_pci_device(bus, slot, func);
-
-				if (func == 0) {
-					type = read_pci_config_byte(bus, slot,
-								    func,
-							       PCI_HEADER_TYPE);
-					if (!(type & 0x80))
-						break;
-				}
-			}
-		}
-	}
-}
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 97acba7..04052dc 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -115,6 +115,9 @@ static bool pcie_ari_disabled;
 /* If set, the PCIe ATS capability will not be used. */
 static bool pcie_ats_disabled;
 
+/* If set, the PCI config space of each device is printed during boot. */
+bool pci_early_dump;
+
 bool pci_ats_disabled(void)
 {
 	return pcie_ats_disabled;
@@ -5805,6 +5808,8 @@ static int __init pci_setup(char *str)
 				pcie_ats_disabled = true;
 			} else if (!strcmp(str, "noaer")) {
 				pci_no_aer();
+			} else if (!strcmp(str, "earlydump")) {
+				pci_early_dump = true;
 			} else if (!strncmp(str, "realloc=", 8)) {
 				pci_realloc_get_opt(str + 8);
 			} else if (!strncmp(str, "realloc", 7)) {
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index c358e7a0..c33265e 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -7,6 +7,7 @@
 #define PCI_VSEC_ID_INTEL_TBT	0x1234	/* Thunderbolt */
 
 extern const unsigned char pcie_link_speed[];
+extern bool pci_early_dump;
 
 bool pcie_cap_has_lnkctl(const struct pci_dev *dev);
 
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 56771f3..3678f0a 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1545,6 +1545,23 @@ static int pci_intx_mask_broken(struct pci_dev *dev)
 	return 0;
 }
 
+static void early_dump_pci_device(struct pci_dev *pdev)
+{
+	u32 value[256 / 4];
+	int i;
+
+	if (!pci_early_dump)
+		return;
+
+	pci_info(pdev, "config space:\n");
+
+	for (i = 0; i < 256; i += 4)
+		pci_read_config_dword(pdev, i, &value[i / 4]);
+
+	print_hex_dump(KERN_INFO, "", DUMP_PREFIX_OFFSET, 16, 1, value,
+		       256, false);
+}
+
 /**
  * pci_setup_device - Fill in class and map information of a device
  * @dev: the device structure to fill
@@ -1594,6 +1611,8 @@ int pci_setup_device(struct pci_dev *dev)
 	pci_printk(KERN_DEBUG, dev, "[%04x:%04x] type %02x class %#08x\n",
 		   dev->vendor, dev->device, dev->hdr_type, dev->class);
 
+	early_dump_pci_device(dev);
+
 	/* Need to have dev->class ready */
 	dev->cfg_size = pci_cfg_space_size(dev);
 
-- 
2.7.4

^ permalink raw reply related

* [PATCH V2] PCI: move early dump functionality from x86 arch into the common code
From: Sinan Kaya @ 2018-06-01 18:58 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20180601185310.GA175802@bhelgaas-glaptop.roam.corp.google.com>

On 6/1/2018 2:53 PM, Bjorn Helgaas wrote:
>> +	pci_info(pdev, "pci 0000:%02x:%02x.%d config space:\n",
>> +		 pdev->bus->number, PCI_SLOT(pdev->devfn),
>> +		 PCI_FUNC(pdev->devfn));
> I'm still missing something -- why go to the trouble of pdev->bus->number,
> PCI_SLOT(), etc?  Isn't the output going to look like this?
> 
>   pci 0000:00:00.0: pci 0000:00:00.0 config space:
> 
> In other words, wouldn't the following be enough?
> 
>   pci_info(pdev, "config space:\n");
> 

You are right. The origin print function was pr_info. We don't need that
stuff anymore. 

-- 
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.

^ permalink raw reply

* [PATCH V2] PCI: move early dump functionality from x86 arch into the common code
From: Bjorn Helgaas @ 2018-06-01 18:53 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <1527878854-21419-1-git-send-email-okaya@codeaurora.org>

On Fri, Jun 01, 2018 at 02:47:28PM -0400, Sinan Kaya wrote:
> Move early dump functionality into common code so that it is available for
> all archtiectures. No need to carry arch specific reads around as the read
> hooks are already initialized by the time pci_setup_device() is getting
> called during scan.

> +static void early_dump_pci_device(struct pci_dev *pdev)
> +{
> +	u32 value[256 / 4];
> +	int i;
> +
> +	if (!pci_early_dump)
> +		return;
> +
> +	pci_info(pdev, "pci 0000:%02x:%02x.%d config space:\n",
> +		 pdev->bus->number, PCI_SLOT(pdev->devfn),
> +		 PCI_FUNC(pdev->devfn));

I'm still missing something -- why go to the trouble of pdev->bus->number,
PCI_SLOT(), etc?  Isn't the output going to look like this?

  pci 0000:00:00.0: pci 0000:00:00.0 config space:

In other words, wouldn't the following be enough?

  pci_info(pdev, "config space:\n");

> +
> +	for (i = 0; i < 256; i += 4)
> +		pci_read_config_dword(pdev, i, &value[i / 4]);
> +
> +	print_hex_dump(KERN_INFO, "", DUMP_PREFIX_OFFSET, 16, 1, value,
> +		       256, false);
> +}

^ permalink raw reply

* [PATCH V2] PCI: move early dump functionality from x86 arch into the common code
From: Sinan Kaya @ 2018-06-01 18:47 UTC (permalink / raw)
  To: linux-arm-kernel

Move early dump functionality into common code so that it is available for
all archtiectures. No need to carry arch specific reads around as the read
hooks are already initialized by the time pci_setup_device() is getting
called during scan.

Signed-off-by: Sinan Kaya <okaya@codeaurora.org>
---
 Documentation/admin-guide/kernel-parameters.txt |  2 +-
 arch/x86/include/asm/pci-direct.h               |  5 ---
 arch/x86/kernel/setup.c                         |  5 ---
 arch/x86/pci/common.c                           |  4 ---
 arch/x86/pci/early.c                            | 44 -------------------------
 drivers/pci/pci.c                               |  5 +++
 drivers/pci/pci.h                               |  1 +
 drivers/pci/probe.c                             | 21 ++++++++++++
 8 files changed, 28 insertions(+), 59 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index e490902..e64f1d8 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -2995,7 +2995,7 @@
 			See also Documentation/blockdev/paride.txt.
 
 	pci=option[,option...]	[PCI] various PCI subsystem options:
-		earlydump	[X86] dump PCI config space before the kernel
+		earlydump	dump PCI config space before the kernel
 				changes anything
 		off		[X86] don't probe for the PCI bus
 		bios		[X86-32] force use of PCI BIOS, don't access
diff --git a/arch/x86/include/asm/pci-direct.h b/arch/x86/include/asm/pci-direct.h
index e1084f7..e5e2129 100644
--- a/arch/x86/include/asm/pci-direct.h
+++ b/arch/x86/include/asm/pci-direct.h
@@ -14,9 +14,4 @@ extern void write_pci_config(u8 bus, u8 slot, u8 func, u8 offset, u32 val);
 extern void write_pci_config_byte(u8 bus, u8 slot, u8 func, u8 offset, u8 val);
 extern void write_pci_config_16(u8 bus, u8 slot, u8 func, u8 offset, u16 val);
 
-extern int early_pci_allowed(void);
-
-extern unsigned int pci_early_dump_regs;
-extern void early_dump_pci_device(u8 bus, u8 slot, u8 func);
-extern void early_dump_pci_devices(void);
 #endif /* _ASM_X86_PCI_DIRECT_H */
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 2f86d88..480f250 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -991,11 +991,6 @@ void __init setup_arch(char **cmdline_p)
 		setup_clear_cpu_cap(X86_FEATURE_APIC);
 	}
 
-#ifdef CONFIG_PCI
-	if (pci_early_dump_regs)
-		early_dump_pci_devices();
-#endif
-
 	e820__reserve_setup_data();
 	e820__finish_early_params();
 
diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c
index 563049c..d4ec117 100644
--- a/arch/x86/pci/common.c
+++ b/arch/x86/pci/common.c
@@ -22,7 +22,6 @@
 unsigned int pci_probe = PCI_PROBE_BIOS | PCI_PROBE_CONF1 | PCI_PROBE_CONF2 |
 				PCI_PROBE_MMCONF;
 
-unsigned int pci_early_dump_regs;
 static int pci_bf_sort;
 int pci_routeirq;
 int noioapicquirk;
@@ -599,9 +598,6 @@ char *__init pcibios_setup(char *str)
 		pci_probe |= PCI_BIG_ROOT_WINDOW;
 		return NULL;
 #endif
-	} else if (!strcmp(str, "earlydump")) {
-		pci_early_dump_regs = 1;
-		return NULL;
 	} else if (!strcmp(str, "routeirq")) {
 		pci_routeirq = 1;
 		return NULL;
diff --git a/arch/x86/pci/early.c b/arch/x86/pci/early.c
index e5f753c..f5fc953 100644
--- a/arch/x86/pci/early.c
+++ b/arch/x86/pci/early.c
@@ -57,47 +57,3 @@ int early_pci_allowed(void)
 			PCI_PROBE_CONF1;
 }
 
-void early_dump_pci_device(u8 bus, u8 slot, u8 func)
-{
-	u32 value[256 / 4];
-	int i;
-
-	pr_info("pci 0000:%02x:%02x.%d config space:\n", bus, slot, func);
-
-	for (i = 0; i < 256; i += 4)
-		value[i / 4] = read_pci_config(bus, slot, func, i);
-
-	print_hex_dump(KERN_INFO, "", DUMP_PREFIX_OFFSET, 16, 1, value, 256, false);
-}
-
-void early_dump_pci_devices(void)
-{
-	unsigned bus, slot, func;
-
-	if (!early_pci_allowed())
-		return;
-
-	for (bus = 0; bus < 256; bus++) {
-		for (slot = 0; slot < 32; slot++) {
-			for (func = 0; func < 8; func++) {
-				u32 class;
-				u8 type;
-
-				class = read_pci_config(bus, slot, func,
-							PCI_CLASS_REVISION);
-				if (class == 0xffffffff)
-					continue;
-
-				early_dump_pci_device(bus, slot, func);
-
-				if (func == 0) {
-					type = read_pci_config_byte(bus, slot,
-								    func,
-							       PCI_HEADER_TYPE);
-					if (!(type & 0x80))
-						break;
-				}
-			}
-		}
-	}
-}
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 97acba7..04052dc 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -115,6 +115,9 @@ static bool pcie_ari_disabled;
 /* If set, the PCIe ATS capability will not be used. */
 static bool pcie_ats_disabled;
 
+/* If set, the PCI config space of each device is printed during boot. */
+bool pci_early_dump;
+
 bool pci_ats_disabled(void)
 {
 	return pcie_ats_disabled;
@@ -5805,6 +5808,8 @@ static int __init pci_setup(char *str)
 				pcie_ats_disabled = true;
 			} else if (!strcmp(str, "noaer")) {
 				pci_no_aer();
+			} else if (!strcmp(str, "earlydump")) {
+				pci_early_dump = true;
 			} else if (!strncmp(str, "realloc=", 8)) {
 				pci_realloc_get_opt(str + 8);
 			} else if (!strncmp(str, "realloc", 7)) {
diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
index c358e7a0..c33265e 100644
--- a/drivers/pci/pci.h
+++ b/drivers/pci/pci.h
@@ -7,6 +7,7 @@
 #define PCI_VSEC_ID_INTEL_TBT	0x1234	/* Thunderbolt */
 
 extern const unsigned char pcie_link_speed[];
+extern bool pci_early_dump;
 
 bool pcie_cap_has_lnkctl(const struct pci_dev *dev);
 
diff --git a/drivers/pci/probe.c b/drivers/pci/probe.c
index 56771f3..7490f78fa 100644
--- a/drivers/pci/probe.c
+++ b/drivers/pci/probe.c
@@ -1545,6 +1545,25 @@ static int pci_intx_mask_broken(struct pci_dev *dev)
 	return 0;
 }
 
+static void early_dump_pci_device(struct pci_dev *pdev)
+{
+	u32 value[256 / 4];
+	int i;
+
+	if (!pci_early_dump)
+		return;
+
+	pci_info(pdev, "pci 0000:%02x:%02x.%d config space:\n",
+		 pdev->bus->number, PCI_SLOT(pdev->devfn),
+		 PCI_FUNC(pdev->devfn));
+
+	for (i = 0; i < 256; i += 4)
+		pci_read_config_dword(pdev, i, &value[i / 4]);
+
+	print_hex_dump(KERN_INFO, "", DUMP_PREFIX_OFFSET, 16, 1, value,
+		       256, false);
+}
+
 /**
  * pci_setup_device - Fill in class and map information of a device
  * @dev: the device structure to fill
@@ -1594,6 +1613,8 @@ int pci_setup_device(struct pci_dev *dev)
 	pci_printk(KERN_DEBUG, dev, "[%04x:%04x] type %02x class %#08x\n",
 		   dev->vendor, dev->device, dev->hdr_type, dev->class);
 
+	early_dump_pci_device(dev);
+
 	/* Need to have dev->class ready */
 	dev->cfg_size = pci_cfg_space_size(dev);
 
-- 
2.7.4

^ permalink raw reply related

* [PATCH linux-next v5 06/13] drivers/peci: Add a PECI adapter driver for Aspeed AST24xx/AST25xx
From: Jae Hyun Yoo @ 2018-06-01 18:21 UTC (permalink / raw)
  To: linux-arm-kernel

This commit adds PECI adapter driver implementation for Aspeed
AST24xx/AST25xx SoCs.

Signed-off-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
Reviewed-by: Haiyue Wang <haiyue.wang@linux.intel.com>
Reviewed-by: James Feist <james.feist@linux.intel.com>
Reviewed-by: Vernon Mauery <vernon.mauery@linux.intel.com>
Cc: Andy Shevchenko <andriy.shevchenko@intel.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Ryan Chen <ryan_chen@aspeedtech.com>
---
 drivers/peci/Kconfig       |  27 ++
 drivers/peci/Makefile      |   3 +
 drivers/peci/peci-aspeed.c | 498 +++++++++++++++++++++++++++++++++++++
 3 files changed, 528 insertions(+)
 create mode 100644 drivers/peci/peci-aspeed.c

diff --git a/drivers/peci/Kconfig b/drivers/peci/Kconfig
index c39f7730d081..273a10eab1ce 100644
--- a/drivers/peci/Kconfig
+++ b/drivers/peci/Kconfig
@@ -11,3 +11,30 @@ config PECI
 	  interface that provides a communication channel between Intel
 	  processors and chipset components to external monitoring or control
 	  devices.
+
+	  If you want PECI support, you should say Y here and also to the
+	  specific driver for your bus adapter(s) below.
+
+if PECI
+
+#
+# PECI hardware bus configuration
+#
+
+menu "PECI Hardware Bus support"
+
+config PECI_ASPEED
+	tristate "ASPEED PECI support"
+	select REGMAP_MMIO
+	depends on OF
+	depends on ARCH_ASPEED || COMPILE_TEST
+	help
+	  Say Y here if you want support for the Platform Environment Control
+	  Interface (PECI) bus adapter driver on the ASPEED SoCs.
+
+	  This support is also available as a module.  If so, the module
+	  will be called peci-aspeed.
+
+endmenu
+
+endif # PECI
diff --git a/drivers/peci/Makefile b/drivers/peci/Makefile
index 9e8615e0d3ff..886285e69765 100644
--- a/drivers/peci/Makefile
+++ b/drivers/peci/Makefile
@@ -4,3 +4,6 @@
 
 # Core functionality
 obj-$(CONFIG_PECI)		+= peci-core.o
+
+# Hardware specific bus drivers
+obj-$(CONFIG_PECI_ASPEED)	+= peci-aspeed.o
diff --git a/drivers/peci/peci-aspeed.c b/drivers/peci/peci-aspeed.c
new file mode 100644
index 000000000000..8070ec18d484
--- /dev/null
+++ b/drivers/peci/peci-aspeed.c
@@ -0,0 +1,498 @@
+// SPDX-License-Identifier: GPL-2.0
+// Copyright (C) 2012-2017 ASPEED Technology Inc.
+// Copyright (c) 2018 Intel Corporation
+
+#include <linux/bitfield.h>
+#include <linux/clk.h>
+#include <linux/interrupt.h>
+#include <linux/jiffies.h>
+#include <linux/module.h>
+#include <linux/of.h>
+#include <linux/peci.h>
+#include <linux/platform_device.h>
+#include <linux/regmap.h>
+#include <linux/reset.h>
+
+/* ASPEED PECI Registers */
+#define ASPEED_PECI_CTRL     0x00
+#define ASPEED_PECI_TIMING   0x04
+#define ASPEED_PECI_CMD      0x08
+#define ASPEED_PECI_CMD_CTRL 0x0c
+#define ASPEED_PECI_EXP_FCS  0x10
+#define ASPEED_PECI_CAP_FCS  0x14
+#define ASPEED_PECI_INT_CTRL 0x18
+#define ASPEED_PECI_INT_STS  0x1c
+#define ASPEED_PECI_W_DATA0  0x20
+#define ASPEED_PECI_W_DATA1  0x24
+#define ASPEED_PECI_W_DATA2  0x28
+#define ASPEED_PECI_W_DATA3  0x2c
+#define ASPEED_PECI_R_DATA0  0x30
+#define ASPEED_PECI_R_DATA1  0x34
+#define ASPEED_PECI_R_DATA2  0x38
+#define ASPEED_PECI_R_DATA3  0x3c
+#define ASPEED_PECI_W_DATA4  0x40
+#define ASPEED_PECI_W_DATA5  0x44
+#define ASPEED_PECI_W_DATA6  0x48
+#define ASPEED_PECI_W_DATA7  0x4c
+#define ASPEED_PECI_R_DATA4  0x50
+#define ASPEED_PECI_R_DATA5  0x54
+#define ASPEED_PECI_R_DATA6  0x58
+#define ASPEED_PECI_R_DATA7  0x5c
+
+/* ASPEED_PECI_CTRL - 0x00 : Control Register */
+#define PECI_CTRL_SAMPLING_MASK      GENMASK(19, 16)
+#define PECI_CTRL_READ_MODE_MASK     GENMASK(13, 12)
+#define PECI_CTRL_READ_MODE_COUNT    BIT(12)
+#define PECI_CTRL_READ_MODE_DBG      BIT(13)
+#define PECI_CTRL_CLK_SOURCE_MASK    BIT(11)
+#define PECI_CTRL_CLK_DIV_MASK       GENMASK(10, 8)
+#define PECI_CTRL_INVERT_OUT         BIT(7)
+#define PECI_CTRL_INVERT_IN          BIT(6)
+#define PECI_CTRL_BUS_CONTENT_EN     BIT(5)
+#define PECI_CTRL_PECI_EN            BIT(4)
+#define PECI_CTRL_PECI_CLK_EN        BIT(0)
+
+/* ASPEED_PECI_TIMING - 0x04 : Timing Negotiation Register */
+#define PECI_TIMING_MESSAGE_MASK     GENMASK(15, 8)
+#define PECI_TIMING_ADDRESS_MASK     GENMASK(7, 0)
+
+/* ASPEED_PECI_CMD - 0x08 : Command Register */
+#define PECI_CMD_PIN_MON             BIT(31)
+#define PECI_CMD_STS_MASK            GENMASK(27, 24)
+#define PECI_CMD_IDLE_MASK           (PECI_CMD_STS_MASK | PECI_CMD_PIN_MON)
+#define PECI_CMD_FIRE                BIT(0)
+
+/* ASPEED_PECI_LEN - 0x0C : Read/Write Length Register */
+#define PECI_AW_FCS_EN               BIT(31)
+#define PECI_READ_LEN_MASK           GENMASK(23, 16)
+#define PECI_WRITE_LEN_MASK          GENMASK(15, 8)
+#define PECI_TAGET_ADDR_MASK         GENMASK(7, 0)
+
+/* ASPEED_PECI_EXP_FCS - 0x10 : Expected FCS Data Register */
+#define PECI_EXPECT_READ_FCS_MASK    GENMASK(23, 16)
+#define PECI_EXPECT_AW_FCS_AUTO_MASK GENMASK(15, 8)
+#define PECI_EXPECT_WRITE_FCS_MASK   GENMASK(7, 0)
+
+/* ASPEED_PECI_CAP_FCS - 0x14 : Captured FCS Data Register */
+#define PECI_CAPTURE_READ_FCS_MASK   GENMASK(23, 16)
+#define PECI_CAPTURE_WRITE_FCS_MASK  GENMASK(7, 0)
+
+/* ASPEED_PECI_INT_CTRL/STS - 0x18/0x1c : Interrupt Register */
+#define PECI_INT_TIMING_RESULT_MASK  GENMASK(31, 30)
+#define PECI_INT_TIMEOUT             BIT(4)
+#define PECI_INT_CONNECT             BIT(3)
+#define PECI_INT_W_FCS_BAD           BIT(2)
+#define PECI_INT_W_FCS_ABORT         BIT(1)
+#define PECI_INT_CMD_DONE            BIT(0)
+
+#define PECI_INT_MASK  (PECI_INT_TIMEOUT | PECI_INT_CONNECT | \
+			PECI_INT_W_FCS_BAD | PECI_INT_W_FCS_ABORT | \
+			PECI_INT_CMD_DONE)
+
+#define PECI_IDLE_CHECK_TIMEOUT_USEC    50000
+#define PECI_IDLE_CHECK_INTERVAL_USEC   10000
+
+#define PECI_RD_SAMPLING_POINT_DEFAULT  8
+#define PECI_RD_SAMPLING_POINT_MAX      15
+#define PECI_CLK_DIV_DEFAULT            0
+#define PECI_CLK_DIV_MAX                7
+#define PECI_MSG_TIMING_DEFAULT         1
+#define PECI_MSG_TIMING_MAX             255
+#define PECI_ADDR_TIMING_DEFAULT        1
+#define PECI_ADDR_TIMING_MAX            255
+#define PECI_CMD_TIMEOUT_MS_DEFAULT     1000
+#define PECI_CMD_TIMEOUT_MS_MAX         60000
+
+struct aspeed_peci {
+	struct peci_adapter	*adapter;
+	struct device		*dev;
+	struct regmap		*regmap;
+	struct reset_control	*rst;
+	int			irq;
+	spinlock_t		lock; /* to sync completion status handling */
+	struct completion	xfer_complete;
+	u32			status;
+	u32			cmd_timeout_ms;
+};
+
+static int aspeed_peci_xfer_native(struct aspeed_peci *priv,
+				   struct peci_xfer_msg *msg)
+{
+	long err, timeout = msecs_to_jiffies(priv->cmd_timeout_ms);
+	u32 peci_head, peci_state, rx_data, cmd_sts;
+	unsigned long flags;
+	int i, rc;
+	uint reg;
+
+	/* Check command sts and bus idle state */
+	rc = regmap_read_poll_timeout(priv->regmap, ASPEED_PECI_CMD, cmd_sts,
+				      !(cmd_sts & PECI_CMD_IDLE_MASK),
+				      PECI_IDLE_CHECK_INTERVAL_USEC,
+				      PECI_IDLE_CHECK_TIMEOUT_USEC);
+	if (rc)
+		return rc; /* -ETIMEDOUT */
+
+	spin_lock_irqsave(&priv->lock, flags);
+	reinit_completion(&priv->xfer_complete);
+
+	peci_head = FIELD_PREP(PECI_TAGET_ADDR_MASK, msg->addr) |
+		    FIELD_PREP(PECI_WRITE_LEN_MASK, msg->tx_len) |
+		    FIELD_PREP(PECI_READ_LEN_MASK, msg->rx_len);
+
+	regmap_write(priv->regmap, ASPEED_PECI_CMD_CTRL, peci_head);
+
+	for (i = 0; i < msg->tx_len; i += 4) {
+		reg = i < 16 ? ASPEED_PECI_W_DATA0 + i % 16 :
+			       ASPEED_PECI_W_DATA4 + i % 16;
+		regmap_write(priv->regmap, reg,
+			     le32_to_cpup((__le32 *)&msg->tx_buf[i]));
+	}
+
+	dev_dbg(priv->dev, "HEAD : 0x%08x\n", peci_head);
+	print_hex_dump_debug("TX : ", DUMP_PREFIX_NONE, 16, 1,
+			     msg->tx_buf, msg->tx_len, true);
+
+	priv->status = 0;
+	regmap_write(priv->regmap, ASPEED_PECI_CMD, PECI_CMD_FIRE);
+	spin_unlock_irqrestore(&priv->lock, flags);
+
+	err = wait_for_completion_interruptible_timeout(&priv->xfer_complete,
+							timeout);
+
+	spin_lock_irqsave(&priv->lock, flags);
+	dev_dbg(priv->dev, "INT_STS : 0x%08x\n", priv->status);
+	regmap_read(priv->regmap, ASPEED_PECI_CMD, &peci_state);
+	dev_dbg(priv->dev, "PECI_STATE : 0x%lx\n",
+		FIELD_GET(PECI_CMD_STS_MASK, peci_state));
+
+	regmap_write(priv->regmap, ASPEED_PECI_CMD, 0);
+
+	if (err <= 0 || priv->status != PECI_INT_CMD_DONE) {
+		if (err < 0) { /* -ERESTARTSYS */
+			rc = (int)err;
+			goto err_irqrestore;
+		} else if (err == 0) {
+			dev_dbg(priv->dev, "Timeout waiting for a response!\n");
+			rc = -ETIMEDOUT;
+			goto err_irqrestore;
+		}
+
+		dev_dbg(priv->dev, "No valid response!\n");
+		rc = -EIO;
+		goto err_irqrestore;
+	}
+
+	/**
+	 * Note that rx_len and rx_buf size can be an odd number.
+	 * Byte handling is more efficient.
+	 */
+	for (i = 0; i < msg->rx_len; i++) {
+		u8 byte_offset = i % 4;
+
+		if (byte_offset == 0) {
+			reg = i < 16 ? ASPEED_PECI_R_DATA0 + i % 16 :
+				       ASPEED_PECI_R_DATA4 + i % 16;
+			regmap_read(priv->regmap, reg, &rx_data);
+		}
+
+		msg->rx_buf[i] = (u8)(rx_data >> (byte_offset << 3));
+	}
+
+	print_hex_dump_debug("RX : ", DUMP_PREFIX_NONE, 16, 1,
+			     msg->rx_buf, msg->rx_len, true);
+
+	regmap_read(priv->regmap, ASPEED_PECI_CMD, &peci_state);
+	dev_dbg(priv->dev, "PECI_STATE : 0x%lx\n",
+		FIELD_GET(PECI_CMD_STS_MASK, peci_state));
+	dev_dbg(priv->dev, "------------------------\n");
+
+err_irqrestore:
+	spin_unlock_irqrestore(&priv->lock, flags);
+	return rc;
+}
+
+static irqreturn_t aspeed_peci_irq_handler(int irq, void *arg)
+{
+	struct aspeed_peci *priv = arg;
+	u32 status_ack = 0;
+	u32 status;
+
+	spin_lock(&priv->lock);
+	regmap_read(priv->regmap, ASPEED_PECI_INT_STS, &status);
+	priv->status |= (status & PECI_INT_MASK);
+
+	/**
+	 * In most cases, interrupt bits will be set one by one but also note
+	 * that multiple interrupt bits could be set at the same time.
+	 */
+	if (status & PECI_INT_TIMEOUT) {
+		dev_dbg(priv->dev, "PECI_INT_TIMEOUT\n");
+		status_ack |= PECI_INT_TIMEOUT;
+	}
+
+	if (status & PECI_INT_CONNECT) {
+		dev_dbg(priv->dev, "PECI_INT_CONNECT\n");
+		status_ack |= PECI_INT_CONNECT;
+	}
+
+	if (status & PECI_INT_W_FCS_BAD) {
+		dev_dbg(priv->dev, "PECI_INT_W_FCS_BAD\n");
+		status_ack |= PECI_INT_W_FCS_BAD;
+	}
+
+	if (status & PECI_INT_W_FCS_ABORT) {
+		dev_dbg(priv->dev, "PECI_INT_W_FCS_ABORT\n");
+		status_ack |= PECI_INT_W_FCS_ABORT;
+	}
+
+	/**
+	 * All commands should be ended up with a PECI_INT_CMD_DONE bit set
+	 * even in an error case.
+	 */
+	if (status & PECI_INT_CMD_DONE) {
+		dev_dbg(priv->dev, "PECI_INT_CMD_DONE\n");
+		status_ack |= PECI_INT_CMD_DONE;
+		complete(&priv->xfer_complete);
+	}
+
+	regmap_write(priv->regmap, ASPEED_PECI_INT_STS, status_ack);
+	spin_unlock(&priv->lock);
+	return IRQ_HANDLED;
+}
+
+static int aspeed_peci_init_ctrl(struct aspeed_peci *priv)
+{
+	u32 msg_timing, addr_timing, rd_sampling_point;
+	u32 clk_freq, clk_divisor, clk_div_val = 0;
+	struct clk *clkin;
+	int ret;
+
+	clkin = devm_clk_get(priv->dev, NULL);
+	if (IS_ERR(clkin)) {
+		dev_err(priv->dev, "Failed to get clk source.\n");
+		return PTR_ERR(clkin);
+	}
+
+	ret = of_property_read_u32(priv->dev->of_node, "clock-frequency",
+				   &clk_freq);
+	if (ret) {
+		dev_err(priv->dev,
+			"Could not read clock-frequency property.\n");
+		return ret;
+	}
+
+	clk_divisor = clk_get_rate(clkin) / clk_freq;
+	devm_clk_put(priv->dev, clkin);
+
+	while ((clk_divisor >> 1) && (clk_div_val < PECI_CLK_DIV_MAX))
+		clk_div_val++;
+
+	ret = of_property_read_u32(priv->dev->of_node, "msg-timing",
+				   &msg_timing);
+	if (ret || msg_timing > PECI_MSG_TIMING_MAX) {
+		if (!ret)
+			dev_warn(priv->dev,
+				 "Invalid msg-timing : %u, Use default : %u\n",
+				 msg_timing, PECI_MSG_TIMING_DEFAULT);
+		msg_timing = PECI_MSG_TIMING_DEFAULT;
+	}
+
+	ret = of_property_read_u32(priv->dev->of_node, "addr-timing",
+				   &addr_timing);
+	if (ret || addr_timing > PECI_ADDR_TIMING_MAX) {
+		if (!ret)
+			dev_warn(priv->dev,
+				 "Invalid addr-timing : %u, Use default : %u\n",
+				 addr_timing, PECI_ADDR_TIMING_DEFAULT);
+		addr_timing = PECI_ADDR_TIMING_DEFAULT;
+	}
+
+	ret = of_property_read_u32(priv->dev->of_node, "rd-sampling-point",
+				   &rd_sampling_point);
+	if (ret || rd_sampling_point > PECI_RD_SAMPLING_POINT_MAX) {
+		if (!ret)
+			dev_warn(priv->dev,
+				 "Invalid rd-sampling-point : %u. Use default : %u\n",
+				 rd_sampling_point,
+				 PECI_RD_SAMPLING_POINT_DEFAULT);
+		rd_sampling_point = PECI_RD_SAMPLING_POINT_DEFAULT;
+	}
+
+	ret = of_property_read_u32(priv->dev->of_node, "cmd-timeout-ms",
+				   &priv->cmd_timeout_ms);
+	if (ret || priv->cmd_timeout_ms > PECI_CMD_TIMEOUT_MS_MAX ||
+	    priv->cmd_timeout_ms == 0) {
+		if (!ret)
+			dev_warn(priv->dev,
+				 "Invalid cmd-timeout-ms : %u. Use default : %u\n",
+				 priv->cmd_timeout_ms,
+				 PECI_CMD_TIMEOUT_MS_DEFAULT);
+		priv->cmd_timeout_ms = PECI_CMD_TIMEOUT_MS_DEFAULT;
+	}
+
+	regmap_write(priv->regmap, ASPEED_PECI_CTRL,
+		     FIELD_PREP(PECI_CTRL_CLK_DIV_MASK, PECI_CLK_DIV_DEFAULT) |
+		     PECI_CTRL_PECI_CLK_EN);
+
+	/**
+	 * Timing negotiation period setting.
+	 * The unit of the programmed value is 4 times of PECI clock period.
+	 */
+	regmap_write(priv->regmap, ASPEED_PECI_TIMING,
+		     FIELD_PREP(PECI_TIMING_MESSAGE_MASK, msg_timing) |
+		     FIELD_PREP(PECI_TIMING_ADDRESS_MASK, addr_timing));
+
+	/* Clear interrupts */
+	regmap_write(priv->regmap, ASPEED_PECI_INT_STS, PECI_INT_MASK);
+
+	/* Enable interrupts */
+	regmap_write(priv->regmap, ASPEED_PECI_INT_CTRL, PECI_INT_MASK);
+
+	/* Read sampling point and clock speed setting */
+	regmap_write(priv->regmap, ASPEED_PECI_CTRL,
+		     FIELD_PREP(PECI_CTRL_SAMPLING_MASK, rd_sampling_point) |
+		     FIELD_PREP(PECI_CTRL_CLK_DIV_MASK, clk_div_val) |
+		     PECI_CTRL_PECI_EN | PECI_CTRL_PECI_CLK_EN);
+
+	return 0;
+}
+
+static const struct regmap_config aspeed_peci_regmap_config = {
+	.reg_bits = 32,
+	.val_bits = 32,
+	.reg_stride = 4,
+	.max_register = ASPEED_PECI_R_DATA7,
+	.val_format_endian = REGMAP_ENDIAN_LITTLE,
+	.fast_io = true,
+};
+
+static int aspeed_peci_xfer(struct peci_adapter *adapter,
+			    struct peci_xfer_msg *msg)
+{
+	struct aspeed_peci *priv = peci_get_adapdata(adapter);
+
+	return aspeed_peci_xfer_native(priv, msg);
+}
+
+static int aspeed_peci_probe(struct platform_device *pdev)
+{
+	struct peci_adapter *adapter;
+	struct aspeed_peci *priv;
+	struct resource *res;
+	void __iomem *base;
+	u32 cmd_sts;
+	int ret;
+
+	adapter = peci_alloc_adapter(&pdev->dev, sizeof(*priv));
+	if (!adapter)
+		return -ENOMEM;
+
+	priv = peci_get_adapdata(adapter);
+	priv->adapter = adapter;
+	priv->dev = &pdev->dev;
+	dev_set_drvdata(&pdev->dev, priv);
+
+	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	base = devm_ioremap_resource(&pdev->dev, res);
+	if (IS_ERR(base)) {
+		ret = PTR_ERR(base);
+		goto err_put_adapter_dev;
+	}
+
+	priv->regmap = devm_regmap_init_mmio(&pdev->dev, base,
+					     &aspeed_peci_regmap_config);
+	if (IS_ERR(priv->regmap)) {
+		ret = PTR_ERR(priv->regmap);
+		goto err_put_adapter_dev;
+	}
+
+	/**
+	 * We check that the regmap works on this very first access,
+	 * but as this is an MMIO-backed regmap, subsequent regmap
+	 * access is not going to fail and we skip error checks from
+	 * this point.
+	 */
+	ret = regmap_read(priv->regmap, ASPEED_PECI_CMD, &cmd_sts);
+	if (ret) {
+		ret = -EIO;
+		goto err_put_adapter_dev;
+	}
+
+	priv->irq = platform_get_irq(pdev, 0);
+	if (!priv->irq) {
+		ret = -ENODEV;
+		goto err_put_adapter_dev;
+	}
+
+	ret = devm_request_irq(&pdev->dev, priv->irq, aspeed_peci_irq_handler,
+			       0, "peci-aspeed-irq", priv);
+	if (ret)
+		goto err_put_adapter_dev;
+
+	init_completion(&priv->xfer_complete);
+	spin_lock_init(&priv->lock);
+
+	priv->adapter->owner = THIS_MODULE;
+	priv->adapter->dev.of_node = of_node_get(dev_of_node(priv->dev));
+	strlcpy(priv->adapter->name, pdev->name, sizeof(priv->adapter->name));
+	priv->adapter->xfer = aspeed_peci_xfer;
+
+	priv->rst = devm_reset_control_get(&pdev->dev, NULL);
+	if (IS_ERR(priv->rst)) {
+		dev_err(&pdev->dev,
+			"missing or invalid reset controller entry");
+		ret = PTR_ERR(priv->rst);
+		goto err_put_adapter_dev;
+	}
+	reset_control_deassert(priv->rst);
+
+	ret = aspeed_peci_init_ctrl(priv);
+	if (ret)
+		goto err_put_adapter_dev;
+
+	ret = peci_add_adapter(priv->adapter);
+	if (ret)
+		goto err_put_adapter_dev;
+
+	dev_info(&pdev->dev, "peci bus %d registered, irq %d\n",
+		 priv->adapter->nr, priv->irq);
+
+	return 0;
+
+err_put_adapter_dev:
+	put_device(&adapter->dev);
+	return ret;
+}
+
+static int aspeed_peci_remove(struct platform_device *pdev)
+{
+	struct aspeed_peci *priv = dev_get_drvdata(&pdev->dev);
+
+	reset_control_assert(priv->rst);
+	peci_del_adapter(priv->adapter);
+	of_node_put(priv->adapter->dev.of_node);
+
+	return 0;
+}
+
+static const struct of_device_id aspeed_peci_of_table[] = {
+	{ .compatible = "aspeed,ast2400-peci", },
+	{ .compatible = "aspeed,ast2500-peci", },
+	{ }
+};
+MODULE_DEVICE_TABLE(of, aspeed_peci_of_table);
+
+static struct platform_driver aspeed_peci_driver = {
+	.probe  = aspeed_peci_probe,
+	.remove = aspeed_peci_remove,
+	.driver = {
+		.name           = "peci-aspeed",
+		.of_match_table = of_match_ptr(aspeed_peci_of_table),
+	},
+};
+module_platform_driver(aspeed_peci_driver);
+
+MODULE_AUTHOR("Ryan Chen <ryan_chen@aspeedtech.com>");
+MODULE_AUTHOR("Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>");
+MODULE_DESCRIPTION("ASPEED PECI driver");
+MODULE_LICENSE("GPL v2");
-- 
2.17.0

^ permalink raw reply related

* [PATCH linux-next v5 05/13] ARM: dts: aspeed: peci: Add PECI node
From: Jae Hyun Yoo @ 2018-06-01 18:21 UTC (permalink / raw)
  To: linux-arm-kernel

This commit adds PECI bus/adapter node of AST24xx/AST25xx into
aspeed-g4 and aspeed-g5.

Signed-off-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
Reviewed-by: Haiyue Wang <haiyue.wang@linux.intel.com>
Reviewed-by: James Feist <james.feist@linux.intel.com>
Reviewed-by: Vernon Mauery <vernon.mauery@linux.intel.com>
Cc: Jason M Biils <jason.m.bills@linux.intel.com>
Cc: Ryan Chen <ryan_chen@aspeedtech.com>
---
 arch/arm/boot/dts/aspeed-g4.dtsi | 26 ++++++++++++++++++++++++++
 arch/arm/boot/dts/aspeed-g5.dtsi | 26 ++++++++++++++++++++++++++
 2 files changed, 52 insertions(+)

diff --git a/arch/arm/boot/dts/aspeed-g4.dtsi b/arch/arm/boot/dts/aspeed-g4.dtsi
index 5e947ed496c2..5702f0cfd5f8 100644
--- a/arch/arm/boot/dts/aspeed-g4.dtsi
+++ b/arch/arm/boot/dts/aspeed-g4.dtsi
@@ -29,6 +29,7 @@
 		serial3 = &uart4;
 		serial4 = &uart5;
 		serial5 = &vuart;
+		peci0 = &peci0;
 	};
 
 	cpus {
@@ -295,6 +296,13 @@
 				};
 			};
 
+			peci: peci at 1e78b000 {
+				compatible = "simple-bus";
+				#address-cells = <1>;
+				#size-cells = <1>;
+				ranges = <0x0 0x1e78b000 0x60>;
+			};
+
 			uart2: serial at 1e78d000 {
 				compatible = "ns16550a";
 				reg = <0x1e78d000 0x20>;
@@ -338,6 +346,24 @@
 	};
 };
 
+&peci {
+	peci0: peci-bus at 0 {
+		compatible = "aspeed,ast2400-peci";
+		reg = <0x0 0x60>;
+		#address-cells = <1>;
+		#size-cells = <0>;
+		interrupts = <15>;
+		clocks = <&syscon ASPEED_CLK_GATE_REFCLK>;
+		resets = <&syscon ASPEED_RESET_PECI>;
+		clock-frequency = <24000000>;
+		msg-timing = <1>;
+		addr-timing = <1>;
+		rd-sampling-point = <8>;
+		cmd-timeout-ms = <1000>;
+		status = "disabled";
+	};
+};
+
 &i2c {
 	i2c_ic: interrupt-controller at 0 {
 		#interrupt-cells = <1>;
diff --git a/arch/arm/boot/dts/aspeed-g5.dtsi b/arch/arm/boot/dts/aspeed-g5.dtsi
index 24eec00c4a95..5741b841fddb 100644
--- a/arch/arm/boot/dts/aspeed-g5.dtsi
+++ b/arch/arm/boot/dts/aspeed-g5.dtsi
@@ -29,6 +29,7 @@
 		serial3 = &uart4;
 		serial4 = &uart5;
 		serial5 = &vuart;
+		peci0 = &peci0;
 	};
 
 	cpus {
@@ -352,6 +353,13 @@
 				};
 			};
 
+			peci: peci at 1e78b000 {
+				compatible = "simple-bus";
+				#address-cells = <1>;
+				#size-cells = <1>;
+				ranges = <0x0 0x1e78b000 0x60>;
+			};
+
 			uart2: serial at 1e78d000 {
 				compatible = "ns16550a";
 				reg = <0x1e78d000 0x20>;
@@ -395,6 +403,24 @@
 	};
 };
 
+&peci {
+	peci0: peci-bus at 0 {
+		compatible = "aspeed,ast2500-peci";
+		reg = <0x0 0x60>;
+		#address-cells = <1>;
+		#size-cells = <0>;
+		interrupts = <15>;
+		clocks = <&syscon ASPEED_CLK_GATE_REFCLK>;
+		resets = <&syscon ASPEED_RESET_PECI>;
+		clock-frequency = <24000000>;
+		msg-timing = <1>;
+		addr-timing = <1>;
+		rd-sampling-point = <8>;
+		cmd-timeout-ms = <1000>;
+		status = "disabled";
+	};
+};
+
 &i2c {
 	i2c_ic: interrupt-controller at 0 {
 		#interrupt-cells = <1>;
-- 
2.17.0

^ permalink raw reply related

* [PATCH linux-next v5 04/13] dt-bindings: Add a document of PECI adapter driver for ASPEED AST24xx/25xx SoCs
From: Jae Hyun Yoo @ 2018-06-01 18:21 UTC (permalink / raw)
  To: linux-arm-kernel

This commit adds a dt-bindings document of PECI adapter driver for ASPEED
AST24xx/25xx SoCs.

Signed-off-by: Jae Hyun Yoo <jae.hyun.yoo@linux.intel.com>
Reviewed-by: Haiyue Wang <haiyue.wang@linux.intel.com>
Reviewed-by: James Feist <james.feist@linux.intel.com>
Reviewed-by: Vernon Mauery <vernon.mauery@linux.intel.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Jason M Biils <jason.m.bills@linux.intel.com>
Cc: Milton Miller II <miltonm@us.ibm.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Ryan Chen <ryan_chen@aspeedtech.com>
---
 .../devicetree/bindings/peci/peci-aspeed.txt  | 57 +++++++++++++++++++
 1 file changed, 57 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/peci/peci-aspeed.txt

diff --git a/Documentation/devicetree/bindings/peci/peci-aspeed.txt b/Documentation/devicetree/bindings/peci/peci-aspeed.txt
new file mode 100644
index 000000000000..8c35f905589d
--- /dev/null
+++ b/Documentation/devicetree/bindings/peci/peci-aspeed.txt
@@ -0,0 +1,57 @@
+Device tree configuration for PECI buses on the AST24XX and AST25XX SoCs.
+
+Required properties:
+- compatible        : Should be "aspeed,ast2400-peci" or "aspeed,ast2500-peci"
+		      - aspeed,ast2400-peci: ASPEED AST2400 family PECI
+					     controller
+		      - aspeed,ast2500-peci: ASPEED AST2500 family PECI
+					     controller
+- reg               : Should contain PECI controller registers location and
+		      length.
+- #address-cells    : Should be <1> required to define a client address.
+- #size-cells       : Should be <0> required to define a client address.
+- interrupts        : Should contain PECI controller interrupt.
+- clocks            : Should contain clock source for PECI controller. Should
+		      reference the external oscillator clock in the second
+		      cell.
+- resets            : Should contain phandle to reset controller with the reset
+		      number in the second cell.
+- clock-frequency   : Should contain the operation frequency of PECI controller
+		      in units of Hz.
+		      187500 ~ 24000000
+
+Optional properties:
+- msg-timing        : Message timing negotiation period. This value will
+		      determine the period of message timing negotiation to be
+		      issued by PECI controller. The unit of the programmed
+		      value is four times of PECI clock period.
+		      0 ~ 255 (default: 1)
+- addr-timing       : Address timing negotiation period. This value will
+		      determine the period of address timing negotiation to be
+		      issued by PECI controller. The unit of the programmed
+		      value is four times of PECI clock period.
+		      0 ~ 255 (default: 1)
+- rd-sampling-point : Read sampling point selection. The whole period of a bit
+		      time will be divided into 16 time frames. This value will
+		      determine the time frame in which the controller will
+		      sample PECI signal for data read back. Usually in the
+		      middle of a bit time is the best.
+		      0 ~ 15 (default: 8)
+- cmd-timeout-ms    : Command timeout in units of ms.
+		      1 ~ 60000 (default: 1000)
+
+Example:
+	peci0: peci-bus at 0 {
+		compatible = "aspeed,ast2500-peci";
+		reg = <0x0 0x60>;
+		#address-cells = <1>;
+		#size-cells = <0>;
+		interrupts = <15>;
+		clocks = <&syscon ASPEED_CLK_GATE_REFCLK>;
+		resets = <&syscon ASPEED_RESET_PECI>;
+		clock-frequency = <24000000>;
+		msg-timing = <1>;
+		addr-timing = <1>;
+		rd-sampling-point = <8>;
+		cmd-timeout-ms = <1000>;
+	};
-- 
2.17.0

^ permalink raw reply related

* [PATCH linux-next v5 00/13] PECI device driver introduction
From: Jae Hyun Yoo @ 2018-06-01 18:20 UTC (permalink / raw)
  To: linux-arm-kernel

Introduction of the Platform Environment Control Interface (PECI) bus
device driver. PECI is a one-wire bus interface that provides a
communication channel between an Intel processor and chipset components to
external monitoring or control devices. PECI is designed to support the
following sideband functions:

* Processor and DRAM thermal management
  - Processor fan speed control is managed by comparing Digital Thermal
    Sensor (DTS) thermal readings acquired via PECI against the
    processor-specific fan speed control reference point, or TCONTROL. Both
    TCONTROL and DTS thermal readings are accessible via the processor PECI
    client. These variables are referenced to a common temperature, the TCC
    activation point, and are both defined as negative offsets from that
    reference.
  - PECI based access to the processor package configuration space provides
    a means for Baseboard Management Controllers (BMC) or other platform
    management devices to actively manage the processor and memory power
    and thermal features.

* Platform Manageability
  - Platform manageability functions including thermal, power, and error
    monitoring. Note that platform 'power' management includes monitoring
    and control for both the processor and DRAM subsystem to assist with
    data center power limiting.
  - PECI allows read access to certain error registers in the processor MSR
    space and status monitoring registers in the PCI configuration space
    within the processor and downstream devices.
  - PECI permits writes to certain registers in the processor PCI
    configuration space.

* Processor Interface Tuning and Diagnostics
  - Processor interface tuning and diagnostics capabilities
    (Intel Interconnect BIST). The processors Intel Interconnect Built In
    Self Test (Intel IBIST) allows for infield diagnostic capabilities in
    the Intel UPI and memory controller interfaces. PECI provides a port to
    execute these diagnostics via its PCI Configuration read and write
    capabilities.

* Failure Analysis
  - Output the state of the processor after a failure for analysis via
    Crashdump.

PECI uses a single wire for self-clocking and data transfer. The bus
requires no additional control lines. The physical layer is a self-clocked
one-wire bus that begins each bit with a driven, rising edge from an idle
level near zero volts. The duration of the signal driven high depends on
whether the bit value is a logic '0' or logic '1'. PECI also includes
variable data transfer rate established with every message. In this way, it
is highly flexible even though underlying logic is simple.

The interface design was optimized for interfacing between an Intel
processor and chipset components in both single processor and multiple
processor environments. The single wire interface provides low board
routing overhead for the multiple load connections in the congested routing
area near the processor and chipset components. Bus speed, error checking,
and low protocol overhead provides adequate link bandwidth and reliability
to transfer critical device operating conditions and configuration
information.

This implementation provides the basic framework to add PECI extensions to
the Linux bus and device models. A hardware specific 'Adapter' driver can
be attached to the PECI bus to provide sideband functions described above.
It is also possible to access all devices on an adapter from userspace
through the /dev interface. A device specific 'Client' driver also can be
attached to the PECI bus so each processor client's features can be
supported by the 'Client' driver through an adapter connection in the bus.
This patch set includes Aspeed 24xx/25xx PECI driver and PECI
cputemp/dimmtemp drivers as the first implementation for both adapter and
client drivers on the PECI bus framework.

Please review.

Thanks,

-Jae

Changes since v4:
* Fixed an incorrect endianness handling in peci-aspeed.
* Added a comment to explain about the asm/intel-family.h inclusion.
* Added an MFD module to support multi-function PECI client devices.

Changes since v3:
* Made code more simple and compact.
* Removed unused header file inclusion.
* Fixed incorrect error return values and messages.
* Removed DTS margin temperature from the peci-cputemp.
* Made some magic numbers use defines.
* Moved peci_get_cpu_id() into peci-core as a common function.
* Replaced the cancel_delayed_work() call with a cancel_delayed_work_sync().
* Replaced AST and Aspeed uses with ASPEED.
* Simplified peci command timeout checking logic using
  regmap_read_poll_timeout().
* Simplified endian swap codes using endian handling macros.
* Dropped regmap read/write error checking except for the first access.
* Added a PECI reset setting in the device tree node.
* Removed unnecessary sleep from the probe context.
* Removed IRQF_SHARED flag from irq request code in the ASPEED PECI driver.
* Fixed typos in documents.
* Combined peci-bus.txt, peci-adapter.txt and peci-client.txt into peci.txt.
* Fixed and swept documents to drop some incorrect or unnecessary
  descriptions.
* Fixed device tree to make unit-address format use reg contents.
* Simplified bit manipulations using <linux/bitfield.h>.
* Made client CPU model checking use <asm/intel-family.h> if available.
* Modified adapter heap allocation method to use kobject reference count
  based.
* Added the low-level PECI xfer IOCTL again to support the Redfish
  requirement.
* Added PM domain attach/detach code.
* Added logic for device instantiation through sysfs.
* Fix a bug of interrupt status checking code in peci-aspeed driver.

Changes since v2:
* Divided peci-hwmon driver into two drivers, peci-cputemp and
  peci-dimmtemp.
* Added generic dt binding documents for PECI bus, adapter and client.
* Removed in_atomic() call from the PECI core driver.
* Improved PECI commands masking logic.
* Added permission check logic for PECI ioctls.
* Removed unnecessary type casts.
* Fixed some invalid error return codes.
* Added the mark_updated() function to improve update interval checking
  logic.
* Fixed a bug in populated DIMM checking function.
* Fixed some typo, grammar and style issues in documents.
* Rewrote hwmon drivers to use devm_hwmon_device_register_with_info API.
* Made peci_match_id() function as a static.
* Replaced a deprecated create_singlethread_workqueue() call with an
  alloc_ordered_workqueue() call.
* Reordered local variable definitions in reversed xmas tree notation.
* Listed up client CPUs that can be supported by peci-cputemp and
  peci-dimmtemp hwmon drivers.
* Added CPU generation detection logic which checks CPUID signature through
  PECI connection.
* Improved interrupt handling logic in the Aspeed PECI adapter driver.
* Fixed SPDX license identifier style in header files.
* Changed some macros in peci.h to static inline functions.
* Dropped sleepable context checking code in peci-core.
* Adjusted rt_mutex protection scope in peci-core.
* Moved adapter->xfer() checking code into peci_register_adapter().
* Improved PECI command retry checking logic.
* Changed ioctl base from 'P' to 0xb6 to avoid confiliction and updated
  ioctl-number.txt to reflect the ioctl number of PECI subsystem.
* Added a comment to describe PECI retry action.
* Simplified return code handling of peci_ioctl_ping().
* Changed type of peci_ioctl_fn[] to static const.
* Fixed range checking code for valid PECI commands.
* Fixed the error return code on invalid PECI commands.
* Fixed incorrect definitions of PECI ioctl and its handling logic.

Changes since v1:
* Additionally implemented a core driver to support PECI linux bus driver
  model.
* Modified Aspeed PECI driver to make that to be an adapter driver in PECI
  bus.
* Modified PECI hwmon driver to make that to be a client driver in PECI
  bus.
* Simplified hwmon driver attribute labels and removed redundant strings.
* Removed core_nums from device tree setting of hwmon driver and modified
  core number detection logic to check the resolved_core register in client
  CPU's local PCI configuration area.
* Removed dimm_nums from device tree setting of hwmon driver and added
  populated DIMM detection logic to support dynamic creation.
* Removed indexing gap on core temperature and DIMM temperature attributes.
* Improved hwmon registration and dynamic attribute creation logic.
* Fixed structure definitions in PECI uapi header to make that use __u8,
  __u16 and etc.
* Modified wait_for_completion_interruptible_timeout error handling logic
  in Aspeed PECI driver to deliver errors correctly.
* Removed low-level xfer command from ioctl and kept only high-level PECI
  command suite as ioctls.
* Fixed I/O timeout logic in Aspeed PECI driver using ktime.
* Added a function into hwmon driver to simplify update delay checking.
* Added a function into hwmon driver to convert 10.6 to millidegree.
* Dropped non-standard attributes in hwmon driver.
* Fixed OF table for hwmon to make it indicate as a PECI client of Intel
  CPU target.
* Added a maintainer of PECI subsystem into MAINTAINERS document.

Jae Hyun Yoo (13):
  dt-bindings: Add a document of PECI subsystem
  Documentation: ioctl: Add ioctl numbers for PECI subsystem
  drivers/peci: Add support for PECI bus driver core
  dt-bindings: Add a document of PECI adapter driver for ASPEED
    AST24xx/25xx SoCs
  ARM: dts: aspeed: peci: Add PECI node
  drivers/peci: Add a PECI adapter driver for Aspeed AST24xx/AST25xx
  dt-bindings: mfd: Add a document for PECI client mfd
  drivers/mfd: Add PECI client mfd driver
  dt-bindings: hwmon: Add documents for PECI hwmon client drivers
  Documentation: hwmon: Add documents for PECI hwmon client drivers
  drivers/hwmon: Add PECI cputemp driver
  drivers/hwmon: Add PECI dimmtemp driver
  Add maintainers for the PECI subsystem

 .../bindings/hwmon/peci-cputemp.txt           |   11 +
 .../bindings/hwmon/peci-dimmtemp.txt          |   13 +
 .../devicetree/bindings/mfd/peci-client.txt   |   23 +
 .../devicetree/bindings/peci/peci-aspeed.txt  |   57 +
 .../devicetree/bindings/peci/peci.txt         |   59 +
 Documentation/hwmon/peci-cputemp              |   78 +
 Documentation/hwmon/peci-dimmtemp             |   50 +
 Documentation/ioctl/ioctl-number.txt          |    2 +
 MAINTAINERS                                   |   12 +
 arch/arm/boot/dts/aspeed-g4.dtsi              |   26 +
 arch/arm/boot/dts/aspeed-g5.dtsi              |   26 +
 drivers/Kconfig                               |    2 +
 drivers/Makefile                              |    1 +
 drivers/hwmon/Kconfig                         |   28 +
 drivers/hwmon/Makefile                        |    2 +
 drivers/hwmon/peci-cputemp.c                  |  401 +++++
 drivers/hwmon/peci-dimmtemp.c                 |  295 ++++
 drivers/mfd/Kconfig                           |   11 +
 drivers/mfd/Makefile                          |    1 +
 drivers/mfd/peci-client.c                     |  205 +++
 drivers/peci/Kconfig                          |   40 +
 drivers/peci/Makefile                         |    9 +
 drivers/peci/peci-aspeed.c                    |  498 ++++++
 drivers/peci/peci-core.c                      | 1439 +++++++++++++++++
 include/linux/mfd/peci-client.h               |   60 +
 include/linux/peci.h                          |  104 ++
 include/uapi/linux/peci-ioctl.h               |  265 +++
 27 files changed, 3718 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/hwmon/peci-cputemp.txt
 create mode 100644 Documentation/devicetree/bindings/hwmon/peci-dimmtemp.txt
 create mode 100644 Documentation/devicetree/bindings/mfd/peci-client.txt
 create mode 100644 Documentation/devicetree/bindings/peci/peci-aspeed.txt
 create mode 100644 Documentation/devicetree/bindings/peci/peci.txt
 create mode 100644 Documentation/hwmon/peci-cputemp
 create mode 100644 Documentation/hwmon/peci-dimmtemp
 create mode 100644 drivers/hwmon/peci-cputemp.c
 create mode 100644 drivers/hwmon/peci-dimmtemp.c
 create mode 100644 drivers/mfd/peci-client.c
 create mode 100644 drivers/peci/Kconfig
 create mode 100644 drivers/peci/Makefile
 create mode 100644 drivers/peci/peci-aspeed.c
 create mode 100644 drivers/peci/peci-core.c
 create mode 100644 include/linux/mfd/peci-client.h
 create mode 100644 include/linux/peci.h
 create mode 100644 include/uapi/linux/peci-ioctl.h

-- 
2.17.0

^ permalink raw reply

* [linux-sunxi] [PATCH] arm: sun4i: Add support for Pengpod 1000 tablet
From: Jagan Teki @ 2018-06-01 18:16 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20180601175538.16716-1-rah@settrans.net>

On 06/01/2018 11:25 PM, Bob Ham wrote:
> This is initial support for the Pengpod 1000 tablet.  The display is
> not currently working but the UART, SD card and USB all work fine.
> 
> Signed-off-by: Bob Ham <rah@settrans.net>
> ---
>   arch/arm/boot/dts/Makefile                   |   1 +
>   arch/arm/boot/dts/sun4i-a10-pengpod-1000.dts | 232 +++++++++++++++++++++++++++
>   2 files changed, 233 insertions(+)
>   create mode 100644 arch/arm/boot/dts/sun4i-a10-pengpod-1000.dts
> 
> diff --git a/arch/arm/boot/dts/Makefile b/arch/arm/boot/dts/Makefile
> index ade7a38543dc..e6e93e7ffc8b 100644
> --- a/arch/arm/boot/dts/Makefile
> +++ b/arch/arm/boot/dts/Makefile
> @@ -893,6 +893,7 @@ dtb-$(CONFIG_MACH_SUN4I) += \
>   	sun4i-a10-olinuxino-lime.dtb \
>   	sun4i-a10-pcduino.dtb \
>   	sun4i-a10-pcduino2.dtb \
> +	sun4i-a10-pengpod-1000.dtb \
>   	sun4i-a10-pov-protab2-ips9.dtb
>   dtb-$(CONFIG_MACH_SUN5I) += \
>   	sun5i-a10s-auxtek-t003.dtb \
> diff --git a/arch/arm/boot/dts/sun4i-a10-pengpod-1000.dts b/arch/arm/boot/dts/sun4i-a10-pengpod-1000.dts
> new file mode 100644
> index 000000000000..94560400114d
> --- /dev/null
> +++ b/arch/arm/boot/dts/sun4i-a10-pengpod-1000.dts
> @@ -0,0 +1,232 @@
> +/*
> + * Copyright 2015 Hans de Goede <hdegoede@redhat.com>
> + * Copyright 2017 Robert Ham <rah@settrans.net>
> + *
> + * This file is dual-licensed: you can use it either under the terms
> + * of the GPL or the X11 license, at your option. Note that this dual
> + * licensing only applies to this file, and not this project as a
> + * whole.
> + *
> + *  a) This file is free software; you can redistribute it and/or
> + *     modify it under the terms of the GNU General Public License as
> + *     published by the Free Software Foundation; either version 2 of the
> + *     License, or (at your option) any later version.
> + *
> + *     This file is distributed in the hope that it will be useful,
> + *     but WITHOUT ANY WARRANTY; without even the implied warranty of
> + *     MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + *     GNU General Public License for more details.
> + *
> + * Or, alternatively,
> + *
> + *  b) Permission is hereby granted, free of charge, to any person
> + *     obtaining a copy of this software and associated documentation
> + *     files (the "Software"), to deal in the Software without
> + *     restriction, including without limitation the rights to use,
> + *     copy, modify, merge, publish, distribute, sublicense, and/or
> + *     sell copies of the Software, and to permit persons to whom the
> + *     Software is furnished to do so, subject to the following
> + *     conditions:
> + *
> + *     The above copyright notice and this permission notice shall be
> + *     included in all copies or substantial portions of the Software.
> + *
> + *     THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
> + *     EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
> + *     OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
> + *     NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
> + *     HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
> + *     WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + *     FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
> + *     OTHER DEALINGS IN THE SOFTWARE.
> + */

We need to use SPDX-License for new dts files.

^ permalink raw reply

* [PATCH v2] clk: imx: Set CLK_SET_RATE_GATE for gate and divider clocks
From: Stephen Boyd @ 2018-06-01 18:03 UTC (permalink / raw)
  To: linux-arm-kernel
In-Reply-To: <20180418023214.GN25429@dragon>

Quoting Shawn Guo (2018-04-17 19:32:16)
> On Wed, Apr 11, 2018 at 05:03:29PM +0300, Abel Vesa wrote:
> > From: Shawn Guo <shawnguo@kernel.org>
> > 
> > Add flag CLK_SET_RATE_GATE for i.MX gate and divider clocks on which the
> > client drivers usually make clk_set_rate() call, so that the call will fail
> > when clock is still on instead of standing the risk of running into glitch
> > issue. Rate cannot be changed when the clock is enabled due to the glitchy
> > multiplexers.
> > 
> > Signed-off-by: Shawn Guo <shawnguo@kernel.org>
> > [initial patch from imx internal repo]
> > Signed-off-by: Abel Vesa <abel.vesa@nxp.com>
> > [carried over from 3.14 and also applied the flag to newer functions]
> > ---
> > 
> > Changes since v1:
> >  - changed ownership as per initial patch
> 
> IIRC, the patch was created on vendor kernel long time ago to work
> around a specific glitchy multiplexer issue seen on particular SoC.
> I'm not sure it's good for the upstream kernel today.
> 

I'm taking this as a Nak. Resend or restart this discussion if you want
me to apply this.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox