All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH AUTOSEL 5.5 029/542] drm/amd/display: Map ODM memory correctly when doing ODM combine
From: Sasha Levin @ 2020-02-14 15:40 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Nikola Cornij, Jun Lei, Rodrigo Siqueira, Alex Deucher,
	Sasha Levin, amd-gfx, dri-devel
In-Reply-To: <20200214154854.6746-1-sashal@kernel.org>

From: Nikola Cornij <nikola.cornij@amd.com>

[ Upstream commit ec5b356c58941bb8930858155d9ce14ceb3d30a0 ]

[why]
Up to 4 ODM memory pieces are required per ODM combine and cannot
overlap, i.e. each ODM "session" has to use its own memory pieces.
The ODM-memory mapping is currently broken for generic case.

The maximum number of memory pieces is ASIC-dependent, but it's always
big enough to satisfy maximum number of ODM combines. Memory pieces
are mapped as a bit-map, i.e. one memory piece corresponds to one bit.
The OPTC doing ODM needs to select memory pieces by setting the
corresponding bits, making sure there's no overlap with other OPTC
instances that might be doing ODM.

The current mapping works only for OPTC instance indexes smaller than
3. For instance indexes 3 and up it practically maps no ODM memory,
causing black, gray or white screen in display configs that include
ODM on OPTC instance 3 or up.

[how]
Statically map two unique ODM memory pieces for each OPTC instance
and piece them together when programming ODM combine mode.

Signed-off-by: Nikola Cornij <nikola.cornij@amd.com>
Reviewed-by: Jun Lei <Jun.Lei@amd.com>
Acked-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 .../gpu/drm/amd/display/dc/dcn20/dcn20_optc.c    | 16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_optc.c b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_optc.c
index 3b613fb93ef80..0162d3ffe268f 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_optc.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_optc.c
@@ -233,12 +233,13 @@ void optc2_set_odm_combine(struct timing_generator *optc, int *opp_id, int opp_c
 		struct dc_crtc_timing *timing)
 {
 	struct optc *optc1 = DCN10TG_FROM_TG(optc);
-	/* 2 pieces of memory required for up to 5120 displays, 4 for up to 8192 */
 	int mpcc_hactive = (timing->h_addressable + timing->h_border_left + timing->h_border_right)
 			/ opp_cnt;
-	int memory_mask = mpcc_hactive <= 2560 ? 0x3 : 0xf;
+	uint32_t memory_mask;
 	uint32_t data_fmt = 0;
 
+	ASSERT(opp_cnt == 2);
+
 	/* TODO: In pseudocode but does not affect maximus, delete comment if we dont need on asic
 	 * REG_SET(OTG_GLOBAL_CONTROL2, 0, GLOBAL_UPDATE_LOCK_EN, 1);
 	 * Program OTG register MASTER_UPDATE_LOCK_DB_X/Y to the position before DP frame start
@@ -246,9 +247,17 @@ void optc2_set_odm_combine(struct timing_generator *optc, int *opp_id, int opp_c
 	 *		MASTER_UPDATE_LOCK_DB_X, 160,
 	 *		MASTER_UPDATE_LOCK_DB_Y, 240);
 	 */
+
+	/* 2 pieces of memory required for up to 5120 displays, 4 for up to 8192,
+	 * however, for ODM combine we can simplify by always using 4.
+	 * To make sure there's no overlap, each instance "reserves" 2 memories and
+	 * they are uniquely combined here.
+	 */
+	memory_mask = 0x3 << (opp_id[0] * 2) | 0x3 << (opp_id[1] * 2);
+
 	if (REG(OPTC_MEMORY_CONFIG))
 		REG_SET(OPTC_MEMORY_CONFIG, 0,
-			OPTC_MEM_SEL, memory_mask << (optc->inst * 4));
+			OPTC_MEM_SEL, memory_mask);
 
 	if (timing->pixel_encoding == PIXEL_ENCODING_YCBCR422)
 		data_fmt = 1;
@@ -257,7 +266,6 @@ void optc2_set_odm_combine(struct timing_generator *optc, int *opp_id, int opp_c
 
 	REG_UPDATE(OPTC_DATA_FORMAT_CONTROL, OPTC_DATA_FORMAT, data_fmt);
 
-	ASSERT(opp_cnt == 2);
 	REG_SET_3(OPTC_DATA_SOURCE_SELECT, 0,
 			OPTC_NUM_OF_INPUT_SEGMENT, 1,
 			OPTC_SEG0_SRC_SEL, opp_id[0],
-- 
2.20.1


^ permalink raw reply related

* [PATCH AUTOSEL 5.5 030/542] pinctrl: sh-pfc: r8a77965: Fix DU_DOTCLKIN3 drive/bias control
From: Sasha Levin @ 2020-02-14 15:40 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Geert Uytterhoeven, Sasha Levin, linux-renesas-soc, linux-gpio
In-Reply-To: <20200214154854.6746-1-sashal@kernel.org>

From: Geert Uytterhoeven <geert+renesas@glider.be>

[ Upstream commit a34cd9dfd03fa9ec380405969f1d638bc63b8d63 ]

R-Car Gen3 Hardware Manual Errata for Rev. 2.00 of October 24, 2019
changed the configuration bits for drive and bias control for the
DU_DOTCLKIN3 pin on R-Car M3-N, to match the same pin on R-Car H3.
Update the driver to reflect this.

After this, the handling of drive and bias control for the various
DU_DOTCLKINx pins is consistent across all of the R-Car H3, M3-W,
M3-W+, and M3-N SoCs.

Fixes: 86c045c2e4201e94 ("pinctrl: sh-pfc: r8a77965: Replace DU_DOTCLKIN2 by DU_DOTCLKIN3")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/20191113101653.28428-1-geert+renesas@glider.be
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/pinctrl/sh-pfc/pfc-r8a77965.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/pinctrl/sh-pfc/pfc-r8a77965.c b/drivers/pinctrl/sh-pfc/pfc-r8a77965.c
index 8bdf33c807f6c..6616f5210b9d9 100644
--- a/drivers/pinctrl/sh-pfc/pfc-r8a77965.c
+++ b/drivers/pinctrl/sh-pfc/pfc-r8a77965.c
@@ -5998,7 +5998,7 @@ static const struct pinmux_drive_reg pinmux_drive_regs[] = {
 		{ PIN_DU_DOTCLKIN1,    0, 2 },	/* DU_DOTCLKIN1 */
 	} },
 	{ PINMUX_DRIVE_REG("DRVCTRL12", 0xe6060330) {
-		{ PIN_DU_DOTCLKIN3,   28, 2 },	/* DU_DOTCLKIN3 */
+		{ PIN_DU_DOTCLKIN3,   24, 2 },	/* DU_DOTCLKIN3 */
 		{ PIN_FSCLKST,        20, 2 },	/* FSCLKST */
 		{ PIN_TMS,             4, 2 },	/* TMS */
 	} },
@@ -6254,8 +6254,8 @@ static const struct pinmux_bias_reg pinmux_bias_regs[] = {
 		[31] = PIN_DU_DOTCLKIN1,	/* DU_DOTCLKIN1 */
 	} },
 	{ PINMUX_BIAS_REG("PUEN3", 0xe606040c, "PUD3", 0xe606044c) {
-		[ 0] = PIN_DU_DOTCLKIN3,	/* DU_DOTCLKIN3 */
-		[ 1] = SH_PFC_PIN_NONE,
+		[ 0] = SH_PFC_PIN_NONE,
+		[ 1] = PIN_DU_DOTCLKIN3,	/* DU_DOTCLKIN3 */
 		[ 2] = PIN_FSCLKST,		/* FSCLKST */
 		[ 3] = PIN_EXTALR,		/* EXTALR*/
 		[ 4] = PIN_TRST_N,		/* TRST# */
-- 
2.20.1


^ permalink raw reply related

* Re: [PATCH v4 1/6] MIPS: JZ4780: Introduce SMP support.
From: Paul Cercueil @ 2020-02-14 18:21 UTC (permalink / raw)
  To: 周琰杰 (Zhou Yanjie)
  Cc: linux-mips, linux-clk, linux-kernel, devicetree, mturquette,
	sboyd, robh+dt, mark.rutland, ralf, paulburton, jiaxun.yang,
	chenhc, allison, tglx, daniel.lezcano, geert+renesas, krzk,
	keescook, ebiederm, miquel.raynal, paul, hns, sernia.zhou,
	zhenwenjin, mips-creator-ci20-dev, 1326991897
In-Reply-To: <1581703360-112557-3-git-send-email-zhouyanjie@wanyeetech.com>

Hi Zhou,

The changes to drivers/clk/ingenic/jz4780-cgu.c look sound but they 
really don't belong in this patch.

Please split them to another patch.

-Paul


Le sam., févr. 15, 2020 at 02:02, 周琰杰 (Zhou Yanjie) 
<zhouyanjie@wanyeetech.com> a écrit :
> Forward port smp support from kernel 3.18.3 of CI20_linux
> to upstream kernel 5.5.
> 
> Tested-by: H. Nikolaus Schaller <hns@goldelico.com>
> Tested-by: Paul Boddie <paul@boddie.org.uk>
> Signed-off-by: 周琰杰 (Zhou Yanjie) <zhouyanjie@wanyeetech.com>
> ---
> 
> Notes:
>     v1->v2:
>     1.Remove unnecessary "plat_irq_dispatch(void)" in irq-ingenic.c.
>     2.Add a timeout check for "jz4780_boot_secondary()" to avoid a 
> dead loop.
>     3.Replace hard code in smp.c with macro.
> 
>     v2->v3:
>     1.Remove unnecessary "extern void (*r4k_blast_dcache)(void)" in 
> smp.c.
>     2.Use "for_each_of_cpu_node" instead "for_each_compatible_node" 
> in smp.c.
>     3.Use "of_cpu_node_to_id" instead "of_property_read_u32_index" in 
> smp.c.
>     4.Move LCR related operations to jz4780-cgu.c.
> 
>     v3->v4:
>     Rebase on top of kernel 5.6-rc1.
> 
>  arch/mips/include/asm/mach-jz4740/jz4780-smp.h |  91 ++++++++
>  arch/mips/jz4740/Kconfig                       |   3 +
>  arch/mips/jz4740/Makefile                      |   5 +
>  arch/mips/jz4740/prom.c                        |   4 +
>  arch/mips/jz4740/smp-entry.S                   |  57 +++++
>  arch/mips/jz4740/smp.c                         | 286 
> +++++++++++++++++++++++++
>  arch/mips/kernel/idle.c                        |  14 +-
>  drivers/clk/ingenic/jz4780-cgu.c               |  58 ++++-
>  8 files changed, 512 insertions(+), 6 deletions(-)
>  create mode 100644 arch/mips/include/asm/mach-jz4740/jz4780-smp.h
>  create mode 100644 arch/mips/jz4740/smp-entry.S
>  create mode 100644 arch/mips/jz4740/smp.c
> 
> diff --git a/arch/mips/include/asm/mach-jz4740/jz4780-smp.h 
> b/arch/mips/include/asm/mach-jz4740/jz4780-smp.h
> new file mode 100644
> index 00000000..3f592ce
> --- /dev/null
> +++ b/arch/mips/include/asm/mach-jz4740/jz4780-smp.h
> @@ -0,0 +1,91 @@
> +/* SPDX-License-Identifier: GPL-2.0-or-later */
> +/*
> + *  Copyright (C) 2013, Paul Burton <paul.burton@imgtec.com>
> + *  JZ4780 SMP definitions
> + */
> +
> +#ifndef __MIPS_ASM_MACH_JZ4740_JZ4780_SMP_H__
> +#define __MIPS_ASM_MACH_JZ4740_JZ4780_SMP_H__
> +
> +#define read_c0_corectrl()		__read_32bit_c0_register($12, 2)
> +#define write_c0_corectrl(val)		__write_32bit_c0_register($12, 2, 
> val)
> +
> +#define read_c0_corestatus()		__read_32bit_c0_register($12, 3)
> +#define write_c0_corestatus(val)	__write_32bit_c0_register($12, 3, 
> val)
> +
> +#define read_c0_reim()			__read_32bit_c0_register($12, 4)
> +#define write_c0_reim(val)		__write_32bit_c0_register($12, 4, val)
> +
> +#define read_c0_mailbox0()		__read_32bit_c0_register($20, 0)
> +#define write_c0_mailbox0(val)		__write_32bit_c0_register($20, 0, 
> val)
> +
> +#define read_c0_mailbox1()		__read_32bit_c0_register($20, 1)
> +#define write_c0_mailbox1(val)		__write_32bit_c0_register($20, 1, 
> val)
> +
> +#define smp_clr_pending(mask) do {		\
> +		unsigned int stat;		\
> +		stat = read_c0_corestatus();	\
> +		stat &= ~((mask) & 0xff);	\
> +		write_c0_corestatus(stat);	\
> +	} while (0)
> +
> +/*
> + * Core Control register
> + */
> +#define CORECTRL_SLEEP1M_SHIFT	17
> +#define CORECTRL_SLEEP1M	(_ULCAST_(0x1) << CORECTRL_SLEEP1M_SHIFT)
> +#define CORECTRL_SLEEP0M_SHIFT	16
> +#define CORECTRL_SLEEP0M	(_ULCAST_(0x1) << CORECTRL_SLEEP0M_SHIFT)
> +#define CORECTRL_RPC1_SHIFT	9
> +#define CORECTRL_RPC1		(_ULCAST_(0x1) << CORECTRL_RPC1_SHIFT)
> +#define CORECTRL_RPC0_SHIFT	8
> +#define CORECTRL_RPC0		(_ULCAST_(0x1) << CORECTRL_RPC0_SHIFT)
> +#define CORECTRL_SWRST1_SHIFT	1
> +#define CORECTRL_SWRST1		(_ULCAST_(0x1) << CORECTRL_SWRST1_SHIFT)
> +#define CORECTRL_SWRST0_SHIFT	0
> +#define CORECTRL_SWRST0		(_ULCAST_(0x1) << CORECTRL_SWRST0_SHIFT)
> +
> +/*
> + * Core Status register
> + */
> +#define CORESTATUS_SLEEP1_SHIFT	17
> +#define CORESTATUS_SLEEP1	(_ULCAST_(0x1) << CORESTATUS_SLEEP1_SHIFT)
> +#define CORESTATUS_SLEEP0_SHIFT	16
> +#define CORESTATUS_SLEEP0	(_ULCAST_(0x1) << CORESTATUS_SLEEP0_SHIFT)
> +#define CORESTATUS_IRQ1P_SHIFT	9
> +#define CORESTATUS_IRQ1P	(_ULCAST_(0x1) << CORESTATUS_IRQ1P_SHIFT)
> +#define CORESTATUS_IRQ0P_SHIFT	8
> +#define CORESTATUS_IRQ0P	(_ULCAST_(0x1) << CORESTATUS_IRQ8P_SHIFT)
> +#define CORESTATUS_MIRQ1P_SHIFT	1
> +#define CORESTATUS_MIRQ1P	(_ULCAST_(0x1) << CORESTATUS_MIRQ1P_SHIFT)
> +#define CORESTATUS_MIRQ0P_SHIFT	0
> +#define CORESTATUS_MIRQ0P	(_ULCAST_(0x1) << CORESTATUS_MIRQ0P_SHIFT)
> +
> +/*
> + * Reset Entry & IRQ Mask register
> + */
> +#define REIM_ENTRY_SHIFT	16
> +#define REIM_ENTRY		(_ULCAST_(0xffff) << REIM_ENTRY_SHIFT)
> +#define REIM_IRQ1M_SHIFT	9
> +#define REIM_IRQ1M		(_ULCAST_(0x1) << REIM_IRQ1M_SHIFT)
> +#define REIM_IRQ0M_SHIFT	8
> +#define REIM_IRQ0M		(_ULCAST_(0x1) << REIM_IRQ0M_SHIFT)
> +#define REIM_MBOXIRQ1M_SHIFT	1
> +#define REIM_MBOXIRQ1M		(_ULCAST_(0x1) << REIM_MBOXIRQ1M_SHIFT)
> +#define REIM_MBOXIRQ0M_SHIFT	0
> +#define REIM_MBOXIRQ0M		(_ULCAST_(0x1) << REIM_MBOXIRQ0M_SHIFT)
> +
> +#ifdef CONFIG_SMP
> +
> +extern void jz4780_smp_wait_irqoff(void);
> +
> +extern void jz4780_smp_init(void);
> +extern void jz4780_secondary_cpu_entry(void);
> +
> +#else /* !CONFIG_SMP */
> +
> +static inline void jz4780_smp_init(void) { }
> +
> +#endif /* !CONFIG_SMP */
> +
> +#endif /* __MIPS_ASM_MACH_JZ4740_JZ4780_SMP_H__ */
> diff --git a/arch/mips/jz4740/Kconfig b/arch/mips/jz4740/Kconfig
> index 412d2fa..0239597 100644
> --- a/arch/mips/jz4740/Kconfig
> +++ b/arch/mips/jz4740/Kconfig
> @@ -34,9 +34,12 @@ config MACH_JZ4770
> 
>  config MACH_JZ4780
>  	bool
> +	select GENERIC_CLOCKEVENTS_BROADCAST if SMP
>  	select MIPS_CPU_SCACHE
> +	select NR_CPUS_DEFAULT_2
>  	select SYS_HAS_CPU_MIPS32_R2
>  	select SYS_SUPPORTS_HIGHMEM
> +	select SYS_SUPPORTS_SMP
> 
>  config MACH_X1000
>  	bool
> diff --git a/arch/mips/jz4740/Makefile b/arch/mips/jz4740/Makefile
> index 6de14c0..0a0f024 100644
> --- a/arch/mips/jz4740/Makefile
> +++ b/arch/mips/jz4740/Makefile
> @@ -12,3 +12,8 @@ CFLAGS_setup.o = 
> -I$(src)/../../../scripts/dtc/libfdt
>  # PM support
> 
>  obj-$(CONFIG_PM) += pm.o
> +
> +# SMP support
> +
> +obj-$(CONFIG_SMP) += smp.o
> +obj-$(CONFIG_SMP) += smp-entry.o
> diff --git a/arch/mips/jz4740/prom.c b/arch/mips/jz4740/prom.c
> index ff4555c..a79159e 100644
> --- a/arch/mips/jz4740/prom.c
> +++ b/arch/mips/jz4740/prom.c
> @@ -8,10 +8,14 @@
> 
>  #include <asm/bootinfo.h>
>  #include <asm/fw/fw.h>
> +#include <asm/mach-jz4740/jz4780-smp.h>
> 
>  void __init prom_init(void)
>  {
>  	fw_init_cmdline();
> +#if defined(CONFIG_MACH_JZ4780) && defined(CONFIG_SMP)
> +	jz4780_smp_init();
> +#endif
>  }
> 
>  void __init prom_free_prom_memory(void)
> diff --git a/arch/mips/jz4740/smp-entry.S 
> b/arch/mips/jz4740/smp-entry.S
> new file mode 100644
> index 00000000..20049a3
> --- /dev/null
> +++ b/arch/mips/jz4740/smp-entry.S
> @@ -0,0 +1,57 @@
> +/* SPDX-License-Identifier: GPL-2.0-or-later */
> +/*
> + *  Copyright (C) 2013, Paul Burton <paul.burton@imgtec.com>
> + *  JZ4780 SMP entry point
> + */
> +
> +#include <asm/addrspace.h>
> +#include <asm/asm.h>
> +#include <asm/asmmacro.h>
> +#include <asm/cacheops.h>
> +#include <asm/mipsregs.h>
> +
> +#define CACHE_SIZE (32 * 1024)
> +#define CACHE_LINESIZE 32
> +
> +.extern jz4780_cpu_entry_sp
> +.extern jz4780_cpu_entry_gp
> +
> +.section .text.smp-entry
> +.balign 0x10000
> +.set noreorder
> +LEAF(jz4780_secondary_cpu_entry)
> +	mtc0	zero, CP0_CAUSE
> +
> +	li	t0, ST0_CU0
> +	mtc0	t0, CP0_STATUS
> +
> +	/* cache setup */
> +	li	t0, KSEG0
> +	ori	t1, t0, CACHE_SIZE
> +	mtc0	zero, CP0_TAGLO, 0
> +1:	cache	Index_Store_Tag_I, 0(t0)
> +	cache	Index_Store_Tag_D, 0(t0)
> +	bne	t0, t1, 1b
> +	 addiu	t0, t0, CACHE_LINESIZE
> +
> +	/* kseg0 cache attribute */
> +	mfc0	t0, CP0_CONFIG, 0
> +	ori	t0, t0, CONF_CM_CACHABLE_NONCOHERENT
> +	mtc0	t0, CP0_CONFIG, 0
> +
> +	/* pagemask */
> +	mtc0	zero, CP0_PAGEMASK, 0
> +
> +	/* retrieve sp */
> +	la	t0, jz4780_cpu_entry_sp
> +	lw	sp, 0(t0)
> +
> +	/* retrieve gp */
> +	la	t0, jz4780_cpu_entry_gp
> +	lw	gp, 0(t0)
> +
> +	/* jump to the kernel in kseg0 */
> +	la	t0, smp_bootstrap
> +	jr	t0
> +	 nop
> +	END(jz4780_secondary_cpu_entry)
> diff --git a/arch/mips/jz4740/smp.c b/arch/mips/jz4740/smp.c
> new file mode 100644
> index 00000000..19b75c2
> --- /dev/null
> +++ b/arch/mips/jz4740/smp.c
> @@ -0,0 +1,286 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + *  Copyright (C) 2013, Paul Burton <paul.burton@imgtec.com>
> + *  JZ4780 SMP
> + */
> +
> +#include <linux/clk.h>
> +#include <linux/delay.h>
> +#include <linux/interrupt.h>
> +#include <linux/of.h>
> +#include <linux/sched.h>
> +#include <linux/sched/task_stack.h>
> +#include <linux/smp.h>
> +#include <linux/tick.h>
> +#include <asm/mach-jz4740/jz4780-smp.h>
> +#include <asm/r4kcache.h>
> +#include <asm/smp-ops.h>
> +
> +static struct clk *cpu_clock_gates[CONFIG_NR_CPUS] = { NULL };
> +
> +u32 jz4780_cpu_entry_sp;
> +u32 jz4780_cpu_entry_gp;
> +
> +static struct cpumask cpu_running;
> +
> +static DEFINE_SPINLOCK(smp_lock);
> +
> +/*
> + * The Ingenic jz4780 SMP variant has to write back dirty cache 
> lines before
> + * executing wait. The CPU & cache clock will be gated until we 
> return from
> + * the wait, and if another core attempts to access data from our 
> data cache
> + * during this time then it will lock up.
> + */
> +void jz4780_smp_wait_irqoff(void)
> +{
> +	unsigned long pending = read_c0_cause() & read_c0_status() & 
> CAUSEF_IP;
> +
> +	/*
> +	 * Going to idle has a significant overhead due to the cache flush 
> so
> +	 * try to avoid it if we'll immediately be woken again due to an 
> IRQ.
> +	 */
> +	if (!need_resched() && !pending) {
> +		r4k_blast_dcache();
> +
> +		__asm__(
> +		"	.set push	\n"
> +		"	.set mips3	\n"
> +		"	sync		\n"
> +		"	wait		\n"
> +		"	.set pop	\n");
> +	}
> +
> +	local_irq_enable();
> +}
> +
> +static irqreturn_t mbox_handler(int irq, void *dev_id)
> +{
> +	int cpu = smp_processor_id();
> +	u32 action, status;
> +
> +	spin_lock(&smp_lock);
> +
> +	switch (cpu) {
> +	case 0:
> +		action = read_c0_mailbox0();
> +		write_c0_mailbox0(0);
> +		break;
> +	case 1:
> +		action = read_c0_mailbox1();
> +		write_c0_mailbox1(0);
> +		break;
> +	default:
> +		panic("unhandled cpu %d!", cpu);
> +	}
> +
> +	/* clear pending mailbox interrupt */
> +	status = read_c0_corestatus();
> +	status &= ~(CORESTATUS_MIRQ0P << cpu);
> +	write_c0_corestatus(status);
> +
> +	spin_unlock(&smp_lock);
> +
> +	if (action & SMP_RESCHEDULE_YOURSELF)
> +		scheduler_ipi();
> +	if (action & SMP_CALL_FUNCTION)
> +		generic_smp_call_function_interrupt();
> +
> +	return IRQ_HANDLED;
> +}
> +
> +static struct irqaction mbox_action = {
> +	.handler = mbox_handler,
> +	.name = "core mailbox",
> +	.flags = IRQF_PERCPU | IRQF_NO_THREAD,
> +};
> +
> +static void jz4780_smp_setup(void)
> +{
> +	u32 addr, reim;
> +	int cpu;
> +
> +	reim = read_c0_reim();
> +
> +	for (cpu = 0; cpu < NR_CPUS; cpu++) {
> +		__cpu_number_map[cpu] = cpu;
> +		__cpu_logical_map[cpu] = cpu;
> +		set_cpu_possible(cpu, true);
> +	}
> +
> +	/* mask mailbox interrupts for this core */
> +	reim &= ~REIM_MBOXIRQ0M;
> +	write_c0_reim(reim);
> +
> +	/* clear mailboxes & pending mailbox IRQs */
> +	write_c0_mailbox0(0);
> +	write_c0_mailbox1(0);
> +	write_c0_corestatus(0);
> +
> +	/* set reset entry point */
> +	addr = KSEG1ADDR((u32)&jz4780_secondary_cpu_entry);
> +	WARN_ON(addr & ~REIM_ENTRY);
> +	reim &= ~REIM_ENTRY;
> +	reim |= addr & REIM_ENTRY;
> +
> +	/* unmask mailbox interrupts for this core */
> +	reim |= REIM_MBOXIRQ0M;
> +	write_c0_reim(reim);
> +	set_c0_status(STATUSF_IP3);
> +	irq_enable_hazard();
> +
> +	cpumask_set_cpu(cpu, &cpu_running);
> +}
> +
> +static void jz4780_smp_prepare_cpus(unsigned int max_cpus)
> +{
> +	struct device_node *cpu_node;
> +	unsigned cpu, ctrl;
> +	int err;
> +
> +	/* setup the mailbox IRQ */
> +	setup_irq(MIPS_CPU_IRQ_BASE + 3, &mbox_action);
> +
> +	init_cpu_present(cpu_possible_mask);
> +
> +	ctrl = read_c0_corectrl();
> +
> +	for (cpu = 0; cpu < max_cpus; cpu++) {
> +		/* use reset entry point from REIM register */
> +		ctrl |= CORECTRL_RPC0 << cpu;
> +	}
> +
> +	for_each_of_cpu_node(cpu_node) {
> +		cpu = of_cpu_node_to_id(cpu_node);
> +		if (cpu < 0) {
> +			pr_err("Failed to read index of %s\n",
> +			       cpu_node->full_name);
> +			continue;
> +		}
> +
> +		cpu_clock_gates[cpu] = of_clk_get(cpu_node, 0);
> +		if (IS_ERR(cpu_clock_gates[cpu])) {
> +			cpu_clock_gates[cpu] = NULL;
> +			continue;
> +		}
> +
> +		err = clk_prepare(cpu_clock_gates[cpu]);
> +		if (err)
> +			pr_err("Failed to prepare CPU clock gate\n");
> +	}
> +
> +	write_c0_corectrl(ctrl);
> +}
> +
> +static int jz4780_boot_secondary(int cpu, struct task_struct *idle)
> +{
> +	unsigned long flags;
> +	u32 ctrl;
> +
> +	spin_lock_irqsave(&smp_lock, flags);
> +
> +	/* ensure the core is in reset */
> +	ctrl = read_c0_corectrl();
> +	ctrl |= CORECTRL_SWRST0 << cpu;
> +	write_c0_corectrl(ctrl);
> +
> +	/* ungate core clock */
> +	if (cpu_clock_gates[cpu])
> +		clk_enable(cpu_clock_gates[cpu]);
> +
> +	/* set entry sp/gp register values */
> +	jz4780_cpu_entry_sp = __KSTK_TOS(idle);
> +	jz4780_cpu_entry_gp = (u32)task_thread_info(idle);
> +	smp_wmb();
> +
> +	/* take the core out of reset */
> +	ctrl &= ~(CORECTRL_SWRST0 << cpu);
> +	write_c0_corectrl(ctrl);
> +
> +	cpumask_set_cpu(cpu, &cpu_running);
> +
> +	spin_unlock_irqrestore(&smp_lock, flags);
> +
> +	return 0;
> +}
> +
> +static void jz4780_init_secondary(void)
> +{
> +}
> +
> +static void jz4780_smp_finish(void)
> +{
> +	u32 reim;
> +
> +	spin_lock(&smp_lock);
> +
> +	/* unmask mailbox interrupts for this core */
> +	reim = read_c0_reim();
> +	reim |= REIM_MBOXIRQ0M << smp_processor_id();
> +	write_c0_reim(reim);
> +
> +	spin_unlock(&smp_lock);
> +
> +	/* unmask interrupts for this core */
> +	change_c0_status(ST0_IM, STATUSF_IP3 | STATUSF_IP2 |
> +			 STATUSF_IP1 | STATUSF_IP0);
> +	irq_enable_hazard();
> +
> +	/* force broadcast timer */
> +	tick_broadcast_force();
> +}
> +
> +static void jz4780_send_ipi_single_locked(int cpu, unsigned int 
> action)
> +{
> +	u32 mbox;
> +
> +	switch (cpu) {
> +	case 0:
> +		mbox = read_c0_mailbox0();
> +		write_c0_mailbox0(mbox | action);
> +		break;
> +	case 1:
> +		mbox = read_c0_mailbox1();
> +		write_c0_mailbox1(mbox | action);
> +		break;
> +	default:
> +		panic("unhandled cpu %d!", cpu);
> +	}
> +}
> +
> +static void jz4780_send_ipi_single(int cpu, unsigned int action)
> +{
> +	unsigned long flags;
> +
> +	spin_lock_irqsave(&smp_lock, flags);
> +	jz4780_send_ipi_single_locked(cpu, action);
> +	spin_unlock_irqrestore(&smp_lock, flags);
> +}
> +
> +static void jz4780_send_ipi_mask(const struct cpumask *mask,
> +				 unsigned int action)
> +{
> +	unsigned long flags;
> +	int cpu;
> +
> +	spin_lock_irqsave(&smp_lock, flags);
> +
> +	for_each_cpu(cpu, mask)
> +		jz4780_send_ipi_single_locked(cpu, action);
> +
> +	spin_unlock_irqrestore(&smp_lock, flags);
> +}
> +
> +static struct plat_smp_ops jz4780_smp_ops = {
> +	.send_ipi_single = jz4780_send_ipi_single,
> +	.send_ipi_mask = jz4780_send_ipi_mask,
> +	.init_secondary = jz4780_init_secondary,
> +	.smp_finish = jz4780_smp_finish,
> +	.boot_secondary = jz4780_boot_secondary,
> +	.smp_setup = jz4780_smp_setup,
> +	.prepare_cpus = jz4780_smp_prepare_cpus,
> +};
> +
> +void jz4780_smp_init(void)
> +{
> +	register_smp_ops(&jz4780_smp_ops);
> +}
> diff --git a/arch/mips/kernel/idle.c b/arch/mips/kernel/idle.c
> index 37f8e78..a406de3 100644
> --- a/arch/mips/kernel/idle.c
> +++ b/arch/mips/kernel/idle.c
> @@ -19,6 +19,10 @@
>  #include <asm/idle.h>
>  #include <asm/mipsregs.h>
> 
> +#ifdef CONFIG_MACH_JZ4780
> +# include <asm/mach-jz4740/jz4780-smp.h>
> +#endif
> +
>  /*
>   * Not all of the MIPS CPUs have the "wait" instruction available. 
> Moreover,
>   * the implementation of the "wait" feature differs between CPU 
> families. This
> @@ -172,7 +176,6 @@ void __init check_wait(void)
>  	case CPU_CAVIUM_OCTEON_PLUS:
>  	case CPU_CAVIUM_OCTEON2:
>  	case CPU_CAVIUM_OCTEON3:
> -	case CPU_XBURST:
>  	case CPU_LOONGSON32:
>  	case CPU_XLR:
>  	case CPU_XLP:
> @@ -246,6 +249,15 @@ void __init check_wait(void)
>  		   cpu_wait = r4k_wait;
>  		 */
>  		break;
> +	case CPU_XBURST:
> +#if defined(CONFIG_MACH_JZ4780) && defined(CONFIG_SMP)
> +		if (NR_CPUS > 1)
> +			cpu_wait = jz4780_smp_wait_irqoff;
> +		else
> +			cpu_wait = r4k_wait;
> +#else
> +		cpu_wait = r4k_wait;
> +#endif
>  	default:
>  		break;
>  	}
> diff --git a/drivers/clk/ingenic/jz4780-cgu.c 
> b/drivers/clk/ingenic/jz4780-cgu.c
> index d07fff1..4f81819 100644
> --- a/drivers/clk/ingenic/jz4780-cgu.c
> +++ b/drivers/clk/ingenic/jz4780-cgu.c
> @@ -16,7 +16,7 @@
> 
>  /* CGU register offsets */
>  #define CGU_REG_CLOCKCONTROL	0x00
> -#define CGU_REG_PLLCONTROL	0x0c
> +#define CGU_REG_LCR			0x04
>  #define CGU_REG_APLL		0x10
>  #define CGU_REG_MPLL		0x14
>  #define CGU_REG_EPLL		0x18
> @@ -46,8 +46,8 @@
>  #define CGU_REG_CLOCKSTATUS	0xd4
> 
>  /* bits within the OPCR register */
> -#define OPCR_SPENDN0		(1 << 7)
> -#define OPCR_SPENDN1		(1 << 6)
> +#define OPCR_SPENDN0		BIT(7)
> +#define OPCR_SPENDN1		BIT(6)
> 
>  /* bits within the USBPCR register */
>  #define USBPCR_USB_MODE		BIT(31)
> @@ -88,6 +88,13 @@
>  #define USBVBFIL_IDDIGFIL_MASK	(0xffff << USBVBFIL_IDDIGFIL_SHIFT)
>  #define USBVBFIL_USBVBFIL_MASK	(0xffff)
> 
> +/* bits within the LCR register */
> +#define LCR_PD_SCPU			BIT(31)
> +#define LCR_SCPUS			BIT(27)
> +
> +/* bits within the CLKGR1 register */
> +#define CLKGR1_CORE1		BIT(15)
> +
>  static struct ingenic_cgu *cgu;
> 
>  static u8 jz4780_otg_phy_get_parent(struct clk_hw *hw)
> @@ -205,6 +212,47 @@ static const struct clk_ops jz4780_otg_phy_ops = 
> {
>  	.set_rate = jz4780_otg_phy_set_rate,
>  };
> 
> +static int jz4780_core1_enable(struct clk_hw *hw)
> +{
> +	struct ingenic_clk *ingenic_clk = to_ingenic_clk(hw);
> +	struct ingenic_cgu *cgu = ingenic_clk->cgu;
> +	const unsigned int timeout = 100;
> +	unsigned long flags;
> +	unsigned int i;
> +	u32 lcr, clkgr1;
> +
> +	spin_lock_irqsave(&cgu->lock, flags);
> +
> +	lcr = readl(cgu->base + CGU_REG_LCR);
> +	lcr &= ~LCR_PD_SCPU;
> +	writel(lcr, cgu->base + CGU_REG_LCR);
> +
> +	clkgr1 = readl(cgu->base + CGU_REG_CLKGR1);
> +	clkgr1 &= ~CLKGR1_CORE1;
> +	writel(clkgr1, cgu->base + CGU_REG_CLKGR1);
> +
> +	spin_unlock_irqrestore(&cgu->lock, flags);
> +
> +	/* wait for the CPU to be powered up */
> +	for (i = 0; i < timeout; i++) {
> +		lcr = readl(cgu->base + CGU_REG_LCR);
> +		if (!(lcr & LCR_SCPUS))
> +			break;
> +		mdelay(1);
> +	}
> +
> +	if (i == timeout) {
> +		pr_err("%s: Wait for power up core1 timeout\n", __func__);
> +		return -EBUSY;
> +	}
> +
> +	return 0;
> +}
> +
> +static const struct clk_ops jz4780_core1_ops = {
> +	.enable = jz4780_core1_enable,
> +};
> +
>  static const s8 pll_od_encoding[16] = {
>  	0x0, 0x1, 0x2, 0x3, 0x4, 0x5, 0x6, 0x7,
>  	0x8, 0x9, 0xa, 0xb, 0xc, 0xd, 0xe, 0xf,
> @@ -701,9 +749,9 @@ static const struct ingenic_cgu_clk_info 
> jz4780_cgu_clocks[] = {
>  	},
> 
>  	[JZ4780_CLK_CORE1] = {
> -		"core1", CGU_CLK_GATE,
> +		"core1", CGU_CLK_CUSTOM,
>  		.parents = { JZ4780_CLK_CPU, -1, -1, -1 },
> -		.gate = { CGU_REG_CLKGR1, 15 },
> +		.custom = { &jz4780_core1_ops },
>  	},
> 
>  };
> --
> 2.7.4
> 



^ permalink raw reply

* [PATCH v2 0/15] combining object filters and bitmaps
From: Jeff King @ 2020-02-14 18:21 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano
In-Reply-To: <20200213021506.GA1124607@coredump.intra.peff.net>

Here's a v2 based on the early feedback from Junio, and a few things I
noticed while addressing it:

 - patch 4 now moves the check for revs.prune into the pack-bitmap code,
   rather than rearranging it within rev-list.

 - I noticed some more opportunities to reuse the "compare the results
   of a bitmap and non-bitmap traversal" code, so I factored that out
   into its own commit (patch 8)

 - I realized another long-standing annoyance is simple to fix: before
   we did not know how to do "rev-list --use-bitmap-index" without
   "--objects", but it was pretty easy to support. That's patch 9.

The range diff below is a little noisy because of the ripple effects of
some of those changes. If you just want to review the substantive
changes, look at patches 4, 8, and 9.

  [01/15]: pack-bitmap: factor out type iterator initialization
  [02/15]: pack-bitmap: fix leak of haves/wants object lists
  [03/15]: rev-list: fallback to non-bitmap traversal when filtering
  [04/15]: pack-bitmap: refuse to do a bitmap traversal with pathspecs
  [05/15]: rev-list: factor out bitmap-optimized routines
  [06/15]: rev-list: make --count work with --objects
  [07/15]: rev-list: allow bitmaps when counting objects
  [08/15]: t5310: factor out bitmap traversal comparison
  [09/15]: rev-list: allow commit-only bitmap traversals
  [10/15]: pack-bitmap: basic noop bitmap filter infrastructure
  [11/15]: rev-list: use bitmap filters for traversal
  [12/15]: bitmap: add bitmap_unset() function
  [13/15]: pack-bitmap: implement BLOB_NONE filtering
  [14/15]: pack-bitmap: implement BLOB_LIMIT filtering
  [15/15]: pack-objects: support filters with bitmaps

 builtin/pack-objects.c             |   6 +-
 builtin/rev-list.c                 | 114 +++++++++---
 ewah/bitmap.c                      |   8 +
 ewah/ewok.h                        |   1 +
 object.c                           |   9 +
 object.h                           |   2 +
 pack-bitmap.c                      | 272 +++++++++++++++++++++++++----
 pack-bitmap.h                      |   5 +-
 reachable.c                        |   4 +-
 t/perf/p5310-pack-bitmaps.sh       |  22 +++
 t/t5310-pack-bitmaps.sh            |  36 +++-
 t/t6000-rev-list-misc.sh           |  12 ++
 t/t6113-rev-list-bitmap-filters.sh |  56 ++++++
 t/test-lib-functions.sh            |  27 +++
 14 files changed, 504 insertions(+), 70 deletions(-)
 create mode 100755 t/t6113-rev-list-bitmap-filters.sh

Range-diff from v1:

 1:  283874f2ac =  1:  4424d745de pack-bitmap: factor out type iterator initialization
 2:  68e7bed41c =  2:  bc0576868f pack-bitmap: fix leak of haves/wants object lists
 3:  1182593fb2 =  3:  9d2ef09b9b rev-list: fallback to non-bitmap traversal when filtering
 4:  03b20e9b66 <  -:  ---------- rev-list: consolidate bitmap-disabling options
 -:  ---------- >  4:  ce7969221a pack-bitmap: refuse to do a bitmap traversal with pathspecs
 5:  91032b0d78 =  5:  a9a5513bb0 rev-list: factor out bitmap-optimized routines
 6:  c7c3a31cf3 !  6:  eab5dd34bc rev-list: make --count work with --objects
    @@ builtin/rev-list.c: int cmd_rev_list(int argc, const char **argv, const char *pr
     +	    (revs.left_right || revs.cherry_mark))
     +		die(_("marked counting is incompatible with --objects"));
     +
    - 	if (filter_options.choice || revs.prune)
    + 	if (filter_options.choice)
      		use_bitmap_index = 0;
      
     
 7:  0aabf6262b =  7:  04af6b9362 rev-list: allow bitmaps when counting objects
 -:  ---------- >  8:  4b93d064b1 t5310: factor out bitmap traversal comparison
 -:  ---------- >  9:  d89509fa8f rev-list: allow commit-only bitmap traversals
 8:  969238ec42 ! 10:  c64d4db8e8 pack-bitmap: basic noop bitmap filter infrastructure
    @@ builtin/rev-list.c: static int try_bitmap_count(struct rev_info *revs)
      		return -1;
      
     @@ builtin/rev-list.c: static int try_bitmap_traversal(struct rev_info *revs)
    - 	if (!revs->tag_objects || !revs->tree_objects || !revs->blob_objects)
    + 	if (revs->max_count >= 0)
      		return -1;
      
     -	bitmap_git = prepare_bitmap_walk(revs);
    @@ pack-bitmap.c: static int in_bitmapped_pack(struct bitmap_index *bitmap_git,
      	unsigned int i;
      
     @@ pack-bitmap.c: struct bitmap_index *prepare_bitmap_walk(struct rev_info *revs)
    - 	struct bitmap *wants_bitmap = NULL;
    - 	struct bitmap *haves_bitmap = NULL;
    + 	if (revs->prune)
    + 		return NULL;
      
    --	struct bitmap_index *bitmap_git = xcalloc(1, sizeof(*bitmap_git));
    -+	struct bitmap_index *bitmap_git;
    -+
     +	if (!can_filter_bitmap(filter))
     +		return NULL;
     +
      	/* try to open a bitmapped pack, but don't parse it yet
      	 * because we may not need to use it */
    -+	bitmap_git = xcalloc(1, sizeof(*bitmap_git));
    - 	if (open_pack_bitmap(revs->repo, bitmap_git) < 0)
    - 		goto cleanup;
    - 
    + 	bitmap_git = xcalloc(1, sizeof(*bitmap_git));
     @@ pack-bitmap.c: struct bitmap_index *prepare_bitmap_walk(struct rev_info *revs)
      	if (haves_bitmap)
      		bitmap_and_not(wants_bitmap, haves_bitmap);
    @@ pack-bitmap.h
      
      static const char BITMAP_IDX_SIGNATURE[] = {'B', 'I', 'T', 'M'};
      
    -@@ pack-bitmap.h: void count_bitmap_commit_list(struct bitmap_index *, uint32_t *commits,
    - void traverse_bitmap_commit_list(struct bitmap_index *,
    +@@ pack-bitmap.h: void traverse_bitmap_commit_list(struct bitmap_index *,
    + 				 struct rev_info *revs,
      				 show_reachable_fn show_reachable);
      void test_bitmap_walk(struct rev_info *revs);
     -struct bitmap_index *prepare_bitmap_walk(struct rev_info *revs);
    @@ reachable.c: void mark_reachable_objects(struct rev_info *revs, int mark_reflog,
     -	bitmap_git = prepare_bitmap_walk(revs);
     +	bitmap_git = prepare_bitmap_walk(revs, NULL);
      	if (bitmap_git) {
    - 		traverse_bitmap_commit_list(bitmap_git, mark_object_seen);
    + 		traverse_bitmap_commit_list(bitmap_git, revs, mark_object_seen);
      		free_bitmap_index(bitmap_git);
 9:  b4eb8fcb9d ! 11:  caa69ac535 rev-list: use bitmap filters for traversal
    @@ builtin/rev-list.c: static int try_bitmap_count(struct rev_info *revs)
      	struct bitmap_index *bitmap_git;
      
     @@ builtin/rev-list.c: static int try_bitmap_traversal(struct rev_info *revs)
    - 	if (!revs->tag_objects || !revs->tree_objects || !revs->blob_objects)
    + 	if (revs->max_count >= 0)
      		return -1;
      
     -	bitmap_git = prepare_bitmap_walk(revs, NULL);
    @@ builtin/rev-list.c: int cmd_rev_list(int argc, const char **argv, const char *pr
      	    (revs.left_right || revs.cherry_mark))
      		die(_("marked counting is incompatible with --objects"));
      
    --	if (filter_options.choice || revs.prune)
    -+	if (revs.prune)
    - 		use_bitmap_index = 0;
    - 
    +-	if (filter_options.choice)
    +-		use_bitmap_index = 0;
    +-
      	save_commit_buffer = (revs.verbose_header ||
    + 			      revs.grep_filter.pattern_list ||
    + 			      revs.grep_filter.header_list);
     @@ builtin/rev-list.c: int cmd_rev_list(int argc, const char **argv, const char *prefix)
      		progress = start_delayed_progress(show_progress, 0);
      
10:  b518070c12 = 12:  d3e7462138 bitmap: add bitmap_unset() function
11:  438c21082e ! 13:  07ee699520 pack-bitmap: implement BLOB_NONE filtering
    @@ Commit message
             bitmaps is about the same as our loop here, but it might make the
             code a bit simpler).
     
    -    The regression tests just compare the bitmap output to the non-bitmap
    -    output. Since the code internally falls back to the non-bitmap path in
    -    some cases, this is at risk of being a trivial noop. To combat this,
    -    we'll check for small differences between the two outputs (see the
    -    comment in the test). This is a little fragile, as it's possible that we
    -    may teach the fallback path for --use-bitmap-index to produce
    -    bitmap-like output in the future. But the exact ordering of objects
    -    would likely be different in such a case, anyway.
    +    Here are perf results for the new test on git.git:
     
    -    Plus we'd catch an unexpected fallback when running the perf suite, as
    -    things would get slower. Here's what it looks like now (on git.git):
    -
    -    Test                                    HEAD^             HEAD
    -    --------------------------------------------------------------------------------
    -    5310.7: rev-list count with blob:none   1.67(1.62+0.05)   0.22(0.21+0.02) -86.8%
    +      Test                                    HEAD^             HEAD
    +      --------------------------------------------------------------------------------
    +      5310.9: rev-list count with blob:none   1.67(1.62+0.05)   0.22(0.21+0.02) -86.8%
     
      ## pack-bitmap.c ##
     @@ pack-bitmap.c: static int in_bitmapped_pack(struct bitmap_index *bitmap_git,
    @@ pack-bitmap.c: static int filter_bitmap(struct bitmap_index *bitmap_git,
      }
     
      ## t/perf/p5310-pack-bitmaps.sh ##
    -@@ t/perf/p5310-pack-bitmaps.sh: test_perf 'pack to file (bitmap)' '
    - 	git pack-objects --use-bitmap-index --all pack1b </dev/null >/dev/null
    +@@ t/perf/p5310-pack-bitmaps.sh: test_perf 'rev-list (objects)' '
    + 	git rev-list --all --use-bitmap-index --objects >/dev/null
      '
      
     +test_perf 'rev-list count with blob:none' '
    @@ t/t6113-rev-list-bitmap-filters.sh: test_expect_success 'filters fallback to non
      	test_cmp expect actual
      '
      
    -+# the bitmap and regular traversals produce subtly different output:
    -+#
    -+#   - regular output is in traversal order, whereas bitmap is split by type,
    -+#     with non-packed objects at the end
    -+#
    -+#   - regular output has a space and the pathname appended to non-commit
    -+#     objects; bitmap output omits this
    -+#
    -+# Normalize and compare the two. The second argument should always be the
    -+# bitmap output.
    -+cmp_bitmap_traversal () {
    -+	if cmp "$1" "$2"
    -+	then
    -+		echo >&2 "identical raw outputs; are you sure bitmaps were used?"
    -+		return 1
    -+	fi &&
    -+	cut -d' ' -f1 "$1" | sort >"$1.normalized" &&
    -+	sort "$2" >"$2.normalized" &&
    -+	test_cmp "$1.normalized" "$2.normalized"
    -+}
    -+
     +test_expect_success 'blob:none filter' '
     +	git rev-list --objects --filter=blob:none HEAD >expect &&
     +	git rev-list --use-bitmap-index \
     +		     --objects --filter=blob:none HEAD >actual &&
    -+	cmp_bitmap_traversal expect actual
    ++	test_bitmap_traversal expect actual
     +'
     +
     +test_expect_success 'blob:none filter with specified blob' '
     +	git rev-list --objects --filter=blob:none HEAD HEAD:two.t >expect &&
     +	git rev-list --use-bitmap-index \
     +		     --objects --filter=blob:none HEAD HEAD:two.t >actual &&
    -+	cmp_bitmap_traversal expect actual
    ++	test_bitmap_traversal expect actual
     +'
     +
      test_done
12:  05cd57f0ba ! 14:  f35fab09a0 pack-bitmap: implement BLOB_LIMIT filtering
    @@ Commit message
         than BLOB_NONE, but still produces a noticeable speedup (these results
         are on git.git):
     
    -      Test                                        HEAD~2            HEAD
    +      Test                                         HEAD~2            HEAD
           ------------------------------------------------------------------------------------
    -      5310.7: rev-list count with blob:none       1.80(1.77+0.02)   0.22(0.20+0.02) -87.8%
    -      5310.8: rev-list count with blob:limit=1k   1.99(1.96+0.03)   0.29(0.25+0.03) -85.4%
    +      5310.9:  rev-list count with blob:none       1.80(1.77+0.02)   0.22(0.20+0.02) -87.8%
    +      5310.10: rev-list count with blob:limit=1k   1.99(1.96+0.03)   0.29(0.25+0.03) -85.4%
     
         The implementation is similar to the BLOB_NONE one, with the exception
         that we have to go object-by-object while walking the blob-type bitmap
    @@ t/t6113-rev-list-bitmap-filters.sh: test_description='rev-list combining bitmaps
      
      test_expect_success 'filters fallback to non-bitmap traversal' '
     @@ t/t6113-rev-list-bitmap-filters.sh: test_expect_success 'blob:none filter with specified blob' '
    - 	cmp_bitmap_traversal expect actual
    + 	test_bitmap_traversal expect actual
      '
      
     +test_expect_success 'blob:limit filter' '
     +	git rev-list --objects --filter=blob:limit=5 HEAD >expect &&
     +	git rev-list --use-bitmap-index \
     +		     --objects --filter=blob:limit=5 HEAD >actual &&
    -+	cmp_bitmap_traversal expect actual
    ++	test_bitmap_traversal expect actual
     +'
     +
     +test_expect_success 'blob:limit filter with specified blob' '
    @@ t/t6113-rev-list-bitmap-filters.sh: test_expect_success 'blob:none filter with s
     +	git rev-list --use-bitmap-index \
     +		     --objects --filter=blob:limit=5 \
     +		     HEAD HEAD:much-larger-blob-two.t >actual &&
    -+	cmp_bitmap_traversal expect actual
    ++	test_bitmap_traversal expect actual
     +'
     +
      test_done
13:  1e5befb73a ! 15:  797dbfe28f pack-objects: support filters with bitmaps
    @@ Commit message
         This unsurprisingly makes things faster for partial clones of large
         repositories (here we're cloning linux.git):
     
    -      Test                              HEAD^               HEAD
    +      Test                               HEAD^               HEAD
           ------------------------------------------------------------------------------
    -      5310.9: simulated partial clone   38.94(37.28+5.87)   11.06(11.27+4.07) -71.6%
    +      5310.11: simulated partial clone   38.94(37.28+5.87)   11.06(11.27+4.07) -71.6%
     
      ## builtin/pack-objects.c ##
     @@ builtin/pack-objects.c: static int pack_options_allow_reuse(void)

^ permalink raw reply

* [PATCH AUTOSEL 5.5 031/542] leds: pca963x: Fix open-drain initialization
From: Sasha Levin @ 2020-02-14 15:40 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: Zahari Petkov, Pavel Machek, Sasha Levin, linux-leds
In-Reply-To: <20200214154854.6746-1-sashal@kernel.org>

From: Zahari Petkov <zahari@balena.io>

[ Upstream commit 697529091ac7a0a90ca349b914bb30641c13c753 ]

Before commit bb29b9cccd95 ("leds: pca963x: Add bindings to invert
polarity") Mode register 2 was initialized directly with either 0x01
or 0x05 for open-drain or totem pole (push-pull) configuration.

Afterwards, MODE2 initialization started using bitwise operations on
top of the default MODE2 register value (0x05). Using bitwise OR for
setting OUTDRV with 0x01 and 0x05 does not produce correct results.
When open-drain is used, instead of setting OUTDRV to 0, the driver
keeps it as 1:

Open-drain: 0x05 | 0x01 -> 0x05 (0b101 - incorrect)
Totem pole: 0x05 | 0x05 -> 0x05 (0b101 - correct but still wrong)

Now OUTDRV setting uses correct bitwise operations for initialization:

Open-drain: 0x05 & ~0x04 -> 0x01 (0b001 - correct)
Totem pole: 0x05 | 0x04 -> 0x05 (0b101 - correct)

Additional MODE2 register definitions are introduced now as well.

Fixes: bb29b9cccd95 ("leds: pca963x: Add bindings to invert polarity")
Signed-off-by: Zahari Petkov <zahari@balena.io>
Signed-off-by: Pavel Machek <pavel@ucw.cz>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/leds/leds-pca963x.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/leds/leds-pca963x.c b/drivers/leds/leds-pca963x.c
index 4afc317901a89..66cdc003b8f42 100644
--- a/drivers/leds/leds-pca963x.c
+++ b/drivers/leds/leds-pca963x.c
@@ -40,6 +40,8 @@
 #define PCA963X_LED_PWM		0x2	/* Controlled through PWM */
 #define PCA963X_LED_GRP_PWM	0x3	/* Controlled through PWM/GRPPWM */
 
+#define PCA963X_MODE2_OUTDRV	0x04	/* Open-drain or totem pole */
+#define PCA963X_MODE2_INVRT	0x10	/* Normal or inverted direction */
 #define PCA963X_MODE2_DMBLNK	0x20	/* Enable blinking */
 
 #define PCA963X_MODE1		0x00
@@ -438,12 +440,12 @@ static int pca963x_probe(struct i2c_client *client,
 						    PCA963X_MODE2);
 		/* Configure output: open-drain or totem pole (push-pull) */
 		if (pdata->outdrv == PCA963X_OPEN_DRAIN)
-			mode2 |= 0x01;
+			mode2 &= ~PCA963X_MODE2_OUTDRV;
 		else
-			mode2 |= 0x05;
+			mode2 |= PCA963X_MODE2_OUTDRV;
 		/* Configure direction: normal or inverted */
 		if (pdata->dir == PCA963X_INVERTED)
-			mode2 |= 0x10;
+			mode2 |= PCA963X_MODE2_INVRT;
 		i2c_smbus_write_byte_data(pca963x->chip->client, PCA963X_MODE2,
 					  mode2);
 	}
-- 
2.20.1


^ permalink raw reply related

* [PATCH AUTOSEL 5.5 032/542] ext4: fix ext4_dax_read/write inode locking sequence for IOCB_NOWAIT
From: Sasha Levin @ 2020-02-14 15:40 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Ritesh Harjani, Jan Kara, Matthew Bobrowski, Joseph Qi,
	Theodore Ts'o, Sasha Levin, linux-ext4
In-Reply-To: <20200214154854.6746-1-sashal@kernel.org>

From: Ritesh Harjani <riteshh@linux.ibm.com>

[ Upstream commit f629afe3369e9885fd6e9cc7a4f514b6a65cf9e9 ]

Apparently our current rwsem code doesn't like doing the trylock, then
lock for real scheme.  So change our dax read/write methods to just do the
trylock for the RWF_NOWAIT case.
This seems to fix AIM7 regression in some scalable filesystems upto ~25%
in some cases. Claimed in commit 942491c9e6d6 ("xfs: fix AIM7 regression")

Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Matthew Bobrowski <mbobrowski@mbobrowski.org>
Tested-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com>
Link: https://lore.kernel.org/r/20191212055557.11151-2-riteshh@linux.ibm.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 fs/ext4/file.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/fs/ext4/file.c b/fs/ext4/file.c
index 6a7293a5cda2d..977ac58dc718d 100644
--- a/fs/ext4/file.c
+++ b/fs/ext4/file.c
@@ -88,9 +88,10 @@ static ssize_t ext4_dax_read_iter(struct kiocb *iocb, struct iov_iter *to)
 	struct inode *inode = file_inode(iocb->ki_filp);
 	ssize_t ret;
 
-	if (!inode_trylock_shared(inode)) {
-		if (iocb->ki_flags & IOCB_NOWAIT)
+	if (iocb->ki_flags & IOCB_NOWAIT) {
+		if (!inode_trylock_shared(inode))
 			return -EAGAIN;
+	} else {
 		inode_lock_shared(inode);
 	}
 	/*
@@ -487,9 +488,10 @@ ext4_dax_write_iter(struct kiocb *iocb, struct iov_iter *from)
 	bool extend = false;
 	struct inode *inode = file_inode(iocb->ki_filp);
 
-	if (!inode_trylock(inode)) {
-		if (iocb->ki_flags & IOCB_NOWAIT)
+	if (iocb->ki_flags & IOCB_NOWAIT) {
+		if (!inode_trylock(inode))
 			return -EAGAIN;
+	} else {
 		inode_lock(inode);
 	}
 
-- 
2.20.1


^ permalink raw reply related

* [PATCH AUTOSEL 5.5 035/542] pinctrl: sh-pfc: sh7264: Fix CAN function GPIOs
From: Sasha Levin @ 2020-02-14 15:40 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Geert Uytterhoeven, Sasha Levin, linux-renesas-soc, linux-gpio
In-Reply-To: <20200214154854.6746-1-sashal@kernel.org>

From: Geert Uytterhoeven <geert+renesas@glider.be>

[ Upstream commit 55b1cb1f03ad5eea39897d0c74035e02deddcff2 ]

pinmux_func_gpios[] contains a hole due to the missing function GPIO
definition for the "CTX0&CTX1" signal, which is the logical "AND" of the
two CAN outputs.

Fix this by:
  - Renaming CRX0_CRX1_MARK to CTX0_CTX1_MARK, as PJ2MD[2:0]=010
    configures the combined "CTX0&CTX1" output signal,
  - Renaming CRX0X1_MARK to CRX0_CRX1_MARK, as PJ3MD[1:0]=10 configures
    the shared "CRX0/CRX1" input signal, which is fed to both CAN
    inputs,
  - Adding the missing function GPIO definition for "CTX0&CTX1" to
    pinmux_func_gpios[],
  - Moving all CAN enums next to each other.

See SH7262 Group, SH7264 Group User's Manual: Hardware, Rev. 4.00:
  [1] Figure 1.2 (3) (Pin Assignment for the SH7264 Group (1-Mbyte
      Version),
  [2] Figure 1.2 (4) Pin Assignment for the SH7264 Group (640-Kbyte
      Version,
  [3] Table 1.4 List of Pins,
  [4] Figure 20.29 Connection Example when Using This Module as 1-Channel
      Module (64 Mailboxes x 1 Channel),
  [5] Table 32.10 Multiplexed Pins (Port J),
  [6] Section 32.2.30 (3) Port J Control Register 0 (PJCR0).

Note that the last 2 disagree about PJ2MD[2:0], which is probably the
root cause of this bug.  But considering [4], "CTx0&CTx1" in [5] must
be correct, and "CRx0&CRx1" in [6] must be wrong.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/20191218194812.12741-4-geert+renesas@glider.be
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/pinctrl/sh-pfc/pfc-sh7264.c | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/drivers/pinctrl/sh-pfc/pfc-sh7264.c b/drivers/pinctrl/sh-pfc/pfc-sh7264.c
index 4a95867deb8af..5a026601d4f9a 100644
--- a/drivers/pinctrl/sh-pfc/pfc-sh7264.c
+++ b/drivers/pinctrl/sh-pfc/pfc-sh7264.c
@@ -497,17 +497,15 @@ enum {
 	SD_WP_MARK, SD_CLK_MARK, SD_CMD_MARK,
 	CRX0_MARK, CRX1_MARK,
 	CTX0_MARK, CTX1_MARK,
+	CRX0_CRX1_MARK, CTX0_CTX1_MARK,
 
 	PWM1A_MARK, PWM1B_MARK, PWM1C_MARK, PWM1D_MARK,
 	PWM1E_MARK, PWM1F_MARK, PWM1G_MARK, PWM1H_MARK,
 	PWM2A_MARK, PWM2B_MARK, PWM2C_MARK, PWM2D_MARK,
 	PWM2E_MARK, PWM2F_MARK, PWM2G_MARK, PWM2H_MARK,
 	IERXD_MARK, IETXD_MARK,
-	CRX0_CRX1_MARK,
 	WDTOVF_MARK,
 
-	CRX0X1_MARK,
-
 	/* DMAC */
 	TEND0_MARK, DACK0_MARK, DREQ0_MARK,
 	TEND1_MARK, DACK1_MARK, DREQ1_MARK,
@@ -995,12 +993,12 @@ static const u16 pinmux_data[] = {
 
 	PINMUX_DATA(PJ3_DATA, PJ3MD_00),
 	PINMUX_DATA(CRX1_MARK, PJ3MD_01),
-	PINMUX_DATA(CRX0X1_MARK, PJ3MD_10),
+	PINMUX_DATA(CRX0_CRX1_MARK, PJ3MD_10),
 	PINMUX_DATA(IRQ1_PJ_MARK, PJ3MD_11),
 
 	PINMUX_DATA(PJ2_DATA, PJ2MD_000),
 	PINMUX_DATA(CTX1_MARK, PJ2MD_001),
-	PINMUX_DATA(CRX0_CRX1_MARK, PJ2MD_010),
+	PINMUX_DATA(CTX0_CTX1_MARK, PJ2MD_010),
 	PINMUX_DATA(CS2_MARK, PJ2MD_011),
 	PINMUX_DATA(SCK0_MARK, PJ2MD_100),
 	PINMUX_DATA(LCD_M_DISP_MARK, PJ2MD_101),
@@ -1245,6 +1243,7 @@ static const struct pinmux_func pinmux_func_gpios[] = {
 	GPIO_FN(CTX1),
 	GPIO_FN(CRX1),
 	GPIO_FN(CTX0),
+	GPIO_FN(CTX0_CTX1),
 	GPIO_FN(CRX0),
 	GPIO_FN(CRX0_CRX1),
 
-- 
2.20.1


^ permalink raw reply related

* [PATCH AUTOSEL 5.5 037/542] drm/mipi_dbi: Fix off-by-one bugs in mipi_dbi_blank()
From: Sasha Levin @ 2020-02-14 15:40 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Geert Uytterhoeven, Noralf Trønnes, Sasha Levin, dri-devel
In-Reply-To: <20200214154854.6746-1-sashal@kernel.org>

From: Geert Uytterhoeven <geert+renesas@glider.be>

[ Upstream commit 2ce18249af5a28031b3f909cfafccc88ea966c9d ]

When configuring the frame memory window, the last column and row
numbers are written to the column resp. page address registers.  These
numbers are thus one less than the actual window width resp. height.

While this is handled correctly in mipi_dbi_fb_dirty() since commit
03ceb1c8dfd1e293 ("drm/tinydrm: Fix setting of the column/page end
addresses."), it is not in mipi_dbi_blank().  The latter still forgets
to subtract one when calculating the most significant bytes of the
column and row numbers, thus programming wrong values when the display
width or height is a multiple of 256.

Fixes: 02dd95fe31693626 ("drm/tinydrm: Add MIPI DBI support")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Signed-off-by: Noralf Trønnes <noralf@tronnes.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20191230130604.31006-1-geert+renesas@glider.be
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/gpu/drm/drm_mipi_dbi.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/drm_mipi_dbi.c b/drivers/gpu/drm/drm_mipi_dbi.c
index e34058c721bec..16bff1be4b8ac 100644
--- a/drivers/gpu/drm/drm_mipi_dbi.c
+++ b/drivers/gpu/drm/drm_mipi_dbi.c
@@ -367,9 +367,9 @@ static void mipi_dbi_blank(struct mipi_dbi_dev *dbidev)
 	memset(dbidev->tx_buf, 0, len);
 
 	mipi_dbi_command(dbi, MIPI_DCS_SET_COLUMN_ADDRESS, 0, 0,
-			 (width >> 8) & 0xFF, (width - 1) & 0xFF);
+			 ((width - 1) >> 8) & 0xFF, (width - 1) & 0xFF);
 	mipi_dbi_command(dbi, MIPI_DCS_SET_PAGE_ADDRESS, 0, 0,
-			 (height >> 8) & 0xFF, (height - 1) & 0xFF);
+			 ((height - 1) >> 8) & 0xFF, (height - 1) & 0xFF);
 	mipi_dbi_command_buf(dbi, MIPI_DCS_WRITE_MEMORY_START,
 			     (u8 *)dbidev->tx_buf, len);
 
-- 
2.20.1


^ permalink raw reply related

* [PATCH AUTOSEL 5.5 038/542] drm/msm/adreno: fix zap vs no-zap handling
From: Sasha Levin @ 2020-02-14 15:40 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Rob Clark, Sasha Levin, linux-arm-msm, dri-devel, freedreno
In-Reply-To: <20200214154854.6746-1-sashal@kernel.org>

From: Rob Clark <robdclark@chromium.org>

[ Upstream commit 15ab987c423df561e0949d77fb5043921ae59956 ]

We can have two cases, when it comes to "zap" fw.  Either the fw
requires zap fw to take the GPU out of secure mode at boot, or it does
not and we can write RBBM_SECVID_TRUST_CNTL directly.  Previously we
decided based on whether zap fw load succeeded, but this is not a great
plan because:

1) we could have zap fw in the filesystem on a device where it is not
   required
2) we could have the inverse case

Instead, shift to deciding based on whether we have a 'zap-shader' node
in dt.  In practice, there is only one device (currently) with upstream
dt that does not use zap (cheza), and it already has a /delete-node/ for
the zap-shader node.

Fixes: abccb9fe3267 ("drm/msm/a6xx: Add zap shader load")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 11 +++++++++--
 drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 11 +++++++++--
 2 files changed, 18 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
index b02e2042547f6..7d9e63e20dedd 100644
--- a/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a5xx_gpu.c
@@ -753,11 +753,18 @@ static int a5xx_hw_init(struct msm_gpu *gpu)
 		gpu->funcs->flush(gpu, gpu->rb[0]);
 		if (!a5xx_idle(gpu, gpu->rb[0]))
 			return -EINVAL;
-	} else {
-		/* Print a warning so if we die, we know why */
+	} else if (ret == -ENODEV) {
+		/*
+		 * This device does not use zap shader (but print a warning
+		 * just in case someone got their dt wrong.. hopefully they
+		 * have a debug UART to realize the error of their ways...
+		 * if you mess this up you are about to crash horribly)
+		 */
 		dev_warn_once(gpu->dev->dev,
 			"Zap shader not enabled - using SECVID_TRUST_CNTL instead\n");
 		gpu_write(gpu, REG_A5XX_RBBM_SECVID_TRUST_CNTL, 0x0);
+	} else {
+		return ret;
 	}
 
 	/* Last step - yield the ringbuffer */
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
index dc8ec2c94301b..686c34d706b0d 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c
@@ -537,12 +537,19 @@ static int a6xx_hw_init(struct msm_gpu *gpu)
 		a6xx_flush(gpu, gpu->rb[0]);
 		if (!a6xx_idle(gpu, gpu->rb[0]))
 			return -EINVAL;
-	} else {
-		/* Print a warning so if we die, we know why */
+	} else if (ret == -ENODEV) {
+		/*
+		 * This device does not use zap shader (but print a warning
+		 * just in case someone got their dt wrong.. hopefully they
+		 * have a debug UART to realize the error of their ways...
+		 * if you mess this up you are about to crash horribly)
+		 */
 		dev_warn_once(gpu->dev->dev,
 			"Zap shader not enabled - using SECVID_TRUST_CNTL instead\n");
 		gpu_write(gpu, REG_A6XX_RBBM_SECVID_TRUST_CNTL, 0x0);
 		ret = 0;
+	} else {
+		return ret;
 	}
 
 out:
-- 
2.20.1


^ permalink raw reply related

* [PATCH RFC 0/9] Address bugzilla 198053 and more ...
From: Chuck Lever @ 2020-02-14 15:49 UTC (permalink / raw)
  To: bfields; +Cc: linux-rdma, linux-nfs

Hi Bruce-

As promised, I'm resending the fix for 198053, now that the v5.6
merge window has closed. This fix gets splice-incapable file systems
working with NFS/RDMA. That's the first patch in this series.

We can discuss splitting the fix up again, if you so desire, but my
sense is that will make the fix more challenging to backport into
stable kernels.

The next logical step is to add support for multiple READ payloads
to the server's RPC-over-RDMA transport implementation. Subsequent
patches in this series start down that path. There is more work to
do to finish that task. Today I'm sending only what is code-complete
and working.

The primary issue is that today svcrdma assumes that rq_res's page
vector is exactly what needs to be pushed in a single Write chunk.
In other words, only one read payload is supported, and it has to
fit exactly into that page vector. And critically, the XDR pad for
that payload must not be included in the page vector.

I've already implemented changes to handle Writing more than one
chunk back to a client. See patches 4 and 7.

Patch 9 introduces a data structure to keep track of multiple Write
chunks and multiple read payloads. Next, the svc_rdma_sendto path
needs to be changed to use the information in this data structure to
exclude arbitrary segments of rq_res (ie, read payloads already sent
via explicit RDMA) when constructing each RPC/RDMA Reply.

Comments and input are welcome as always.


---

Chuck Lever (9):
      nfsd: Fix NFSv4 READ on RDMA when using readv
      NFSD: Clean up nfsd4_encode_readv
      svcrdma: Avoid DMA mapping small RPC Replies
      NFSD: Invoke svc_encode_read_payload in "read" NFSD encoders
      svcrdma: Add trace point to examine client-provided write segment
      svcrdma: De-duplicate code that locates Write and Reply chunks
      svcrdma: Post RDMA Writes while XDR encoding replies
      svcrdma: Refactor svc_rdma_sendto()
      svcrdma: Add data structure to track READ payloads


 fs/nfsd/nfs3xdr.c                          |    4 
 fs/nfsd/nfs4xdr.c                          |   32 ++--
 fs/nfsd/nfsxdr.c                           |    4 
 include/linux/sunrpc/svc.h                 |    3 
 include/linux/sunrpc/svc_rdma.h            |   21 ++
 include/linux/sunrpc/svc_xprt.h            |    2 
 include/trace/events/rpcrdma.h             |   47 +++++
 net/sunrpc/svc.c                           |   16 ++
 net/sunrpc/svcsock.c                       |    8 +
 net/sunrpc/xprtrdma/svc_rdma_backchannel.c |    2 
 net/sunrpc/xprtrdma/svc_rdma_recvfrom.c    |   58 +++++--
 net/sunrpc/xprtrdma/svc_rdma_rw.c          |   42 +++--
 net/sunrpc/xprtrdma/svc_rdma_sendto.c      |  248 +++++++++++++---------------
 net/sunrpc/xprtrdma/svc_rdma_transport.c   |    1 
 14 files changed, 308 insertions(+), 180 deletions(-)

--
Chuck Lever

^ permalink raw reply

* [PATCH AUTOSEL 5.5 039/542] pxa168fb: Fix the function used to release some memory in an error handling path
From: Sasha Levin @ 2020-02-14 15:40 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Christophe JAILLET, Lubomir Rintel, YueHaibing,
	Bartlomiej Zolnierkiewicz, Sasha Levin, dri-devel, linux-fbdev
In-Reply-To: <20200214154854.6746-1-sashal@kernel.org>

From: Christophe JAILLET <christophe.jaillet@wanadoo.fr>

[ Upstream commit 3c911fe799d1c338d94b78e7182ad452c37af897 ]

In the probe function, some resources are allocated using 'dma_alloc_wc()',
they should be released with 'dma_free_wc()', not 'dma_free_coherent()'.

We already use 'dma_free_wc()' in the remove function, but not in the
error handling path of the probe function.

Also, remove a useless 'PAGE_ALIGN()'. 'info->fix.smem_len' is already
PAGE_ALIGNed.

Fixes: 638772c7553f ("fb: add support of LCD display controller on pxa168/910 (base layer)")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Lubomir Rintel <lkundrak@v3.sk>
CC: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Bartlomiej Zolnierkiewicz <b.zolnierkie@samsung.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190831100024.3248-1-christophe.jaillet@wanadoo.fr
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/video/fbdev/pxa168fb.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/video/fbdev/pxa168fb.c b/drivers/video/fbdev/pxa168fb.c
index 1410f476e135d..1fc50fc0694bc 100644
--- a/drivers/video/fbdev/pxa168fb.c
+++ b/drivers/video/fbdev/pxa168fb.c
@@ -766,8 +766,8 @@ static int pxa168fb_probe(struct platform_device *pdev)
 failed_free_clk:
 	clk_disable_unprepare(fbi->clk);
 failed_free_fbmem:
-	dma_free_coherent(fbi->dev, info->fix.smem_len,
-			info->screen_base, fbi->fb_start_dma);
+	dma_free_wc(fbi->dev, info->fix.smem_len,
+		    info->screen_base, fbi->fb_start_dma);
 failed_free_info:
 	kfree(info);
 
@@ -801,7 +801,7 @@ static int pxa168fb_remove(struct platform_device *pdev)
 
 	irq = platform_get_irq(pdev, 0);
 
-	dma_free_wc(fbi->dev, PAGE_ALIGN(info->fix.smem_len),
+	dma_free_wc(fbi->dev, info->fix.smem_len,
 		    info->screen_base, info->fix.smem_start);
 
 	clk_disable_unprepare(fbi->clk);
-- 
2.20.1


^ permalink raw reply related

* [PATCH 15/19] linux-user/arm: Replace ARM_FEATURE_VFP* tests for HWCAP
From: Richard Henderson @ 2020-02-14 18:15 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell
In-Reply-To: <20200214181547.21408-1-richard.henderson@linaro.org>

Use isar feature tests instead of feature bit tests.

Although none of QEMUs current cpus have VFPv3 without D32,
replace the large comment explaining why with one line that
sets ARM_HWCAP_ARM_VFPv3D16 under the correct conditions.
Mirror the test sequence used in the linux kernel.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 linux-user/elfload.c | 24 +++++++++++++++---------
 1 file changed, 15 insertions(+), 9 deletions(-)

diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index f3080a1635..c52c814a2e 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -468,22 +468,28 @@ static uint32_t get_elf_hwcap(void)
 
     /* EDSP is in v5TE and above, but all our v5 CPUs are v5TE */
     GET_FEATURE(ARM_FEATURE_V5, ARM_HWCAP_ARM_EDSP);
-    GET_FEATURE(ARM_FEATURE_VFP, ARM_HWCAP_ARM_VFP);
     GET_FEATURE(ARM_FEATURE_IWMMXT, ARM_HWCAP_ARM_IWMMXT);
     GET_FEATURE(ARM_FEATURE_THUMB2EE, ARM_HWCAP_ARM_THUMBEE);
     GET_FEATURE(ARM_FEATURE_NEON, ARM_HWCAP_ARM_NEON);
-    GET_FEATURE(ARM_FEATURE_VFP3, ARM_HWCAP_ARM_VFPv3);
     GET_FEATURE(ARM_FEATURE_V6K, ARM_HWCAP_ARM_TLS);
-    GET_FEATURE(ARM_FEATURE_VFP4, ARM_HWCAP_ARM_VFPv4);
+    GET_FEATURE(ARM_FEATURE_LPAE, ARM_HWCAP_ARM_LPAE);
+
     GET_FEATURE_ID(arm_div, ARM_HWCAP_ARM_IDIVA);
     GET_FEATURE_ID(thumb_div, ARM_HWCAP_ARM_IDIVT);
-    /* All QEMU's VFPv3 CPUs have 32 registers, see VFP_DREG in translate.c.
-     * Note that the ARM_HWCAP_ARM_VFPv3D16 bit is always the inverse of
-     * ARM_HWCAP_ARM_VFPD32 (and so always clear for QEMU); it is unrelated
-     * to our VFP_FP16 feature bit.
+    /*
+     * Note that none of QEMU's cpus have double precision without single
+     * precision support in VFP, so only test the single precision field.
      */
-    GET_FEATURE(ARM_FEATURE_VFP3, ARM_HWCAP_ARM_VFPD32);
-    GET_FEATURE(ARM_FEATURE_LPAE, ARM_HWCAP_ARM_LPAE);
+    GET_FEATURE_ID(aa32_fpsp_v2, ARM_HWCAP_ARM_VFP);
+    if (cpu_isar_feature(aa32_fpsp_v3, cpu)) {
+        hwcaps |= ARM_HWCAP_ARM_VFPv3;
+        if (cpu_isar_feature(aa32_simd_r32, cpu)) {
+            hwcaps |= ARM_HWCAP_ARM_VFPD32;
+        } else {
+            hwcaps |= ARM_HWCAP_ARM_VFPv3D16;
+        }
+    }
+    GET_FEATURE_ID(aa32_simdfmac, ARM_HWCAP_ARM_VFPv4);
 
     return hwcaps;
 }
-- 
2.20.1



^ permalink raw reply related

* [PATCH AUTOSEL 5.5 033/542] ALSA: ctl: allow TLV read operation for callback type of element in locked case
From: Sasha Levin @ 2020-02-14 15:40 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Takashi Sakamoto, Jaroslav Kysela, Takashi Iwai, Sasha Levin,
	alsa-devel
In-Reply-To: <20200214154854.6746-1-sashal@kernel.org>

From: Takashi Sakamoto <o-takashi@sakamocchi.jp>

[ Upstream commit d61fe22c2ae42d9fd76c34ef4224064cca4b04b0 ]

A design of ALSA control core allows applications to execute three
operations for TLV feature; read, write and command. Furthermore, it
allows driver developers to process the operations by two ways; allocated
array or callback function. In the former, read operation is just allowed,
thus developers uses the latter when device driver supports variety of
models or the target model is expected to dynamically change information
stored in TLV container.

The core also allows applications to lock any element so that the other
applications can't perform write operation to the element for element
value and TLV information. When the element is locked, write and command
operation for TLV information are prohibited as well as element value.
Any read operation should be allowed in the case.

At present, when an element has callback function for TLV information,
TLV read operation returns EPERM if the element is locked. On the
other hand, the read operation is success when an element has allocated
array for TLV information. In both cases, read operation is success for
element value expectedly.

This commit fixes the bug. This change can be backported to v4.14
kernel or later.

Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
Reviewed-by: Jaroslav Kysela <perex@perex.cz>
Link: https://lore.kernel.org/r/20191223093347.15279-1-o-takashi@sakamocchi.jp
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 sound/core/control.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/sound/core/control.c b/sound/core/control.c
index 7a4d8690ce41f..08ca7666e84cf 100644
--- a/sound/core/control.c
+++ b/sound/core/control.c
@@ -1430,8 +1430,9 @@ static int call_tlv_handler(struct snd_ctl_file *file, int op_flag,
 	if (kctl->tlv.c == NULL)
 		return -ENXIO;
 
-	/* When locked, this is unavailable. */
-	if (vd->owner != NULL && vd->owner != file)
+	/* Write and command operations are not allowed for locked element. */
+	if (op_flag != SNDRV_CTL_TLV_OP_READ &&
+	    vd->owner != NULL && vd->owner != file)
 		return -EPERM;
 
 	return kctl->tlv.c(kctl, op_flag, size, buf);
-- 
2.20.1


^ permalink raw reply related

* [PATCH AUTOSEL 5.5 040/542] RDMA/i40iw: fix a potential NULL pointer dereference
From: Sasha Levin @ 2020-02-14 15:40 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Xiyu Yang, Xin Tan, Leon Romanovsky, Jason Gunthorpe, Sasha Levin,
	linux-rdma
In-Reply-To: <20200214154854.6746-1-sashal@kernel.org>

From: Xiyu Yang <xiyuyang19@fudan.edu.cn>

[ Upstream commit 04db1580b5e48a79e24aa51ecae0cd4b2296ec23 ]

A NULL pointer can be returned by in_dev_get(). Thus add a corresponding
check so that a NULL pointer dereference will be avoided at this place.

Fixes: 8e06af711bf2 ("i40iw: add main, hdr, status")
Link: https://lore.kernel.org/r/1577672668-46499-1-git-send-email-xiyuyang19@fudan.edu.cn
Signed-off-by: Xiyu Yang <xiyuyang19@fudan.edu.cn>
Signed-off-by: Xin Tan <tanxin.ctf@gmail.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/infiniband/hw/i40iw/i40iw_main.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/infiniband/hw/i40iw/i40iw_main.c b/drivers/infiniband/hw/i40iw/i40iw_main.c
index d44cf33df81ae..238614370927a 100644
--- a/drivers/infiniband/hw/i40iw/i40iw_main.c
+++ b/drivers/infiniband/hw/i40iw/i40iw_main.c
@@ -1225,6 +1225,8 @@ static void i40iw_add_ipv4_addr(struct i40iw_device *iwdev)
 			const struct in_ifaddr *ifa;
 
 			idev = in_dev_get(dev);
+			if (!idev)
+				continue;
 			in_dev_for_each_ifa_rtnl(ifa, idev) {
 				i40iw_debug(&iwdev->sc_dev, I40IW_DEBUG_CM,
 					    "IP=%pI4, vlan_id=%d, MAC=%pM\n", &ifa->ifa_address,
-- 
2.20.1


^ permalink raw reply related

* [PATCH AUTOSEL 5.5 043/542] media: sun4i-csi: Deal with DRAM offset
From: Sasha Levin @ 2020-02-14 15:40 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Chen-Yu Tsai, Maxime Ripard, Sakari Ailus, Mauro Carvalho Chehab,
	Sasha Levin, linux-media, linux-arm-kernel
In-Reply-To: <20200214154854.6746-1-sashal@kernel.org>

From: Chen-Yu Tsai <wens@csie.org>

[ Upstream commit 249b286171fa9c358e8d5c825b48c4ebea97c498 ]

On Allwinner SoCs, some high memory bandwidth devices do DMA directly
over the memory bus (called MBUS), instead of the system bus. These
devices include the CSI camera sensor interface, video (codec) engine,
display subsystem, etc.. The memory bus has a different addressing
scheme without the DRAM starting offset.

Deal with this using the "interconnects" property from the device tree,
or if that is not available, set dev->dma_pfn_offset to PHYS_PFN_OFFSET.

Fixes: 577bbf23b758 ("media: sunxi: Add A10 CSI driver")
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
Acked-by: Maxime Ripard <mripard@kernel.org>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 .../platform/sunxi/sun4i-csi/sun4i_csi.c      | 22 +++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c b/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c
index f36dc6258900e..b8b07c1de2a8e 100644
--- a/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c
+++ b/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.c
@@ -11,6 +11,7 @@
 #include <linux/module.h>
 #include <linux/mutex.h>
 #include <linux/of.h>
+#include <linux/of_device.h>
 #include <linux/of_graph.h>
 #include <linux/platform_device.h>
 #include <linux/pm_runtime.h>
@@ -155,6 +156,27 @@ static int sun4i_csi_probe(struct platform_device *pdev)
 	subdev = &csi->subdev;
 	vdev = &csi->vdev;
 
+	/*
+	 * On Allwinner SoCs, some high memory bandwidth devices do DMA
+	 * directly over the memory bus (called MBUS), instead of the
+	 * system bus. The memory bus has a different addressing scheme
+	 * without the DRAM starting offset.
+	 *
+	 * In some cases this can be described by an interconnect in
+	 * the device tree. In other cases where the hardware is not
+	 * fully understood and the interconnect is left out of the
+	 * device tree, fall back to a default offset.
+	 */
+	if (of_find_property(csi->dev->of_node, "interconnects", NULL)) {
+		ret = of_dma_configure(csi->dev, csi->dev->of_node, true);
+		if (ret)
+			return ret;
+	} else {
+#ifdef PHYS_PFN_OFFSET
+		csi->dev->dma_pfn_offset = PHYS_PFN_OFFSET;
+#endif
+	}
+
 	csi->mdev.dev = csi->dev;
 	strscpy(csi->mdev.model, "Allwinner Video Capture Device",
 		sizeof(csi->mdev.model));
-- 
2.20.1


^ permalink raw reply related

* [PATCH v2] ecryptfs: replace BUG_ON with error handling code
From: Aditya Pakki @ 2020-02-14 18:21 UTC (permalink / raw)
  To: pakki001
  Cc: kjlu, Tyler Hicks, Andrew Morton, Michael Halcrow, Adrian Bunk,
	Randy Dunlap, Theodore Ts'o, ecryptfs, linux-kernel

In crypt_scatterlist, if the crypt_stat argument is not set up
correctly, the kernel crashes. Instead, by returning an error code
upstream, the error is handled safely.

The issue is detected via a static analysis tool written by us.

Fixes: 237fead619984 (ecryptfs: fs/Makefile and fs/Kconfig)
Signed-off-by: Aditya Pakki <pakki001@umn.edu>
---
v1: Add missing fixes tag suggested by Markus and Tyler.
---
 fs/ecryptfs/crypto.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/fs/ecryptfs/crypto.c b/fs/ecryptfs/crypto.c
index db1ef144c63a..2c449aed1b92 100644
--- a/fs/ecryptfs/crypto.c
+++ b/fs/ecryptfs/crypto.c
@@ -311,8 +311,10 @@ static int crypt_scatterlist(struct ecryptfs_crypt_stat *crypt_stat,
 	struct extent_crypt_result ecr;
 	int rc = 0;
 
-	BUG_ON(!crypt_stat || !crypt_stat->tfm
-	       || !(crypt_stat->flags & ECRYPTFS_STRUCT_INITIALIZED));
+	if (!crypt_stat || !crypt_stat->tfm
+	       || !(crypt_stat->flags & ECRYPTFS_STRUCT_INITIALIZED))
+		return -EINVAL;
+
 	if (unlikely(ecryptfs_verbosity > 0)) {
 		ecryptfs_printk(KERN_DEBUG, "Key size [%zd]; key:\n",
 				crypt_stat->key_size);
-- 
2.20.1


^ permalink raw reply related

* [PATCH AUTOSEL 5.5 045/542] RDMA/netlink: Do not always generate an ACK for some netlink operations
From: Sasha Levin @ 2020-02-14 15:40 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Håkon Bugge, Mark Haywood, Leon Romanovsky, Jason Gunthorpe,
	Sasha Levin, linux-rdma
In-Reply-To: <20200214154854.6746-1-sashal@kernel.org>

From: Håkon Bugge <haakon.bugge@oracle.com>

[ Upstream commit a242c36951ecd24bc16086940dbe6b522205c461 ]

In rdma_nl_rcv_skb(), the local variable err is assigned the return value
of the supplied callback function, which could be one of
ib_nl_handle_resolve_resp(), ib_nl_handle_set_timeout(), or
ib_nl_handle_ip_res_resp(). These three functions all return skb->len on
success.

rdma_nl_rcv_skb() is merely a copy of netlink_rcv_skb(). The callback
functions used by the latter have the convention: "Returns 0 on success or
a negative error code".

In particular, the statement (equal for both functions):

   if (nlh->nlmsg_flags & NLM_F_ACK || err)

implies that rdma_nl_rcv_skb() always will ack a message, independent of
the NLM_F_ACK being set in nlmsg_flags or not.

The fix could be to change the above statement, but it is better to keep
the two *_rcv_skb() functions equal in this respect and instead change the
three callback functions in the rdma subsystem to the correct convention.

Fixes: 2ca546b92a02 ("IB/sa: Route SA pathrecord query through netlink")
Fixes: ae43f8286730 ("IB/core: Add IP to GID netlink offload")
Link: https://lore.kernel.org/r/20191216120436.3204814-1-haakon.bugge@oracle.com
Suggested-by: Mark Haywood <mark.haywood@oracle.com>
Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com>
Tested-by: Mark Haywood <mark.haywood@oracle.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/infiniband/core/addr.c     | 2 +-
 drivers/infiniband/core/sa_query.c | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/infiniband/core/addr.c b/drivers/infiniband/core/addr.c
index 606fa6d866851..1753a9801b704 100644
--- a/drivers/infiniband/core/addr.c
+++ b/drivers/infiniband/core/addr.c
@@ -139,7 +139,7 @@ int ib_nl_handle_ip_res_resp(struct sk_buff *skb,
 	if (ib_nl_is_good_ip_resp(nlh))
 		ib_nl_process_good_ip_rsep(nlh);
 
-	return skb->len;
+	return 0;
 }
 
 static int ib_nl_ip_send_msg(struct rdma_dev_addr *dev_addr,
diff --git a/drivers/infiniband/core/sa_query.c b/drivers/infiniband/core/sa_query.c
index 8917125ea16d4..30d4c126a2db0 100644
--- a/drivers/infiniband/core/sa_query.c
+++ b/drivers/infiniband/core/sa_query.c
@@ -1068,7 +1068,7 @@ int ib_nl_handle_set_timeout(struct sk_buff *skb,
 	}
 
 settimeout_out:
-	return skb->len;
+	return 0;
 }
 
 static inline int ib_nl_is_good_resolve_resp(const struct nlmsghdr *nlh)
@@ -1139,7 +1139,7 @@ int ib_nl_handle_resolve_resp(struct sk_buff *skb,
 	}
 
 resp_out:
-	return skb->len;
+	return 0;
 }
 
 static void free_sm_ah(struct kref *kref)
-- 
2.20.1


^ permalink raw reply related

* Re: [PATCH 1/2] virtio-blk: fix hw_queue stopped on arbitrary error
From: dongli.zhang @ 2020-02-14 18:20 UTC (permalink / raw)
  To: Halil Pasic, Michael S. Tsirkin, Jason Wang
  Cc: Jens Axboe, linux-block, linux-s390, Cornelia Huck, Ram Pai,
	linux-kernel, virtualization, Christian Borntraeger,
	Stefan Hajnoczi, Paolo Bonzini, Lendacky, Thomas,
	Viktor Mihajlovski
In-Reply-To: <20200213123728.61216-2-pasic@linux.ibm.com>

Hi Halil,

When swiotlb full is hit for virtio_blk, there is below warning for once (the
warning is not by this patch set). Is this expected or just false positive?

[   54.767257] virtio-pci 0000:00:04.0: swiotlb buffer is full (sz: 16 bytes),
total 32768 (slots), used 258 (slots)
[   54.767260] virtio-pci 0000:00:04.0: overflow 0x0000000075770110+16 of DMA
mask ffffffffffffffff bus limit 0
[   54.769192] ------------[ cut here ]------------
[   54.769200] WARNING: CPU: 3 PID: 102 at kernel/dma/direct.c:35
report_addr+0x71/0x77
[   54.769200] Modules linked in:
[   54.769203] CPU: 3 PID: 102 Comm: kworker/u8:2 Not tainted 5.6.0-rc1+ #2
[   54.769204] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
[   54.769208] Workqueue: writeback wb_workfn (flush-253:0)
[   54.769211] RIP: 0010:report_addr+0x71/0x77
... ...
[   54.769226] Call Trace:
[   54.769241]  dma_direct_map_page+0xc9/0xe0
[   54.769245]  virtqueue_add+0x172/0xaa0
[   54.769248]  virtqueue_add_sgs+0x85/0xa0
[   54.769251]  virtio_queue_rq+0x292/0x480
[   54.769255]  __blk_mq_try_issue_directly+0x13e/0x1f0
[   54.769257]  blk_mq_request_issue_directly+0x48/0xa0
[   54.769259]  blk_mq_try_issue_list_directly+0x3c/0xb0
[   54.769261]  blk_mq_sched_insert_requests+0xb6/0x100
[   54.769263]  blk_mq_flush_plug_list+0x146/0x210
[   54.769264]  blk_flush_plug_list+0xba/0xe0
[   54.769266]  blk_mq_make_request+0x331/0x5b0
[   54.769268]  generic_make_request+0x10d/0x2e0
[   54.769270]  submit_bio+0x69/0x130
[   54.769273]  ext4_io_submit+0x44/0x50
[   54.769275]  ext4_writepages+0x56f/0xd30
[   54.769278]  ? cpumask_next_and+0x19/0x20
[   54.769280]  ? find_busiest_group+0x11a/0xa40
[   54.769283]  do_writepages+0x15/0x70
[   54.769288]  __writeback_single_inode+0x38/0x330
[   54.769290]  writeback_sb_inodes+0x219/0x4c0
[   54.769292]  __writeback_inodes_wb+0x82/0xb0
[   54.769293]  wb_writeback+0x267/0x300
[   54.769295]  wb_workfn+0x1aa/0x430
[   54.769298]  process_one_work+0x156/0x360
[   54.769299]  worker_thread+0x41/0x3b0
[   54.769300]  kthread+0xf3/0x130
[   54.769302]  ? process_one_work+0x360/0x360
[   54.769303]  ? kthread_bind+0x10/0x10
[   54.769305]  ret_from_fork+0x35/0x40
[   54.769307] ---[ end trace 923a87a9ce0e777a ]---

Thank you very much!

Dongli Zhang

On 2/13/20 4:37 AM, Halil Pasic wrote:
> Since nobody else is going to restart our hw_queue for us, the
> blk_mq_start_stopped_hw_queues() is in virtblk_done() is not sufficient
> necessarily sufficient to ensure that the queue will get started again.
> In case of global resource outage (-ENOMEM because mapping failure,
> because of swiotlb full) our virtqueue may be empty and we can get
> stuck with a stopped hw_queue.
> 
> Let us not stop the queue on arbitrary errors, but only on -EONSPC which
> indicates a full virtqueue, where the hw_queue is guaranteed to get
> started by virtblk_done() before when it makes sense to carry on
> submitting requests. Let us also remove a stale comment.
> 
> Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
> Cc: Jens Axboe <axboe@kernel.dk>
> Fixes: f7728002c1c7 ("virtio_ring: fix return code on DMA mapping fails")
> ---
> 
> I'm in doubt with regards to the Fixes tag. The thing is, virtio-blk
> probably made an assumption on virtqueue_add: the failure is either
> because the virtqueue is full, or the failure is fatal. In both cases it
> seems acceptable to stop the queue, although the fatal case is arguable.
> Since commit f7728002c1c7 it the dma mapping has failed returns -ENOMEM
> and not -EIO, and thus we have a recoverable failure that ain't
> virtqueue full. So I lean towards to a fixes tag that references that
> commit, although it ain't broken. Alternatively one could say 'Fixes:
> e467cde23818 ("Block driver using virtio.")', if the aforementioned
> assumption shouldn't have made in the first place.
> ---
>  drivers/block/virtio_blk.c | 8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
> index 54158766334b..adfe43f5ffe4 100644
> --- a/drivers/block/virtio_blk.c
> +++ b/drivers/block/virtio_blk.c
> @@ -245,10 +245,12 @@ static blk_status_t virtio_queue_rq(struct blk_mq_hw_ctx *hctx,
>  	err = virtblk_add_req(vblk->vqs[qid].vq, vbr, vbr->sg, num);
>  	if (err) {
>  		virtqueue_kick(vblk->vqs[qid].vq);
> -		blk_mq_stop_hw_queue(hctx);
> +		/* Don't stop the queue if -ENOMEM: we may have failed to
> +		 * bounce the buffer due to global resource outage.
> +		 */
> +		if (err == -ENOSPC)
> +			blk_mq_stop_hw_queue(hctx);
>  		spin_unlock_irqrestore(&vblk->vqs[qid].lock, flags);
> -		/* Out of mem doesn't actually happen, since we fall back
> -		 * to direct descriptors */
>  		if (err == -ENOMEM || err == -ENOSPC)
>  			return BLK_STS_DEV_RESOURCE;
>  		return BLK_STS_IOERR;
> 

^ permalink raw reply

* Re: [PATCH] spapr: Rework hash<->radix transitions at CAS
From: Greg Kurz @ 2020-02-14 18:19 UTC (permalink / raw)
  To: David Gibson; +Cc: Alexey Kardashevskiy, Laurent Vivier, qemu-ppc, qemu-devel
In-Reply-To: <20200213222835.GG124369@umbus.fritz.box>

[-- Attachment #1: Type: text/plain, Size: 10721 bytes --]

On Fri, 14 Feb 2020 09:28:35 +1100
David Gibson <david@gibson.dropbear.id.au> wrote:

> On Thu, Feb 13, 2020 at 04:38:38PM +0100, Greg Kurz wrote:
> > Until the CAS negotiation is over, an HPT can be allocated on three
> > different paths:
> > 
> > 1) during machine reset if the host doesn't support radix,
> > 
> > 2) during CAS if the guest wants hash and doesn't support HPT resizing,
> >    in which case we pre-emptively resize the HPT to accomodate maxram,
> > 
> > 3) during CAS if no CAS reboot was requested, the guest wants hash but
> >    we're currently configured for radix.
> > 
> > Depending on the various combinations of host or guest MMU support,
> > HPT resizing guest support and the possibility of a CAS reboot, it
> > is quite hard to know which of these allocates the HPT that will
> > be ultimately used by the guest that wants to do hash. Also, some of
> > them have bugs:
> > 
> > - 2) calls spapr_reallocate_hpt() instead of spapr_setup_hpt_and_vrma()
> >   and thus doesn't update the VRMA size, even though we've just extended
> >   the HPT. Not sure what issues this can cause,
> > 
> > - 3) doesn't check for HPT resizing support and will always allocate a
> >   small HPT based on the initial RAM size. This caps the total amount of
> >   RAM the guest can see, especially if maxram is much higher than the
> >   initial ram.
> > 
> > We only support guests that do CAS and we already assume that the HPT
> > isn't being used when we do the pre-emptive resizing at CAS. It thus
> > seems reasonable to only allocate the HPT at the end of CAS, when no
> > CAS reboot was requested.
> > 
> > Consolidate the logic so that we only create the HPT during 3), ie.
> > when we're done with the CAS reboot cycles, and ensure HPT resizing
> > is taken into account. This fixes the radix->hash transition for
> > all cases.
> 
> Uh.. I'm pretty sure this can't work for KVM on a POWER8 host.  We
> need the HPT at all times there, or there's nowhere to put VRMA
> entries, so we can't run even in real mode.
> 

Well it happens to be working anyway because KVM automatically
creates an HPT (default size 16MB) in kvmppc_hv_setup_htab_rma()
if QEMU didn't do so already... Would a comment to emphasize this
be enough or do you prefer I don't drop the HPT allocation currently
performed at machine reset ?

> > The guest can theoretically call CAS several times, without a CAS
> > reboot in between. Linux guests don't do that, but better safe than
> > sorry, let's ensure we can also handle the symmetrical hash->radix
> > transition correctly: free the HPT and set the GR bit in PATE.
> > An helper is introduced for the latter since this is already what
> > we do during machine reset when going for radix.
> > 
> > As a bonus, this removes one user of spapr->cas_reboot, which we
> > want to get rid of in the future.
> > 
> > Signed-off-by: Greg Kurz <groug@kaod.org>
> > ---
> >  hw/ppc/spapr.c         |   25 +++++++++++++++-----
> >  hw/ppc/spapr_hcall.c   |   59 ++++++++++++++++++++----------------------------
> >  include/hw/ppc/spapr.h |    1 +
> >  3 files changed, 44 insertions(+), 41 deletions(-)
> > 
> > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > index 828e2cc1359a..88bc0e4e3ca1 100644
> > --- a/hw/ppc/spapr.c
> > +++ b/hw/ppc/spapr.c
> > @@ -1573,9 +1573,19 @@ void spapr_setup_hpt_and_vrma(SpaprMachineState *spapr)
> >  {
> >      int hpt_shift;
> >  
> > +    /*
> > +     * HPT resizing is a bit of a special case, because when enabled
> > +     * we assume an HPT guest will support it until it says it
> > +     * doesn't, instead of assuming it won't support it until it says
> > +     * it does.  Strictly speaking that approach could break for
> > +     * guests which don't make a CAS call, but those are so old we
> > +     * don't care about them.  Without that assumption we'd have to
> > +     * make at least a temporary allocation of an HPT sized for max
> > +     * memory, which could be impossibly difficult under KVM HV if
> > +     * maxram is large.
> > +     */
> >      if ((spapr->resize_hpt == SPAPR_RESIZE_HPT_DISABLED)
> > -        || (spapr->cas_reboot
> > -            && !spapr_ovec_test(spapr->ov5_cas, OV5_HPT_RESIZE))) {
> > +        || !spapr_ovec_test(spapr->ov5_cas, OV5_HPT_RESIZE)) {
> >          hpt_shift = spapr_hpt_shift_for_ramsize(MACHINE(spapr)->maxram_size);
> >      } else {
> >          uint64_t current_ram_size;
> > @@ -1604,6 +1614,12 @@ static int spapr_reset_drcs(Object *child, void *opaque)
> >      return 0;
> >  }
> >  
> > +void spapr_reset_patb_entry(SpaprMachineState *spapr)
> > +{
> > +    spapr->patb_entry = PATE1_GR;
> > +    spapr_set_all_lpcrs(LPCR_HR | LPCR_UPRT, LPCR_HR | LPCR_UPRT);
> > +}
> > +
> >  static void spapr_machine_reset(MachineState *machine)
> >  {
> >      SpaprMachineState *spapr = SPAPR_MACHINE(machine);
> > @@ -1624,10 +1640,7 @@ static void spapr_machine_reset(MachineState *machine)
> >           * without a HPT because KVM will start them in radix mode.
> >           * Set the GR bit in PATE so that we know there is no HPT.
> >           */
> > -        spapr->patb_entry = PATE1_GR;
> > -        spapr_set_all_lpcrs(LPCR_HR | LPCR_UPRT, LPCR_HR | LPCR_UPRT);
> > -    } else {
> > -        spapr_setup_hpt_and_vrma(spapr);
> > +        spapr_reset_patb_entry(spapr);
> >      }
> >  
> >      qemu_devices_reset();
> > diff --git a/hw/ppc/spapr_hcall.c b/hw/ppc/spapr_hcall.c
> > index b8bb66b5c0d4..57ddf0fa6d05 100644
> > --- a/hw/ppc/spapr_hcall.c
> > +++ b/hw/ppc/spapr_hcall.c
> > @@ -1677,6 +1677,7 @@ static target_ulong h_client_architecture_support(PowerPCCPU *cpu,
> >      bool raw_mode_supported = false;
> >      bool guest_xive;
> >      CPUState *cs;
> > +    int maxshift = spapr_hpt_shift_for_ramsize(MACHINE(spapr)->maxram_size);
> >  
> >      /* CAS is supposed to be called early when only the boot vCPU is active. */
> >      CPU_FOREACH(cs) {
> > @@ -1739,36 +1740,6 @@ static target_ulong h_client_architecture_support(PowerPCCPU *cpu,
> >  
> >      guest_xive = spapr_ovec_test(ov5_guest, OV5_XIVE_EXPLOIT);
> >  
> > -    /*
> > -     * HPT resizing is a bit of a special case, because when enabled
> > -     * we assume an HPT guest will support it until it says it
> > -     * doesn't, instead of assuming it won't support it until it says
> > -     * it does.  Strictly speaking that approach could break for
> > -     * guests which don't make a CAS call, but those are so old we
> > -     * don't care about them.  Without that assumption we'd have to
> > -     * make at least a temporary allocation of an HPT sized for max
> > -     * memory, which could be impossibly difficult under KVM HV if
> > -     * maxram is large.
> > -     */
> > -    if (!guest_radix && !spapr_ovec_test(ov5_guest, OV5_HPT_RESIZE)) {
> > -        int maxshift = spapr_hpt_shift_for_ramsize(MACHINE(spapr)->maxram_size);
> > -
> > -        if (spapr->resize_hpt == SPAPR_RESIZE_HPT_REQUIRED) {
> > -            error_report(
> > -                "h_client_architecture_support: Guest doesn't support HPT resizing, but resize-hpt=required");
> > -            exit(1);
> > -        }
> > -
> > -        if (spapr->htab_shift < maxshift) {
> > -            /* Guest doesn't know about HPT resizing, so we
> > -             * pre-emptively resize for the maximum permitted RAM.  At
> > -             * the point this is called, nothing should have been
> > -             * entered into the existing HPT */
> > -            spapr_reallocate_hpt(spapr, maxshift, &error_fatal);
> > -            push_sregs_to_kvm_pr(spapr);
> > -        }
> > -    }
> > -
> >      /* NOTE: there are actually a number of ov5 bits where input from the
> >       * guest is always zero, and the platform/QEMU enables them independently
> >       * of guest input. To model these properly we'd want some sort of mask,
> > @@ -1806,6 +1777,12 @@ static target_ulong h_client_architecture_support(PowerPCCPU *cpu,
> >              error_report("Guest requested unavailable MMU mode (hash).");
> >              exit(EXIT_FAILURE);
> >          }
> > +        if (spapr->resize_hpt == SPAPR_RESIZE_HPT_REQUIRED &&
> > +            !spapr_ovec_test(ov5_guest, OV5_HPT_RESIZE)) {
> > +            error_report(
> > +                "h_client_architecture_support: Guest doesn't support HPT resizing, but resize-hpt=required");
> > +            exit(1);
> > +        }
> >      }
> >      spapr->cas_pre_isa3_guest = !spapr_ovec_test(ov1_guest, OV1_PPC_3_00);
> >      spapr_ovec_cleanup(ov1_guest);
> > @@ -1838,11 +1815,23 @@ static target_ulong h_client_architecture_support(PowerPCCPU *cpu,
> >          void *fdt;
> >          SpaprDeviceTreeUpdateHeader hdr = { .version_id = 1 };
> >  
> > -        /* If spapr_machine_reset() did not set up a HPT but one is necessary
> > -         * (because the guest isn't going to use radix) then set it up here. */
> > -        if ((spapr->patb_entry & PATE1_GR) && !guest_radix) {
> > -            /* legacy hash or new hash: */
> > -            spapr_setup_hpt_and_vrma(spapr);
> > +        if (!guest_radix) {
> > +            /*
> > +             * Either spapr_machine_reset() did not set up a HPT but one
> > +             * is necessary (because the guest isn't going to use radix),
> > +             * or the guest doesn't know about HPT resizing and we need to
> > +             * pre-emptively resize for the maximum permitted RAM. Set it
> > +             * up here. At the point this is called, nothing should have
> > +             * been entered into the existing HPT.
> > +             */
> > +            if (spapr->patb_entry & PATE1_GR || spapr->htab_shift < maxshift) {
> > +                /* legacy hash or new hash: */
> > +                spapr_setup_hpt_and_vrma(spapr);
> > +                push_sregs_to_kvm_pr(spapr);
> > +            }
> > +        } else {
> > +            spapr_free_hpt(spapr);
> > +            spapr_reset_patb_entry(spapr);
> >          }
> >  
> >          if (fdt_bufsize < sizeof(hdr)) {
> > diff --git a/include/hw/ppc/spapr.h b/include/hw/ppc/spapr.h
> > index 09110961a589..9d88b5596481 100644
> > --- a/include/hw/ppc/spapr.h
> > +++ b/include/hw/ppc/spapr.h
> > @@ -919,4 +919,5 @@ void spapr_check_pagesize(SpaprMachineState *spapr, hwaddr pagesize,
> >  
> >  void spapr_set_all_lpcrs(target_ulong value, target_ulong mask);
> >  hwaddr spapr_get_rtas_addr(void);
> > +void spapr_reset_patb_entry(SpaprMachineState *spapr);
> >  #endif /* HW_SPAPR_H */
> > 
> 


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply

* [PATCH AUTOSEL 5.5 046/542] media: sun4i-csi: Fix [HV]sync polarity handling
From: Sasha Levin @ 2020-02-14 15:40 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Chen-Yu Tsai, Maxime Ripard, Sakari Ailus, Mauro Carvalho Chehab,
	Sasha Levin, linux-media, linux-arm-kernel
In-Reply-To: <20200214154854.6746-1-sashal@kernel.org>

From: Chen-Yu Tsai <wens@csie.org>

[ Upstream commit 1948dcf0f928b8bcdca57ca3fba8545ba380fc29 ]

The Allwinner camera sensor interface has a different definition of
[HV]sync. While the timing diagram uses the names HSYNC and VSYNC,
the note following the diagram and register names use HREF and VREF.
Combined they imply the hardware uses either [HV]REF or inverted
[HV]SYNC. There are also registers to set horizontal skip lengths
in pixels and vertical skip lengths in lines, also known as back
porches.

Fix the polarity handling by using the opposite polarity flag for
the checks. Also rename `[hv]sync_pol` to `[hv]ref_pol` to better
match the hardware register description.

Fixes: 577bbf23b758 ("media: sunxi: Add A10 CSI driver")
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
Acked-by: Maxime Ripard <mripard@kernel.org>
Signed-off-by: Sakari Ailus <sakari.ailus@linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 .../media/platform/sunxi/sun4i-csi/sun4i_csi.h |  4 ++--
 .../media/platform/sunxi/sun4i-csi/sun4i_dma.c | 18 +++++++++++++-----
 2 files changed, 15 insertions(+), 7 deletions(-)

diff --git a/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.h b/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.h
index 001c8bde006ce..88d39b3554c4b 100644
--- a/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.h
+++ b/drivers/media/platform/sunxi/sun4i-csi/sun4i_csi.h
@@ -22,8 +22,8 @@
 #define CSI_CFG_INPUT_FMT(fmt)			((fmt) << 20)
 #define CSI_CFG_OUTPUT_FMT(fmt)			((fmt) << 16)
 #define CSI_CFG_YUV_DATA_SEQ(seq)		((seq) << 8)
-#define CSI_CFG_VSYNC_POL(pol)			((pol) << 2)
-#define CSI_CFG_HSYNC_POL(pol)			((pol) << 1)
+#define CSI_CFG_VREF_POL(pol)			((pol) << 2)
+#define CSI_CFG_HREF_POL(pol)			((pol) << 1)
 #define CSI_CFG_PCLK_POL(pol)			((pol) << 0)
 
 #define CSI_CPT_CTRL_REG		0x08
diff --git a/drivers/media/platform/sunxi/sun4i-csi/sun4i_dma.c b/drivers/media/platform/sunxi/sun4i-csi/sun4i_dma.c
index 8b567d0f019bf..78fa1c535ac64 100644
--- a/drivers/media/platform/sunxi/sun4i-csi/sun4i_dma.c
+++ b/drivers/media/platform/sunxi/sun4i-csi/sun4i_dma.c
@@ -228,7 +228,7 @@ static int sun4i_csi_start_streaming(struct vb2_queue *vq, unsigned int count)
 	struct sun4i_csi *csi = vb2_get_drv_priv(vq);
 	struct v4l2_fwnode_bus_parallel *bus = &csi->bus;
 	const struct sun4i_csi_format *csi_fmt;
-	unsigned long hsync_pol, pclk_pol, vsync_pol;
+	unsigned long href_pol, pclk_pol, vref_pol;
 	unsigned long flags;
 	unsigned int i;
 	int ret;
@@ -278,13 +278,21 @@ static int sun4i_csi_start_streaming(struct vb2_queue *vq, unsigned int count)
 	writel(CSI_WIN_CTRL_H_ACTIVE(csi->fmt.height),
 	       csi->regs + CSI_WIN_CTRL_H_REG);
 
-	hsync_pol = !!(bus->flags & V4L2_MBUS_HSYNC_ACTIVE_HIGH);
-	vsync_pol = !!(bus->flags & V4L2_MBUS_VSYNC_ACTIVE_HIGH);
+	/*
+	 * This hardware uses [HV]REF instead of [HV]SYNC. Based on the
+	 * provided timing diagrams in the manual, positive polarity
+	 * equals active high [HV]REF.
+	 *
+	 * When the back porch is 0, [HV]REF is more or less equivalent
+	 * to [HV]SYNC inverted.
+	 */
+	href_pol = !!(bus->flags & V4L2_MBUS_HSYNC_ACTIVE_LOW);
+	vref_pol = !!(bus->flags & V4L2_MBUS_VSYNC_ACTIVE_LOW);
 	pclk_pol = !!(bus->flags & V4L2_MBUS_PCLK_SAMPLE_RISING);
 	writel(CSI_CFG_INPUT_FMT(csi_fmt->input) |
 	       CSI_CFG_OUTPUT_FMT(csi_fmt->output) |
-	       CSI_CFG_VSYNC_POL(vsync_pol) |
-	       CSI_CFG_HSYNC_POL(hsync_pol) |
+	       CSI_CFG_VREF_POL(vref_pol) |
+	       CSI_CFG_HREF_POL(href_pol) |
 	       CSI_CFG_PCLK_POL(pclk_pol),
 	       csi->regs + CSI_CFG_REG);
 
-- 
2.20.1


^ permalink raw reply related

* [PATCH AUTOSEL 5.5 047/542] clk: at91: sam9x60: fix programmable clock prescaler
From: Sasha Levin @ 2020-02-14 15:40 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Eugen Hristev, Stephen Boyd, Sasha Levin, linux-clk,
	linux-arm-kernel
In-Reply-To: <20200214154854.6746-1-sashal@kernel.org>

From: Eugen Hristev <eugen.hristev@microchip.com>

[ Upstream commit 66d9f5214c9ba1c151478f99520b6817302d50dc ]

The prescaler works as parent rate divided by (PRES + 1) (is_pres_direct == 1)
It does not work in the way of parent rate shifted to the right by (PRES + 1),
which means division by 2^(PRES + 1) (is_pres_direct == 0)
Thus is_pres_direct must be enabled for this SoC, to make the right computation.
This field was added in
commit 45b06682113b ("clk: at91: fix programmable clock for sama5d2")
SAM9X60 has the same field as SAMA5D2 in the PCK

Fixes: 01e2113de9a5 ("clk: at91: add sam9x60 pmc driver")
Signed-off-by: Eugen Hristev <eugen.hristev@microchip.com>
Link: https://lkml.kernel.org/r/1575977088-16781-1-git-send-email-eugen.hristev@microchip.com
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/clk/at91/sam9x60.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/clk/at91/sam9x60.c b/drivers/clk/at91/sam9x60.c
index 86238d5ecb4da..77398aefeb6db 100644
--- a/drivers/clk/at91/sam9x60.c
+++ b/drivers/clk/at91/sam9x60.c
@@ -47,6 +47,7 @@ static const struct clk_programmable_layout sam9x60_programmable_layout = {
 	.pres_shift = 8,
 	.css_mask = 0x1f,
 	.have_slck_mck = 0,
+	.is_pres_direct = 1,
 };
 
 static const struct clk_pcr_layout sam9x60_pcr_layout = {
-- 
2.20.1


^ permalink raw reply related

* [PATCH RFC 2/9] NFSD: Clean up nfsd4_encode_readv
From: Chuck Lever @ 2020-02-14 15:49 UTC (permalink / raw)
  To: bfields; +Cc: linux-rdma, linux-nfs
In-Reply-To: <20200214151427.3848.49739.stgit@klimt.1015granger.net>

Address some minor nits I noticed while working on this function.

Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 fs/nfsd/nfs4xdr.c |    9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/fs/nfsd/nfs4xdr.c b/fs/nfsd/nfs4xdr.c
index 60be969d8be1..262f9fc76e4e 100644
--- a/fs/nfsd/nfs4xdr.c
+++ b/fs/nfsd/nfs4xdr.c
@@ -3591,7 +3591,6 @@ static __be32 nfsd4_encode_readv(struct nfsd4_compoundres *resp,
 	__be32 nfserr;
 	__be32 tmp;
 	__be32 *p;
-	u32 zzz = 0;
 	int pad;
 
 	/*
@@ -3607,7 +3606,7 @@ static __be32 nfsd4_encode_readv(struct nfsd4_compoundres *resp,
 	v = 0;
 	while (len) {
 		thislen = min_t(long, len, PAGE_SIZE);
-		p = xdr_reserve_space(xdr, (thislen+3)&~3);
+		p = xdr_reserve_space(xdr, thislen);
 		WARN_ON_ONCE(!p);
 		resp->rqstp->rq_vec[v].iov_base = p;
 		resp->rqstp->rq_vec[v].iov_len = thislen;
@@ -3616,7 +3615,6 @@ static __be32 nfsd4_encode_readv(struct nfsd4_compoundres *resp,
 	}
 	read->rd_vlen = v;
 
-	len = maxcount;
 	nfserr = nfsd_readv(resp->rqstp, read->rd_fhp, file, read->rd_offset,
 			    resp->rqstp->rq_vec, read->rd_vlen, &maxcount,
 			    &eof);
@@ -3625,16 +3623,17 @@ static __be32 nfsd4_encode_readv(struct nfsd4_compoundres *resp,
 		return nfserr;
 	if (svc_encode_read_payload(resp->rqstp, starting_len + 8, maxcount))
 		return nfserr_io;
-	xdr_truncate_encode(xdr, starting_len + 8 + ((maxcount+3)&~3));
+	xdr_truncate_encode(xdr, starting_len + 8 + xdr_align_size(maxcount));
 
 	tmp = htonl(eof);
 	write_bytes_to_xdr_buf(xdr->buf, starting_len    , &tmp, 4);
 	tmp = htonl(maxcount);
 	write_bytes_to_xdr_buf(xdr->buf, starting_len + 4, &tmp, 4);
 
+	tmp = xdr_zero;
 	pad = (maxcount&3) ? 4 - (maxcount&3) : 0;
 	write_bytes_to_xdr_buf(xdr->buf, starting_len + 8 + maxcount,
-								&zzz, pad);
+								&tmp, pad);
 	return 0;
 
 }


^ permalink raw reply related

* [PATCH AUTOSEL 5.5 049/542] clk: meson: meson8b: make the CCF use the glitch-free mali mux
From: Sasha Levin @ 2020-02-14 15:40 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Martin Blumenstingl, Jerome Brunet, Sasha Levin, linux-amlogic,
	linux-clk, linux-arm-kernel
In-Reply-To: <20200214154854.6746-1-sashal@kernel.org>

From: Martin Blumenstingl <martin.blumenstingl@googlemail.com>

[ Upstream commit 8daeaea99caabe24a0929fac17977ebfb882fa86 ]

The "mali_0" or "mali_1" clock trees should not be updated while the
clock is running. Enforce this by setting CLK_SET_RATE_GATE on the
"mali_0" and "mali_1" gates. This makes the CCF switch to the "mali_1"
tree when "mali_0" is currently active and vice versa, which is exactly
what the vendor driver does when updating the frequency of the mali
clock.

This fixes a potential hang when changing the GPU frequency at runtime.

Fixes: 74e1f2521f16ff ("clk: meson: meson8b: add the GPU clock tree")
Signed-off-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com>
Signed-off-by: Jerome Brunet <jbrunet@baylibre.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 drivers/clk/meson/meson8b.c | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/drivers/clk/meson/meson8b.c b/drivers/clk/meson/meson8b.c
index 67e6691e080c1..8856ce476ccfa 100644
--- a/drivers/clk/meson/meson8b.c
+++ b/drivers/clk/meson/meson8b.c
@@ -1764,8 +1764,11 @@ static struct clk_regmap meson8b_hdmi_sys = {
 
 /*
  * The MALI IP is clocked by two identical clocks (mali_0 and mali_1)
- * muxed by a glitch-free switch on Meson8b and Meson8m2. Meson8 only
- * has mali_0 and no glitch-free mux.
+ * muxed by a glitch-free switch on Meson8b and Meson8m2. The CCF can
+ * actually manage this glitch-free mux because it does top-to-bottom
+ * updates the each clock tree and switches to the "inactive" one when
+ * CLK_SET_RATE_GATE is set.
+ * Meson8 only has mali_0 and no glitch-free mux.
  */
 static const struct clk_hw *meson8b_mali_0_1_parent_hws[] = {
 	&meson8b_xtal.hw,
@@ -1830,7 +1833,7 @@ static struct clk_regmap meson8b_mali_0 = {
 			&meson8b_mali_0_div.hw
 		},
 		.num_parents = 1,
-		.flags = CLK_SET_RATE_PARENT,
+		.flags = CLK_SET_RATE_GATE | CLK_SET_RATE_PARENT,
 	},
 };
 
@@ -1885,7 +1888,7 @@ static struct clk_regmap meson8b_mali_1 = {
 			&meson8b_mali_1_div.hw
 		},
 		.num_parents = 1,
-		.flags = CLK_SET_RATE_PARENT,
+		.flags = CLK_SET_RATE_GATE | CLK_SET_RATE_PARENT,
 	},
 };
 
-- 
2.20.1


^ permalink raw reply related

* [PATCH 08/19] target/arm: Perform fpdp_v2 check first
From: Richard Henderson @ 2020-02-14 18:15 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell
In-Reply-To: <20200214181547.21408-1-richard.henderson@linaro.org>

Shuffle the order of the checks so that we test the ISA
before we test anything else, such as the register arguments.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/translate-vfp.inc.c | 143 +++++++++++++++++----------------
 1 file changed, 72 insertions(+), 71 deletions(-)

diff --git a/target/arm/translate-vfp.inc.c b/target/arm/translate-vfp.inc.c
index 5290828d0d..0c55140127 100644
--- a/target/arm/translate-vfp.inc.c
+++ b/target/arm/translate-vfp.inc.c
@@ -200,13 +200,13 @@ static bool trans_VSEL(DisasContext *s, arg_VSEL *a)
         return false;
     }
 
-    /* UNDEF accesses to D16-D31 if they don't exist */
-    if (dp && !dc_isar_feature(aa32_simd_r32, s) &&
-        ((a->vm | a->vn | a->vd) & 0x10)) {
+    if (dp && !dc_isar_feature(aa32_fpdp_v2, s)) {
         return false;
     }
 
-    if (dp && !dc_isar_feature(aa32_fpdp_v2, s)) {
+    /* UNDEF accesses to D16-D31 if they don't exist */
+    if (dp && !dc_isar_feature(aa32_simd_r32, s) &&
+        ((a->vm | a->vn | a->vd) & 0x10)) {
         return false;
     }
 
@@ -333,13 +333,13 @@ static bool trans_VMINMAXNM(DisasContext *s, arg_VMINMAXNM *a)
         return false;
     }
 
-    /* UNDEF accesses to D16-D31 if they don't exist */
-    if (dp && !dc_isar_feature(aa32_simd_r32, s) &&
-        ((a->vm | a->vn | a->vd) & 0x10)) {
+    if (dp && !dc_isar_feature(aa32_fpdp_v2, s)) {
         return false;
     }
 
-    if (dp && !dc_isar_feature(aa32_fpdp_v2, s)) {
+    /* UNDEF accesses to D16-D31 if they don't exist */
+    if (dp && !dc_isar_feature(aa32_simd_r32, s) &&
+        ((a->vm | a->vn | a->vd) & 0x10)) {
         return false;
     }
 
@@ -419,13 +419,13 @@ static bool trans_VRINT(DisasContext *s, arg_VRINT *a)
         return false;
     }
 
-    /* UNDEF accesses to D16-D31 if they don't exist */
-    if (dp && !dc_isar_feature(aa32_simd_r32, s) &&
-        ((a->vm | a->vd) & 0x10)) {
+    if (dp && !dc_isar_feature(aa32_fpdp_v2, s)) {
         return false;
     }
 
-    if (dp && !dc_isar_feature(aa32_fpdp_v2, s)) {
+    /* UNDEF accesses to D16-D31 if they don't exist */
+    if (dp && !dc_isar_feature(aa32_simd_r32, s) &&
+        ((a->vm | a->vd) & 0x10)) {
         return false;
     }
 
@@ -483,12 +483,12 @@ static bool trans_VCVT(DisasContext *s, arg_VCVT *a)
         return false;
     }
 
-    /* UNDEF accesses to D16-D31 if they don't exist */
-    if (dp && !dc_isar_feature(aa32_simd_r32, s) && (a->vm & 0x10)) {
+    if (dp && !dc_isar_feature(aa32_fpdp_v2, s)) {
         return false;
     }
 
-    if (dp && !dc_isar_feature(aa32_fpdp_v2, s)) {
+    /* UNDEF accesses to D16-D31 if they don't exist */
+    if (dp && !dc_isar_feature(aa32_simd_r32, s) && (a->vm & 0x10)) {
         return false;
     }
 
@@ -1308,12 +1308,12 @@ static bool do_vfp_3op_dp(DisasContext *s, VFPGen3OpDPFn *fn,
     TCGv_i64 f0, f1, fd;
     TCGv_ptr fpst;
 
-    /* UNDEF accesses to D16-D31 if they don't exist */
-    if (!dc_isar_feature(aa32_simd_r32, s) && ((vd | vn | vm) & 0x10)) {
+    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
         return false;
     }
 
-    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
+    /* UNDEF accesses to D16-D31 if they don't exist */
+    if (!dc_isar_feature(aa32_simd_r32, s) && ((vd | vn | vm) & 0x10)) {
         return false;
     }
 
@@ -1457,12 +1457,12 @@ static bool do_vfp_2op_dp(DisasContext *s, VFPGen2OpDPFn *fn, int vd, int vm)
     int veclen = s->vec_len;
     TCGv_i64 f0, fd;
 
-    /* UNDEF accesses to D16-D31 if they don't exist */
-    if (!dc_isar_feature(aa32_simd_r32, s) && ((vd | vm) & 0x10)) {
+    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
         return false;
     }
 
-    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
+    /* UNDEF accesses to D16-D31 if they don't exist */
+    if (!dc_isar_feature(aa32_simd_r32, s) && ((vd | vm) & 0x10)) {
         return false;
     }
 
@@ -1821,12 +1821,13 @@ static bool trans_VFM_dp(DisasContext *s, arg_VFM_dp *a)
         return false;
     }
 
-    /* UNDEF accesses to D16-D31 if they don't exist. */
-    if (!dc_isar_feature(aa32_simd_r32, s) && ((a->vd | a->vn | a->vm) & 0x10)) {
+    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
         return false;
     }
 
-    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
+    /* UNDEF accesses to D16-D31 if they don't exist. */
+    if (!dc_isar_feature(aa32_simd_r32, s) &&
+        ((a->vd | a->vn | a->vm) & 0x10)) {
         return false;
     }
 
@@ -1920,12 +1921,12 @@ static bool trans_VMOV_imm_dp(DisasContext *s, arg_VMOV_imm_dp *a)
 
     vd = a->vd;
 
-    /* UNDEF accesses to D16-D31 if they don't exist. */
-    if (!dc_isar_feature(aa32_simd_r32, s) && (vd & 0x10)) {
+    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
         return false;
     }
 
-    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
+    /* UNDEF accesses to D16-D31 if they don't exist. */
+    if (!dc_isar_feature(aa32_simd_r32, s) && (vd & 0x10)) {
         return false;
     }
 
@@ -2059,6 +2060,10 @@ static bool trans_VCMP_dp(DisasContext *s, arg_VCMP_dp *a)
 {
     TCGv_i64 vd, vm;
 
+    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
+        return false;
+    }
+
     /* Vm/M bits must be zero for the Z variant */
     if (a->z && a->vm != 0) {
         return false;
@@ -2069,10 +2074,6 @@ static bool trans_VCMP_dp(DisasContext *s, arg_VCMP_dp *a)
         return false;
     }
 
-    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
-        return false;
-    }
-
     if (!vfp_access_check(s)) {
         return true;
     }
@@ -2133,6 +2134,10 @@ static bool trans_VCVT_f64_f16(DisasContext *s, arg_VCVT_f64_f16 *a)
     TCGv_i32 tmp;
     TCGv_i64 vd;
 
+    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
+        return false;
+    }
+
     if (!dc_isar_feature(aa32_fp16_dpconv, s)) {
         return false;
     }
@@ -2142,10 +2147,6 @@ static bool trans_VCVT_f64_f16(DisasContext *s, arg_VCVT_f64_f16 *a)
         return false;
     }
 
-    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
-        return false;
-    }
-
     if (!vfp_access_check(s)) {
         return true;
     }
@@ -2199,6 +2200,10 @@ static bool trans_VCVT_f16_f64(DisasContext *s, arg_VCVT_f16_f64 *a)
     TCGv_i32 tmp;
     TCGv_i64 vm;
 
+    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
+        return false;
+    }
+
     if (!dc_isar_feature(aa32_fp16_dpconv, s)) {
         return false;
     }
@@ -2208,10 +2213,6 @@ static bool trans_VCVT_f16_f64(DisasContext *s, arg_VCVT_f16_f64 *a)
         return false;
     }
 
-    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
-        return false;
-    }
-
     if (!vfp_access_check(s)) {
         return true;
     }
@@ -2259,6 +2260,10 @@ static bool trans_VRINTR_dp(DisasContext *s, arg_VRINTR_dp *a)
     TCGv_ptr fpst;
     TCGv_i64 tmp;
 
+    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
+        return false;
+    }
+
     if (!dc_isar_feature(aa32_vrint, s)) {
         return false;
     }
@@ -2268,10 +2273,6 @@ static bool trans_VRINTR_dp(DisasContext *s, arg_VRINTR_dp *a)
         return false;
     }
 
-    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
-        return false;
-    }
-
     if (!vfp_access_check(s)) {
         return true;
     }
@@ -2320,6 +2321,10 @@ static bool trans_VRINTZ_dp(DisasContext *s, arg_VRINTZ_dp *a)
     TCGv_i64 tmp;
     TCGv_i32 tcg_rmode;
 
+    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
+        return false;
+    }
+
     if (!dc_isar_feature(aa32_vrint, s)) {
         return false;
     }
@@ -2329,10 +2334,6 @@ static bool trans_VRINTZ_dp(DisasContext *s, arg_VRINTZ_dp *a)
         return false;
     }
 
-    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
-        return false;
-    }
-
     if (!vfp_access_check(s)) {
         return true;
     }
@@ -2379,6 +2380,10 @@ static bool trans_VRINTX_dp(DisasContext *s, arg_VRINTX_dp *a)
     TCGv_ptr fpst;
     TCGv_i64 tmp;
 
+    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
+        return false;
+    }
+
     if (!dc_isar_feature(aa32_vrint, s)) {
         return false;
     }
@@ -2388,10 +2393,6 @@ static bool trans_VRINTX_dp(DisasContext *s, arg_VRINTX_dp *a)
         return false;
     }
 
-    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
-        return false;
-    }
-
     if (!vfp_access_check(s)) {
         return true;
     }
@@ -2411,12 +2412,12 @@ static bool trans_VCVT_sp(DisasContext *s, arg_VCVT_sp *a)
     TCGv_i64 vd;
     TCGv_i32 vm;
 
-    /* UNDEF accesses to D16-D31 if they don't exist. */
-    if (!dc_isar_feature(aa32_simd_r32, s) && (a->vd & 0x10)) {
+    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
         return false;
     }
 
-    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
+    /* UNDEF accesses to D16-D31 if they don't exist. */
+    if (!dc_isar_feature(aa32_simd_r32, s) && (a->vd & 0x10)) {
         return false;
     }
 
@@ -2439,12 +2440,12 @@ static bool trans_VCVT_dp(DisasContext *s, arg_VCVT_dp *a)
     TCGv_i64 vm;
     TCGv_i32 vd;
 
-    /* UNDEF accesses to D16-D31 if they don't exist. */
-    if (!dc_isar_feature(aa32_simd_r32, s) && (a->vm & 0x10)) {
+    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
         return false;
     }
 
-    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
+    /* UNDEF accesses to D16-D31 if they don't exist. */
+    if (!dc_isar_feature(aa32_simd_r32, s) && (a->vm & 0x10)) {
         return false;
     }
 
@@ -2493,12 +2494,12 @@ static bool trans_VCVT_int_dp(DisasContext *s, arg_VCVT_int_dp *a)
     TCGv_i64 vd;
     TCGv_ptr fpst;
 
-    /* UNDEF accesses to D16-D31 if they don't exist. */
-    if (!dc_isar_feature(aa32_simd_r32, s) && (a->vd & 0x10)) {
+    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
         return false;
     }
 
-    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
+    /* UNDEF accesses to D16-D31 if they don't exist. */
+    if (!dc_isar_feature(aa32_simd_r32, s) && (a->vd & 0x10)) {
         return false;
     }
 
@@ -2529,6 +2530,10 @@ static bool trans_VJCVT(DisasContext *s, arg_VJCVT *a)
     TCGv_i32 vd;
     TCGv_i64 vm;
 
+    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
+        return false;
+    }
+
     if (!dc_isar_feature(aa32_jscvt, s)) {
         return false;
     }
@@ -2538,10 +2543,6 @@ static bool trans_VJCVT(DisasContext *s, arg_VJCVT *a)
         return false;
     }
 
-    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
-        return false;
-    }
-
     if (!vfp_access_check(s)) {
         return true;
     }
@@ -2622,6 +2623,10 @@ static bool trans_VCVT_fix_dp(DisasContext *s, arg_VCVT_fix_dp *a)
     TCGv_ptr fpst;
     int frac_bits;
 
+    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
+        return false;
+    }
+
     if (!arm_dc_feature(s, ARM_FEATURE_VFP3)) {
         return false;
     }
@@ -2631,10 +2636,6 @@ static bool trans_VCVT_fix_dp(DisasContext *s, arg_VCVT_fix_dp *a)
         return false;
     }
 
-    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
-        return false;
-    }
-
     if (!vfp_access_check(s)) {
         return true;
     }
@@ -2722,12 +2723,12 @@ static bool trans_VCVT_dp_int(DisasContext *s, arg_VCVT_dp_int *a)
     TCGv_i64 vm;
     TCGv_ptr fpst;
 
-    /* UNDEF accesses to D16-D31 if they don't exist. */
-    if (!dc_isar_feature(aa32_simd_r32, s) && (a->vm & 0x10)) {
+    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
         return false;
     }
 
-    if (!dc_isar_feature(aa32_fpdp_v2, s)) {
+    /* UNDEF accesses to D16-D31 if they don't exist. */
+    if (!dc_isar_feature(aa32_simd_r32, s) && (a->vm & 0x10)) {
         return false;
     }
 
-- 
2.20.1



^ permalink raw reply related

* [PATCH 04/19] target/arm: Set MVFR0.FPSP for ARMv5 cpus
From: Richard Henderson @ 2020-02-14 18:15 UTC (permalink / raw)
  To: qemu-devel; +Cc: peter.maydell
In-Reply-To: <20200214181547.21408-1-richard.henderson@linaro.org>

We are going to convert FEATURE tests to ISAR tests,
so FPSP needs to be set for these cpus, like we have
already for FPDP.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/arm/cpu.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index f0bd419dd8..92006e56c8 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -1869,10 +1869,11 @@ static void arm926_initfn(Object *obj)
      */
     cpu->isar.id_isar1 = FIELD_DP32(cpu->isar.id_isar1, ID_ISAR1, JAZELLE, 1);
     /*
-     * Similarly, we need to set MVFR0 fields to enable double precision
-     * and short vector support even though ARMv5 doesn't have this register.
+     * Similarly, we need to set MVFR0 fields to enable vfp and short vector
+     * support even though ARMv5 doesn't have this register.
      */
     cpu->isar.mvfr0 = FIELD_DP32(cpu->isar.mvfr0, MVFR0, FPSHVEC, 1);
+    cpu->isar.mvfr0 = FIELD_DP32(cpu->isar.mvfr0, MVFR0, FPSP, 1);
     cpu->isar.mvfr0 = FIELD_DP32(cpu->isar.mvfr0, MVFR0, FPDP, 1);
 }
 
@@ -1911,10 +1912,11 @@ static void arm1026_initfn(Object *obj)
      */
     cpu->isar.id_isar1 = FIELD_DP32(cpu->isar.id_isar1, ID_ISAR1, JAZELLE, 1);
     /*
-     * Similarly, we need to set MVFR0 fields to enable double precision
-     * and short vector support even though ARMv5 doesn't have this register.
+     * Similarly, we need to set MVFR0 fields to enable vfp and short vector
+     * support even though ARMv5 doesn't have this register.
      */
     cpu->isar.mvfr0 = FIELD_DP32(cpu->isar.mvfr0, MVFR0, FPSHVEC, 1);
+    cpu->isar.mvfr0 = FIELD_DP32(cpu->isar.mvfr0, MVFR0, FPSP, 1);
     cpu->isar.mvfr0 = FIELD_DP32(cpu->isar.mvfr0, MVFR0, FPDP, 1);
 
     {
-- 
2.20.1



^ permalink raw reply related


This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.