LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH v4 3/3] kselftest: Extend vDSO selftest to clock_getres
From: Vincenzo Frascino @ 2019-05-28 11:57 UTC (permalink / raw)
  To: Michael Ellerman, linux-arch, linuxppc-dev, linux-s390,
	linux-kselftest
  Cc: Arnd Bergmann, Heiko Carstens, Paul Mackerras, Martin Schwidefsky,
	Thomas Gleixner, Shuah Khan
In-Reply-To: <87lfyrp0d2.fsf@concordia.ellerman.id.au>

Hi Michael,

thank you for your reply.

On 28/05/2019 07:19, Michael Ellerman wrote:
> Vincenzo Frascino <vincenzo.frascino@arm.com> writes:
> 
>> The current version of the multiarch vDSO selftest verifies only
>> gettimeofday.
>>
>> Extend the vDSO selftest to clock_getres, to verify that the
>> syscall and the vDSO library function return the same information.
>>
>> The extension has been used to verify the hrtimer_resoltion fix.
> 
> This is passing for me even without patch 1 applied, shouldn't it fail
> without the fix? What am I missing?
> 

This is correct, because during the refactoring process I missed an "n" :)

if·((x.tv_sec·!=·y.tv_sec)·||·(x.tv_sec·!=·y.tv_sec))

Should be:

if·((x.tv_sec·!=·y.tv_sec)·||·(x.tv_nsec·!=·y.tv_nsec))

My mistake, I am going to fix the test and re-post v5 of this set.

Without my patch if you pass "highres=off" to the kernel (as a command line
parameter) it leads to a broken implementation of clock_getres since the value
of CLOCK_REALTIME_RES does not change at runtime.

Expected result (with highres=off):

# uname -r
5.2.0-rc2
# ./vdso_clock_getres
clock_id: CLOCK_REALTIME [FAIL]
clock_id: CLOCK_BOOTTIME [PASS]
clock_id: CLOCK_TAI [PASS]
clock_id: CLOCK_REALTIME_COARSE [PASS]
clock_id: CLOCK_MONOTONIC [FAIL]
clock_id: CLOCK_MONOTONIC_RAW [PASS]
clock_id: CLOCK_MONOTONIC_COARSE [PASS]

The reason of this behavior is that the only clocks supported by getres on
powerpc are CLOCK_REALTIME and CLOCK_MONOTONIC, the rest on the clocks use
always syscalls.

> # uname -r
> 5.2.0-rc2-gcc-8.2.0
> 
> # ./vdso_clock_getres
> clock_id: CLOCK_REALTIME [PASS]
> clock_id: CLOCK_BOOTTIME [PASS]
> clock_id: CLOCK_TAI [PASS]
> clock_id: CLOCK_REALTIME_COARSE [PASS]
> clock_id: CLOCK_MONOTONIC [PASS]
> clock_id: CLOCK_MONOTONIC_RAW [PASS]
> clock_id: CLOCK_MONOTONIC_COARSE [PASS]
> 
> cheers
> 
>> Cc: Shuah Khan <shuah@kernel.org>
>> Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
>> ---
>>
>> Note: This patch is independent from the others in this series, hence it
>> can be merged singularly by the kselftest maintainers.
>>
>>  tools/testing/selftests/vDSO/Makefile         |   2 +
>>  .../selftests/vDSO/vdso_clock_getres.c        | 124 ++++++++++++++++++
>>  2 files changed, 126 insertions(+)
>>  create mode 100644 tools/testing/selftests/vDSO/vdso_clock_getres.c

-- 
Regards,
Vincenzo

^ permalink raw reply

* [PATCH v5 0/3] Fix vDSO clock_getres()
From: Vincenzo Frascino @ 2019-05-28 12:04 UTC (permalink / raw)
  To: linux-arch, linuxppc-dev, linux-s390, linux-kselftest
  Cc: Arnd Bergmann, Heiko Carstens, Paul Mackerras, Martin Schwidefsky,
	Thomas Gleixner, vincenzo.frascino, Shuah Khan

clock_getres in the vDSO library has to preserve the same behaviour
of posix_get_hrtimer_res().

In particular, posix_get_hrtimer_res() does:
    sec = 0;
    ns = hrtimer_resolution;
and hrtimer_resolution depends on the enablement of the high
resolution timers that can happen either at compile or at run time.

A possible fix is to change the vdso implementation of clock_getres,
keeping a copy of hrtimer_resolution in vdso data and using that
directly [1].

This patchset implements the proposed fix for arm64, powerpc, s390,
nds32 and adds a test to verify that the syscall and the vdso library
implementation of clock_getres return the same values.

Even if these patches are unified by the same topic, there is no
dependency between them, hence they can be merged singularly by each
arch maintainer.

Note: arm64 and nds32 respective fixes have been merged in 5.2-rc1,
hence they have been removed from this series.

[1] https://marc.info/?l=linux-arm-kernel&m=155110381930196&w=2

Changes:
--------
v5:
  - Rebased on 5.2-rc2
  - Fixed a bug in kselftest.
v4:
  - Address review comments.
v3:
  - Rebased on 5.2-rc1.
  - Address review comments.
v2:
  - Rebased on 5.1-rc5.
  - Addressed review comments.

Cc: Christophe Leroy <christophe.leroy@c-s.fr>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>

Vincenzo Frascino (3):
  powerpc: Fix vDSO clock_getres()
  s390: Fix vDSO clock_getres()
  kselftest: Extend vDSO selftest to clock_getres

 arch/powerpc/include/asm/vdso_datapage.h      |   2 +
 arch/powerpc/kernel/asm-offsets.c             |   2 +-
 arch/powerpc/kernel/time.c                    |   1 +
 arch/powerpc/kernel/vdso32/gettimeofday.S     |   7 +-
 arch/powerpc/kernel/vdso64/gettimeofday.S     |   7 +-
 arch/s390/include/asm/vdso.h                  |   1 +
 arch/s390/kernel/asm-offsets.c                |   2 +-
 arch/s390/kernel/time.c                       |   1 +
 arch/s390/kernel/vdso32/clock_getres.S        |  12 +-
 arch/s390/kernel/vdso64/clock_getres.S        |  10 +-
 tools/testing/selftests/vDSO/Makefile         |   2 +
 .../selftests/vDSO/vdso_clock_getres.c        | 124 ++++++++++++++++++
 12 files changed, 155 insertions(+), 16 deletions(-)
 create mode 100644 tools/testing/selftests/vDSO/vdso_clock_getres.c

-- 
2.21.0


^ permalink raw reply

* [PATCH v5 1/3] powerpc: Fix vDSO clock_getres()
From: Vincenzo Frascino @ 2019-05-28 12:04 UTC (permalink / raw)
  To: linux-arch, linuxppc-dev, linux-s390, linux-kselftest
  Cc: Arnd Bergmann, Heiko Carstens, stable, Paul Mackerras,
	Martin Schwidefsky, Thomas Gleixner, vincenzo.frascino,
	Shuah Khan
In-Reply-To: <20190528120446.48911-1-vincenzo.frascino@arm.com>

clock_getres in the vDSO library has to preserve the same behaviour
of posix_get_hrtimer_res().

In particular, posix_get_hrtimer_res() does:
    sec = 0;
    ns = hrtimer_resolution;
and hrtimer_resolution depends on the enablement of the high
resolution timers that can happen either at compile or at run time.

Fix the powerpc vdso implementation of clock_getres keeping a copy of
hrtimer_resolution in vdso data and using that directly.

Fixes: a7f290dad32e ("[PATCH] powerpc: Merge vdso's and add vdso support
to 32 bits kernel")
Cc: stable@vger.kernel.org
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Reviewed-by: Christophe Leroy <christophe.leroy@c-s.fr>
---

Note: This patch is independent from the others in this series, hence it
can be merged singularly by the powerpc maintainers.

 arch/powerpc/include/asm/vdso_datapage.h  | 2 ++
 arch/powerpc/kernel/asm-offsets.c         | 2 +-
 arch/powerpc/kernel/time.c                | 1 +
 arch/powerpc/kernel/vdso32/gettimeofday.S | 7 +++++--
 arch/powerpc/kernel/vdso64/gettimeofday.S | 7 +++++--
 5 files changed, 14 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/include/asm/vdso_datapage.h b/arch/powerpc/include/asm/vdso_datapage.h
index bbc06bd72b1f..4333b9a473dc 100644
--- a/arch/powerpc/include/asm/vdso_datapage.h
+++ b/arch/powerpc/include/asm/vdso_datapage.h
@@ -86,6 +86,7 @@ struct vdso_data {
 	__s32 wtom_clock_nsec;			/* Wall to monotonic clock nsec */
 	__s64 wtom_clock_sec;			/* Wall to monotonic clock sec */
 	struct timespec stamp_xtime;		/* xtime as at tb_orig_stamp */
+	__u32 hrtimer_res;			/* hrtimer resolution */
    	__u32 syscall_map_64[SYSCALL_MAP_SIZE]; /* map of syscalls  */
    	__u32 syscall_map_32[SYSCALL_MAP_SIZE]; /* map of syscalls */
 };
@@ -107,6 +108,7 @@ struct vdso_data {
 	__s32 wtom_clock_nsec;
 	struct timespec stamp_xtime;	/* xtime as at tb_orig_stamp */
 	__u32 stamp_sec_fraction;	/* fractional seconds of stamp_xtime */
+	__u32 hrtimer_res;		/* hrtimer resolution */
    	__u32 syscall_map_32[SYSCALL_MAP_SIZE]; /* map of syscalls */
 	__u32 dcache_block_size;	/* L1 d-cache block size     */
 	__u32 icache_block_size;	/* L1 i-cache block size     */
diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c
index 8e02444e9d3d..dfc40f29f2b9 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -389,6 +389,7 @@ int main(void)
 	OFFSET(WTOM_CLOCK_NSEC, vdso_data, wtom_clock_nsec);
 	OFFSET(STAMP_XTIME, vdso_data, stamp_xtime);
 	OFFSET(STAMP_SEC_FRAC, vdso_data, stamp_sec_fraction);
+	OFFSET(CLOCK_REALTIME_RES, vdso_data, hrtimer_res);
 	OFFSET(CFG_ICACHE_BLOCKSZ, vdso_data, icache_block_size);
 	OFFSET(CFG_DCACHE_BLOCKSZ, vdso_data, dcache_block_size);
 	OFFSET(CFG_ICACHE_LOGBLOCKSZ, vdso_data, icache_log_block_size);
@@ -419,7 +420,6 @@ int main(void)
 	DEFINE(CLOCK_REALTIME_COARSE, CLOCK_REALTIME_COARSE);
 	DEFINE(CLOCK_MONOTONIC_COARSE, CLOCK_MONOTONIC_COARSE);
 	DEFINE(NSEC_PER_SEC, NSEC_PER_SEC);
-	DEFINE(CLOCK_REALTIME_RES, MONOTONIC_RES_NSEC);
 
 #ifdef CONFIG_BUG
 	DEFINE(BUG_ENTRY_SIZE, sizeof(struct bug_entry));
diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
index 325d60633dfa..4ea4e9d7a58e 100644
--- a/arch/powerpc/kernel/time.c
+++ b/arch/powerpc/kernel/time.c
@@ -963,6 +963,7 @@ void update_vsyscall(struct timekeeper *tk)
 	vdso_data->wtom_clock_nsec = tk->wall_to_monotonic.tv_nsec;
 	vdso_data->stamp_xtime = xt;
 	vdso_data->stamp_sec_fraction = frac_sec;
+	vdso_data->hrtimer_res = hrtimer_resolution;
 	smp_wmb();
 	++(vdso_data->tb_update_count);
 }
diff --git a/arch/powerpc/kernel/vdso32/gettimeofday.S b/arch/powerpc/kernel/vdso32/gettimeofday.S
index afd516b572f8..2b5f9e83c610 100644
--- a/arch/powerpc/kernel/vdso32/gettimeofday.S
+++ b/arch/powerpc/kernel/vdso32/gettimeofday.S
@@ -160,12 +160,15 @@ V_FUNCTION_BEGIN(__kernel_clock_getres)
 	cror	cr0*4+eq,cr0*4+eq,cr1*4+eq
 	bne	cr0,99f
 
+	mflr	r12
+  .cfi_register lr,r12
+	bl	__get_datapage@local
+	lwz	r5,CLOCK_REALTIME_RES(r3)
+	mtlr	r12
 	li	r3,0
 	cmpli	cr0,r4,0
 	crclr	cr0*4+so
 	beqlr
-	lis	r5,CLOCK_REALTIME_RES@h
-	ori	r5,r5,CLOCK_REALTIME_RES@l
 	stw	r3,TSPC32_TV_SEC(r4)
 	stw	r5,TSPC32_TV_NSEC(r4)
 	blr
diff --git a/arch/powerpc/kernel/vdso64/gettimeofday.S b/arch/powerpc/kernel/vdso64/gettimeofday.S
index 1f324c28705b..f07730f73d5e 100644
--- a/arch/powerpc/kernel/vdso64/gettimeofday.S
+++ b/arch/powerpc/kernel/vdso64/gettimeofday.S
@@ -190,12 +190,15 @@ V_FUNCTION_BEGIN(__kernel_clock_getres)
 	cror	cr0*4+eq,cr0*4+eq,cr1*4+eq
 	bne	cr0,99f
 
+	mflr	r12
+  .cfi_register lr,r12
+	bl	V_LOCAL_FUNC(__get_datapage)
+	lwz	r5,CLOCK_REALTIME_RES(r3)
+	mtlr	r12
 	li	r3,0
 	cmpldi	cr0,r4,0
 	crclr	cr0*4+so
 	beqlr
-	lis	r5,CLOCK_REALTIME_RES@h
-	ori	r5,r5,CLOCK_REALTIME_RES@l
 	std	r3,TSPC64_TV_SEC(r4)
 	std	r5,TSPC64_TV_NSEC(r4)
 	blr
-- 
2.21.0


^ permalink raw reply related

* [PATCH v5 2/3] s390: Fix vDSO clock_getres()
From: Vincenzo Frascino @ 2019-05-28 12:04 UTC (permalink / raw)
  To: linux-arch, linuxppc-dev, linux-s390, linux-kselftest
  Cc: Arnd Bergmann, Heiko Carstens, Paul Mackerras, Martin Schwidefsky,
	Thomas Gleixner, vincenzo.frascino, Shuah Khan
In-Reply-To: <20190528120446.48911-1-vincenzo.frascino@arm.com>

clock_getres in the vDSO library has to preserve the same behaviour
of posix_get_hrtimer_res().

In particular, posix_get_hrtimer_res() does:
    sec = 0;
    ns = hrtimer_resolution;
and hrtimer_resolution depends on the enablement of the high
resolution timers that can happen either at compile or at run time.

Fix the s390 vdso implementation of clock_getres keeping a copy of
hrtimer_resolution in vdso data and using that directly.

Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
---

Note: This patch is independent from the others in this series, hence it
can be merged singularly by the s390 maintainers.

 arch/s390/include/asm/vdso.h           |  1 +
 arch/s390/kernel/asm-offsets.c         |  2 +-
 arch/s390/kernel/time.c                |  1 +
 arch/s390/kernel/vdso32/clock_getres.S | 12 +++++++-----
 arch/s390/kernel/vdso64/clock_getres.S | 10 +++++-----
 5 files changed, 15 insertions(+), 11 deletions(-)

diff --git a/arch/s390/include/asm/vdso.h b/arch/s390/include/asm/vdso.h
index 169d7604eb80..f3ba84fa9bd1 100644
--- a/arch/s390/include/asm/vdso.h
+++ b/arch/s390/include/asm/vdso.h
@@ -36,6 +36,7 @@ struct vdso_data {
 	__u32 tk_shift;			/* Shift used for xtime_nsec	0x60 */
 	__u32 ts_dir;			/* TOD steering direction	0x64 */
 	__u64 ts_end;			/* TOD steering end		0x68 */
+	__u32 hrtimer_res;		/* hrtimer resolution		0x70 */
 };
 
 struct vdso_per_cpu_data {
diff --git a/arch/s390/kernel/asm-offsets.c b/arch/s390/kernel/asm-offsets.c
index 41ac4ad21311..4a229a60b24a 100644
--- a/arch/s390/kernel/asm-offsets.c
+++ b/arch/s390/kernel/asm-offsets.c
@@ -76,6 +76,7 @@ int main(void)
 	OFFSET(__VDSO_TK_SHIFT, vdso_data, tk_shift);
 	OFFSET(__VDSO_TS_DIR, vdso_data, ts_dir);
 	OFFSET(__VDSO_TS_END, vdso_data, ts_end);
+	OFFSET(__VDSO_CLOCK_REALTIME_RES, vdso_data, hrtimer_res);
 	OFFSET(__VDSO_ECTG_BASE, vdso_per_cpu_data, ectg_timer_base);
 	OFFSET(__VDSO_ECTG_USER, vdso_per_cpu_data, ectg_user_time);
 	OFFSET(__VDSO_CPU_NR, vdso_per_cpu_data, cpu_nr);
@@ -87,7 +88,6 @@ int main(void)
 	DEFINE(__CLOCK_REALTIME_COARSE, CLOCK_REALTIME_COARSE);
 	DEFINE(__CLOCK_MONOTONIC_COARSE, CLOCK_MONOTONIC_COARSE);
 	DEFINE(__CLOCK_THREAD_CPUTIME_ID, CLOCK_THREAD_CPUTIME_ID);
-	DEFINE(__CLOCK_REALTIME_RES, MONOTONIC_RES_NSEC);
 	DEFINE(__CLOCK_COARSE_RES, LOW_RES_NSEC);
 	BLANK();
 	/* idle data offsets */
diff --git a/arch/s390/kernel/time.c b/arch/s390/kernel/time.c
index e8766beee5ad..8ea9db599d38 100644
--- a/arch/s390/kernel/time.c
+++ b/arch/s390/kernel/time.c
@@ -310,6 +310,7 @@ void update_vsyscall(struct timekeeper *tk)
 
 	vdso_data->tk_mult = tk->tkr_mono.mult;
 	vdso_data->tk_shift = tk->tkr_mono.shift;
+	vdso_data->hrtimer_res = hrtimer_resolution;
 	smp_wmb();
 	++vdso_data->tb_update_count;
 }
diff --git a/arch/s390/kernel/vdso32/clock_getres.S b/arch/s390/kernel/vdso32/clock_getres.S
index eaf9cf1417f6..fecd7684c645 100644
--- a/arch/s390/kernel/vdso32/clock_getres.S
+++ b/arch/s390/kernel/vdso32/clock_getres.S
@@ -18,20 +18,22 @@
 __kernel_clock_getres:
 	CFI_STARTPROC
 	basr	%r1,0
-	la	%r1,4f-.(%r1)
+10:	al	%r1,4f-10b(%r1)
+	l	%r0,__VDSO_CLOCK_REALTIME_RES(%r1)
 	chi	%r2,__CLOCK_REALTIME
 	je	0f
 	chi	%r2,__CLOCK_MONOTONIC
 	je	0f
-	la	%r1,5f-4f(%r1)
+	basr	%r1,0
+	la	%r1,5f-.(%r1)
+	l	%r0,0(%r1)
 	chi	%r2,__CLOCK_REALTIME_COARSE
 	je	0f
 	chi	%r2,__CLOCK_MONOTONIC_COARSE
 	jne	3f
 0:	ltr	%r3,%r3
 	jz	2f				/* res == NULL */
-1:	l	%r0,0(%r1)
-	xc	0(4,%r3),0(%r3)			/* set tp->tv_sec to zero */
+1:	xc	0(4,%r3),0(%r3)			/* set tp->tv_sec to zero */
 	st	%r0,4(%r3)			/* store tp->tv_usec */
 2:	lhi	%r2,0
 	br	%r14
@@ -39,6 +41,6 @@ __kernel_clock_getres:
 	svc	0
 	br	%r14
 	CFI_ENDPROC
-4:	.long	__CLOCK_REALTIME_RES
+4:	.long	_vdso_data - 10b
 5:	.long	__CLOCK_COARSE_RES
 	.size	__kernel_clock_getres,.-__kernel_clock_getres
diff --git a/arch/s390/kernel/vdso64/clock_getres.S b/arch/s390/kernel/vdso64/clock_getres.S
index 081435398e0a..022b58c980db 100644
--- a/arch/s390/kernel/vdso64/clock_getres.S
+++ b/arch/s390/kernel/vdso64/clock_getres.S
@@ -17,12 +17,14 @@
 	.type  __kernel_clock_getres,@function
 __kernel_clock_getres:
 	CFI_STARTPROC
-	larl	%r1,4f
+	larl	%r1,3f
+	lg	%r0,0(%r1)
 	cghi	%r2,__CLOCK_REALTIME_COARSE
 	je	0f
 	cghi	%r2,__CLOCK_MONOTONIC_COARSE
 	je	0f
-	larl	%r1,3f
+	larl	%r1,_vdso_data
+	l	%r0,__VDSO_CLOCK_REALTIME_RES(%r1)
 	cghi	%r2,__CLOCK_REALTIME
 	je	0f
 	cghi	%r2,__CLOCK_MONOTONIC
@@ -36,7 +38,6 @@ __kernel_clock_getres:
 	jz	2f
 0:	ltgr	%r3,%r3
 	jz	1f				/* res == NULL */
-	lg	%r0,0(%r1)
 	xc	0(8,%r3),0(%r3)			/* set tp->tv_sec to zero */
 	stg	%r0,8(%r3)			/* store tp->tv_usec */
 1:	lghi	%r2,0
@@ -45,6 +46,5 @@ __kernel_clock_getres:
 	svc	0
 	br	%r14
 	CFI_ENDPROC
-3:	.quad	__CLOCK_REALTIME_RES
-4:	.quad	__CLOCK_COARSE_RES
+3:	.quad	__CLOCK_COARSE_RES
 	.size	__kernel_clock_getres,.-__kernel_clock_getres
-- 
2.21.0


^ permalink raw reply related

* [PATCH v5 3/3] kselftest: Extend vDSO selftest to clock_getres
From: Vincenzo Frascino @ 2019-05-28 12:04 UTC (permalink / raw)
  To: linux-arch, linuxppc-dev, linux-s390, linux-kselftest
  Cc: Arnd Bergmann, Heiko Carstens, Paul Mackerras, Martin Schwidefsky,
	Thomas Gleixner, vincenzo.frascino, Shuah Khan
In-Reply-To: <20190528120446.48911-1-vincenzo.frascino@arm.com>

The current version of the multiarch vDSO selftest verifies only
gettimeofday.

Extend the vDSO selftest to clock_getres, to verify that the
syscall and the vDSO library function return the same information.

The extension has been used to verify the hrtimer_resoltion fix.

Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
---

Note: This patch is independent from the others in this series, hence it
can be merged singularly by the kselftest maintainers.

 tools/testing/selftests/vDSO/Makefile         |   2 +
 .../selftests/vDSO/vdso_clock_getres.c        | 124 ++++++++++++++++++
 2 files changed, 126 insertions(+)
 create mode 100644 tools/testing/selftests/vDSO/vdso_clock_getres.c

diff --git a/tools/testing/selftests/vDSO/Makefile b/tools/testing/selftests/vDSO/Makefile
index 9e03d61f52fd..d5c5bfdf1ac1 100644
--- a/tools/testing/selftests/vDSO/Makefile
+++ b/tools/testing/selftests/vDSO/Makefile
@@ -5,6 +5,7 @@ uname_M := $(shell uname -m 2>/dev/null || echo not)
 ARCH ?= $(shell echo $(uname_M) | sed -e s/i.86/x86/ -e s/x86_64/x86/)
 
 TEST_GEN_PROGS := $(OUTPUT)/vdso_test
+TEST_GEN_PROGS += $(OUTPUT)/vdso_clock_getres
 ifeq ($(ARCH),x86)
 TEST_GEN_PROGS += $(OUTPUT)/vdso_standalone_test_x86
 endif
@@ -18,6 +19,7 @@ endif
 
 all: $(TEST_GEN_PROGS)
 $(OUTPUT)/vdso_test: parse_vdso.c vdso_test.c
+$(OUTPUT)/vdso_clock_getres: vdso_clock_getres.c
 $(OUTPUT)/vdso_standalone_test_x86: vdso_standalone_test_x86.c parse_vdso.c
 	$(CC) $(CFLAGS) $(CFLAGS_vdso_standalone_test_x86) \
 		vdso_standalone_test_x86.c parse_vdso.c \
diff --git a/tools/testing/selftests/vDSO/vdso_clock_getres.c b/tools/testing/selftests/vDSO/vdso_clock_getres.c
new file mode 100644
index 000000000000..15dcee16ff72
--- /dev/null
+++ b/tools/testing/selftests/vDSO/vdso_clock_getres.c
@@ -0,0 +1,124 @@
+// SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note
+/*
+ * vdso_clock_getres.c: Sample code to test clock_getres.
+ * Copyright (c) 2019 Arm Ltd.
+ *
+ * Compile with:
+ * gcc -std=gnu99 vdso_clock_getres.c
+ *
+ * Tested on ARM, ARM64, MIPS32, x86 (32-bit and 64-bit),
+ * Power (32-bit and 64-bit), S390x (32-bit and 64-bit).
+ * Might work on other architectures.
+ */
+
+#define _GNU_SOURCE
+#include <elf.h>
+#include <err.h>
+#include <fcntl.h>
+#include <stdint.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <time.h>
+#include <sys/auxv.h>
+#include <sys/mman.h>
+#include <sys/time.h>
+#include <unistd.h>
+#include <sys/syscall.h>
+
+#include "../kselftest.h"
+
+static long syscall_clock_getres(clockid_t _clkid, struct timespec *_ts)
+{
+	long ret;
+
+	ret = syscall(SYS_clock_getres, _clkid, _ts);
+
+	return ret;
+}
+
+const char *vdso_clock_name[12] = {
+	"CLOCK_REALTIME",
+	"CLOCK_MONOTONIC",
+	"CLOCK_PROCESS_CPUTIME_ID",
+	"CLOCK_THREAD_CPUTIME_ID",
+	"CLOCK_MONOTONIC_RAW",
+	"CLOCK_REALTIME_COARSE",
+	"CLOCK_MONOTONIC_COARSE",
+	"CLOCK_BOOTTIME",
+	"CLOCK_REALTIME_ALARM",
+	"CLOCK_BOOTTIME_ALARM",
+	"CLOCK_SGI_CYCLE",
+	"CLOCK_TAI",
+};
+
+/*
+ * This function calls clock_getres in vdso and by system call
+ * with different values for clock_id.
+ *
+ * Example of output:
+ *
+ * clock_id: CLOCK_REALTIME [PASS]
+ * clock_id: CLOCK_BOOTTIME [PASS]
+ * clock_id: CLOCK_TAI [PASS]
+ * clock_id: CLOCK_REALTIME_COARSE [PASS]
+ * clock_id: CLOCK_MONOTONIC [PASS]
+ * clock_id: CLOCK_MONOTONIC_RAW [PASS]
+ * clock_id: CLOCK_MONOTONIC_COARSE [PASS]
+ */
+static inline int vdso_test_clock(unsigned int clock_id)
+{
+	struct timespec x, y;
+
+	printf("clock_id: %s", vdso_clock_name[clock_id]);
+	clock_getres(clock_id, &x);
+	syscall_clock_getres(clock_id, &y);
+
+	if ((x.tv_sec != y.tv_sec) || (x.tv_nsec != y.tv_nsec)) {
+		printf(" [FAIL]\n");
+		return KSFT_FAIL;
+	}
+
+	printf(" [PASS]\n");
+	return KSFT_PASS;
+}
+
+int main(int argc, char **argv)
+{
+	int ret;
+
+#if _POSIX_TIMERS > 0
+
+#ifdef CLOCK_REALTIME
+	ret = vdso_test_clock(CLOCK_REALTIME);
+#endif
+
+#ifdef CLOCK_BOOTTIME
+	ret += vdso_test_clock(CLOCK_BOOTTIME);
+#endif
+
+#ifdef CLOCK_TAI
+	ret += vdso_test_clock(CLOCK_TAI);
+#endif
+
+#ifdef CLOCK_REALTIME_COARSE
+	ret += vdso_test_clock(CLOCK_REALTIME_COARSE);
+#endif
+
+#ifdef CLOCK_MONOTONIC
+	ret += vdso_test_clock(CLOCK_MONOTONIC);
+#endif
+
+#ifdef CLOCK_MONOTONIC_RAW
+	ret += vdso_test_clock(CLOCK_MONOTONIC_RAW);
+#endif
+
+#ifdef CLOCK_MONOTONIC_COARSE
+	ret += vdso_test_clock(CLOCK_MONOTONIC_COARSE);
+#endif
+
+#endif
+	if (ret > 0)
+		return KSFT_FAIL;
+
+	return KSFT_PASS;
+}
-- 
2.21.0


^ permalink raw reply related

* Re: [PATCH] powerpc/configs: Rename foo_basic_defconfig to foo_base.config
From: Christoph Hellwig @ 2019-05-28 12:10 UTC (permalink / raw)
  To: Michael Ellerman; +Cc: linuxppc-dev, linux-kbuild
In-Reply-To: <20190528081614.26096-1-mpe@ellerman.id.au>

On Tue, May 28, 2019 at 06:16:14PM +1000, Michael Ellerman wrote:
> We have several "defconfigs" that are not actually full defconfigs
> they are just a base set of options which are then merged with other
> fragments to produce a working defconfig.
> 
> The most obvious example is corenet_basic_defconfig which only
> contains one symbol CONFIG_CORENET_GENERIC=y. But there is also
> mpc85xx_base_defconfig which doesn't actually enable CONFIG_PPC_85xx.
> 
> To avoid confusion, rename these config fragments to "foo_base.config"
> to make it clearer that they are not full defconfigs.

Adding linux-kbuild, maybe we can make the handling of these fragments
generic and actually document it..

>
> Reported-by: Christophe Leroy <christophe.leroy@c-s.fr>
> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
> ---
>  arch/powerpc/Makefile                                | 12 ++++++------
>  .../{corenet_basic_defconfig => corenet_base.config} |  0
>  .../{mpc85xx_basic_defconfig => mpc85xx_base.config} |  0
>  .../{mpc86xx_basic_defconfig => mpc86xx_base.config} |  0
>  4 files changed, 6 insertions(+), 6 deletions(-)
>  rename arch/powerpc/configs/{corenet_basic_defconfig => corenet_base.config} (100%)
>  rename arch/powerpc/configs/{mpc85xx_basic_defconfig => mpc85xx_base.config} (100%)
>  rename arch/powerpc/configs/{mpc86xx_basic_defconfig => mpc86xx_base.config} (100%)
> 
> diff --git a/arch/powerpc/Makefile b/arch/powerpc/Makefile
> index c345b79414a9..94f735db2229 100644
> --- a/arch/powerpc/Makefile
> +++ b/arch/powerpc/Makefile
> @@ -333,32 +333,32 @@ PHONY += powernv_be_defconfig
>  
>  PHONY += mpc85xx_defconfig
>  mpc85xx_defconfig:
> -	$(call merge_into_defconfig,mpc85xx_basic_defconfig,\
> +	$(call merge_into_defconfig,mpc85xx_base.config,\
>  		85xx-32bit 85xx-hw fsl-emb-nonhw)
>  
>  PHONY += mpc85xx_smp_defconfig
>  mpc85xx_smp_defconfig:
> -	$(call merge_into_defconfig,mpc85xx_basic_defconfig,\
> +	$(call merge_into_defconfig,mpc85xx_base.config,\
>  		85xx-32bit 85xx-smp 85xx-hw fsl-emb-nonhw)
>  
>  PHONY += corenet32_smp_defconfig
>  corenet32_smp_defconfig:
> -	$(call merge_into_defconfig,corenet_basic_defconfig,\
> +	$(call merge_into_defconfig,corenet_base.config,\
>  		85xx-32bit 85xx-smp 85xx-hw fsl-emb-nonhw dpaa)
>  
>  PHONY += corenet64_smp_defconfig
>  corenet64_smp_defconfig:
> -	$(call merge_into_defconfig,corenet_basic_defconfig,\
> +	$(call merge_into_defconfig,corenet_base.config,\
>  		85xx-64bit 85xx-smp altivec 85xx-hw fsl-emb-nonhw dpaa)
>  
>  PHONY += mpc86xx_defconfig
>  mpc86xx_defconfig:
> -	$(call merge_into_defconfig,mpc86xx_basic_defconfig,\
> +	$(call merge_into_defconfig,mpc86xx_base.config,\
>  		86xx-hw fsl-emb-nonhw)
>  
>  PHONY += mpc86xx_smp_defconfig
>  mpc86xx_smp_defconfig:
> -	$(call merge_into_defconfig,mpc86xx_basic_defconfig,\
> +	$(call merge_into_defconfig,mpc86xx_base.config,\
>  		86xx-smp 86xx-hw fsl-emb-nonhw)
>  
>  PHONY += ppc32_allmodconfig
> diff --git a/arch/powerpc/configs/corenet_basic_defconfig b/arch/powerpc/configs/corenet_base.config
> similarity index 100%
> rename from arch/powerpc/configs/corenet_basic_defconfig
> rename to arch/powerpc/configs/corenet_base.config
> diff --git a/arch/powerpc/configs/mpc85xx_basic_defconfig b/arch/powerpc/configs/mpc85xx_base.config
> similarity index 100%
> rename from arch/powerpc/configs/mpc85xx_basic_defconfig
> rename to arch/powerpc/configs/mpc85xx_base.config
> diff --git a/arch/powerpc/configs/mpc86xx_basic_defconfig b/arch/powerpc/configs/mpc86xx_base.config
> similarity index 100%
> rename from arch/powerpc/configs/mpc86xx_basic_defconfig
> rename to arch/powerpc/configs/mpc86xx_base.config
> -- 
> 2.20.1
> 
---end quoted text---

^ permalink raw reply

* Re: [PATCH RFC 0/5] Remove some notrace RCU APIs
From: Paul E. McKenney @ 2019-05-28 12:24 UTC (permalink / raw)
  To: Joel Fernandes
  Cc: rcu, Jonathan Corbet, linux-doc, Lai Jiangshan, Josh Triplett,
	Steven Rostedt, linux-kernel, Miguel Ojeda, Ingo Molnar,
	Mathieu Desnoyers, kvm-ppc, linuxppc-dev
In-Reply-To: <20190525181407.GA220326@google.com>

On Sat, May 25, 2019 at 02:14:07PM -0400, Joel Fernandes wrote:
> On Sat, May 25, 2019 at 08:50:35AM -0700, Paul E. McKenney wrote:
> > On Sat, May 25, 2019 at 10:19:54AM -0400, Joel Fernandes wrote:
> > > On Sat, May 25, 2019 at 07:08:26AM -0400, Steven Rostedt wrote:
> > > > On Sat, 25 May 2019 04:14:44 -0400
> > > > Joel Fernandes <joel@joelfernandes.org> wrote:
> > > > 
> > > > > > I guess the difference between the _raw_notrace and just _raw variants
> > > > > > is that _notrace ones do a rcu_check_sparse(). Don't we want to keep
> > > > > > that check?  
> > > > > 
> > > > > This is true.
> > > > > 
> > > > > Since the users of _raw_notrace are very few, is it worth keeping this API
> > > > > just for sparse checking? The API naming is also confusing. I was expecting
> > > > > _raw_notrace to do fewer checks than _raw, instead of more. Honestly, I just
> > > > > want to nuke _raw_notrace as done in this series and later we can introduce a
> > > > > sparse checking version of _raw if need-be. The other option could be to
> > > > > always do sparse checking for _raw however that used to be the case and got
> > > > > changed in http://lists.infradead.org/pipermail/linux-afs/2016-July/001016.html
> > > > 
> > > > What if we just rename _raw to _raw_nocheck, and _raw_notrace to _raw ?
> > > 
> > > That would also mean changing 160 usages of _raw to _raw_nocheck in the
> > > kernel :-/.
> > > 
> > > The tracing usage of _raw_notrace is only like 2 or 3 users. Can we just call
> > > rcu_check_sparse directly in the calling code for those and eliminate the APIs?
> > > 
> > > I wonder what Paul thinks about the matter as well.
> > 
> > My thought is that it is likely that a goodly number of the current uses
> > of _raw should really be some form of _check, with lockdep expressions
> > spelled out.  Not that working out what exactly those lockdep expressions
> > should be is necessarily a trivial undertaking.  ;-)
> 
> Yes, currently where I am a bit stuck is the rcu_dereference_raw()
> cannot possibly know what SRCU domain it is under, so lockdep cannot check if
> an SRCU lock is held without the user also passing along the SRCU domain. I
> am trying to change lockdep to see if it can check if *any* srcu domain lock
> is held (regardless of which one) and complain if none are. This is at least
> better than no check at all.
> 
> However, I think it gets tricky for mutexes. If you have something like:
> mutex_lock(some_mutex);
> p = rcu_dereference_raw(gp);
> mutex_unlock(some_mutex);
> 
> This might be a perfectly valid invocation of _raw, however my checks (patch
> is still cooking) trigger a lockdep warning becase _raw cannot know that this
> is Ok. lockdep thinks it is not in a reader section. This then gets into the
> territory of a new rcu_derference_raw_protected(gp, assert_held(some_mutex))
> which sucks because its yet another API. To circumvent this issue, can we
> just have callers of rcu_dereference_raw ensure that they call
> rcu_read_lock() if they are protecting dereferences by a mutex? That would
> make things a lot easier and also may be Ok since rcu_read_lock is quite
> cheap.

Why not just rcu_dereference_protected(lockdep_is_held(some_mutex))?
The API is already there, and no need for spurious readers.

> > That aside, if we are going to change the name of an API that is
> > used 160 places throughout the tree, we would need to have a pretty
> > good justification.  Without such a justification, it will just look
> > like pointless churn to the various developers and maintainers on the
> > receiving end of the patches.
> 
> Actually, the API name change is not something I want to do, it is Steven
> suggestion. My suggestion is let us just delete _raw_notrace and just use the
> _raw API for tracing, since _raw doesn't do any tracing anyway. Steve pointed
> that _raw_notrace does sparse checking unlike _raw, but I think that isn't an
> issue since _raw doesn't do such checking at the moment anyway.. (if possible
> check my cover letter again for details/motivation of this series).

Understood, but regardless of who suggested it, if we are to go through
with it, good justification will be required.  ;-)

							Thanx, Paul

> thanks!
> 
>  - Joel
> 
> > 							Thanx, Paul
> > 
> > > thanks, Steven!
> > > 
> > 
> 


^ permalink raw reply

* Re: [alsa-devel] [PATCH] ASoC: fsl: sai: Fix clock source for mclk0
From: Daniel Baluta @ 2019-05-28 12:39 UTC (permalink / raw)
  To: Nicolin Chen
  Cc: alsa-devel@alsa-project.org, timur@kernel.org,
	Xiubo.Lee@gmail.com, Daniel Baluta, S.j. Wang, tiwai@suse.com,
	lgirdwood@gmail.com, broonie@kernel.org, dl-linux-imx,
	festevam@gmail.com, linuxppc-dev@lists.ozlabs.org,
	linux-kernel@vger.kernel.org
In-Reply-To: <20190421082627.GB8304@Asurada>

On Sun, Apr 21, 2019 at 11:26 AM Nicolin Chen <nicoleotsuka@gmail.com> wrote:
>
> On Sun, Apr 21, 2019 at 01:04:39AM -0700, Nicolin Chen wrote:
> > On Sun, Apr 21, 2019 at 10:26:40AM +0300, Daniel Baluta wrote:
> > > > Firstly, according to your commit message, neither imx8qm nor
> > > > imx6sx has an "mclk0" clock in the clock list. Either of them
> > > > starts with "mclk1". So, before you change the driver, I don't
> > > > think it's even a right thing to define an "mclk0" in the DT.
> > >
> > > From what I understand mclk0 means option 00b of MSEL bits which is:
> > > * busclk for i.MX8
> > > * mclk1 for i.MX6/7.
> >
> > MSEL bit is used for an internal clock MUX to select four clock
> > inputs. However,  these four clock inputs aren't exactly 1:1 of
> > SAI's inputs. As fas as I can tell, SAI only has one bus clock
> > and three MCLK[1-3]; the internal clock MUX maps the bus clock
> > or MCLK1 to its input0, and then linearly maps MCLK[1-3] to its
> > inputs[1-3]. So it doesn't sound right to me that you define an
> > "MCLK0" in the DT, as it's supposed to describe input clocks of
> > SAI block, other than its internal clock MUX's.
>
> Daniel, I think I's saying this too confident, though I do feel
> so :) But if you can prove me wrong and justify that there is an
> "MCLK0" as an external input of the SAI block, I will agree with
> this change.

Looking inside the RTL for SAI on i.MX8 I found that there
is a MUX with 4 inputs exactly as RM says:
- bus
- master clock 1
- master clock 2
- master clock 3

My point is that the DT is modelling the internal clock MUX
used for SAI to select its clock source.

thanks,
Daniel.

^ permalink raw reply

* Re: [PATCH v7 1/1] iommu: enhance IOMMU dma mode build options
From: Leizhen (ThunderTown) @ 2019-05-28 13:05 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: linux-ia64, Sebastian Ott, linux-doc, Hanjun Guo, Heiko Carstens,
	Paul Mackerras, H . Peter Anvin, linux-s390, Jonathan Corbet,
	Jean-Philippe Brucker, x86, Ingo Molnar, Fenghua Yu, Will Deacon,
	John Garry, linuxppc-dev, Borislav Petkov, Thomas Gleixner,
	Gerald Schaefer, Tony Luck, David Woodhouse, linux-kernel, iommu,
	Martin Schwidefsky, Robin Murphy
In-Reply-To: <20190527142140.GH8420@8bytes.org>



On 2019/5/27 22:21, Joerg Roedel wrote:
> Hi Zhen Lei,
> 
> On Mon, May 20, 2019 at 09:59:47PM +0800, Zhen Lei wrote:
>>  arch/ia64/kernel/pci-dma.c                |  2 +-
>>  arch/powerpc/platforms/powernv/pci-ioda.c |  3 ++-
>>  arch/s390/pci/pci_dma.c                   |  2 +-
>>  arch/x86/kernel/pci-dma.c                 |  7 ++---
>>  drivers/iommu/Kconfig                     | 44 ++++++++++++++++++++++++++-----
>>  drivers/iommu/amd_iommu_init.c            |  3 ++-
>>  drivers/iommu/intel-iommu.c               |  2 +-
>>  drivers/iommu/iommu.c                     |  3 ++-
>>  8 files changed, 48 insertions(+), 18 deletions(-)
> 
> This needs Acks from the arch maintainers of ia64, powerpc, s390 and
> x86, at least.
> 
> It is easier for them if you split it up into the Kconfig change and
> separete patches per arch and per iommu driver. Then collect the Acks on
> the individual patches.

OK, thanks. I will do it tomorrow.

> 
> Thanks,
> 
> 	Joerg
> 
> .
> 

-- 
Thanks!
BestRegards


^ permalink raw reply

* Re: [PATCH v10 12/12] ima: Store the measurement again when appraising a modsig
From: Mimi Zohar @ 2019-05-28 14:09 UTC (permalink / raw)
  To: Thiago Jung Bauermann, linux-integrity
  Cc: Herbert Xu, linux-doc, Dmitry Kasatkin, David S. Miller,
	Jonathan Corbet, linux-kernel, James Morris, David Howells,
	AKASHI, Takahiro, linux-security-module, keyrings, linux-crypto,
	Jessica Yu, linuxppc-dev, David Woodhouse, Serge E. Hallyn
In-Reply-To: <20190418035120.2354-13-bauerman@linux.ibm.com>

Hi Thiago,

On Thu, 2019-04-18 at 00:51 -0300, Thiago Jung Bauermann wrote:
> If the IMA template contains the "modsig" or "d-modsig" field, then the
> modsig should be added to the measurement list when the file is appraised.
> 
> And that is what normally happens, but if a measurement rule caused a file
> containing a modsig to be measured before a different rule causes it to be
> appraised, the resulting measurement entry will not contain the modsig
> because it is only fetched during appraisal. When the appraisal rule
> triggers, it won't store a new measurement containing the modsig because
> the file was already measured.
> 
> We need to detect that situation and store an additional measurement with
> the modsig. This is done by adding an IMA_MEASURE action flag if we read a
> modsig and the IMA template contains a modsig field.

With the new per policy rule "template" support being added, this
patch needs to be modified so that the per policy "template" format is
checked.  ima_template_has_modsig() should be called with the
template_desc being used.

thanks,

Mimi


> diff --git a/security/integrity/ima/ima_main.c b/security/integrity/ima/ima_main.c
> index 8e6475854351..f91ed4189f98 100644
> --- a/security/integrity/ima/ima_main.c
> +++ b/security/integrity/ima/ima_main.c
> @@ -282,9 +282,17 @@ static int process_measurement(struct file *file, const struct cred *cred,
>  		/* read 'security.ima' */
>  		xattr_len = ima_read_xattr(file_dentry(file), &xattr_value);
>  
> -		/* Read the appended modsig if allowed by the policy. */
> -		if (iint->flags & IMA_MODSIG_ALLOWED)
> -			ima_read_modsig(func, buf, size, &modsig);
> +		/*
> +		 * Read the appended modsig, if allowed by the policy, and allow
> +		 * an additional measurement list entry, if needed, based on the
> +		 * template format.
> +		 */
> +		if (iint->flags & IMA_MODSIG_ALLOWED) {
> +			rc = ima_read_modsig(func, buf, size, &modsig);
> +
> +			if (!rc && ima_template_has_modsig())
> +				action |= IMA_MEASURE;
> +		}
> 


^ permalink raw reply

* Re: [PATCH v2] mm: add account_locked_vm utility function
From: Daniel Jordan @ 2019-05-28 15:04 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Mark Rutland, Davidlohr Bueso, kvm, Alan Tull,
	Alexey Kardashevskiy, linux-fpga, linux-kernel, kvm-ppc,
	Daniel Jordan, linux-mm, Alex Williamson, Jason Gunthorpe,
	Moritz Fischer, Steve Sistare, Christoph Lameter, linuxppc-dev,
	Wu Hao
In-Reply-To: <20190525145118.bfda2d75a14db05a001e49ad@linux-foundation.org>

On Sat, May 25, 2019 at 02:51:18PM -0700, Andrew Morton wrote:
> On Fri, 24 May 2019 13:50:45 -0400 Daniel Jordan <daniel.m.jordan@oracle.com> wrote:
> 
> > locked_vm accounting is done roughly the same way in five places, so
> > unify them in a helper.  Standardize the debug prints, which vary
> > slightly, but include the helper's caller to disambiguate between
> > callsites.
> > 
> > Error codes stay the same, so user-visible behavior does too.  The one
> > exception is that the -EPERM case in tce_account_locked_vm is removed
> > because Alexey has never seen it triggered.
> > 
> > ...
> >
> > --- a/include/linux/mm.h
> > +++ b/include/linux/mm.h
> > @@ -1564,6 +1564,25 @@ long get_user_pages_unlocked(unsigned long start, unsigned long nr_pages,
> >  int get_user_pages_fast(unsigned long start, int nr_pages,
> >  			unsigned int gup_flags, struct page **pages);
> >  
> > +int __account_locked_vm(struct mm_struct *mm, unsigned long pages, bool inc,
> > +			struct task_struct *task, bool bypass_rlim);
> > +
> > +static inline int account_locked_vm(struct mm_struct *mm, unsigned long pages,
> > +				    bool inc)
> > +{
> > +	int ret;
> > +
> > +	if (pages == 0 || !mm)
> > +		return 0;
> > +
> > +	down_write(&mm->mmap_sem);
> > +	ret = __account_locked_vm(mm, pages, inc, current,
> > +				  capable(CAP_IPC_LOCK));
> > +	up_write(&mm->mmap_sem);
> > +
> > +	return ret;
> > +}
> 
> That's quite a mouthful for an inlined function.  How about uninlining
> the whole thing and fiddling drivers/vfio/vfio_iommu_type1.c to suit. 
> I wonder why it does down_write_killable and whether it really needs
> to...

Sure, I can uninline it.  vfio changelogs don't show a particular reason for
_killable[1].  Maybe Alex has something to add.  Otherwise I'll respin without
it since the simplification seems worth removing _killable.

[1] 0cfef2b7410b ("vfio/type1: Remove locked page accounting workqueue")

^ permalink raw reply

* Re: [PATCH v2] powerpc/power: Expose pfn_is_nosave prototype
From: Rafael J. Wysocki @ 2019-05-28 15:48 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Len Brown, linux-s390, Mathieu Malaterre, linux-pm,
	Heiko Carstens, linux-kernel, Paul Mackerras, Pavel Machek,
	Martin Schwidefsky, linuxppc-dev
In-Reply-To: <875zpvqsy9.fsf@concordia.ellerman.id.au>

On Tuesday, May 28, 2019 3:16:30 AM CEST Michael Ellerman wrote:
> "Rafael J. Wysocki" <rjw@rjwysocki.net> writes:
> > On Friday, May 24, 2019 12:44:18 PM CEST Mathieu Malaterre wrote:
> >> The declaration for pfn_is_nosave is only available in
> >> kernel/power/power.h. Since this function can be override in arch,
> >> expose it globally. Having a prototype will make sure to avoid warning
> >> (sometime treated as error with W=1) such as:
> >> 
> >>   arch/powerpc/kernel/suspend.c:18:5: error: no previous prototype for 'pfn_is_nosave' [-Werror=missing-prototypes]
> >> 
> >> This moves the declaration into a globally visible header file and add
> >> missing include to avoid a warning on powerpc. Also remove the
> >> duplicated prototypes since not required anymore.
> >> 
> >> Cc: Christophe Leroy <christophe.leroy@c-s.fr>
> >> Signed-off-by: Mathieu Malaterre <malat@debian.org>
> >> ---
> >> v2: As suggestion by christophe remove duplicates prototypes
> >> 
> >>  arch/powerpc/kernel/suspend.c | 1 +
> >>  arch/s390/kernel/entry.h      | 1 -
> >>  include/linux/suspend.h       | 1 +
> >>  kernel/power/power.h          | 2 --
> >>  4 files changed, 2 insertions(+), 3 deletions(-)
> >> 
> >> diff --git a/kernel/power/power.h b/kernel/power/power.h
> >> index 9e58bdc8a562..44bee462ff57 100644
> >> --- a/kernel/power/power.h
> >> +++ b/kernel/power/power.h
> >> @@ -75,8 +75,6 @@ static inline void hibernate_reserved_size_init(void) {}
> >>  static inline void hibernate_image_size_init(void) {}
> >>  #endif /* !CONFIG_HIBERNATION */
> >>  
> >> -extern int pfn_is_nosave(unsigned long);
> >> -
> >>  #define power_attr(_name) \
> >>  static struct kobj_attribute _name##_attr = {	\
> >>  	.attr	= {				\
> >> 
> >
> > With an ACK from the powerpc maintainers, I could apply this one.
> 
> Sent.

Thanks!




^ permalink raw reply

* Re: [PATCH v2] powerpc/32: sstep: Move variable `rc` within CONFIG_PPC64 sentinels
From: Mathieu Malaterre @ 2019-05-28 15:52 UTC (permalink / raw)
  To: Michael Ellerman; +Cc: Paul Mackerras, linuxppc-dev, LKML
In-Reply-To: <87d0k2q025.fsf@concordia.ellerman.id.au>

On Tue, May 28, 2019 at 1:40 PM Michael Ellerman <mpe@ellerman.id.au> wrote:
>
> Mathieu Malaterre <malat@debian.org> writes:
>
> > Fix warnings treated as errors with W=1:
> >
> >   arch/powerpc/lib/sstep.c:1172:31: error: variable 'rc' set but not used [-Werror=unused-but-set-variable]
> >
> > Suggested-by: Christophe Leroy <christophe.leroy@c-s.fr>
> > Signed-off-by: Mathieu Malaterre <malat@debian.org>
> > ---
> > v2: as suggested prefer CONFIG_PPC64 sentinel instead of unused keyword
>
> I'd rather avoid adding more ifdefs if we can.
>
> I think this works?

It does ! ;)

Reviewed-by: Mathieu Malaterre <malat@debian.org>

> cheers
>
> diff --git a/arch/powerpc/lib/sstep.c b/arch/powerpc/lib/sstep.c
> index 3d33fb509ef4..600b036ddfda 100644
> --- a/arch/powerpc/lib/sstep.c
> +++ b/arch/powerpc/lib/sstep.c
> @@ -1169,7 +1169,7 @@ static nokprobe_inline int trap_compare(long v1, long v2)
>  int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
>                   unsigned int instr)
>  {
> -       unsigned int opcode, ra, rb, rc, rd, spr, u;
> +       unsigned int opcode, ra, rb, rd, spr, u;
>         unsigned long int imm;
>         unsigned long int val, val2;
>         unsigned int mb, me, sh;
> @@ -1292,7 +1292,6 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
>         rd = (instr >> 21) & 0x1f;
>         ra = (instr >> 16) & 0x1f;
>         rb = (instr >> 11) & 0x1f;
> -       rc = (instr >> 6) & 0x1f;
>
>         switch (opcode) {
>  #ifdef __powerpc64__
> @@ -1307,10 +1306,14 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
>                 return 1;
>
>  #ifdef __powerpc64__
> -       case 4:
> +       case 4: {
> +               unsigned int rc;
> +
>                 if (!cpu_has_feature(CPU_FTR_ARCH_300))
>                         return -1;
>
> +               rc = (instr >> 6) & 0x1f;
> +
>                 switch (instr & 0x3f) {
>                 case 48:        /* maddhd */
>                         asm volatile(PPC_MADDHD(%0, %1, %2, %3) :
> @@ -1336,6 +1339,7 @@ int analyse_instr(struct instruction_op *op, const struct pt_regs *regs,
>                  * primary opcode which do not have emulation support yet.
>                  */
>                 return -1;
> +       }
>  #endif
>
>         case 7:         /* mulli */

^ permalink raw reply

* Re: [PATCH] dlpar: Fix a missing-check bug in dlpar_parse_cc_property()
From: Nathan Lynch @ 2019-05-28 16:38 UTC (permalink / raw)
  To: Gen Zhang, benh, paulus; +Cc: linuxppc-dev, linux-kernel
In-Reply-To: <20190526024240.GA14546@zhanggen-UX430UQ>

Gen Zhang <blackgod016574@gmail.com> writes:
> In dlpar_parse_cc_property(), 'prop->name' is allocated by kstrdup().
> kstrdup() may return NULL, so it should be checked and handle error.
> And prop should be freed if 'prop->name' is NULL.
>
> Signed-off-by: Gen Zhang <blackgod016574@gmail.com>
> ---
> diff --git a/arch/powerpc/platforms/pseries/dlpar.c b/arch/powerpc/platforms/pseries/dlpar.c
> index 1795804..c852024 100644
> --- a/arch/powerpc/platforms/pseries/dlpar.c
> +++ b/arch/powerpc/platforms/pseries/dlpar.c
> @@ -61,6 +61,10 @@ static struct property *dlpar_parse_cc_property(struct cc_workarea *ccwa)
>  
>  	name = (char *)ccwa + be32_to_cpu(ccwa->name_offset);
>  	prop->name = kstrdup(name, GFP_KERNEL);
> +	if (!prop->name) {
> +		dlpar_free_cc_property(prop);
> +		return NULL;
> +	}

Acked-by: Nathan Lynch <nathanl@linux.ibm.com>


^ permalink raw reply

* Re: [PATCH v1 00/15] Fixing selftests failure on Talitos driver
From: Christophe Leroy @ 2019-05-28 16:48 UTC (permalink / raw)
  To: Horia Geanta
  Cc: Herbert Xu, David S. Miller, linuxppc-dev, linux-kernel,
	linux-crypto
In-Reply-To: <VI1PR0402MB34850975FDD8F5F1CE366A9C981E0@VI1PR0402MB3485.eurprd04.prod.outlook.com>

Horia Geanta <horia.geanta@nxp.com> a écrit :

> On 5/21/2019 4:34 PM, Christophe Leroy wrote:
>> Several test failures have popped up following recent changes to crypto
>> selftests.
>>
>> This series fixes (most of) them.
>>
>> The last three patches are trivial cleanups.
>>
> Thanks Christophe.
>
> For the series:
> Reviewed-by: Horia Geantă <horia.geanta@nxp.com>
>
> Have you validated the changes also on SEC 2.x+?
> Asking since IIRC you mentioned having only HW with SEC 1 and  
> changes in patch
> "crypto: talitos - fix AEAD processing." look quite complex.

When I ported the driver to SEC1 some years ago I only had a SEC 1.2  
(mpc885) but I now have also a board with a mpc8321E which embeds a  
SEC 2.2 so I also tested the changes on it.

Christophe

>
> Thanks,
> Horia



^ permalink raw reply

* Re: [PATCH v4 3/3] kselftest: Extend vDSO selftest to clock_getres
From: Christophe Leroy @ 2019-05-28 17:01 UTC (permalink / raw)
  To: Vincenzo Frascino
  Cc: linux-arch, linux-s390, Arnd Bergmann, Heiko Carstens,
	linuxppc-dev, Paul Mackerras, linux-kselftest, Martin Schwidefsky,
	Thomas Gleixner, Shuah Khan
In-Reply-To: <afb7395f-43e9-c304-2db2-349e6727b687@arm.com>

Vincenzo Frascino <vincenzo.frascino@arm.com> a écrit :

> Hi Michael,
>
> thank you for your reply.
>
> On 28/05/2019 07:19, Michael Ellerman wrote:
>> Vincenzo Frascino <vincenzo.frascino@arm.com> writes:
>>
>>> The current version of the multiarch vDSO selftest verifies only
>>> gettimeofday.
>>>
>>> Extend the vDSO selftest to clock_getres, to verify that the
>>> syscall and the vDSO library function return the same information.
>>>
>>> The extension has been used to verify the hrtimer_resoltion fix.
>>
>> This is passing for me even without patch 1 applied, shouldn't it fail
>> without the fix? What am I missing?
>>
>
> This is correct, because during the refactoring process I missed an "n" :)
>
> if·((x.tv_sec·!=·y.tv_sec)·||·(x.tv_sec·!=·y.tv_sec))
>
> Should be:
>
> if·((x.tv_sec·!=·y.tv_sec)·||·(x.tv_nsec·!=·y.tv_nsec))

Maybe you'd better use timercmp() from sys/time.h

Christophe

>
> My mistake, I am going to fix the test and re-post v5 of this set.
>
> Without my patch if you pass "highres=off" to the kernel (as a command line
> parameter) it leads to a broken implementation of clock_getres since  
> the value
> of CLOCK_REALTIME_RES does not change at runtime.
>
> Expected result (with highres=off):
>
> # uname -r
> 5.2.0-rc2
> # ./vdso_clock_getres
> clock_id: CLOCK_REALTIME [FAIL]
> clock_id: CLOCK_BOOTTIME [PASS]
> clock_id: CLOCK_TAI [PASS]
> clock_id: CLOCK_REALTIME_COARSE [PASS]
> clock_id: CLOCK_MONOTONIC [FAIL]
> clock_id: CLOCK_MONOTONIC_RAW [PASS]
> clock_id: CLOCK_MONOTONIC_COARSE [PASS]
>
> The reason of this behavior is that the only clocks supported by getres on
> powerpc are CLOCK_REALTIME and CLOCK_MONOTONIC, the rest on the clocks use
> always syscalls.
>
>> # uname -r
>> 5.2.0-rc2-gcc-8.2.0
>>
>> # ./vdso_clock_getres
>> clock_id: CLOCK_REALTIME [PASS]
>> clock_id: CLOCK_BOOTTIME [PASS]
>> clock_id: CLOCK_TAI [PASS]
>> clock_id: CLOCK_REALTIME_COARSE [PASS]
>> clock_id: CLOCK_MONOTONIC [PASS]
>> clock_id: CLOCK_MONOTONIC_RAW [PASS]
>> clock_id: CLOCK_MONOTONIC_COARSE [PASS]
>>
>> cheers
>>
>>> Cc: Shuah Khan <shuah@kernel.org>
>>> Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
>>> ---
>>>
>>> Note: This patch is independent from the others in this series, hence it
>>> can be merged singularly by the kselftest maintainers.
>>>
>>>  tools/testing/selftests/vDSO/Makefile         |   2 +
>>>  .../selftests/vDSO/vdso_clock_getres.c        | 124 ++++++++++++++++++
>>>  2 files changed, 126 insertions(+)
>>>  create mode 100644 tools/testing/selftests/vDSO/vdso_clock_getres.c
>
> --
> Regards,
> Vincenzo



^ permalink raw reply

* Re: [PATCH v3 14/16] powerpc/32: implement fast entry for syscalls on BOOKE
From: Christophe Leroy @ 2019-05-28 17:03 UTC (permalink / raw)
  To: Michael Ellerman; +Cc: linuxppc-dev, linux-kernel, Nicholas Piggin
In-Reply-To: <87r28jp2b0.fsf@concordia.ellerman.id.au>

Michael Ellerman <mpe@ellerman.id.au> a écrit :

> Christophe Leroy <christophe.leroy@c-s.fr> writes:
>> Le 23/05/2019 à 09:00, Christophe Leroy a écrit :
>>
>> [...]
>>
>>>> arch/powerpc/kernel/head_fsl_booke.o: In function `SystemCall':
>>>> arch/powerpc/kernel/head_fsl_booke.S:416: undefined reference to
>>>> `kvmppc_handler_BOOKE_INTERRUPT_SYSCALL_SPRN_SRR1'
>>>> Makefile:1052: recipe for target 'vmlinux' failed
>>>>
>>>>> +.macro SYSCALL_ENTRY trapno intno
>>>>> +    mfspr    r10, SPRN_SPRG_THREAD
>>>>> +#ifdef CONFIG_KVM_BOOKE_HV
>>>>> +BEGIN_FTR_SECTION
>>>>> +    mtspr    SPRN_SPRG_WSCRATCH0, r10
>>>>> +    stw    r11, THREAD_NORMSAVE(0)(r10)
>>>>> +    stw    r13, THREAD_NORMSAVE(2)(r10)
>>>>> +    mfcr    r13            /* save CR in r13 for now       */
>>>>> +    mfspr    r11, SPRN_SRR1
>>>>> +    mtocrf    0x80, r11    /* check MSR[GS] without clobbering reg */
>>>>> +    bf    3, 1975f
>>>>> +    b    kvmppc_handler_BOOKE_INTERRUPT_\intno\()_SPRN_SRR1
>>>>
>>>> It seems to me that the "_SPRN_SRR1" on the end of this line
>>>> isn't meant to be there...  However, it still fails to link with that
>>>> removed.
>>
>> It looks like I missed the macro expansion.
>>
>> The called function should be kvmppc_handler_8_0x01B
>>
>> Seems like kisskb doesn't build any config like this.
>
> I thought we did, ie:
>
> http://kisskb.ellerman.id.au/kisskb/buildresult/13817941/

That's a ppc64 config it seems. The problem was on booke32.

Christophe

>
> But clearly something is missing to trigger the bug.
>
> cheers



^ permalink raw reply

* Re: [PATCH v4 3/3] kselftest: Extend vDSO selftest to clock_getres
From: Vincenzo Frascino @ 2019-05-28 17:05 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: linux-arch, linux-s390, Arnd Bergmann, Heiko Carstens,
	linuxppc-dev, Paul Mackerras, linux-kselftest, Martin Schwidefsky,
	Thomas Gleixner, Shuah Khan
In-Reply-To: <20190528190132.Horde.a454OBLbW8Q4Xvx6vYRfSA1@messagerie.si.c-s.fr>

Hi Christophe,

On 28/05/2019 18:01, Christophe Leroy wrote:
> Vincenzo Frascino <vincenzo.frascino@arm.com> a écrit :
> 
>> Hi Michael,
>>
>> thank you for your reply.
>>
>> On 28/05/2019 07:19, Michael Ellerman wrote:
>>> Vincenzo Frascino <vincenzo.frascino@arm.com> writes:
>>>
>>>> The current version of the multiarch vDSO selftest verifies only
>>>> gettimeofday.
>>>>
>>>> Extend the vDSO selftest to clock_getres, to verify that the
>>>> syscall and the vDSO library function return the same information.
>>>>
>>>> The extension has been used to verify the hrtimer_resoltion fix.
>>>
>>> This is passing for me even without patch 1 applied, shouldn't it fail
>>> without the fix? What am I missing?
>>>
>>
>> This is correct, because during the refactoring process I missed an "n" :)
>>
>> if·((x.tv_sec·!=·y.tv_sec)·||·(x.tv_sec·!=·y.tv_sec))
>>
>> Should be:
>>
>> if·((x.tv_sec·!=·y.tv_sec)·||·(x.tv_nsec·!=·y.tv_nsec))
> 
> Maybe you'd better use timercmp() from sys/time.h
> 

timercmp() takes "struct timeval" not "struct timespec".

> Christophe
> 
>>
>> My mistake, I am going to fix the test and re-post v5 of this set.
>>
>> Without my patch if you pass "highres=off" to the kernel (as a command line
>> parameter) it leads to a broken implementation of clock_getres since  
>> the value
>> of CLOCK_REALTIME_RES does not change at runtime.
>>
>> Expected result (with highres=off):
>>
>> # uname -r
>> 5.2.0-rc2
>> # ./vdso_clock_getres
>> clock_id: CLOCK_REALTIME [FAIL]
>> clock_id: CLOCK_BOOTTIME [PASS]
>> clock_id: CLOCK_TAI [PASS]
>> clock_id: CLOCK_REALTIME_COARSE [PASS]
>> clock_id: CLOCK_MONOTONIC [FAIL]
>> clock_id: CLOCK_MONOTONIC_RAW [PASS]
>> clock_id: CLOCK_MONOTONIC_COARSE [PASS]
>>
>> The reason of this behavior is that the only clocks supported by getres on
>> powerpc are CLOCK_REALTIME and CLOCK_MONOTONIC, the rest on the clocks use
>> always syscalls.
>>
>>> # uname -r
>>> 5.2.0-rc2-gcc-8.2.0
>>>
>>> # ./vdso_clock_getres
>>> clock_id: CLOCK_REALTIME [PASS]
>>> clock_id: CLOCK_BOOTTIME [PASS]
>>> clock_id: CLOCK_TAI [PASS]
>>> clock_id: CLOCK_REALTIME_COARSE [PASS]
>>> clock_id: CLOCK_MONOTONIC [PASS]
>>> clock_id: CLOCK_MONOTONIC_RAW [PASS]
>>> clock_id: CLOCK_MONOTONIC_COARSE [PASS]
>>>
>>> cheers
>>>
>>>> Cc: Shuah Khan <shuah@kernel.org>
>>>> Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
>>>> ---
>>>>
>>>> Note: This patch is independent from the others in this series, hence it
>>>> can be merged singularly by the kselftest maintainers.
>>>>
>>>>  tools/testing/selftests/vDSO/Makefile         |   2 +
>>>>  .../selftests/vDSO/vdso_clock_getres.c        | 124 ++++++++++++++++++
>>>>  2 files changed, 126 insertions(+)
>>>>  create mode 100644 tools/testing/selftests/vDSO/vdso_clock_getres.c
>>
>> --
>> Regards,
>> Vincenzo
> 
> 

-- 
Regards,
Vincenzo

^ permalink raw reply

* Re: [PATCH RFC 0/5] Remove some notrace RCU APIs
From: Joel Fernandes @ 2019-05-28 19:00 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: rcu, Jonathan Corbet, linux-doc, Lai Jiangshan, Josh Triplett,
	Steven Rostedt, linux-kernel, Miguel Ojeda, Ingo Molnar,
	Mathieu Desnoyers, kvm-ppc, linuxppc-dev
In-Reply-To: <20190528122447.GS28207@linux.ibm.com>

On Tue, May 28, 2019 at 05:24:47AM -0700, Paul E. McKenney wrote:
> On Sat, May 25, 2019 at 02:14:07PM -0400, Joel Fernandes wrote:
> > On Sat, May 25, 2019 at 08:50:35AM -0700, Paul E. McKenney wrote:
> > > On Sat, May 25, 2019 at 10:19:54AM -0400, Joel Fernandes wrote:
> > > > On Sat, May 25, 2019 at 07:08:26AM -0400, Steven Rostedt wrote:
> > > > > On Sat, 25 May 2019 04:14:44 -0400
> > > > > Joel Fernandes <joel@joelfernandes.org> wrote:
> > > > > 
> > > > > > > I guess the difference between the _raw_notrace and just _raw variants
> > > > > > > is that _notrace ones do a rcu_check_sparse(). Don't we want to keep
> > > > > > > that check?  
> > > > > > 
> > > > > > This is true.
> > > > > > 
> > > > > > Since the users of _raw_notrace are very few, is it worth keeping this API
> > > > > > just for sparse checking? The API naming is also confusing. I was expecting
> > > > > > _raw_notrace to do fewer checks than _raw, instead of more. Honestly, I just
> > > > > > want to nuke _raw_notrace as done in this series and later we can introduce a
> > > > > > sparse checking version of _raw if need-be. The other option could be to
> > > > > > always do sparse checking for _raw however that used to be the case and got
> > > > > > changed in http://lists.infradead.org/pipermail/linux-afs/2016-July/001016.html
> > > > > 
> > > > > What if we just rename _raw to _raw_nocheck, and _raw_notrace to _raw ?
> > > > 
> > > > That would also mean changing 160 usages of _raw to _raw_nocheck in the
> > > > kernel :-/.
> > > > 
> > > > The tracing usage of _raw_notrace is only like 2 or 3 users. Can we just call
> > > > rcu_check_sparse directly in the calling code for those and eliminate the APIs?
> > > > 
> > > > I wonder what Paul thinks about the matter as well.
> > > 
> > > My thought is that it is likely that a goodly number of the current uses
> > > of _raw should really be some form of _check, with lockdep expressions
> > > spelled out.  Not that working out what exactly those lockdep expressions
> > > should be is necessarily a trivial undertaking.  ;-)
> > 
> > Yes, currently where I am a bit stuck is the rcu_dereference_raw()
> > cannot possibly know what SRCU domain it is under, so lockdep cannot check if
> > an SRCU lock is held without the user also passing along the SRCU domain. I
> > am trying to change lockdep to see if it can check if *any* srcu domain lock
> > is held (regardless of which one) and complain if none are. This is at least
> > better than no check at all.
> > 
> > However, I think it gets tricky for mutexes. If you have something like:
> > mutex_lock(some_mutex);
> > p = rcu_dereference_raw(gp);
> > mutex_unlock(some_mutex);
> > 
> > This might be a perfectly valid invocation of _raw, however my checks (patch
> > is still cooking) trigger a lockdep warning becase _raw cannot know that this
> > is Ok. lockdep thinks it is not in a reader section. This then gets into the
> > territory of a new rcu_derference_raw_protected(gp, assert_held(some_mutex))
> > which sucks because its yet another API. To circumvent this issue, can we
> > just have callers of rcu_dereference_raw ensure that they call
> > rcu_read_lock() if they are protecting dereferences by a mutex? That would
> > make things a lot easier and also may be Ok since rcu_read_lock is quite
> > cheap.
> 
> Why not just rcu_dereference_protected(lockdep_is_held(some_mutex))?
> The API is already there, and no need for spurious readers.

Hmm, so I gave a bad example, here is a better example:

fib_get_table calls hlist_for_each_entry_rcu()
hlist_for_each_entry_rcu calls rcu_dereference_raw().

This is perfectly Ok to be called under rtnl_mutex. However rcu_dererence_raw
in hlist_for_each_entry_rcu has no way of knowing that the rtnl_mutex held is
sufficient for the protection since it is not directly called by the caller.

I am almost sure I saw other examples of rcu_dereference_raw being called
this way as well.

I was trying to make an "automatic" lockdep check for all this, but it is
quite hard to do so without passing down lockdep experessions down a call
chain thus complicating all such callchains.

Further I don't think code can trivially be converted from
rcu_dereference_raw to rcu_dereference_protected even if the protection being
offered is known, since the former does not do sparse checking and the latter
might trigger false sparse checks in case the pointer in concern is protected
both by RCU and non-RCU methods. I believe this is why you removed sparse
checking from rcu_dereference_raw as well:

http://lists.infradead.org/pipermail/linux-afs/2016-July/001016.html

> > > That aside, if we are going to change the name of an API that is
> > > used 160 places throughout the tree, we would need to have a pretty
> > > good justification.  Without such a justification, it will just look
> > > like pointless churn to the various developers and maintainers on the
> > > receiving end of the patches.
> > 
> > Actually, the API name change is not something I want to do, it is Steven
> > suggestion. My suggestion is let us just delete _raw_notrace and just use the
> > _raw API for tracing, since _raw doesn't do any tracing anyway. Steve pointed
> > that _raw_notrace does sparse checking unlike _raw, but I think that isn't an
> > issue since _raw doesn't do such checking at the moment anyway.. (if possible
> > check my cover letter again for details/motivation of this series).
> 
> Understood, but regardless of who suggested it, if we are to go through
> with it, good justification will be required.  ;-)

Ok ;-). About the names of the APIs, I thought of leaving rcu_dereference_raw
and its callers intact, and just rename:

 * hlist_for_each_entry_rcu_notrace
 * rcu_dereference_raw_notrace

to:
 * hlist_for_each_entry_rcu_sparse
 * rcu_dereference_raw_sparse

The _sparse would stand for "sparse checking". However I am open to better
names..

Such renaming would avoid confusion and keep the fact about sparse checking
less ambiguous.

thanks!

 - Joel


^ permalink raw reply

* Re: [PATCH v10 01/12] MODSIGN: Export module signature definitions
From: Thiago Jung Bauermann @ 2019-05-28 19:03 UTC (permalink / raw)
  To: Mimi Zohar
  Cc: Herbert Xu, linux-doc, Dmitry Kasatkin, David S. Miller,
	Jonathan Corbet, linux-kernel, James Morris, David Howells,
	AKASHI, Takahiro, linux-security-module, keyrings, linux-crypto,
	Jessica Yu, linux-integrity, linuxppc-dev, David Woodhouse,
	Serge E. Hallyn
In-Reply-To: <1557416528.10635.62.camel@linux.ibm.com>


Mimi Zohar <zohar@linux.ibm.com> writes:

> On Thu, 2019-04-18 at 00:51 -0300, Thiago Jung Bauermann wrote:
>> IMA will use the module_signature format for append signatures, so export
>> the relevant definitions and factor out the code which verifies that the
>> appended signature trailer is valid.
>> 
>> Also, create a CONFIG_MODULE_SIG_FORMAT option so that IMA can select it
>> and be able to use mod_check_sig() without having to depend on either
>> CONFIG_MODULE_SIG or CONFIG_MODULES.
>> 
>> Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>
>> Cc: Jessica Yu <jeyu@kernel.org>
>
> Just a couple minor questions/comments below.
>
> Reviewed-by: Mimi Zohar <zohar@linux.ibm.com>

Thanks for your review and your comments!

>> diff --git a/init/Kconfig b/init/Kconfig
>> index 4592bf7997c0..a71019553ee1 100644
>> --- a/init/Kconfig
>> +++ b/init/Kconfig
>> @@ -1906,7 +1906,7 @@ config MODULE_SRCVERSION_ALL
>>  config MODULE_SIG
>>  	bool "Module signature verification"
>>  	depends on MODULES
>> -	select SYSTEM_DATA_VERIFICATION
>> +	select MODULE_SIG_FORMAT
>>  	help
>>  	  Check modules for valid signatures upon load: the signature
>>  	  is simply appended to the module. For more information see
>> @@ -2036,6 +2036,10 @@ config TRIM_UNUSED_KSYMS
>>  
>>  endif # MODULES
>>  
>> +config MODULE_SIG_FORMAT
>> +	def_bool n
>> +	select SYSTEM_DATA_VERIFICATION
>
> Normally Kconfigs, in the same file, are defined before they are used.
>  I'm not sure if that is required or just a convention.

I think it's a convention, because it seemed to work in the current way.
For the next version I moved the config MODULE_SIG_FORMAT definition to
just before "menuconfig MODULES"

>> diff --git a/kernel/module_signature.c b/kernel/module_signature.c
>> new file mode 100644
>> index 000000000000..6d5e59f27f55
>> --- /dev/null
>> +++ b/kernel/module_signature.c
>> @@ -0,0 +1,45 @@
>> +// SPDX-License-Identifier: GPL-2.0+
>> +/*
>> + * Module signature checker
>> + *
>> + * Copyright (C) 2012 Red Hat, Inc. All Rights Reserved.
>> + * Written by David Howells (dhowells@redhat.com)
>> + */
>> +
>> +#include <linux/errno.h>
>> +#include <linux/printk.h>
>> +#include <linux/module_signature.h>
>> +#include <asm/byteorder.h>
>> +
>> +/**
>> + * mod_check_sig - check that the given signature is sane
>> + *
>> + * @ms:		Signature to check.
>> + * @file_len:	Size of the file to which @ms is appended.
>
> "name" is missing.

Fixed.

-- 
Thiago Jung Bauermann
IBM Linux Technology Center


^ permalink raw reply

* Re: [PATCH v10 11/12] ima: Define ima-modsig template
From: Thiago Jung Bauermann @ 2019-05-28 19:09 UTC (permalink / raw)
  To: Mimi Zohar
  Cc: Herbert Xu, linux-doc, Dmitry Kasatkin, David S. Miller,
	Jonathan Corbet, linux-kernel, James Morris, David Howells,
	AKASHI, Takahiro, linux-security-module, keyrings, linux-crypto,
	Jessica Yu, linux-integrity, linuxppc-dev, David Woodhouse,
	Serge E. Hallyn
In-Reply-To: <1557442889.10635.88.camel@linux.ibm.com>


Mimi Zohar <zohar@linux.ibm.com> writes:

> On Thu, 2019-04-18 at 00:51 -0300, Thiago Jung Bauermann wrote:
>> Define new "d-modsig" template field which holds the digest that is
>> expected to match the one contained in the modsig, and also new "modsig"
>> template field which holds the appended file signature.
>>
>> Add a new "ima-modsig" defined template descriptor with the new fields as
>> well as the ones from the "ima-sig" descriptor.
>>
>> Change ima_store_measurement() to accept a struct modsig * argument so that
>> it can be passed along to the templates via struct ima_event_data.
>>
>> Suggested-by: Mimi Zohar <zohar@linux.ibm.com>
>> Signed-off-by: Thiago Jung Bauermann <bauerman@linux.ibm.com>
>
> Thanks, Roberto. Just some thoughts inline below.
>
> Reviewed-by: Mimi Zohar <zohar@linux.ibm.com>

Thanks!

>> +/*
>> + * Validating the appended signature included in the measurement list requires
>> + * the file hash calculated without the appended signature (i.e., the 'd-modsig'
>> + * field). Therefore, notify the user if they have the 'modsig' field but not
>> + * the 'd-modsig' field in the template.
>> + */
>> +static void check_current_template_modsig(void)
>> +{
>> +#define MSG "template with 'modsig' field also needs 'd-modsig' field\n"
>> +	struct ima_template_desc *template;
>> +	bool has_modsig, has_dmodsig;
>> +	static bool checked;
>> +	int i;
>> +
>> +	/* We only need to notify the user once. */
>> +	if (checked)
>> +		return;
>> +
>> +	has_modsig = has_dmodsig = false;
>> +	template = ima_template_desc_current();
>> +	for (i = 0; i < template->num_fields; i++) {
>> +		if (!strcmp(template->fields[i]->field_id, "modsig"))
>> +			has_modsig = true;
>> +		else if (!strcmp(template->fields[i]->field_id, "d-modsig"))
>> +			has_dmodsig = true;
>> +	}
>> +
>> +	if (has_modsig && !has_dmodsig)
>> +		pr_notice(MSG);
>> +
>> +	checked = true;
>> +#undef MSG
>> +}
>> +
>
> There was some recent discussion about supporting per IMA policy rule
> template formats. This feature will allow just the kexec kernel image
> to require ima-modsig. When per policy rule template formats support
> is upstreamed, this function will need to be updated.

Indeed. Thanks for the clarification. For the next iteration I rebased
on top of Matthew Garret's "IMA: Allow profiles to define the desired
IMA template" patch. I'm currently adapting this check accordingly.

>> @@ -389,3 +425,25 @@ int ima_eventsig_init(struct ima_event_data *event_data,
>>  	return ima_write_template_field_data(xattr_value, event_data->xattr_len,
>>  					     DATA_FMT_HEX, field_data);
>>  }
>> +
>> +int ima_eventmodsig_init(struct ima_event_data *event_data,
>> +			 struct ima_field_data *field_data)
>> +{
>> +	const void *data;
>> +	u32 data_len;
>> +	int rc;
>> +
>> +	if (!event_data->modsig)
>> +		return 0;
>> +
>> +	/*
>> +	 * The xattr_value for IMA_MODSIG is a runtime structure containing
>> +	 * pointers. Get its raw data instead.
>> +	 */
>
> "xattr_value"? The comment needs some clarification.

Oops, forgot to update this comment. This is the new version:

	/*
	 * modsig is a runtime structure containing pointers. Get its raw data
	 * instead.
	 */

--
Thiago Jung Bauermann
IBM Linux Technology Center


^ permalink raw reply

* Re: kmemleak: 1157 new suspected memory leaks (see /sys/kernel/debug/kmemleak)
From: Mathieu Malaterre @ 2019-05-28 19:14 UTC (permalink / raw)
  To: Michael Ellerman, Catalin Marinas; +Cc: linuxppc-dev
In-Reply-To: <87tvdfp31x.fsf@concordia.ellerman.id.au>

Hi Michael !

Thanks for the kind help.

On Tue, May 28, 2019 at 7:21 AM Michael Ellerman <mpe@ellerman.id.au> wrote:
>
> Mathieu Malaterre <malat@debian.org> writes:
> > Hi there,
> >
> > Is there a way to dump more context (somewhere in of tree
> > flattening?). I cannot make sense of the following:
>
> Hmm. Not that I know of.
>
> Those don't look related to OF flattening/unflattening. That's just
> sysfs setup based on the unflattened device tree.
>
> The allocations are happening in safe_name() AFAICS.
>
> int __of_add_property_sysfs(struct device_node *np, struct property *pp)
> {
>         ...
>         pp->attr.attr.name = safe_name(&np->kobj, pp->name);
>
> And the free is in __of_sysfs_remove_bin_file():
>
> void __of_sysfs_remove_bin_file(struct device_node *np, struct property *prop)
> {
>         if (!IS_ENABLED(CONFIG_SYSFS))
>                 return;
>
>         sysfs_remove_bin_file(&np->kobj, &prop->attr);
>         kfree(prop->attr.attr.name);
>

Right. That helped a lot !

> There is this check which could be failing leading to us not calling the
> free at all:
>
> void __of_remove_property_sysfs(struct device_node *np, struct property *prop)
> {
>         /* at early boot, bail here and defer setup to of_init() */
>         if (of_kset && of_node_is_attached(np))
>                 __of_sysfs_remove_bin_file(np, prop);
> }
>
>
> So maybe stick a printk() in there to see if you're hitting that
> condition, eg something like:
>
>         if (of_kset && of_node_is_attached(np))
>                 __of_sysfs_remove_bin_file(np, prop);
>         else
>                 printk("%s: leaking prop %s on node %pOF\n", __func__, prop->attr.attr.name, np);
>

If I understand correctly those are false positive. I was first
starting to consider using something like kmemleak_not_leak, but I
remember that I have been using kmemleak for a couple of years now.
Those reports starting to show up only recently.

Catalin, do you have an idea why on a non-SMP machine kmemleak reports
leaks from:

[...]
void __init of_core_init(void)
{
[...]
 for_each_of_allnodes(np)
    __of_attach_node_sysfs(np);



> cheers
>
> > kmemleak: 1157 new suspected memory leaks (see /sys/kernel/debug/kmemleak)
> >
> > Where:
> >
> > # head -40 /sys/kernel/debug/kmemleak
> > unreferenced object 0xdf44d180 (size 8):
> >   comm "swapper", pid 1, jiffies 4294892297 (age 4766.460s)
> >   hex dump (first 8 bytes):
> >     62 61 73 65 00 00 00 00                          base....
> >   backtrace:
> >     [<0ca59825>] kstrdup+0x4c/0xb8
> >     [<c8a79377>] kobject_set_name_vargs+0x34/0xc8
> >     [<661b4c86>] kobject_add+0x78/0x120
> >     [<c1416f27>] __of_attach_node_sysfs+0xa0/0x14c
> >     [<2a143d10>] of_core_init+0x90/0x114
> >     [<a353d0e1>] driver_init+0x30/0x48
> >     [<84ed01b1>] kernel_init_freeable+0xfc/0x3fc
> >     [<dc60f815>] kernel_init+0x20/0x110
> >     [<faa1c5b0>] ret_from_kernel_thread+0x14/0x1c
> > unreferenced object 0xdf44d178 (size 8):
> >   comm "swapper", pid 1, jiffies 4294892297 (age 4766.460s)
> >   hex dump (first 8 bytes):
> >     6d 6f 64 65 6c 00 97 c8                          model...
> >   backtrace:
> >     [<0ca59825>] kstrdup+0x4c/0xb8
> >     [<0eeb0a3b>] __of_add_property_sysfs+0x88/0x12c
> >     [<f6c64af0>] __of_attach_node_sysfs+0xcc/0x14c
> >     [<2a143d10>] of_core_init+0x90/0x114
> >     [<a353d0e1>] driver_init+0x30/0x48
> >     [<84ed01b1>] kernel_init_freeable+0xfc/0x3fc
> >     [<dc60f815>] kernel_init+0x20/0x110
> >     [<faa1c5b0>] ret_from_kernel_thread+0x14/0x1c
> > unreferenced object 0xdf4021e0 (size 16):
> >   comm "swapper", pid 1, jiffies 4294892297 (age 4766.460s)
> >   hex dump (first 16 bytes):
> >     63 6f 6d 70 61 74 69 62 6c 65 00 01 00 00 00 00  compatible......
> >   backtrace:
> >     [<0ca59825>] kstrdup+0x4c/0xb8
> >     [<0eeb0a3b>] __of_add_property_sysfs+0x88/0x12c
> >     [<f6c64af0>] __of_attach_node_sysfs+0xcc/0x14c
> >     [<2a143d10>] of_core_init+0x90/0x114
> >     [<a353d0e1>] driver_init+0x30/0x48
> >     [<84ed01b1>] kernel_init_freeable+0xfc/0x3fc
> >     [<dc60f815>] kernel_init+0x20/0x110
> >     [<faa1c5b0>] ret_from_kernel_thread+0x14/0x1c

^ permalink raw reply

* Re: [PATCH v10 12/12] ima: Store the measurement again when appraising a modsig
From: Thiago Jung Bauermann @ 2019-05-28 19:14 UTC (permalink / raw)
  To: Mimi Zohar
  Cc: Herbert Xu, linux-doc, Dmitry Kasatkin, David S. Miller,
	Jonathan Corbet, linux-kernel, James Morris, David Howells,
	AKASHI, Takahiro, linux-security-module, keyrings, linux-crypto,
	Jessica Yu, linux-integrity, linuxppc-dev, David Woodhouse,
	Serge E. Hallyn
In-Reply-To: <1559052560.4090.14.camel@linux.ibm.com>


Mimi Zohar <zohar@linux.ibm.com> writes:

> Hi Thiago,
>
> On Thu, 2019-04-18 at 00:51 -0300, Thiago Jung Bauermann wrote:
>> If the IMA template contains the "modsig" or "d-modsig" field, then the
>> modsig should be added to the measurement list when the file is appraised.
>>
>> And that is what normally happens, but if a measurement rule caused a file
>> containing a modsig to be measured before a different rule causes it to be
>> appraised, the resulting measurement entry will not contain the modsig
>> because it is only fetched during appraisal. When the appraisal rule
>> triggers, it won't store a new measurement containing the modsig because
>> the file was already measured.
>>
>> We need to detect that situation and store an additional measurement with
>> the modsig. This is done by adding an IMA_MEASURE action flag if we read a
>> modsig and the IMA template contains a modsig field.
>
> With the new per policy rule "template" support being added, this
> patch needs to be modified so that the per policy "template" format is
> checked. ima_template_has_modsig() should be called with the
> template_desc being used.

Right. Thanks for point out what needs to be done. After rebasing on top
of Matthew Garret's "IMA: Allow profiles to define the desired IMA
template" patch I changed ima_template_has_modsig() to check the
template_desc obtained from process_measurement().

--
Thiago Jung Bauermann
IBM Linux Technology Center


^ permalink raw reply

* Re: [PATCH v10 09/12] ima: Implement support for module-style appended signatures
From: Thiago Jung Bauermann @ 2019-05-28 19:23 UTC (permalink / raw)
  To: Mimi Zohar
  Cc: Herbert Xu, linux-doc, Dmitry Kasatkin, David S. Miller,
	Jonathan Corbet, linux-kernel, James Morris, David Howells,
	AKASHI, Takahiro, linux-security-module, keyrings, linux-crypto,
	Jessica Yu, linux-integrity, linuxppc-dev, David Woodhouse,
	Serge E. Hallyn
In-Reply-To: <1557442868.10635.87.camel@linux.ibm.com>


Mimi Zohar <zohar@linux.ibm.com> writes:

> Hi Thiago,
>
>> diff --git a/security/integrity/ima/ima_policy.c b/security/integrity/ima/ima_policy.c
>> index fca7a3f23321..a7a20a8c15c1 100644
>> --- a/security/integrity/ima/ima_policy.c
>> +++ b/security/integrity/ima/ima_policy.c
>> @@ -1144,6 +1144,12 @@ void ima_delete_rules(void)
>>  	}
>>  }
>>
>> +#define __ima_hook_stringify(str)	(#str),
>> +
>> +const char *const func_tokens[] = {
>> +	__ima_hooks(__ima_hook_stringify)
>> +};
>> +
>>  #ifdef	CONFIG_IMA_READ_POLICY
>>  enum {
>>  	mask_exec = 0, mask_write, mask_read, mask_append
>> @@ -1156,12 +1162,6 @@ static const char *const mask_tokens[] = {
>>  	"MAY_APPEND"
>>  };
>>
>> -#define __ima_hook_stringify(str)	(#str),
>> -
>> -static const char *const func_tokens[] = {
>> -	__ima_hooks(__ima_hook_stringify)
>> -};
>> -
>>  void *ima_policy_start(struct seq_file *m, loff_t *pos)
>>  {
>>  	loff_t l = *pos;
>
> Is moving this something left over from previous versions or there is
> a need for this change?

Well, it's not a strong need, but it's still relevant in the current
version. I use func_tokens in ima_read_modsig() in order to be able to
mention the hook name in mod_check_sig()'s error message:

In ima_read_modsig():

	rc = mod_check_sig(sig, buf_len, func_tokens[func]);

And in mod_check_sig():

		pr_err("%s: Module is not signed with expected PKCS#7 message\n",
		       name);

If you think it's not worth it to expose func_tokens, I can make
ima_read_modsig() pass a more generic const string such as "IMA modsig"
for example.

> Other than this, the patch looks good.

Nice!

--
Thiago Jung Bauermann
IBM Linux Technology Center


^ permalink raw reply

* Re: [PATCH v10 09/12] ima: Implement support for module-style appended signatures
From: Thiago Jung Bauermann @ 2019-05-28 19:27 UTC (permalink / raw)
  To: Mimi Zohar
  Cc: Herbert Xu, linux-doc, Dmitry Kasatkin, David S. Miller,
	Jonathan Corbet, linux-kernel, James Morris, David Howells,
	AKASHI, Takahiro, linux-security-module, keyrings, linux-crypto,
	Jessica Yu, linux-integrity, linuxppc-dev, David Woodhouse,
	Serge E. Hallyn
In-Reply-To: <1557835765.4139.9.camel@linux.ibm.com>


Mimi Zohar <zohar@linux.ibm.com> writes:

> Hi Thiago,
>
> On Thu, 2019-04-18 at 00:51 -0300, Thiago Jung Bauermann wrote:
>> 
>> @@ -326,6 +356,10 @@ int ima_appraise_measurement(enum ima_hooks func,
>> case INTEGRITY_UNKNOWN:
>> break;
>> case INTEGRITY_NOXATTRS:/* No EVM protected xattrs. */
>> +/* It's fine not to have xattrs when using a modsig. */
>> +if (try_modsig)
>> +break;
>> +/* fall through */
>> case INTEGRITY_NOLABEL:/* No security.evm xattr. */
>> cause = "missing-HMAC";
>> goto out;
>> @@ -340,6 +374,14 @@ int ima_appraise_measurement(enum ima_hooks func,
>> rc = xattr_verify(func, iint, xattr_value, xattr_len, &status,
>>  &cause);
>> 
>> +/*
>> + * If we have a modsig and either no imasig or the imasig's key isn't
>> + * known, then try verifying the modsig.
>> + */
>> +if (status != INTEGRITY_PASS && try_modsig &&
>> + (!xattr_value || rc == -ENOKEY))
>> +rc = modsig_verify(func, modsig, &status, &cause);
>
> EVM protects other security xattrs, not just security.ima, if they
> exist. As a result, evm_verifyxattr() could pass based on the other
> security xattrs.

Indeed! It doesn't make sense to test for status != INTEGRITY_PASS here.
Not sure what I was thinking. Thanks for spotting it. With your other
comments about this if clause, this code now reads:

	/*
	 * If we have a modsig and either no imasig or the imasig's key isn't
	 * known, then try verifying the modsig.
	 */
	if (try_modsig &&
	    (!xattr_value || xattr_value->type == IMA_XATTR_DIGEST_NG ||
	     rc == -ENOKEY))
		rc = modsig_verify(func, modsig, &status, &cause);

-- 
Thiago Jung Bauermann
IBM Linux Technology Center


^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox