linux-arch.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] ARC fixes for 3.12-rc3
@ 2013-09-27 10:57 Vineet Gupta
  2013-09-27 10:57 ` Vineet Gupta
                   ` (4 more replies)
  0 siblings, 5 replies; 9+ messages in thread
From: Vineet Gupta @ 2013-09-27 10:57 UTC (permalink / raw)
  To: linux-kernel, linux-arch
  Cc: arc-linux-dev, u.kleine-koenig, Noam Camus, Gilad Ben-Yossef,
	Vineet Gupta

ARC fixes for 3.12-rc3.

Thx,
-Vineet

Mischa Jonker (1):
  ARC: Handle zero-overhead-loop in unaligned access handler

Uwe Kleine-König (1):
  ARC: Use clockevents_config_and_register over
    clockevents_register_device

Vineet Gupta (2):
  ARC: Fix 32-bit wrap around in access_ok()
  ARC: Workaround spinlock livelock in SMP SystemC simulation

 arch/arc/include/asm/spinlock.h | 9 ++++++++-
 arch/arc/include/asm/uaccess.h  | 4 ++--
 arch/arc/kernel/time.c          | 7 ++-----
 arch/arc/kernel/unaligned.c     | 6 ++++++
 4 files changed, 18 insertions(+), 8 deletions(-)

-- 
1.8.1.2

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 0/4] ARC fixes for 3.12-rc3
  2013-09-27 10:57 [PATCH 0/4] ARC fixes for 3.12-rc3 Vineet Gupta
@ 2013-09-27 10:57 ` Vineet Gupta
  2013-09-27 10:57 ` [PATCH 1/4] ARC: Handle zero-overhead-loop in unaligned access handler Vineet Gupta
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 9+ messages in thread
From: Vineet Gupta @ 2013-09-27 10:57 UTC (permalink / raw)
  To: linux-kernel, linux-arch
  Cc: arc-linux-dev, u.kleine-koenig, Noam Camus, Gilad Ben-Yossef,
	Vineet Gupta

ARC fixes for 3.12-rc3.

Thx,
-Vineet

Mischa Jonker (1):
  ARC: Handle zero-overhead-loop in unaligned access handler

Uwe Kleine-König (1):
  ARC: Use clockevents_config_and_register over
    clockevents_register_device

Vineet Gupta (2):
  ARC: Fix 32-bit wrap around in access_ok()
  ARC: Workaround spinlock livelock in SMP SystemC simulation

 arch/arc/include/asm/spinlock.h | 9 ++++++++-
 arch/arc/include/asm/uaccess.h  | 4 ++--
 arch/arc/kernel/time.c          | 7 ++-----
 arch/arc/kernel/unaligned.c     | 6 ++++++
 4 files changed, 18 insertions(+), 8 deletions(-)

-- 
1.8.1.2


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/4] ARC: Handle zero-overhead-loop in unaligned access handler
  2013-09-27 10:57 [PATCH 0/4] ARC fixes for 3.12-rc3 Vineet Gupta
  2013-09-27 10:57 ` Vineet Gupta
@ 2013-09-27 10:57 ` Vineet Gupta
  2013-09-27 10:57 ` [PATCH 2/4] ARC: Fix 32-bit wrap around in access_ok() Vineet Gupta
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 9+ messages in thread
From: Vineet Gupta @ 2013-09-27 10:57 UTC (permalink / raw)
  To: linux-kernel, linux-arch
  Cc: arc-linux-dev, u.kleine-koenig, Noam Camus, Gilad Ben-Yossef,
	Mischa Jonker, Vineet Gupta

From: Mischa Jonker <mjonker@synopsys.com>

If a load or store is the last instruction in a zero-overhead-loop, and
it's misaligned, the loop would execute only once.

This fixes that problem.

Signed-off-by: Mischa Jonker <mjonker@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
---
 arch/arc/kernel/unaligned.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/arch/arc/kernel/unaligned.c b/arch/arc/kernel/unaligned.c
index 28d1700..7ff5b5c 100644
--- a/arch/arc/kernel/unaligned.c
+++ b/arch/arc/kernel/unaligned.c
@@ -245,6 +245,12 @@ int misaligned_fixup(unsigned long address, struct pt_regs *regs,
 		regs->status32 &= ~STATUS_DE_MASK;
 	} else {
 		regs->ret += state.instr_len;
+
+		/* handle zero-overhead-loop */
+		if ((regs->ret == regs->lp_end) && (regs->lp_count)) {
+			regs->ret = regs->lp_start;
+			regs->lp_count--;
+		}
 	}
 
 	return 0;
-- 
1.8.1.2

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/4] ARC: Fix 32-bit wrap around in access_ok()
  2013-09-27 10:57 [PATCH 0/4] ARC fixes for 3.12-rc3 Vineet Gupta
  2013-09-27 10:57 ` Vineet Gupta
  2013-09-27 10:57 ` [PATCH 1/4] ARC: Handle zero-overhead-loop in unaligned access handler Vineet Gupta
@ 2013-09-27 10:57 ` Vineet Gupta
  2013-09-27 10:57   ` Vineet Gupta
  2013-09-27 10:57 ` [PATCH 3/4] ARC: Workaround spinlock livelock in SMP SystemC simulation Vineet Gupta
  2013-09-27 10:57 ` [PATCH 4/4] ARC: Use clockevents_config_and_register over clockevents_register_device Vineet Gupta
  4 siblings, 1 reply; 9+ messages in thread
From: Vineet Gupta @ 2013-09-27 10:57 UTC (permalink / raw)
  To: linux-kernel, linux-arch
  Cc: arc-linux-dev, u.kleine-koenig, Noam Camus, Gilad Ben-Yossef,
	Vineet Gupta

Anton reported

 | LTP tests syscalls/process_vm_readv01 and process_vm_writev01 fail
 | similarly in one testcase test_iov_invalid -> lvec->iov_base.
 | Testcase expects errno EFAULT and return code -1,
 | but it gets return code 1 and ERRNO is 0 what means success.

Essentially test case was passing a pointer of -1 which access_ok()
was not catching. It was doing [@addr + @sz <= TASK_SIZE] which would
pass for @addr == -1

Fixed that by rewriting as [@addr <= TASK_SIZE - @sz]

Reported-by: Anton Kolesov <Anton.Kolesov@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
---
 arch/arc/include/asm/uaccess.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arc/include/asm/uaccess.h b/arch/arc/include/asm/uaccess.h
index 3242082..30c9baf 100644
--- a/arch/arc/include/asm/uaccess.h
+++ b/arch/arc/include/asm/uaccess.h
@@ -43,7 +43,7 @@
  * Because it essentially checks if buffer end is within limit and @len is
  * non-ngeative, which implies that buffer start will be within limit too.
  *
- * The reason for rewriting being, for majorit yof cases, @len is generally
+ * The reason for rewriting being, for majority of cases, @len is generally
  * compile time constant, causing first sub-expression to be compile time
  * subsumed.
  *
@@ -53,7 +53,7 @@
  *
  */
 #define __user_ok(addr, sz)	(((sz) <= TASK_SIZE) && \
-				 (((addr)+(sz)) <= get_fs()))
+				 ((addr) <= (get_fs() - (sz))))
 #define __access_ok(addr, sz)	(unlikely(__kernel_ok) || \
 				 likely(__user_ok((addr), (sz))))
 
-- 
1.8.1.2

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/4] ARC: Fix 32-bit wrap around in access_ok()
  2013-09-27 10:57 ` [PATCH 2/4] ARC: Fix 32-bit wrap around in access_ok() Vineet Gupta
@ 2013-09-27 10:57   ` Vineet Gupta
  0 siblings, 0 replies; 9+ messages in thread
From: Vineet Gupta @ 2013-09-27 10:57 UTC (permalink / raw)
  To: linux-kernel, linux-arch
  Cc: arc-linux-dev, u.kleine-koenig, Noam Camus, Gilad Ben-Yossef,
	Vineet Gupta

Anton reported

 | LTP tests syscalls/process_vm_readv01 and process_vm_writev01 fail
 | similarly in one testcase test_iov_invalid -> lvec->iov_base.
 | Testcase expects errno EFAULT and return code -1,
 | but it gets return code 1 and ERRNO is 0 what means success.

Essentially test case was passing a pointer of -1 which access_ok()
was not catching. It was doing [@addr + @sz <= TASK_SIZE] which would
pass for @addr == -1

Fixed that by rewriting as [@addr <= TASK_SIZE - @sz]

Reported-by: Anton Kolesov <Anton.Kolesov@synopsys.com>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
---
 arch/arc/include/asm/uaccess.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arc/include/asm/uaccess.h b/arch/arc/include/asm/uaccess.h
index 3242082..30c9baf 100644
--- a/arch/arc/include/asm/uaccess.h
+++ b/arch/arc/include/asm/uaccess.h
@@ -43,7 +43,7 @@
  * Because it essentially checks if buffer end is within limit and @len is
  * non-ngeative, which implies that buffer start will be within limit too.
  *
- * The reason for rewriting being, for majorit yof cases, @len is generally
+ * The reason for rewriting being, for majority of cases, @len is generally
  * compile time constant, causing first sub-expression to be compile time
  * subsumed.
  *
@@ -53,7 +53,7 @@
  *
  */
 #define __user_ok(addr, sz)	(((sz) <= TASK_SIZE) && \
-				 (((addr)+(sz)) <= get_fs()))
+				 ((addr) <= (get_fs() - (sz))))
 #define __access_ok(addr, sz)	(unlikely(__kernel_ok) || \
 				 likely(__user_ok((addr), (sz))))
 
-- 
1.8.1.2


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 3/4] ARC: Workaround spinlock livelock in SMP SystemC simulation
  2013-09-27 10:57 [PATCH 0/4] ARC fixes for 3.12-rc3 Vineet Gupta
                   ` (2 preceding siblings ...)
  2013-09-27 10:57 ` [PATCH 2/4] ARC: Fix 32-bit wrap around in access_ok() Vineet Gupta
@ 2013-09-27 10:57 ` Vineet Gupta
  2013-09-27 10:57   ` Vineet Gupta
  2013-09-27 10:57 ` [PATCH 4/4] ARC: Use clockevents_config_and_register over clockevents_register_device Vineet Gupta
  4 siblings, 1 reply; 9+ messages in thread
From: Vineet Gupta @ 2013-09-27 10:57 UTC (permalink / raw)
  To: linux-kernel, linux-arch
  Cc: arc-linux-dev, u.kleine-koenig, Noam Camus, Gilad Ben-Yossef,
	Vineet Gupta

Some ARC SMP systems lack native atomic R-M-W (LLOCK/SCOND) insns and
can only use atomic EX insn (reg with mem) to build higher level R-M-W
primitives. This includes a SystemC based SMP simulation model.

So rwlocks need to use a protecting spinlock for atomic cmp-n-exchange
operation to update reader(s)/writer count.

The spinlock operation itself looks as follows:

	mov reg, 1		; 1=locked, 0=unlocked
retry:
	EX reg, [lock]		; load existing, store 1, atomically
	BREQ reg, 1, rety	; if already locked, retry

In single-threaded simulation, SystemC alternates between the 2 cores
with "N" insn each based scheduling. Additionally for insn with global
side effect, such as EX writing to shared mem, a core switch is
enforced too.

Given that, 2 cores doing a repeated EX on same location, Linux often
got into a livelock e.g. when both cores were fiddling with tasklist
lock (gdbserver / hackbench) for read/write respectively as the
sequence diagram below shows:

           core1                                   core2
         --------                                --------
1. spin lock [EX r=0, w=1] - LOCKED
2. rwlock(Read)            - LOCKED
3. spin unlock  [ST 0]     - UNLOCKED
                                         spin lock [EX r=0,w=1] - LOCKED
                      -- resched core 1----

5. spin lock [EX r=1] - ALREADY-LOCKED

                      -- resched core 2----
6.                                       rwlock(Write) - READER-LOCKED
7.                                       spin unlock [ST 0]
8.                                       rwlock failed, retry again

9.                                       spin lock  [EX r=0, w=1]
                      -- resched core 1----

10  spinlock locked in #9, retry #5
11. spin lock [EX gets 1]
                      -- resched core 2----
...
...

The fix was to unlock using the EX insn too (step 7), to trigger another
SystemC scheduling pass which would let core1 proceed, eliding the
livelock.

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
---
 arch/arc/include/asm/spinlock.h | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/arch/arc/include/asm/spinlock.h b/arch/arc/include/asm/spinlock.h
index f158197..b6a8c2d 100644
--- a/arch/arc/include/asm/spinlock.h
+++ b/arch/arc/include/asm/spinlock.h
@@ -45,7 +45,14 @@ static inline int arch_spin_trylock(arch_spinlock_t *lock)
 
 static inline void arch_spin_unlock(arch_spinlock_t *lock)
 {
-	lock->slock = __ARCH_SPIN_LOCK_UNLOCKED__;
+	unsigned int tmp = __ARCH_SPIN_LOCK_UNLOCKED__;
+
+	__asm__ __volatile__(
+	"	ex  %0, [%1]		\n"
+	: "+r" (tmp)
+	: "r"(&(lock->slock))
+	: "memory");
+
 	smp_mb();
 }
 
-- 
1.8.1.2

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 3/4] ARC: Workaround spinlock livelock in SMP SystemC simulation
  2013-09-27 10:57 ` [PATCH 3/4] ARC: Workaround spinlock livelock in SMP SystemC simulation Vineet Gupta
@ 2013-09-27 10:57   ` Vineet Gupta
  0 siblings, 0 replies; 9+ messages in thread
From: Vineet Gupta @ 2013-09-27 10:57 UTC (permalink / raw)
  To: linux-kernel, linux-arch
  Cc: arc-linux-dev, u.kleine-koenig, Noam Camus, Gilad Ben-Yossef,
	Vineet Gupta

Some ARC SMP systems lack native atomic R-M-W (LLOCK/SCOND) insns and
can only use atomic EX insn (reg with mem) to build higher level R-M-W
primitives. This includes a SystemC based SMP simulation model.

So rwlocks need to use a protecting spinlock for atomic cmp-n-exchange
operation to update reader(s)/writer count.

The spinlock operation itself looks as follows:

	mov reg, 1		; 1=locked, 0=unlocked
retry:
	EX reg, [lock]		; load existing, store 1, atomically
	BREQ reg, 1, rety	; if already locked, retry

In single-threaded simulation, SystemC alternates between the 2 cores
with "N" insn each based scheduling. Additionally for insn with global
side effect, such as EX writing to shared mem, a core switch is
enforced too.

Given that, 2 cores doing a repeated EX on same location, Linux often
got into a livelock e.g. when both cores were fiddling with tasklist
lock (gdbserver / hackbench) for read/write respectively as the
sequence diagram below shows:

           core1                                   core2
         --------                                --------
1. spin lock [EX r=0, w=1] - LOCKED
2. rwlock(Read)            - LOCKED
3. spin unlock  [ST 0]     - UNLOCKED
                                         spin lock [EX r=0,w=1] - LOCKED
                      -- resched core 1----

5. spin lock [EX r=1] - ALREADY-LOCKED

                      -- resched core 2----
6.                                       rwlock(Write) - READER-LOCKED
7.                                       spin unlock [ST 0]
8.                                       rwlock failed, retry again

9.                                       spin lock  [EX r=0, w=1]
                      -- resched core 1----

10  spinlock locked in #9, retry #5
11. spin lock [EX gets 1]
                      -- resched core 2----
...
...

The fix was to unlock using the EX insn too (step 7), to trigger another
SystemC scheduling pass which would let core1 proceed, eliding the
livelock.

Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
---
 arch/arc/include/asm/spinlock.h | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/arch/arc/include/asm/spinlock.h b/arch/arc/include/asm/spinlock.h
index f158197..b6a8c2d 100644
--- a/arch/arc/include/asm/spinlock.h
+++ b/arch/arc/include/asm/spinlock.h
@@ -45,7 +45,14 @@ static inline int arch_spin_trylock(arch_spinlock_t *lock)
 
 static inline void arch_spin_unlock(arch_spinlock_t *lock)
 {
-	lock->slock = __ARCH_SPIN_LOCK_UNLOCKED__;
+	unsigned int tmp = __ARCH_SPIN_LOCK_UNLOCKED__;
+
+	__asm__ __volatile__(
+	"	ex  %0, [%1]		\n"
+	: "+r" (tmp)
+	: "r"(&(lock->slock))
+	: "memory");
+
 	smp_mb();
 }
 
-- 
1.8.1.2


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 4/4] ARC: Use clockevents_config_and_register over clockevents_register_device
  2013-09-27 10:57 [PATCH 0/4] ARC fixes for 3.12-rc3 Vineet Gupta
                   ` (3 preceding siblings ...)
  2013-09-27 10:57 ` [PATCH 3/4] ARC: Workaround spinlock livelock in SMP SystemC simulation Vineet Gupta
@ 2013-09-27 10:57 ` Vineet Gupta
  2013-09-27 10:57   ` Vineet Gupta
  4 siblings, 1 reply; 9+ messages in thread
From: Vineet Gupta @ 2013-09-27 10:57 UTC (permalink / raw)
  To: linux-kernel, linux-arch
  Cc: arc-linux-dev, u.kleine-koenig, Noam Camus, Gilad Ben-Yossef,
	Vineet Gupta

From: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>

clockevents_config_and_register is more clever and correct than doing it
by hand; so use it.

[vgupta: fixed build failure due to missing ; in patch]

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
---
 arch/arc/kernel/time.c | 7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/arch/arc/kernel/time.c b/arch/arc/kernel/time.c
index 0e51e69..3fde7de 100644
--- a/arch/arc/kernel/time.c
+++ b/arch/arc/kernel/time.c
@@ -227,12 +227,9 @@ void __attribute__((weak)) arc_local_timer_setup(unsigned int cpu)
 {
 	struct clock_event_device *clk = &per_cpu(arc_clockevent_device, cpu);
 
-	clockevents_calc_mult_shift(clk, arc_get_core_freq(), 5);
-
-	clk->max_delta_ns = clockevent_delta2ns(ARC_TIMER_MAX, clk);
 	clk->cpumask = cpumask_of(cpu);
-
-	clockevents_register_device(clk);
+	clockevents_config_and_register(clk, arc_get_core_freq(),
+					0, ARC_TIMER_MAX);
 
 	/*
 	 * setup the per-cpu timer IRQ handler - for all cpus
-- 
1.8.1.2

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 4/4] ARC: Use clockevents_config_and_register over clockevents_register_device
  2013-09-27 10:57 ` [PATCH 4/4] ARC: Use clockevents_config_and_register over clockevents_register_device Vineet Gupta
@ 2013-09-27 10:57   ` Vineet Gupta
  0 siblings, 0 replies; 9+ messages in thread
From: Vineet Gupta @ 2013-09-27 10:57 UTC (permalink / raw)
  To: linux-kernel, linux-arch
  Cc: arc-linux-dev, u.kleine-koenig, Noam Camus, Gilad Ben-Yossef,
	Vineet Gupta

From: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>

clockevents_config_and_register is more clever and correct than doing it
by hand; so use it.

[vgupta: fixed build failure due to missing ; in patch]

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Vineet Gupta <vgupta@synopsys.com>
---
 arch/arc/kernel/time.c | 7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/arch/arc/kernel/time.c b/arch/arc/kernel/time.c
index 0e51e69..3fde7de 100644
--- a/arch/arc/kernel/time.c
+++ b/arch/arc/kernel/time.c
@@ -227,12 +227,9 @@ void __attribute__((weak)) arc_local_timer_setup(unsigned int cpu)
 {
 	struct clock_event_device *clk = &per_cpu(arc_clockevent_device, cpu);
 
-	clockevents_calc_mult_shift(clk, arc_get_core_freq(), 5);
-
-	clk->max_delta_ns = clockevent_delta2ns(ARC_TIMER_MAX, clk);
 	clk->cpumask = cpumask_of(cpu);
-
-	clockevents_register_device(clk);
+	clockevents_config_and_register(clk, arc_get_core_freq(),
+					0, ARC_TIMER_MAX);
 
 	/*
 	 * setup the per-cpu timer IRQ handler - for all cpus
-- 
1.8.1.2


^ permalink raw reply related	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2013-09-27 10:58 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-09-27 10:57 [PATCH 0/4] ARC fixes for 3.12-rc3 Vineet Gupta
2013-09-27 10:57 ` Vineet Gupta
2013-09-27 10:57 ` [PATCH 1/4] ARC: Handle zero-overhead-loop in unaligned access handler Vineet Gupta
2013-09-27 10:57 ` [PATCH 2/4] ARC: Fix 32-bit wrap around in access_ok() Vineet Gupta
2013-09-27 10:57   ` Vineet Gupta
2013-09-27 10:57 ` [PATCH 3/4] ARC: Workaround spinlock livelock in SMP SystemC simulation Vineet Gupta
2013-09-27 10:57   ` Vineet Gupta
2013-09-27 10:57 ` [PATCH 4/4] ARC: Use clockevents_config_and_register over clockevents_register_device Vineet Gupta
2013-09-27 10:57   ` Vineet Gupta

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).