LinuxPPC-Dev Archive on lore.kernel.org

LinuxPPC-Dev Archive on lore.kernel.org
 help / color / mirror / Atom feed

* Re: [PATCH v2 6/6] powerpc/pkeys: Deny read/write/execute by default
From: Florian Weimer @ 2018-06-19 13:19 UTC (permalink / raw)
  To: Michael Ellerman, Ram Pai
  Cc: linuxppc-dev, dave.hansen, aneesh.kumar, bsingharora, hbabu,
	mhocko, bauerman, Ulrich.Weigand, luto, msuchanek
In-Reply-To: <87602fx84g.fsf@concordia.ellerman.id.au>

On 06/19/2018 02:39 PM, Michael Ellerman wrote:
> Ram Pai <linuxram@us.ibm.com> writes:
>> Deny all permissions on all keys, with some exceptions.  pkey-0 must
>> allow all permissions, or else everything comes to a screaching halt.
>> Execute-only key must allow execute permission.
> 
> Another ABI change.
> 
> Are we calling this a bug fix?

Yes.  Note that the default is configurable on x86 (default is deny), so 
it's not really an ABI change as such IMHO.

Thanks,
Florian

^ permalink raw reply

* Re: [PATCH] powerpc/mm/hash/4k: Free hugetlb page table caches correctly.
From: Aneesh Kumar K.V @ 2018-06-19 13:25 UTC (permalink / raw)
  To: Michael Ellerman, npiggin, benh, paulus; +Cc: linuxppc-dev
In-Reply-To: <87o9g7xby7.fsf@concordia.ellerman.id.au>

On 06/19/2018 04:47 PM, Michael Ellerman wrote:
> "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> writes:
> 
>> With 4k page size for hugetlb we allocate hugepage directories from its on slab
>> cache. With patch 0c4d26802 ("powerpc/book3s64/mm: Simplify the rcu callback for page table free")
>> we missed to free these allocated hugepd tables.
>>
>> Update pgtable_free to handle hugetlb hugepd directory table.
>>
>> Fixes:  0c4d26802 ("powerpc/book3s64/mm: Simplify the rcu callback for page table free")
>> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
>> ---
>>   arch/powerpc/include/asm/book3s/32/pgalloc.h  |  1 +
>>   .../include/asm/book3s/64/pgtable-4k.h        | 21 +++++++++++++++++++
>>   .../include/asm/book3s/64/pgtable-64k.h       |  9 ++++++++
>>   arch/powerpc/include/asm/book3s/64/pgtable.h  |  5 +++++
>>   arch/powerpc/include/asm/nohash/32/pgalloc.h  |  1 +
>>   arch/powerpc/include/asm/nohash/64/pgalloc.h  |  1 +
>>   arch/powerpc/mm/hugetlbpage.c                 |  3 ++-
>>   arch/powerpc/mm/pgtable-book3s64.c            | 12 +++++++++++
> 
> Fails with 4K=y HUGETLBFS=n:
> 
>    arch/powerpc/mm/pgtable-book3s64.c:415:16: error: ‘H_16M_CACHE_INDEX’ undeclared (first use in this function); did you mean ‘H_PUD_CACHE_INDEX’?
> 
> ...
> 
>> diff --git a/arch/powerpc/mm/pgtable-book3s64.c b/arch/powerpc/mm/pgtable-book3s64.c
>> index c1f4ca45c93a..468c3d83a2aa 100644
>> --- a/arch/powerpc/mm/pgtable-book3s64.c
>> +++ b/arch/powerpc/mm/pgtable-book3s64.c
>> @@ -409,6 +409,18 @@ static inline void pgtable_free(void *table, int index)
>>   	case PUD_INDEX:
>>   		kmem_cache_free(PGT_CACHE(PUD_CACHE_INDEX), table);
>>   		break;
>> +#ifdef CONFIG_PPC_4K_PAGES
>> +		/* 16M hugepd directory at pud level */
>> +	case HTLB_16M_INDEX:
>> +		BUILD_BUG_ON(H_16M_CACHE_INDEX <= 0);
>> +		kmem_cache_free(PGT_CACHE(H_16M_CACHE_INDEX), table);
>> +		break;
>> +		/* 16G hugepd directory at the pgd level */
>> +	case HTLB_16G_INDEX:
>> +		BUILD_BUG_ON(H_16G_CACHE_INDEX <= 0);
>> +		kmem_cache_free(PGT_CACHE(H_16G_CACHE_INDEX), table);
>> +		break;
>> +#endif
> 
> Because this isn't protected by CONFIG_HUGETLBFS.
> 
> I assume this is correct?
> 
> diff --git a/arch/powerpc/mm/pgtable-book3s64.c b/arch/powerpc/mm/pgtable-book3s64.c
> index 468c3d83a2aa..9b7007fd075e 100644
> --- a/arch/powerpc/mm/pgtable-book3s64.c
> +++ b/arch/powerpc/mm/pgtable-book3s64.c
> @@ -409,7 +409,7 @@ static inline void pgtable_free(void *table, int index)
>   	case PUD_INDEX:
>   		kmem_cache_free(PGT_CACHE(PUD_CACHE_INDEX), table);
>   		break;
> -#ifdef CONFIG_PPC_4K_PAGES
> +#if defined(CONFIG_PPC_4K_PAGES) && defined (CONFIG_HUGETLBFS)
>   		/* 16M hugepd directory at pud level */
>   	case HTLB_16M_INDEX:
>   		BUILD_BUG_ON(H_16M_CACHE_INDEX <= 0);
> 
> 
> cheers
> 

Sorry missed that. Can we use #ifdef CONFIG_HUGETLB_PAGE ? That is what 
we use to protect that in pgtable-4k.h

-aneesh

-aneesh

^ permalink raw reply

* Re: [PATCH v1] mm: relax deferred struct page requirements
From: Pavel Tatashin @ 2018-06-19 13:50 UTC (permalink / raw)
  To: jslaby
  Cc: mhocko, Steven Sistare, Daniel Jordan, benh, paulus,
	Andrew Morton, kirill.shutemov, Reza Arbab, schwidefsky,
	Heiko Carstens, x86, LKML, tglx, linuxppc-dev,
	Linux Memory Management List, linux-s390, mgorman
In-Reply-To: <b16029f0-ada0-df25-071b-cd5dba0ab756@suse.cz>

On Sat, Jun 16, 2018 at 4:04 AM Jiri Slaby <jslaby@suse.cz> wrote:
>
> On 11/21/2017, 08:24 AM, Michal Hocko wrote:
> > On Thu 16-11-17 20:46:01, Pavel Tatashin wrote:
> >> There is no need to have ARCH_SUPPORTS_DEFERRED_STRUCT_PAGE_INIT,
> >> as all the page initialization code is in common code.
> >>
> >> Also, there is no need to depend on MEMORY_HOTPLUG, as initialization code
> >> does not really use hotplug memory functionality. So, we can remove this
> >> requirement as well.
> >>
> >> This patch allows to use deferred struct page initialization on all
> >> platforms with memblock allocator.
> >>
> >> Tested on x86, arm64, and sparc. Also, verified that code compiles on
> >> PPC with CONFIG_MEMORY_HOTPLUG disabled.
> >
> > There is slight risk that we will encounter corner cases on some
> > architectures with weird memory layout/topology
>
> Which x86_32-pae seems to be. Many bad page state errors are emitted
> during boot when this patch is applied:

Hi Jiri,

Thank you for reporting this bug.

Because 32-bit systems are limited in the maximum amount of physical
memory, they don't need deferred struct pages. So, we can add depends
on 64BIT to DEFERRED_STRUCT_PAGE_INIT in mm/Kconfig.

However, before we do this, I want to try reproducing this problem and
root cause it, as it might expose a general problem that is not 32-bit
specific.

Thank you,
Pavel

^ permalink raw reply

* [PATCH 2/3] [v2] m68k: mac: use time64_t in RTC handling
From: Arnd Bergmann @ 2018-06-19 14:02 UTC (permalink / raw)
  To: Paul Mackerras, Michael Ellerman, Geert Uytterhoeven,
	Joshua Thompson
  Cc: Mathieu Malaterre, Benjamin Herrenschmidt, Greg Ungerer,
	linux-m68k, linuxppc-dev, linux-kernel, y2038, Meelis Roos,
	Andreas Schwab, Arnd Bergmann
In-Reply-To: <20180619140229.3615110-1-arnd@arndb.de>

The real-time clock on m68k (and powerpc) mac systems uses an unsigned
32-bit value starting in 1904, which overflows in 2040, about two years
later than everyone else, but this gets wrapped around in the Linux
code in 2038 already because of the deprecated usage of time_t and/or
long in the conversion.

Getting rid of the deprecated interfaces makes it work until 2040 as
documented, and it could be easily extended by reinterpreting
the resulting time64_t as a positive number. For the moment, I'm
adding a WARN_ON() that triggers if we encounter a time before 1970
or after 2040 (the two are indistinguishable).

This brings it in line with the corresponding code that we have on
powerpc macintosh.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
v2: Fix varargs passing bug pointed out by Andreas Schwab
    Fix a typo that caused a build regression
---
 arch/m68k/mac/misc.c | 62 +++++++++++++++++++++++++++++++++-------------------
 1 file changed, 39 insertions(+), 23 deletions(-)

diff --git a/arch/m68k/mac/misc.c b/arch/m68k/mac/misc.c
index c68054361615..0a2572a6bfe5 100644
--- a/arch/m68k/mac/misc.c
+++ b/arch/m68k/mac/misc.c
@@ -26,33 +26,39 @@
 
 #include <asm/machdep.h>
 
-/* Offset between Unix time (1970-based) and Mac time (1904-based) */
+/* Offset between Unix time (1970-based) and Mac time (1904-based). Cuda and PMU
+ * times wrap in 2040. If we need to handle later times, the read_time functions
+ * need to be changed to interpret wrapped times as post-2040. */
 
 #define RTC_OFFSET 2082844800
 
 static void (*rom_reset)(void);
 
 #ifdef CONFIG_ADB_CUDA
-static long cuda_read_time(void)
+static time64_t cuda_read_time(void)
 {
 	struct adb_request req;
-	long time;
+	time64_t time;
 
 	if (cuda_request(&req, NULL, 2, CUDA_PACKET, CUDA_GET_TIME) < 0)
 		return 0;
 	while (!req.complete)
 		cuda_poll();
 
-	time = (req.reply[3] << 24) | (req.reply[4] << 16) |
-	       (req.reply[5] << 8) | req.reply[6];
+	time = (u32)((req.reply[3] << 24) | (req.reply[4] << 16) |
+		     (req.reply[5] << 8) | req.reply[6]);
+
+	/* it's either after year 2040, or the RTC has gone backwards */
+	WARN_ON(time < RTC_OFFSET);
+
 	return time - RTC_OFFSET;
 }
 
-static void cuda_write_time(long data)
+static void cuda_write_time(time64_t time)
 {
 	struct adb_request req;
+	u32 data = lower_32_bits(time + RTC_OFFSET);
 
-	data += RTC_OFFSET;
 	if (cuda_request(&req, NULL, 6, CUDA_PACKET, CUDA_SET_TIME,
 			 (data >> 24) & 0xFF, (data >> 16) & 0xFF,
 			 (data >> 8) & 0xFF, data & 0xFF) < 0)
@@ -86,26 +92,30 @@ static void cuda_write_pram(int offset, __u8 data)
 #endif /* CONFIG_ADB_CUDA */
 
 #ifdef CONFIG_ADB_PMU68K
-static long pmu_read_time(void)
+static time64_t pmu_read_time(void)
 {
 	struct adb_request req;
-	long time;
+	time64_t time;
 
 	if (pmu_request(&req, NULL, 1, PMU_READ_RTC) < 0)
 		return 0;
 	while (!req.complete)
 		pmu_poll();
 
-	time = (req.reply[1] << 24) | (req.reply[2] << 16) |
-	       (req.reply[3] << 8) | req.reply[4];
+	time = (u32)((req.reply[1] << 24) | (req.reply[2] << 16) |
+		     (req.reply[3] << 8) | req.reply[4]);
+
+	/* it's either after year 2040, or the RTC has gone backwards */
+	WARN_ON(time < RTC_OFFSET);
+
 	return time - RTC_OFFSET;
 }
 
-static void pmu_write_time(long data)
+static void pmu_write_time(time64_t time)
 {
 	struct adb_request req;
+	u32 data = lower_32_bits(time + RTC_OFFSET);
 
-	data += RTC_OFFSET;
 	if (pmu_request(&req, NULL, 5, PMU_SET_RTC,
 			(data >> 24) & 0xFF, (data >> 16) & 0xFF,
 			(data >> 8) & 0xFF, data & 0xFF) < 0)
@@ -269,8 +279,12 @@ static long via_read_time(void)
 		via_pram_command(0x89, &result.cdata[1]);
 		via_pram_command(0x8D, &result.cdata[0]);
 
-		if (result.idata == last_result.idata)
+		if (result.idata == last_result.idata) {
+			if (result.idata < RTC_OFFSET)
+				result.idata += 0x100000000ull;
+
 			return result.idata - RTC_OFFSET;
+		}
 
 		if (++count > 10)
 			break;
@@ -291,11 +305,11 @@ static long via_read_time(void)
  * is basically any machine with Mac II-style ADB.
  */
 
-static void via_write_time(long time)
+static void via_write_time(time64_t time)
 {
 	union {
 		__u8 cdata[4];
-		long idata;
+		__u32 idata;
 	} data;
 	__u8 temp;
 
@@ -585,12 +599,15 @@ void mac_reset(void)
  * This function translates seconds since 1970 into a proper date.
  *
  * Algorithm cribbed from glibc2.1, __offtime().
+ *
+ * This is roughly same as rtc_time64_to_tm(), which we should probably
+ * use here, but it's only available when CONFIG_RTC_LIB is enabled.
  */
 #define SECS_PER_MINUTE (60)
 #define SECS_PER_HOUR  (SECS_PER_MINUTE * 60)
 #define SECS_PER_DAY   (SECS_PER_HOUR * 24)
 
-static void unmktime(unsigned long time, long offset,
+static void unmktime(time64_t time, long offset,
 		     int *yearp, int *monp, int *dayp,
 		     int *hourp, int *minp, int *secp)
 {
@@ -602,11 +619,10 @@ static void unmktime(unsigned long time, long offset,
 		/* Leap years.  */
 		{ 0, 31, 60, 91, 121, 152, 182, 213, 244, 274, 305, 335, 366 }
 	};
-	long int days, rem, y, wday, yday;
+	int days, rem, y, wday, yday;
 	const unsigned short int *ip;
 
-	days = time / SECS_PER_DAY;
-	rem = time % SECS_PER_DAY;
+	days = div_u64_rem(time, SECS_PER_DAY, &rem);
 	rem += offset;
 	while (rem < 0) {
 		rem += SECS_PER_DAY;
@@ -657,7 +673,7 @@ static void unmktime(unsigned long time, long offset,
 
 int mac_hwclk(int op, struct rtc_time *t)
 {
-	unsigned long now;
+	time64_t now;
 
 	if (!op) { /* read */
 		switch (macintosh_config->adb_type) {
@@ -693,8 +709,8 @@ int mac_hwclk(int op, struct rtc_time *t)
 		         __func__, t->tm_year + 1900, t->tm_mon + 1, t->tm_mday,
 		         t->tm_hour, t->tm_min, t->tm_sec);
 
-		now = mktime(t->tm_year + 1900, t->tm_mon + 1, t->tm_mday,
-			     t->tm_hour, t->tm_min, t->tm_sec);
+		now = mktime64(t->tm_year + 1900, t->tm_mon + 1, t->tm_mday,
+			       t->tm_hour, t->tm_min, t->tm_sec);
 
 		switch (macintosh_config->adb_type) {
 		case MAC_ADB_IOP:
-- 
2.9.0

^ permalink raw reply related

* [PATCH 1/3] [v2] powerpc: mac: fix rtc read/write functions
From: Arnd Bergmann @ 2018-06-19 14:02 UTC (permalink / raw)
  To: Paul Mackerras, Michael Ellerman, Geert Uytterhoeven,
	Joshua Thompson
  Cc: Mathieu Malaterre, Benjamin Herrenschmidt, Greg Ungerer,
	linux-m68k, linuxppc-dev, linux-kernel, y2038, Meelis Roos,
	Andreas Schwab, Arnd Bergmann

As Mathieu pointed out, my conversion to time64_t was incorrect and resulted
in negative times to be read from the RTC. The problem is that during the
conversion from a byte array to a time64_t, the 'unsigned char' variable
holding the top byte gets turned into a negative signed 32-bit integer
before being assigned to the 64-bit variable for any times after 1972.

This changes the logic to cast to an unsigned 32-bit number first for
the Macintosh time and then convert that to the Unix time, which then gives
us a time in the documented 1904..2040 year range. I decided not to use
the longer 1970..2106 range that other drivers use, for consistency with
the literal interpretation of the register, but that could be easily
changed if we decide we want to support any Mac after 2040.

Just to be on the safe side, I'm also adding a WARN_ON that will trigger
if either the year 2040 has come and is observed by this driver, or we
run into an RTC that got set back to a pre-1970 date for some reason
(the two are indistinguishable).

For the RTC write functions, Andreas found another problem: both
pmu_request() and cuda_request() are varargs functions, so changing
the type of the arguments passed into them from 32 bit to 64 bit
breaks the API for the set_rtc_time functions. This changes it
back to 32 bits.

The same code exists in arch/m68k/ and is patched in an identical way now
in a separate patch.

Fixes: 5bfd643583b2 ("powerpc: use time64_t in read_persistent_clock")
Reported-by: Mathieu Malaterre <malat@debian.org>
Reported-by: Andreas Schwab <schwab@linux-m68k.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
 arch/powerpc/platforms/powermac/time.c | 29 ++++++++++++++++++++---------
 1 file changed, 20 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/platforms/powermac/time.c b/arch/powerpc/platforms/powermac/time.c
index 7c968e46736f..12e6e4d30602 100644
--- a/arch/powerpc/platforms/powermac/time.c
+++ b/arch/powerpc/platforms/powermac/time.c
@@ -42,7 +42,11 @@
 #define DBG(x...)
 #endif

-/* Apparently the RTC stores seconds since 1 Jan 1904 */
+/*
+ * Offset between Unix time (1970-based) and Mac time (1904-based). Cuda and PMU
+ * times wrap in 2040. If we need to handle later times, the read_time functions
+ * need to be changed to interpret wrapped times as post-2040.
+ */
 #define RTC_OFFSET	2082844800

 /*
@@ -97,8 +101,11 @@ static time64_t cuda_get_time(void)
 	if (req.reply_len != 7)
 		printk(KERN_ERR "cuda_get_time: got %d byte reply\n",
 		       req.reply_len);
-	now = (req.reply[3] << 24) + (req.reply[4] << 16)
-		+ (req.reply[5] << 8) + req.reply[6];
+	now = (u32)((req.reply[3] << 24) + (req.reply[4] << 16) +
+		    (req.reply[5] << 8) + req.reply[6]);
+	/* it's either after year 2040, or the RTC has gone backwards */
+	WARN_ON(now < RTC_OFFSET);
+
 	return now - RTC_OFFSET;
 }

@@ -106,10 +113,10 @@ static time64_t cuda_get_time(void)

 static int cuda_set_rtc_time(struct rtc_time *tm)
 {
-	time64_t nowtime;
+	u32 nowtime;
 	struct adb_request req;

-	nowtime = rtc_tm_to_time64(tm) + RTC_OFFSET;
+	nowtime = lower_32_bits(rtc_tm_to_time64(tm) + RTC_OFFSET);
 	if (cuda_request(&req, NULL, 6, CUDA_PACKET, CUDA_SET_TIME,
 			 nowtime >> 24, nowtime >> 16, nowtime >> 8,
 			 nowtime) < 0)
@@ -140,8 +147,12 @@ static time64_t pmu_get_time(void)
 	if (req.reply_len != 4)
 		printk(KERN_ERR "pmu_get_time: got %d byte reply from PMU\n",
 		       req.reply_len);
-	now = (req.reply[0] << 24) + (req.reply[1] << 16)
-		+ (req.reply[2] << 8) + req.reply[3];
+	now = (u32)((req.reply[0] << 24) + (req.reply[1] << 16)	+
+		    (req.reply[2] << 8) + req.reply[3]);
+
+	/* it's either after year 2040, or the RTC has gone backwards */
+	WARN_ON(now < RTC_OFFSET);
+
 	return now - RTC_OFFSET;
 }

@@ -149,10 +160,10 @@ static time64_t pmu_get_time(void)

 static int pmu_set_rtc_time(struct rtc_time *tm)
 {
-	time64_t nowtime;
+	u32 nowtime;
 	struct adb_request req;

-	nowtime = rtc_tm_to_time64(tm) + RTC_OFFSET;
+	nowtime = lower_32_bits(rtc_tm_to_time64(tm) + RTC_OFFSET);
 	if (pmu_request(&req, NULL, 5, PMU_SET_RTC, nowtime >> 24,
 			nowtime >> 16, nowtime >> 8, nowtime) < 0)
 		return -ENXIO;
-- 
2.9.0

^ permalink raw reply related

* [PATCH 3/3] [v2] m68k: remove unused set_clock_mmss() helpers
From: Arnd Bergmann @ 2018-06-19 14:02 UTC (permalink / raw)
  To: Paul Mackerras, Michael Ellerman, Geert Uytterhoeven,
	Joshua Thompson
  Cc: Mathieu Malaterre, Benjamin Herrenschmidt, Greg Ungerer,
	linux-m68k, linuxppc-dev, linux-kernel, y2038, Meelis Roos,
	Andreas Schwab, Arnd Bergmann
In-Reply-To: <20180619140229.3615110-1-arnd@arndb.de>

Commit 397ac99c6cef ("m68k: remove dead timer code") removed set_rtc_mmss()
because it was unused in 2012. However, this was itself the only user of the
mach_set_clock_mmss() callback and the many implementations of that callback,
which are equally unused.

This removes all of those as well.

Acked-by: Greg Ungerer <gerg@linux-m68k.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
v2: no changes
---
 arch/m68k/apollo/config.c       |  8 ------
 arch/m68k/atari/config.c        |  5 ----
 arch/m68k/atari/time.c          | 63 -----------------------------------------
 arch/m68k/bvme6000/config.c     | 45 -----------------------------
 arch/m68k/include/asm/machdep.h |  1 -
 arch/m68k/kernel/setup_mm.c     |  1 -
 arch/m68k/kernel/setup_no.c     |  1 -
 arch/m68k/mac/config.c          |  2 --
 arch/m68k/mac/misc.c            | 16 -----------
 arch/m68k/mvme147/config.c      |  7 -----
 arch/m68k/mvme16x/config.c      |  8 ------
 arch/m68k/q40/config.c          | 30 --------------------
 12 files changed, 187 deletions(-)

diff --git a/arch/m68k/apollo/config.c b/arch/m68k/apollo/config.c
index b2a6bc63f8cd..aef8d42e078d 100644
--- a/arch/m68k/apollo/config.c
+++ b/arch/m68k/apollo/config.c
@@ -31,7 +31,6 @@ extern void dn_sched_init(irq_handler_t handler);
 extern void dn_init_IRQ(void);
 extern u32 dn_gettimeoffset(void);
 extern int dn_dummy_hwclk(int, struct rtc_time *);
-extern int dn_dummy_set_clock_mmss(unsigned long);
 extern void dn_dummy_reset(void);
 #ifdef CONFIG_HEARTBEAT
 static void dn_heartbeat(int on);
@@ -156,7 +155,6 @@ void __init config_apollo(void)
 	arch_gettimeoffset   = dn_gettimeoffset;
 	mach_max_dma_address = 0xffffffff;
 	mach_hwclk           = dn_dummy_hwclk; /* */
-	mach_set_clock_mmss  = dn_dummy_set_clock_mmss; /* */
 	mach_reset	     = dn_dummy_reset;  /* */
 #ifdef CONFIG_HEARTBEAT
 	mach_heartbeat = dn_heartbeat;
@@ -240,12 +238,6 @@ int dn_dummy_hwclk(int op, struct rtc_time *t) {
 
 }
 
-int dn_dummy_set_clock_mmss(unsigned long nowtime)
-{
-	pr_info("set_clock_mmss\n");
-	return 0;
-}
-
 void dn_dummy_reset(void) {
 
   dn_serial_print("The end !\n");
diff --git a/arch/m68k/atari/config.c b/arch/m68k/atari/config.c
index 565c6f06ab0b..bd96702a1ad0 100644
--- a/arch/m68k/atari/config.c
+++ b/arch/m68k/atari/config.c
@@ -81,9 +81,6 @@ extern void atari_sched_init(irq_handler_t);
 extern u32 atari_gettimeoffset(void);
 extern int atari_mste_hwclk (int, struct rtc_time *);
 extern int atari_tt_hwclk (int, struct rtc_time *);
-extern int atari_mste_set_clock_mmss (unsigned long);
-extern int atari_tt_set_clock_mmss (unsigned long);
-
 
 /* ++roman: This is a more elaborate test for an SCC chip, since the plain
  * Medusa board generates DTACK at the SCC's standard addresses, but a SCC
@@ -362,13 +359,11 @@ void __init config_atari(void)
 		ATARIHW_SET(TT_CLK);
 		pr_cont(" TT_CLK");
 		mach_hwclk = atari_tt_hwclk;
-		mach_set_clock_mmss = atari_tt_set_clock_mmss;
 	}
 	if (hwreg_present(&mste_rtc.sec_ones)) {
 		ATARIHW_SET(MSTE_CLK);
 		pr_cont(" MSTE_CLK");
 		mach_hwclk = atari_mste_hwclk;
-		mach_set_clock_mmss = atari_mste_set_clock_mmss;
 	}
 	if (!MACH_IS_MEDUSA && hwreg_present(&dma_wd.fdc_speed) &&
 	    hwreg_write(&dma_wd.fdc_speed, 0)) {
diff --git a/arch/m68k/atari/time.c b/arch/m68k/atari/time.c
index c549b48174ec..9cca64286464 100644
--- a/arch/m68k/atari/time.c
+++ b/arch/m68k/atari/time.c
@@ -285,69 +285,6 @@ int atari_tt_hwclk( int op, struct rtc_time *t )
     return( 0 );
 }
 
-
-int atari_mste_set_clock_mmss (unsigned long nowtime)
-{
-    short real_seconds = nowtime % 60, real_minutes = (nowtime / 60) % 60;
-    struct MSTE_RTC val;
-    unsigned char rtc_minutes;
-
-    mste_read(&val);
-    rtc_minutes= val.min_ones + val.min_tens * 10;
-    if ((rtc_minutes < real_minutes
-         ? real_minutes - rtc_minutes
-         : rtc_minutes - real_minutes) < 30)
-    {
-        val.sec_ones = real_seconds % 10;
-        val.sec_tens = real_seconds / 10;
-        val.min_ones = real_minutes % 10;
-        val.min_tens = real_minutes / 10;
-        mste_write(&val);
-    }
-    else
-        return -1;
-    return 0;
-}
-
-int atari_tt_set_clock_mmss (unsigned long nowtime)
-{
-    int retval = 0;
-    short real_seconds = nowtime % 60, real_minutes = (nowtime / 60) % 60;
-    unsigned char save_control, save_freq_select, rtc_minutes;
-
-    save_control = RTC_READ (RTC_CONTROL); /* tell the clock it's being set */
-    RTC_WRITE (RTC_CONTROL, save_control | RTC_SET);
-
-    save_freq_select = RTC_READ (RTC_FREQ_SELECT); /* stop and reset prescaler */
-    RTC_WRITE (RTC_FREQ_SELECT, save_freq_select | RTC_DIV_RESET2);
-
-    rtc_minutes = RTC_READ (RTC_MINUTES);
-    if (!(save_control & RTC_DM_BINARY))
-	rtc_minutes = bcd2bin(rtc_minutes);
-
-    /* Since we're only adjusting minutes and seconds, don't interfere
-       with hour overflow.  This avoids messing with unknown time zones
-       but requires your RTC not to be off by more than 30 minutes.  */
-    if ((rtc_minutes < real_minutes
-         ? real_minutes - rtc_minutes
-         : rtc_minutes - real_minutes) < 30)
-        {
-            if (!(save_control & RTC_DM_BINARY))
-                {
-		    real_seconds = bin2bcd(real_seconds);
-		    real_minutes = bin2bcd(real_minutes);
-                }
-            RTC_WRITE (RTC_SECONDS, real_seconds);
-            RTC_WRITE (RTC_MINUTES, real_minutes);
-        }
-    else
-        retval = -1;
-
-    RTC_WRITE (RTC_FREQ_SELECT, save_freq_select);
-    RTC_WRITE (RTC_CONTROL, save_control);
-    return retval;
-}
-
 /*
  * Local variables:
  *  c-indent-level: 4
diff --git a/arch/m68k/bvme6000/config.c b/arch/m68k/bvme6000/config.c
index 2cfff4765040..143ee9fa3893 100644
--- a/arch/m68k/bvme6000/config.c
+++ b/arch/m68k/bvme6000/config.c
@@ -41,7 +41,6 @@ static void bvme6000_get_model(char *model);
 extern void bvme6000_sched_init(irq_handler_t handler);
 extern u32 bvme6000_gettimeoffset(void);
 extern int bvme6000_hwclk (int, struct rtc_time *);
-extern int bvme6000_set_clock_mmss (unsigned long);
 extern void bvme6000_reset (void);
 void bvme6000_set_vectors (void);
 
@@ -113,7 +112,6 @@ void __init config_bvme6000(void)
     mach_init_IRQ        = bvme6000_init_IRQ;
     arch_gettimeoffset   = bvme6000_gettimeoffset;
     mach_hwclk           = bvme6000_hwclk;
-    mach_set_clock_mmss	 = bvme6000_set_clock_mmss;
     mach_reset		 = bvme6000_reset;
     mach_get_model       = bvme6000_get_model;
 
@@ -305,46 +303,3 @@ int bvme6000_hwclk(int op, struct rtc_time *t)
 
 	return 0;
 }
-
-/*
- * Set the minutes and seconds from seconds value 'nowtime'.  Fail if
- * clock is out by > 30 minutes.  Logic lifted from atari code.
- * Algorithm is to wait for the 10ms register to change, and then to
- * wait a short while, and then set it.
- */
-
-int bvme6000_set_clock_mmss (unsigned long nowtime)
-{
-	int retval = 0;
-	short real_seconds = nowtime % 60, real_minutes = (nowtime / 60) % 60;
-	unsigned char rtc_minutes, rtc_tenms;
-	volatile RtcPtr_t rtc = (RtcPtr_t)BVME_RTC_BASE;
-	unsigned char msr = rtc->msr & 0xc0;
-	unsigned long flags;
-	volatile int i;
-
-	rtc->msr = 0;		/* Ensure clock accessible */
-	rtc_minutes = bcd2bin (rtc->bcd_min);
-
-	if ((rtc_minutes < real_minutes
-		? real_minutes - rtc_minutes
-			: rtc_minutes - real_minutes) < 30)
-	{
-		local_irq_save(flags);
-		rtc_tenms = rtc->bcd_tenms;
-		while (rtc_tenms == rtc->bcd_tenms)
-			;
-		for (i = 0; i < 1000; i++)
-			;
-		rtc->bcd_min = bin2bcd(real_minutes);
-		rtc->bcd_sec = bin2bcd(real_seconds);
-		local_irq_restore(flags);
-	}
-	else
-		retval = -1;
-
-	rtc->msr = msr;
-
-	return retval;
-}
-
diff --git a/arch/m68k/include/asm/machdep.h b/arch/m68k/include/asm/machdep.h
index 1605da48ebf2..49bd3266b4b1 100644
--- a/arch/m68k/include/asm/machdep.h
+++ b/arch/m68k/include/asm/machdep.h
@@ -22,7 +22,6 @@ extern int (*mach_hwclk)(int, struct rtc_time*);
 extern unsigned int (*mach_get_ss)(void);
 extern int (*mach_get_rtc_pll)(struct rtc_pll_info *);
 extern int (*mach_set_rtc_pll)(struct rtc_pll_info *);
-extern int (*mach_set_clock_mmss)(unsigned long);
 extern void (*mach_reset)( void );
 extern void (*mach_halt)( void );
 extern void (*mach_power_off)( void );
diff --git a/arch/m68k/kernel/setup_mm.c b/arch/m68k/kernel/setup_mm.c
index f35e3ebd6331..07244732eb41 100644
--- a/arch/m68k/kernel/setup_mm.c
+++ b/arch/m68k/kernel/setup_mm.c
@@ -88,7 +88,6 @@ void (*mach_get_hardware_list) (struct seq_file *m);
 /* machine dependent timer functions */
 int (*mach_hwclk) (int, struct rtc_time*);
 EXPORT_SYMBOL(mach_hwclk);
-int (*mach_set_clock_mmss) (unsigned long);
 unsigned int (*mach_get_ss)(void);
 int (*mach_get_rtc_pll)(struct rtc_pll_info *);
 int (*mach_set_rtc_pll)(struct rtc_pll_info *);
diff --git a/arch/m68k/kernel/setup_no.c b/arch/m68k/kernel/setup_no.c
index a98af1018201..3c53e4c366ac 100644
--- a/arch/m68k/kernel/setup_no.c
+++ b/arch/m68k/kernel/setup_no.c
@@ -51,7 +51,6 @@ char __initdata command_line[COMMAND_LINE_SIZE];
 
 /* machine dependent timer functions */
 void (*mach_sched_init)(irq_handler_t handler) __initdata = NULL;
-int (*mach_set_clock_mmss)(unsigned long);
 int (*mach_hwclk) (int, struct rtc_time*);
 
 /* machine dependent reboot functions */
diff --git a/arch/m68k/mac/config.c b/arch/m68k/mac/config.c
index e522307db47c..da1aeb966474 100644
--- a/arch/m68k/mac/config.c
+++ b/arch/m68k/mac/config.c
@@ -57,7 +57,6 @@ static unsigned long mac_orig_videoaddr;
 /* Mac specific timer functions */
 extern u32 mac_gettimeoffset(void);
 extern int mac_hwclk(int, struct rtc_time *);
-extern int mac_set_clock_mmss(unsigned long);
 extern void iop_preinit(void);
 extern void iop_init(void);
 extern void via_init(void);
@@ -158,7 +157,6 @@ void __init config_mac(void)
 	mach_get_model = mac_get_model;
 	arch_gettimeoffset = mac_gettimeoffset;
 	mach_hwclk = mac_hwclk;
-	mach_set_clock_mmss = mac_set_clock_mmss;
 	mach_reset = mac_reset;
 	mach_halt = mac_poweroff;
 	mach_power_off = mac_poweroff;
diff --git a/arch/m68k/mac/misc.c b/arch/m68k/mac/misc.c
index 0a2572a6bfe5..bf8df47a6d09 100644
--- a/arch/m68k/mac/misc.c
+++ b/arch/m68k/mac/misc.c
@@ -735,19 +735,3 @@ int mac_hwclk(int op, struct rtc_time *t)
 	}
 	return 0;
 }
-
-/*
- * Set minutes/seconds in the hardware clock
- */
-
-int mac_set_clock_mmss (unsigned long nowtime)
-{
-	struct rtc_time now;
-
-	mac_hwclk(0, &now);
-	now.tm_sec = nowtime % 60;
-	now.tm_min = (nowtime / 60) % 60;
-	mac_hwclk(1, &now);
-
-	return 0;
-}
diff --git a/arch/m68k/mvme147/config.c b/arch/m68k/mvme147/config.c
index f8a710fd84cd..adea549d240e 100644
--- a/arch/m68k/mvme147/config.c
+++ b/arch/m68k/mvme147/config.c
@@ -40,7 +40,6 @@ static void mvme147_get_model(char *model);
 extern void mvme147_sched_init(irq_handler_t handler);
 extern u32 mvme147_gettimeoffset(void);
 extern int mvme147_hwclk (int, struct rtc_time *);
-extern int mvme147_set_clock_mmss (unsigned long);
 extern void mvme147_reset (void);
 
 
@@ -92,7 +91,6 @@ void __init config_mvme147(void)
 	mach_init_IRQ		= mvme147_init_IRQ;
 	arch_gettimeoffset	= mvme147_gettimeoffset;
 	mach_hwclk		= mvme147_hwclk;
-	mach_set_clock_mmss	= mvme147_set_clock_mmss;
 	mach_reset		= mvme147_reset;
 	mach_get_model		= mvme147_get_model;
 
@@ -164,8 +162,3 @@ int mvme147_hwclk(int op, struct rtc_time *t)
 	}
 	return 0;
 }
-
-int mvme147_set_clock_mmss (unsigned long nowtime)
-{
-	return 0;
-}
diff --git a/arch/m68k/mvme16x/config.c b/arch/m68k/mvme16x/config.c
index 4ffd9ef98de4..6ee36a5b528d 100644
--- a/arch/m68k/mvme16x/config.c
+++ b/arch/m68k/mvme16x/config.c
@@ -46,7 +46,6 @@ static void mvme16x_get_model(char *model);
 extern void mvme16x_sched_init(irq_handler_t handler);
 extern u32 mvme16x_gettimeoffset(void);
 extern int mvme16x_hwclk (int, struct rtc_time *);
-extern int mvme16x_set_clock_mmss (unsigned long);
 extern void mvme16x_reset (void);
 
 int bcd2int (unsigned char b);
@@ -280,7 +279,6 @@ void __init config_mvme16x(void)
     mach_init_IRQ        = mvme16x_init_IRQ;
     arch_gettimeoffset   = mvme16x_gettimeoffset;
     mach_hwclk           = mvme16x_hwclk;
-    mach_set_clock_mmss	 = mvme16x_set_clock_mmss;
     mach_reset		 = mvme16x_reset;
     mach_get_model       = mvme16x_get_model;
     mach_get_hardware_list = mvme16x_get_hardware_list;
@@ -411,9 +409,3 @@ int mvme16x_hwclk(int op, struct rtc_time *t)
 	}
 	return 0;
 }
-
-int mvme16x_set_clock_mmss (unsigned long nowtime)
-{
-	return 0;
-}
-
diff --git a/arch/m68k/q40/config.c b/arch/m68k/q40/config.c
index 71c0867ecf20..96810d91da2b 100644
--- a/arch/m68k/q40/config.c
+++ b/arch/m68k/q40/config.c
@@ -43,7 +43,6 @@ extern void q40_sched_init(irq_handler_t handler);
 static u32 q40_gettimeoffset(void);
 static int q40_hwclk(int, struct rtc_time *);
 static unsigned int q40_get_ss(void);
-static int q40_set_clock_mmss(unsigned long);
 static int q40_get_rtc_pll(struct rtc_pll_info *pll);
 static int q40_set_rtc_pll(struct rtc_pll_info *pll);
 
@@ -175,7 +174,6 @@ void __init config_q40(void)
 	mach_get_ss = q40_get_ss;
 	mach_get_rtc_pll = q40_get_rtc_pll;
 	mach_set_rtc_pll = q40_set_rtc_pll;
-	mach_set_clock_mmss = q40_set_clock_mmss;
 
 	mach_reset = q40_reset;
 	mach_get_model = q40_get_model;
@@ -267,34 +265,6 @@ static unsigned int q40_get_ss(void)
 	return bcd2bin(Q40_RTC_SECS);
 }
 
-/*
- * Set the minutes and seconds from seconds value 'nowtime'.  Fail if
- * clock is out by > 30 minutes.  Logic lifted from atari code.
- */
-
-static int q40_set_clock_mmss(unsigned long nowtime)
-{
-	int retval = 0;
-	short real_seconds = nowtime % 60, real_minutes = (nowtime / 60) % 60;
-
-	int rtc_minutes;
-
-	rtc_minutes = bcd2bin(Q40_RTC_MINS);
-
-	if ((rtc_minutes < real_minutes ?
-	     real_minutes - rtc_minutes :
-	     rtc_minutes - real_minutes) < 30) {
-		Q40_RTC_CTRL |= Q40_RTC_WRITE;
-		Q40_RTC_MINS = bin2bcd(real_minutes);
-		Q40_RTC_SECS = bin2bcd(real_seconds);
-		Q40_RTC_CTRL &= ~(Q40_RTC_WRITE);
-	} else
-		retval = -1;
-
-	return retval;
-}
-
-
 /* get and set PLL calibration of RTC clock */
 #define Q40_RTC_PLL_MASK ((1<<5)-1)
 #define Q40_RTC_PLL_SIGN (1<<5)
-- 
2.9.0

^ permalink raw reply related

* Re: [PATCH v2 1/6] powerpc/pkeys: Enable all user-allocatable pkeys at init.
From: Ram Pai @ 2018-06-19 14:25 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: linuxppc-dev, dave.hansen, aneesh.kumar, bsingharora, hbabu,
	mhocko, bauerman, Ulrich.Weigand, fweimer, luto, msuchanek
In-Reply-To: <878t7bx84n.fsf@concordia.ellerman.id.au>

On Tue, Jun 19, 2018 at 10:39:52PM +1000, Michael Ellerman wrote:
> Ram Pai <linuxram@us.ibm.com> writes:
> 
> > In a multithreaded application, a key allocated by one thread must
> > be activate and usable on all threads.
> >
> > Currently this is not the case, because the UAMOR bits for all keys are
> > disabled by default. When a new key is allocated in one thread, though
> > the corresponding UAMOR bits for that thread get enabled, the UAMOR bits
> > for all other existing threads continue to have their bits disabled.
> > Other threads have no way to set permissions on the key, effectively
> > making the key useless.
> 
> This all seems a bit strongly worded to me. It's arguable whether a key
> should be usable by the thread that allocated it or all threads.
> 
> You could conceivably have a design where threads are blocked from using
> a key until they're given permission to do so by the thread that
> allocated the key.
> 
> But we're changing the behaviour to match x86 and because we don't have
> an API to grant another thread access to a key. Right?
> 

correct. The other threads have no way to access or change the
permissions on the key.


> > Enable the UAMOR bits for all keys, at process creation. Since the
> > contents of UAMOR are inherited at fork, all threads are capable of
> > modifying the permissions on any key.
> >
> > BTW: changing the permission on unallocated keys has no effect, till
> > those keys are not associated with any PTEs. The kernel will anyway
> > disallow to association of unallocated keys with PTEs.
> 
> This is an ABI change, which is bad, but I guess we call it a bug fix
> because things didn't really work previously?

Yes its a behaviorial change for the better. There is no downside
to the change because no applications should break. Single threaded
apps will continue to just work fine. Multithreaded applications,
which were unable to consume the API/ABI, will now be able to do so.

> 
> I'll tag it:
> 
>   Fixes: cf43d3b26452 ("powerpc: Enable pkey subsystem")
>   Cc: stable@vger.kernel.org # v4.16+

thanks,
RP

^ permalink raw reply

* Re: [PATCH v2 2/6] powerpc/pkeys: Save the pkey registers before fork
From: Ram Pai @ 2018-06-19 14:28 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: linuxppc-dev, dave.hansen, aneesh.kumar, bsingharora, hbabu,
	mhocko, bauerman, Ulrich.Weigand, fweimer, luto, msuchanek
In-Reply-To: <877emvx84j.fsf@concordia.ellerman.id.au>

On Tue, Jun 19, 2018 at 10:39:56PM +1000, Michael Ellerman wrote:
> Ram Pai <linuxram@us.ibm.com> writes:
> 
> > When a thread forks the contents of AMR, IAMR, UAMOR registers in the
> > newly forked thread are not inherited.
> >
> > Save the registers before forking, for content of those
> > registers to be automatically copied into the new thread.
> >
> > CC: Michael Ellerman <mpe@ellerman.id.au>
> > CC: Florian Weimer <fweimer@redhat.com>
> > CC: Andy Lutomirski <luto@kernel.org>
> > CC: Thiago Jung Bauermann <bauerman@linux.ibm.com>
> > Signed-off-by: Ram Pai <linuxram@us.ibm.com>
> 
> Again this is an ABI change but we'll call it a bug fix I guess.

yes. the same defense here too. its a behaviorial change for the better.
Single threaded applications will not see any behaviorial change.
Multithreaded apps, which were unable to consume, the behavior will now be
able to do so.

> 
> I'll add:
> 
>   Fixes: cf43d3b26452 ("powerpc: Enable pkey subsystem")
>   Cc: stable@vger.kernel.org # v4.16+

yes.  Thanks
RP

^ permalink raw reply

* Re: Constant ata messages on console with commit 28361c403683 ("libata: add extra internal command") (was Re: [GIT PULL 2/2] libata changes for v4.18-rc1)
From: Jens Axboe @ 2018-06-19 15:26 UTC (permalink / raw)
  To: Michael Ellerman, Tejun Heo
  Cc: linux-kernel, linux-ide, Linus Torvalds, linuxppc-dev
In-Reply-To: <87sh5jxmif.fsf@concordia.ellerman.id.au>

On 6/19/18 1:29 AM, Michael Ellerman wrote:
> Jens Axboe <axboe@kernel.dk> writes:
>> On 6/18/18 1:33 AM, Michael Ellerman wrote:
>>> Tejun Heo <tj@kernel.org> writes:
>>> ...
>>>> Jens Axboe (10):
>>>>       libata: introduce notion of separate hardware tags
>>>>       libata: convert core and drivers to ->hw_tag usage
>>>>       libata: bump ->qc_active to a 64-bit type
>>>>       libata: use ata_tag_internal() consistently
>>>>       libata: remove assumption that ATA_MAX_QUEUE - 1 is the max
>>>>       sata_nv: set host can_queue count appropriately
>>>>       libata: add extra internal command
>>>
>>> Replying here because I can't find the original mail.
>>>
>>> The above commit is causing one of my machines to constantly spew ata
>>> messages on the console, according to bisect:
>>>
>>> # first bad commit: [28361c403683c2b00d4f5e76045f3ccd299bf99d] libata: add extra internal command
>>>
>>> To get it to boot I have to also apply:
>>>
>>>   88e10092f6a6 ("sata_fsl: use the right type for tag bitshift")
>>>
>>>
>>> The system boots OK and seems fine, except that it's just printing
>>> multiple of these per second:
>>>
>>>   ata2: Signature Update detected @ 0 msecs
>>>   ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
>>>   ata2.00: configured for UDMA/100
>>>   ata2: Signature Update detected @ 0 msecs
>>>   ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
>>>   ata2.00: configured for UDMA/100
>>>   ata2: Signature Update detected @ 0 msecs
>>>   ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
>>>   ata2.00: configured for UDMA/100
>>>   ata2: Signature Update detected @ 0 msecs
>>>   ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
>>>   ata2.00: configured for UDMA/100
>>>   ata2: Signature Update detected @ 0 msecs
>>>
>>> And it never seems to stop.
>>>
>>> The machine is a Freescale/NXP P5020ds, using the sata_fsl driver
>>> presumably. Any ideas?
>>
>> Hmm that's odd. Can you include the boot log from a working boot as
>> well? Would be nice to see what devices are on the sata adapter.
>> The above just looks like a hardreset loop.
> 
> Ah yep. I stupidly assumed it was working, because the machine booted,
> but that's because the root disk is on ata1.
> 
> Booting the good kernel:
> 
>   ba80c3a572f4 ("sata_nv: set host can_queue count appropriately")
> 
> I see:
> 
>   root@p5020ds:~# ls -l /sys/class/ata_port/
>   total 0
>   lrwxrwxrwx 1 root root 0 Jun 19 06:49 ata1 -> ../../devices/platform/ffe000000.soc/ffe220000.sata/ata1/ata_port/ata1
>   lrwxrwxrwx 1 root root 0 Jun 19 17:06 ata2 -> ../../devices/platform/ffe000000.soc/ffe221000.sata/ata2/ata_port/ata2
> 
>   root@p5020ds:~# ls -l /sys/class/block/ | grep ata
>   lrwxrwxrwx 1 root root 0 Jun 19 17:11 sda -> ../../devices/platform/ffe000000.soc/ffe220000.sata/ata1/host0/target0:0:0/0:0:0:0/block/sda
>   lrwxrwxrwx 1 root root 0 Jun 19 17:11 sda1 -> ../../devices/platform/ffe000000.soc/ffe220000.sata/ata1/host0/target0:0:0/0:0:0:0/block/sda/sda1
>   lrwxrwxrwx 1 root root 0 Jun 19 17:11 sda2 -> ../../devices/platform/ffe000000.soc/ffe220000.sata/ata1/host0/target0:0:0/0:0:0:0/block/sda/sda2
>   lrwxrwxrwx 1 root root 0 Jun 19 17:11 sda5 -> ../../devices/platform/ffe000000.soc/ffe220000.sata/ata1/host0/target0:0:0/0:0:0:0/block/sda/sda5
>   lrwxrwxrwx 1 root root 0 Jun 19 17:11 sr0 -> ../../devices/platform/ffe000000.soc/ffe221000.sata/ata2/host1/target1:0:0/1:0:0:0/block/sr0
> 
> So it's the DVD drive.
> 
>   root@p5020ds:/sys/devices/platform/ffe000000.soc/ffe221000.sata/ata2/host1/target1:0:0/1:0:0:0/scsi_device/1:0:0:0/device# cat vendor 
>   Optiarc 
>   root@p5020ds:/sys/devices/platform/ffe000000.soc/ffe221000.sata/ata2/host1/target1:0:0/1:0:0:0/scsi_device/1:0:0:0/device# cat model
>   DVD RW AD-7260S 
> 
> 
> Full boot log from a good boot attached if that's helpful.
> 
> All of the above looks the same when I boot with the broken setup, it
> just spams dmesg constantly.
> 
> One thing that is different, on the good kernel I see:
> 
>   root@p5020ds:~# mount /dev/sr0 /mnt
>   mount: no medium found on /dev/sr0
> 
> vs bad (88e10092f6a6):
> 
>   root@p5020ds:~# mount /dev/sr0 /mnt
>   mount: /dev/sr0 is already mounted or /mnt busy

Can you try the below patch, on both 4.17 and on current -git? Might
help shed some light on this. The fsl driver does weird stuff with
the internal tag, I'm guessing that's related.


diff --git a/drivers/ata/sata_fsl.c b/drivers/ata/sata_fsl.c
index b8d9cfc60374..9bac5ba36dac 100644
--- a/drivers/ata/sata_fsl.c
+++ b/drivers/ata/sata_fsl.c
@@ -28,6 +28,9 @@
 #include <linux/of_irq.h>
 #include <linux/of_platform.h>
 
+#undef DPRINTK
+#define DPRINTK printk
+
 static unsigned int intr_coalescing_count;
 module_param(intr_coalescing_count, int, S_IRUGO);
 MODULE_PARM_DESC(intr_coalescing_count,
@@ -318,7 +321,7 @@ static void fsl_sata_set_irq_coalescing(struct ata_host *host,
 
 	DPRINTK("interrupt coalescing, count = 0x%x, ticks = %x\n",
 			intr_coalescing_count, intr_coalescing_ticks);
-	DPRINTK("ICC register status: (hcr base: 0x%x) = 0x%x\n",
+	DPRINTK("ICC register status: (hcr base: 0x%p) = 0x%x\n",
 			hcr_base, ioread32(hcr_base + ICC));
 }
 
@@ -1479,7 +1482,7 @@ static int sata_fsl_probe(struct platform_device *ofdev)
 	}
 
 	DPRINTK("@reset i/o = 0x%x\n", ioread32(csr_base + TRANSCFG));
-	DPRINTK("sizeof(cmd_desc) = %d\n", sizeof(struct command_desc));
+	DPRINTK("sizeof(cmd_desc) = %d\n", (int) sizeof(struct command_desc));
 	DPRINTK("sizeof(#define cmd_desc) = %d\n", SATA_FSL_CMD_DESC_SIZE);
 
 	host_priv = kzalloc(sizeof(struct sata_fsl_host_priv), GFP_KERNEL);

-- 
Jens Axboe

^ permalink raw reply related

* Re: [PATCH] arch: powerpc: pci-common: fix wrong return value check on phd_id
From: Guilherme Piccoli @ 2018-06-19 15:29 UTC (permalink / raw)
  To: Daniel Walker
  Cc: Andrew Morton, xe-kernel, Gavin Shan, linux-kernel,
	Paul Mackerras, Ian Munsie, linuxppc-dev, Mauro Rodrigues
In-Reply-To: <20180618165706.42679-1-danielwa@cisco.com>

On Mon, Jun 18, 2018 at 1:57 PM, Daniel Walker <danielwa@cisco.com> wrote:
> Cisco has a couple platforms which depend on the domain values getting
> set a certain way. We discovered our machines not detecting the pci
> devices, and traced it back to this commit,
>
> 63a7228 powerpc/pci: Assign fixed PHB number based on device-tree
> properties
>
> It seems that the code is expecting the return value of of_property_read_u64()
> to be the opposite of what it actually is.. It returns zero on success, and a
> negative return value on error. So if you only check when it's non-zero your
> going to set Opal for all platforms but Opal, which I assume is not what was
> expected.

Hi Daniel, thanks for exposing a potential issue and trying to fix.
(And thanks Mauro to point this message, since my email changed).

I disagree with your patch Daniel, I think it's wrong.
Since the logic of my original patch is kind of convoluted, for improve
(even my own) understanding, I tried to over-comment the logic here:
https://pastebin.ubuntu.com/p/4DNvd9b2bY

Please, take a look and see if you agree with the logic now.

Regarding your issue, I'm interested - I think perhaps you have some FW
flaw that might be exposing duplicated "reg" or "ibm,opal-phbid" properties.
Or maybe you somehow rely on the old incremental PHB setting and your
software stack is not working properly with the DT-based PHB values.

Please, elaborate the issue a bit more in order we can try to figure
out. And also, you could try reverting the below two patches and boot
the kernel to be sure it works:

61e8a0d5a02 powerpc/pci: Fix endian bug in fixed PHB numbering
63a72284b15 powerpc/pci: Assign fixed PHB number based on device-tree properties

*Notice* the first is a fix to my patch, and both should be tested together.
Be sure your current kernel has _both or neither_ these patches!

Cheers,

Guilherme

^ permalink raw reply

* Re: Constant ata messages on console with commit 28361c403683 ("libata: add extra internal command") (was Re: [GIT PULL 2/2] libata changes for v4.18-rc1)
From: Jens Axboe @ 2018-06-19 15:39 UTC (permalink / raw)
  To: Michael Ellerman, Tejun Heo
  Cc: linux-kernel, linux-ide, Linus Torvalds, linuxppc-dev
In-Reply-To: <87sh5jxmif.fsf@concordia.ellerman.id.au>

On 6/19/18 1:29 AM, Michael Ellerman wrote:
> Jens Axboe <axboe@kernel.dk> writes:
>> On 6/18/18 1:33 AM, Michael Ellerman wrote:
>>> Tejun Heo <tj@kernel.org> writes:
>>> ...
>>>> Jens Axboe (10):
>>>>       libata: introduce notion of separate hardware tags
>>>>       libata: convert core and drivers to ->hw_tag usage
>>>>       libata: bump ->qc_active to a 64-bit type
>>>>       libata: use ata_tag_internal() consistently
>>>>       libata: remove assumption that ATA_MAX_QUEUE - 1 is the max
>>>>       sata_nv: set host can_queue count appropriately
>>>>       libata: add extra internal command
>>>
>>> Replying here because I can't find the original mail.
>>>
>>> The above commit is causing one of my machines to constantly spew ata
>>> messages on the console, according to bisect:
>>>
>>> # first bad commit: [28361c403683c2b00d4f5e76045f3ccd299bf99d] libata: add extra internal command
>>>
>>> To get it to boot I have to also apply:
>>>
>>>   88e10092f6a6 ("sata_fsl: use the right type for tag bitshift")
>>>
>>>
>>> The system boots OK and seems fine, except that it's just printing
>>> multiple of these per second:
>>>
>>>   ata2: Signature Update detected @ 0 msecs
>>>   ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
>>>   ata2.00: configured for UDMA/100
>>>   ata2: Signature Update detected @ 0 msecs
>>>   ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
>>>   ata2.00: configured for UDMA/100
>>>   ata2: Signature Update detected @ 0 msecs
>>>   ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
>>>   ata2.00: configured for UDMA/100
>>>   ata2: Signature Update detected @ 0 msecs
>>>   ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
>>>   ata2.00: configured for UDMA/100
>>>   ata2: Signature Update detected @ 0 msecs
>>>
>>> And it never seems to stop.
>>>
>>> The machine is a Freescale/NXP P5020ds, using the sata_fsl driver
>>> presumably. Any ideas?
>>
>> Hmm that's odd. Can you include the boot log from a working boot as
>> well? Would be nice to see what devices are on the sata adapter.
>> The above just looks like a hardreset loop.
> 
> Ah yep. I stupidly assumed it was working, because the machine booted,
> but that's because the root disk is on ata1.
> 
> Booting the good kernel:
> 
>   ba80c3a572f4 ("sata_nv: set host can_queue count appropriately")
> 
> I see:
> 
>   root@p5020ds:~# ls -l /sys/class/ata_port/
>   total 0
>   lrwxrwxrwx 1 root root 0 Jun 19 06:49 ata1 -> ../../devices/platform/ffe000000.soc/ffe220000.sata/ata1/ata_port/ata1
>   lrwxrwxrwx 1 root root 0 Jun 19 17:06 ata2 -> ../../devices/platform/ffe000000.soc/ffe221000.sata/ata2/ata_port/ata2
> 
>   root@p5020ds:~# ls -l /sys/class/block/ | grep ata
>   lrwxrwxrwx 1 root root 0 Jun 19 17:11 sda -> ../../devices/platform/ffe000000.soc/ffe220000.sata/ata1/host0/target0:0:0/0:0:0:0/block/sda
>   lrwxrwxrwx 1 root root 0 Jun 19 17:11 sda1 -> ../../devices/platform/ffe000000.soc/ffe220000.sata/ata1/host0/target0:0:0/0:0:0:0/block/sda/sda1
>   lrwxrwxrwx 1 root root 0 Jun 19 17:11 sda2 -> ../../devices/platform/ffe000000.soc/ffe220000.sata/ata1/host0/target0:0:0/0:0:0:0/block/sda/sda2
>   lrwxrwxrwx 1 root root 0 Jun 19 17:11 sda5 -> ../../devices/platform/ffe000000.soc/ffe220000.sata/ata1/host0/target0:0:0/0:0:0:0/block/sda/sda5
>   lrwxrwxrwx 1 root root 0 Jun 19 17:11 sr0 -> ../../devices/platform/ffe000000.soc/ffe221000.sata/ata2/host1/target1:0:0/1:0:0:0/block/sr0
> 
> So it's the DVD drive.
> 
>   root@p5020ds:/sys/devices/platform/ffe000000.soc/ffe221000.sata/ata2/host1/target1:0:0/1:0:0:0/scsi_device/1:0:0:0/device# cat vendor 
>   Optiarc 
>   root@p5020ds:/sys/devices/platform/ffe000000.soc/ffe221000.sata/ata2/host1/target1:0:0/1:0:0:0/scsi_device/1:0:0:0/device# cat model
>   DVD RW AD-7260S 
> 
> 
> Full boot log from a good boot attached if that's helpful.
> 
> All of the above looks the same when I boot with the broken setup, it
> just spams dmesg constantly.
> 
> One thing that is different, on the good kernel I see:
> 
>   root@p5020ds:~# mount /dev/sr0 /mnt
>   mount: no medium found on /dev/sr0
> 
> vs bad (88e10092f6a6):
> 
>   root@p5020ds:~# mount /dev/sr0 /mnt
>   mount: /dev/sr0 is already mounted or /mnt busy
> 
> cheers

Actually, just try this one on top of current -git.

diff --git a/drivers/ata/sata_fsl.c b/drivers/ata/sata_fsl.c
index b8d9cfc60374..4007a9ae650d 100644
--- a/drivers/ata/sata_fsl.c
+++ b/drivers/ata/sata_fsl.c
@@ -395,12 +395,6 @@ static inline unsigned int sata_fsl_tag(unsigned int tag,
 {
 	/* We let libATA core do actual (queue) tag allocation */
 
-	/* all non NCQ/queued commands should have tag#0 */
-	if (ata_tag_internal(tag)) {
-		DPRINTK("mapping internal cmds to tag#0\n");
-		return 0;
-	}
-
 	if (unlikely(tag >= SATA_FSL_QUEUE_DEPTH)) {
 		DPRINTK("tag %d invalid : out of range\n", tag);
 		return 0;
@@ -1229,7 +1223,7 @@ static void sata_fsl_host_intr(struct ata_port *ap)
 
 	/* Workaround for data length mismatch errata */
 	if (unlikely(hstatus & INT_ON_DATA_LENGTH_MISMATCH)) {
-		for (tag = 0; tag < ATA_MAX_QUEUE; tag++) {
+		for (tag = 0; tag <= ATA_MAX_QUEUE; tag++) {
 			qc = ata_qc_from_tag(ap, tag);
 			if (qc && ata_is_atapi(qc->tf.protocol)) {
 				u32 hcontrol;

-- 
Jens Axboe

^ permalink raw reply related

* [PATCH -tip v6 07/27] powerpc/kprobes: Remove jprobe powerpc implementation
From: Masami Hiramatsu @ 2018-06-19 16:07 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar
  Cc: Masami Hiramatsu, Ingo Molnar, H . Peter Anvin, linux-kernel,
	Ananth N Mavinakayanahalli, Andrew Morton, Steven Rostedt,
	linux-arch, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, linuxppc-dev
In-Reply-To: <152942424698.15209.15245996287444292393.stgit@devbox>

Remove arch dependent setjump/longjump functions
and unused fields in kprobe_ctlblk for jprobes
from arch/powerpc. This also reverts commits
related __is_active_jprobe() function.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Acked-by: "Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: linuxppc-dev@lists.ozlabs.org
---
 arch/powerpc/include/asm/kprobes.h             |    2 -
 arch/powerpc/kernel/kprobes-ftrace.c           |   15 -------
 arch/powerpc/kernel/kprobes.c                  |   54 ------------------------
 arch/powerpc/kernel/trace/ftrace_64_mprofile.S |   39 ++---------------
 4 files changed, 5 insertions(+), 105 deletions(-)

diff --git a/arch/powerpc/include/asm/kprobes.h b/arch/powerpc/include/asm/kprobes.h
index 9f3be5c8a4a3..674036db558b 100644
--- a/arch/powerpc/include/asm/kprobes.h
+++ b/arch/powerpc/include/asm/kprobes.h
@@ -88,7 +88,6 @@ struct prev_kprobe {
 struct kprobe_ctlblk {
 	unsigned long kprobe_status;
 	unsigned long kprobe_saved_msr;
-	struct pt_regs jprobe_saved_regs;
 	struct prev_kprobe prev_kprobe;
 };
 
@@ -104,7 +103,6 @@ extern int kprobe_fault_handler(struct pt_regs *regs, int trapnr);
 extern int kprobe_handler(struct pt_regs *regs);
 extern int kprobe_post_handler(struct pt_regs *regs);
 #ifdef CONFIG_KPROBES_ON_FTRACE
-extern int __is_active_jprobe(unsigned long addr);
 extern int skip_singlestep(struct kprobe *p, struct pt_regs *regs,
 			   struct kprobe_ctlblk *kcb);
 #else
diff --git a/arch/powerpc/kernel/kprobes-ftrace.c b/arch/powerpc/kernel/kprobes-ftrace.c
index 7a1f99f1b47f..1b316331c2d9 100644
--- a/arch/powerpc/kernel/kprobes-ftrace.c
+++ b/arch/powerpc/kernel/kprobes-ftrace.c
@@ -25,21 +25,6 @@
 #include <linux/preempt.h>
 #include <linux/ftrace.h>
 
-/*
- * This is called from ftrace code after invoking registered handlers to
- * disambiguate regs->nip changes done by jprobes and livepatch. We check if
- * there is an active jprobe at the provided address (mcount location).
- */
-int __is_active_jprobe(unsigned long addr)
-{
-	if (!preemptible()) {
-		struct kprobe *p = raw_cpu_read(current_kprobe);
-		return (p && (unsigned long)p->addr == addr) ? 1 : 0;
-	}
-
-	return 0;
-}
-
 static nokprobe_inline
 int __skip_singlestep(struct kprobe *p, struct pt_regs *regs,
 		      struct kprobe_ctlblk *kcb, unsigned long orig_nip)
diff --git a/arch/powerpc/kernel/kprobes.c b/arch/powerpc/kernel/kprobes.c
index e4c5bf33970b..600678fce0a8 100644
--- a/arch/powerpc/kernel/kprobes.c
+++ b/arch/powerpc/kernel/kprobes.c
@@ -611,60 +611,6 @@ unsigned long arch_deref_entry_point(void *entry)
 }
 NOKPROBE_SYMBOL(arch_deref_entry_point);
 
-int setjmp_pre_handler(struct kprobe *p, struct pt_regs *regs)
-{
-	struct jprobe *jp = container_of(p, struct jprobe, kp);
-	struct kprobe_ctlblk *kcb = get_kprobe_ctlblk();
-
-	memcpy(&kcb->jprobe_saved_regs, regs, sizeof(struct pt_regs));
-
-	/* setup return addr to the jprobe handler routine */
-	regs->nip = arch_deref_entry_point(jp->entry);
-#ifdef PPC64_ELF_ABI_v2
-	regs->gpr[12] = (unsigned long)jp->entry;
-#elif defined(PPC64_ELF_ABI_v1)
-	regs->gpr[2] = (unsigned long)(((func_descr_t *)jp->entry)->toc);
-#endif
-
-	/*
-	 * jprobes use jprobe_return() which skips the normal return
-	 * path of the function, and this messes up the accounting of the
-	 * function graph tracer.
-	 *
-	 * Pause function graph tracing while performing the jprobe function.
-	 */
-	pause_graph_tracing();
-
-	return 1;
-}
-NOKPROBE_SYMBOL(setjmp_pre_handler);
-
-void __used jprobe_return(void)
-{
-	asm volatile("jprobe_return_trap:\n"
-		     "trap\n"
-		     ::: "memory");
-}
-NOKPROBE_SYMBOL(jprobe_return);
-
-int longjmp_break_handler(struct kprobe *p, struct pt_regs *regs)
-{
-	struct kprobe_ctlblk *kcb = get_kprobe_ctlblk();
-
-	if (regs->nip != ppc_kallsyms_lookup_name("jprobe_return_trap")) {
-		pr_debug("longjmp_break_handler NIP (0x%lx) does not match jprobe_return_trap (0x%lx)\n",
-				regs->nip, ppc_kallsyms_lookup_name("jprobe_return_trap"));
-		return 0;
-	}
-
-	memcpy(regs, &kcb->jprobe_saved_regs, sizeof(struct pt_regs));
-	/* It's OK to start function graph tracing again */
-	unpause_graph_tracing();
-	preempt_enable_no_resched();
-	return 1;
-}
-NOKPROBE_SYMBOL(longjmp_break_handler);
-
 static struct kprobe trampoline_p = {
 	.addr = (kprobe_opcode_t *) &kretprobe_trampoline,
 	.pre_handler = trampoline_probe_handler
diff --git a/arch/powerpc/kernel/trace/ftrace_64_mprofile.S b/arch/powerpc/kernel/trace/ftrace_64_mprofile.S
index 9a5b5a513604..32476a6e4e9c 100644
--- a/arch/powerpc/kernel/trace/ftrace_64_mprofile.S
+++ b/arch/powerpc/kernel/trace/ftrace_64_mprofile.S
@@ -104,39 +104,13 @@ ftrace_regs_call:
 	bl	ftrace_stub
 	nop
 
-	/* Load the possibly modified NIP */
-	ld	r15, _NIP(r1)
-
+	/* Load ctr with the possibly modified NIP */
+	ld	r3, _NIP(r1)
+	mtctr	r3
 #ifdef CONFIG_LIVEPATCH
-	cmpd	r14, r15	/* has NIP been altered? */
+	cmpd	r14, r3		/* has NIP been altered? */
 #endif
 
-#if defined(CONFIG_LIVEPATCH) && defined(CONFIG_KPROBES_ON_FTRACE)
-	/* NIP has not been altered, skip over further checks */
-	beq	1f
-
-	/* Check if there is an active jprobe on us */
-	subi	r3, r14, 4
-	bl	__is_active_jprobe
-	nop
-
-	/*
-	 * If r3 == 1, then this is a kprobe/jprobe.
-	 * else, this is livepatched function.
-	 *
-	 * The conditional branch for livepatch_handler below will use the
-	 * result of this comparison. For kprobe/jprobe, we just need to branch to
-	 * the new NIP, not call livepatch_handler. The branch below is bne, so we
-	 * want CR0[EQ] to be true if this is a kprobe/jprobe. Which means we want
-	 * CR0[EQ] = (r3 == 1).
-	 */
-	cmpdi	r3, 1
-1:
-#endif
-
-	/* Load CTR with the possibly modified NIP */
-	mtctr	r15
-
 	/* Restore gprs */
 	REST_GPR(0,r1)
 	REST_10GPRS(2,r1)
@@ -154,10 +128,7 @@ ftrace_regs_call:
 	addi r1, r1, SWITCH_FRAME_SIZE
 
 #ifdef CONFIG_LIVEPATCH
-        /*
-	 * Based on the cmpd or cmpdi above, if the NIP was altered and we're
-	 * not on a kprobe/jprobe, then handle livepatch.
-	 */
+        /* Based on the cmpd above, if the NIP was altered handle livepatch */
 	bne-	livepatch_handler
 #endif
 

^ permalink raw reply related

* [PATCH -tip v6 18/27] powerpc/kprobes: Don't call the ->break_handler() in powerpc kprobes code
From: Masami Hiramatsu @ 2018-06-19 16:12 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar
  Cc: Masami Hiramatsu, Ingo Molnar, H . Peter Anvin, linux-kernel,
	Ananth N Mavinakayanahalli, Andrew Morton, Steven Rostedt,
	linux-arch, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, linuxppc-dev
In-Reply-To: <152942424698.15209.15245996287444292393.stgit@devbox>

Don't call the ->break_handler() from the powerpc kprobes code,
because it was only used by jprobes which got removed.

This also removes skip_singlestep() and embeds it in the
caller, kprobe_ftrace_handler(), which simplifies regs->nip
operation around there.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Acked-by: "Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: linuxppc-dev@lists.ozlabs.org
---
 Changes in v6
   - Fix patch description
   - Move post handler emulation into kprobe_ftrace_handler().
---
 arch/powerpc/include/asm/kprobes.h   |   10 -------
 arch/powerpc/kernel/kprobes-ftrace.c |   46 +++++++++-------------------------
 arch/powerpc/kernel/kprobes.c        |   31 ++++++++---------------
 3 files changed, 23 insertions(+), 64 deletions(-)

diff --git a/arch/powerpc/include/asm/kprobes.h b/arch/powerpc/include/asm/kprobes.h
index 674036db558b..785c464b6588 100644
--- a/arch/powerpc/include/asm/kprobes.h
+++ b/arch/powerpc/include/asm/kprobes.h
@@ -102,16 +102,6 @@ extern int kprobe_exceptions_notify(struct notifier_block *self,
 extern int kprobe_fault_handler(struct pt_regs *regs, int trapnr);
 extern int kprobe_handler(struct pt_regs *regs);
 extern int kprobe_post_handler(struct pt_regs *regs);
-#ifdef CONFIG_KPROBES_ON_FTRACE
-extern int skip_singlestep(struct kprobe *p, struct pt_regs *regs,
-			   struct kprobe_ctlblk *kcb);
-#else
-static inline int skip_singlestep(struct kprobe *p, struct pt_regs *regs,
-				  struct kprobe_ctlblk *kcb)
-{
-	return 0;
-}
-#endif
 #else
 static inline int kprobe_handler(struct pt_regs *regs) { return 0; }
 static inline int kprobe_post_handler(struct pt_regs *regs) { return 0; }
diff --git a/arch/powerpc/kernel/kprobes-ftrace.c b/arch/powerpc/kernel/kprobes-ftrace.c
index 1b316331c2d9..070d1d862444 100644
--- a/arch/powerpc/kernel/kprobes-ftrace.c
+++ b/arch/powerpc/kernel/kprobes-ftrace.c
@@ -25,35 +25,6 @@
 #include <linux/preempt.h>
 #include <linux/ftrace.h>
 
-static nokprobe_inline
-int __skip_singlestep(struct kprobe *p, struct pt_regs *regs,
-		      struct kprobe_ctlblk *kcb, unsigned long orig_nip)
-{
-	/*
-	 * Emulate singlestep (and also recover regs->nip)
-	 * as if there is a nop
-	 */
-	regs->nip = (unsigned long)p->addr + MCOUNT_INSN_SIZE;
-	if (unlikely(p->post_handler)) {
-		kcb->kprobe_status = KPROBE_HIT_SSDONE;
-		p->post_handler(p, regs, 0);
-	}
-	__this_cpu_write(current_kprobe, NULL);
-	if (orig_nip)
-		regs->nip = orig_nip;
-	return 1;
-}
-
-int skip_singlestep(struct kprobe *p, struct pt_regs *regs,
-		    struct kprobe_ctlblk *kcb)
-{
-	if (kprobe_ftrace(p))
-		return __skip_singlestep(p, regs, kcb, 0);
-	else
-		return 0;
-}
-NOKPROBE_SYMBOL(skip_singlestep);
-
 /* Ftrace callback handler for kprobes */
 void kprobe_ftrace_handler(unsigned long nip, unsigned long parent_nip,
 			   struct ftrace_ops *ops, struct pt_regs *regs)
@@ -71,8 +42,6 @@ void kprobe_ftrace_handler(unsigned long nip, unsigned long parent_nip,
 	if (kprobe_running()) {
 		kprobes_inc_nmissed_count(p);
 	} else {
-		unsigned long orig_nip = regs->nip;
-
 		/*
 		 * On powerpc, NIP is *before* this instruction for the
 		 * pre handler
@@ -81,9 +50,18 @@ void kprobe_ftrace_handler(unsigned long nip, unsigned long parent_nip,
 
 		__this_cpu_write(current_kprobe, p);
 		kcb->kprobe_status = KPROBE_HIT_ACTIVE;
-		if (!p->pre_handler || !p->pre_handler(p, regs))
-			__skip_singlestep(p, regs, kcb, orig_nip);
-		else {
+		if (!p->pre_handler || !p->pre_handler(p, regs)) {
+			/*
+			 * Emulate singlestep (and also recover regs->nip)
+			 * as if there is a nop
+			 */
+			regs->nip += MCOUNT_INSN_SIZE;
+			if (unlikely(p->post_handler)) {
+				kcb->kprobe_status = KPROBE_HIT_SSDONE;
+				p->post_handler(p, regs, 0);
+			}
+			__this_cpu_write(current_kprobe, NULL);
+		} else {
 			/*
 			 * If pre_handler returns !0, it sets regs->nip and
 			 * resets current kprobe. In this case, we should not
diff --git a/arch/powerpc/kernel/kprobes.c b/arch/powerpc/kernel/kprobes.c
index 600678fce0a8..f06747e2e70d 100644
--- a/arch/powerpc/kernel/kprobes.c
+++ b/arch/powerpc/kernel/kprobes.c
@@ -317,25 +317,17 @@ int kprobe_handler(struct pt_regs *regs)
 			}
 			prepare_singlestep(p, regs);
 			return 1;
-		} else {
-			if (*addr != BREAKPOINT_INSTRUCTION) {
-				/* If trap variant, then it belongs not to us */
-				kprobe_opcode_t cur_insn = *addr;
-				if (is_trap(cur_insn))
-		       			goto no_kprobe;
-				/* The breakpoint instruction was removed by
-				 * another cpu right after we hit, no further
-				 * handling of this interrupt is appropriate
-				 */
-				ret = 1;
+		} else if (*addr != BREAKPOINT_INSTRUCTION) {
+			/* If trap variant, then it belongs not to us */
+			kprobe_opcode_t cur_insn = *addr;
+
+			if (is_trap(cur_insn))
 				goto no_kprobe;
-			}
-			p = __this_cpu_read(current_kprobe);
-			if (p->break_handler && p->break_handler(p, regs)) {
-				if (!skip_singlestep(p, regs, kcb))
-					goto ss_probe;
-				ret = 1;
-			}
+			/* The breakpoint instruction was removed by
+			 * another cpu right after we hit, no further
+			 * handling of this interrupt is appropriate
+			 */
+			ret = 1;
 		}
 		goto no_kprobe;
 	}
@@ -350,7 +342,7 @@ int kprobe_handler(struct pt_regs *regs)
 			 */
 			kprobe_opcode_t cur_insn = *addr;
 			if (is_trap(cur_insn))
-		       		goto no_kprobe;
+				goto no_kprobe;
 			/*
 			 * The breakpoint instruction was removed right
 			 * after we hit it.  Another cpu has removed
@@ -370,7 +362,6 @@ int kprobe_handler(struct pt_regs *regs)
 		/* handler has already set things up, so skip ss setup */
 		return 1;
 
-ss_probe:
 	if (p->ainsn.boostable >= 0) {
 		ret = try_to_emulate(p, regs);
 

^ permalink raw reply related

* [PATCH -tip v6 24/27] bpf: error-inject: kprobes: Clear current_kprobe and enable preempt in kprobe
From: Masami Hiramatsu @ 2018-06-19 16:15 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar
  Cc: Masami Hiramatsu, Ingo Molnar, H . Peter Anvin, linux-kernel,
	Ananth N Mavinakayanahalli, Andrew Morton, Steven Rostedt,
	linux-arch, Vineet Gupta, Russell King, Catalin Marinas,
	Will Deacon, Tony Luck, Fenghua Yu, Ralf Baechle, James Hogan,
	Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Martin Schwidefsky, Heiko Carstens, Yoshinori Sato, Rich Felker,
	David S. Miller, Josef Bacik, Alexei Starovoitov, x86,
	linux-snps-arc, linux-arm-kernel, linux-ia64, linux-mips,
	linuxppc-dev, linux-s390, linux-sh, sparclinux
In-Reply-To: <152942424698.15209.15245996287444292393.stgit@devbox>

Clear current_kprobe and enable preemption in kprobe
even if pre_handler returns !0.

This simplifies function override using kprobes.

Jprobe used to require to keep the preemption disabled and
keep current_kprobe until it returned to original function
entry. For this reason kprobe_int3_handler() and similar
arch dependent kprobe handers checks pre_handler result
and exit without enabling preemption if the result is !0.

After removing the jprobe, Kprobes does not need to
keep preempt disabled even if user handler returns !0
anymore.

But since the function override handler in error-inject
and bpf is also returns !0 if it overrides a function,
to balancing the preempt count, it enables preemption
and reset current kprobe by itself.

That is a bad design that is very buggy. This fixes
such unbalanced preempt-count and current_kprobes setting
in kprobes, bpf and error-inject.

Note: for powerpc and x86, this removes all preempt_disable
from kprobe_ftrace_handler because ftrace callbacks are
called under preempt disabled.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Acked-by: "Naveen N. Rao" <naveen.n.rao@linux.vnet.ibm.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: James Hogan <jhogan@kernel.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Rich Felker <dalias@libc.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Josef Bacik <jbacik@fb.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: x86@kernel.org
Cc: linux-snps-arc@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-ia64@vger.kernel.org
Cc: linux-mips@linux-mips.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-s390@vger.kernel.org
Cc: linux-sh@vger.kernel.org
Cc: sparclinux@vger.kernel.org
---
 Changes in v6:
  - Simplify kprobe_ftrace_handler change on x86 and powerpc code.
 Changes in v5:
  - Fix kprobe_ftrace_handler in arch/powerpc too.
---
 arch/arc/kernel/kprobes.c            |    5 +++--
 arch/arm/probes/kprobes/core.c       |   10 +++++-----
 arch/arm64/kernel/probes/kprobes.c   |   10 +++++-----
 arch/ia64/kernel/kprobes.c           |   13 ++++---------
 arch/mips/kernel/kprobes.c           |    4 ++--
 arch/powerpc/kernel/kprobes-ftrace.c |   19 ++++++-------------
 arch/powerpc/kernel/kprobes.c        |    7 +++++--
 arch/s390/kernel/kprobes.c           |    7 ++++---
 arch/sh/kernel/kprobes.c             |    7 ++++---
 arch/sparc/kernel/kprobes.c          |    7 ++++---
 arch/x86/kernel/kprobes/core.c       |    4 ++++
 arch/x86/kernel/kprobes/ftrace.c     |    9 +++------
 kernel/fail_function.c               |    3 ---
 kernel/trace/trace_kprobe.c          |   11 +++--------
 14 files changed, 52 insertions(+), 64 deletions(-)

diff --git a/arch/arc/kernel/kprobes.c b/arch/arc/kernel/kprobes.c
index 465365696c91..df35d4c0b0b8 100644
--- a/arch/arc/kernel/kprobes.c
+++ b/arch/arc/kernel/kprobes.c
@@ -231,6 +231,9 @@ int __kprobes arc_kprobe_handler(unsigned long addr, struct pt_regs *regs)
 		if (!p->pre_handler || !p->pre_handler(p, regs)) {
 			setup_singlestep(p, regs);
 			kcb->kprobe_status = KPROBE_HIT_SS;
+		} else {
+			reset_current_kprobe();
+			preempt_enable_no_resched();
 		}
 
 		return 1;
@@ -442,9 +445,7 @@ static int __kprobes trampoline_probe_handler(struct kprobe *p,
 	kretprobe_assert(ri, orig_ret_address, trampoline_address);
 	regs->ret = orig_ret_address;
 
-	reset_current_kprobe();
 	kretprobe_hash_unlock(current, &flags);
-	preempt_enable_no_resched();
 
 	hlist_for_each_entry_safe(ri, tmp, &empty_rp, hlist) {
 		hlist_del(&ri->hlist);
diff --git a/arch/arm/probes/kprobes/core.c b/arch/arm/probes/kprobes/core.c
index 3192350f389d..8d37601fdb20 100644
--- a/arch/arm/probes/kprobes/core.c
+++ b/arch/arm/probes/kprobes/core.c
@@ -300,10 +300,10 @@ void __kprobes kprobe_handler(struct pt_regs *regs)
 
 			/*
 			 * If we have no pre-handler or it returned 0, we
-			 * continue with normal processing.  If we have a
-			 * pre-handler and it returned non-zero, it prepped
-			 * for calling the break_handler below on re-entry,
-			 * so get out doing nothing more here.
+			 * continue with normal processing. If we have a
+			 * pre-handler and it returned non-zero, it will
+			 * modify the execution path and no need to single
+			 * stepping. Let's just reset current kprobe and exit.
 			 */
 			if (!p->pre_handler || !p->pre_handler(p, regs)) {
 				kcb->kprobe_status = KPROBE_HIT_SS;
@@ -312,8 +312,8 @@ void __kprobes kprobe_handler(struct pt_regs *regs)
 					kcb->kprobe_status = KPROBE_HIT_SSDONE;
 					p->post_handler(p, regs, 0);
 				}
-				reset_current_kprobe();
 			}
+			reset_current_kprobe();
 		}
 	} else {
 		/*
diff --git a/arch/arm64/kernel/probes/kprobes.c b/arch/arm64/kernel/probes/kprobes.c
index 076c3c0775a6..5daf3d721cb7 100644
--- a/arch/arm64/kernel/probes/kprobes.c
+++ b/arch/arm64/kernel/probes/kprobes.c
@@ -395,9 +395,9 @@ static void __kprobes kprobe_handler(struct pt_regs *regs)
 			/*
 			 * If we have no pre-handler or it returned 0, we
 			 * continue with normal processing.  If we have a
-			 * pre-handler and it returned non-zero, it prepped
-			 * for calling the break_handler below on re-entry,
-			 * so get out doing nothing more here.
+			 * pre-handler and it returned non-zero, it will
+			 * modify the execution path and no need to single
+			 * stepping. Let's just reset current kprobe and exit.
 			 *
 			 * pre_handler can hit a breakpoint and can step thru
 			 * before return, keep PSTATE D-flag enabled until
@@ -405,8 +405,8 @@ static void __kprobes kprobe_handler(struct pt_regs *regs)
 			 */
 			if (!p->pre_handler || !p->pre_handler(p, regs)) {
 				setup_singlestep(p, regs, kcb, 0);
-				return;
-			}
+			} else
+				reset_current_kprobe();
 		}
 	}
 	/*
diff --git a/arch/ia64/kernel/kprobes.c b/arch/ia64/kernel/kprobes.c
index 74c8524e6309..aa41bd5cf9b7 100644
--- a/arch/ia64/kernel/kprobes.c
+++ b/arch/ia64/kernel/kprobes.c
@@ -478,12 +478,9 @@ int __kprobes trampoline_probe_handler(struct kprobe *p, struct pt_regs *regs)
 			 */
 			break;
 	}
-
 	kretprobe_assert(ri, orig_ret_address, trampoline_address);
 
-	reset_current_kprobe();
 	kretprobe_hash_unlock(current, &flags);
-	preempt_enable_no_resched();
 
 	hlist_for_each_entry_safe(ri, tmp, &empty_rp, hlist) {
 		hlist_del(&ri->hlist);
@@ -851,13 +848,11 @@ static int __kprobes pre_kprobes_handler(struct die_args *args)
 	set_current_kprobe(p, kcb);
 	kcb->kprobe_status = KPROBE_HIT_ACTIVE;
 
-	if (p->pre_handler && p->pre_handler(p, regs))
-		/*
-		 * Our pre-handler is specifically requesting that we just
-		 * do a return.  This is used for both the jprobe pre-handler
-		 * and the kretprobe trampoline
-		 */
+	if (p->pre_handler && p->pre_handler(p, regs)) {
+		reset_current_kprobe();
+		preempt_enable_no_resched();
 		return 1;
+	}
 
 #if !defined(CONFIG_PREEMPT)
 	if (p->ainsn.inst_flag == INST_FLAG_BOOSTABLE && !p->post_handler) {
diff --git a/arch/mips/kernel/kprobes.c b/arch/mips/kernel/kprobes.c
index 7fd277bc59b9..54cd675c5d1d 100644
--- a/arch/mips/kernel/kprobes.c
+++ b/arch/mips/kernel/kprobes.c
@@ -358,6 +358,8 @@ static int __kprobes kprobe_handler(struct pt_regs *regs)
 
 	if (p->pre_handler && p->pre_handler(p, regs)) {
 		/* handler has already set things up, so skip ss setup */
+		reset_current_kprobe();
+		preempt_enable_no_resched();
 		return 1;
 	}
 
@@ -543,9 +545,7 @@ static int __kprobes trampoline_probe_handler(struct kprobe *p,
 	kretprobe_assert(ri, orig_ret_address, trampoline_address);
 	instruction_pointer(regs) = orig_ret_address;
 
-	reset_current_kprobe();
 	kretprobe_hash_unlock(current, &flags);
-	preempt_enable_no_resched();
 
 	hlist_for_each_entry_safe(ri, tmp, &empty_rp, hlist) {
 		hlist_del(&ri->hlist);
diff --git a/arch/powerpc/kernel/kprobes-ftrace.c b/arch/powerpc/kernel/kprobes-ftrace.c
index 070d1d862444..e4a49c051325 100644
--- a/arch/powerpc/kernel/kprobes-ftrace.c
+++ b/arch/powerpc/kernel/kprobes-ftrace.c
@@ -32,11 +32,9 @@ void kprobe_ftrace_handler(unsigned long nip, unsigned long parent_nip,
 	struct kprobe *p;
 	struct kprobe_ctlblk *kcb;
 
-	preempt_disable();
-
 	p = get_kprobe((kprobe_opcode_t *)nip);
 	if (unlikely(!p) || kprobe_disabled(p))
-		goto end;
+		return;
 
 	kcb = get_kprobe_ctlblk();
 	if (kprobe_running()) {
@@ -60,18 +58,13 @@ void kprobe_ftrace_handler(unsigned long nip, unsigned long parent_nip,
 				kcb->kprobe_status = KPROBE_HIT_SSDONE;
 				p->post_handler(p, regs, 0);
 			}
-			__this_cpu_write(current_kprobe, NULL);
-		} else {
-			/*
-			 * If pre_handler returns !0, it sets regs->nip and
-			 * resets current kprobe. In this case, we should not
-			 * re-enable preemption.
-			 */
-			return;
 		}
+		/*
+		 * If pre_handler returns !0, it changes regs->nip. We have to
+		 * skip emulating post_handler.
+		 */
+		__this_cpu_write(current_kprobe, NULL);
 	}
-end:
-	preempt_enable_no_resched();
 }
 NOKPROBE_SYMBOL(kprobe_ftrace_handler);
 
diff --git a/arch/powerpc/kernel/kprobes.c b/arch/powerpc/kernel/kprobes.c
index f06747e2e70d..5c60bb0f927f 100644
--- a/arch/powerpc/kernel/kprobes.c
+++ b/arch/powerpc/kernel/kprobes.c
@@ -358,9 +358,12 @@ int kprobe_handler(struct pt_regs *regs)
 
 	kcb->kprobe_status = KPROBE_HIT_ACTIVE;
 	set_current_kprobe(p, regs, kcb);
-	if (p->pre_handler && p->pre_handler(p, regs))
-		/* handler has already set things up, so skip ss setup */
+	if (p->pre_handler && p->pre_handler(p, regs)) {
+		/* handler changed execution path, so skip ss setup */
+		reset_current_kprobe();
+		preempt_enable_no_resched();
 		return 1;
+	}
 
 	if (p->ainsn.boostable >= 0) {
 		ret = try_to_emulate(p, regs);
diff --git a/arch/s390/kernel/kprobes.c b/arch/s390/kernel/kprobes.c
index 3e34018960b5..7c0a095e9c5f 100644
--- a/arch/s390/kernel/kprobes.c
+++ b/arch/s390/kernel/kprobes.c
@@ -326,8 +326,11 @@ static int kprobe_handler(struct pt_regs *regs)
 			 */
 			push_kprobe(kcb, p);
 			kcb->kprobe_status = KPROBE_HIT_ACTIVE;
-			if (p->pre_handler && p->pre_handler(p, regs))
+			if (p->pre_handler && p->pre_handler(p, regs)) {
+				pop_kprobe(kcb);
+				preempt_enable_no_resched();
 				return 1;
+			}
 			kcb->kprobe_status = KPROBE_HIT_SS;
 		}
 		enable_singlestep(kcb, regs, (unsigned long) p->ainsn.insn);
@@ -431,9 +434,7 @@ static int trampoline_probe_handler(struct kprobe *p, struct pt_regs *regs)
 
 	regs->psw.addr = orig_ret_address;
 
-	pop_kprobe(get_kprobe_ctlblk());
 	kretprobe_hash_unlock(current, &flags);
-	preempt_enable_no_resched();
 
 	hlist_for_each_entry_safe(ri, tmp, &empty_rp, hlist) {
 		hlist_del(&ri->hlist);
diff --git a/arch/sh/kernel/kprobes.c b/arch/sh/kernel/kprobes.c
index 4fafe0cd12c6..241e903dd3ee 100644
--- a/arch/sh/kernel/kprobes.c
+++ b/arch/sh/kernel/kprobes.c
@@ -272,9 +272,12 @@ static int __kprobes kprobe_handler(struct pt_regs *regs)
 	set_current_kprobe(p, regs, kcb);
 	kcb->kprobe_status = KPROBE_HIT_ACTIVE;
 
-	if (p->pre_handler && p->pre_handler(p, regs))
+	if (p->pre_handler && p->pre_handler(p, regs)) {
 		/* handler has already set things up, so skip ss setup */
+		reset_current_kprobe();
+		preempt_enable_no_resched();
 		return 1;
+	}
 
 	prepare_singlestep(p, regs);
 	kcb->kprobe_status = KPROBE_HIT_SS;
@@ -352,8 +355,6 @@ int __kprobes trampoline_probe_handler(struct kprobe *p, struct pt_regs *regs)
 	regs->pc = orig_ret_address;
 	kretprobe_hash_unlock(current, &flags);
 
-	preempt_enable_no_resched();
-
 	hlist_for_each_entry_safe(ri, tmp, &empty_rp, hlist) {
 		hlist_del(&ri->hlist);
 		kfree(ri);
diff --git a/arch/sparc/kernel/kprobes.c b/arch/sparc/kernel/kprobes.c
index c684c96ef2e9..dfbca2470536 100644
--- a/arch/sparc/kernel/kprobes.c
+++ b/arch/sparc/kernel/kprobes.c
@@ -175,8 +175,11 @@ static int __kprobes kprobe_handler(struct pt_regs *regs)
 
 	set_current_kprobe(p, regs, kcb);
 	kcb->kprobe_status = KPROBE_HIT_ACTIVE;
-	if (p->pre_handler && p->pre_handler(p, regs))
+	if (p->pre_handler && p->pre_handler(p, regs)) {
+		reset_current_kprobe();
+		preempt_enable_no_resched();
 		return 1;
+	}
 
 	prepare_singlestep(p, regs, kcb);
 	kcb->kprobe_status = KPROBE_HIT_SS;
@@ -508,9 +511,7 @@ static int __kprobes trampoline_probe_handler(struct kprobe *p,
 	regs->tpc = orig_ret_address;
 	regs->tnpc = orig_ret_address + 4;
 
-	reset_current_kprobe();
 	kretprobe_hash_unlock(current, &flags);
-	preempt_enable_no_resched();
 
 	hlist_for_each_entry_safe(ri, tmp, &empty_rp, hlist) {
 		hlist_del(&ri->hlist);
diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c
index 0ac16a0d93e5..814e26b7c8a2 100644
--- a/arch/x86/kernel/kprobes/core.c
+++ b/arch/x86/kernel/kprobes/core.c
@@ -694,6 +694,10 @@ int kprobe_int3_handler(struct pt_regs *regs)
 			 */
 			if (!p->pre_handler || !p->pre_handler(p, regs))
 				setup_singlestep(p, regs, kcb, 0);
+			else {
+				reset_current_kprobe();
+				preempt_enable_no_resched();
+			}
 			return 1;
 		}
 	} else if (*addr != BREAKPOINT_INSTRUCTION) {
diff --git a/arch/x86/kernel/kprobes/ftrace.c b/arch/x86/kernel/kprobes/ftrace.c
index 02a6dd1b6bd0..ef819e19650b 100644
--- a/arch/x86/kernel/kprobes/ftrace.c
+++ b/arch/x86/kernel/kprobes/ftrace.c
@@ -45,8 +45,6 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
 		/* Kprobe handler expects regs->ip = ip + 1 as breakpoint hit */
 		regs->ip = ip + sizeof(kprobe_opcode_t);
 
-		/* To emulate trap based kprobes, preempt_disable here */
-		preempt_disable();
 		__this_cpu_write(current_kprobe, p);
 		kcb->kprobe_status = KPROBE_HIT_ACTIVE;
 		if (!p->pre_handler || !p->pre_handler(p, regs)) {
@@ -60,13 +58,12 @@ void kprobe_ftrace_handler(unsigned long ip, unsigned long parent_ip,
 				p->post_handler(p, regs, 0);
 			}
 			regs->ip = orig_ip;
-			__this_cpu_write(current_kprobe, NULL);
-			preempt_enable_no_resched();
 		}
 		/*
-		 * If pre_handler returns !0, it sets regs->ip and
-		 * resets current kprobe, and keep preempt count +1.
+		 * If pre_handler returns !0, it changes regs->ip. We have to
+		 * skip emulating post_handler.
 		 */
+		__this_cpu_write(current_kprobe, NULL);
 	}
 }
 NOKPROBE_SYMBOL(kprobe_ftrace_handler);
diff --git a/kernel/fail_function.c b/kernel/fail_function.c
index 1d5632d8bbcc..b090688df94f 100644
--- a/kernel/fail_function.c
+++ b/kernel/fail_function.c
@@ -184,9 +184,6 @@ static int fei_kprobe_handler(struct kprobe *kp, struct pt_regs *regs)
 	if (should_fail(&fei_fault_attr, 1)) {
 		regs_set_return_value(regs, attr->retval);
 		override_function_with_return(regs);
-		/* Kprobe specific fixup */
-		reset_current_kprobe();
-		preempt_enable_no_resched();
 		return 1;
 	}
 
diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index daa81571b22a..7e3b944b6ac1 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -1217,16 +1217,11 @@ kprobe_perf_func(struct trace_kprobe *tk, struct pt_regs *regs)
 
 		/*
 		 * We need to check and see if we modified the pc of the
-		 * pt_regs, and if so clear the kprobe and return 1 so that we
-		 * don't do the single stepping.
-		 * The ftrace kprobe handler leaves it up to us to re-enable
-		 * preemption here before returning if we've modified the ip.
+		 * pt_regs, and if so return 1 so that we don't do the
+		 * single stepping.
 		 */
-		if (orig_ip != instruction_pointer(regs)) {
-			reset_current_kprobe();
-			preempt_enable_no_resched();
+		if (orig_ip != instruction_pointer(regs))
 			return 1;
-		}
 		if (!ret)
 			return 0;
 	}

^ permalink raw reply related

* Re: [PATCH] arch: powerpc: pci-common: fix wrong return value check on phd_id
From: Guilherme G. Piccoli @ 2018-06-19 16:26 UTC (permalink / raw)
  To: Daniel Walker, mpe, benh
  Cc: Andrew Morton, xe-kernel, linux-kernel, Paul Mackerras,
	linuxppc-dev, Mauro Rodrigues, linux-pci
In-Reply-To: <c56f43f5-7902-f7c8-20e1-955ad064c4d0@cisco.com>

> [...]
> What your doing is changing the phb_id to some transformation of "reg" for
> all platforms except which have "ibm,opal-phbid". This is what we observe.
> This is a radical altering of the prior phb_id selection before your patch.
>
> The question is, was that your intent ? We are expecting these numbers
> aren't getting altered in this way, this is why we have the patch. Your
> patch description appears to suggest you want this type of selection for
> "... pSeries and PowerNV cases)." , so I am assuming you did this by
> mistake. Also I don't see a reason to make this change for all platforms.

It was the intent - whenever we have device-tree information, the idea is to
use it to have more consistent numbering across reboots and PCI hotplugs.
The reason to change that originally is hotplugging a PCI device added
the device with a different PHB/Domain value, and network predictable naming
messed-up interfaces' names, leading to a machine without networking.

> [...]
>
> We have already done this, that is how we determined your patch was changing
> the domain values. There isn't much to tell w.r.t this issue on our side, we
> are expecting the domain numbers are set a specific way and they aren't
> getting set that way. The drivers which are looking for PCI devices are not
> in kernel space, and the PCI domain values are getting taken from userspace.
>

Oh okay, I imagined it was some crazy userspace-based PCI domain scheme
that led you to report the issue - confirmed heheh
So, you expect the old behavior right? Incremental PHB numbering.
I think we could have a kernel config option to allow that, this way
you could rebuild your kernel with that option and keep the old behavior.

This idea needs to be validated by the maintainers, or perhaps they could
propose another way to keep the old behavior to interested parties.

[Looping linux-pci too - original thread at:
https://lists.ozlabs.org/pipermail/linuxppc-dev/2018-June/174760.html]

^ permalink raw reply

* Re: [PATCH] spi/fsl-espi: Add missing cell-index OF property
From: York Sun @ 2018-06-19 16:29 UTC (permalink / raw)
  To: jocke@infinera.com, linuxppc-dev, broonie@kernel.org
In-Reply-To: <20180619162304.20125-1-joakim.tjernlund@infinera.com>

Joakim,=0A=
=0A=
I am not the maintainer for this driver. Adding Mark Brown.=0A=
=0A=
York=0A=
=0A=
On 06/19/2018 09:23 AM, Joakim Tjernlund wrote:=0A=
> espi does not look for a OF cell-index property which=0A=
> makes the bus numbering dynamic only. This add an=0A=
> optional cell-index.=0A=
> =0A=
> Signed-off-by: Joakim Tjernlund <joakim.tjernlund@infinera.com>=0A=
> ---=0A=
>  drivers/spi/spi-fsl-espi.c | 8 ++++++++=0A=
>  1 file changed, 8 insertions(+)=0A=
> =0A=
> diff --git a/drivers/spi/spi-fsl-espi.c b/drivers/spi/spi-fsl-espi.c=0A=
> index 1d332e23f6ed..56b71c5e2f10 100644=0A=
> --- a/drivers/spi/spi-fsl-espi.c=0A=
> +++ b/drivers/spi/spi-fsl-espi.c=0A=
> @@ -672,6 +672,14 @@ static int fsl_espi_probe(struct device *dev, struct=
 resource *mem,=0A=
>  =0A=
>  	dev_set_drvdata(dev, master);=0A=
>  =0A=
> +	if (dev->of_node) {=0A=
> +		u32 cell_index;=0A=
> +=0A=
> +		if (!of_property_read_u32(dev->of_node, "cell-index",=0A=
> +					  &cell_index))=0A=
> +			master->bus_num =3D cell_index;=0A=
> +	}=0A=
> +=0A=
>  	master->mode_bits =3D SPI_RX_DUAL | SPI_CPOL | SPI_CPHA | SPI_CS_HIGH |=
=0A=
>  			    SPI_LSB_FIRST | SPI_LOOP;=0A=
>  	master->dev.of_node =3D dev->of_node;=0A=
> =0A=
=0A=

^ permalink raw reply

* [PATCH] QE GPIO: Add qe_gpio_set_multiple
From: Joakim Tjernlund @ 2018-06-19 16:22 UTC (permalink / raw)
  To: York Sun, linuxppc-dev

This cousin to gpio-mpc8xxx was lacking a multiple pins method,
add one.

Signed-off-by: Joakim Tjernlund <joakim.tjernlund@infinera.com>
---
 drivers/soc/fsl/qe/gpio.c | 28 ++++++++++++++++++++++++++++
 1 file changed, 28 insertions(+)

diff --git a/drivers/soc/fsl/qe/gpio.c b/drivers/soc/fsl/qe/gpio.c
index 3b27075c21a7..819bed0f5667 100644
--- a/drivers/soc/fsl/qe/gpio.c
+++ b/drivers/soc/fsl/qe/gpio.c
@@ -83,6 +83,33 @@ static void qe_gpio_set(struct gpio_chip *gc, unsigned int gpio, int val)
 	spin_unlock_irqrestore(&qe_gc->lock, flags);
 }
 
+static void qe_gpio_set_multiple(struct gpio_chip *gc,
+				 unsigned long *mask, unsigned long *bits)
+{
+	struct of_mm_gpio_chip *mm_gc = to_of_mm_gpio_chip(gc);
+	struct qe_gpio_chip *qe_gc = gpiochip_get_data(gc);
+	struct qe_pio_regs __iomem *regs = mm_gc->regs;
+	unsigned long flags;
+	int i;
+
+	spin_lock_irqsave(&qe_gc->lock, flags);
+
+	for (i = 0; i < gc->ngpio; i++) {
+		if (*mask == 0)
+			break;
+		if (__test_and_clear_bit(i, mask)) {
+			if (test_bit(i, bits))
+				qe_gc->cpdata |= (1U << (QE_PIO_PINS - 1 - i));
+			else
+				qe_gc->cpdata &= ~(1U << (QE_PIO_PINS - 1 - i));
+		}
+	}
+
+	out_be32(&regs->cpdata, qe_gc->cpdata);
+
+	spin_unlock_irqrestore(&qe_gc->lock, flags);
+}
+
 static int qe_gpio_dir_in(struct gpio_chip *gc, unsigned int gpio)
 {
 	struct of_mm_gpio_chip *mm_gc = to_of_mm_gpio_chip(gc);
@@ -298,6 +325,7 @@ static int __init qe_add_gpiochips(void)
 		gc->direction_output = qe_gpio_dir_out;
 		gc->get = qe_gpio_get;
 		gc->set = qe_gpio_set;
+		gc->set_multiple = qe_gpio_set_multiple;
 
 		ret = of_mm_gpiochip_add_data(np, mm_gc, qe_gc);
 		if (ret)
-- 
2.13.6

^ permalink raw reply related

* [PATCH] spi/fsl-espi: Add missing cell-index OF property
From: Joakim Tjernlund @ 2018-06-19 16:23 UTC (permalink / raw)
  To: York Sun, linuxppc-dev

espi does not look for a OF cell-index property which
makes the bus numbering dynamic only. This add an
optional cell-index.

Signed-off-by: Joakim Tjernlund <joakim.tjernlund@infinera.com>
---
 drivers/spi/spi-fsl-espi.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/drivers/spi/spi-fsl-espi.c b/drivers/spi/spi-fsl-espi.c
index 1d332e23f6ed..56b71c5e2f10 100644
--- a/drivers/spi/spi-fsl-espi.c
+++ b/drivers/spi/spi-fsl-espi.c
@@ -672,6 +672,14 @@ static int fsl_espi_probe(struct device *dev, struct resource *mem,
 
 	dev_set_drvdata(dev, master);
 
+	if (dev->of_node) {
+		u32 cell_index;
+
+		if (!of_property_read_u32(dev->of_node, "cell-index",
+					  &cell_index))
+			master->bus_num = cell_index;
+	}
+
 	master->mode_bits = SPI_RX_DUAL | SPI_CPOL | SPI_CPHA | SPI_CS_HIGH |
 			    SPI_LSB_FIRST | SPI_LOOP;
 	master->dev.of_node = dev->of_node;
-- 
2.13.6

^ permalink raw reply related

* Re: [PATCH v2 6/6] powerpc/pkeys: Deny read/write/execute by default
From: Ram Pai @ 2018-06-19 16:31 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: linuxppc-dev, dave.hansen, aneesh.kumar, bsingharora, hbabu,
	mhocko, bauerman, Ulrich.Weigand, fweimer, luto, msuchanek
In-Reply-To: <87602fx84g.fsf@concordia.ellerman.id.au>

On Tue, Jun 19, 2018 at 10:39:59PM +1000, Michael Ellerman wrote:
> Ram Pai <linuxram@us.ibm.com> writes:
> > Deny all permissions on all keys, with some exceptions.  pkey-0 must
> > allow all permissions, or else everything comes to a screaching halt.
> > Execute-only key must allow execute permission.
> 
> Another ABI change.
> 
> Are we calling this a bug fix?

It is a ABI change. There are two cases where this could break an
existing application.

a) single threaded application, depending on the AMR bits of
	unallocated keys to do something.
       
	Not sure what one can achieve doing so.


b) Multithreaded application could see the difference. The scenarios is 
	i) Thread T2 allocates a key and associates with Memory M1
	ii) Thread T1 accesses the memory M1.

	Without the patch step (ii) will be successful.
	With the patch step (ii) will fail.

	I doubt any multithreaded applications are out there depending
	on this particular behavior. And if it does, than it is
	depending on a buggy behavior.

RP

^ permalink raw reply

* Re: [PATCH v2 5/6] powerpc/pkeys: make protection key 0 less special
From: Ram Pai @ 2018-06-19 16:34 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: linuxppc-dev, dave.hansen, aneesh.kumar, bsingharora, hbabu,
	mhocko, bauerman, Ulrich.Weigand, fweimer, luto, msuchanek
In-Reply-To: <8736xjx847.fsf@concordia.ellerman.id.au>

On Tue, Jun 19, 2018 at 10:40:08PM +1000, Michael Ellerman wrote:
> Ram Pai <linuxram@us.ibm.com> writes:
> > Applications need the ability to associate an address-range with some
> > key and latter revert to its initial default key. Pkey-0 comes close to
> > providing this function but falls short, because the current
> > implementation disallows applications to explicitly associate pkey-0 to
> > the address range.
> >
> > Lets make pkey-0 less special and treat it almost like any other key.
> > Thus it can be explicitly associated with any address range, and can be
> > freed. This gives the application more flexibility and power.  The
> > ability to free pkey-0 must be used responsibily, since pkey-0 is
> > associated with almost all address-range by default.
> >
> > Even with this change pkey-0 continues to be slightly more special
> > from the following point of view.
> > (a) it is implicitly allocated.
> > (b) it is the default key assigned to any address-range.
> > (c) its permissions cannot be modified by userspace.
> >
> > NOTE: (c) is specific to powerpc only. pkey-0 is associated by default
> > with all pages including kernel pages, and pkeys are also active in
> > kernel mode. If any permission is denied on pkey-0, the kernel running
> > in the context of the application will be unable to operate.
> 
> We could fix that by saving/restoring the AMR when we come into the
> kernel, and switching to a kernel-AMR with all keys accessible.
> 
> We'd then need to think about copy_to/from_user() gup etc. So maybe we
> don't want to do that. But it's not set in stone.
> 
> Are we calling this a bug fix?

Actually, I call it borderline bug fix. Its more of a feature.

RP

^ permalink raw reply

* Re: [PATCH] QE GPIO: Add qe_gpio_set_multiple
From: York Sun @ 2018-06-19 16:38 UTC (permalink / raw)
  To: jocke@infinera.com, linuxppc-dev, Scott Wood
In-Reply-To: <20180619162216.20030-1-joakim.tjernlund@infinera.com>

On 06/19/2018 09:22 AM, Joakim Tjernlund wrote:=0A=
> This cousin to gpio-mpc8xxx was lacking a multiple pins method,=0A=
> add one.=0A=
> =0A=
> Signed-off-by: Joakim Tjernlund <joakim.tjernlund@infinera.com>=0A=
> ---=0A=
>  drivers/soc/fsl/qe/gpio.c | 28 ++++++++++++++++++++++++++++=0A=
>  1 file changed, 28 insertions(+)=0A=
> =0A=
=0A=
Joakim,=0A=
=0A=
I am not the maintainer for this driver. Adding Scott.=0A=
=0A=
York=0A=

^ permalink raw reply

* Re: [PATCH v2 4/6] powerpc/pkeys: Preallocate execute-only key
From: Ram Pai @ 2018-06-19 16:38 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: linuxppc-dev, dave.hansen, aneesh.kumar, bsingharora, hbabu,
	mhocko, bauerman, Ulrich.Weigand, fweimer, luto, msuchanek
In-Reply-To: <871sd3x842.fsf@concordia.ellerman.id.au>

On Tue, Jun 19, 2018 at 10:40:13PM +1000, Michael Ellerman wrote:
> Ram Pai <linuxram@us.ibm.com> writes:
> 
> > execute-only key is allocated dynamically. This is a problem.  When a
> > thread implicitly creates a execute-only key, and resets UAMOR for that
> > key, the UAMOR value does not percolate to all the other threads.  Any
> > other thread may ignorantly change the permissions on the key.  This can
> > cause the key to be not execute-only for that thread.
> >
> > Preallocate the execute-only key and ensure that no thread can change
> > the permission of the key, by resetting the corresponding bit in UAMOR.
> 
> OK this is a non-ABI changing bug fix AFAICS.
> 
> I'll add:
> 
>   Fixes: 5586cf61e108 ("powerpc: introduce execute-only pkey")
>   Cc: stable@vger.kernel.org # v4.16+
> 
> > diff --git a/arch/powerpc/mm/pkeys.c b/arch/powerpc/mm/pkeys.c
> > index b681bec..1f2389f 100644
> > --- a/arch/powerpc/mm/pkeys.c
> > +++ b/arch/powerpc/mm/pkeys.c
> > @@ -25,6 +25,7 @@
> >  #define IAMR_EX_BIT 0x1UL
> >  #define PKEY_REG_BITS (sizeof(u64)*8)
> >  #define pkeyshift(pkey) (PKEY_REG_BITS - ((pkey+1) * AMR_BITS_PER_PKEY))
> > +#define EXECUTE_ONLY_KEY 2
> 
> Do we ensure we have at least 3 keys anywhere?

No. Good to add. Can this be different patch?

However we do not have any systems with less than 16keys AFAICT.

RP

^ permalink raw reply

* Re: [PATCH] arch: powerpc: pci-common: fix wrong return value check on phd_id
From: Daniel Walker @ 2018-06-19 16:57 UTC (permalink / raw)
  To: Guilherme G. Piccoli, mpe, benh
  Cc: Andrew Morton, xe-kernel, linux-kernel, Paul Mackerras,
	linuxppc-dev, Mauro Rodrigues, linux-pci
In-Reply-To: <CALJn8nM+S7STs+_MTje1qVCpa6MvnROS2+b5_GDz2wYs8etapA@mail.gmail.com>

On 06/19/2018 09:26 AM, Guilherme G. Piccoli wrote:
>> [...]
>> What your doing is changing the phb_id to some transformation of "reg" for
>> all platforms except which have "ibm,opal-phbid". This is what we observe.
>> This is a radical altering of the prior phb_id selection before your patch.
>>
>> The question is, was that your intent ? We are expecting these numbers
>> aren't getting altered in this way, this is why we have the patch. Your
>> patch description appears to suggest you want this type of selection for
>> "... pSeries and PowerNV cases)." , so I am assuming you did this by
>> mistake. Also I don't see a reason to make this change for all platforms.
> It was the intent - whenever we have device-tree information, the idea is to
> use it to have more consistent numbering across reboots and PCI hotplugs.
> The reason to change that originally is hotplugging a PCI device added
> the device with a different PHB/Domain value, and network predictable naming
> messed-up interfaces' names, leading to a machine without networking.

Ok ..

>
>
>> [...]
>>
>> We have already done this, that is how we determined your patch was changing
>> the domain values. There isn't much to tell w.r.t this issue on our side, we
>> are expecting the domain numbers are set a specific way and they aren't
>> getting set that way. The drivers which are looking for PCI devices are not
>> in kernel space, and the PCI domain values are getting taken from userspace.
>>
> Oh okay, I imagined it was some crazy userspace-based PCI domain scheme
> that led you to report the issue - confirmed heheh
> So, you expect the old behavior right? Incremental PHB numbering.
> I think we could have a kernel config option to allow that, this way
> you could rebuild your kernel with that option and keep the old behavior.
>
> This idea needs to be validated by the maintainers, or perhaps they could
> propose another way to keep the old behavior to interested parties.
>
> [Looping linux-pci too - original thread at:
> https://lists.ozlabs.org/pipermail/linuxppc-dev/2018-June/174760.html]

I didn't look into changing the behavior on our side because it didn't 
look like the intent of the patch was to make a global change. I can 
take a look at changing this behavior on our side , given that this was 
intended by your changes.

However, they're may be other platforms or drivers which depend on these 
numbers getting set a certain way, so there may be other userspace 
dependencies on these values.

Daniel

^ permalink raw reply

* Re: [PATCH] QE GPIO: Add qe_gpio_set_multiple
From: Scott Wood @ 2018-06-19 17:04 UTC (permalink / raw)
  To: York Sun, jocke@infinera.com, linuxppc-dev, Qiang Zhao
In-Reply-To: <VI1PR04MB207831899BEC8092D346712B9A700@VI1PR04MB2078.eurprd04.prod.outlook.com>

On Tue, 2018-06-19 at 16:38 +0000, York Sun wrote:
> On 06/19/2018 09:22 AM, Joakim Tjernlund wrote:
> > This cousin to gpio-mpc8xxx was lacking a multiple pins method,
> > add one.
> > 
> > Signed-off-by: Joakim Tjernlund <joakim.tjernlund@infinera.com>
> > ---
> >  drivers/soc/fsl/qe/gpio.c | 28 ++++++++++++++++++++++++++++
> >  1 file changed, 28 insertions(+)
> > 
> 
> Joakim,
> 
> I am not the maintainer for this driver. Adding Scott.
> 
> York
> 

Qiang Zhao is the maintainer for drivers/soc/fsl/qe

-Scott

^ permalink raw reply

* Re: [PATCH 1/3] kbuild: add macro for controlling warnings to linux/compiler.h
From: Masahiro Yamada @ 2018-06-19 17:34 UTC (permalink / raw)
  To: Paul Burton, Arnd Bergmann
  Cc: Linux Kbuild mailing list, Mauro Carvalho Chehab, Linux-MIPS,
	Ingo Molnar, Matthew Wilcox, Thomas Gleixner, Douglas Anderson,
	Josh Poimboeuf, Andrew Morton, Matthias Kaehlcke, He Zhe,
	Benjamin Herrenschmidt, Michal Marek, Khem Raj, Christophe Leroy,
	Al Viro, Stafford Horne, Gideon Israel Dsouza, Kees Cook,
	Michael Ellerman, Heiko Carstens, Linux Kernel Mailing List,
	Paul Mackerras, linuxppc-dev
In-Reply-To: <20180616005323.7938-2-paul.burton@mips.com>

Hi.

2018-06-16 9:53 GMT+09:00 Paul Burton <paul.burton@mips.com>:
> From: Arnd Bergmann <arnd@arndb.de>
>
> I have occasionally run into a situation where it would make sense to
> control a compiler warning from a source file rather than doing so from
> a Makefile using the $(cc-disable-warning, ...) or $(cc-option, ...)
> helpers.
>
> The approach here is similar to what glibc uses, using __diag() and
> related macros to encapsulate a _Pragma("GCC diagnostic ...") statement
> that gets turned into the respective "#pragma GCC diagnostic ..." by
> the preprocessor when the macro gets expanded.
>
> Like glibc, I also have an argument to pass the affected compiler
> version, but decided to actually evaluate that one. For now, this
> supports GCC_4_6, GCC_4_7, GCC_4_8, GCC_4_9, GCC_5, GCC_6, GCC_7,
> GCC_8 and GCC_9. Adding support for CLANG_5 and other interesting
> versions is straightforward here. GNU compilers starting with gcc-4.2
> could support it in principle, but "#pragma GCC diagnostic push"
> was only added in gcc-4.6, so it seems simpler to not deal with those
> at all. The same versions show a large number of warnings already,
> so it seems easier to just leave it at that and not do a more
> fine-grained control for them.
>
> The use cases I found so far include:
>
> - turning off the gcc-8 -Wattribute-alias warning inside of the
>   SYSCALL_DEFINEx() macro without having to do it globally.
>
> - Reducing the build time for a simple re-make after a change,
>   once we move the warnings from ./Makefile and
>   ./scripts/Makefile.extrawarn into linux/compiler.h
>
> - More control over the warnings based on other configurations,
>   using preprocessor syntax instead of Makefile syntax. This should make
>   it easier for the average developer to understand and change things.
>
> - Adding an easy way to turn the W=1 option on unconditionally
>   for a subdirectory or a specific file. This has been requested
>   by several developers in the past that want to have their subsystems
>   W=1 clean.
>
> - Integrating clang better into the build systems. Clang supports
>   more warnings than GCC, and we probably want to classify them
>   as default, W=1, W=2 etc, but there are cases in which the
>   warnings should be classified differently due to excessive false
>   positives from one or the other compiler.
>
> - Adding a way to turn the default warnings into errors (e.g. using
>   a new "make E=0" tag) while not also turning the W=1 warnings into
>   errors.
>
> This patch for now just adds the minimal infrastructure in order to
> do the first of the list above. As the #pragma GCC diagnostic
> takes precedence over command line options, the next step would be
> to convert a lot of the individual Makefiles that set nonstandard
> options to use __diag() instead.
>
> [paul.burton@mips.com:
>   - Rebase atop current master.
>   - Add __diag_GCC, or more generally __diag_<compiler>, abstraction to
>     avoid code outside of linux/compiler-gcc.h needing to duplicate
>     knowledge about different GCC versions.
>   - Add a comment argument to __diag_{ignore,warn,error} which isn't
>     used in the expansion of the macros but serves to push people to
>     document the reason for using them - per feedback from Kees Cook.]
>
> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
> Signed-off-by: Paul Burton <paul.burton@mips.com>
> Cc: Michal Marek <michal.lkml@markovi.net>
> Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
> Cc: Douglas Anderson <dianders@chromium.org>
> Cc: Al Viro <viro@zeniv.linux.org.uk>
> Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
> Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
> Cc: Matthew Wilcox <matthew@wil.cx>
> Cc: Matthias Kaehlcke <mka@chromium.org>
> Cc: Arnd Bergmann <arnd@arndb.de>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Josh Poimboeuf <jpoimboe@redhat.com>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Gideon Israel Dsouza <gidisrael@gmail.com>
> Cc: Christophe Leroy <christophe.leroy@c-s.fr>
> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Cc: Paul Mackerras <paulus@samba.org>
> Cc: Michael Ellerman <mpe@ellerman.id.au>
> Cc: Stafford Horne <shorne@gmail.com>
> Cc: Khem Raj <raj.khem@gmail.com>
> Cc: He Zhe <zhe.he@windriver.com>
> Cc: linux-kbuild@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Cc: linux-mips@linux-mips.org
> Cc: linuxppc-dev@lists.ozlabs.org
> ---
>
>  include/linux/compiler-gcc.h   | 66 ++++++++++++++++++++++++++++++++++
>  include/linux/compiler_types.h | 18 ++++++++++
>  2 files changed, 84 insertions(+)
>
> diff --git a/include/linux/compiler-gcc.h b/include/linux/compiler-gcc.h
> index f1a7492a5cc8..aba64a2912d8 100644
> --- a/include/linux/compiler-gcc.h
> +++ b/include/linux/compiler-gcc.h
> @@ -347,3 +347,69 @@
>  #if GCC_VERSION >= 50100
>  #define COMPILER_HAS_GENERIC_BUILTIN_OVERFLOW 1
>  #endif
> +
> +/*
> + * turn individual warnings and errors on and off locally, depending
> + * on version.
> + */
> +#define __diag_GCC(version, s) __diag_GCC_ ## version(s)
> +
> +#if GCC_VERSION >= 40600
> +#define __diag_str1(s) #s
> +#define __diag_str(s) __diag_str1(s)
> +#define __diag(s) _Pragma(__diag_str(GCC diagnostic s))
> +
> +/* compilers before gcc-4.6 do not understand "#pragma GCC diagnostic push" */
> +#define __diag_GCC_4_6(s) __diag(s)
> +#else
> +#define __diag(s)
> +#define __diag_GCC_4_6(s)
> +#endif
> +
> +#if GCC_VERSION >= 40700
> +#define __diag_GCC_4_7(s) __diag(s)
> +#else
> +#define __diag_GCC_4_7(s)
> +#endif
> +
> +#if GCC_VERSION >= 40800
> +#define __diag_GCC_4_8(s) __diag(s)
> +#else
> +#define __diag_GCC_4_8(s)
> +#endif
> +
> +#if GCC_VERSION >= 40900
> +#define __diag_GCC_4_9(s) __diag(s)
> +#else
> +#define __diag_GCC_4_9(s)
> +#endif
> +
> +#if GCC_VERSION >= 50000
> +#define __diag_GCC_5(s) __diag(s)
> +#else
> +#define __diag_GCC_5(s)
> +#endif
> +
> +#if GCC_VERSION >= 60000
> +#define __diag_GCC_6(s) __diag(s)
> +#else
> +#define __diag_GCC_6(s)
> +#endif
> +
> +#if GCC_VERSION >= 70000
> +#define __diag_GCC_7(s) __diag(s)
> +#else
> +#define __diag_GCC_7(s)
> +#endif
> +
> +#if GCC_VERSION >= 80000
> +#define __diag_GCC_8(s) __diag(s)
> +#else
> +#define __diag_GCC_8(s)
> +#endif
> +
> +#if GCC_VERSION >= 90000
> +#define __diag_GCC_9(s) __diag(s)
> +#else
> +#define __diag_GCC_9(s)
> +#endif


Hmm, we would have to add this for every release.



> diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h
> index 6b79a9bba9a7..313a2ad884e0 100644
> --- a/include/linux/compiler_types.h
> +++ b/include/linux/compiler_types.h
> @@ -271,4 +271,22 @@ struct ftrace_likely_data {
>  # define __native_word(t) (sizeof(t) == sizeof(char) || sizeof(t) == sizeof(short) || sizeof(t) == sizeof(int) || sizeof(t) == sizeof(long))
>  #endif
>
> +#ifndef __diag
> +#define __diag(string)
> +#endif
> +
> +#ifndef __diag_GCC
> +#define __diag_GCC(string)
> +#endif




__diag_GCC() takes two arguments,
so it should be:

#ifndef __diag_GCC
#define __diag_GCC(version, s)
#endif


Otherwise, this would cause warning like this:


arch/arm64/kernel/sys.c:40:1: error: macro "__diag_GCC" passed 2
arguments, but takes just 1
 SYSCALL_DEFINE1(arm64_personality, unsigned int, personality)
 ^~~~~~~~~~







> +#define __diag_push()  __diag(push)
> +#define __diag_pop()   __diag(pop)
> +
> +#define __diag_ignore(compiler, version, option, comment) \
> +       __diag_ ## compiler(version, ignored option)
> +#define __diag_warn(compiler, version, option, comment) \
> +       __diag_ ## compiler(version, warning option)
> +#define __diag_error(compiler, version, option, comment) \
> +       __diag_ ## compiler(version, error   option)
> +


To me, it looks like this is putting GCC/Clang specific things
in the common file, <linux/compiler_types.h> .

All compilers must use "ignored", "warning", "error",
not allowed to use "ignore".



I also wonder if we could avoid proliferating __diag_GCC_*.



>  #endif /* __LINUX_COMPILER_TYPES_H */
> --
> 2.17.1
>


I attached a bit different implementation below.

I used -Wno-pragmas to avoid unknown option warnings.




diff --git a/Makefile b/Makefile
index ca2af1a..d610d81 100644
--- a/Makefile
+++ b/Makefile
@@ -817,6 +817,8 @@ KBUILD_CFLAGS   += $(call cc-option,-Werror=designated-init)
 # change __FILE__ to the relative path from the srctree
 KBUILD_CFLAGS  += $(call cc-option,-fmacro-prefix-map=$(srctree)/=)

+KBUILD_CFLAGS   += $(call cc-option,-Wno-pragmas)
+
 # use the deterministic mode of AR if available
 KBUILD_ARFLAGS := $(call ar-option,D)

diff --git a/include/linux/compiler-gcc.h b/include/linux/compiler-gcc.h
index f1a7492..3f9c1cc 100644
--- a/include/linux/compiler-gcc.h
+++ b/include/linux/compiler-gcc.h
@@ -3,6 +3,8 @@
 #error "Please don't include <linux/compiler-gcc.h> directly, include
<linux/compiler.h> instead."
 #endif

+#include <linux/stringify.h>
+
 /*
  * Common definitions for all gcc versions go here.
  */
@@ -259,6 +261,16 @@
  */
 #define __visible      __attribute__((externally_visible))

+/* turn individual warnings and errors on and off locally */
+#define __diag_gcc(s)  _Pragma(__stringify(GCC diagnostic s))
+
+#define __diag_push()  __diag_gcc(push)
+#define __diag_pop()   __diag_gcc(pop)
+
+#define __diag_ignore(option, comment) __diag_gcc(ignored __stringify(option))
+#define __diag_warn(option, comment)   __diag_gcc(warning __stringify(option))
+#define __diag_error(option, comment)  __diag_gcc(error __stringify(option))
+
 #endif /* GCC_VERSION >= 40600 */
diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h
index 6b79a9b..32e354f 100644
--- a/include/linux/compiler_types.h
+++ b/include/linux/compiler_types.h
@@ -271,4 +271,24 @@ struct ftrace_likely_data {
 # define __native_word(t) (sizeof(t) == sizeof(char) || sizeof(t) ==
sizeof(short) || sizeof(t) == sizeof(int) || sizeof(t) ==
sizeof(long))
 #endif

+#ifndef __diag_push
+#define __diag_push()
+#endif
+
+#ifndef __diag_pop
+#define __diag_pop()
+#endif
+
+#ifndef __diag_ignore
+#define __diag_ignore(option, comment)
+#endif
+
+#ifndef __diag_warn
+#define __diag_warn(option, comment)
+#endif
+
+#ifndef __diag_error
+#define __diag_error(option, comment)
+#endif
+
 #endif /* __LINUX_COMPILER_TYPES_H */





Usage is

       __diag_push();
       __diag_ignore(-Wattribute-alias,
                     "Type aliasing is used to sanitize syscall arguments");
              ...
       __diag_pop();




Comments, ideas are appreciated.




-- 
Best Regards
Masahiro Yamada

^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox